| Home | A-Z index | Site map | Contact | Login | Search: 


ESDS logo - link to ESDS home page

Download/format help

The information below relates to data available to download using the Download/order links in the data catalogue or via Usage details in your account. ESDS guides provide help with downloading data in other ways e.g. from the ESDS International multi-nation aggregate databanks or the ESDS Nesstar Catalogue.

Download data in your chosen format
What is a ZIP file?
Saving the ZIP file
Opening the ZIP file
Contents of the unzipped file
Commonly occurring file extensions

Quantitative data formats - availability and advice

Qualitative data formats - availability and advice

Download data in your chosen format

  1. Within 'Your account' select 'Usage details'
  2. Click the relevant usage title and a list of studies associated with that usage will be displayed
  3. Click the 'Download' button associated with the study you wish to download, and an 'End User Licence' reminder page will appear
  4. Click the 'I accept' option to continue, and the data formats available for your chosen study will then be displayed
  5. Click the required format button and the download will begin

Note: if you get the message "Access Denied - Referral Block", this may be due to a particular type of firewall at your institution or on your computer. Try to download the data from another computer that does not have the same firewall installed, or temporarily deactivate the firewall after consulting your local computing support.

If Internet Explorer blocks a download with a 'no entry' sign and displays a notification in the Information Bar, click the Information Bar and select the option to allow the download. Alternatively, hold down the Control button on your keyboard when you try to download.

What is a ZIP file?

A ZIP file allows several files to be downloaded as one file. The files are compressed so that the ZIP file is smaller than the size of the uncompressed files, resulting in a faster download.

Saving the ZIP file

In the 'File Download' dialog box choose to Save this file. Do NOT choose to open the file as this will only create a temporary copy on your computer.

In the 'Save As' dialog box, choose the directory/folder where you want to save your data. Make a note of the file name and click on 'Save'.

Note: if a dialog box does not appear you may need to change the security settings in your browser to Medium. In Internet Explorer, you can do this using Tools, Internet Options, Security. If this does not work, contact your local computer support.

If you are using Internet Explorer on a Macintosh Operating System (OS) and a dialog box does not appear, you should try using one of the following web browsers instead: OS9, Mozilla (version 5.1.7 or higher) or OS X - Safari (OS supplied browser).

Opening the ZIP file

Locate the directory/folder where you saved the ZIP file. Double click the filename to uncompress/open it using decompression software such as Winzip, Pkunzip or Stuffit Expander.

Extract the files to a directory/folder, opting to keep the folder names/directory structure.

Get StuffIt Expander

Contents of the unzipped file

The top level folder is usually named UKDA[study number]-[format] (e.g. UKDA4651-spss). In this top level folder there will usually be two folders, one containing the data and named according to format (e.g. SPSS, Stata, tab, rtf), and one containing the documentation and called 'mrdoc' (short for machine-readable documentation). Occasionally, there will also be a folder called 'code', which will contain command files that create derived variables or aid analysis in some way.

The 'mrdoc' folder contains a number of other folders containing the user guides supplied by the data depositor and are usually in PDF format; a file containing information on how to cite and acknowledge the data in publications; a shortened version of the catalogue record called UKDA_Study_[study number]_Information.htm but may also be called cite[study number].txt; and for studies processed after April 2004, a UK Data Archive data dictionary called [filename]_UKDA_Data_Dictionary.rtf

The following files may also be available:

  • read[study number].txt (or .htm) files - these contain information about the UK Data Archive processing levels, useful information in additon to the user guides supplied by the data depositor and copies of any additional agreements on conditions of use.
  • rd[series number].txt (or .htm) file - these provide additional information that relate to the entire series (e.g. every quarter of the Labour Force Survey).
  • [study number]_file_information.rtf - contains file names and a brief description of the files included in the downloaded zip file.

Commonly occurring file extensions

  • txt - plain text format can be opened using any text editor or word processing software such as MS Word
  • rtf - rich text format can be opened using any word processing software
  • pdf - portable document files can be opened using Adobe Acrobat Reader
  • htm - html/web files can be opened using a web browser or MS Word
  • por - SPSS portable files, requires SPSS to open
  • sav - SPSS system files, requires SPSS to open
  • dta - Stata files, requires Stata to open
  • tab - tab delimited text files can be opened by spreadsheet software such as MS Excel

If you do not have Adobe Acrobat Reader installed it can be downloaded from:

Get Adobe Acrobat Reader

Quantitative data formats - availability and advice

SPSS

This is the most popular dissemination format. The files supplied by the UK Data Archive (the Archive) are in SPSS portable (.por) format for older studies, and SPSS system (.sav) format for newer studies (processed from October 2005 onwards). SPSS portable files open in all versions of SPSS, and SPSS system files with all recent versions of SPSS on all platforms.

For .sav files that include variable names longer than eight characters, a file named [study number]_SPSS_varnameinfo.txt is also supplied. This contains a lookup table of extended and abbreviated SPSS variable names. This enables users of SPSS version 11.5 or previous (which abbreviates long variable names) to equate the variable names in the SPSS file with the full variable names.

Studies processed after April 2004 are also supplied with a UK Data Archive Data Dictionary file, which is named [data file name]_UKDA_Data_Dictionary.rtf and should be more easily readable than SPSS data dictionaries or Stata codebooks. For an example, see UK Data Archive Data Dictionary.

Stata

This format is increasing in popularity, and is recommended for surveys that require weighting and other survey design effects to be incorporated into any analyses. Studies are made available in Stata 6 format, and, for studies processed after October 2005, in Stata 8 format. If one or more data files have more than 2,047 variables (the limit for version 6 and the 'intercooled' versions 7 and 8), the data files are (after April 2004) made available in Stata version 8 Special Edition format. Versions are indicated in the names of the zipped download bundle i.e. [study number]stata6 (version 6), [study number]stata8 (version 8), and [study number]stata8se (version 8 special edition).

Stata data handling limits are generally slightly less generous than SPSS, so some loss or truncation of information, such as variable and value label loss, or truncation, and loss of user missing value definitions, is inevitable. The UK Data Archive has developed its own scripts to guarantee optimal translation of data between SPSS and Stata. For studies processed after April 2004, the file [Study Number]_SPSS_to_STATA_conversion.rtf is supplied with the data. This provides a log of any information that has been lost or truncated upon translation. For an example, see UK Data Archive SPSS to Stata Conversion Information File.

Users can then locate the full label and user missing value information in the UK Data Archive's data dictionary files, named [data file name]_UKDA_Data_Dictionary.rtf. For an example, see UK Data Archive Data Dictionary.

Tab-delimited text

This is an entirely generic format that stores just the variable names and the rectangular matrix of data (there is no information on variable formats, label information or missing value definitions). The character set is normally ASCII but may be UNICODE.

ESDS recommends ordering data in tab-delimited format where this is the most effective means of reading the data into a specialist analysis package. When data are supplied in tab-delimited format, data dictionary or database structure information will also be provided. Depending on the application from which the tab-delimited data were created, these file will either be named: [data file name]_variableinformation.rtf or [data file name]_UKDA_Data_Dictionary.rtf

Although tab-delimited format is suitable for use in MS Excel 2003, the maximum number of columns (variables) is 256 and the maximum number of rows (cases) is 65,536. However, MS Excel 2007 supports 16,384 columns (variables) and 1,048,576 rows (cases).

SAS

Due to limited demand, data are not routinely made available in SAS format by the UK Data Archive (the Archive). However, the Archive will create SAS formats upon request. The standard method of delivery of SAS datasets is as a fixed with ASCII file with a .sas command file to read the data into SAS and create the formats library. This means that a SAS dataset can be created in any almost any version of SAS running on any operating system. Users can also decide whether to preserve the SPSS 'user missing' codes or collapse them into the SAS system missing code, since this is supplied as a discrete block of commands in the .sas file, under the heading /* User Missing Value Specifications */

R

This open source variant of S-Plus is gaining in popularity as it offers advanced functionality not present in SPSS or in some instances, even Stata. However, the ESDS does not make data available in R format, since R will read both SPSS and Stata formats (using the 'read spss' and 'read.dta' commands).

Other formats

Occasionally, datasets are not suitable for SPSS or Stata. This occurs when, for example, unstructured or semi-structured interviews record literal textual responses of greater than 255 characters. While packages such as MS Excel and MS Access can store these long strings, statistical packages (like SPSS, prior to version 13, and Stata) cannot. In these cases, the data are made available in format(s) that do not truncate these long strings. This will typically be a choice of a proprietary format (e.g. MS Excel or MS Access) and tab-delimited text. MS Access 'data documenter' information is provided for each table when data are extracted from an MS Access database.

Qualitative data formats - availability and advice

Rich text format (.rtf)

Rich text format is used for the majority of qualitative studies. Rich text format files will open into most text editors and almost all word processing packages.

Portable document format (.pdf)

PDF format is used when data were only available to the ESDS as hard copy (paper) and the level of ESDS data processing did not permit Optical Character Recognition to convert them into text. In such instances the hard copy material is scanned into image files (400 dpi TIFFs) and then converted into PDF format.

CAQDAS format

Where qualitative data have been coded and analysed using Computer Assisted Qualitative Data Analysis Software (CAQDAS), these files may also be supplied, in addition to the 'raw' transcripts in rich text format.

Download Adobe Acrobat PDF Reader      Download Stuffit Expander

ESDS Home Page > Usersupport > Download/format help
_