UK DATA ARCHIVE DOCUMENTATION 4504 - United Kingdom Time Use Survey, 2000 Data Archive processing standards --------------------------------- The data were processed to the UK Data Archive's 'A' standard. A rigorous and comprehensive series of checks was carried out to ensure the quality of the data and documentation. The most important procedures were as follows. Firstly, checks were made that the number of cases and variables matched the depositor's records. Secondly, checks were made that all variables had variable labels and all nominal (categorical) variables had value labels. Where possible, either with reference to the documentation and/or in communication with the depositor, absent labels were created. Thirdly, logical checks were performed to ensure that nominal (categorical) variables had values within the range defined (either by value labels or in the depositor's documentation). Lastly, any data or documentation that breached confidentiality rules were altered or suppressed to preserve anonymity. All notable and/or outstanding problems discovered are detailed under the Data and documentation problems heading below. Methodology report ------------------ The UK Time Use Survey 2000 methodology report is available online at: http://www.statistics.gov.uk/timeuse/methodology.asp Data and documentation problems ------------------------------- 1. Diary file The diary file summarises the information collected on one weekday and one weekend day of the self-completion diary. Diaries are divided into 144 ten-minute time slots over 24 hours. Users are strongly recommended to analyse this file in close conjunction with the section on 'Understanding the UK Time Use Survey' and the Activity Coding lists in the User Guide, as the coding can be quite tricky to interpret. 2. Household file This file comprises data from the household questionnaire, which was always the first questionnaire to be completed at any address. It collected information on details of individual household members, housing and household appliances, household vehicles, home produce and DIY, help received from the outside of the household, household income, accommodation type. 3. Individual file: This file comprises data from the individual questionnaire, which collected information on current employment, looking for work, receipt of benefits, education and training, voluntary work, help and services for others, leisure activities, health, childcare, carers and classification. Some coding anomalies were found while checking the individual file. They include: Q8E 'Type of shift pattern worked' has a few responses of 11 and 12 not mentioned in the questionnaire. Q14F 'Number of hours unpaid overtime usually worked per week' has 1 case of 70 hours and 1 case of 150 - this seems abnormally high, so may be a coding error. Q23A 'Age of leaving full-time education' has some cases with values under age 12/13 (older respondents may have left school at that age). Q23C 'How old when left that full-time education' has one value of 94 which seems abnormally high. PQ48 'Number of sick people cared for not living with helper' has isolated values of 15, 34, 40, 79 and 99, which seems rather high. Q27DA ' Number of times helped in last week' has some values which seem abnormally high, such as 99. Users should note that other out-of-range codes may be present in the data files. 4. Worksheet This file contains data from the worksheet. The worksheet is used to record hours spent in main job, full time education or in other paid work for seven days starting the first day of the diary. Information on travelling at work was also collected. Whilst no problems were encountered while processing this file, the coding can be tricky to interpret, so it is strongly recommended that users analyse this file in close conjunction with the example on the worksheet page in the User Guide (p.168). 2nd edition information ------------------------ Several changes were made to the diary, household and individual data for the 2nd edition (though the worksheet data remains unchanged), and the documentation has also been updated. For full details of the changes, please refer to the User Guide (pp.1 and 2). 3rd edition information ----------------------- All files have new weights added which use 2000 population figures based on the 2001 Census (the previous versions of the data files had used 1991 Census based estimates). For other changes to the data, please see documentation. Individual file --------------- There are a number of unlabelled values in the variables SIC, SIC2, SOC and SOC2. Most of these have only one case. Some variables have unlabelled values of -1, -7, -8 or -9. Other variables within this data file have labels as follows: -1 missing -7 refused -8 don't know -9 not answered Worksheet --------- There are a number of unlabelled values in the variables SIC, SIC2, SOC and SOC2. Most of these have only one case. Notes from data delivery and post-order corrections --------------------------------------------------- The data depositor requested in May 2002 that the following variables be removed from the first edition of the household data file - they are not present in the 2nd or 3rd edition of the data: agehh comphh lengtr director ptype gender Useful Notes ------------ File name diary_data_8 is not available in STATA format, as there are too many variables. Conversion of Documentation --------------------------- All electronic and paper documentation supplied with this study is normally incorporated into the UKDA User Guide (in PDF format). The conversion programmes used are the latest versions of Adobe PDF Writer for electronic documentation and Adobe Paper Capture (Acrobat 'plugin' version) for paper documentation. Occasionally, some or all of the electronic documentation cannot be usefully converted to PDF (e.g. MS Excel files with wide worksheets) and this is supplied in other formats. All User Guides are fully bookmarked. Conversion of Data ------------------ Ingest format(s) of the data = SPSS .sav files From January 2003 onwards, almost all data conversions have been performed using software developed by the UKDA. This enables standardisation of the conversion methods and ensures optimal data quality. In addition to its own data processing/conversion functionality, this software invokes the SPSS and StatTransfer command processors to perform certain translations in a standardised and optimal way. Although data conversion is automated, all data files created are subject to inspection by a UKDA data processor. To create the format you have been supplied the data in, the following conversion will have been performed depending on the ingest format. Note that you will have only been provided the data in the format you requested. SPSS portable: If SPSS portable is not the ingest format, this format will generally either have been created via the SPSS command processor (e.g. if the ingest format is SPSS .sav, SAS, Excel, or dBase), or if the ingest format is STATA, the SPSS portable version will be created via the Stat/Transfer command processor. If the Ingest format is text (e.g. fixed width ASCII) and no setup files are provided, the UKDA will write the necessary setup files to read the data into SPSS. STATA: If STATA is not the ingest format, all STATA files will have been created from SPSS .sav format via the Stat/Transfer command processor. All files created are in STATA 6 format. Importantly, Stat/Transfer's optimisation routine is run so that variables with SPSS write formats narrower than the data (e.g. numeric variables with 10 decimal places of data formatted to FX.2) are not rounded upon conversion to STATA because they are converted to "doubles" rather than floats. User missing values are copied across into STATA where the user definition is lost), but the code exists (as opposed to being collapsed into STATA's single missing code (versions 6 and 7). Issues: Variables that include both date and time in the SPSS version, such as mm-dd-yyyy hh:mm:ss (e.g. 18-JUN-2001 13:28:00), will lose the time information and become date only. If the time information is critical, a new variable will have been created in the STATA data file by the UKDA. Tab-delimited text: If tab-delimited text is not the ingest format, tab-delimited files are created from SPSS portable files via the SPSS command processor, Excel spreadsheets, or MS Access databases. When exporting from Access data tables to tab-delimited text, the many undesirable embedded special characters allowed by access memo and text fields - tabs, carriage returns, line feds, etc., - are stripped out by the UKDA software. Issues: Date formats in SPSS are always exported to mm/dd/yyyy in tab- delimited text format - so you may note a mismatch with the documentation on such variables. Variables that include both date and time such as. mm-dd-yyyy hh:mm:ss (e.g. 18-JUN-2001 13:28:00), will lose the time information and become mm/dd/yyyy. If the time information is critical, a new variable will have been created in the tab-delimited data file by the UKDA. All users of the data in tab- delimited format are provided with the SPSS data dictionary, this being the rich text file named according to the convention _variableinformation.rtf. This contains the SPSS format information as well as the variable and value labels, and it is thereby recommended all tab-delimited data users consult this information. If the tab-delimited data were converted from MS Access, analogous 'data documenter' output will be supplied in rtf format. Likewise, the files may contain SQL setup information. MS Excel: If MS Excel is not the ingest format, Excel files are created via the SPSS command processor. The date and time issues noted under STATA and tab-delimited apply to SPSS to Excel conversion via the SPSS command processor. MS Access: Due to the substantial incompatibilities between versions of MS Access, the UKDA only make data available in MS Access format if this is the ingest format and the database contains important information in addition to the data tables (forms, queries, etc.). Other formats: Data are only made available in other formats on the rare occasion when there is no reliable method of extracting the data into a more accessible format.