Glossary of terms in the Data Catalogue
Introduction
Each dataset within the UK Data Archive collection is described by a structured catalogue record that contains a number of pre-defined elements,
such as Title and Abstract. These elements are based on the Data Documentation Initiative (DDI) version 2.1, an XML-defined
international standard for describing social science data. For further details on the
UK Data Archive catalogue record and metadata in use at the UK Data Archive see: Catalogue metadata.
Listed below are definitions for the main data catalogue elements currently in use at the UKDA.
Those marked with an asterisk are mandatory elements
and should always be present, if applicable, in each record in the Data Catalogue, with the exception of some
older studies.
Some elements consist of a free-text entry e.g. the abstract and main topics. Others draw on
controlled vocabularies e.g. sampling procedures, keywords. Controlled vocabularies are lists of standardised
terminology. They provide efficiency and consistency in both creating and retrieving information.
The controlled vocabularies currently in use at the UK Data Archive are indicated in the list below.
For more information about each DDI tag please refer to the DDI XML Schema Tag Library.
List of elements
Study Number *
Each study (or dataset) is given a unique number on accession to the collection.
DDI element: <IDNo> Identification Number
Title *
Full authoritative title of the study, usually indicating the geographic scope of the data as well as the time
period covered.
Alternative Title
The title statement may also include an alternative title (this is found below the main title in brackets). This is a title by
which the work is commonly referred, or an abbreviation of the title, such as an acronym. It might also be used where a title has changed
over time.
DDI elements:
- <titl> Title
- <altTitl> Alternative Title
Series
A collective title is assigned to a group, or series, of datasets which are closely related but where the individual
titles may differ, e.g. being part of a larger study or a long-standing regularly conducted survey where the title has
changed over time. An example of this is the Labour Force Survey Series.
DDI element: <serName> Series Name
Related Studies
These fall under two categories:
Group Consituents - these are datasets which are generated from the same data collection vehicle
(e.g. predecessors, successors, other waves or rounds) for example the individual years of the
British Crime Survey. The individual datasets are brought together by means of a group number (GN).
Other Related - these are datasets which are closely related but not part of the same data collection vehicle.
DDI element: <relStdy> Related Studies
Subject Categories *
One or more subject categories from a controlled vocabulary list are assigned to each study to define the
overall subject content of the data at study level. They can be used for retrieval purposes to identify data that
cover a particular subject area. The list currently in use at the UK Data Archive can be seen be on
the browse by subject web page.
DDI element: <topcClas> Topic Classification
Keywords *
Each catalogue record contains a list of Assigned Subject Keywords covering all topics included in the
data at question/variable level. The keywords are taken from
the UK Data Archive Humanities and Social Science Electronic Thesaurus HASSET.
DDI element: <keyword> Keywords
Depositor(s) *
The name of the person(s) and/or institution(s) who deposited the study with the archive. These are taken from a controlled vocabulary names authority list.
DDI element: <depositr> Depositor
Principal Investigator(s) *
The person(s) and/or organisation(s) responsible for creating the data. These are taken from a controlled vocabulary names authority list.
DDI element: <AuthEnty> Authoring Entity/Primary Investigator
Data Collector(s)
The person(s) and/or organisation(s) who collected the data if different from the principal investigators. These are taken from a controlled vocabulary names authority list.
DDI element: <dataCollector> Data Collector
Original Data Producer
For studies that are based on data previously collected e.g. datasets based on government data, census
data, this field is used to record who was responsible for the original data on which the study is based. These are taken from a controlled vocabulary names authority list.
DDI element: <producer> Producer
using the "role" attribute: role="original producer"
Sponsor(s)
The person(s) and/or organisation(s) funding the research/data collection. These are taken from a controlled vocabulary names authority list.
DDI element: <fundAg> Funding Agency/Sponsor
Grant Number
This element contains the grant numbers given by the funding bodies e.g. ESRC grant number.
DDI element: <grantNo> Grant Number
Other Acknowledgements
This field is used for any names of individuals or organisations which should be acknowledged as having some input into the dataset, but that do not
fit into other fields or where it is not practical to name them individually e.g. 'All District Councils in Great Britain provided data on..'
It may also be used to qualify or further explain other fields relating to individuals or organisations.
DDI element: <othId> Other Identifications /Acknowledgments
Abstract *
A summary describing the purpose, nature, and scope of the data collection, special characteristics of its contents, major subject areas covered,
and what questions the principal investigators attempted to answer when they conducted the study.
- Main Topics *
These are a sub-section of the abstract and include information on the subject coverage of the data.
DDI element: <abstract> Abstract
Coverage:
- Time Period Covered * mandatory if no entry under Dates of Fieldwork
The time period to which the data refer. This item reflects the time period covered by the data, not the dates of coding or making documents machine-readable or the dates the data were collected.
DDI element: <timePrd> Time Period Covered
- Dates of Fieldwork * mandatory if no entry under Time Period Covered
The date(s) when the data were collected.
DDI element: <collDate> Date of Collection
- Country *
Indicates the country or countries covered by the data. These are taken from a controlled vocabulary geographic names authority list. If the data cover more than one country, the term 'multi-nation' is also added.
DDI element: <nation> Country
- Geography
Where the data collection focused on specific town(s)/village(s) or region(s)/counties, these are named here. These are taken from a controlled vocabulary geographic names authority list.
DDI element: <geogCover> Geographic Coverage
- Spatial Units *
Lowest level(s) of geographic aggregation covered by the data. Examples include Government Office Regions, Metropolitan Districts and local authorities. More information
can be found in the FAQ question What is the most detailed geographical level I can analyse the data at?.
DDI element: <geogUnit> Geographic Unit
- Observation Units
One or more basic units of analysis or observation that the data describes. For example: individuals; families/households; groups;
institutions/organisations; administrative units (geographical/political); text units (documents/chapters/word).
DDI element: <anlyUnit> Unit of Analysis
- Kind of Data
Each study usually indicates whether the data are individual (micro) and/or aggregate (macro) level. In addition one or more of the following may be
indicated, depending upon whether the data are quantitative and/or qualitative: textual; numeric; alpha/numeric; image; sound; in-depth/unstructured interview transcripts; semi-structured interview transcripts;
structured interview questionnaires; focus group transcripts; interview notes; interview summaries; interview extracts; unstructured/semi-structured diaries;
observation field notes; kinship diagrams; press clippings; photographs; video-taped interviews; audio-taped interviews; minutes of meetings; case study
notes; naturally occurring speech/conversation transcripts; correspondence.
DDI element: <dataKind> Kind of Data
Universe Sampled:
- Location of Units of Observation *
This identifies whether the data are national, cross-national or subnational. If the geographic coverage of the data is for an entire country then the
entry is national. If the data are for only part of a country (e.g. a number of towns, or counties, or a single place) then the entry is subnational.
If the data cover more than one country then the entry is cross-national. If cross-national then either national and/or subnational will also be specified.
DDI element: <universe> Universe
Population *
The group of persons or other elements that are the object of research and to which any analytic results refer. Age, nationality, and residence
commonly help to delineate a given universe, but any of a number of factors may be involved, such as sex, ethnicity, income, etc. The universe may consist
of elements other than persons, such as housing units, court cases, deaths, countries, etc. In general,
it should be possible to tell from the description of the universe whether a given individual or element (hypothetical or real) is a member of the
population under study e.g. Adults aged 16 and over in private households in England and Wales.
DDI element: <universe> Universe
Methodology:
- Time Dimensions *
The time method or time dimension of the data collection. These include: cross-sectional (one-time) study; follow-up to cross-sectional study;
repeated cross-sectional study; longitudinal/panel/cohort study; time series. For studies other than cross-sectional one-time studies, the frequency
(e.g. number of follow-ups, or intention of numbers) and interval (e.g. how often) is also shown e.g. Repeated cross-sectional study. Data are collected quarterly.
DDI element: <timeMeth> Time Method
- Sampling Procedures *
The type of sample and sample design used to select the survey respondents to represent the population. These may include one or more of the following: no sampling (total universe);
quota sample; simple random sample; one-stage stratified or systematic random sample; one-stage cluster sample; multi-stage stratified random sample;
quasi-random (e.g. random walk) sample; purposive selection/case studies; volunteer sample; convenience sample.
DDI element: <sampProc> Sampling Procedure
- Number of Units
Information on the number of cases targeted and obtained. For example, the survey may have posted a questionnaire to 4000 households and received 2000 back.
DDI element: <deviat> Major Deviations from the Sample Design
- Method of Data Collection *
The method used to collect the data, including one or more of the following: face-to-face interview; telephone interview; postal survey; self-completion;
psychological measurements; educational measurements; observation; clinical measurements; simulation; diaries; physical measurements; transcription of
existing materials; compilation or synthesis of existing material; focus group; video recording; audio recording; email survey; web-based self-completion; content analysis.
DDI element: <collMode> Mode of Data Collection
- Weighting *
The use of sampling procedures may make it necessary to apply weights to produce accurate statistical results. For example, population weights may be applied
to make the data representative of the general population and to compensate for groups over- or under-represented in the data. This section indicates whether weights
should be applied and describes the criteria for using weights. Where the weighting information is particularly complex and/or extensive the user may be
advised to consult the documentation.
DDI element: <weight> Weighting
- Data Sources
This field is used to describe the sources used for the data. It is typically used for historical data
collections to describe the originating materials and their location e.g. Kings' Speeches 1940 - 1951, House
of Lords/House of Commons Publications, The National Archives.
It is also used for studies that are based on data previously collected, for example where the data are based on Census data.
DDI element: <sources> Sources Statement
Language(s) of Written Material *
This field specifies the language of the study description and the study documentation.
DDI element: <othRefs> Other References Notes
Access:
- Access Conditions *
Each study is assigned an access code which has certain access conditions associated with it. The most common message to see here is: "The depositor has specified that
registration is required and standard conditions of use apply. The depositor may be informed about usage." The standard conditions of use are those that
are agreed to at registration in the End User Licence. However, some studies may have additional conditions of use. Where this is the case this field will
also have the message "Additional special conditions of use also apply." with a link to a web page that lists those studies for which special conditions apply and links
to the special conditions wording for each one. Where data are only available to certain users, such as the ESDS International macro databanks which are only
available to users at institutes of UK higher or further education, this is also specified in this section.
DDI element: <restrctn> Restrictions
- Availability *
This field indicates the service that supports users of the data and where the data are held. E.g. ESDS Government, UK Data Archive.
DDI element: <accsPlac> Location of Data Collection
- Contact *
This field indicates who to contact with queries about the data.
DDI element: <contact> Contact Persons
Date of release *
The First Edition date is when the study was first made available through the online Data Catalogue. If there
have been subsequent editions, the Latest
Edition date is also displayed.
DDI elements:
- <distDate> Date of Distribution
- <version> Version
Copyright *
This field contains a copyright statement indicating the person(s) and/or organisations(s) that own the
intellectual property. Copyright is usually retained
by the author/author's employer. Also see Copyright and the use of existing resources.
DDI element: <copyright> Copyright
Publications
Where the UK Data Archive has been informed of any publications, the references to these are provided. Where possible, a link is provided to the actual publication.
The publications are split into those by principal investigators and those resulting from secondary analysis.
DDI element: <relPubl> Related Publications