How to cite datasets used in your published works
If you are a researcher analysing statistical data you will need to cite primary data sources in your published papers, just as you would cite other sources.
The term “dataset” refers to the raw microdata files from research, as well as the documents that give provenance and usage information about the data. Here are some tips on correct data citations.
- Identify the data early in on your paper, preferably in the abstract.
- Include a dedicated "data" section so that readers can immediately identify the data that underlies your work.
- Reference the data in your data tables.
- Cite data in your references. References are more frequently indexed than full papers, so the citation will be made more visible by its inclusion here.
- Cite the exact version of the data used in your research, to support data discovery.
- Include a unique identifier in your citation, such as Direct Object Identifiers (DOIs).These will enable the data to be accessed even if URLs change and thus provide a permanent link to the data.
DataFirst will be providing Direct Object Identifiers (DOIs) for our datasets from 2016.
DataFirst uses the international data citation recommended by DataCite. This follows the format below:
Name of producer. Survey name and date [dataset]. Version number. Place of production: Producer [producer], date of production. Place of distribution: Distributor [distributor]. URL or DOI
Statistics South Africa. General Household Survey 2010 [dataset]. Version 2. Pretoria. Statistics South Africa [producer], 2011. Cape Town. DataFirst [distributor], 2011 http://www.datafirst.uct.ac.za/dataportal/index.php/catalog/192
Contact our helpdesk support[at]data1st.org for help with citing data in your published research.
Based on: Ball, A. & Duke, M. (2011). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available Online: http://www.dcc.ac.uk/resources/how-guides/cite-datasets.