This dataset contains a number of cross-sectional datafiles generated from Markinor's M-Bus survey (which later became Ipsos-Markinor's KhayaBus survey). The data from these sections was then packaged as the South African Reconciliation Barometer (SARB), an omnibus survey with national coverage aimed at measuring socio-political trends.
The South African Reconciliation Barometer survey has been conducted 12 times. This release excludes the 12th round, which is due to be released in mid 2013. The datasets are labeled "round 1" to "round 11" and represent independent cross-sectional surveys, so results can be compared between rounds only in aggregate. In the data files there is a variable called "round" that contains values 1 to 11 (it is the same for every observation in each data file) and a value label with the month(s) and year the data were collected. The data files should also each contain a datafile label with this information.
Kind of Data
Sample survey data
Version 1.1 Edited, anonymised data for public distribution
Version 1 of the data was received by DataFirst from the Institute for Justice and Reconciliation (IJR) in 2012.
DataFirst has identified the following data quality issues concerning this dataset:
A questionnaire was supplied for each round of the survey, however:
The round 1 questionnaire does not match the variable labels in the dataset -- this probably means that this is an incorrect questionnaire or an incorrect earlier version of the questionnaire
The round 2 questionnaire is missing
The round 9 questionnaire was not available during the data cleaning process, potentially resulting in mismatches between the questions in the codebook and in the questionnaire -- in the event of mismatches the questionnaire takes precedence
Many questions were repeated in every round of the survey, although there were changes with every round. Researchers should be careful to be aware of changes in the wording of questions (or in the options available for selection).
A technical report for each round of the survey was supplied by IPSOS/Markinor (sometimes in the form of powerpoint presentations) and are incomplete with respect to sampling processes etc. In the event of queries, users should be encouraged to submit them to datafirst, and it can be passed on to the IJR or IPSOS/Markinor for clarification if required.
The archived IJR version of the dataset has been substantially cleaned by DataFirst, mainly in the form of harmonising coding and adding metadata to the datasets.
The SARB datasets were cleaned and harmonised across rounds in the following ways:
1. Unique identifier variables ("id") were transformed to be unique between and within rounds by turning each id into a 7-digit number (with a leading zero) or 8-digit number of which the first two digits represent the round (e.g. the first respondent in the round 1 dataset is identified by 01000001 and the first respondent in the round 12 dataset by 12000001).
2. Demographic and general (i.e. non-SARB question) variables were renamed to a standard list of names (including "id", "weight", "gender", "race", etc...), given consistent variable labels and placed at the beginning of the dataset in a standard order. NOTE: While coding of these variables were often harmonised, we cannot guarantee complete compatibility: for many variables categories (e.g. dwelling type) categories changed several times. The rounds should therefore NOT be merged/appended together but rather analysed separately.
3. Substantive questions were renamed with a round prefix in the form "r01_" (for the first round), usually followed by the letter/number combination corresponding with the designation of questions on the questionnaire.
4. Substantive questions were recoded to match a standard coding scheme consistent across questions in a round and, as far as possible, consistent across rounds. Missing values were recoded to -9 for "Don't know", -5 for "Refused", -3 for "Not applicable" and -1 for "Missing" (the latter representing cases where there is no response recorded but the data producer specified no reason in the data for there being no response). Scales were transformed to run from "negative" to "positive" (e.g. "disagree" to "agree", "unjustified" to "justified", "never" to "often", etc.). Standard label schemes were applied to ensure value labels are consistently spelled.
5. All transformations of the data from the supplied form to the first release version can be traced by viewing the cleaning do-files released with the data. Researchers requiring the original files to re-run the synax files used in the data cleaning process should contact email@example.com
6. A codebook was produced for each round of the survey EXCEPT rounds 1 and 2 (owing to the incorrect and missing questionnaires). The codebooks are automatically generated from the post-cleaning data files, which have the full questions from the questionnaires inserted into the data as "notes". The codebooks show the variables names (altered during the cleaning process, but easy to match with variables in the original data files and with questions in the questionnaires), variable labels (usually truncated versions of the full question but sometimes more descriptive) and coding/value labels (which have been harmonised during data cleaning). For rounds 1 and 2 the best source of the full questions are the variable labels in the original, unaltered SPSS-version data files. Researchers requiring these files for their data analysis should contact firstname.lastname@example.org
INDIVIDUAL: Racial attitudes, opinion of government, race, perception of government performance, public opinion, service delivery, intertemporal perception of well-being, happiness, dependence on public services, education, employment status, income
The survey covered all persons over 16 years of age living in multimember households. Squatters were also included in the sampling frame. However, domestic workers, hostel dwellers and persons younger than 16 years of age were excluded from the sample.
Producers and sponsors
Institute for Justice and Reconciliation
Institute for Justice and Reconciliation
Sampling of the respondents assumed the form of a multistage area-probabilitysample with three calls. The sampling included persons of 16 years and older, living in multimember households. Squatters were included in the sampling frame. However, domestic workers, hostel dwellers and persons younger than 16 years of age were excluded from the sample. Enumeration Areas are drawn from the 2001 Population Census and sampling points were allocated to sub-places in each of the metros. Within each of the sampled sub-places, a street was randomly selected using the Geographical Information System (GIS), which indicates all the streets within the boundaries of each sub-place. The streets were listed and a street randomly selected from the list of streets. In the selected street between four and six dwellings were then be selected using a random walk procedure in the selected area. If there was more than one household at a dwelling, one household was chosen using a random procedure. At every alternate dwelling, all the males or all the females over 16 years of age are listed in order of age and thereafter one chosen using a random selection grid.
If the interviewer found at their first visit that the qualifying person was not available for the duration of the fieldwork such as being on holiday, sick or could not speak any of the South African official languages, interviewers were allowed to substitute immediately. Three calls had to be made prior to substituting.