In 2014, Financial Sector Deepening Zambia (FSDZ) commissioned Microfinance Opportunities (MFO) to implement the Zambia Financial Diaries Project (FDP). The intention of this was to understand how low-income people in Zambia managed their cash flows and how they utilised transfers, savings, loans and insurance to do so.
Kind of Data
Event/transaction data [evn]
v1.0: Edited, anonymised dataset for licensed distribution.
Household: Household characteristics, the make-up of the household, demographic characteristics of respondent, asset ownership, use of assets as investments, asset financing decisions, insurance purchasing behaviour and knowledge, retirement savings decisions, perspectives on savings groups, perceptions of shock, depositing patterns, earnings and occupation (for all members of household), general savings behaviour, farming in the household, micro-retail behaviour, propensity to permit credit, and energy and water consumption.
Transactions: Purchases of goods and services, exchanges of goods and services for other goods and services (bartering), gifts, loans, savings deposits, transfers of money between household residents (intra-household transfers), withdrawals or inputs into food storage
Events: Event was a shock, event happened to the respondent directly, event led to a loss of income, event resulted in expenditure, event required travelling, event brought about income gain, event happened to a peripheral party (i.e. did not happen to the respondent or their family but someone in the community), nature of event
Panel: Value of all inflows and outflows in ZMW, count of various transactional events in the week, count of various event types during the week
Roster: Material living conditions, educational attainment, demographic characteristics, financial service usage, occupation
Four provinces (Lusaka, Copperbelt, Eastern, Western) in districts with towns that could accommodate fieldworkers for a year that was withing one hour of all field sites.
The survey universe of the ZFDP 2015 is slightly unusual because of logistical constraints imposed on the sampling frame. These constraints meant that only certain districts from Copperbelt, Eastern, Lusaka, and Western Provinces could be selected. Districts within those provinces could then only be selected if there was a town with sufficient services to support a fieldwork team for a year that was within an hour and a half of all field sites. Enumerator areas were then drawn from those selected districts and households within those EAs were then selected using a random walk. The survey, then, covered all de jure household members within households in those enumerator areas in Zambia.
Producers and sponsors
Financial Sector Deepening Zambia
University of Cape Town
Data cleaning, hosting and support
The sampling frame for the FDP was developed under certain logistical constraints. The priority was to develop a sample that, while not statistically representative, was still reflective of the varying levels of financial service access and livelihoods of low-income Zambians. Four provinces were selected — Copperbelt, Eastern, Lusaka, and Western Provinces — that contained a diverse mix of urban and rural respondents, various levels of financial access, and a preponderance of individuals involved in informal businesses (Lusaka Province), the mining sector (Copperbelt Province), or farming (Eastern and Western Provinces).''
From those provinces districts were selected based on further conditions. Chief amongst these was that any selected district needed to have a town with sufficient services to support a fieldwork team for a year that was within an hour and a half of all field sites. Once those constraints were satisfied standard enumerator areas (as drawn from the master sample developed in the Zambian 2010 Census) were randomly selected from the pruned set of districts in the provinces mentioned above. Households were then selected within those enumerator areas using a random walk. Respondents within households were chosen using a Kish grid as per and screened for eligibility with an enrolment questionnaire. This questionnaire had certain requirements that needed to be fulfilled for the respondent to be included in the sample. For example, if the respondent was going to be away for the majority of the year the interview was terminated and the respondent was excluded.
Dates of Data Collection
Data Collection Mode
Data Collection Notes
Most of the data generated in this dataset is the result of respondents filling out notebooks on a weekly basis. Those notebooks were then used to create official records under the supervision of fieldworkers. The intervention of fieldworkers every week ensured that notebooks were filled out diligently, with respondents probed regarding the accuracy of their entries.
The financial diaries paper instrument: this data sheet was filled out weekly and then checked with respondents by enumerators at the end of each week for accuracy
Enrolment questionnaire: the enrolment questionnaire was used as a screen at the beginning of the data collection period
Cross-sectional survey questionnaire: the cross-sectional survey questionnaire was administered towards the end of the year of observation
The data was anonymised by DataFirst, encoded and cleaned. Some duplicates were removed.
Date start and date end variables are fuzzy (there are not always seven days in a week, the weeks don't always begin on the same day) which is most likely attributable to data capturing errors on the part of the fieldworker. Most of the week lengths, when evaluated, come to seven days (as expected) but not all. For the user the more reliable measure of the week of observation is the transact_week variable.
The wards variable seems to be imperfectly captured as many do not match lists of recorded Zambian electoral wards. Efforts have been made to make the entries more readable but are imperfect perfect. The phone access variable is also bit misleading. There are 58 missing values for the variable roster_phoneaccess_or_own which seem to have a corresponding follow-up response with variable roster_accessph_only. It is unclear what the roster_accessph_only} variable is meant to represent because it is not on the enrolment questionnaire.
There are two sets of duplicates in terms of RespID in the cross-sectional data file. Investigating further it was discovered that one set of duplicates is the apparent result of the same fieldworker visiting the same household twice (the second visit occurring eight days after the first). This policy adopted here was to choose the latest observed row in the data. Notably, there were a few variables that had different values between the two observations. These are easily attributable to actual dynamics within the household. There were no substantive differences between static household characteristics.
The other set of duplicates was slightly more complicated. It involved the apparent incorrect assignment of the RespID code (that is, the RespID code was assigned correctly for one entry and incorrectly for the other). Fortunately, it was possible to check the correct RespID using the other data files. It turned out that the incorrectly assigned entry was meant to be represented by another code entirely. This is most likely a data capturing error. Correcting the incorrectly coded RespID yielded another set of two duplicates. The correctly coded entry of these two duplicates was the one used in the final file.
Licenced data, available under conditions
Financial Services Deepening Zambia. Zambian Financial Diaries Project 2015 [dataset]. Version 1. Lusaka: Financial Services Deepening Zambia [producer], March 2017. Cape Town: DataFirst [distributor], March 2017. DOI: https://doi.org/10.25828/v91e-hj78