South Sudan - Population and Housing Census 2008
Reference ID | SSD-NBS-PHC-2008-v01 |
Year | 2008 |
Country | South Sudan |
Producer(s) | National Bureau of Statistics - Government of South Sudan |
Sponsor(s) | Transitional Government of National Unity - TGoNU - Financial support Multi-Donor Trust Fund- South Sudan - MDTF-SS - Financial support Danish International Development Agency - DANIDA - Financial support European Union - EU - Financial su |
Created on
May 24, 2016
Last modified
Jul 20, 2016
Page views
19125
Sampling
Sampling Procedure
The sampling was only for the Long Form Questionnaire (LFQ), the reason being it was to capture the information for social- economic status of the country. The sample design for the LFQ was documented in the report "review of sampling procedures for the Sudan census long form questionnaire (LFQ)" (Megill, March 2008). At the Southern Sudan Center for Census, Statistics and Evaluation (SSCCSE) excel spreadsheets were produced with the cartographic database of EAs for each state. This database had all of the geographic names and codes for each EA, as well as the estimated number of households from the pre-census cartographic quick count. The sampling frame of EAs for each state was first sorted by urban and rural stratum, and then by geographic code (county, Payam, Boma and EA number) within each stratum. The 10 percent sample of EAs for the LFQ was selected systematically with equal probability from this ordered frame of EAs for each state. In order to save time and increase the accuracy of the sample selection, formulas were introduced in the sampling frame spreadsheets for generating the random start and selecting the systematic sample. Given the geographic ordering of the sampling frame of EAs, this sampling strategy provided a very representative sample of EAs within each county of each state for the LFQ enumeration.
Response Rate
99.8%
Weighting
The basic design weight for the LFQ data would be the inverse of the probability of selection of the sample EAs and households. Given that a 10 percent systematic sample of EAs was selected with equal probability within each state, the basic weight for the LFQ sample should be close to 10. However as described in the previous section, the original sample was not strictly followed, as there was a considerable amount of substitution, there were many EAs with both LFQ and SFQ data, and the average number of households was smaller for the EAs with exclusively LFQ data. As a result the average LFQ household weight for South Sudan is about 14.1720, implying an overall 7.06 percent sample of households. The average household weight varies by state from 12.5600 for Western Equatoria to 15.7625 for Central Equatoria, given the different LFQ sampling results for each state.
Given the availability of the 100 percent household and population count data from the 2008 Sudan Census, it would have been possible to calculate the LFQ weights using the 100 percent census count data for the in-scope population as the numerator and the sample count as the denominator for different strata. It was important to examine the geographic distribution of the population counts from the census data for both the SFQ and LFQ for the in-scope population as well as the corresponding distribution of the sample population enumerated with the LFQ, in order to determine the best post-stratification level for calculating the weights. It should be pointed out again that the LFQ sampling was used for the following types of households: private households, internally displaced and refugees.
In order for the distribution of the LFQ weighted population estimates to be consistent with the distribution of the 100 percent census count data by sex and 5-year age groups, the LFQ sample was first post-stratified at this level. The preliminary tabulation plans for the LFQ data included tables by sex and 5-year age groups. The highest age group in the tables was 75+ years, so this grouping was also used for calculating the weights. This post-stratification of persons by sex and 5-year age group also improved the LFQ sample estimates of characteristics that are correlated with sex and age group, such as education, labor force and fertility.
As described in the report on the LFQ sample design the area sample of EAs selected for the LFQ only provided reliable results down to the state level. Many of the LFQ estimates were tabulated at the state level, this was the first geographic level for calculating the weights. Tabulations of the distribution of the population by sex and age group in the frame and the LFQ sample were also examined at the county and payam levels. As expected the weights would be very unstable at the payam level, and some individual payams do not have any LFQ sample. Even at the county level the household weights varied considerably.