Digit preference in Nigerian censuses data of 1991 and 2006

Background: censuses in developing countries are prone to errors of age misreporting due to ignorance, low literacy levels and other social, economic and cultural factors. ages are commonly rounded with great affinity for 0 or 5. This tendency to digit preference and/or avoidance results in age heaping or concentration of ages at certain digits. This study examined the extent of digit preference in the nigerian census data of 1991 and 2006. MeThods: this study utilized age data from the 1991 and 2006 nigerian censuses reported in single years. The Whipple and Myers indices were used to determine the extent of digit preference. resulTs: both the 1991 and 2006 census data showed the expected pattern of errors, with Whipple and Myers indices being beyond acceptable levels. The Whipple index for 1991 and 2006 was 293 and 251 respectively, while the Myers index was 62.3 and 67.1 respectively. There was a strong preference for terminal digits 0 and 5, followed by 8 whereas terminal digits 1 and 9 were strongly avoided. conclusions: the quality of age data in nigerian census data is poor as a result of misreporting and no significant improvement or difference was observed between 1991 and 2006 censuses.


inTroducTion
Nigeria has a long history of census taking.The first census, confined to the old Lagos colony, was conducted in 1866.Subsequent censuses, gradually involved other parts of the country and by 2006 12 censuses have been conducted in Nigeria [1,2].
Censuses conducted across Africa have been documented to suffer from errors in the form of under-count or age misreporting [3].
Information about age is an important variable collected in a census because it provides basis for further demographic analyses and estimations such population estimation, fertility estimation [3].Conceptually, age seems simple to collect in any demographic inquiry but the literature indicates that it is not as simple as it seems [4].Several authors [3,5,6]  Results demonstrated that an age exaggeration of 2.5 years will bias the estimated life expectancy upward by approximately the same amount [5].Age exaggeration has implications on other indices that are calculated based on age of respondents.In the assessment of data quality used in demographic and health surveys, it was found that there was intentional misrecording of information in order to exclude some women for in-depth interviews.The effect of this misrecording was an overestimation of the total fertility rate of 2-4% and underestimation of under-5 mortality rate of 2-4% [7].
To date, assessment of quality of census data has been conducted in some African countries [8,9], but not in Nigeria.Therefore, the aim of this study was to assess the age data of the 1991 and 2006 Nigerian censuses.

MeThods
Reported age in single years for the 1991 and 2006 Nigerian census data were used and disaggregated by sex.Two methods of analyses were used, the Whipple and Myers blended indices [10,11].
The Whipple index is a summary measure of age-heaping for ages ending in 0 or 5 [10].It is calculated by summing persons in the age range 23 to 62 years inclusive, and calculating the ratio of reported ages ending in 0 or 5 to one-fifth of the total sample.This measure assumes linearity in the distribution of ages in each five-year age range.The index varies between a minimum of 100, indicating no concentration/preference at all, and a maximum of 500, if only 0 and 5 were reported.The United Nation's recommended standard for measuring age heaping on digits 0 and 5 classifies the quality of data from very accurate to very bad (Table 1).
The Meyer index was developed to detect preference and avoidance of all the terminal digits, from 0 to 9 [11].It is based on the principle that in the absence of age preference and avoidance, the aggregate population of each age ending in one of the digits 0 to 9 should represent 10% of the total population.The method produces a reference value for each of the nine digits as well as a summary index for the ten terminal digits or the index of preference (IP).Theoretically, the index has a minimum value of 0 (when deviations from 10% is zero) which represents no heaping to a maximum of 90 (when heaping of all reported ages is at a single digit e.g., zero).

resulTs
The Whipple indices for the sexes and total sample indicates how the quality of age data for the two censuses were very bad (Table 2).Digit preference and avoidance as measured using the Myers index shows there was general preference for digits 0 and 5 while others were avoided with digits 1 and 9 being most avoided (Table 3).There was general improvement in the quality of age reporting using Myers index; avoidance of certain digits declined and preference for certain digits also declined.For instance, Myers index for digit 1 declined from 20.2 to 16.8 while avoidance for digit 1 declined from -5.6 to 5.2.Table 4 shows the rank order of digit preference/avoidance for the 10 digits.It clearly shows the declines (or improvements) in the quality of the age data.Table 4 is a summary of rank order of digit preference and avoidance.Some reversal of the digit preference and avoidance was found when two numbers were combined: (2, 8), (4,6) and (3,7).

discussion
This study has brought to the fore the level of quality Nigeria's censuses data, at least as far age reporting is concerned.From the analyses of Nigerian censuses data of 1991 and 2006, it can be seen that the quality is poor with respect to Myers' and Whipple's indices.
Results indicate a concentration on terminal digits 0 and 5 and very poor data quality.However, looking at sex disaggregation, the index is slightly better for males compared to that of females.The quality of Whipple's index has been linked to the level of literacy of the respondent or the individual declaring the age i.e. the better the literacy, the better e 8  there was an improvement in literacy levels of Nigerians and people became more aware of their exact ages.This can be deduced from the fact that more males than females are literate in Nigeria and the Whipple's index also shows this sex differentials as females had higher Whipple's than males for the two censuses.Also, the programme launched by the National Population Commission to improve birth registration coverage must have had some impact albeit not estimated here.Furthermore, during the 2006 census calendar of local events were developed as aid to exact age estimation and this also would have had some effect in age reporting.The Myers index results show a general pattern of terminal digit preference and avoidance.A strong preference for terminal digits 0 and 5, and 1 and 9 was noted.Digits 2 and 8 showed the least deviations; digit 8 showed the least followed by digit 2. There were declines in negative deviations between the two census data for the two digits; digit 8 showed no deviation in 2006 census from -0.2 in 1991 to 0.0 in 2006 while digit 2 declined from -2.3 in 1991 to -1.6 in 2006.For these two digits, this observation conforms to what was observed by Nagi, Stockwell and Snavely [6].

conclusions
It can be concluded that the quality age data is poor as a result of misreporting and it seems that there was no significant improvement of the indices with the passage of time between 1991 and 2006 censuses.