Political parties are clamouring for a caste census. The first and only caste census since independence conducted in 2011 was a disaster, and has now been discarded by the Narendra Modi government. Seeing the Union government’s reluctance to launch a new caste census, many state governments have decided to conduct their own. That is a good idea, but requires preparation.
The abandoned 2011 caste census contains important information that should be used by surveyors and statisticians to prepare for future censuses. The 2011 caste census gave 4.6 million caste categories, a number that most people will find difficult to wrap their heads around. This happens with open-ended questions asking people to provide their caste. Some people list their caste, some their sub-caste or gotra or profession, and some their last names. This does not mean the data is bogus. It simply means enumerators need to reclassify them into fewer, consistent categories.
Before conducting the next census, the data from 2011 should be used to create a code book assigning the 4.6 million castes (or sub-caste/gotra/profession/last name) into fewer and consistent categories. In Western countries, surveyors receive similar multiple responses (although nearly not as many) on questions relating to occupation. The US census bureau, for instance, has a code book that assigns occupations into various categories. The code book is revised every 10 to 15 years to address changes in economic structure.
Broad caste-related questions should be asked through drop-down menus. The survey could start with simple yes/no questions, asking people if they identify as Brahmin, Kshatriya, Vaishya (Bania), SC, ST, a few additional major OBC categories, and other OBCs. Individuals who do not identify in any of these categories can be classified as other castes or no caste. This alone will provide valuable data untainted by confusion about subcastes, gotras etc. It will also avoid errors caused by misspelling in self-identification surveys.
Also read: ‘Why Does India Fear a Caste Census?’: Answers From a Conference
The defunct 2011 census has become a laughing stock because it had 1.2 crore errors. The mirth is unwarranted. In a census of 121 crore people (population in 2011), this represents an error rate of about 1%, which is not exorbitantly high. In any case, the government has spent Rs 5,000 crore in conducting the caste census. Instead of throwing out the entire census data, it can use the 99% of the data which is error-free.
Initial reports said the 2011 caste census had 8.2 crore errors. Fortunately, seven crore errors were rectified by state governments. Is there any documentation on why these entries were categorised as errors and how these errors were rectified? Such documentation could help rectify future errors and help prepare the code for assigning public responses into listed castes. Most importantly, the code book for such assignment should guide the preparation of the next census. Such an exercise may yield a few thousand additional castes/names outside the code book that can be considered later.
Before undertaking caste censuses, states should clearly define their purpose. Without a clear purpose, state-level censuses are also likely to yield chaotic outcomes that would not be any different from the 2011 census.
The primary purpose of a caste census should be to ascertain the income and asset ownership of various caste groups. Caste is an integral part of India’s political economy. Data on the economic wellbeing of various caste groups is critical for an informed discourse on how policies and practices have impacted various groups.
Data from the caste census will determine which castes should be eligible for reservations in educational institutions and jobs. One big challenge in collecting any such data is that people are likely to understate their incomes and assets once they learn the objective of the survey. Why tell the truth if it means a loss of opportunities?
Also read: How We Teach Inclusion in Policy Schools May Change the Way We Talk About Caste
With the proliferation of welfare schemes, truth in surveys has become a challenge for all surveyors, including the NSSO. Even in innocent surveys unrelated to quotas, most respondents could think that the surveyors are in some way linked to government agencies and so their answers would determine welfare eligibility. This problem is not unique to India. Surveyors in other countries have also faced this problem, which they address by using other sources, such as macro-data or administrative data. For instance, if surveys show the purchase of one lakh cycles in a city and cycle companies show sales of two lakh, data from the survey can be adjusted upward.
Indian policymakers can also use data from the administration (on welfare use and subsidies), market data on sales, and value of property by location to determine the wellbeing of populations by geography (e.g. city, village, or block level). These data can then be used to predict income and asset ownership across population groups based on caste dispersion across geographies from the census. At a minimum, local data must be used along with micro level survey data and censuses to minimise errors.
The responsibility of conducting censuses should be with the registrar general and census commissioner of India. This office has the relevant experience and knows how challenging a census is. The office should be independent of political pressures.
Neeraj Kaushal is a professor of social policy at Columbia University, New York.