A civilization of regions

It is well known that in Western Europe the south of Italy is the poorest region. It is less well known, though not totally surprising, that regions of northern Italy such as Lombardy are among the wealthier areas of Western Europe. The aggregation of some of Europe’s wealthiest and poorest regions into one nation, Italy, obscures some very interesting fine scale trends. But since this is a weblog about brown-folk I’m not going to be discussing variance in statistics across the European Union. Rather, I want to address the issue of variance of statistics, and culture, across South Asia. India, Pakistan, Bangladesh, Nepal, and Sri Lanka (Bhutan is so small that I will leave it out of this treatment).

Intuitively we know that comparing Sri Lanka to India, or India to Pakistan, is apples and oranges. In terms of administrative units in the post-Westphalian age they’re equivalent, but we know that a nation of 20 million (Sri Lanka) and one of over 1 billion (India) are ludicrously mismatched. Even when comparing Pakistan to India, you have to face the fact that one state of India, Uttar Pradesh, is more populous than all of Pakistan! If Uttar Pradesh was a nation unto itself it would be the fifth most populous in the world.Obviously as a broader region South Asia, what would once have been simply termed india, has real coherency. It does so genetically and geographically. No matter the religious, social, or class background of a brown desi individual, in the rest of world one is marked off clearly as being “Indian.” And yet being brown does not mean that one is part of an amorphous whole. We know this well, with the cacophony of languages, religions, and cultures, which all partake of the brown identity. This is no different from other civilizations and regional identities. A person from Finland and a person from Portugal are both white Europeans, but very different in deep and significant ways. Similarly, a Kenyan may have little understanding of the details of the life of a Liberian, one being East African, the other West African.

With all that diversity understood, reading this weblog since 2004 has exposed me to a great deal of surprises in cultural variation nonetheless. Here is a 2005 post from Anna on the late, great, television show E.R., in relation to the main brown character, Neela, getting married to another doctor. The first commenter observes:

Hey, A N N A, I lurve ya, but don’t be dissing Kovac… he’s my future husband. 😉 He just suffers from being stuck with some of the worst writers in television. What’s up with a white sari at a wedding though? I found that a little odd.

And here’s a quote from a later post of a movie reviewer:

The representations tread the line between cultural authenticity, sometimes considered stereotype, and colorblindness. The women exhibit some level of conflict with their cultures and are slightly neurotic: Ming Na dreaded telling her immigrant parents that she was having a baby out of wedlock; Nagra quit her job in a bout of rebellion against family expectation to work as a convenience store clerk. The men are dangerous but tender. Phifer grew up without a father and has a temper; Gallant went off to serve in Iraq. I did laugh at the effort to bridge cultures, though, when NagraÂ’s character got married wearing a white sari. White is the Hindu color of mourning.

My own grandmothers had worn white saris after the deaths of my grandfathers, so I too associated white with death. This is not uncommon across Asia. But the assumption is very problematic in South Asia, as you can see in the comments. I’ll quote one which gets to the heart of the issue:

A General Rant Directed at Sepia Mutiny: I see so often people posting here speaking of how things are done in South Asia/India, when what they are really talking about is how things are done in their grandparents house.

Those of us who grew up in the United States during a time when we were the “foreign kid” or the kid from “India” implicitly represented being “Indian” to the rest of the world. Now, for me this is ironic insofar as I was born in Bangladesh, and you have to go back to my great-great-grandparents’ generation to find anyone born within the boundaries of the current state of India (my maternal grandmother’s paternal grandfather was from Old Delhi). But I look Indian, and frankly no one knew where Bangladesh was, unless they were socially conscious in the early 1970s, or watched a lot of Sally Struthers. But one part of representing brownness to the rest of the world is that you implicitly and unconsciously conflated your own specific representation of brownness with brownness more broadly. When confronted with the concrete range of diversity of authentically brown cultural forms and folkways the water recedes, so to speak, and your own very ethnocentric superstructure, long hidden, is exposed.

I could go on in this personal vein, but let me tack back to statistics, and administrative boundaries. My Canuck nemesis Ikram Saeed usefully pointed out recently that of the nation-states of South Asia only Bangladesh really qualifies in a conventional sense as a nation-state. That is, Bangladesh is a relatively homogeneous state ethnically, so that the national identity and political identities are nearly coterminous. In contrast, Sri Lanka and Pakistan have long been riven by ethnic and religious tensions. India has been a serviceable coalition between various linguistic groups (themselves divided along religious and caste lines). And finally, Nepal has a marginalized culturally Tibetan minority.

More importantly, India clearly has a coherent identity as the world’s largest democracy, but aggregating economic and social data on the national level, and using it to compare India to its neighbors, is possibly very misleading. So I decided to break out some of the data I could find by Indian state, and compare them to Pakistan, Bangladesh, Nepal, and Sri Lanka. You can see the original data in csv format here. I have a methodological addendum at the bottom of this post.

So the charts….

indiaAREA.png

indiaPOP.png

popDENSITY.png

indiaSEXRATIO.png

indiaLIT.png

indiaTFR.png

indiaLIFEEXPEC.png

indiaGDP.png

indiaGDPCAPITA.png

Finally, I think it might be useful to also lay out a “bubble plot.” The x axis is GDP per capita, the y axis total GDP, and the size of the bubble is proportional to population. Excuse the obscuring of some labels due to overlap.

bubbleplot.png

For me the biggest surprise is how big Pakistan is. I assumed that the data was entered incorrectly, but it seems that Pakistan is just really big. Far bigger than Uttar Pradesh. The second surprise was the GDP of a state like Tamil Nadu. The fact that it’s not very poor, like Bihar, but also relatively populous, means it has a large GDP heft. Larger than Gujarat and Punjab.

Methodological notes: Most of the Indian data is from this Wikipedia entry. In its turn most of the data within that entry is from the Indian census, or various health and economic surveys. The life expectancy data I mostly got from Google books. Some states had life expectancy for the two sexes disaggregated. If I couldn’t find it averaged, I did so myself. For the non-Indian states I used the World Bank data sets on Google data explorer. I simply aligned up the years between that and the Indian data. For the GDP I had to re-weight. The Indian government doesn’t seem to calculate GDP the same way as the World Bank, because the Indian aggregate and state level per capita GDP was way too low. So I simply shifted the non-Indian GDP in proportion to how much lower the government estimates were from the World Bank (i.e., the GDP for Pakistan and Bangladesh are lower than you’ll find for that period online because I changed to align with the Indian data). I’m still not totally satisfied with this, as I am suspicious that Bangladesh had a higher per capita GDP than Uttar Pradesh in the time period noted. But it is what it is. Also, the literacy rate numbers vary quite a bit, so they may not be totally comparable cross-nationally because what counts as literate varies depend on the criteria laid out. Life expectancy and total fertility rate on the other hand should be robust survey to survey assuming representativeness and no obvious data manipulation.

Also, you can see above that I removed many small administrative units as well as city-states. Cities in India are always very different from the hinterlands, so they don’t give a good representative picture. By analogy, it’s like discarding Luxembourg from European Union data sets when not weighting by population, since it’s so atypical. I wanted to give a sense of geographical variation, and aggregating city-states with more expansive units causes confusion.

Finally, obviously you should put more weight on intra-Indian comparisons, because the methodology will be more standard than when you compare across nations. Calculations of GDP can be an art more than a science. Additionally, metrics like literacy are notoriously fudged (some sources give Kerala with a literacy of 85%, while others 100%).

60 thoughts on “A civilization of regions

  1. The data for “Total fertility Rate” for Andhra Pradesh doesn’t seem to be right, it can’t be lower than Kerala (I am from Andhra Pradesh)

  2. re: TFR kerala vs. A.P., my intuition is with you. though the difference is only 0.1. a more recent survey gives a TFR of 1.7 for kerala vs. 1.8 for A.P. i generally used numbers for years where i could get the most states, so i didn’t use that….

    http://www.jsk.gov.in/total_fertility_rate.asp

    the TFRs don’t seem to be counts, but surveys based on samples from the broader population. so caution is warranted.

  3. the bigger point re: TFR is that there is a huge “natural break” between pakistan, U.P. & bihar, and the rest of south asia (small states like nagaland are not demographically too important). this is interesting insofar as indian punjab is a low TFR “valley” between these peaks.

  4. I’d be interested to see the same charts with Pakistan similarly broken down into Punjab, Sindh, etc. I suspect you’ll see a lot more variation with Punjab and Sindh being way up on things like literacy and GDP while everywhere else stays low and drags the aggregate down.

    I always knew Maharashtra was richer than the rest of the country but damn I never realized the extent of it. Also Bihar barely has $200 per head per year? That appalling!

    The data for “Total fertility Rate” for Andhra Pradesh doesn’t seem to be right, it can’t be lower than Kerala (I am from Andhra Pradesh)

    It’s barely lower. Statistically numbers that close are basically identical.

  5. Also Bihar barely has $200 per head per year? That appalling!

    the mid-2000s was kind of a nadir. though when it comes to really poor areas like bihar or bangladesh, i do wonder about the big effects which small data manipulation can have…. that’s why i like life expectancy and what not more.

  6. I always knew Maharashtra was richer than the rest of the country but damn I never realized the extent of it.

    if delhi was part of U.P. it wouldn’t look so bad. so it’s a little deceptive comparing U.P. to maharashtra + delhi i suppose. and you’re spot on about pakistan. when i have time i am curious to look up the % of pashtuns in 1981 vs. today.

  7. This is great! Thanks, Razib.

    I hope we keep this as a “sticky thread” and keep updating it with statistics from the ongoing census in india. Perhaps we should split states into urban and rural parts as well.

    It might be a good idea to add an additional data point—Mumbai alone. In my reckoning, using fairly old statistics, Mumbai alone has GDP that is probably close to half or even greater than half of the GDP of MH, and as such, deserves a data point by itself.

    It would be so great if we have a “fivethirtyeight” like statistical goldmine for the subcontinent :).

  8. the india census website is pretty confusing. i used to check in on it… i am thinking of doing some GIS based visualization at some point. in particular, generating more fine-grained district level thematic maps. i wish that every gov. would emulate eurostat.

  9. It would be so great if we have a “fivethirtyeight” like statistical goldmine for the subcontinent :).

    if you or anyone ever starts such a site, be aware that my pagerank will be at your service 🙂

  10. I am getting married in September, and I will be wearing a white sari for the wedding–as my ancestors have for as far back at the photographs go, because we are Christian.

    These statistics are fascinating. Thanks for making them easier to understand.

  11. I’d be interested to see the same charts with Pakistan similarly broken down into Punjab, Sindh, etc. I suspect you’ll see a lot more variation with Punjab and Sindh being way up on things like literacy and GDP while everywhere else stays low and drags the aggregate down.

    While that’s correct, it’s worth keeping in mind that Punjab and Sindh together make up almost 80% of Pakistan’s population (according to the last census in 1998; another census scheduled this year).

  12. Tamil Nadu has always been a leading state in terms of overall development. No surprises there. Anyway, what is the point of all this analysis?

  13. Very interesting! I always knew that Maharashtra state was the most productive state. Moreover, most of the GDP is from Mumbai by mostly non-Marathis.

    Here is my opinion to make your analysis more robust for TFR, literacy, life exp., and other demographic data: You should average 2 years of data together. That should be more accurate.

    Regarding the economic statistics of India – treat them all with suspect. A few years ago in ’06, the PPP GDP of India and Bangladesh deflated by 40%, but their economies grew robustly. What happened? Well, economists figured that they used a linear model since the ’70s or ’80s for certain consumer staples which are needed for the computation of purchasing power (deflated costs, but the real costs were much higher). That’s why India’s PPP in ’05 was like $3600, and then a year later, it was $2600.

    Finally, whenever I read about the PPP Per Capita of Pakistan, I’m so shocked that their numbers compare quite well to India’s as does their financial markets. However, we always hear on the news that their country is financially very unhealthy. Their PPP GDP/capita is only like 10% less than India’s.

  14. those weren’t PPP. i used nominal because of the issues with PPP. some of the day were from 4 year surveys (you can see at the links). so from 2002-2006, so i put 2004.

  15. Interesting study, good effort. It reminds me of that saying, though – “There are lies, there are damned lies and there are statistics”. Just kidding, but you have to admit that alot of these stats are unreliable and old at best.

    One point I would just mention is that if you compared New York State or California to the rest of the US or London to the rest of Britain they would be far ahead in terms of GDP. It tends to be the case all over the world that a few states or areas dominate in any country. That is not unusual in both developed and developing countries. Also, I’m not sure you can compare states to countries directly. I think as someone mentioned, you really need to break Pakistan, Bangladesh, Sri Lanka down into states too.

    Overall, however, it is a good thing that Pakistan do well economically, as well as Bangladesh and all the neighbours of India. It should lead to better relations between all the countries in that area as war and terrorism become less important relative to peace, prosperity, mutual trade and economic growth. The better the region does as a whole, the more peaceful it should become, in my opinion.

  16. I think as someone mentioned, you really need to break Pakistan, Bangladesh, Sri Lanka down into states too.

    sri lanka has 20 million people. pakistan ha 170 million. pakistan needs to be broken down…as does western UP + delhi vs. eastern UP. sri lanka, not as much. at least if you want to compare it to indian states.

  17. I read somewhere that the GDP statistics for Maharashtra and especially Bombay may be skewed because a number of national and international companies have headquarters there and they report their income at headquarters. Don’t know if that is true and how much it affects statistics.

  18. great stuff, razib – but i think pakistan should be broken out to sindh/punjab/other.

    44% of the paki land mass is balochistan which also has a population of 7 million – so that definitely skews things a lot.

    Its sad to read of the appalling conditions of the indian heartland – UP/Bihar/MP – so much potential and yet such enormous poverty…

  19. Wow – I can see the effort that went into this blog post. I don’t have anything interesting to say or add to the conversation, but I just wanted to say that the results are fascinating. Thanks.

  20. I would be interested in seeing similar data on governance quality between states. Any such data out there, I.e., some kind of corruption index? Also, I agree with the previous commentator that Pakistan needs to be broken up by states as well. Sindh and Punjab form the economic heart of Pakistan and accounts for the bulk of the population.

    Also, this new anonymous comments policy is totally pointless. If I really wanted to troll, I could very easily setup a fake facebook profile. I am sure Razib could find some data on how this policy does more harm than good by suppressing legitimate commenting.

  21. hi razib, thank you for this very interesting post. i am not sure how one does this but i am wondering if there is a way to manipulate the date to show the correlation between sex ratio, gdp, and literacy, all in one place like the bubble chart. thanks.

  22. very fine post my man!

    Incredible detail.

    Twitter me if you ever feel like compiling statistical data on texas hold em.

  23. Thank you Razib for your efforts. IMHO Just one important factor missing is average growth rates for the last few years. For ex., Pakistan may seem to be doing well, but per capita GDP has stagnated over the last few years. Bihar is in bad shape but its growth rate is at 11%. THis will give an idea as to where they are headed.

    • Well as long as Razib is sitting on a bottomless trove of data I’d also like to see morbidity rates, PPM pollutants urban, PPM pollutants rural, CO2 emissions aggregate, Tons of CO2 emission per unit GDP, cost of living, median income, average income, GINI coefficient, water use, potable water supply, and so on and so forth.

      Bihar is in bad shape but its growth rate is at 11%

      If you make $5 and hour and I hand you a $5 an hour raise then congratulations! Your income is growing at 100%! That growth is nice, but we’ll see how long it lasts. As it stands I’m assuming that’s mostly remittance from young Bihari lads doing migrant labor around the rest of the country, but I could be wrong.

      • Yoga Fire said: “If you make $5 and hour and I hand you a $5 an hour raise then congratulations! Your income is growing at 100%! That growth is nice, but we’ll see how long it lasts. As it stands I’m assuming that’s mostly remittance from young Bihari lads doing migrant labor around the rest of the country, but I could be wrong.”

        I agree that BIhar is growing from a very low base. But the growth is coming because of a huge improvement in governance which hopefully is a long term phenomenon. Read about their superb new CM Nitesh Kumar. http://economictimes.indiatimes.com/features/nitish-kumar-bihars-change-agent-charms-all/articleshow/5435106.cms

  24. i am not sure how one does this but i am wondering if there is a way to manipulate the date to show the correlation between sex ratio, gdp, and literacy, all in one place like the bubble chart.

    first, let’s use the word analyze, not manipulate. anyone with some knowledge of R and my csv file could do what you’re suggesting easily with the symbol function. i’ll do that in a bit, have to go do errands.

    most of the work here is finding data which used similar methodologies. i thought about breaking out pakistan, but couldn’t find data sources so easily.

  25. Goa is India’s richest state by GDP per capita. It has a greater population than Arunachal Pradesh and Mizoram, which are in your chart.

  26. A good effort, but the Indian GDP data looks funny. On a per-capita basis Maharashtra is richer, but not that much more than Gujarat or Tamil Nadu. And incomes in Haryana are higher than in Maharashtra. Tamil Nadu and Gujarat have about the same nominal GDP.

    As far as I can make out, Table 1.8 in this link has accurate per-capita GDP data. But the nominal GDP data in Table 1.7 is patchy – Gujarat and Tamil Nadu are both reported in terms of GDP in constant prices, when other states like Maharashtra are reported at current prices.

    http://indiabudget.nic.in/es2010-11/estat1.pdf

  27. First of all, excuse me for my bad writing. I’m a terrible at analyzing and writing, especially desi-related topics.

    “It is well known that in Western Europe the south of Italy is the poorest region. It is less well known, though not totally surprising, that regions of northern Italy such as Lombardy are among the wealthier areas of Western Europe. The aggregation of some of Europe’s wealthiest and poorest regions into one nation, Italy, obscures some very interesting fine scale trends. “

    As a Desi who has lived in desiland and Italy — terrible comparison. The Desi mentality of my aunts, uncles and cousins back in the homeland is scary; I am villified for not being an engineer or doctor or someone who is ‘successful’ in their eyes. I had the chance to interact with a decent amount of Italian families in northern and southern Italy and I found them to be geinuely family-oriented. Can’t say the same about Desi families back in the homeland. Indians have some obession with Italy; I don’t know why our brown *sses, think we look Italian or something.

    Looking at the GDP chart, it’s interesting how Sri Lanka and Punjab overlap each other. So Punjab has the same GDP as Sri Lanka? Well done, Punjab. And I see Orissa, Jharkhand and Chattisgharh have the same GDP — if they combine those 3 states into 1 — they could develop it into Punjab/Gujurat.

    Thank you for the informative post.

    Good Luck Sepia Mutiny. I look forward to reading more of your threads, Mr. Razib Khan.

    Cheers!

    • Actually, it is not such a terrible comparison. One of the keys to developing pride and confidence in desis in their nations is to show that western countries are just as prone to inequalities and stupidities as eastern countries are. The base levels are just higher. Showing how there are vast gaps in equality in western countries helps desis to see this and lose their ridiculous idea that somehow phoren is better than us. True, desi countries still have a long way to go but they are getting there and they are having to work through alot of baggage right now that has been with them for quite some time.

      Also, the comparison was an economic one, not a socio-cultural one. In addition, the idea that Italians are more family-oriented than Desis is just delusional! It may be your experience, but I would wager that 99% of desis would disagree.

      I realize by saying that I have just opened the door for that 1% to make their voices heard! Bring it on…

  28. Some other statistics of interest, which are usually reported in census data in India:

    1. Calorie intake, disposable income (income that is not spent on essentials)
    2. Availability of water per capita
    3. Percentage in primary school, dropout rates…
    4. NREGA data

    “the india census website is pretty confusing. i used to check in on it… i am thinking of doing some GIS based visualization at some point. in particular, generating more fine-grained district level thematic maps. i wish that every gov. would emulate eurostat.”

    I agree. Right now, I don’t think the census is complete, but at some point, the ISI (Indian Statistical Institute, not the more notorious organization that is always up to no good) publishes extensive summaries. Perhaps you would be able to collaborate with them as well?

  29. Sri Lanka and Kerala (and I guess Goa also) are the highest in Human Development Index in South Asia. Why aren’t corporations going there?

      • Put another way: If I’m trying to sell bottles of Mirinda, I don’t really care how literate, educated, or healthy my Mirinda buying customers are. The illiterate domestic servant is just as valuable to me as the literate one. The guy without access to a doctor is just as important as the one with.

        • I disagree with this much simplification. Tech companies go into other states because, quite frankly, higher education is what attracts tech companies, not primary education. Manufacturing goes into TN or Gujarat since the labor force is more skilled and efficient, coupled with better infrastructure, universities and governance.

          Ironically, bottling Mirinda is in Kerala btw :).

          The HDI is not a catch-all index. At least in India, much more refined indices are essential. Kerala is an good example to emulate in many ways (healthcare), but it is a horrible example in some (education, as opposed to literacy).

  30. Razib, thank you for the chart. high gdp does not seem to equalize the sex ratios, but high literacy seems to do a bit. i presume this is sex ratio at birth?

  31. “Sri Lanka and Kerala (and I guess Goa also) are the highest in Human Development Index in South Asia. Why aren’t corporations going there?”

    Kerala has been a communist anti-business state for many decades. Its people will strike at the drop of a hat and the government being of a communist bent will support it. It is very difficult to run a big corporation there – especially manufacturing (even now Coke bottling has all kinds of problems there). Malayalis who can leave the state as soon as they can either to other states or to the gulf. The HDI is high because huge remittances from expats from abroad (esp. the gulf) make it a rich state that can afford its socialist ways. The attitude to business is slowly changing now. Kerala has its own hi-tech. technology parks but big changes do not happen overnight.

    Sri-Lanka is actually doing quite well. For ex., it is one of the largest destinations for outsourced accounting.

  32. Hi Razib,

    Do you have this data in nice spreadsheet format? Like Excel or Google Docs?

    I’m interested in running creating some unified data sets so people can run regressions, etc. on them.

  33. I heart Razib Khan.

    Tamil State has done well in spite of the pro-Hindi ways of Delhi. Delhi offers farming subsidies to wheat farmers in the heartland of India (mostly N. Indian farmers in Punjab, Haryana, Gangetic Plains), but not for the rice farmers of South and East India.

    Moreover, S. Indians are not a business-oriented people like the N. Western Indians are. I have no clue how this state is progressing, but I’m willing to bet it has something to do with remittance payments from SE Asia, Middle East, and N. America.

  34. Economic data is seriously flawed (might be outdated), especially those belonging to my state Andhra Pradesh. 2011 budget for AP with 8 crore population is 1,13,675 crore($25 Billion) which is greater than Pakistan total tax revenues for year 2010-2011. PCI of AP in 2011 is INR 55000, which translates to $1250. Do you article again with latest statistics.

  35. btw, if u have really up to date sets of most of the indian states, feel free to email me the csv. i’ll post updates. but don’t bitch if you aren’t going to do the leg work. it was time consuming enough double checking and standardizing this data. i don’t care that andhara pradesh has grown a lot over the last 5 years, i need data on all the states in 2011. if you care that much, collect them all and send them to me, and i will replot (yes, i could do it, but it takes A LOT of time from what i saw, and i don’t care that much. if you do, do it).

    email, contactgnxp -at- gmail.com

    otherwise, shut it.

  36. Sorry for not looking more carefully.

    Anyway, I’m looking to expand Razib’s original data set and add more variables such as % in Forward Castes, % Hindu, % Muslim, etc. It shouldn’t be hard but it is the most time consuming part. I have already done some plots and work on my site and made all my Stata results completely public as how I saw them.

  37. Your data just points to the obvious: the sooner South Asia becomes one common economic entity the better. India may think it is doing great, but as your wonderful compilation shows, India is doing great in parts and Sind, West Punjab, East Bengal, Sri Lanka and Nepal fit in quite well among the states of India. Of course this is not politically viable at this time. But how long must the current state go on? If Europe can have its EU – we ought to have a SAU. Perhaps that the only way the Afganistan, Burma and Kashmir problems will be solved. Economics uber alles.

  38. Tut Tut, Razib… I can’t believe you forgot the Maldives or is that country too small to fit into your analysis as Bhutan was?

  39. I should have wrote this post earlier on the thread but is southern Italy really poorer than Portugal?

  40. On Wikipedia (maybe not the most reliable source, but it has never failed me!) it says Moldova is the poorest part of Europe (by GDP per capita). But of course Italy is a good example here because of the massive poor/rich divide.

    • Moldova is an eastern European country which is why Razib made it a point to state that southern Italy has one of the poorest economies of western Europe.

  41. re: moldova. i was looking at EU data sets. moldova is not in the EU. specifically, i was looking at the “NUT 2” region data.