The decline of Hindi among American brown folk

southas.pngA few days ago Taz pointed me to the fact that Census 2010 had been releasing a lot more data in the past few months. I was naturally curious, so I decided to check out the website. Mind you, I’m someone rather familiar with the older website, and have downloaded raw data sets and crunched them with R. So I was cautiously optimistic. Very cautiously. Apparently the big news is that the American Factfinder now has a “web 2.0” version which will be releasing 2010 count data. Unfortunately, the implementation of AJAX makes the site very, very, slow (and beware of non-supported browsers!). And, there isn’t that much 2010 information.

To review, there are decennial censuses which are straightforward counts, and, since the aughts there have been American Community Survey results which are based on a sample, and so have a margin of error (this makes county-level data less useful because for small subpopulations within a county the margin of error can be very large). I’m looking forward to 2010 results because they are going to be much more robust than the ACS data which makes the news periodically for small subgroups. You can see where I’m going with this, since the core readership of this weblog comes from a group which is itself subdivided by nationality, ethnicity, class, and religion.

Though the initial intent was to find 2010 results for Indian Americans which I could compare across the older censuses, I stumbled onto some interesting language data. The pie chart is based on 2006-2008 ACS data. I’ve put a csv file of the data online. For the pie chart I removed some languages with less than 1,000 claimed speakers. The original data can be found here. The sample was limited to those aged 5 and over. If these data are correct ~2.25 million Americans spoke indigenous South Asian languages at home in the second half of the aughts.But what about the title? Well, I found data for selected South Asian languages from 1980 to 2000. And therein one can find an interesting pattern. I have placed the csv online. You can find the full table here. Note that these tables apply only to foreign born individuals. Again, age 5 and older. For South Asians this is actually pretty close to the total population, as contrary to the readership of this weblog, the national communities are still mostly first generation immigrant in composition, and biased toward people in the 30something age bracket.

lang1b.pnglang2b.pnglang3b.png

Unfortunately, Hindi and Urdu are not disaggregated for the first two censuses, so it’s a moot point to compare these two individually. I believe that the greater increase in Pakistani Americans from a lower population base (relatively) in 1980 has probably masked an even greater proportionate decline in native-language Hindi speech. Of course, this masks the opposite trend which I’ve seen anecdotally, the picking up of Hindi phrases and terms by Indian Americans whose parents came from non-Hindi speaking regions, whose first language is English. The rise in Bengali is probably in part due to the greater percentage growth of Bangladeshis in the USA since 1980. Last I checked the Census 2000 had about ~60,000 people with some Bangladeshi heritage. There were nearly 2 million people of at least partial Indian heritage, 200,000 Pakistanis, 25,000 Sri Lankans and 10,000 Nepalis.

Someone with more time than I could probably figure out what proportion of South Asian Americans have English as their first language at home. This would include people who are born and raised here, as well as those in inter-language families (though I presume that Hindi’s popularity among Gujarati and Punjabi speakers would mitigate this trend). Overall it looks like a substantial minority of South Asians already habitually speak English at home as their first language, irrespective of natality.

49 thoughts on “The decline of Hindi among American brown folk

  1. Unfortunately, Hindi and Urdu are not disaggregated for the first two censuses …

    I’m curious why you think this is unfortunate. Most linguists agree that Hindi and Urdu are dialects of the same language, with the most characteristic difference being the script. Lots of youngsters these days write, say, Tamil in the Latin script. I can’t see how that makes it a different language.

    • More granular data is generally preferable to less granular data. You can always aggregate components together, but you can’t split them apart without making assumptions, and assumptions reduce your confidence.

  2. I’m curious why you think this is unfortunate. Most linguists agree that Hindi and Urdu are dialects of the same language, with the most characteristic difference being the script. Lots of youngsters these days write, say, Tamil in the Latin script. I can’t see how that makes it a different language.

    could partially eliminate the differential growth of pak vs. indian communities, so a better apples-to-apples comparison of hindi with south indian languages.

  3. Although, why did you put your graphs in revere chronological order? That seems bizarre.

  4. Although, why did you put your graphs in revere chronological order? That seems bizarre.

    yes. i’m going to update that….

  5. Can someone please explain how to pronounce Kannada? I’m pretty sure I’ve been saying it wrong for years…

    Also…why didn’t the Census data include Dari as a language? Or I guess they just lumped it together with Farsi to make “Persian” I suppose (an ethnic group, not a language…it would be like saying Jewish instead of Hebrew…not sure why this is).

    • “Can someone please explain how to pronounce Kannada? I’m pretty sure I’ve been saying it wrong for years…”

      Wikipedia has accurate pronunciations if you have the time to figure it out .

      http://en.wikipedia.org/wiki/Kannada

      Here is my attempt

      Kuh-nuh-da

  6. Agree on the craziness of treating Hindi and Urdu as separate languages. If you remove the script issue by writing an everyday phrase in Roman letters, no one can tell if it’s “Hindi” or “Urdu”.

  7. English is my first language. I can read Tamil but written Tamil is different from spoken Tamil so I don’t understand some of what I read. I studied Sanskrit when I was in middle school at a local temple’s sunday school classes. A few years ago I tried to learn Hindi on my own using Rosetta Stone software and the Hindi alphabet is the same as the Sanskrit alphabet so I was able to pick up the reading really quickly. So back then I was able to read Hindi but I didn’t understand any of it.

  8. In my experience, there are substantial differences between Hindi and Urdu words, even in everyday speak.

    When I first moved to US, right across my apartment were a family of Pakistani Hindus recently converted to Islam. When my friend used words like “waalid”, “khaala”, “saalan”, I used to have to request a translation. They were a lot more comfortable with Hindi, they’d watched more Hindi movies than my sporadic Pakistani drama watching.

    I actually think it’s better to treat Hindi and Urdu as separate linguistically, simply for acknowledging differences. It’s something of a challenge when a majority identity subsumes a minority one (though in this case, I’m really not sure which is which, given the stats)

    Sadaiyappan, similar situation in my home. I’m trying to teach my daughter and her friends Hindi. They’re at the stage where they can read the script pretty well, except they don’t know what they’re reading. I think a kid friendly solution to that might be more Bollywood watching 🙂

  9. I think a kid friendly solution to that might be more Bollywood watching 🙂

    That solution’s not friendly to anyone.

  10. thanks guys…i’ve been pronouncing it like “Canada” with a Desi accent, whoops 🙂

    As for Hindi/Urdu…I consider them different dialects of one language. On Facebook (there’s a section to list languages you speak) and they’re listed as “Hindi-Urdu” and I like that, it acknowledges both individually while reinforcing that they’re pretty much the same thing. Same reason Dari/Farsi are collectively listed as “Persian” I guess.

    English is the primary language in my house, but everyone also knows Urdu, Pashto, Dari, and a bit of Arabic…I can’t write or read any of them particularly well though :/

    • Now the next question. What do you call a person who speaks Kannada? Kannadian?

      (Yes I know the answer is Kannadiga, but Kannadian is more fun.)

  11. Also, when you’re referring to both Hindi and Urdu I believe the term is “Hindustani.”

    The languages are undergoing a bit of divergent evolution these days though. Urdu speakers tend to use more Arabic and Farsi loanwords while Hindi speakers tend to use more Sanskrit loanwords. Right now it’s not much of a difference, but over time they will probably end up evolving to be pretty distinct. I suppose it’ll be something like Spanish and Italian. You can pretty easily pick one up if you know the other, but they’re unmistakably different languages.

  12. the disjunction between fluency and literacy is interesting. i’m marginally fluent in bengali, but have always been illiterate. because of the diglossia which u can find in bengali (most evident for those of us far from the cultural core of bengal in the west) that means that depending on social context intelligibility varies a great deal.

  13. I think data like this still continues to be heavily influenced by the predominance of DBDs in the population. I dare say a 3rd generation family of S.Asian descent, assuming that they have continued to marry S. Asians will not list Hindi or Tamil or whatever… So this is almost transient data

  14. I’m actually fairly ill today (unfotunately) so can’t muster more than a few thoughts.

    Urdu actually borrows from Dari, rather than Farsi, that’s why alot of Urdu words are archaic or medieval Persian. Also the idea that somehow Arabization or Persianisation is this new and ongoing process to differentiate the language, well Dahkini Urdu is the one that seems to be particularly ornate and flowery.

    Also there is a schedule in the Indian constitution where Hindi must borrow words from Sanskrit, to my knowledge there isn’t a similar one in the Pakistani one (arguable though how often we adhere to the constitution).

    Its interesting how a single comment (Unfortunately, Hindi and Urdu are not disaggregated for the first two censuses, so it’s a moot point to compare these two individually.) diverges the thread so rapidly. Also I’ve written on Hindi/Urdu (http://www.brownpundits.com/2011/02/05/answer-to-the-hindu-urdu-question-gandhis-hindustani/) I can copy and past but it will be a very long comment. But a salient point is here:

    Are Hindi and Urdu the same language?Yes and no, they are one and the same but there’s been a conscious effort to wedge them apart. Incidentally one of the prevailing narrative is that Hindi/Hindustani was used by “Muslims”, who turned Urdu (with the help of the “Imperialist & conniving” British) as a badge of separate identity in a way to disassociate from their “Indic origins”. However Urdu should not be taken as some Muslimification or reactionary element of Muslims against “India” or the Brits; its liturgical tradition is in fact longer (by a century at least) than contemporary Hindi (which can be traced to mid 19th century Fort Williams as having been regularised and standardised).

    • There needs to be an edit function. Just to clarify obviously Urdu is an “Indic” language (but distinct) and liturgical tradition also has a secular dimension to it.

  15. So this is almost transient data

    if immigration rates remain high, then transience and perpetuity lose distinctive meaning. though point taken.

  16. Razib, do you think the decline in the % of Hindi could be related to simply changes in the demographics of new immigrants?

    I’m sure the % of South Indians increased starting in the late 1990s.

  17. Razib, do you think the decline in the % of Hindi could be related to simply changes in the demographics of new immigrants?

    I’m sure the % of South Indians increased starting in the late 1990s.

    yes. H1B’s? family reunification tends to amplify/solidify early migration patterns in terms of selection bias. but skills-filtered immigration can have a different long term profile (e.g., it can shift based on changes in economic conditions in different regions from which the migrants derive, not just base human capital).

  18. It might be interesting to break this down by US states. I suspect that you will find a dramatic rise in South Indian language speakers in states that have a lot of tech jobs (like Silicon Valley/DC), whereas you might find a lot of Hindi/Gujarati/Punjabi speakers in places like NY/Chicago

  19. It might be interesting to break this down by US states. I suspect that you will find a dramatic rise in South Indian language speakers in states that have a lot of tech jobs (like Silicon Valley/DC), whereas you might find a lot of Hindi/Gujarati/Punjabi speakers in places like NY/Chicago

    that is in the original data set actually. i was going to look at it, but time is time. if someone is curious, here is the link:

    http://www.census.gov/hhes/socdemo/language/data/other/detailed-lang-tables.xls

  20. Eye-balling it before seeing the data and from personal experience.

    San Francisco/Silicon Valley – Mostly Southerners. Tamil and Malayalam. San Joaquin Valley – Punjabis Southern California – Gujaratis, Marathis, etc.

  21. Yeah, it’s interesting how certain ethnic groups flock to specific parts of the country – like we have more Persian Jews in America than in Iran, and they’re mostly around the LA/SoCal area. Here in NY most brown folk seem to be Punjabi/Gujurati/South Indian (actually just a lot of Indians in general, particularly North NJ an Long Island). My family socializes with a lot of Desi’s but we’ve yet to meet other Pashtuns here. Probably since “traveling” by Pashtun standards means going to the next village or god forbid out of the FATA, but hey.

  22. Do we have further breakdown of the data along economic lines? Just to validate the high-tech/s.Indian correlation of the recent migrants vs the family driven earlier ones

  23. Alina, I’m from Central California and there were only 2 related Pashtun families in a community dominated by Punjabis and Mohajirs. They certainly underpeform their share of 10% of Pakistan in the Pakistani community from my anecdotal evidence. Though its possible for many Pashtuns to not really associate with other Pashtuns especially those who have had long family history in cities like Lahore.

  24. http://en.wikipedia.org/wiki/Ethnic_groups_in_Pakistan

    15% ethnic pashtun? overestimate?

    Do we have further breakdown of the data along economic lines? Just to validate the high-tech/s.Indian correlation of the recent migrants vs the family driven earlier ones

    not that i saw, and i doubt you’ll find such a cross-tab in the census, though who knows?

  25. Btw Razib, there may be people of Pashtun ancestry in the Punjab (and even Indian Muslims) who do not identity as such. Granted they probably don’t know the language and follow more regularized North Indian Muslim traditions.

    Otherwise, I’m not sure if Wikipedia is counting Afghan Pashtuns refugees and Pakistani Pashtuns as one in the same.

  26. 15% is what I would guess – I’ve read estimates anywhere between 10% – 20%. I think it makes sense to lump the refugees and Paki pashtuns together, since they’re the same ethnicity, speak the same language, and seem to identify as the same culturally.

    Sorry to go off-topic 🙂

    • I think it varies Alina.

      Some Pashtuns in Pakistan are very “South Asianized”

  27. For what it’s worth, I don’t speak Hindi (well) because my family grew up speaking Gujarati exclusively. Hindi was for talking to South Asians from other parts of the subcontinent, and unfortunately there were very few of those in my parents’ social circle. So, while I wouldn’t show up on that index, I still speak a Desi language. I wonder if that’s a confounding factor in the data…

  28. I don’t speak any dessy lang fluently but I fake it. My family is Sikh so Punjabi is spoken in our household. Bizzaare thing is that the Sikh family branches in Desh is quite pretentious ;). The parents speak in Hindi to their sprogs. It’s considered fashionable I spose. This has been going on for decades. We are all middle-aged now. gollum-like even 😮 So here we are abroad learning to hold on to our cultures and trying to avert the injustice of the ’84 riots etc etc from (re)happening in the home host countries

    & quite simply, it seems that the real Desis could not give a toss about the language dichotomy. but then again I only speakz for Delhi & maybe mumbai the artist formerly known as bumbai

  29. I am curious. The most common word on this website is “South Asian”. I dont recall ever calling myself “South Asian” when I was studying in the US. Neither did my friends. We called ourselves “Indian”.

  30. “Overall it looks like a substantial minority of South Asians already habitually speak English at home as their first language, irrespective of nationality.”

    Definitely the case in Bangalore. Most young upper middle class parents address their kids in English. Kannada is a dying language, unfortunately – at least among the well-to-do.

  31. Definitely the case in Bangalore. Most young upper middle class parents address their kids in English. Kannada is a dying language, unfortunately – at least among the well-to-do.

    this weblog is american-centric, and i was analyzing american census data. interesting to know about bangalore, but please note that i was referring to ABDs, and am doing so unless otherwise stipulated.

    also, don’t revive the “south asian vs. indian” debate. it’s tired, stale, and invariably will result in deletion.

  32. One point that I forgot.

    The pie chart confirms that the US South Asian community is far more diverse than the UK South Asian community.

  33. The pie chart confirms that the US South Asian community is far more diverse than the UK South Asian community

    yes. among the “asian” community in the UK even among pakistanis and bangladeshis there’s a trend toward subregional sampling which is really strong; mirpuris == pakistani and syhleti == bangladeshi. when i visited london i was struck by the bengali dialect i often heard, which was really hard to understand. in contrast, the bengali i heard in italy was easy to understand. i presume that’s because the italian bangladeshi migrants are more geographically diverse and default more to standard bangaldeshi bengali.

  34. @Red: I don’t understand your confusion – south asian refers to pakistan, bangladesh, sri lanka, afghanistan, and nepal as well as india. So naturally if you’re indian you’ll call yourself indian, but when referring to the entire region, south asia is the correct term.

  35. @Red: I don’t understand your confusion – south asian refers to pakistan, bangladesh, sri lanka, afghanistan, and nepal as well as india. So naturally if you’re indian you’ll call yourself indian, but when referring to the entire region, south asia is the correct term.

    his confusion was understandable. if he’s a first time commenter he is probably not aware that this is an indian american blog numerically in terms of contributors, and, that some of the contributors are not indian american, such as myself. “south asian” strikes me as a somewhat clinical and artificial term (i hope i’m not divulging anything by observing that at SM meetups the term “indian” is 100 X more common than “south asian”), but it does seem minmax.

    anyway, this debate has been erupting since 2004. it will never end because all parties will never get final resolution. i am probably in the set of “non-indian” nationals of south asian origin who is fine with “reclaiming” the term indian in a broader context, but this isn’t going to happen.

  36. @YogaFire who said, “Right now it’s not much of a difference, but over time they will probably end up evolving to be pretty distinct. I suppose it’ll be something like Spanish and Italian.”

    I would argue that it won’t for these reasons.

    India has both Hindi speakers and Urdu speakers, and those who say they speak Hindi who are really speaking Urdu, or a lot of Urdu. And Bollywood.

  37. Hindi is actually a very recent invention. Originally there was braj and other older languages. Hindi derives from a mix of Hindustani, Urdu, punjabi ( both much older), Marathi, Gujarati and rajistani. It’s core is actually braj and Sanskrit. Take the latter and punjabi out and what you have left is Urdu. Another way to look at it is that Urdu is heavily persianised and Hindi sanskritised. Bollywood Hindi is closer to punjabi and rajistanni. Essentially it is the same language as Urdu the difference is political and religious chauvinism.

    In uk it is punjabi that rules, followed by Urdu, Gujarati and Bengali.

  38. What I am fascinated with is the very small minority born and raised in the west as Americans or Brits who have not only managed to learn to speak their heritage language but can read and wrote in it as well. How many generations does it take to completely angloise this breed? The only famous examples I can provide are the singer sukhshinder shined. The writer Roop Dhillon or actress Katrina Kaif.

    I give them one generation before they lose all real connection with south Asia.

  39. “Definitely the case in Bangalore. Most young upper middle class parents address their kids in English. Kannada is a dying language, unfortunately – at least among the well-to-do.”

    I dont agree with that statement, Kannada might be less spoken in Bangalore, But it is still spoken by large number of people (70% of people in karnataka speak Kannada) in Karnataka.