Do that Guju you do!

gujcluster.jpgThe 2009 paper Reconstructing Indian population history was a watershed in understanding the genomics of South Asians. Before this point the studies had been with unrepresentative samples, fewer markers, or, South Asians were only a sidelight. This paper put the focus on South Asians to elucidate the group’s population history (it still undersampled eastern South Asians, though this seems part of the plan because of their focus on two, not three, ancestral Indian components). If you want to know more about the paper, here is the ungated version. But in this post I want focus on an issue which you can find only in the supplements to the paper.

The HapMap project, which surveys genetic variation in world populations, has a set of Gujaratis, from Houston, Texas. This is currently the primary population of Indian origin you have to work with in the public data sets. There are other South Asian populations in the public domain, but their number of markers is far lower. So the Gujarati sample is very useful right now. But one thing that immediately jumps out at you is that there are in fact two Gujarati clusters. In the PCA plot I’ve extracted from the supplements you see the two largest components of genetic variation. PC 1, the x axis, separates whites from South Asians, and PC 2, separates one group of Gujaratis from everyone else. What’s going on here?

First, let’s take another look at the Gujarati population, and compare it to other South Asians. I ran ADMIXTURE the other day with a combined data set of Eurasians, Papuans, and Berbers from Algeria. I exclude Africans and New World populations because for the purposes of this analysis I’m not interested in that genetic variation (Africans are so diverse that they often take up many components of any analysis). The way ADMIXTURE works is that it takes in a K number of hypothetical ancestral populations, and assigns fractions of ancestry to individuals to each group, taking as the reference the variation found in the aggregate sample. To make this concrete, I have two Bengalis in the sample, my parents. While the Chinese come out to be almost 100% one population, and the French 100% another, my parents tend to be a mix. That’s because they share commonalities with both groups. Also, for what it’s worth I pruned my data set down to 80,000 markers.

K2to4.pngTo the right you see runs K = 2 to K =4. That means 2, 3, and 4, ancestral populations respectively. Each bar is divided by color into population assignments. From the top to bottom I have Sindhis, French, Papuans, Chinese, Bengalis, and Gujarati_A and Gujarati_B. Gujarati_A is the group of Gujaratis who are more diverse and do not form a tight distinctive cluster in the plot above. Gujarati_B are those who do form a tight distinctive cluster.

A K = 2 the Chinese and French elements are clear. This is an east vs. west division. South Asians are a bit more western than eastern, as you might expect, with the Bengalis being the most eastern, and the Sindhis the least. At K = 3 you see a Papuan element break out in red, which South Asians share. Note that my parents have the green element which is nearly 100% in Chinese. Finally at K = 4 you see a South Asian component, in green now.

Note the difference between the two Gujarati samples. Gujarati_A has much more of whatever is distinctive to the French. Gujarati_B is more singularly South Asian. On a finer grained level, you can see in this figure that Gujarati_B is very uniform in ancestral quanta in comparison to Gujarati_A.

What does this tell us? The PCA pulls out only a small fraction of genetic variance, but because it extracts independent dimensions it is really informative of between population difference. You want to not include related individuals in this sort of stuff, because they often show up as part of their own cluster, as they share so many genetic variants. To me Gujarati_B looks to be a specific group among that ethnic group which is broadly related because of endogamy. This is a common pattern among South Asians because of jati. And, in the paper I referenced above it shows up in the genomics.

Why does this matter? Because different South Asian groups have different genetic susceptibilities. England has many more South Asians, so one knows that Bengalis in particular have a greater risk of type 2 diabetes, all things controlled, than Punjabis. But more specifically individual endogamous groups should have their own genetic predispositions. If this is true, and Gujarati_B is an endogamous group, then pooling it into the broader Gujarati sample may be problematic. At least without correction. If the Gujarati HapMap sample is used as a proxy for South Asian genetic variance, then the peculiarities of Gujarati_B may produce false positives when it comes to generalizing to South Asians more broadly. In contrast, Gujarati_A are all co-ethnics, and if they’re South Asian likely the outcome of long term endogamy. But, the individuals may be the outcome of different histories, and so only share risk variants common to Gujaratis, or perhaps even South Asians.

That’s the practicality. On to the reason why I hope the readership here will be informative: does anyone know what this difference in the Houston Gujaratis could be? I have speculated that perhaps Gujarati_B are a specific jati which is well overrepresented in the United States. Ethnographically Gujaratis should have a sense of this I assume, just as non-Sylheti and non-Mirpuri Bangaldeshis and Pakistanis in England are aware of the dominance of these two subgroups.

Addendum: For those who want meatier South Asian focused population genomics, please see Zack Ajmal’s continuous series of results. Looks like he just got a new batch of participants.

56 thoughts on “Do that Guju you do!

  1. “Eastern south asians” lol.. This exposes how ridiculous the term South Asian really is. At least in anthropological, cultural and genetic contexts, it is far more accurate to refer all people of the sub-continent as Indians. The fact that one of the states in the region calls itself India doesn’t justify the ludicrous term “South Asian”.

  2. “Eastern south asians” lol.. This exposes how ridiculous the term South Asian really is. At least in anthropological, cultural and genetic contexts, it is far more accurate to refer all people of the sub-continent as Indians. The fact that one of the states in the region calls itself India doesn’t justify the ludicrous term “South Asian”.

    this is fair. but can people please engage the science for once instead of bringing up long running issues on this weblog?

  3. for me, it’d be tough to separate gujaratis (and different jatis) without thinking of religion, which has a huge role in the endogamous culture. while there are many gujarati hindus and muslims in this country, i feel (and have no data to back this up, just my skewed experiences) that there may be an over representation of jains in the u.s., especially in large metro areas like houston. i’m sure you are right that guju_b’s are specific jati; i’m just guessing as to how to narrow down what that jati may be. of course, there are many gujarati jain jati’s as well, so that’s obviously not the full answer.

    great stuff nonetheless.

  4. especially in large metro areas like houston. i’m sure you are right that guju_b’s are specific jati; i’m just guessing as to how to narrow down what that jati may be. of course, there are many gujarati jain jati’s as well, so that’s obviously not the full answer.

    hm. well, i can easily place people in A or B if i get a 23andme file 😉 my mom is in B, my dad is in A.

  5. my mom is in B, my dad is in A.

    btw, this isn’t super informative. my parents are closer to each other than any guju, but if i placed them within the guju variance my mom is within B and dad A. my mom is RM and dad is RF here.

  6. “Eastern south asians” lol.. This exposes how ridiculous the term South Asian really is. At least in anthropological, cultural and genetic contexts, it is far more accurate to refer all people of the sub-continent as Indians. The fact that one of the states in the region calls itself India doesn’t justify the ludicrous term “South Asian”.

    Good grief, what’s ludicrous is expecting everyone from Afghanistan, Pakistan, Bangladesh, Afghanistan, Sri Lanka, and Nepal to start calling themselves “Indian”. I’m surprised to read that on Sepia Mutiny, it seems like something a redneck would say (lumping all the brown folks together as “India”). Do you realize how misleading and confusing that would be? In America it would have 3 potential meanings: Native American Indians (outside the PC context folks still call ’em Indians), general Desi’s, and legit Indian Indians. As an American of Afghani and Pakistani (Pashtun) descent, I’d feel stupid calling myself an Indian. It would be like a Canadian telling people abroad that he is from America instead of Canada – technically ok because Canada is in North America, but realistically misleading because people would assume he’s from the USA.

  7. Thanks razib for starting this discussion. Read the paper.

    Any additional reference to what the ANI and ASI are? Are these theoretical constructs using different components of PCA that don’t belong to anything else? the statistics and genetics i took in college are failing me.

    How will they determine the actual timing of the ANI + ASI admixture with additional samples. And the terms ANI and ASI, are they being used as politically correct ways of saying aryan and dravidian (i know those were coined as language groups, but they have taken a life of their own…specially on this blog).

    great to have u here as guest blogger.

  8. Any additional reference to what the ANI and ASI are?

    i believe ANI are old west eurasians, probably part of the anatolian farmer push from Çatalhöyük. a small component of ANI is probably indo-aryan. the researcher who did much of the work on the above paper told me that the genetic distance between ANI and europeans is on the order of that between finns and italians. when you go into the higher K’s there are some things you find in indo-aryan speaking south asians (and south indian brahmins) you don’t find in non-indo-aryan south asians. additionally, that small K is also found in parts of europe and the mid-east, and, lacking in basques, sardinians, and finns.

    ASI are just descendants of the longest resident modern humans in southern eurasia. their stamp is pretty evident in south asia all the way to vietnam. and they have some connection to the eastern populations (chinese, papuans), distant as it is. the stuff that cambodians and south indians share is i think deep ASI.

    Are these theoretical constructs using different components of PCA that don’t belong to anything else?

    ANI+ASI = the south asian element you see partitioning out above. basically “south asian” in ADMIXTURE is a stabilized hybrid, a compound of ANI+ASI gene frequencies so intermixed that they’re a solid continuum across much of south asia.

    How will they determine the actual timing of the ANI+ASI admixture with additional samples. And the terms ANI and ASI, are they being used as politically correct ways of saying aryan and dravidian (i know those were coined as language groups, but they have taken a life of their own…specially on this blog).

    you’d look at decay of linkage disequilibrium. the admixture seems old. you can calculate the admixture to generate uyghurs, probably ~2,000 years. the ANI+ASI is pretty old, older than that. again, i corresponded with one of the lead researchers. at the current time it just isn’t possible to extract out the signal to peg a time. i think that means it’s more than 4,000 years ago, perhaps 6 or 7?

    i do not believe that the primary ANI+ASI divide is indo-aryan or dravidian. i think the dravidian languages may have come with ANI, and that indo-aryan languages are a later overlay, and contribute to the small, but persistent difference between indo-aryan + south indian brahmins & non-brahmin dravidian. the lowest ANI percentages in dravidian speaking groups are in particular tribal populations, at 40%. the highest are among pathans, at 75%, with kashmiri pandits and sindhis at 70%. the gap there is big, but it isn’t as big as it should be if ANI = indo-aryans, and ASI = dravidians. the tribal groups shouldn’t have any indo-aryan.

    if you want me to hazard a guess, this is what i think happened. within the last 10,000 years a group of farmers arrived in the indus valley drainage area, their ultimate origin being in anatolia and its environs. these are most of the ANI. they admixed with the ancient indigenous substrate, the ASI, and produced a new hybrid population. this population then underwent a massive demographic expansion. east and south. think the settlement of north america by whites. 30,000 puritans in 1640 probably gave rise to 30 million english americans today. the gradient between northwest-southeast, and upper to lower caste, is just a function of the ANI-ASI hybrids uptaking indigenous ASI substrate across the subcontinent. i believe that the indo-aryans and munda people came later. here’s a blog post on the munda:

    http://blogs.discovermagazine.com/gnxp/2010/10/sons-of-the-conquerers-the-story-of-india/

    as for he indo-aryans, it seems like they are surely related to the ANI, being west eurasian. but they weren’t ANI. there is a weird genetic signal at high K’s from mongolia, to india, to france, but excluding some non-indo-european groups in europe and india, which makes me believe that we’ll know what the genetically origin of that group is (i suspect they had some interaction with the north caucasian peoples).

    finally, the people of the indus valley buried their dead underneath their houses. this was an anatolian practice too. but importantly, it gives us hope that one day we’ll be able to get genetic data from subfossils and remains. if we can get an indus valley person’s genome reconstructed, we could look for haplotype blocks in them. the decay is going to be far less at that point.

    thanks for focusing on the science. appreciated 🙂

  9. Gujarati is a language group, not an ethnicity. Language is an arbitrary grouping that is genetically meaningless. People who speak Gujarati include everyone from tribals to Parsis. Most groups within Gujarat are largely endogamous so obviously there are going to be vast genetic differences between the groups. Whoever collected the original data would need to (1) ask better questions that actually describe genetic (familial/clan/caste/etc) groupings, and (2) be transparent about their recruiting methodology. It doesn’t really make sense to ask the random public about what kind of Gujus live in Houston; you have to go to the source. There are lots of groups and if the data collectors recruited from one or two particular samaj’s (societies), they’re going to get very specific results.

  10. Language is an arbitrary grouping that is genetically meaningless

    this is absolutely false. you should be careful before making extreme statements like this, unless you’re god and so determine reality! in any case, there’s a fair amount of empirical evidence now that genetic clines tend to follow language group barriers. the anthropological reason for this is obvious: people tend to marry those who speak the same language. you can prove this to yourself, learn a little basic programming and run the samples yourself from the public domain. it will take 6 hours on a moderately powered computer. here’s the famous cavalli-sforza dendogram which shows the correlation between language families and genetic trees:

    http://www.pnas.org/content/94/15/7719/F3.large.jpg

    It doesn’t really make sense to ask the random public about what kind of Gujus live in Houston; you have to go to the source.

    yes it does. did you bother to read my post? the method would be simple: just ask around to see what group could be such a large portion of the population. all the other gujus are clearly from different groups. in fact, if a few gujus send in their samples to me i could quickly figure out if any of them were in cluster B, which is very distinctive and for which the sample size is robust. cluster A seems like a large number of different groups.

  11. Interesting post, Razib. Aside to Sanjaya, please take a look at the site FAQs. “Free speech applies to the public sphere. This is a privately-run blog with moderated comments. You’re welcome to post whatever you wish on your own blog.” If the author of the post feels an individual is repeatedly hijacking/derailing a thread, s/he is free to delete comments and ban folks. Thanks, PG

  12. thanks razib for ur detailed response. It answered a lot of questions I’ve had.

    In the PCA plot in the paper, where they look at people of similar caste status but in different regions (AP/UP), did they find overlap across regions? That is, the ANI-ASI mix (from 40% – 75% south east to northwest groups), did that track with caste? I was not able to discern that from the paper.

    thanks once again for taking the time. it would be nice if ppl took more of an interest in it. at least they have a hook now. to re-affirm the superiority/inferiority of their identities.

  13. In the PCA plot in the paper, where they look at people of similar caste status but in different regions (AP/UP), did they find overlap across regions? That is, the ANI-ASI mix (from 40% – 75% south east to northwest groups), did that track with caste? I was not able to discern that from the paper.

    here here and here. to a first approximation one can say:

    % ANI = (how close to afghanistan)X + (how high up in the caste ladder)Y

    or, see this graph: http://blogs.discovermagazine.com/gnxp/files/indiareich7.png

    to the bottom-right = more ASI. to the bottom-left = more east asian. and the top = maximal ASI.

    hope u feel validated 🙂

  14. ani = “ancestral north indian” and asi = “ancestral south indian.” they’re mnemonics, and aren’t meant to be taken literally. as i stated above, ANI is very close genetically to the populations of europe and west asia (turkey, iran, etc.). if you had a line representing genetic distance, with the french and chinese as antipodes, ASI would be 2/3 of the way to the chinese. but in many ways ASI were an independent east eurasian population, which has been submerged across south and southeast asia. there are no “pure” ASI, while many iranians are excellent proxies for ANI. these two groups were inferred from the distinctive gene frequencies which are the hallmark of south asians. the possibility that south asians are a compound of these two very different groups resolves some strange peculiarities in the phylogeography of south asia.

  15. Was the “Caste” or “Jati” data of the participants not collected?? Caste might be the only scientific way to explain ethnic clusters in India. Genetic testing will (may) eventually bare it out.

  16. Was the “Caste” or “Jati” data of the participants not collected??

    no one in the USA (aside from some brownz) know or care about jati (they’re liable to think you’re talking about an obscure yoga technique). and they’re vague about caste. so i doubt it was collected.

    Caste might be the only scientific way to explain ethnic clusters in India. Genetic testing will (may) eventually bare it out.

    please see the paper and the links i’ve provided. you don’t need to be hypothetical. geography is probably the best explainer, and then caste. so, for example, the genetic distance between punjabi brahmins and punjabi jatts (or, pakistani punjuabi peasants) seems lower than that between punjabi brahmins and south indian brahmins. though south indian brahmins tend to exhibit smaller genetic distance to north indian groups than south indian non-brahmins.

    click here, http://www.nature.com/nature/journal/v461/n7263/extref/nature08365-s1.pdf, go to page 3, and look at the pairwise values to the top right of the diagonal. those are genetic distance measures. i assume you are more aware of the nature of the castes than i, but i hope that will clarify the issue in your mind.

  17. Razib: Great post as expected. Full of scientific details. A minor request. There is no such word as “Gujju”. Please call us “Gujarati”. Thanks. Back to the intelligent forum discussing human genetics……

  18. 30,000 puritans in 1640 probably gave rise to 30 million english americans today.

    No they didn’t. The 30,000 puritans weren’t the only English immigrants to America. Numerous others came after them.

    I have read that one can still have an agenda when interpreting all this genetic data. Which tells me that this is not yet a hard science. The validity of the Kalash component of the Harappa Project is being questioned for example.

    I find it hard to accept that such a distinct and numerous population like desis are derived from much more sparse populations in West Asia or Central Asia who are less ancient to begin with and look substantially different as well.

    • had similar thoughts. aren’t desis incredibly wayy more diverse than the ones we’re trying to source them genetically to??

    • The validity of the Kalash component of the Harappa Project is being questioned for example.

      I myself question the Kalash component. 🙂 I look at it this way: At low K values (i.e. number of ancestral components to be calculated) several real ancestral populations might be merged as a single ancestral component. For example, we had South Asian and Papuan together as one component at K=4. Then at K=6, Papuan component split.

      BTW, where is this discussion happening?

  19. No they didn’t. The 30,000 puritans weren’t the only English immigrants to America. Numerous others came after them.

    i know a lot more american history than you, so you should chill out.

    http://www.greatmigration.org/new_englands_great_migration.html

    i also know more demographics than you, and what you should understand is that first settlers have a disproportionate impact on the ancestral makeup of later groups, because later migrants intermarry with the founding stock. see here:

    http://blogs.discovermagazine.com/gnxp/2011/01/the-genomic-heritage-of-french-canadians/

    here’s a map of english americans:

    http://en.wikipedia.org/wiki/File:English2000.png

    they track “greater new england.” it is true that many english americans settled in the south, but many of these have re-identified as “american.”

    Which tells me that this is not yet a hard science.

    ? are you a scientist? if so, you don’t do any interpretation?

    I find it hard to accept that such a distinct and numerous population like desis are derived from much more sparse populations in West Asia or Central Asia who are less ancient to begin with and look substantially different as well.

    the earliest modern human finds aside from africa are in western asia:

    http://en.wikipedia.org/wiki/Skhul_and_Qafzeh_hominids

    aren’t desis incredibly wayy more diverse than the ones we’re trying to source them genetically to??

    you need to have a better model of what’s going on.

    consider mixed-race brazilians. they have three source populations:

    1) africans 2) europeans 3) amerindians

    africans are far more diverse than group #2 or #3, so the mixed group is more diverse than the second and third parental populations. second, south asians are more diverse than europeans and east eurasians, but not necessarily middle easterners (though this is more of an open question). they are definitely not more diverse than africans, or at least that’s the current consensus.

  20. since i’m a ‘guest’ here, i feel bad about being kind of short. so i’m going to elaborate a bit as to why i said what i did.

    first, in 1790 there were 4 million americans. in 1990 there were 250 million. but, 1990 was the first census where the majority of the ancestry of american residents post-dated 1790. here’s pointers to the source:

    http://blogs.discovermagazine.com/gnxp/2010/04/when-america-was-post-colonial/

    the key is that demographically the later anglo-saxon, german, irish, southern italian, asian, etc., immigration mixed with the populations already resident, who had among the highest fertilities in the world because of land surplus. this dynamic was most accentuated in new england. my statement about 30,000 to 30 million english americans is qualified:

    1) new england did not receive much immigration quantitatively between 1650 and the massive waves of irish which accompanied the beginning of the industrial economy

    2) the south and the mid-atlantic did. in particular, the scots-irish arrived in the 18th century via philadelphia

    3) new england’s population growth was due to very high endogenous fertility, which persisted into the mid-19th century. new englanders populated northern new york, and much of the upper midwest and great lakes region, because of land shortage and labor surplus (the mormons were part of this migration wave).

    4) there is a tendency between the 1980 and 2010 census of people selecting their most “exotic” ancestry when giving ethnic information. this has resulted in a collapse of self-identified english americans. so someone is 1/4 german and 3/4 english will identify as german many times.

    many of these demographic/historical facts can be found in the books Albion’s Seed and The Cousins’ Wars.

    i don’t want to talk about this anymore, but i figured it would outline my rationale since i was rather dismissive. if you disagree that’s fine, i don’t care.

    second, ppl keep asking about genetic diversity, and i keep repeating myself. i can’t find it table format now, but here’s a figure which shows diversity as a function of distance from african (haplotypes are sequences of markers, and heterozygosity just means that the two sequences on the two homologous chromosomes don’t match, so they’re diverse):

    http://blogs.discovermagazine.com/gnxp/files/2011/02/F3.large_.jpg

    as you can see, south asians are not the most diverse population. this isn’t like newtonian mechanics, so we need better population coverage. but that’s the closest we got to consensus now.

  21. what does the “validity of a the kalash component” even mean? these aren’t even necessarily real populations, they’re just patterns in the genes. the “south asian” component, as i suggest above, may simply be the intermixing of two other ancestral populations. and due to reticulation in ancestry this is true if you go far back enough with all the components.

  22. Ok much of the science goes over my head, so not sure if this helps. Anyway, based purely on my attendance at my friends wedding, there’s a large Gujarati Bohra Muslim population in Houston.

  23. Interesting the Bohras are local Gujarati converts who may or may not have Arab contributions.

    The Memons and the Khojas (the other two Muslim merchant castes of Gujarat) were converted in Sindh and then migrated during the medieval ages to Gujarat completely switching their cultural and linguistic identity in the process.

    Think Razib had some interesting comments about how “Ismailism” was the predominant faith among Gujarati Muslims until the imposition of normative Islam.

  24. if you want me to hazard a guess, this is what i think happened. within the last 10,000 years a group of farmers arrived in the indus valley drainage area, their ultimate origin being in anatolia and its environs. these are most of the ANI.

    The fact that ANI is found at 40% and more even in desi tribals living in isolated jungles far to the southeast of the Indus Valley Civilization makes your guess a mathematical impossibility.

  25. The fact that ANI is found at 40% and more even in desi tribals living in isolated jungles far to the southeast of the Indus Valley Civilization makes your guess a mathematical impossibility.

    interesting that i run into ramanujans who lecture me about mathematical impossibilities! in any case, i’m alluding to a fisher wave of advance, since you’re so versed in mathematics you’ll get my drift.

    Interesting the Bohras are local Gujarati converts who may or may not have Arab contributions.

    the bohra have no arab:

    http://blogs.discovermagazine.com/gnxp/2009/10/the-mostly-south-asian-origins-of-indian-muslims/

  26. are you a scientist? if so, you don’t do any interpretation?

    i believe ANI are old west eurasians, probably part of the anatolian farmer push from Çatalhöyük…………..i do not believe that the primary ANI+ASI divide is indo-aryan or dravidian. i think the dravidian languages may have come with ANI……..this is what i think happened. within the last 10,000 years a group of farmers arrived in the indus valley drainage area, their ultimate origin being in anatolia and its environs. these are most of the ANI. they admixed with the ancient indigenous substrate, the ASI, and produced a new hybrid population. this population then underwent a massive demographic expansion. east and south. think the settlement of north america by whites. 30,000 puritans in 1640 probably gave rise to 30 million english americans today.

    Sounds more like faith and speculations to support that faith/agenda than hard science. The people doing this stuff such as Dienekes and Polako are accused of having agendas and seeing what they want to see. As long as agendas are possible in this business it behooves desis to have a south asian agenda not a west eurasian or european agenda. Which is why the Harappa Project and its Kailash component is a step in the right direction.

  27. . The people doing this stuff such as Dienekes and Polako are accused of having agendas and seeing what they want to see.

    everyone has some agenda (at least unconsciously). the steps to replicate this are trivial in any case. also, from what i have seen the accusations against polako are spurious. a lot of southern europeans are angry at traces of sub-saharan african he’s detected. i’ll let you decide whose agenda is more merit-worthy in that case.

    As long as agendas are possible in this business it behooves desis to have a south asian agenda not a west eurasian or european agenda.

    it behooves people to be as objective as possible, and, be transparent so others can replicate. (e.g., i’ve replicated zach’s ‘kalash’ cluster, and i’ve seen suggestions to dienekes’ ‘dagestani’)

  28. i know a lot more american history than you, so you should chill out. http://www.greatmigration.org/new_englands_great_migration.html i also know more demographics than you, and what you should understand is that first settlers have a disproportionate impact on the ancestral makeup of later groups, because later migrants intermarry with the founding stock.

    English immigration to America did not end in 1640. Anyone with the slightest clue about american history knows that.

    http://www.everyculture.com/multi/Du-Ha/English-Americans.html

    In the late seventeenth century most English immigrants were younger men who came from the rural areas of southern and south central England. Unlike the New England farming families, most who settled in the region from the Chesapeake to Charleston came as indentured servants and had training as farmers, skilled tradesmen, laborers, or craftsmen.

    In the eighteenth century, people from London and the northern counties comprised the majority of English immigrants. The percentage of women increased slightly, from about 15 percent to nearly 25 percent of the English settlers.

    Economic and political troubles brought new spurts of English immigration in the 1720s and in the decades preceding the American Revolution.

    While English settlers and their descendants constituted only about 60 percent of the European settlers and half of the four million residents living from Maine to Georgia, according to the 1790 census, they had ensured the dominance of English institutions and culture throughout the new republic.

    Although German, Irish, Scandinavian, Mediterranean, and Slavic peoples dominated the new waves of immigration after 1815, English settlers provided a steady and substantial influx throughout the nineteenth century.

    During the last years of 1860s, annual English immigration increased to over 60,000 and continued to rise to over 75,000 per year in 1872, before experiencing a decline. The final and most sustained wave of immigration began in 1879 and lasted until the depression of 1893. During this period English annual immigration averaged more than 80,000, with peaks in 1882 and 1888.

    In the twentieth century, English immigration to America decreased, a product of Canada and Australia having better economic opportunities and favorable immigration policies. English immigration remained low in the first four decades of the century, averaging about six percent of the total number of people from Europe

    This decline reversed itself in the decade of World War II when over 100,000 English (18 percent of all European immigrants) came from England.

    Despite the slight decline in English immigration under the current immigration structure adopted in the 1970s, 33 million Americans identify themselves as being of English descent in the 1990 census. They constitute the third largest ethnic group in the United States

    I think it is necessary to contest this point because you are trying to use it as an example of how a group of Anatolians could have determined the genetic content of desis across the length and breadth of the Indian subcontinent in a short span of time.

  29. if im reading this right there seems to be only a small(<10%?) french component to GujA’s thats not there in GujB’s.The rest look the same. And thers the same French strain in Bengalis though at a much smaller level so maybe these are folks are from some french colony in Guj/Bengal? I think some Iyengars claim a subsect of theirs (Mandyam?) carry some european genes because of the interbreeding that happened when the Brit soldiers were stationed in the karnataka barracks

  30. wtf let me try again-there seems to be only a small(<10%?) french component to GujA’s thats not there in GujB’s.The rest look the same. And thers the same French strain in Bengalis though at a much smaller level so maybe these are folks are from some french colony in Guj/Bengal? I think some Iyengars claim a subsect of theirs (Mandyam?) carry some european genes because of the interbreeding that happened when the Brit soldiers were stationed in the karnataka barracks

  31. Apols Razib I used Arab for a short-hand for “non-South Asian”, when I should be much more precise. On the Bohra I remembered reading this and that’s what made me speculate on some foreign mixture?

    “Interestingly, in Dawoodi Bohras (TN and GUJ) and Iranian Shia significant genetic contribution from West Asia, especially Iran (49, 47 and 46%, respectively) was observed.”

    http://www.nature.com/jhg/journal/v54/n6/full/jhg200938a.html

    I try to read the analysis more than the primary sources and this isn’t my field so more than happy to be clarified where-ever possible.

  32. “it behooves people to be as objective as possible, and, be transparent so others can replicate”

    like speculating that Southern Whites have lower IQs because they are Christians who live in warmer weather? The infamous temperature-IQ correlation you suggested a while back?

  33. The infamous temperature-IQ correlation you suggested a while back?

    what does that have do with do with being objective? anyway, the low IQs had nothing to do with xtians. controlled for ethnicity people do have lower IQs in warmer american climes. stop trolling, i can actually moderate now.

  34. ” razib | July 17, 2008 2:59 PM | Reply

    (in response to me stating – “I don’t think being Evangelical has any correlation with being a moron.”)

    uh, sorry, it does.”

    Moderate away…

  35. i know a lot more american history than you, so you should chill out. http://www.greatmigration.org/new_englands_great_migration.html i also know more demographics than you, and what you should understand is that first settlers have a disproportionate impact on the ancestral makeup of later groups, because later migrants intermarry with the founding stock.

    English immigration to America did not end in 1640. Anyone with the slightest clue about american history knows that.

    http://www.everyculture.com/multi/Du-Ha/English-Americans.html

    In the late seventeenth century most English immigrants were younger men who came from the rural areas of southern and south central England. Unlike the New England farming families, most who settled in the region from the Chesapeake to Charleston came as indentured servants and had training as farmers, skilled tradesmen, laborers, or craftsmen.

    In the eighteenth century, people from London and the northern counties comprised the majority of English immigrants. The percentage of women increased slightly, from about 15 percent to nearly 25 percent of the English settlers.

    Economic and political troubles brought new spurts of English immigration in the 1720s and in the decades preceding the American Revolution.

    While English settlers and their descendants constituted only about 60 percent of the European settlers and half of the four million residents living from Maine to Georgia, according to the 1790 census, they had ensured the dominance of English institutions and culture throughout the new republic.

    Although German, Irish, Scandinavian, Mediterranean, and Slavic peoples dominated the new waves of immigration after 1815, English settlers provided a steady and substantial influx throughout the nineteenth century.

    During the last years of 1860s, annual English immigration increased to over 60,000 and continued to rise to over 75,000 per year in 1872, before experiencing a decline. The final and most sustained wave of immigration began in 1879 and lasted until the depression of 1893. During this period English annual immigration averaged more than 80,000, with peaks in 1882 and 1888.

    In the twentieth century, English immigration to America decreased, a product of Canada and Australia having better economic opportunities and favorable immigration policies. English immigration remained low in the first four decades of the century, averaging about six percent of the total number of people from Europe

    This decline reversed itself in the decade of World War II when over 100,000 English (18 percent of all European immigrants) came from England.

    Despite the slight decline in English immigration under the current immigration structure adopted in the 1970s, 33 million Americans identify themselves as being of English descent in the 1990 census. They constitute the third largest ethnic group in the United States

    I think it is necessary to contest this point because you are trying to use it as an example of how a group of Anatolians could have determined the genetic content of desis across the length and breadth of the Indian subcontinent in a short span of time.

  36. rahul, you might like my work, but do you know the difference between a correlation and a causation? or are you those who don’t bother making a distinction?

  37. I think I made that very same point on that thread and several other instances on the conclusion YOU arrived at. Morevover, you don’t have to assume that I’m stupid just because I disagree with you.

    Please forget that I mentioned anything, since it is giving your reason to be THIS disparaging and use your delete wand for my posts.

    Carry on Pilgrim .

  38. That’s the practicality. On to the reason why I hope the readership here will be informative: does anyone know what this difference in the Houston Gujaratis could be?

    Razib – I’ve highlighted your question in today’s Desi Living post, so that the questions gets some more visibility with Gujrati Houstonians. Hope this helps to answer your question.

  39. That’s the practicality. On to the reason why I hope the readership here will be informative: does anyone know what this difference in the Houston Gujaratis could be?

    Razib – I’ve highlighted your question in today’s Desi Living post, so that the question gets some more visibility with Gujrati Houstonians. Hope this helps to answer your question.

  40. Please forget that I mentioned anything, since it is giving your reason to be THIS disparaging and use your delete wand for my posts.

    That effectively sums up RK’s MO, regardless of his intellectual ability. It’s rather sad. Have you visited his site? It’s basically quite decent posts with nothing but insults and screaming-down.

  41. That effectively sums up RK’s MO, regardless of his intellectual ability. It’s rather sad. Have you visited his site? It’s basically quite decent posts with nothing but insults and screaming-down.

    stay on topic from now on. don’t want comment’s like lynn’s to get lost in the din. i’ve been away from the computer for several days, but i’m back and will be vigilant again.

  42. Hey here’s a thought. People are people and we are all the same. The sooner we start to focus on the similarities and not the differences between us the better off we will all be. Revolutionary idea I know.

    Peace out.