Saturday, December 13, 2008

Absurdities of Caste Genetic Studies - Notes

If any Indian Caste Genetic study has M J Bamshad as one of its authors then absurdities to serve their intellectual dishonesty is but natural. I used to wonder why not Bamshad et al. study Punjabi and UP castes thoroughly then come to South India. I would be more convinced of their assumed limitation of those studies once we interpret those results. But after reading this study I'm not so sure of their competency.

Just a thought:
I suppose it's almost 13 years(Mountain et al. 1995?) since studies of Indian caste population started appearing. The complexity of Indian population structure probably has made it difficult for Geneticists to work on a comprehensive study of Indian castes. Barring few attmepts (Sengupta et al. 2005/6, Sahoo et al. 2006, Trivedi et al. 2007) none of the other studies really had all India scope. But in my opinion, these studies have severe shortcomings like small samples sizes, random caste assignments etc...

I would think considering the vastness of the field a co-ordinated effort is the need of the hour. There should be consensus on;
- Sample sizes
- Castes to be studies
- Position of castes
- Non ambiguous caste names
->Many caste names are titles could be found among many castes in a region. A better approach would be exact occupation and if exists old tribal names).
- Validity of assumption that East Asians and Europeans are standard populations.
-> There is a study(Zhao et al. 2008) that calls R2 European when in reality there is hardly any R2 in Europe and even the few observations can be perfectly explained from many angles.

About this study:
1. The present study throws many surprises that help their pre-held notions beautifully.
- J2a is higher in ex-Sudras and Dalits compared to Brahmins. J2b has made vanishing act or shows similar frequency across all castes. Both contrary to Sengupta et al.
- R2 frequency (which was higher than R1a1 among Telugu castes) has nosedived. Contrary to many previous studies(Kivisild, Sahoo etc...)

2. Some of the interpretations are beyond me.
- East Asian mtDNA M. But we don't have any East Asian Y-chromosomes! According to the study we have non-South Asian chromosomes and South Asian chromosomes. There is an East Eurasian Y-chromosome C (Mongolo-Oceanic) but its distribution in this study is rather counterintuitive for our understanding of mtDNA M. So, I would rather call mtDNA M as Mongolic. Oceanic has problems as their major lineages belong to mtDNA N.

3. Selective quoting of other studies
According to this study:
A recent analysis of caste and tribal populations from eastern India (Orissa) demonstrated Indo-European influences on paternal caste lineages [41]. Brahmins showed high Ychromosome affinity to eastern Europeans (M17, haplogroup R1a1).

I have quoted this study (Sahoo, Kashyap 2006) many a time in this blog. The really important point from the study was:
Analysis of Y-chromosomes revealed that the average genetic distance between Orissa Brahmins and Eastern Europeans (0.066) is relatively less than the distance between Eastern Europeans and the Karan (0.098), Khandayat (0.150), or Gope (0.067). Since both upper and lower caste populations, i.e., the Brahmins and Gope, were closer to Europeans and Central Asians, than were the middle caste populations, the Karan and Khandayat, this indicated that genetic distances have no correlation with their position in the caste hierarchy.

This kind of Ghetto study could have ignored quoting that study. It is perfectly alright to have exceptions. However, it is rather appalling that these people went ahead and selectively quoted it.

4. Some mathematics if it helps
Let's take Tamil caste population as 6 crores (60 million)
Brahmins form 3% => Brahmin R1a1 at 34.2% : ~0.32 million
Dalits form 20% => Dalit R1a1 at 20.6% : ~1.25 million
Ex-Sudras form 77% => Ex-Sudra R1a1 at 18.6% : ~4.5 million

Genetic variation in South Indian castes: evidence fr om Y-chromosome, mitochondrial, and autosomal polymorphisms
Watkins et al. 2008
Via Razib's Gene Expression


milieu said...

I wanted to know more about this study as there are very few genetic studies about Indian population that seems to be occuring.
Alas, I do not have any genetic background so am relying on the blogosphere to understand the conclusions and their implications.
Good or Bad, such studies are important to be discussed.

Manjunat said...

You can find a list of the studies here.

milieu said...


Maju said...

Yeah, it looks like a great example of how NOT to make population genetic research.

Just an item that called my attention particularly: in the mtDNA affinity graph, for some odd reason, U is treated as a distinct clade from N, while no M sublcades are considered the same way. If U would have been included in N (just like M33 is considered M without further considerations), the overall Indian position in that graph would have been much more intermediate, specially for the Tamil "middle/upper" groups.

In any case, there's no particular reason to consider macro-haplogroups like M and N, that must have drifted early on in the Paleolithic colonization of Eurasia. They should at least consider the next or next two level haplogroups, most of which are (at least for mtDNA) wholly native among Indians.

- Validity of assumption that East Asians and Europeans are standard populations.

Absloutely! Europeans may be somewhat genetically homogeneous for Eurasian standards but that doesn't make them automatically a source population, rather the opposite. Additionally, most of West Eurasian influence in South Asia must have come from West Asia, quite logically (excepted surely the Indo-Europeans).

East Asia is also terribly ambiguous and not necesarily more important than, for example, SE Asia in the field of human genetics. While West and East Eurasia as a whole may make up two more or less coherent macro-regions (separated by the deserts and semi-deserts of Central Asia and Siberia, and by the distinct entity of South Asia) the Eurasian continent is much better approached as made up of more distinct regions (say: Europe, West, Central, South, SE, East and North Asia - plus the extra-continental "extensions" in Oceania, America and North Africa). Furthermore, I'd say that most of the high level genetic diversity (and founder lineages) are in the southern Asian strip, from Anatolia to Indonesia - and, in this context, Europe and East Asia are secondary.