you are viewing a single comment's thread.

view the rest of the comments →

[–]DragonerneJesus is white 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (5 children)

Finally, I would love to have this debate if you want to do that in good faith.

The best argument for race is that when we put the genetics of different populations into a clustering algorithm we see that the clusters closely relate to what we consider races. Blacks cluster together, Europeans cluster together, East Asians cluster together, Oceanians cluster together and american indians cluster together etc.

If race didn't exist we wouldn't expect that to happen. It could've just as well have been eye colors, hair color or some other random attribute or combination of attributes that would best represent the clusters, but what we find is that RACE is what the generic clustering algos produce.

Another argument is that if you take 2 whites or 2 blacks they will always be more similar than say 1 random black and 1 random white. This indicates that races are surprisingly well seperated. The famous saying: "more distance between than within populations" (tongue in cheek)

Now of course you will have mixed race people like a color spectrum between the races/colors. Arguing against races is like arguing against blue, red, green, yellow etc.

[–]milkmender11 2 insightful - 1 fun2 insightful - 0 fun3 insightful - 1 fun -  (4 children)

The program we usually use is called STRUCTURE. It's likewise the program where nearly all of the data from your link came from. STRUCTURE is hardly equipped to perform objective scientific analysis--it is, like many things in social science, a program which draws upon some scientific data to infuse it with social hypotheses and answer a specifically formulated, purposeful question. We can't use STRUCTURE to figure out true things about reality, the way we can use astronomical analysis packages to derive conclusions based on the data that a space probe collects. STRUCTURE requires a human element--the data, on its own, speaks for itself, as far as it can. We need to tell STRUCTURE what to look for, and in doing so, we tell it what is important to us, personally.

Take racial clusters, for instance. Those uneducated in genetic sciences see your link and assume that racial clusters are real, that they are true, and located in the data. They mistakenly believe that STRUCTURE draws this truth out of the data for all to see. However, STRUCTURE has no way to determine the correct number of racial clusters. We actually have to tell it how many we want to see. If you want to believe that there are 7 racial clusters, you can tell STRUCTURE to look for 7. It will find 7--after all, that's what you told it to do. If you ask it to find 12, it will find 12. The simplest operation is to tell STRUCTURE to find 1 cluster, and this is presently the most widely accepted number of racial clusters that exist. Sometimes the data looks more visually appealing to us, looks like it 'ought' to be 7, or 12, or 23. But we can always use a sharper magnifying glass, or take a step back, and see that, in terms of genetic science, there are only as many 'races' as we choose to see. Usually, we choose a certain number because we have a specific question, often epidemielogical, that we want to answer. The 'correct' number of races in each instance is whatever helps us answer that particular question.

I should point out that I am providing you the courtesy of pretending that 'race' is a legitimate taxonomic category. It is not. 'Race' typically refers to subspecies, which is likewise not a defined classification. There are hundreds of studies that use hundreds of different definitions of what qualifies as a subspecies. Historically, the definition has only had glimmers of consistency across specific areas of research, for specific species and genera. For example, wolf researchers tend to use 'subspecies' in the same way, because they cite other wolf researchers who used it that way. It is a totally different story for drosophila researchers. So, again, before we have started, your premise is non-scientific. But I don't even need to win on that point--this is my job, I could give you free points all day and still win.

You should be aware: your link is very misleading, and has hoodwinked you. These 'ethnically dependent' SNPs? My, that sounds impressive! Damning, even! How could I possibly argue with 'ethnically dependent' SNPs? Easy.

These SNPs are specifically derived from unexpressed remnants of viral DNA (retrotransposons) that mutate very rapidly. Because they do not have an effect on our phenotype, these mutations are not 'pruned' by natural selection. They are then able to proliferate and diversify, and allow us to compare samples of aDNA (ancient DNA) to modern samples, and match up who is related to what groups based on the pattern of mutations.

Do you see an obvious problem? Humans are so incredibly closely related, so lacking in genetic diversity compared to most animals, that we have to go far out of our way to be able to detect differences at all. We actually need to look at genes that don't do anything. We need to look at genes that have no effects on us, because if we try to find genetic diversity elsewhere, we come across too many stumbling blocks. Your 'ethnically dependent' SNPs, the keystone upon which your link depends, are quite precisely the LEAST MEANINGFUL genes in the human genome. That isn't a coincidence. That's the only way we can reliably distinguish our personally preferred number of races--by looking at genes that don't do anything. You are using genes that don't do anything and saying that they enable us to distinguish race. If race is so self-evident, why don't you look at genes that DO things? Because you can't. The analyses will be inconsistent. You would have to pick and choose genes that make your point, and ignore that vastly higher number of genes that don't. Or, you could perform a genome-wide analysis, which will put you in exactly the same position--the differences will be so small relative to the whole of the genome that, by definition, they will fail to meet the standard of statistical significance. I've been doing this for a long time.

I'm not naive on this science. There are certainly genes that have an effect on IQ, testosterone levels, impulsivity, etc. These genes are predictably distributed in various popilations. We all know where they are prominent, and where they aren't. But you aren't talking about that. You are trying to shoehorn in Biblical 'kinds' into modern science, under the same guise of 'race' that the racialists of the 19th and 20th century used. Your fundamental hypothesis is Biblical, not scientific. And, the anthropologist in me sees your conflict with the Jews for precisely what it is--family squabbles. Jealousy of the more 'successful' big brother, who isn't letting you in his clubhouse.

To be clear, you led with your BEST argument. I didn't characterize it as such, you did. I demonstrated why that argument is, scientifically, nonsense. It is one I have heard many times, and proliferates on boards like this. It is actually a running joke in the genetics community.

Pleaze recognize what has happened here. Your BEST argument (your characterization) was bunk. Total drivel. Useless. 'Pseudoscience' is too decent a label for it. This is why I was hoping you would invoke machine learning as a means of possibly determining an objective k (cluster) value. But we didn't even get that far.

[–]DragonerneJesus is white 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (3 children)

Ok, did you read my last post or did you go autopilot? Did you notice that the clusters correspond to races, not eye color, not hair color, not some other random combination of attributes? This fact alone should tell you that the concept of race has a significant categorical meaning. The clusters could've been completely unrelated to the races but they are not. In fact the clusters correspond very surprisingly to exactly how humans perceive races (okay, exactly is an exxageration, but you get my point, swedes lie close to danes, greeks close to italians, whites far from blacks and so on, exactly as we would expect). Please try your best to explain away this phenomenon, because you failed to do that so far.

Okay, now I will adress your concern of the arbitrary k. The first point is that it does not matter "how many races you choose", you can pick 10, you can pick 3, you can pick 23 if you want. This is how race is defined and understood in the alt right anyway (and in our genetics circles too).
Now if you have trouble choosing the number of k, I can refer you to this article: https://medium.com/analytics-vidhya/how-to-determine-the-optimal-k-for-k-means-708505d204eb
I personally use the elbow method but either works. This is basic 1 year undergraduate stuff.
I don't use STRUCTURE, I use python.

'Race' typically refers to subspecies, which is likewise not a defined classification.

Again, I already adressed this with the color spectrum fallacy. This is not a very advanced idea that you have. Its a common misunderstanding that is widespread in social sciences because they want everything to be as "subjective" as possible.
With this type of logic, you can make the concept of "species" meaningless, which is simply absurd. We use categorization to say something meaningful about the data. In the case of genetic clustering, we are using a similarity measurement as the target function to optimize. "How similar is this individual that individual", "Sort them into k similar groups", "Here you are".
Your problem is that k is not as arbitrary as you want it to be and also that the clusters correspond perfectly to what people think of as race.
If the clusters didn't correspond to our understanding of race, you might've had a point, but that's not the case.

But I don't even need to win on that point--this is my job, I could give you free points all day and still win.

Your point is that you can use different measures for categorization. You don't ever prove that these measures don't result in racial clusters. But with that said I will gladly say that it seems reasonable to think that you could find some arbitrary clustering measurement (not genetic similarity) where the the clustering does not end up corresponding race, but I don't think this has any relevance for this topic.

If race is so self-evident, why don't you look at genes that DO things? Because you can't.

This is what I call ceding ground. You already acknowledge that races do exist. That whites are genetically more similar to other whites than they are to blacks.
Your strategy now is to claim that the racial clusters aren't "useful enough" and that we should only use a predetermined subset of the genome to create the clusters...
How many loci are you talking about here? How few should we use for you to think it is "useful enough"? Is it curiously so few that it makes the lewontin fallacy relevant? Is that it?

Or, you could perform a genome-wide analysis, which will put you in exactly the same position--the differences will be so small relative to the whole of the genome that, by definition, they will fail to meet the standard of statistical significance. I've been doing this for a long time.

Genome wide clustering seperates the races well. Am I misunderstanding you here? I think I am, if you could reformulate it, because I didn't get your point.

I'm not naive on this science. There are certainly genes that have an effect on IQ, testosterone levels, impulsivity, etc. These genes are predictably distributed in various popilations. We all know where they are prominent, and where they aren't. But you aren't talking about that. You are trying to shoehorn in Biblical 'kinds' into modern science, under the same guise of 'race' that the racialists of the 19th and 20th century used. Your fundamental hypothesis is Biblical, not scientific. And, the anthropologist in me sees your conflict with the Jews for precisely what it is--family squabbles. Jealousy of the more 'successful' big brother, who isn't letting you in his clubhouse.

I don't know why you had this garbage paragraph. Lets keep being on topic, thanks. You're published, so no need to divert attention elsewhere. Would be appreciated.

To be clear, you led with your BEST argument. I didn't characterize it as such, you did. I demonstrated why that argument is, scientifically, nonsense. It is one I have heard many times, and proliferates on boards like this. It is actually a running joke in the genetics community.

No, you had a misunderstanding how clustering algos work that a 1st year undergraduate wont ever have. I think this is included in chapter1 in a lot of books and I just pulled up the first medium post on the search engine. Let's cut the arrogance a bit. I treated you with respect in my initial response and I hope that you will reply properly going forward, otherwise I will return in kind.

Pleaze recognize what has happened here. Your BEST argument (your characterization) was bunk. Total drivel. Useless. 'Pseudoscience' is too decent a label for it. This is why I was hoping you would involve machine learning as a means of possibly determining an objective k (cluster) value. But we didn't even get that far.

I look forward to your next reply. Please keep the arrogant attitude to a minimum. I know you've been taught that we're dumb, so if that is true, less talk and show your knowledge through your presentation of your arguments. They were sorely lacking so far.

[–][deleted] 2 insightful - 2 fun2 insightful - 1 fun3 insightful - 2 fun -  (2 children)

I may be reading this whole process between the two of you incorrectly, and am somewhat skimming, but it looks like she's trying to make the point to you that the systems they use are different than what you keep bringing up. SNP runs quite a bit deeper than just general ethnic analysis, but there are SNPs that they're able to use to relate towards ethnicity. It's like parsing out each little code within your genetic makeup instead of a broad picture. Ethnicity analysis is the cliff notes version, from what I understand. I'll let her explain, she's the one with the education stats and experience.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (1 child)

Her claim is that we are using data that is meaningless and that the racial clustering is happening in the subset of the data that is meaningless. She wants to strip the data of meaningless data and then cluster based on the remaining meaningful data. Her implicit (never proven, never explicitedly stated) claim is that doing so would result in a clustering that does not correspond to the racial groups.

In the first part, she is saying that race as a concept is biologically real and in the second part, she is moving the goalpost and saying the concept is real but meaningless/worthless/without value.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

That is definitely not my claim! As my superviser said, 'There is no such thing as bad data.' Of course that data is not meaningless it is very useful for tracking ancestry, and that is exactly what we use it for. But it is useless for making meaningful distinctions between groups of people, because the genes don't DO anything. At best, you can try to use them as a proxy to say, "Well, if these do-nothing genes can be shown to form haplotypes, maybe they correspond to do-something genes that are also predictably distributed!" And you know what, it IS possible to demonstrate predictable distribution of some genes along the same lines as ancestral SNPs. But only some genes. And usually NOT the genes we socially consider important when it comes to race, like skin or eye color. It's a pretty pitiful result, but only if people are expecting it to justify race. It is what it is scientifically. Not meaningless, but not as meaningful as many on this board would like.

I never said race was biologically real, I assumed the putative truth of my sparring partner's positon as a Socratic exercise in demonstrating that it cannot be right, even if I operate from his assumptions.