you are viewing a single comment's thread.

view the rest of the comments →

[–]DragonerneJesus is white 3 insightful - 2 fun3 insightful - 1 fun4 insightful - 2 fun -  (35 children)

Ok, did you read my last post or did you go autopilot? Did you notice that the clusters correspond to races, not eye color, not hair color, not some other random combination of attributes? This fact alone should tell you that the concept of race has a significant categorical meaning. The clusters could've been completely unrelated to the races but they are not. In fact the clusters correspond very surprisingly to exactly how humans perceive races (okay, exactly is an exxageration, but you get my point, swedes lie close to danes, greeks close to italians, whites far from blacks and so on, exactly as we would expect). Please try your best to explain away this phenomenon, because you failed to do that so far.

Okay, now I will adress your concern of the arbitrary k. The first point is that it does not matter "how many races you choose", you can pick 10, you can pick 3, you can pick 23 if you want. This is how race is defined and understood in the alt right anyway (and in our genetics circles too).
Now if you have trouble choosing the number of k, I can refer you to this article: https://medium.com/analytics-vidhya/how-to-determine-the-optimal-k-for-k-means-708505d204eb
I personally use the elbow method but either works. This is basic 1 year undergraduate stuff.
I don't use STRUCTURE, I use python.

'Race' typically refers to subspecies, which is likewise not a defined classification.

Again, I already adressed this with the color spectrum fallacy. This is not a very advanced idea that you have. Its a common misunderstanding that is widespread in social sciences because they want everything to be as "subjective" as possible.
With this type of logic, you can make the concept of "species" meaningless, which is simply absurd. We use categorization to say something meaningful about the data. In the case of genetic clustering, we are using a similarity measurement as the target function to optimize. "How similar is this individual that individual", "Sort them into k similar groups", "Here you are".
Your problem is that k is not as arbitrary as you want it to be and also that the clusters correspond perfectly to what people think of as race.
If the clusters didn't correspond to our understanding of race, you might've had a point, but that's not the case.

But I don't even need to win on that point--this is my job, I could give you free points all day and still win.

Your point is that you can use different measures for categorization. You don't ever prove that these measures don't result in racial clusters. But with that said I will gladly say that it seems reasonable to think that you could find some arbitrary clustering measurement (not genetic similarity) where the the clustering does not end up corresponding race, but I don't think this has any relevance for this topic.

If race is so self-evident, why don't you look at genes that DO things? Because you can't.

This is what I call ceding ground. You already acknowledge that races do exist. That whites are genetically more similar to other whites than they are to blacks.
Your strategy now is to claim that the racial clusters aren't "useful enough" and that we should only use a predetermined subset of the genome to create the clusters...
How many loci are you talking about here? How few should we use for you to think it is "useful enough"? Is it curiously so few that it makes the lewontin fallacy relevant? Is that it?

Or, you could perform a genome-wide analysis, which will put you in exactly the same position--the differences will be so small relative to the whole of the genome that, by definition, they will fail to meet the standard of statistical significance. I've been doing this for a long time.

Genome wide clustering seperates the races well. Am I misunderstanding you here? I think I am, if you could reformulate it, because I didn't get your point.

I'm not naive on this science. There are certainly genes that have an effect on IQ, testosterone levels, impulsivity, etc. These genes are predictably distributed in various popilations. We all know where they are prominent, and where they aren't. But you aren't talking about that. You are trying to shoehorn in Biblical 'kinds' into modern science, under the same guise of 'race' that the racialists of the 19th and 20th century used. Your fundamental hypothesis is Biblical, not scientific. And, the anthropologist in me sees your conflict with the Jews for precisely what it is--family squabbles. Jealousy of the more 'successful' big brother, who isn't letting you in his clubhouse.

I don't know why you had this garbage paragraph. Lets keep being on topic, thanks. You're published, so no need to divert attention elsewhere. Would be appreciated.

To be clear, you led with your BEST argument. I didn't characterize it as such, you did. I demonstrated why that argument is, scientifically, nonsense. It is one I have heard many times, and proliferates on boards like this. It is actually a running joke in the genetics community.

No, you had a misunderstanding how clustering algos work that a 1st year undergraduate wont ever have. I think this is included in chapter1 in a lot of books and I just pulled up the first medium post on the search engine. Let's cut the arrogance a bit. I treated you with respect in my initial response and I hope that you will reply properly going forward, otherwise I will return in kind.

Pleaze recognize what has happened here. Your BEST argument (your characterization) was bunk. Total drivel. Useless. 'Pseudoscience' is too decent a label for it. This is why I was hoping you would involve machine learning as a means of possibly determining an objective k (cluster) value. But we didn't even get that far.

I look forward to your next reply. Please keep the arrogant attitude to a minimum. I know you've been taught that we're dumb, so if that is true, less talk and show your knowledge through your presentation of your arguments. They were sorely lacking so far.

[–][deleted] 2 insightful - 2 fun2 insightful - 1 fun3 insightful - 2 fun -  (2 children)

I may be reading this whole process between the two of you incorrectly, and am somewhat skimming, but it looks like she's trying to make the point to you that the systems they use are different than what you keep bringing up. SNP runs quite a bit deeper than just general ethnic analysis, but there are SNPs that they're able to use to relate towards ethnicity. It's like parsing out each little code within your genetic makeup instead of a broad picture. Ethnicity analysis is the cliff notes version, from what I understand. I'll let her explain, she's the one with the education stats and experience.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (1 child)

Her claim is that we are using data that is meaningless and that the racial clustering is happening in the subset of the data that is meaningless. She wants to strip the data of meaningless data and then cluster based on the remaining meaningful data. Her implicit (never proven, never explicitedly stated) claim is that doing so would result in a clustering that does not correspond to the racial groups.

In the first part, she is saying that race as a concept is biologically real and in the second part, she is moving the goalpost and saying the concept is real but meaningless/worthless/without value.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (0 children)

That is definitely not my claim! As my superviser said, 'There is no such thing as bad data.' Of course that data is not meaningless it is very useful for tracking ancestry, and that is exactly what we use it for. But it is useless for making meaningful distinctions between groups of people, because the genes don't DO anything. At best, you can try to use them as a proxy to say, "Well, if these do-nothing genes can be shown to form haplotypes, maybe they correspond to do-something genes that are also predictably distributed!" And you know what, it IS possible to demonstrate predictable distribution of some genes along the same lines as ancestral SNPs. But only some genes. And usually NOT the genes we socially consider important when it comes to race, like skin or eye color. It's a pretty pitiful result, but only if people are expecting it to justify race. It is what it is scientifically. Not meaningless, but not as meaningful as many on this board would like.

I never said race was biologically real, I assumed the putative truth of my sparring partner's positon as a Socratic exercise in demonstrating that it cannot be right, even if I operate from his assumptions.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (31 children)

Dragonere, I dismantled your point completely. Let me try to explain it again.

Your 'clusters' are literally human opinions. They do not exist as scientific data. You are talking about a research package called STRUCTURE which I and other scientists use. In order to get ANY clusters from this program, we must put the desired number of clusters in ourself. You are acting as if these clusters are real data. They are not. Every time you talk about any number of clusters, you are referencing a specific researcher who decided to arbitrarily use a given number of clusters. You actually have to open STRUCTURE and TELL it how many clusters you want. It can't give you ANY clusters, none at all, until you tell it to give you a specific number. Your clusters, literally, are arbitrary opinions. They are not in the data. They never were. There is no 'race' to correspond to, either. Race is not a defined scientific concept. I am talking about science here.

Here is what you are doing: without realizing it, you are choosing to use a number of clusters that corresponds to the 'races' you want to see. You say they 'correspond.' They do not. You have to first choose an arbitrary number of clusters (a k value) for STRUCTURE to give you. Then, you backwards-rationalize that number into alignment with the racial categories that you want to believe in. What about two races? You can tell STRUCTURE, 'give me an output with two clusters,' and it will. What would your preferred two races be? You can ask it to give you 3, 4, 5, literally any number that is equal to or less than the number of individuals. Hell, you could use data that includes multiple samples from single individuals and ask STRUCTURE to give you more clusters than there are individual humans in your sample!! And it WOULD! You are only talking about a program, STRUCTURE, that you are only just learning about from me. I have been using this program for years.

Finally, the ML k values. I read your link. Honestly, I'm not sure why you would use this when I quite literally taunted you to use it in my first post. I asked you to bring this up. Don't you know a trap when you see one?? The first problem is that it assumes a single level of magnification for the sample. It decides to look at the data at one level of magnification--not closer, not further. This is precisely what I said in my previous post. Of course, at a specific level of magnification, it 'looks' like there are 3 clusters. So, again, you arbitrarily choose 3, and just pass the buck to an algorithim which you have chosen in advance. You have not gotten away from the arbitrary nature of the k selection at all. You simply loaded up your data at one level of magnification, chose the number 3, and ran an algorithim that would give you 3, based on your hunch. If you zoom out, you will see 2, or 1. If you zoom in, you will see 4, 5, 6, up to as many individual points of data that you have. This is how cluster analysis works. There is no one 'true' or 'correct' algorithim that will give you an objective k value, and I can prove it right now. I actually found this tidbit in a paper by a researcher who runs exactly this analysis, who used this argument to debunk your claim in advance, because he knew people would make it.

Take your Magic Algorithim, the one that gives you your Objective K. Let's say it's 7. Ok, now an algorithim gave you the k value of 7, and you can pretend that you didn't choose that number, that The Gods of Science did. Wow! What a great algorithim. It is so great, let's run it on the same sample again!

Whoops. 49. Get it?

Algorithims to magically 'justify' k values are not an escape to your problem of arbitrary k designation. They just pass the buck to an algorithim that was likewise developed by a person. You see that graphic in your link, it is obviously 3 clusters at that scale, so you run the algorithim that you already know will give you 3. You are abusing the function of that algorithim. Its intended function is very much like an ANOVA or MANOVA. It is a confirmation test, a way to say: "Hm, I am pretty sure I see 3 clusters here, at this scale, and I do indeed want to use 3 clusters in my analysis. However, I worry that if I eyeball it like this, the peer reviewers will take issue with that. What I can do instead is use this algorithim to confirm that, at this scale, the computer also sees 3 distinct clusters. It might seem obvious, but this way, my reviewers won't be able to chastise me for eyeballing the chart. It is obvious that I see 3 clusters here, but this is just one little test I can use to not make it seem like I am choosing to see the 3." This is common. The more we can seem like we have a test to back us up, that it isn't our opinion, the more likely we are to get through peer review. You seem to think of science as more rigorous and monolothic than it actuall is. Hard reality check, my friend, we are just a bunch of stressed and overworked peons like everyone else. In a way, your idealism is invigorating, and reminds me of my more energetic graduate students. You would have made a decent geneticist, with a proper superviser, of course :)

It's great that you use Python. Wonderful. I am glad you are developing skills. That does not change the fact that your data, the data you cited, with your first link, mostly came from studies which used STRUCTURE. I am quite familiar with those studies, having cited them myself. The same researchers who published them would tell you the same things I am telling you now. I learned much of this from their papers myself.

You didn't address the issue of the taxonomic nonexistence of the race concept. You handwaved it away and made reference to the color spectrum argument, which I didn't use. I understand if these arguments may be new to you, but please respond to the arguments I make rather than the ones you feel you are prepared to debunk, that I never invoked.

You are incorrect that my argument could be used to nullify the species concept. There are several species concepts, each well defined, with conventional classification criteria. This does not exist with the race concept. Actually, this is rich! YOU are using the equivalent of the color spectrum argument to say that my correct designation of the race concept as an undefined taxonomic classification, is tantamount to denying species classification wholesale!! Brilliant! It's as if you are saying that if I throw away the subspecies concept as a scientifically robust taxonomic criteria because it is inconsistent, then I must also throw away the species concept because there are moments when its consistency flickers. But that IS the color spectrum argument, which you already reject!

You then again claim that k is not arbitrary, but fail to recognize that a human told STRUCTURE, or your analysis package in Python, how many clusters to find. At best, you can arbitrarily choose a specific scale where you see 3 clusters, and run an algorithim that will show you 3 clusters. Zoom out, adjust the parameters of the algorithim, and you will see 1 cluster. Zoom in, adjust, you will see 10. Or 100. Arbitrary. Or, just run the algorithim on the clusters it produced. Why not? If it worked so well the first time, why not learn more. And all this on selectively chosen SNPs that do not express in phenotypes, not a random sample of genes. Hell, if you did choose a random sample of genes, you analysis would look overwhelmingly Chinese.

[–]DragonerneJesus is white 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (19 children)

I will try to teach you some basics of how clustering works. Now I've never used your program STRUCTURE, I use python myself, but I've spend the last minutes reading up on the documentation of your program and its not a surprise to find that your program uses the "k-means clustering algo". Its also unsurprisingly the algorithm used in the link that I sent you.
"The K-Means algorithm needs no introduction. It is simple and perhaps the most commonly used algorithm for clustering."

It is not an advanced clustering algorithm but thats fine. In this, simple is better. With K-means you have to specify K before running the algorithm. You pick say 7 clusters, run the algo and the algo returns 7 best fit clusters, exactly as you specified.

You can pick any number. If you want 10 clusters you set k=10 and the algo will output 10 clusters.

In order to get ANY clusters from this program, we must put the desired number of clusters in ourself.

Yes, that goes without saying. Did you read the article I gave you that described how we can estimate the best K to choose?

Your clusters, literally, are arbitrary opinions. They are not in the data.

See I think this is where your lack of understanding of this subject starts. The clusters are not arbitrary opinions. The number of clusters is arbitrary and must be chosen somewhat subjectively, although we can pick an optimal k using 1st year undergraduate methods.
If you pick 2 clusters, the algo will not give you "arbitrary opinions" as you wrote. Instead it will provide the 2 clusters that best split the data.
0. Preconceived ideas about racial groups
1. Choose K
2. Algo returns K clusters
3. These K clusters that our algorithm returned describe the same groups that we had in our preconceived ideas about racial groups.

Please pay attention here, because you seem to have missed this point several times now. We are NOT telling the algorithm to create K racial groups!!! We are telling the algorithm to create K clusters from the genetic data. This is a VERY important distinction.
Why? Because if racial groups were pseudoscience, then we would NOT expect the algorithm to return K clusters that align almost perfectly with our preconceived ideas about racial groups!!
If racial groups were pseudoscience, we might find that the algorithm would return K clusters that happens to correspond to hair type groups, or eye color groups, or nose length, or height, or IQ, or whatever random group you might think of. But AGAINST ALL ODDS, the neutral algorithm returns K clusters that just happens to correspond to our racial groups! This is a wild coincidence.

Hell, you could use data that includes multiple samples from single individuals and ask STRUCTURE to give you more clusters than there are individual humans in your sample!! And it WOULD! You are only talking about a program, STRUCTURE, that you are only just learning about from me. I have been using this program for years.

Please keep the arrogance down. I could write the clustering algorithm that you're using from scratch, it is nothing special, and I think you might want to read an introduction to k-means clustering algorithms, because you seem to have some very basic misunderstandings about the algorithms that you're using.

Here is what you are doing: without realizing it, you are choosing to use a number of clusters that corresponds to the 'races' you want to see. You say they 'correspond.' They do not. You have to first choose an arbitrary number of clusters (a k value) for STRUCTURE to give you.

I don't know if this is a case of low IQ or you just not being familiar with how the k-means algorithm works.
https://youtu.be/HVXime0nQeI
https://youtu.be/4b5d3muPQmA

Here are some videos for you to watch, which I would advice you to go through. Especially if you've been using that program for years and still haven't taken the time to understanding the fundamentals of how it works.

Assuming its not an issue of low IQ (because then we can keep going back and forth forever), we don't tell the algorithm to give us the racial clusters. We tell the algorithm to give us K clusters. And these K clusters HAPPEN to be the racial clusters. You are saying the opposite: We tell the algorithm to give us K racial clusters and then the algorithm gives us K clusters that of course correspond to the K racial clusters that we told it to give us.
What you are saying is NOT what we are doing. We tell it to pick, say, 7 clusters, the algorithm could decide to give us 7 clusters that correspond to red hair, brown hair, black hair, yellow hair, blonde hair, golden hair, orange hair but thats not what the algorithm returns.
It returns 7 clusters that nicely correspond to 7 racial groups.

It is so great, let's run it on the same sample again!

Whoops. 49. Get it?

The elbow method wont return k=49 after having returned k=7 on the same sample. But, I can see some situations, where the returned k might differ if we introduce randomized initial configuration of the k-means algo. However setting the random_seed to a fixed number solves that "problem" (its not a problem, it just introduces some randomness into the data analysis, which is not even a bad thing imo)
One time the algo gives you optimal value of k = 7 and other times it gives you k = 10.
This is not arbitrary, its not a problem with the concept of race either, its also likely due to how the k-means algo is setup. Randomness does not suddenly introduce any human element to it either.

Are you of the misconception that race realists believe that there exists a fixed number of races? This is not the case. No one holds that position.

That does not change the fact that your data, the data you cited, with your first link, mostly came from studies which used STRUCTURE.

Are you conflating me with someone else?
I simply want to argue that race is real. You have been failing so far to deal with any argument that I've put forth and you have come to this debate underprepared, showcasing poor knowledge/fundamentals of the underlying algorithms and possibly a mental barrier where you conflate "k clusters correspond to k racial groups" with the incorrect view that we "choose k racial clusters and algo returns our chosen k racial clusters" which is not what is happening. This could also be a simple misunderstanding that you have of how the algo works.

You didn't address the issue of the taxonomic nonexistence of the race concept.

Please reformulate it then, because I fail to see how I haven't dealt with this. The same objections that you're using against race can be used to deconstruct the concept of species.

You then again claim that k is not arbitrary, but fail to recognize that a human told STRUCTURE, or your analysis package in Python, how many clusters to find.

No, it didn't.

Could you explain what you mean by "level of magnification"? Is that a structure specific term

At best, you can arbitrarily choose a specific scale where you see 3 clusters, and run an algorithim that will show you 3 clusters.

This is against all laws of data analysis.
Please watch this introduction video:
https://youtu.be/fSytzGwwBVw
And then this video:
https://youtu.be/EuBBz3bI-aA

If you view your data and then decide your parameters based on this, then you don't get an unbiased estimator. In this case your estimator will be very biased and we say that its "overfitting". You're choosing your model based on your data... can't do that.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (18 children)

Before I start, let us clarify, YOU brought STRUCTURE into this discussion, not me. It's your citation, not mine. The first link you shared uses data from studies that used STRUCTURE as their analysis package. I'm surprised that you freely admit that you didn't know about it and are only reading about it now, because it was the very first thing that you brought into this debate.

Did you read the article I gave you that described how we can estimate the best K to choose?

Yes, but you weren't the first one to show it to me. That's why I asked you to link to it in my first post :)

The clusters are not arbitrary opinions. The number of clusters is arbitrary and must be chosen somewhat subjectively, although we can pick an optimal k using 1st year undergraduate methods.

Well of course the computer doesn't come up with the clusters arbitrarily. It is done through machine learning. Yes, the number of clusters is arbitrary, as you acknowledge here. But there is no such thing as an 'optimal k' outside of a specific question. Again, clusters do not exist in reality. They are scientific tools that allow us to answer specific questions and test specific hypotheses. You are seeing 'optimal' and making the mistake of assuming that 'optimal' means 'correct.' It doesn't mean that. It means 'optimal for the parameters of our research question.' There are as many optimal values for k as there are different ways you can meaningfully analyze the data with different k values. This is how clusters work.

I am noticing a pattern here--you use your space to explain something, then sneak in a (willful?) misinterpretation of what I said, and hope that I let it go unnoticed. It hasn't worked so far, and it isn't going to.

If you pick 2 clusters, the algo will not give you "arbitrary opinions" as you wrote. Instead it will provide the 2 clusters that best split the data.

Here's your problem. You say that the algorithim does not give you arbitrary opinions. However, I have a very good friend who can debunk you right now. I will quote him directly: "The number of clusters is arbitrary" My friend is very smart and you stand no chance of defeating him. In his deep wisdom, he acknowledges that the number of clusters is arbitrary. That assumption, that arbitrary nature, follows the rest of the analysis. It is rooted in something arbitrary. Try to tell a peer reviewer, "Ok, yes, I know I chose the initial value arbitrarily, but I promise, the analysis which proceeded from that arbitrary value is NOT arbitrary!" It is by definition arbitrary. If you want to escape that, you MUST find a non-arbitrary way to determine your original number.

Please pay attention here, because you seem to have missed this point several times now. We are NOT telling the algorithm to create K racial groups!!! We are telling the algorithm to create K clusters from the genetic data. This is a VERY important distinction.

It is a completely unimportant distinction. I am pretty sure that you are again going to try to cross the boundary from science into your own personal opinion of how many races there ought to be, fail to justify why that number is correct, and hope that I don't notice that you just spouted a bunch of 101 cluster analysis stuff that you found just now on Google, all so that it would seem more legitimate when the science suddenly vanishes out window like a stale fart.

Why? Because if racial groups were pseudoscience, then we would NOT expect the algorithm to return K clusters that align almost perfectly with our preconceived ideas about racial groups!!

Lol. Thanks. Really, I'm not psychic, I just know you already. I like you! Always have. What are our preconceived ideas about racial groups? You keep talking about these preconceived ideas over and over again. Preconceived ideas. Preconceived ideas. We have preconceived ideas. What ideas?? I asked you in my last post and you ignored the question. WHAT ARE YOUR PRECONCEIVED IDEAS ABOUT RACE? WHAT ARE OUR PRECONCEIVED IDEAS ABOUT RACE? Your preconceived ideas are not likely to be the same as mine. There are dozens, scores, hundreds of preconceived ideas of how many races there are. Sure, we tend to make small lists, but we are humans. We make small lists of everything. Small lists of gods, small lists of types of foods, small lists of animals, small lists of races.

But I know the answer already. By 'preconceived ideas,' you mean, quite specifically, the ideas of racialist thinkers of the 19th and 20th centuries, and their intellectual descendents--or, rather, you THINK you mean that. You don't know what they actually said. And hoo boy, buddy, I'll tell you what--you know I'm strong on genetics and cluster analysis. I know you are smart enough to recognize that no matter how much you pretend to call me uninformed. But I'm equally informed on racialist thought in the 19th and 20th centuries. That's where a lot of the anthropology comes in.

What preconceived ideas? the preconceived ideas of Thomas Huxley, of E.B. Tylor, of Blumenbach or Linnaeus or Meiners? Let's talk about their preconceived ideas about race. Let's name some races. Anglo-Saxon (dark & white variety), Teuton, Laplander, Fin, Sarmatian/Slav, Hindu, Celtic, Nord, Assyrian, Chaldean, Mede, Scythian, Parthain, Philistine, Phoenician, Jew (Jesus is white), Georgian, Circassian, Mingrelian, Armenian, Turk, Persian, Arabian, Afghan, Egyptian, Abyssinian, Guanche. Whew!! We have barely even covered any geography, and we have a score or more races!!

British physiologist William Lawrence, one of the most important of the racialist thinkers, wrote: "The Caucasian variety encompasses numerous races." Is this the preconceived idea of race you had in mind? I have a feeling that you disagree with Lawrence. Will you classify Caucasian as a race? No? European, then? Preconceived ideas, indeed! I think Lawrence is on to something.

The list goes on. There has NEVER been cohesion about 'preconceived racial ideas.' There is a snapshot in time, right now, where you believe there is some kind of unity of thought on this subject. There is not, and what scant unity you might try to point to unravels completely just 100 years back. Nevermind 200. In fact, for most of Western history, 'race' was primarily wielded as a proxy for distinct kingdoms, a form of crude propaganda which tries to invoke phenotypic differences to drum up nationalistic fervor amongst a populace that was usually closely related to the enemy. This is exactly what your flair means. Your enemy is the Jews. Jesus was white. Jesus was a Jew. Your enemy is your fellow whites, just the ones who have more money and won't share it with you.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

Before I start, let us clarify, YOU brought STRUCTURE into this discussion, not me. It's your citation, not mine. The first link you shared uses data from studies that used STRUCTURE as their analysis package. I'm surprised that you freely admit that you didn't know about it and are only reading about it now, because it was the very first thing that you brought into this debate.

No I didn't.

Again, clusters do not exist in reality. They are scientific tools that allow us to answer specific questions and test specific hypotheses. You are seeing 'optimal' and making the mistake of assuming that 'optimal' means 'correct.' It doesn't mean that. It means 'optimal for the parameters of our research question.' There are as many optimal values for k as there are different ways you can meaningfully analyze the data with different k values. This is how clusters work.

"It means 'optimal for the parameters of our research question.'", Yes and when our research question is how do we best group people based on their genetics and it spits out by using 7 clusters that just happen to correspond to racial groups, then that answers our research question pretty well.

Here's your problem. You say that the algorithim does not give you arbitrary opinions. However, I have a very good friend who can debunk you right now. I will quote him directly: "The number of clusters is arbitrary"

Yes, the algorithm does not give you arbitrary opinions. As I said in my previous post, the number of clusters is arbitrary. Your friend is right about that.

That assumption, that arbitrary nature, follows the rest of the analysis. It is rooted in something arbitrary. Try to tell a peer reviewer, "Ok, yes, I know I chose the initial value arbitrarily, but I promise, the analysis which proceeded from that arbitrary value is NOT arbitrary!" It is by definition arbitrary. If you want to escape that, you MUST find a non-arbitrary way to determine your original number.

So first of all, as a general thing, this is factually wrong, because if you wanted to, you could train a simple deep layered network to optimize over an arbitrary input configuration. But that's more technical and not really relevant for our topic.
The arbitrary part of k-means is choosing k, it is not arbitrary how the clusters are made. The clustering wont suddenly put Khoisan next to Swedes or cluster English/Aboriginals together + French/Paraguans together. It is the number of clusters that are arbitrary but not the clusters themselves.

Try to tell a peer reviewer, "Ok, yes, I know I chose the initial value arbitrarily, but I promise, the analysis which proceeded from that arbitrary value is NOT arbitrary!" It is by definition arbitrary

Thats standard practice in data analysis.

It is a completely unimportant distinction.

No its not. Until you understand this distinction you will never understand why you are mistaken. This is why I told you to pay close attention. This distinction is the basis of your misconception. It either stems from you not know how k-means clustering works or from you having a very low IQ and having trouble understanding how the distinction matters. In the post down below you write "I've barely made any reference to hair, eyes, nose, etc. You are the one who keeps bringing that up, because you are used to people raising those attributes to try and attack the idea of race. But I'm not doing that. These aren't the arguments I made." but this is exactly why this distinction is important.
Why do the K means clusters align with our concept of race (puts english, swedes, french together in 1 cluster, puts khoisan, bantu in another, puts chinese, koreans in another)? If race was a bogus concept we would expect the clusters to put individual swedes in the different clusters, the french mixed with the koreans, other koreans mixed into a cluster with a subset of some africans and so on.
But it is not randomly throwing individuals together into clusters, it neatly puts them into racial clusters. This is the killer argument and before you understand the distinction, you will forever be mistaken.

What are our preconceived ideas about racial groups?

Things like Swedes are genetically similar to other Swedes. Whites are genetically similar to other whites and so on. You could break up Sweden and find that some subset of Swedes are closer together than other subsets of Swedes.

And hoo boy, buddy, I'll tell you what--you know I'm strong on genetics and cluster analysis

You are definitely not strong on cluster analysis. I've found no faults with your understanding of genetics so far. However with cluster analysis you give off the vibe of someone very uneducated.

There are dozens, scores, hundreds of preconceived ideas of how many races there are.

You seem to be stuck on the misunderstanding that any race realist thinks there is a FIXED set of races. You're debunking a position that literally no one has. We don't care if you set k=3 or k=7 or k=21. They all show a clustering of racial groups and thats what race is. Swedes together, French together, Bantus together etc.

Let's name some races. Anglo-Saxon (dark & white variety), Teuton, Laplander, Fin, Sarmatian/Slav, Hindu, Celtic, Nord, Assyrian, Chaldean, Mede, Scythian, Parthain, Philistine, Phoenician, Jew (Jesus is white), Georgian, Circassian, Mingrelian, Armenian, Turk, Persian, Arabian, Afghan, Egyptian, Abyssinian, Guanche. Whew!! We have barely even covered any geography, and we have a score or more races!!

The brilliant part is that if we cluster individuals from these groups, they will likely cluster together in the same manner as we humans cluster them. Laplanders together, Teutons together, Chaldeans together etc. If some groups cluster with other groups, its likely due to close proximity / racial mixing between the groups or a shared ancestry.

"The Caucasian variety encompasses numerous races." Is this the preconceived idea of race you had in mind?

Yes

This is exactly what your flair means. Your enemy is the Jews. Jesus was white. Jesus was a Jew. Your enemy is your fellow whites, just the ones who have more money and won't share it with you.

Let's keep the discussion on topic.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (1 child)

If racial groups were pseudoscience, we might find that the algorithm would return K clusters that happens to correspond to hair type groups, or eye color groups, or nose length, or height, or IQ, or whatever random group you might think of.

I've barely made any reference to hair, eyes, nose, etc. You are the one who keeps bringing that up, because you are used to people raising those attributes to try and attack the idea of race. But I'm not doing that. These aren't the arguments I made. Again, you are doing what you know. You are responding to the arguments you know how to deal with, not the ones that I am raising. I'm not thinking of any random groups. That's what you're thinking of. I never said race does or should correspond to these attributes, in fact I mocked that idea in my last post. I said that caring about those things first is monkeybrain stuff, the obvious externalities that we evolved to notice so we could make quick and dirty approximations of who we are likely related to and who we are likely not related to.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

This post is exactly why you are mistaken. You misread what I wrote.

We DONT find that the K clusters correspond to hair type clusters
We DONT find that the K clusters correspond to Eye color groups
We DONT find that the K clusters correspond to nose length
We DONT find that the K clusters correspond to whatever

We DO find that the K clusters correspond to RACIAL GROUPS

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (14 children)

But to answer your actual point, why on earth would we expect k clusters to align to those attributes?? As I just explained, WE care about that stuff. WE notice that stuff. To a computer, a gene is a gene. A gene that makes a blue eye is no more or less a gene than one that is not expressed. We definitely would not expect k clusters to align to those attributes in ANY case, whether race is psuedoscience or not.

Whoever was using this argument you are responding to, I agree that it isn't a great one. That's why I don't use it.

But AGAINST ALL ODDS, the neutral algorithm returns K clusters that just happens to correspond to our racial groups! This is a wild coincidence.

•what racial groups •what racial groups •what racial groups

Please tell me what these racial groups are. Please tell me what our 'preconceived ideas about race' are. You keep saying this, as if this is some information we all know. It isn't. White people in the 19th century didn't know this. The world doesn't 'know' this. I don't know how many different ways I can ask you to clarify here, but you keep ignoring me. Again, though, I know why. You are hoping I gloss over this. You know that being put on the spot and asked to say precisely what 'our preconceived ideas about race' are opens up a biiiig can of worms, and you do not want to put yourself in an even more defensive position where it will be even easier for me to poke holes in your arguments. I mean, I readily admit that I have the high ground here. You are the one claiming a positive position: "race is. Race is, this. Our ideas are, this." All I have to do is nitpick. I don't need to be right, all I have to do is show that you aren't right.

I could write the clustering algorithm that you're using from scratch, it is nothing special

I'm sure you could. I presume you are a professional or at least a serious hobbyist programmer, and I sure as heck know you aren't a geneticist or a researcher who does cluster analysis.

I don't know if this is a case of low IQ or you just not being familiar with how the k-means algorithm works.
https://youtu.be/HVXime0nQeI
https://youtu.be/4b5d3muPQmA

And there it is. I see this as the moment where I won the debate, actually. You may recall, my first post in this thread was a reply to someone who descended into calling another poster a retard. In my reply, I asked, "Are you going to call me a retard now as well?" And here, you just did. The very first point I made in this thread is that when you opt to just call someone a retard, it can look quite like you are just frustrated because you are losing.

I know you aren't going to change your mind, not tonight. That isn't how these debates work. But I already know that you know I am not stupid. I know you aren't stupid either. I already did get through to you--you are not going to change your mind, I'm not a miracle worker, but you are going to be more careful with your wording in the future. You'll refine your arguments. See, I know so much about you. You feel that your intelligence is not sufficiently appreciated. And it isn't. You want to weigh in on social issues, many of which you have vastly superior answers to than the majority of society, and yet you are silenced on grounds outside of your control. So you throw your lot on with Race and Country to reclaim some of that value that is owed to you. But the truth is, man, you're a mongrel. Europeans were already mostly mongrels in the 20th century. You're no shining beacon of Whiteness. You're a smart guy who works a job he mostly dislikes, desperate to find some secret truths that validate your intelligence, because society refuses to do so. I fucking know you, man. Of course I know you. When was the last time you even wrote back and forth this much with someone? We're basically penpals now. I'm an anthropologist, I've been reading between the lines this whole time. I know exactly who you are.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

But to answer your actual point, why on earth would we expect k clusters to align to those attributes?

Unless the clusters are totally random, then we would expect the clusters to align to SOMETHING. Those attributes were just examples of what the clusters DONT align to. You are free to be creative and find other attributes, phenotypical or genotypical or otherwise that the clusters DONT align to.
The interesting part is that AGAINST ALL ODDS, the clusters align to race.
Ask yourself the opposite question (of the quoted), why would we expect to find that the k clusters align to race if race was a bogus manmade concept that had no relation to biology? How crazy is it that the k clusters just happen to correspond perfectly to our bogus manmade concept? That's like 1 in a qutrillion. And not just once? Every time we run the algo and no matter the initial conditions. Definitely a weird coincidence that keeps repeating itself.

Please tell me what these racial groups are.

The idea that we can cluster human beings into clusters of genetic similarity. A swede wont be more similar to a Gambian than he is to another Swede. You might find examples of "Swedes" being more similar to say Danes than other Swedes but this is due to the color spectrum fallacy.

And there it is. I see this as the moment where I won the debate, actually. You may recall, my first post in this thread was a reply to someone who descended into calling another poster a retard. In my reply, I asked, "Are you going to call me a retard now as well?" And here, you just did. The very first point I made in this thread is that when you opt to just call someone a retard, it can look quite like you are just frustrated because you are losing.

No. You display a repeated misconception of how the kmeans algorithm works when it does its clustering. This is why I sent you two introduction videos so that you may educate yourself. This would help you correct your misconception and help you understand my argument. Because right now you do NOT understand my argument.
This could either be because you do not understand how kmeans works or because you have a low IQ. I assume, and I really hope, that it is because you don't understand how kmeans works. If its an issue with IQ, then we will be stuck.
You might see this as a winning moment but I am merely responding in kind. If you do not want this kind of response, then cut out your arrogance. I have also told you to do this or I would respond in kind. And I will continue to do this as long as you continue to this.

Your first response to me showed to me that you lacked basic knowledge of this topic and to be honest I felt that it was a loss. I was hoping to learn something new that I had not considered, or gain a new insight or learn something new. I am sure that you have knowledge that I don't know that can help broaden my perspective, but before we reach that, you will have to humble yourself and start listening to the arguments I am putting forward, because you are not listening or not understanding them. If it is a lack of understanding, watching a few videos could help sort out that misunderstanding and we could move on from there onto more interesting insights and angles. This is 101 stuff of cluster analysis that you are not understanding and its a shame.

But I already know that you know I am not stupid. I know you aren't stupid either.

Yes, but someone can be low IQ and not be stupid. Most people that aren't stupid have a sufficiently high IQ and assume you to be high IQ too, and that you just lack knowledge in certain aspects of cluster analysis which leads you to a misunderstanding of how something works.
You seem to have the misconception that its a given that the kmeans clustering algo returns k racial clusters but its not a given. We didn't ask it to give us RACIAL clusters. We just told it to give us K whatever clusters and it just happend to return whatever=racial. Why didn't it give us K eye color clusters? Because human beings aren't clustered by eye color, but by race.

I know exactly who you are.

You really don't. You were right about one thing. I haven't discussed this topic in a long time, because 1: its already settled science and 2: its a banned topic on social media so literally impossible to have back and forths about. You're an anthropologist? The anthropology subreddit has this as their rule 3:
""Race realism", "human biodiversity", conspiracy theories, and pseudoscience will be removed as will any other content that is incorrect or not supported by reputable scholarship."
So yeah, its hard to challenge someone with expertise on this subject when its a banned subject.
Like you told another user, you wouldn't want to dox yourself, because just having a debate with alt righters would put your career in jeopardy, well, its the same for me, so you might understand why I don't regularly have a back and forth like this.
I used to, back before social media started mass censoring taboo subjects.

Europeans were already mostly mongrels in the 20th century.

Oh, noo!

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (12 children)

But, just so you don't accuse me of trying to obscure the poont, I'll answer: your point... it's moot. You debunked yourself already. Yes, the k value is still arbitrary. You admitted that. You admitted that the choice of the number is arbitrary. That proceeds into the analysis. Once one arbitrary distinction is made, the entire analysis remains, at least to that extent, arbitrary. It doesn't necessarily become MORE arbitrary during the application of the algorithim, unless other arbitrary assumptions have been woven into the code. And just because an analysis contains an arbitrary component does not mean it is incorrect or useless. But it at least maintains the arbitrary quality from the first decision of the number, as you already said.

Once an element of arbitrary decision making enters your analysis, the entire analysis remains subject to challenge on those grounds, even if no additional arbitrary decisions are made.

This isn't the night before your stats final in junior year, bro. I'm a geneticist.

I did watch your videos, because I know what's up here. You've gotten desperate. When people get desperate, they mess up. At this point in the debate, I know these videos will back ME up more than you. So let's dig in.

YOUR source says: "In this case the data make three relatively obvious clusters. But rather than rely on our eye, let's see if we can get the computer to identify the same three clusters."

Sounds familiar! I refer you to my previous post:

Its intended function is very much like an ANOVA or MANOVA. It is a confirmation test, a way to say: "Hm, I am pretty sure I see 3 clusters here, at this scale, and I do indeed want to use 3 clusters in my analysis. However, I worry that if I eyeball it like this, the peer reviewers will take issue with that. What I can do instead is use this algorithim to confirm that, at this scale, the computer also sees 3 distinct clusters. It might seem obvious, but this way, my reviewers won't be able to chastise me for eyeballing the chart. It is obvious that I see 3 clusters here, but this is just one little test I can use to not make it seem like I am choosing to see the 3."

Damn. That's exactly what I effin' said. YOUR source is on MY side! This is a CONFIRMATION test designed to help you double down on your assumption! You don't even use it until you have ALREADY decided that you want to see x number of clusters! It literally says it, right here, in YOUR source!

It goes on: "Step 1: select the number of clusters you want to identify in your data." That I want to identify? That doesn't sound very scientific. That sounds like a fancy way to eyeball something. Which is exactly what I said in my last post, before you shared this video. And exactly what the video says. Hell, they even reference eyeballs, too. I could not have asked you to send a source that is more damning to your own position and more supportive of mine

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

Once an element of arbitrary decision making enters your analysis, the entire analysis remains subject to challenge on those grounds, even if no additional arbitrary decisions are made.

Well put. Its open to challenge on those grounds. On those grounds. You are free to attack that we have 7 clusters instead of 5 but its not really relevant to the concept of race.
Common misunderstandings, often intentionally so, consist of 1: Races are genetically distinct and 2: Fixed number of races
Neither of those opinions are held by any race realist. But in every study that tries to debunk race, they debunk those points or a variant of those points. Points that no one holds.

YOUR source says: "In this case the data make three relatively obvious clusters. But rather than rely on our eye, let's see if we can get the computer to identify the same three clusters."

This is an introductory video. He is showing you how the algo works. He is showing you that it works.
You don't want to actually eye ball it and then determine the number of clusters. Like.... in most data analysis we work with thousands or millions of dimensions and you can't exactly "eye ball" that.
If you ran the elbow test, you would also find that k=2 is worse than k=3 and that k>3 does not provide any meaningful improvement.
The whole point of these algos is to not rely on our eyes but to let the computer cluster high dimensional data (like genetics)

You don't even use it until you have ALREADY decided that you want to see x number of clusters! It literally says it, right here, in YOUR source!

Yes, but I think maybe you are stuck on the assumption that race realists care about a FIXED number of races? You can put x as 3, 5, 7, 23 if you want. Swedes wont be put into the same cluster as Ethiopians, while Danes are put in another.

I could not have asked you to send a source that is more damning to your own position and more supportive of mine

What I had hope that you got from the video was that we chose the number of clusters, not HOW it clusters. "HOW" here being that it just so happens to cluster on RACIAL grounds, not any other feature/attribute/whatever.
The algorithm doesn't even know that it is looking at genetic data. It could've been about finance and it thought that it created clusters that describe different spending habits. Its just getting numbers and returning some clusters.
To our surprise these k clusters just so happen to be RACES.

This is me repeating the distinction hoping you at some point will start to understand that the distinction is important.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (10 children)

Now, this is medicine. About cancer, specifically. You might not recall since you haven't watched this since junior year before that test, but it's ok to make arbitrary assumptions when we have a purpose. If it helps to cure cancer, what does it matter if we assumed some clusters that might not be totally accurate? We're trying to preserve life here. If you made THAT argument for race, a social, non-scientific one, the anthropologist in me would have more sympathy. Show me that at the end of the day it is best for everyone to see things your way, but don't try to claim the mantle of science when it doesn't fit you.

Of course, this analysis COULD be wrong. What if there is a 4th cancerous cluster, a small one, just forming, that the assumption of 3 misses? The chemo might miss it. The surgeon might miss it. This hypothetical person could literally die due to a false assumption.

This entire video is rooted in the premise that we have a goal: identify cancer cells. If we fail to identify cells, that is bad. A bad-fit model fails to identify the cells. A perfect model identifies all of them. The correct number of clusters depends on our question, in this case, "How many cancerous cells are there, so we can kill them?" If we apply this logic to race, there is no clear goal. You keep talking about 'preconceived ideas' that you refuse to define. But there are so many different kinds of preconceived ideas about race. There either IS or IS NOT a cancer cell. It is not social. But 'preconceived ideas about race' ARE social. In order to find the reality of it, we must ask WHY humans form ideas about race, how they go about doing it, what parameters we use to do it, how those parameters have changed over the development of society, and how our criteria for racial classification evolved in the ancestral environment. You don't seem interested in any of those scientific questions. You just want to stop at 'preconceived ideas about race' because that is what is important to you, not the scientific method.

By the way, StasQuest is run by the genetics department at UCNC. Josh seems like a smart fellow. If you came to Josh and asked him about race, what do you think he would say? I mean, it's an academic genetics department. They tend to be pretty woke. Do you think he would agree with your assessment of race here?

Let's look at your first video now (I watched them out of order): "Step 1: Start with a dataset with known categories."

Well, fuck. Sorry, man. Step 1 knocked you out of the game again. KNOWN CATEGORIES. So we are supppsed to come into this KNOWING the categories at STEP ONE, so says your source. And yet you have been speaking this whole time as if the analysis will GIVE us the categories. See, this is also what I said in my last post. I knew you had to try and shoehorn in 'preconceived ideas about race' that you refuse to define, because you know that this analysis requires KNOWN CATEGORIES at STEP ONE. You can't get this stuff past me, man.

So, the point of the video is to try and classify an unknown category based on a known category. So, presumably, you share this to try and argue that if we KNOW person A is African, we can determine the racial category of person B who happens to be a neighbor of person A in the cluster analysis. But don't you see them problem? You already classified person A before you started. You didn't arrive at their race through the analysis, you presumed that you have some other means of knowing that 'African' is their correct racial designation. Which is all well and good--probably, most of us would agree about what an 'African' looks like, as long as we don't include white South Africans, Egyptians, Morroccans, Mauritians, etc. etc. That is a lot of exceptions. But we are walking into this with the categories in-hand. As step 1. Not determined by objective scientific analysis. And I didn't even invoke the k problem, which persists in this video. Specifically, Josh points out that if you chose a k of 1, it will classify based on 1 neighbor. If you choose a k of 11, it will classify based on 11 neighbors. Josh is a great educator! I already knew this, but it is so nice to have him debunk your arguments for me. Wait, did you share these videos, or did I??

Thanks for the homework. It does come across like you are trying to bully me into not replying by just regurgitating links at me because you can't make your case effectively, but I like refreshers so I enjoyed watching them.

Assuming its not an issue of low IQ (because then we can keep going back and forth forever),

I know. There are three ways to 'win' online like this. One, someone gets tired of typing and leaves. That isn't much of a victory though. Two, the loser gets frustrated and starts calling the other person a retard with a low IQ. Also not satisfying. Three, and I know I already won here, the loser realizes that they have to refine their arguments. They don't change their mind, not yet, but they are more careful with their words the next time around, or perhaps avoid the debate entirely for fear of getting spanked again.

we don't tell the algorithm to give us the racial clusters. We tell the algorithm to give us K clusters. And these K clusters HAPPEN to be the racial clusters.

That's precisely the problem. You are saying that you know the clusters in advance. You know what the races are in advance. You are starting from the premise, "I am correct. I am right. Let's use this program to find out more about my correct answer." This is the definition of circular logic. You are wrong before you even started! YOU DO NOT KNOW THE CATEGORIES! You won't even tell me them, even though I keep asking!

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

Of course, this analysis COULD be wrong. What if there is a 4th cancerous cluster, a small one, just forming, that the assumption of 3 misses? The chemo might miss it. The surgeon might miss it. This hypothetical person could literally die due to a false assumption.

It doesn't really matter to the concept of race if there is a 4th race that we are not getting, because we used k=3

If we apply this logic to race, there is no clear goal.

People have done studies that try to align genetic clusters with self-identified race and they match like 98-99%, with the mistakes being boundary cases, which is to be expected, because the races are not genetically distinct but continous along a spectrum.

But 'preconceived ideas about race' ARE social.

That's true. It is interesting that our social concept of race align so perfectly with the biological reality.
And here I am not talking about 'race' as defined by sociologists where its mostly just a political label or a label that society racializes you into.
Because if humanists get to define all words, then sure, I will happily admit that race is a bogus concept right here and now and that race has no biological meaning. With these definitions that humanists use, race as a concept has been HEAVILY debunked by science. Irrefutably so.
The only problem is that no one ever claimed that debunked position.
Whats not debunked is the position of race realists.
The simple fact that you can use kmeans on genome wide population data and get racial clusters, 5 clusters, 3 clusters, 7 clusters, x clusters, is proof that race is real. And I will repeat myself again, because you failed to understand the distinction. We can use k-means to get k racial clusters - this proves that race is real, because kmeans does not return k racial clusters, it returns k clusters. The fact that these clusters are RACIAL clusters and not EYE COLOR clusters or SPENDING HABIT clusters or whatever else clusters is the proof.

By the way, StasQuest is run by the genetics department at UCNC. Josh seems like a smart fellow. If you came to Josh and asked him about race, what do you think he would say? I mean, it's an academic genetics department. They tend to be pretty woke. Do you think he would agree with your assessment of race here?

StatQuest is hilarious and he is better at explaining some concepts than a lot of professors or textbooks. I always recommend people to watch his videos because he breaks every concept down in ELI5 formats. Would he agree with me on race? I heavily doubt it

Well, fuck. Sorry, man. Step 1 knocked you out of the game again. KNOWN CATEGORIES.

You really should've watched these videos to learn and not to win a debate. Ok, so the video K-nearest neighbors is one of the simplest clustering algos and you start with known categories and then you see if an unknown person is nearest to whichever cluster. The Kmeans algo is slightly different but builds on the same idea and your program STRUTUCE uses a variant of the Kmeans algorithm.
In Kmeans you don't start with KNOWN CATEGORIES, but rather (usually) random initialization of the "center" of each k cluster.

That's precisely the problem. You are saying that you know the clusters in advance. You know what the races are in advance. You are starting from the premise, "I am correct. I am right. Let's use this program to find out more about my correct answer." This is the definition of circular logic. You are wrong before you even started!

No, kmeans doesn't use KNOWN CATEGORIES, so your argument is moot. This also shows me that I was completely right. You DID NOT know the fundamentals of how your program STRUCTURE does the clustering. STRUCTURE does not use k nearest neighbor (knn). I shared that video because I wanted to teach you the basics. Knn is a prerequosite for kmeans. I hoped to build your knowledge up in cluster analysis, because this is basics and it would help you understand my point and help you apply your knowledge of genetics more appropriately, I am certain that once you get the fundamentals of how clustering actually works you can then expand further and provide new insight to me using some knowledge where you have more expertise than I do.
I'm trying to learn and I am humble enough to realize that you likely know some stuff better than me, but you seem unable to humble yourself and the result is that you come off as arrogant and uneducated. Like in this comment you're heavily trying to debunk my arguments about your own program and you don't even know the basics. You confuse knn with kmeans.
I'm not really annoyed because I've come to expect this kind of behaviour from people in your camp of the debate. You've been taught that we are stupid and that you know it all and then your camp has outlawed all discussion on the topic so that your view is never challenged.
When your type then gets into a debate with one of us, you're misrepresenting our views (because you've been lied to by your educators) and you're debunking concepts that no one holds and you're displaying an extreme lack of knowledge on the subject, often citing 50 years old fallacies. I'm not saying you are doing all those things, but this is what your type usually does and thats why I have come to expect this behaviour and is more tolerant towards your admittedly rather nasty & unmannered behaviour.

With that said I am pleased that you're not resorting to the 50 year old fallacies of "more within than between" that 99% of students are still taught in class. Got to uphold that pseudoscientific narrative, eh?

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (7 children)

More from your source: "Low values for k (like k=1 or k=2) can be noisy and and subject to the effects of outliers. Large values for k smooth over things, but you don't want k to be so large that a category with only a few samples in it will always be outvoted by other categories."

Well, that's convenient. You keep saying that the clusters match our 'preconceived ideas about race,' but there are quirks in this method of analysis that make it not work well with low or high k values. So there is a sweet spot of values for k, which happens to correspond to the numbers of races that you like (even though you refuse to give me any of those numbers) and do not correspond to the values for k which you dislike. To be clear, there is no fundamental reason why a low or high k value would not be the best fit for the model. With a perfect dataset, with perfect computing power, we would be able to eliminate nearly all noise. It just so happens that the limitations of the method do not work very well with low or high k values, so you are going to reject low or high numbers of races out of hand. You are, as programmers often do, letting the algorithim do the thinking for you. What if your tool is not fit for the job? All you have is a hammer, so your problem looks like a nail.

It returns 7 clusters that nicely correspond to 7 racial groups.

Is this it? Are you finally going to share these mysterious 'preconceived ideas about race' with me? Or are you only using 7 because I used 7 earlier? Please tell me what these known categories are. I certainly don't know them. The racialist thinkers of the past didn't know them. I doubt most people you ask would be able to guess precisely what your preconceived ideas are, nor would you be able to guess theirs.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

Well, that's convenient. You keep saying that the clusters match our 'preconceived ideas about race,' but there are quirks in this method of analysis that make it not work well with low or high k values. So there is a sweet spot of values for k, which happens to correspond to the numbers of races that you like (even though you refuse to give me any of those numbers) and do not correspond to the values for k which you dislike. To be clear, there is no fundamental reason why a low or high k value would not be the best fit for the model. With a perfect dataset, with perfect computing power, we would be able to eliminate nearly all noise. It just so happens that the limitations of the method do not work very well with low or high k values, so you are going to reject low or high numbers of races out of hand. You are, as programmers often do, letting the algorithim do the thinking for you. What if your tool is not fit for the job? All you have is a hammer, so your problem looks like a nail.

See? Building your fundamentals was the right thing to do in order to get us into a higher quality of debate.
I think you might raise a good point here that I would like you to expand upon if you so desire. I get the overall gist of your argument and it might hold some merit thats worth exploring some more.

Is this it? Are you finally going to share these mysterious 'preconceived ideas about race' with me? Or are you only using 7 because I used 7 earlier?

Yes, only using 7 because you used 7 earlier.
You could put 3 races and it would maybe return europeans, africans and asians. You could then put 4 and it would maybe return europeans, africans, asians and oceanians. Or 5 and it would include latinos/hispanics/indians. As you increase the number of clusters it will fine tune the races.
If you have 1000 samples and you put k=1000, then it would simply return each sample as a race, which is why its not very good with high k.
Likewise if you pick a too low k it will combine "clusters that ought to be" in weird ways. Again its worth repeating, since you didn't understand why the distinction earlier was important. Its not the number of clusters that is important, its the fact that the clusters correspond to RACIAL clusters that is important. We didn't tell the algo to find k racial clusters, we told it to find k whatever clusters, and these "whatever clusters" happend to be "racial clusters", swedes with swedes, gambians with gambians.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (5 children)

The elbow method wont return k=49 after having returned k=7 on the same sample. But, I can see some situations, where the returned k might differ if we introduce randomized initial configuration of the k-means algo.

No, you completely missed the point. You're tunnelvisioning, as programmers do (a strength and a weakness). I'm not saying that there is a stochastic element in machine learning. We all know that. I'm saying that there is no good reason not to run the analysis again on the 7 clusters and get several more clusters. If we are talking about cancer cells, we don't really have a good reason to do this, unless we have some reason to believe that there is a yet-smaller tumor to find (though I doubt you would simply redo the analysis on returned clusters for that, what with signal degradation). But we have many, MANY more races, historical and contemporary, to identify with repeated cluster analysis of 'known' races. It's not just 'African.' What of all the African subtypes? Dozens more races, all determined with a non-arbitrary k, right? Well, except for that it is arbitrary, like you admitted.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

But we have many, MANY more races, historical and contemporary, to identify with repeated cluster analysis of 'known' races. It's not just 'African.' What of all the African subtypes? Dozens more races, all determined with a non-arbitrary k, right? Well, except for that it is arbitrary, like you admitted.

Yes! And that's whats so wonderful about using the computers to tell us how to best cluster the racial groups. We can start researching if Swedes really are that different from Danes or we can see if Fins did cluster with the mongoloid race like some Americans claimed 100 years ago and so on. We can start seeing how ancient races compare to modern races. Where ancient individuals cluster into modern races. Where modern individuals cluster into ancient races.
Things like this is just very exciting

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (3 children)

Are you of the misconception that race realists believe that there exists a fixed number of races? This is not the case. No one holds that position.

Cop out. You refuse to answer the question because you know you can't. There are as many races as you want to see. Just because a bunch of people have decided that they refuse to answer a question because they can't, that doesn't mean it is a defensible position. It isn't. This is your 'turtles all the way down' moment. Scientists DO have an answer for how many species of sea cucumbers we have documented. They DO have an answer for how many subspecies of grey wolves there are. Why don't YOU have an answer? The rest of the scientific world isn't shy about this when it comes to taxonomic classification, but you have got cold feet all of a sudden.

Are you conflating me with someone else?

Nope. No conflation. I mean exactly what I said. Your first link contains data from studies that were conducted using STRUCTURE. They are landmark studies, often the first cited in these discussions, and cite them first you did.

I simply want to argue that race is real.

Of course race is real. I would never say something so ridiculous as 'race isn't real.' Race is one of the most consequential and painfully real things in the modern world, perhaps the single most consequential. But it isn't a scientific concept. It is a social construct emerging out of the biological reality of our intuitive, cognitive racial-classification modules. In fact, with reference to those modules, in a way, race IS biology. Not in the way people think of it, as a real attribute of human population genetics, but as a little part of our brain that has evolved to see race wherever we look, because so far it has proved adaptive.

The same objections that you're using against race can be used to deconstruct the concept of species.

No, they can't. As I explained before, you are using the color spectrum argument that you already admitted you reject. I say that SPECIES is a legitimate taxonomic classification and SUBSPECIES is not. You say that my same gripes with subspecies can deconstruct species as well. This is identical to someone saying that the gradient of colors shows that there cannot be an actual yellow, and actual orange, an actual green. The existence of intermediaries does not disprove the existence of the discreet categories. Subspecies is an intermediary between 'species' and 'individual.' It is undefined in science, or, rather, it has so many definitions as to render it mostly meaningless outside of very specific bodies of literature. Are there glimmers of inconsistency in species categories? Of course. There are discrepencies between biological, phylogenetic, cladistic species, etc. That does not mean that the vague and undefined intermediary (subspecies) somehow deligitimizes the defined and specific category (species). That is the color spectrum argument. You said you disagreed with it (even though I never brought it up until you did), and then you used it to try and delegitimize the species concept.

No, it didn't.

Yes, it did. I have an excellent source for this. YOUR video. Didn't quite remember Josh saying that one bit, eh? ;)

I have been very generous in that I have willingly gone into the territory you chose, stats and ML, just to show that you will lose even on your home turf. But we have hardly even explored the anthropological assumptions in your argument. What are our preconceived ideas about race? How do you KNOW what the clusters are in advance? What is this information that you refuse to talk about? It is absolutely imperative to your argument. You keep saying it over and over, so obviously it is important. What are our preconceived ideas about race?

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (10 children)

I mean, the clusters ALWAYS correspond to something. Always. If we have one cluster for every single individual in the study, those clusters will correspond perfectly to the number of individuals in the dataset. It will be a 1:1 correspondence. Does that mean there are about 8 billion+ races? Yes. It does. It is the same method that you used to get 3, or 7, or 12, or however many you wanted. There are as many races as you want to see. All an algorithim can do is put a robot-buddy next to you who obsequiously repeats what you programmed him to say. The reality of genetics is in the real genes themselves, genes that modify us. You are trying so hard to turn that reality into some kind of scientific argument for a given k value, and you have failed to do so.

I did not cede any ground. I hypothetically assumed your position for a moment to demonstrate a damning inconsistency inherent within it. Come on, man. I know I was a bit snarky in my previous message, but don't play this game. It's as if I said, "Even if I were to assume your position is true, then there would be this problem of--" "HA!! You just assumed my position is TRUE, I ,got you, you snake in the grass! You admitted it!" You're clearly smart enough to know the game you are playing there, and so you are smart enough to know you won't get away with it.

I never claimed that racial clusters aren't useful enough. Again, you are responding to an argument you have heard in the past, one you feel ready to reply to, but not the one I made. I said, quite specifically, that the utility of clusters depends on the question we ask. This is always the way clusters work. They are, by their nature, supportive classification schemes. They don't exist in nature, we use them as tools to answer scientific questions. Change the question, change the clusters. They could be immensely useful, or not useful at all. How many loci? Depends on the question. How useful? Depends on the question. If your question is, "Why is it possible to choose a k value that corresponds to one of the many available socially popular racial classification schemes?" Now there is a question we could talk about. But it is a specific question. The answer is not 'because race exists.' We would need to talk about why humans evolved to classify each other along ethnic boundaries, how we prioritize our distinctions, what selection pressures might have contributed. Lawrence Hirschfeld is currently the leading expert in this field.

Think about it. Let's pretend you are right (remember, this is a Socratic exercise). Why would humans have evolved to correctly determine the 'correct' racial boundaries? We don't evolve to see scientific truths, we evolve to perpetuate our genes. It ought to be assumed that we are very selective in our racial assignments, that we would care overwhelmingly about clearly expressed features like skin, eyes, hair, and not so much about hidden features that are often much more consequential, like circulatory systems. And this is exactly what Hirschfeld's work demonstrates. Indeed, we humans do have a naive race-assignment module. That is the true home of race, the closest place where race is real science. But that is a real feature of evolved human cognition, not a real feature of human populations. Scientifically, race is in the eye of the beholder. If you want to talk about human populations genetics in a way where it becomes compatible with what we observe in the data, without needless references to what racial categories might be popular in one place or another (it changes from place to place, time to time) then the word 'ancestry' is a good place to go, since researchers have already done pioneering work in that direction.

Genome wide clustering does not separate races at all. It, again, produces the number of races you input into the analysis. In fact, it does a much poorer job than the SNP analyses, because there are far fewer differences to analyze. Of course there will be groups in the data. I'm not saying you won't see obvious clusters. You see obvious clusters in one family. You see them in one individual. There is a reason you keep bringing up the social classifications of race, even though I am staying glued to the science. It's because you know that you need to leave science to come up with a way to justify your k value. You need to find a way to legitimize your preferred race number, and you can't do it with science, so you try to backwards-rationalize into it with popilar opinion, and pretend that there is some grand secret truth here about a magic number that we have intuited, and science somehow justifies. But you haven't even told me how many races you think there are, you just keep saying that society has some number that corresponds to the clusters. What number? I have heard so many. Is it 3? 5? 7? 12? All of these k values have appeared in the literature. They ALL correspond to one view or another of how many races there are. I don't know which number you like, but you seem to believe we all already agree on this. Do you think that most humans today agree about how many races there are? What about 100 years ago? 100 years from now? That is hardly a scientific variable.

My friend, about my 'garbage paragraph.' Oh, my friend. My dear friend, Dragonerne. Please direct your attention to your flair. "Jesus is white." A Jew, is white, apparently, and you feel the need to append that information to every single post you make, right at the top. Yeah, what I said is painfully relevant. In fact, it cuts through this debate and strikes at something personal about your own motivations. Honestly, Dragon, thank you for this. It is a rare day when someone so spectacularly makes my point for me like this. My apologizes for the snark--this one is going right in the scrapbook.

You failed to demonstrate my 'misunderstanding' of the clustering algorithims. Actually, you didn't even know what STRUCTURE was, which is what you didn't know you were citing. You told me that you use Python, as if I had said YOU were running analyses through STRUCTURE. I was talking about your first link. I wouldn't know what textbook you are looking at, because we mostly stop using them in grad school, certainly by our first postdoc. Textbooks are such a vague thing to cite, and become outdated quickly. In academica proper we cite published papers, sometimes edited volumes. If you have a specific paper to cite, please do. I try to avoid doing so myself because it comes across as trying to bully someone into silence by giving them homework. But without meaning to, you are showing your hand by mentioning textbooks at all.

I know you aren't dumb. I already kind of apologized for being snarky, but I will actually apologize here. It's just that I already know you, I have met you and your arguments 100 times, and I can't help but feel as if we are already pals engaged in friendly and lightly abusive sparring. Truth be told, I learn more from altrighters than I have from many professers, who would never acknowledge something as straightforward as the warrior gene. I know you aren't dumb. You're wrong, and you're clearly not formally trained in this, but you are well-spoken and you've retained complex information well.

[–]SamiAlHayyidGrand Mufti Imam Sheikh Professor Al Hadji Dr. Sami al-Hayyid 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (8 children)

Can two black parents create an Asian kid?

Can two Asian parents create a White kid?

Can two White parents create a black kid?

What we mean by race is simply the fact that all of these questions can only be correctly answered with a resounding "No". Those who believe in race rightly don't care about any of this other stuff (continuum fallacy, '99.X% similarity', etc.) simply because no amount of scientific meandering is even remotely going to turn that resounding "No" into a "Yes" or even a "Maybe".

Your style of argument could just as easily be used to attack dog breeds. Sure, dogs probably are overwhelmingly genetically identical. So what?

Can two Great Danes beget a chihuahua?

Is a chihuahua as equally capable of being a police dog as a Belgian Malinois?

The answers to these questions are evidence enough of the existence of dog breeds for most people. Yet strangely when we substitute 'Great Dane' for one race and 'chihuahua' for another in the first question, suddenly the egalitarians amusingly try to backpedal and declare such arguments unsound.

It's hilarious how Western 'science' is so transparently hellbent on trying to provide quasi-scientific explanations for Left-liberal ideological views. The same people who argue this nonsense are incidentally most of the same people who believe that the male-female distinction is also 'fake', and who simultaneously use the 'born this way' argument and the 'sexuality can change over the course of one's life' argument. Hmm... we see a pattern here. Those who want racial distinctions abolished also want these other distinctions abolished. The 'science' in all three cases is subordinate to purely ideological and quasi-moral reasoning. The presupposed assumptions and the reached conclusions are numerically identical.

Until two black parents can create an Asian kid (i.e. never), there will always be a need for racial classifications among humans. End of. Keep denying race all you like—it's transparently obvious that the underlying reasons for doing so are based in ideology. Eastern European and East Asian geneticists, who overwhelmingly accept race, will bury Western science.

[–]milkmender11 1 insightful - 2 fun1 insightful - 1 fun2 insightful - 2 fun -  (7 children)

Yes, two Black parents can create an Asian kid. Yes, two Asian parents can create a white kid. Yes, two white parents can create a Black kid.

You are wrong.

https://www.konbini.com/wp-content/blogs.dir/12/files/2018/03/national-geographic-cover-april-2018-race.adapt_.1190.1.jpg

These are twin sisters, from the same parents. One is Black, one is white.

Now, it is a bit complicated. You don't realize it, but you are invoking the "rule of hypodescent." This is a known anthropological concept. Your categories are social, and they break down when we apply more rigorous scrutiny to them.

Chen is Asian, but he has half white ancestry. Because he lives in the USA, where Asians are a minority group, he is seen as Asian, not at all white. Everyone knows Chen is Asian. Nobody goes out of their way to say he is 'half white.' They don't even see it, since Asian alleles tend to be dominant.

Chen marries Sue. Sue is also Asian, but half white. Everyone likewise knows that Sue is Asian. Her census record says 'Asian,' not 'Asian and Caucasian.' Just 'Asian.'

Chen and Sue have a daughter, Sally. Sally is white!! She looks white, everyone sees her clearly as such. Of course, this can happen. It's luck of the genetic draw. The genes were there, and society has spoken. The phenotypes interact with the zeitgeist, and everyone knows exactly what box to put these people in. Chen is Asian. Sue is Asian. Sally is white. Nobody hesitates to give their 'resounding answers.'

Now, here is where you want to try and be specific, when it becomes inconvenient for you. Before, you said that these questions have resounding answers. Indeed they do! all of Chen's coworkers know that he is Asian. But now, you don't think the answers are so resounding. Now you want nuance and subtlety, now it is complicated. Now it matters to you that Chen is half white.

See, you can't have your cake and eat it too. You want it to be the case that everyone 'just knows' the answers to these questions, but they don't. They have many default assumptions that they make as a matter of their socialization, assumptions that vary enormously from culture to culture, and they won't become more nuanced to accomodate your worldview. Chen IS Asian. Everyone knows it. Sue IS Asian everyone knows it. It is your 'resounding answer.' Sally IS white, and everyone knows it. You can kick and scream and try to persuade society otherwise, but they aren't trying to make your case. You have to do it yourself, and I doubt your reply is going to have a 'resounding answer.' You're going to try to be nuanced, to bring in the little things that matter. What happened to 'resounding answers'??

I didn't mention the continuum fallacy and I didn't mention dog breeds. I appreciate that you refer to this as my 'style' of argument, because you know I didn't say any of that. It's like 'homestyle' food, aka, not homemade. So I won't bother addressing your point about dogs, because it is YOUR point, not one I made. I'm making arguments different than the ones you seem prepared to reply to, which is very much an altright theme. I didn't talk about sex or gender either. Let's stay focused.

And who's denying race?? Again, whose argument are you replying to?? Race is extremelt real. It isn't science, but not everything has to be.

[–]MarkimusNational Socialist 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (6 children)

Yes, two Black parents can create an Asian kid. Yes, two Asian parents can create a white kid. Yes, two white parents can create a Black kid.

No they can't

https://www.konbini.com/wp-content/blogs.dir/12/files/2018/03/national-geographic-cover-april-2018-race.adapt_.1190.1.jpg JPG

These are twin sisters, from the same parents. One is Black, one is white.

No, they're both mixed race retard.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (5 children)

Wow. "Nuh-uh!" Is your reply. Want to throw in a "nya nya!" or a "neener-neener"? Give me a wet willy maybe, or kick me in the balls? I haven't got any, sadly.

You are also mixed raced. Everyone is mixed race. That doesn't matter. What matters is society's RESOUNDING ANSWERS, as you said. Nobody looks at the white twin and says 'she is mixed race.' She is white. That is the resounding answer. Nobody looks at the Black twin and says she is mixed. Everyone gives the resounding answer: SHE IS BLACK. But now, you don't like resounding answers anymore!! Now, you like nuance. Now you like to nitpick, and be specific. Now it really matters to you that we look PAST the resounding answers of society, and keep in mind who is 'mixed' and who is 'not mixed,' even as society rolls their eyes at you and continues to give the resounding answers that you liked 30 minutes ago.

[–]DragonerneJesus is white 4 insightful - 1 fun4 insightful - 0 fun5 insightful - 1 fun -  (0 children)

You are talking about outliers and mixed race people. Yes, you can incorrectly guess someones race, but that doesn't mean that the person wasn't half-Asian, half-White to begin with. You literally stated in your post that the two parents were mixed race.

You're confusing social race with biological race.

[–]MarkimusNational Socialist 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (3 children)

No they have one black and one white parent, they're mixed race. You're literally the only person in the world that would dispute this fact because you're just trolling using tactical nihilism.

You should also get a psychological evaluation, this comment is wacky as fuck.

[–]milkmender11 1 insightful - 1 fun1 insightful - 0 fun2 insightful - 1 fun -  (2 children)

9People don't see their parents. If I showed you each twin without context, you would say one is Black and the other is white. When you walk around, do people come up to you and demand to see photos of your parents? No. They just have a RESOUNDING ANSWER about what you are. It's the same for these twins. They don't go around getting called 'mixed.' Everyone takes one look at them, don't know anything about their parents, and knows which one is white and which one is Black. I thought you liked RESOUNDING ANSWERS! Now we need to invoke everyone's parents in order to know their race? That isn't very resounding.

What about the example I gave with multiple mixed race Asian parents? It's the same situation, with a different distribution. You see many people like this every day. You think they are Black, white, Asian. You don't know about their parents and you don't ask. You do what everyone does. You give your resounding answer.

A lot of people here care about your assessment of my sanity. I care about it a lot as well. It is very important, and consequential.

[–]MarkimusNational Socialist 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (1 child)

9People don't see their parents. If I showed you each twin without context, you would say one is Black and the other is white. When you walk around, do people come up to you and demand to see photos of your parents? No. They just have a RESOUNDING ANSWER about what you are. It's the same for these twins. They don't go around getting called 'mixed.' Everyone takes one look at them, don't know anything about their parents, and knows which one is white and which one is Black. I thought you liked RESOUNDING ANSWERS! Now we need to invoke everyone's parents in order to know their race? That isn't very resounding.

Schizophrenia moment.

What about the example I gave with multiple mixed race Asian parents? It's the same situation, with a different distribution. You see many people like this every day. You think they are Black, white, Asian. You don't know about their parents and you don't ask. You do what everyone does. You give your resounding answer.

No it's an assumption based on appearances, and an incorrect one once facts are obtained.

Logic is a famous rapper who looks white to most people, but his father is black so he's mixed race. He's not magically white just because you insist that he is.

The other guy did better. You should leave this to him.

I'll gladly not talk to a kike battling with his schizophasia whilst attempting to do tactical nihilism.

[–]DragonerneJesus is white 3 insightful - 1 fun3 insightful - 0 fun4 insightful - 1 fun -  (0 children)

I mean, the clusters ALWAYS correspond to something. Always. If we have one cluster for every single individual in the study, those clusters will correspond perfectly to the number of individuals in the dataset. It will be a 1:1 correspondence. Does that mean there are about 8 billion+ races? Yes. It does.

Yes, 8 billion+ races. " I mean, the clusters ALWAYS correspond to something."

You never adress the elephant in the room, that when we set the k to a low number we don't just get clusters that correspond to "something", but we get clusters that just HAPPENS out of pure luck, who would have even thought this to be the case: racial clusters

You're clearly smart enough to know the game you are playing there, and so you are smart enough to know you won't get away with it.

Ok, I will admit that I was doing that, and I should've clarified because I knew it might've come off as disingenous. Under the assumption that my position was true, you conceded that race was real and then you moved the goal post to say that race as a concept isn't meaningful (without ever proven that, you just claimed it to be the case)
"I should point out that I am providing you the courtesy of pretending that 'race' is a legitimate taxonomic category"
But then you didn't attack my claim that there is more than 1 human race. You moved the goal post to say that the k racial clusters aren't meaningful.
Forgive me?

If your question is, "Why is it possible to choose a k value that corresponds to one of the many available socially popular racial classification schemes?" Now there is a question we could talk about. But it is a specific question. The answer is not 'because race exists.' We would need to talk about why humans evolved to classify each other along ethnic boundaries, how we prioritize our distinctions, what selection pressures might have contributed. Lawrence Hirschfeld is currently the leading expert in this field.

I can't explain how much I admire this level of subversion. It is simply blows my mind every single time.

I would like you to answer that question though.

"We know that race is not real, so how come when we cluster human genetics, we get racial clusters? Well, since we know race is not real, the explanation must be something else. "

And this is exactly what Hirschfeld's work demonstrates. Indeed, we humans do have a naive race-assignment module. That is the true home of race, the closest place where race is real science. But that is a real feature of evolved human cognition, not a real feature of human populations. Scientifically, race is in the eye of the beholder.

I don't know that this is true. Self-identified race align 98-99% with estimated race. The outliers are just racial boundaries, biracials and so on.

There is a reason you keep bringing up the social classifications of race, even though I am staying glued to the science. It's because you know that you need to leave science to come up with a way to justify your k value. You need to find a way to legitimize your preferred race number, and you can't do it with science

No, as I've said before, race realists don't believe in a FIXED set of races. We don't mind an arbitrary k, in fact we would assume an arbitrary k.

"Jesus is white." A Jew, is white, apparently, and you feel the need to append that information to every single post you make, right at the top.

Its my flair. I don't want to go into a religious/political debate. I've seen a post of yours (yes, I stalked you a bit before deciding to engage you in a debate hehe) and it said that jews are the most pure whites. After we have finished this debate about race. I would like to delve into that subject if it interests you. However it will divert the attention too much away from the current subject thats already huge if we started on it now.

But without meaning to, you are showing your hand by mentioning textbooks at all.

You had a misunderstanding of kmeans algos that even 1st year students don't have because its in chapter1 of most textbooks on this subject. To my recollection 1st year undergraduates do use textbooks.

I know you aren't dumb. I already kind of apologized for being snarky, but I will actually apologize here. It's just that I already know you, I have met you and your arguments 100 times, and I can't help but feel as if we are already pals engaged in friendly and lightly abusive sparring. Truth be told, I learn more from altrighters than I have from many professers, who would never acknowledge something as straightforward as the warrior gene. I know you aren't dumb. You're wrong, and you're clearly not formally trained in this, but you are well-spoken and you've retained complex information well.

Its all good. I will reciprocate your mannerism and conduct. You're not wrong, but you've been misled, have some misguided misconceptions due to a lack of understanding of the fundamentals of the methods that you're using but these can be fixed and once thats done, then you can return the favor and school me, where my understanding reveals to be lacking.