Software Nerd: The Limitations of Correlation-coefficients

Tuesday, June 10, 2008

The Limitations of Correlation-coefficients

Warning: All the IQ data-values below are fictional. They're contrived to demonstrate the underlying point about statistics and about inheriting IQ.

Hypothetical Study: Suppose we get a list of all the adopted kids in a small geographical area: say one suburban city. We choose a single age -- say 10 years old. From this list, we randomly select a sample. Assume we remove from our sample only those kids who are now orphaned, but every other kid in our sample agrees to participate in our study. We administer a standard IQ test to every child in the study, and to the adoptive parents of the child and to the biological parents.

Hypothetical Findings: Next, suppose we measure the correlations and find the following:

... that, the IQs correlate perfectly between kids and their biological parents (Coefficient=1.0), but only a little (low positive, 0.3) between kids and their adoptive parents. Here's the graph of the data, for 5 kids:

The data points for 5 kids are shown. For example, Kid-1 has an IQ of 105 (X-axis), his biological parents have an average IQ of 92 (Y-axis), and his adoptive parents have an IQ of about 113. Kid-2 has an IQ of 110, and so on.

The straight lines show "best fit". One can see the perfect correlation (going through all the blue data points) between IQ of kids and their biological parents. On the other hand, the red data points are all over the page, with a noticeable positive correlation, but a lot of dispersion from the "best fit".

What conclusions could an we draw from such a (hypothetical) result?

Could we sat that inherited ability (e.g, genes) is a strong causal factor in IQ (as measured by standard tests)? Wouldn't the correlation of 1.0 indicate this?

Secondly, can we conclude that "environment factors" do not have a large impact on a child's IQ? Wouldn't the low positive correlation indicate this?

Neither of these conclusions is warranted by the data.

Scale is abstracted away: A correlation-coefficient does not reflect the absolute numbers of the two series. It abstracts away the particular unit of measure. For instance, suppose I have two series: people's weight and height. The correlation between these two will be the same, no matter what units I use for weight (pounds, kilograms or ounces).

Similarly, when we are calculating the correlations between the IQs, though we use the same scale, we are ignoring the absolute values of each series. Imagine a hypothetical, where something about the process of adoption gives kids a higher IQ. Imagine that adopted kids end up with 25% higher IQs than we have assumed above. Still, the correlations would remain the same: the kids would still have perfect correlation with their biological parents and a low positive with their adoptive parents.

Meaning: If we did get a perfect correlation in an IQ experiment like the one above, it would hint strongly to inheritance being a factor. However, it says nothing about how important a factor. What if further research found that the reasons behind that perfect correlation did indeed represent causation? Even so, it would not speak to the importance of inheritance in the final IQ.

Another example: Here's another example to illustrate this point. Suppose we look at the wealth of five men at the start and end of various years. Suppose we find that they grow their wealth at about 5% each year. Suppose we also find a strong positive correlation between the starting and ending wealth in any year.

Now, instead, suppose that during the last year each man placed a large bet on a game; but, each bet was "large" only in relation that man's wealth. Suppose, each man won, and ended the year with nearly twice the wealth he began with. Now, at the end of the year, their wealth is still positively correlated to their wealth at the beginning. Based on their "inheritance" of that last year, we would have expected their wealth to grow 5% each, but instead it grew by 100%. The betting was responsible for that, not the inheritance. Yet, the inheritance still demonstrates the same high positive correlation. Only by looking at the actual scale do we find the relative importance.

Summary: The take-away is this: be careful gleaning more information from a correlation coefficient than it is designed to tell. Many experiments are designed to ask something like this: "if we assume that all other things are equal, does varying this single factor have an impact?" Well and good. However, the other factors -- that have been abstracted away -- may be the crucial ones that vary in the relevant real-world situation.

5 Comments:

Interesting topic. What do you think of all the race and IQ data? People like Charles Murray, James Watson, and Richard Lynn argue that there is a strong correlation and that black's IQ is stuck at 80 while whites are at 100 and North Asians at 105 (Ashkenazi Jews are the highest at 112-115). This data is supposedly consistent for over 100 years.

Also, have you read Phillip Rushton's work? He not only argues that there are IQ differences but differences in temperment and propensities for moral behavior. The white nationalists and the racialists then take this data and these findings and run with it arguing for total racial segregation because non-whites don't have the same "civilizational abilities." Larry Auster is one such conservative. His argument is that race determines a person and determines culture. Mixing races is therefor bad.

They all base this on correlation-coefficients. So I am curious as to your thoughts on this. I disagree with racial determinism but I must admit that I really don't know what to think about inheritable differences in the races.

By Anonymous, at 2:59 PM
I should also add that there is one person who is a former Objectivist who argues that Ayn Rand's philosophy is flawed because it didn't take "race realism" into account:

http://www.thornwalker.com/ditch/barnyard.htm

I disagree with his argument but I've often wondered if modern genetic discoveries would challenge Objectivism in any way.

By Anonymous, at 3:04 PM
I'm no expert, nor do I even have an amateur interest in the topic. So, I haven't read much on the question.

One problem is that IQ tests are not based on measuring some external objective thing (e.g. how well does this person understand this topic), but are actually designed in order to fit a particular statistical pattern.

For instance, suppose I design an IQ test, and men and women get significantly different scores. What do I do? Often, a designer will look for specific questions which are tripping up men, and specific ones which are tripping up women. Such questions are removed, because they have a bias to one sex or the other. The problem with this is that one is designing a test to fit an existing thesis: i.e. the equality of IQs between the sexes.

Of course there are real academic performance differences between black, Asian and white kids, even in a typical U.S. suburb. My own observation -- of the general stereotypes, and of the kids who break the stereotype -- is that unbringing is the critical determining factor.

I don't know how genetics will throw any light on philosophy in this regard. For instance, let's suppose that science concludes that one person is born with (90 - 110) range of IQ, and cannot go much outside that range despite his effort. Another is born with a (110 - 130) range. What of it? I see no philosophical significance.

By SN, at 6:14 PM
Charles Murray and others seem uninterested in this simple fact: race has no clear definition. Is a person with a white mother and black father white or black? In the USA, he or she is black, but elsewhere..? What if he or she looks light enough to pass for white?

The idea of "scientifically" correlating race and intelligence is so ridiculous, it's amazing to me that people can take it seriously, except for the depth of bigotry that remains in societies everywhere, and the universal human urge to tribalism.

Consider also the notion of IQ - it is simply the score on a particular test. Is that intelligence? Does anyone know what intelligence is? Is there a reason to think it IS one thing?

As softwarenerd says, we are not measuring an "objective thing" here. But many people, in their zeal to "prove" their prejudices, will assume they know what they are talking about, and then to proceed to "measure" things about which they have no clear notion...

By Anonymous, at 2:15 PM
Lichanos, Thanks for the comment.

Other than racism, I think a lot of people buy into the notion of a genetic component because their actually is an "inherited" component. Except, it is not really inherited biologically, but is transmitted culturally / environmentally.

By SN, at 3:20 AM

Software Nerd

Tuesday, June 10, 2008

The Limitations of Correlation-coefficients

5 Comments:

Previous

What I Read...