… but this still struck me as cool. Finding correlations is an important part of just about any research being done, but this can be difficult with anything beyond single variable to single variable relations. Let’s let Donald Richards explain further in this interview from Quanta Magazine:
If you want to study the correlation between one batch of variables and another batch, then there is no single Pearson correlation to measure the strength of an association. A second problem, which people often overlook in everyday applications, is that the Pearson correlation coefficient should be used only when there is a reasonably linear relationship between the two variables. If the relationship is highly nonlinear then this method is inapplicable.
So what do you do then? Apparently, it’s a real mess to figure out the correlation coefficient, but Richards came up with a handy solution:
How does distance correlation work?
This is where the concept of a Fourier transform comes in. A Fourier transform is a way of breaking up a mathematical function into its component frequencies, similar to how a music chord can be decomposed into its constituent notes. All functions can be uniquely characterized by Fourier transforms, so people started to try to define the concept of a measure of correlation by using Fourier transforms. If you give me two probability distributions — the statistical spread of values that a variable takes on — and if I want to test whether the two distributions are the same, all I have to do is calculate their Fourier transforms. If these are equal then I know that the two probability distributions had to be equal to begin with. The distance correlation coefficient, in layman’s terms, is a measure of how far apart these Fourier transforms are.
I’m clueless on Fourier Transforms, but I think this sounds very cool – not only for what it is, but for this:
You wrote a paper last year giving examples where distance correlation improves on Pearson’s method. Talk about the case of homicide rates and state guns laws.
This was prompted by an opinion piece in The Washington Post in 2015, by Eugene Volokh, a professor of law at UCLA. The title of the article is “Zero Correlation Between State Homicide Rate and State Gun Laws.” What he did was — you know, my eyes bugged out; I couldn’t believe it — he found some data on the states’ Brady scores, which are ratings based on the toughness of their gun laws, and he plotted the Brady scores on an x-y plot against the homicide rates in each of these states. And if you look at the plot, it looks like there’s no pattern. He used Excel or something to fit a straight line to this data set, and he calculated the Pearson correlation coefficient for this data set, and it came out to be nearly zero. And he said, “Aha, zero correlation between state homicide rate and state gun laws.”
That’s not kosher?
I was horrified. There are so many things wrong with this analysis.
And I’ll leave off at that, but to note that not every blogger, even a highly educated one such as Volokh, is always right.