In October, the Highway Loss Data Institute and the Insurance Institute for Highway Safety released a self-published report that suggested there’s a strong correlational link between automobile crash rates and the legalization of marijuana in states. This would be harrowing news if the finding was based upon strong scientific evidence.
Unfortunately for these institutes, though, the research data is murky at best. And because these organizations self-published the report, rather than going through the scientific peer-review process, it’s hard to take their findings seriously.
Before we begin to look at the data, readers should understand that although the Highway Loss Data Institute (HLDI) and the Insurance Institute for Highway Safety (IIHS) are two distinct legal organizations, they share the same senior leadership, the same physical address, and probably a lot more. It makes it sound like these are two independent, unrelated organizations that found similar results when they pooled their collective minds, but that is not the case.
I also have to begin this article with the usual scientific reminder — correlation does not equal causation. If I conduct a study on the effect of people opening their umbrellas in an urban downtown area, I will find a very strong positive correlation between that action and the presence of rain. But we know that opening an umbrella does not cause the rain to fall. Yet science will readily show a strong correlation between the two — leading some sad, unwitting researchers to suggest there’s definitely a causative relationship.
Let’s Choose Our Control States
In the new study, a researcher looked at highway crash data for three target states where marijuana had become legalized beyond medicinal use (Monfort, 2018). Then he compared these states’ accident rates with five control states (states that generally bordered the legalized-marijuana states). For instance, Colorado was matched with three states: Nebraska, Utah, and Wyoming.
You see the problem already, right? Colorado shares borders with six states, not just three: Arizona, New Mexico, and Kansas as well. We see a similar problem with the other control states chosen, too. Oregon’s control states were Idaho (a proper comparison state) and Montana — a land-locked state it shares little in common with, and no border.
The researcher justified these choices because the sister organization of IIHS released a previous report choosing those states. Those unnamed researchers chose the states based upon a correlation — sometimes not a very strong one — of similarities between their crash data rates. Not demographics, geography (twisty mountain roads versus flat fields), or some other reason.1 It is arguable whether “seasonal crash patterns prior to 2014” are a legitimate scientific comparison variable for choosing control states, when other variables seem far more appropriate.
In a followup email with IIHS’s spokesperson, he said the researcher wanted “to use two different sets of data and see if the results would be similar. Using different control states would not make sense.”
As long-time readers know, researchers can pretty much manipulate their data, analysis, or hypothesis to demonstrate the results they want to find. This is one of the reasons a lot of people are generally skeptical of research. There are so many ways a researcher can manipulate data — oftentimes for very legitimate, good reasons — that it can be challenging to detect the bias introduced.
That wasn’t the case with this study, however. I think the biases are, in my opinion, pretty clearly laid out.
11 Analyses, But Only 3 Significant
The second sign this research report is utter nonsense is that even when the deck of cards was stacked in a manner that the researcher himself chose, he still couldn’t find much significance in the data for the majority of the statistical analyses he ran. See for yourself:
See those three asterisks? Those are the only three analyses that were statistically significant. Notice how only one of those asterisks appears in an actual state-to-state comparison — between Colorado and Utah. All the other state-to-state analyses showed no significant difference between a control state and the marijuana-legal state.
In the seven state-to-state only analyses, only one came up as significant. That’s a pretty significant finding that got glossed over in the report (and not mentioned at all by the IIHS). In my research circles, we’d call this “weak.”
Let’s Pool the Data!
It was only when the researcher pooled a whole bunch of control states together that he found two more significant correlations.
Generally it’s my understanding that researchers refrain from pooling together data from disparate sources unless they can justify the decision to do so, and assure that the underlying variables are homogeneous (or alike).
Not only do you have to ensure this for a group of states, as in the current research. But the researcher also pooled data across multiple years.
It’s not apparent what justification the researcher used for picking the years he did (2012-2016), other than saying those were the years that such data was available in all those states. However, in my cursory search of each of these state’s government websites, I generally found crash data available reliably going back to 2005.
We know that annual crash data is not homogeneous — it can vary considerably from year to year. For instance, in all states, crash percentage rates were up in 2014, 2015, and 2016 — no matter the state. So what the researcher is trying to measure whether Idaho’s 7.84 percent rise in 2015 is different than marjiuana-friendly Oregon’s 7.09 percent increase. Or whether marijuana-friendly Colorado’s 2015 rate of a 4.37 percent increase from 2014 in number of crashes is significantly different than Utah’s 9.96 percent increase or New Mexico’s 10.19 percent increase.
Furthermore, different states got different pool sizes for the months examined, depending on when their retail sales of marijuana went into effect. That meant for Oregon, the researcher clumped 2012, 2013, 2014 and most of 2015 together as one dataset, and 2016 + 2 months in 2015 as a second dataset (46 months vs 14 months). But Colorado legalized marijuana in January 2014, so it only had two years of pre-legalization data, and three years of post-legalization data.
In a fair and accurate scientific comparison, all datasets should be of similar size in terms of the time period examined pre- and post- the variable you are trying to measure in your analysis. Especially when those data, we know, are not homogeneous.
So What Does the Data Show?
The researcher didn’t provide the usual details one would need to replicate his work. So we have to examine the raw data and just look at the more obvious issues.
Taking a look at Colorado, we can compare its rate of accidents over time to two other states. Before 2013, Colorado enjoyed year-over-year declines in overall accident rates (all data from each respective state’s website). Then in 2013, the state saw a spike of 6.29 percent in crashes, that continued in 2014 (6.79 percent) and 2015 (4.37 percent). In 2016, the state saw only a 0.35 percent increase.
Compare these numbers to one of the control states, Utah, that the researcher included. Utah also saw a rise in crashes in 2013 (9.05 percent), which declined in 2014 (-2.96 percent). That state’s crash rates rose again in 2015 and 2016 (9.96 percent and 3.94 percent respectively).
Now let’s compare those numbers to one of the control states not chosen by the researcher, New Mexico. It saw a decline in crash rates in 2013 (-4.78 percent), but then two years of rise in 2014 and 2015 (3.64 percent and 10.19 percent respectively). In 2016, New Mexico saw a decline of -0.53 percent.
I believe the reason Utah was chosen over New Mexico was simple. Its crash rates in the years 2014-2016 (they key years for Colorado’s marijuana comparison) put its crash rate up cumulatively only 10.94 percent. If the institutes had used New Mexico instead, its crash rate for the same time period was 13.3 percent. When compared to Colorado’s cumulative 11.51 percent, you can see Utah’s is lower — but New Mexico’s (also a neighboring state of Colorado) is nearly 2 percentage points higher.
Throw it all into an analysis that accounts for “state characteristics” (that included only unemployment rates and weather), and voila! Data that seemingly show marijuana sales impacts crash rates.
What Does this Mean for Marijuana?
The data, even from the current research, show basically no statistical significance between marijuana retail sales and the number of car crashes in a state. If a correlation exists at all, it is a very weak one, and one only found when the researcher employed, in my opinion, questionable analyses in the study. Some states neighboring Colorado — for instance, New Mexico — were not included in the analysis and showed greater increases in the rise of their accident rates over the same time period than Colorado did.
I believe this kind of report reflects poorly on both the IIHS and HLDI. These organizations, funded primarily by insurance companies, are, in my opinion, publishing scary findings in order to forward a political and business objective. They tout these findings as “research,” despite the fact that they are sometimes only published on the organization’s own website and are apparently not peer-reviewed, as is done in traditional scientific research. (“Our most recent research on recreational marijuana legalization and police-reported crashes has been submitted for publication to the journal Accident Analysis and Prevention,” notes the IIHS spokesperson.)
What we have here is what I would refer to as “fake research” — research stretching to show a significant relationship where clearly there is only a big question mark. And even if there was correlational significance, this research has zero support of a causal relationship. We reached out to the organizations for comment, and they basically just referred us back to the study for answers. For instance, the IIHS spokesperson stated, “The crash data from the marijuana states in the IIHS study are not pooled, but the individual results from each state were combined in a meta-analysis, which is an accepted research method. The individual results for Colorado and Washington in the HLDI analysis are statistically significant.”
When I pointed out the current study only showed significance for one state — Colorado (and then only in comparison to one other state, Utah) — the spokesperson pointed me to the previous study. Apparently unaware that if newer research contradicts earlier research, you can’t just keep pointing to that earlier research as still being valid. What the new study from IIHS shows is that there’s no longer any significant correlation between marijuana and crashes in Washington — the exact opposite of what the spokesperson claims.
IIHS is an organization that is trying to educate and inform policymakers when it comes to motorist safety. Sadly, most politicians and citizens won’t understand or realize they’ve been hoodwinked, and instead believe this sort of study is actually scientific.
In short, there is no strong data linking marijuana retail sales to an increase in automobile accidents. The data we have only show a very limited, weak correlation — not causation — via statistical trickery.
From now on, I will be far more skeptical of anything the IIHS or HLDI publicize.
- Specifically, “Control states were selected based on proximity to the study state as well as on the similarity of seasonal crash patterns prior to 2014. This similarity was based on the correlations between the monthly frequencies in the study state and each potential control state during the 24 months of 2012–13. The Pearson correlation coefficient for Colorado and Nebraska was 0.85; for Wyoming, 0.79; and for Utah, 0.60. For Washington, the states of Montana (0.67) and Idaho (0.63) were selected as controls. For Oregon, the states of Idaho (0.67) and Montana (0.83) were used.” [↩]