A new health affairs paper "proves" mask mandates slow covid-19
But does it live up to the hype?
A new analysis is out now, and it claims that mask mandates slowed COVID19 spread between March -Nov of 2020; The pre-vaccine era! Right off the bat, Let me say it is a more sophisticated analysis than many others. But, in the years to come, I expect still more complex analyses will emerge— and reaching both conclusions! Some will “prove” masking and/or mask mandates saved lives, as well as those proving that they didn’t help.
I hope to persuade you that although well done, this analysis fails to show that the policy of masking mandates slows sars-cov-2 spread. It also fails to answer a related, but different question: whether asking willing people to wear a mask slows sars-cov-2 spread. This essay has 3 parts: Some introductory basics; A theoretical framework for why observational studies will face difficulty answering the question; and finally some specific issues with this paper and the individual vs. policy question:
Here are the basics of this paper.
Mask mandates had to happen btw April - Aug 2020 (pre vax; height of panic)
Follow up & baseline measurements occurred between March to October
Counties were matched to one another in same geographic region
Counties included “ had to have at least three consecutive days of daily case count exceeding 5 as of August 31, 2020, and meet at least one of the following criteria: contain at city with a population exceeding 100,000 people, contain a state capital, be the most populated county in the state, or have an average daily case incidence that exceeded twenty during July 1–August 31, 2020”
A county could be a control county (no mask mandate), but actually implement a MASK mandate, as long as it did so 3 weeks after the counties it was matched with. (will return to this)
Finally, most importantly, there is something different about places that instituted mask mandates vs. those that did not implement mask mandates. Proof of that is that some places implemented the mandate, while others in the “exact same predicament” did not. And that is not a random choice. (We will also return to this point.)
Before I get into the paper, let me explain why this is such a hard nut to crack.
The analysis is trying to simulate a randomized trial where we enroll places where a politician is thinking about dropping a mask mandate, and we randomize them to mandate or not, and then measure spread. That is the core idea. Of course, we didn’t do that, so how can we re-create it?
Comparability: The first thing you have to do is convince the reader that you are comparing places that instituted a mandate to those who did not, which are otherwise similar. The control group is supposed to be the counterfactual— what would have happened to mask mandate county had they not placed the mask mandate. A randomized trial would have randomly assigned the intervention which is perfect; how to do it here?
In order to prove comparability, the authors cleverly use cases and R (reproductive coefficient). Specifically they match mask mandate counties with those that did not mask mandate for at least 3 more weeks by these variables, “population density, total population, presidential election voting patterns in 2016, case incidence and instantaneous reproduction number (Rt) in the two weeks before time zero”
Because they do this, they can get a pretty figure like this, showing pre-mandate, the counties have equal Covid-19 spread. The lines superimpose because they literally matched based on this.
It is clever, but it confuses 2 things. What the want to do is match counties based on the actual rate of spread (the pandemic trajectory), what they are doing is matching based on the rate of reported cases. That’s not the same.
The best way to measure pandemic trajectory would be random, serial sero-prevalance assays in the counties; or random, repeated testing. They are using the case numbers submitted to CDC. It’s a HUGE problem.
Their method assumes that among all people getting sick in these 2 counties, they are sending tests in equal rates and at the same time. The average Joe in these counties who had a runny nose or cough was equally likely to test for Covid. But that is almost surely not true. Places that implement mask mandates are likely testing on the margin more than places that didn’t (well at least for 3 more weeks). Mask mandates are a general marker of caring more about COVID19, and people who care more; test more, and test more on the margin.
Moreover, people in no-mask counties might also test later, on the margin. The average Joe in I-don’t-care-for-mask-ville might test a few days later than Mr.Mask Smith, only if and when symptoms worsen,. and this again means you are not matching actual trajectories. (time delay). Likely both of these things are happening!
All this means is they think they are matching trajectories, but the actual pandemic is almost surely doing different things. If I were to bet, it is more steep and brisk in no-mask mandate places. Then the mask mandate is implemented, and of course, covid-19 spread, which is non linear, may grow substantively over time, but that is going to happen disproportionately more in no-mandate counties as they started time zero with much brisker epidemic spread.
This is a damning limitation that thwarts the entire paper in my opinion. It cannot be saved by back calculations. Trust me, I wasted an afternoon in excel trying. Finally, this likely explains why the effect size seen here is too good to be true. Free surgical masks and strong advocacy had an 11% relative risk reduction in Bangladesh and cloth failed entirely. How can a mask mandate— mostly cloth masks let’s be honest— achieve a 20-35% reduction in cases? I suspect mismatching is the root reason why the effect is too good to be true.
The paper also has the biggest non-randomized masking research problem
These authors also face the biggest masking research problem that will plague the literature for decades to come (ps, only the non-randomized literature— RCTs were the solution).
Mask mandates might lead people to 2 things (left). One, filter the air coming out of their blowholes. I hear that is how cloth works. Two, it might change their behavior. Get them to stand further apart, spend less time in stores, etc. Mask mandates SHOULD get credit for both, in so far as it leads to both. Alternatively, if it makes you more cavalier, you own that too. The policy is enacting the mandate, and if it works via behavior change or via filtration, you get the points.
But mask mandates also could be occurring in the scenario on the right. People are scared. They go on TV and panic. Politicians shout: We are mandating masks! Things are crazy here! People soil themselves. They stay home more, stand far apart, etc etc. And they wear a mask. But it wasn’t the masking that changed their behavior. It was the lunatic on TV who lead to both. Fear drove both. Mask mandates SHOULD NOT get credit for this behavior change, as even without it, people would have changed behavior. They scared!
Any mask mandate study needs to separate these scenarios, and it is impossible to do in non-randomized data in my opinion. This study also fails to disambiguate the scenarios.
Yet, I believe the authors understand the scenario on the right might be in play, and they do try, specifically they write:
“Social distancing was quantified using daily cellphone movement provided by Unacast (which collects and aggregates human mobility data), measuring the percentage change in visits to nonessential businesses within each county compared with visits in a four-week prepandemic baseline period between February 10 and March 8, 2020.”
They they put it in the model to adjust for it, and it is a significant covariate:
The data comes from this website, that offers 3 ways to get social distancing. Shown in a very dull video.
But this creates a few problems.
In so far as distancing is a downstream effect of masking, you don’t want to adjust for it. You are adjusting for a mediator. Not good. They will say this only biases towards the null, but it is a longer conversation.
Of 3 distancing metrics, why did they pick change in visits to non-essential places? One of a thousand opportunities for multiple analytic plans? Why not the other metrics (change in avg mobility for e.g.)?
The metric has social distancing in it, but does not capture the “holy crap, I am scared” distancing that comes from fear. It doesn’t capture how far apart we stand in a store. How fast I run, if I hear a throat clear in home-depot. It doesn’t capture whether I get scared in a hot packed room with stagnant air, and split, vs. how I am a bit more relaxed in an open foyer with doors open and a breeze, and linger. How many people gather at my mothers house? How far do we sit? Is the window open? In other words, if fear drives both mask mandates and the many choices around social distancing, this metric is insensitive to that. It is only a crude measure (ps is it even accurate) of visiting non-essential businesses, and not how far I stand, how long I stay etc.
In fact, the more you think about it, social distancing is not a single number, but many, many, many numbers, and the authors are not capturing any of that.
This is in short, the second major flaw of the paper. It does not offer, nor can it provide a conceptual framework to separate behavioral choices that happen AS A RESULT of mask mandates—They remind me to keep my distance— from those that happen ALONGSIDE mask mandates— Someone scare the shit of out me. This is key because if the cause is being scared, then mask or no mask, that was enough to do the trick. But we want to know if the mask mandate was necessary. Finally, the variable they use to adjust for distancing is a very crude & insensitive measure of behavior and merely gives the illusion they are considering this.
Now allow me to shift to some very specific comments (these are also damning)
No mandate for 3 weeks
The control arm includes counties that did have mask mandates, but didn’t start it for 3 more weeks. This introduces a new issue. The type of place that didn’t mandate for say 15 cases per 100k, but does mandate later includes places that needed to see more carnage to drop the mandate. As long as some or many control counties ultimately drop mandates, this almost surely guarantees that control counties would see greater rises in cases immediately after mandate, because that it was it takes for them to be persuaded to mandate. Of course, the authors do not say you have to eventually have a mandate, which would make it a total tautological exercise, but the mere fact many do, is enough I think to bias the results. [PS I am open to further thoughts on this point. Am I wrong? Talk to me in comments!]
Why is the percent of people in the county with diabetes a covariate?
“Covariates County-level covariates with the potential to confound the analysis results were considered for each model, including social distancing, population density, wet-bulb temperature, proportion of county residents with diabetes, and proportion of county residents earning less than 200 percent of the federal poverty level.”
First, this also assumes equal screening for diabetes, which is almost surely not true. Second, diabetes is a risk factor for bad-covid outcomes, but does covid spread fast among diabetics? And if so, does it also spread faster in counties with higher avg. BMI? Then why is BMI not a covariate?
Masks don’t work in the burbs
The authors downplay that masks only work in urban counties. No effect in the suburbs! Reminds me of how masks didn’t work when used by daycare providers, but no one discussed it.
A strange analytic choice for a super-spreader disease
It is well know that COVID spread can be explosive. Single superspreader events can lead to dozens or more cases. As long as that is true, why do this? You might be throwing away real data
“Data were assessed for outliers relative to the Rt before analysis. Days with an Rt value outside of the 2.5–97.5 percentiles (Rt of 0.35 at the 2.5 percentile and 3.50 at the 97.5 percentile) were excluded from the analysis.”
Some of those outliers are real data. Why discard it?
The study has nothing to do with present day
We can’t forget none of this has anything to do with the present day. This is all pre-vax, pre-Omicron. We have no idea if it applies post-vax. Also no idea if it applies to current, highly transmissible strain. It is of historical interest only.
The last thing I want to say is
Whether masking works is a prerequisite for a mask mandate to work but not the same thing. This study tests whether mask mandates work. That is a reasonable policy question. Bangladesh RCT tested whether free masks and encouragement worked. That is different. In this study almost all masks used were likely cloth. And cloth masking failed in Bangladesh. I think that makes it highly unlikely the large effect seen here is due to the masking. It is almost surely the first objection I raise.
I could write more, I found some other issues, but I don’t have any more time for this paper. I see patients, work on a bunch of academic articles, am writing my new book, and make podcasts and videos, and I am presently sleepy. So that is all for now!