The Bangladesh Mask Trial is re-analyzed and it falls apart
And, we are still missing an endpoint
Recently Marina Chikina, Wes Pegden, and Ben Recht published their re-analysis of the Bangladesh mask RCT, and the trial appears to fall apart. I will do my best to summarize the key issue in the study, and why this re-analysis is provocative.
First, let’s start with basics. There is a reason that the CDC, WHO and Fauci himself advised against community masking in early March 2020— and it was not to protect the supply for health care workers. The true reason is simple: the pre-existing evidence was poor. That’s the opinion of the Cochrane collaboration, and a systematic review that I participated in on the topic. These agencies truly did not believe it would help because that’s just what the evidence said.
How did COVID19 studies change the evidence? Well, there was a sea of low quality observational studies. They are not worth considering, as noise is 2 orders of magnitude larger than signal. I have debunked dozens in these pages and on YouTube.
There was one individual level randomized trial (DANMASK) that failed to find a benefit, but was limited by low power to exclude a small benefit. There were just 2 cluster RCTs run globally— and none pertaining to children.
One cluster RCT has not been published, and the other is the Bangladesh study. Bangladesh is a cluster RCT that randomized adults in villages to free masks (surgical or cloth) and encouragement to wear them or not, and followed people for COVID19 outcomes. The study found surgical masks lowered rates of COVID19 —though the effect is very small and applies only to adults pre-vaccine and pre-natural immunity and not cloth masks.
Enter the re-analysis. The authors noticed that there was a difference in the number of people enrolled in the study. It looked like ~9% more people enrolled in the free mask arm. This 9% is highly significant, i.e. a real difference.
Of course, the purpose of a randomized trial is to minimize confounding and balance outcome distributions in the absence of treatment effect, but imbalance in the size of groups suggests that something might have happened that jeopardizes this fact.
What would cause more people to sign up for the treatment arm (free mask) than control arm? One possibility is that concealment was violated, and people knew that they might get something for free in 1 arm, but did not feel they would get anything in the other arm.
If participants could see a big truck or boxes in intervention villages, but not see that in control villages, they may be more likely to enroll. In fact, 9% more likely!
This has huge implications. Is the extra 11th person in the mask arm the same as the 10 people in control arm? Or is this the type of person that only enrolls on the margin? Only enrolls if they are getting something for free, but not otherwise, and thus slightly less likely to properly report COVID symptoms (perhaps they report less or differently) and less likely to follow through with testing?
The authors argue this is possible, and this threatens the entire trial. Assuming these people are just a little different, can cause the entire trial results to tip. Their paper nicely probes this statistically and is worth your time.
I have one separate question about this study. The strongest secondary endpoint— the only one truly bias resistant— is random seroprevalance (which does not rely at all on reporting). This endpoint remains listed on ClinicalTrials.gov, but unreported. It must be completed and reported.
Finally, it is worth restating: Bangladesh has no relevance to children, or post-vaccine. It also doesn’t not apply after sero-prevalance rises. In other words, it is not relevant for 2022 America, but would be good to pin down for historical reasons.
Here is my thread on the topic
Imagine if I wrote a paper advocating that everyone wore a rabbit's foot to protect from car accidents, and then a year later I published a paper showing an absolute reduction in car accidents by .0009. Would the Washington Post call this "the nail in the coffin that rabbit foots work?" Or instead would you expect some rigorous skepticism?
What they hell happened to science?
The author of Bangladesh Mask Study, Jason Abaluck, previously authored "The Case for Universal Cloth Mask Adoption and Policies to Increase Supply of Medical Masks for Health Workers" on 4/6/2020 - https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3567438 which concludes:
"The economic case for universal mask wearing is convincing and urgent, but the moral
need to provide adequate equipment to frontline healthcare workers is an even higher
imperative. Enacting policies to increase medical mask production, and concurrently
encouraging the widespread production and use of cloth masks can achieve both
objectives. Public officials should encourage and support universal cloth mask adoption
immediately."
Is it possible that because he started the Bangladesh Mask study already convinced of the conclusion, that masks work, this may have created subconscious bias in his design, gathering of data, and synthesis of results?
Look how his co-authors framed the minimal efficacy findings in their study when given the chance to summarize their findings in the Washington Post.:
(this is typically paywalled but I used the gift option to bypass that)
"We conducted the largest study on masks and covid-19: They work"
https://wapo.st/3DAYmPQ
Hilarious experts include: "We now have the results, and the answer to both questions is an emphatic yes."..."The results were striking."...."We found that surgical masks averted 1 in 3 symptomatic infections among those aged 60 and older."...."Masks may have even greater potential than our study was able to demonstrate."...."Our study showed conclusively that masks are a cost-effective way to reduce infections and demonstrated that a mask-promotion strategy can work."
Note they didn't bother to point out they found cloth masks didn't work, but instead data-drudged their way to the "1 in 3 cases averted over the age of 60" without a second glance at the ridiculousness of that statement.
The press uncritically regurgitated the hype without ever reading the actual study.
Nature: "Face masks for COVID pass their largest test yet"
https://www.nature.com/articles/d41586-021-02457-y
"“This really should be the end of the debate,” says Ashley Styczynski, an infectious-disease researcher at Stanford University in California and a co-author of the preprint describing the trial. The research “takes things a step further in terms of scientific rigour”, says Deepak Bhatt, a medical researcher at Harvard Medical School in Boston, Massachusetts, who has published research on masking."
WebMD: "Large Study Confirms Masks Work to Limit COVID-19 Spread"
https://www.webmd.com/lung/news/20210907/masks-limit-covid-spread-study
Former Director of the CDC Tom Friedman: "Masks work. A new study of more than 340,000 adults across Bangladesh found that the more people wore masks, the less spread of Covid there was"
https://twitter.com/DrTomFrieden/status/1434901885106876416
LiveScience: "Huge, gold-standard study shows unequivocally that surgical masks work to reduce coronavirus spread"
https://www.livescience.com/randomized-trial-shows-surgical-masks-work-curbing-covid.html
"Results from a massive study in Bangladesh unequivocally show that surgical masks reduce the spread of SARS-CoV-2, scientists say. "
HOW??!! How can science be in such a state of disarray where we are so easily fooled? How can journalists covering science fall for such nonsense?
Why did it a bad cat to debunk this and not The New York Times or Nature or The Lancet?
https://boriquagato.substack.com/p/bangladesh-mask-study-do-not-believe
We are going to look at this as a mini-dark age.
I have spent many hundreds of hours trawling through the data for this study (https://gitlab.com/emily-crawford/bd-mask-rct). According to that data, some individuals were given a serology test at endline despite being "asymptomatic" (that is, not meeting the study criteria for COVID-19 symptoms, which are based on the WHO definition of a probable COVID-19 case). I am not sure whether or not these individuals were randomly selected.
Out of 2375 asymptomatic individuals tested in the control group, 449 were seropositive (18.9%).
Out of 2987 asymptomatic individuals tested in the intervention group, 543 were seropositive (18.2%).
This is a smaller effect size (3.8%) than the one seen in the study, which is already on the edge of statistical significance.
Also notably, the percentage of symptomatic individuals who tested seropositive is 29.9%. So symptomatic individuals were more likely to test seropositive than asymptomatic individuals, but not by as much as one would expect or hope (especially given the authors' claim that they ran the symptom surveys in order to cut down on the number of blood tests they would need to perform to find seropositive results). This is one of many indicators that the symptom survey is unreliable.
We must keep in mind that the surveys themselves are fraught with problems. A researcher went to each enrolled household, asked "has anyone in your household experienced any of these symptoms in the past four weeks?", and read out a list of 11 symptoms. That list of symptoms doesn't even match what's reported in the Science paper. (Here is the full script: https://docs.google.com/document/d/14li6x3Wg0INClpBfuTzhFcALqC6aFyNtYc9DjVurcv0/edit.) This is a far cry from meeting any reasonable standard for COVID-19 diagnosis, including the one used by the FDA for vaccine trials (https://www.fda.gov/media/142143/download).