Recently, I wrote articles on skepticism and debunking pseudoscience, both of which require large amounts of evidence. And of course, a true scientific skeptic needs to judge the quality of evidence, because individuals who push science denialism often cherry pick seemingly high quality science to support their beliefs.
A good scientific skeptic needs to decipher the science (or pseudoscience) in popular news articles, for example, to determine its validity. We should be critical, if not skeptical, of what is written in these articles to ascertain what is or is not factually scientific. We even need to determine the quality of science from the best to the weakest, so that we can determine the level of authority of the science before we pass it along to others.
With the social media, like Facebook and Twitter, which provides us with data that may not exceed a few words, then it’s even more imperative that we separate the absurd (bananas kill cancer) from the merely misinterpreted (egg yolks are just as bad as smoking).
Wikipedia is one place which can either be an outstanding resource for science or medicine, or it can just a horrible mess with citations to pseudoscience, junk medicine pushers. For example, Wikipedia’s article on Alzheimer’s disease is probably one of the best medical articles in the “encyclopedia”. It is laid out in a logical manner, with an excellent summary, a discussion of causes, pathophysiology, mechanisms, treatments, and other issues. It may not be at the level of a medical review meant for a medical student or researcher, but it would be a very good start for a scientifically inclined college researcher or someone who had a family who was afflicted with the disease.
Nearly everything in the article is supported by a recent peer-reviewed journal article. Furthermore, the article does its best to avoid primary sources (ones in which the authors directly participated in the research or documented their personal experiences) while utilizing secondary sources (which summarizes one or more primary or secondary sources, usually to provide an overview of the current understanding of a medical topic, to make recommendations, or to combine the results of several studies). The reason secondary sources are so valuable is that they combine the works of several authors (and presumably locations), eliminating biases of one laboratory or one study. Secondary sources also include repetition of experiments that support or refute a hypothesis. As I’ve said many times, trust your secondary sources over just about anything, and, of course, Cochrane Reviews are nearly the best of the secondary sources (but once in a while, they get it all wrong).
A lot of people will immediately dismiss someone using Wikipedia as a source. On the other hand, some people think Wikipedia speaks with authority. As a former passionate Wikipedia editor, the quality is so uneven that I do spend more time looking at the sources of each article than I do actually reading them. I do use Wikipedia to find original sources that I need, when other types of searches fail me. I do use Wikipedia as a “dictionary” of sorts to explain certain medical conditions, but I’ve been moving away from that to other sources.
So when you’re reading a Wikipedia or news article, and it makes a medical claim, and it bases it on some primary research (or worse yet, primary research on animals or cell culture), then it’s a long way to making a conclusion about humans. A very long way. Too often, Wikipedia, news articles, or your Twitter feed provide conclusions about medical or scientific findings that aren’t supported by the original source. And sometimes the original source may actually provide a “conclusion” that’s either unsupported by its own data. Or worse yet, individuals who are using the source are intentionally misinterpreting the primary article to meet their own Confirmation Bias.
One of the better ways to ascertain the quality of research is to look at the quality of the journal. Articles in high quality journals are cited more often, because these journals attract the best scientific articles (which are cited more). The best articles are sent to these journals partially because of the prestige of the journal, but also because the peer review is so thorough. Journals use a metric called “impact factor” that essentially states how many times an average article is cited by other articles in an index (in this case for all medical journals). The impact factor could range from 0 (no one ever cites it) to some huge number, but the largest is in the 50-70 range. One of the highest impact factor journals is the Annual Review of Immunology, which is traditionally in the 50’s.
Impact has its weaknesses. For example, a highly esoteric, but high quality journal, may have a moderate Impact Factor, but it still might still be a prestigious and extraordinary journal. Also a new journal might have an artificially low impact factor, but still be high quality. In this case, watching steady growth in the Impact Factor might be a good indication of it’s quality. Some journals also have a relatively low Impact Factor, but still are cited 10’s of thousands of times every year. This might be an indication of its quality.
A good search engine for finding the Impact Factor of nearly any indexed medical/science journal can be found here. Just type in the name of the journal (even the abbreviated name), and it will give the Impact Factor of a journal for the past five years.
As an independent, objective method to judge the quality of published research, Impact Factor is one of the best available. Frankly, I consider anything over 10 to be high quality, especially in very specialized fields of research in the biomedical sciences. Anything between 5-10 is still high quality, especially if the journal publishes a lot of articles that are widely cited. One of the problems with Impact Factor is that it is an average, so many journals publish popular articles that get hundreds of cites, while also publishing very esoteric and focused articles that don’t get cited often. Journals with impact factors of 2-5 can be high quality, especially if its new, focuses in a very specialized field of science. Anything less than 2 probably requires that your review of particular question in science needs to be more broadly based. As science comes to a consensus on an idea, hypothesis or theory, it gets published more and more often in higher and higher Impact Factor journals. If everything you claim are based on articles from non-peer reviewed, pay-to-publish journals, then your case is weak, bordering on nonsense. If you’re Cherry Picking research only from journals with impact factors less than 1-2, then there might be a problem with the evidence you’re using to establish a scientific belief.
But it’s also important when you’re reading Wikipedia or science articles, to make sure that you don’t get fooled by what may appear to be valid science, but really aren’t. Many journals, and a lot of research institutions, are sending out press releases when new articles are being published. No one should dispute that these institutions have the right to “market” a new advance in science, but they don’t qualify as a reliable source for information.
David Gorski at Science Based Medicine wrote about this problem with press releases in Related by coincidence only? University and medical journal press releases versus journal articles. Gorski made a couple of interesting observations:
Specifically, the results support the hypothesis that university press offices are prone to exaggeration, particularly with respect to animal studies and their relevance to human health and disease, although press releases about human studies exaggerated 18% of the time compared to 41% of the time for animal studies. Again, this seems to make intuitive sense, because in order to “sell” animal research results it is necessary to sell its relevance to human disease. Most lay people aren’t that interested in novel and fascinating biological findings in basic science that can’t be readily translated into humans; so it’s not surprising that university press offices might stretch a bit to draw relevance where there is little or none.
This is very important, something that I see over and over again. There is a tendency to over-dramatize results from animal studies, when only a small percentage of compounds that are tested in animals or cell cultures ever make it into human clinical trials (let alone are approved by the FDA). The National Cancer Institute has screened over 400,000 compounds for treating cancer, and maybe 20-30,000 have even made it to early clinical trials, and of those just a handful are used in modern medicine. You have to be extremely skeptical of reading an article that has source to a press release that might overstate the results (or even if they refer directly to such a primary study).
Gorski continues with a more important issue:
In human studies, the problem appears to be different. There’s another saying in medicine that statistical significance doesn’t necessarily mean that a finding will be clinically significant. In other words, we find small differences in treatment effect or associations between various biomarkers and various diseases that are statistically significant all the time. However, they are often too small to be clinically significant. Is, for example, an allele whose presence means a risk of a certain condition that is increased by 5% clinically significant? It might be if the risk in the population is less than 5%, but if the risk in the population is 50%, much less so. We ask this question all the time in oncology when considering whether or not a “positive” finding in a clinical trial of adjuvant chemotherapy is clinically relevant. For example, if chemotherapy increases the five year survival by 2% in a tumor that has a high likelihood of survival after surgery clinically relevant? Or is an elevated lab value that is associated with a 5% increase in the risk of a condition clinically relevant? Yes, it’s a bit of a value judgment, but small benefits that are statistically significant aren’t always clinically relevant.
Now, as Gorski states, this is a bit of guesswork and instinct, but none of us should probably accept results as meaningful if they are tiny, even if statistically significant. I see this a lot, especially with alternative medicine studies that try to “prove” that they have some benefit beyond placebo. But if there’s a new therapy, or “eat XYZ and it will reduce ABC by 5%”, those results may just be no different than random.
Just as bad as press releases are meeting abstracts (which, of course, are promoted by press releases by research institutions), which are incomplete and unpublished data and undergo varying levels of review; they are often unreviewed self-published sources and these initial conclusions may have changed dramatically if and when the data are finally ready for publication. According to a 2002 paper in JAMA, a total of 252 news stories reported on 147 research abstracts. In the 3 years after the meetings, 50% of the abstracts were published in high-impact journals, 25% in low-impact journals, and 25% remained unpublished. Interestingly, the 39 abstracts that received front-page coverage in newspapers had a publication rate almost identical to the overall publication rate. The authors concluded that, “abstracts at scientific meetings receive substantial attention in the high-profile media. A substantial number of the studies remain unpublished, precluding evaluation in the scientific community.”
So in order of importance, here’s how a good skeptic can judge high quality research in published papers with respect to human health:
- Secondary reviews published in peer-reviewed, high-impact journals. These secondary research articles include meta-reviews, review articles, and Cochrane Collaboration reviews. These studies essentially roll up the data from possibly dozens of other research articles, while attempting to remove obviously poor quality research or biased data. They are mostly useful for examining numerous randomized clinical trials, providing the reader with a higher number of data points, usually with better statistical analysis. These are almost always published in high impact factor peer-reviewed journals. Occasionally, low quality meta-reviews, especially those that cherry pick only primary research that supports the hypothesis of the author, are published in low quality journals (and infrequently, in major journals), but are of little use. Remember, even high impact journal studies get it wrong–Mr. Andy Wakefield, whose fraudulent paper, alleging a connection between MMR and autism, was published and subsequently retracted by the Lancet, a very high impact factor journal, and one of the top medical journals in the world.
- High quality clinical trial reports. Again, if they’re published in high impact journals, and include fairly large numbers (in general, I like to see >1,000 subjects in each arm, but even those in the hundreds can be useful), they can provide evidence to support a hypothesis. However, and this is important, bad study design is bad study design. That’s why the peer-review process is so important, and that’s why we add weight to it. The expectation is that other experts can sort out statistical and design problems before the paper is published.
- Small clinical trials. At this point, they border on observational studies that need to be confirmed (or refuted) by larger studies. In the world of clinical trials for pharmaceuticals, research starts at the small size (less than a 100 patients) at Phase I, progresses through several hundred patients for Phase II, and finally ending with several thousand in Phase III. At each phase more and more data is accumulated, but at Phase I, little can be concluded from the data except for basic safety.
- Animal or cell culture studies. This is basic primary research which may or may not be applicable to human health issues. Too much credence is given to studies that may be 20 years away from having any applicability, if any at all. Someone publishes that cannabis “cures” breast cancer in mice, and that is used to support the smoking of pot to “cure cancer”. Except there’s no real evidence that it can. And the same issues apply. Is the research in a high quality journal. There is an old joke in medical research circles: science has cured cancer in mice for decades.
- Meeting abstract or poster presentation. These are presentations made at meetings/conventions of a scientific society. There are hundreds made every year, and preliminary research is presented in poster sessions or formal presentations. Usually, there is a lot of question and answers, that of course doesn’t show up in a link to the abstract or poster, that can help explain data. Moreover, these types of presentations vary from nearly ready for publication to pure speculation. They have not been peer-reviewed (although peer review can unintentionally happen through tough questions). They are not published formally. And they often do not contain enough explanation of methods, data, analysis, and other issues to evaluate properly. I would not consider this type of information as anything but “observational” in a scientific sense, though, as mentioned above, only around half are published.
- Press releases or news reports. Do not accept the conclusions stated by a press release from a major research university, they haven’t been peer reviewed. I have a feed filled with press releases from major research universities, and I’ve found errors in interpretation from the university’s public relations office relative to the real research. However, it is possible to use a press release to chase down the actual published, and peer reviewed study.
- Studies published in low impact factor journals (those with ratings less than 5). I am openminded that some article published in a bad journal with low quality peer review might actually be good research, so the line I draw at an impact factor of 5 is purely arbitrary. In the world of academic research, job promotions and tenure depend on publishing in high impact factor journals (almost always greater than 10), so impact factor has been correlated to quality of the research. In general, the scientific consensus is supported by lots of publications in high impact (and many times, low impact) factor journals. Good research, that goes against the scientific consensus, can make it into high impact factor journals. But only high quality research, with strong supporting evidence, methods, and analysis. If you’re going to say that all immunology is wrong, and vaccines don’t work, a point of view that is definitely against the scientific consensus, then publishing in insignificant, non-peer-reviewed journals will have little meaning to anyone. Well, as far as I can tell, pseudoscience pushing individuals love these articles published in insignificant journals..
- Medical case reports. I despise these types of articles. They often show up in high quality medical journals, but they are usually observations of one or two patients. Those in medicine know their purpose, which is to give a “heads up” to an observation. They have no scientific validity beyond observational. Unfortunately, science deniers will pull a case report published, and use it to condemn a whole field of medicine. Don’t use them to support your argument, one way or another.
- Natural News from Mike Adams, the Health Ranger. Whale.to. . Let me make this clear, Natural News is a foul, fetid, putrid sewer of information, which befouls any real science with lies. The antisemitic, anti-science, anti-knowledge website, Whale.to, similarly reeks of malodorous smells of pure garbage. Anyone who uses either site as a source for anything in medicine loses the discussion without any further debate. The pseudoscience pushers who claim that they’ve done extensive “research” into a subject, and quote either of these websites are beneath contempt, and supports the hypothesis that those who support anti-science are intellectually lazy charlatans.
It’s really not that hard to determine what’s good science and what is bad. Not all science is equal. If you want to peer into the future, a meeting abstract or primary animal (or cell culture study) may give a hint as to where clinical research is heading 10-20 years into the future. And it’s impossible to predict which of those studies will just end up hitting a dead-end.
- Demicheli V, Rivetti A, Debalini MG, Di Pietrantonj C. Vaccines for measles, mumps and rubella in children. Cochrane Database Syst Rev. 2012 Feb 15;2:CD004407. doi: 10.1002/14651858.CD004407.pub3. Review. PubMed PMID: 22336803. Impact factor: 5.720.
- Retraction–Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children. Lancet. 2010 Feb 6;375(9713):445. doi: 10.1016/S0140-6736(10)60175-4. PubMed PMID: 20137807. Impact factor: 39.060.
- Schwartz LM, Woloshin S, Baczek L. Media coverage of scientific meetings: too much, too soon? JAMA. 2002 Jun 5;287(21):2859-63. PubMed PMID: 12038934. Impact factor=30.000