7 Science-Savvy Ways to Evaluate Research

Popular interpretations of science often oversimplify research and lead us astray. Some people have thrown up their hands and declared, “Research can be twisted any way to say anything anyone wants it to say.” The statement is not wrong, but it is also not grounds for dismissing science entirely. We can respond to bad science and misinterpretations of science with caution and discrimination. Fortunately, science professionals have been obsessing over how to do this for decades now, and we can approach the problem with some useful techniques.

Scientist in field by Risa Pesapane
Epistemology, the discipline that attempts to answer, “How can we know anything?” is not reducible to science. Many questions can be answered outside of science. Sometimes we can use deductive reasoning, personal experience, or trust an authority on the topic. The last of these, trusting an authority, does not seem very rigorous, but can be reasonable if the person has developed a reputation for high epistemological standards.

Some epistemological methods are good for some questions but not for others. I wouldn’t ask an authority on any subject to tell me my preference for slippers, and I wouldn’t trust someone who is not an expert on a topic to give an accurate opinion about it. Almost always, with epistemology, people are working with degrees of certainty rather than with perfect and absolute Truth. (Our culture tends to seek closure and absolutes when these are actually quite rare.) Still, carefully researched hypotheses are better than a shot in the dark – a delicious espresso drink but a poor epistemological practice.

It may be true that anyone can twist a single scientific study to seem to say something it doesn’t. It’s much harder to twist many controlled, randomized, and blinded studies with large sample sizes.

Scientific rigor refers to a number of scientific attitudes and practices demonstrating an unswerving loyalty to finding truth, setting aside any opinions or biases the researcher might have. It is a term referring to how close a statement or the results of a study are likely to be to the truth. The intentions of scientific rigor is disciplined accuracy.

Good scientists design studies using scientific rigor, report degrees of probability, indications rather than certainties, and always follow up with a call for more research. They welcome peer review – evaluation of their work by experts. Relying solely on the peer review process has its risks and limitations, however. While people conducting peer reviews are trained, experienced, scientific professionals, they are not infallible.

Armed with understanding the limits of “knowing” – epistemology – and the intention of research to seek truth beyond the personal – scientific rigor – we can use the following criteria to examine the research meaningfully on our own.

1. Sample sizes must be sufficiently large. Each scientific study, if it is to be of any use, must have a sufficiently large and representative sample. “Sufficiently large,” or meaningful sample size, can be evaluated with statistics, but as a rule the larger the sample size the better. Typically, sample sizes are limited to the amount of funding a study has.

In the field of addiction, people have asserted that one must have experienced addiction in order to understand it. In some ways this is true; the personal experience of having an addiction is not replaceable by science and is immensely valuable for helping others relate to the condition. Personal experience may even provide insight for the initial formation of scientific hypotheses. However, personal experience is a sample size of one. Science is used to evaluate things about a condition beyond the individual and personal.

2. One or more controls must be used for each experimental variable. Science attempts to break a large question down into a series of studies, each addressing very few, measurable variables. Human studies are notoriously difficult to engineer because of the number of variables potentially influencing the outcome. The tension is between the usefulness of a study general enough to apply to many people, and a study specific enough to be rigorous.

First, scientists attempt to narrow the focus of the study to a limited population, for example, to people with heroin substance use disorders in prison. Next scientists must determine what variables still remain. Ideally, they will make all relevant variables consistent, except for the ones being tested. The more potential, unaccounted-for variables, the less useful the results. Statistics allows scientists to sort out some of this complexity, but as a general rule, fewer confounding variables are better.

For each experimental variable – for each variable being measured – scientists come up with one or more controls. In the case of addiction, if the treatment is 12-step attendance, scientists may compare it to no treatment at all. If multiple treatments are being evaluated, scientists may compare 12-step attendance with cognitive behavior therapy,  dialectical behavior therapy, and no treatment. A new treatment may be compared to “treatment as usual,” or “TAU,” which must be explicitly defined.

3. Placebos are a particularly useful control for evaluating medical treatments. Using this type of experimental control, scientists will compare new treatments to false treatments, or placebos. The placebo effect refers to the ability of the human mind to fabricate some kind of change even if the treatment was a sugar pill with no effect. Scientists investigate whether treatments can physiologically improve a condition more than the brain can on its own with sufficient belief.

4. Randomization is a key part of most experimental designs to prevent bias. Randomization involves using a computer program to select test subjects or experimental plots, match test subjects with treatments, and etc. It is a technique that attempts to make samples representative of whatever is being studied. Randomization is an important precaution against bias, i.e. it prevents the scientist from engineering the results of an experiment.

Especially with human studies, scientists often try to overcome some of the potential effects of human diversity (i.e. gender, race, income level, etc.) by randomizing which test subjects are assigned to which experimental group. For example, in a study on Medication-Assisted Treatment (MAT), scientists may use a computer program to determine which person receives buprenorphine, methadone, or no medication at all.

5. Double-blind experiments are a gold standard. In addition to randomization, scientists will use double-blind experiments, where possible, to assess new treatments. Not only are the patients unaware of whether they are receiving treatment or placebo, but the experimenters are, too. After randomizing which patients receive a placebo and which receive the experimental treatment, the people administering the treatments are not told which is placebo and which is not so that they can’t give the patients any unintentional hints. Double-blind trials are not always possible, especially with behavioral therapies.

6. Minimal extrapolation from results is preferred. Even a well-conducted study can be followed by fairly wild speculation on the part of the researcher. Scientists are sometimes guilty of drawing conclusions that are not supported by their research design or by their results. Usually, due to experimental limitations, the conclusions drawn should be modest. After reading the results of a study, reading the conclusion should feel like a “duh” moment.

7. Regardless of a study’s level of rigor, one study is never enough to draw a strong conclusion. As a secondary researcher, I need to use a large sample size, too. Just as my personal experience is limited as a sample size of one, each scientific study has limits to its usefulness. One of the most important aspects of the scientific method is the repeatability of experiments. Until we repeat experiments or approach a problem with multiple experiments, our ability to draw conclusions from the results is very limited. So, when I see a news article making a sensational claim that only cites a single study, I’m going to recognize that even if that study is as rigorous as it could possibly be, its results have limited value until it has been repeated.

Science is accessible to anyone and is extremely useful, though its expense, the limitations of each study, the number of repeat studies, and the work required to understand it may make it seem like an epistemological tortoise. It may be true that anyone can twist a single scientific study to seem to say something it doesn’t. It’s much harder to twist many controlled, randomized, and blinded studies with large sample sizes. Science-savvy readers, at least, will be harder to fool and will be looking for more modest, nuanced conclusions.

Photo credit: Risa Pesapane