Want to be a better consumer of social science research? Here’s a short crib sheet for determining the general legitimacy and generalizability of virtually any social science study. Keep in mind that this crib sheet is not going to be 100% accurate or relevant to apply to every study you might be reading about. But it’s a good short-hand guide to help get you started.
What kind of research was it?
The most robust, best studies employ an experimental group and a control group. Studies that leave out the control group are usually less useful than those that do. A survey is the least powerful type of research one can conduct, as it has no experimental or control group, but can be helpful for identifying trends or zero in on concepts or hypotheses that can be studied more in-depth.
How big was the study?
A study of less than 50 people in virtually any experimental design is going to have very, very limited generalizability (because they nearly always lack sufficient statistical power). This means that while the results may be potentially interesting, until they are replicated in another group (and preferably, a larger group), you should take them with a grain of salt. (Some research, like single-case experimental designs, can also provide single data points of interest or future research, but generally can tell us little about broader trends or treatments.)
Who was in the study?
Good research seeks to use participants that are representative of the population in general. The more representative the sample, the more one can readily generalize from the results. So a study of 200 participants that is balanced for gender, race, socio-economic status, and history is far better than a study of 200 college students at Harvard or OSU.
How long were people studied for?
A study that examines participants for less than 12 weeks for any type of treatment is virtually useless. No clinician or doctor that I know has ever had any typical, mainstream type of treatment that worked in less than 12 weeks’ time. A survey that surveys a group of people at one moment of time means the results found are good for that specific moment in time.
There are good and reasonable exceptions to this rule, for the treatment of anxiety (medications are often taken as needed, not every day), and for things like acute psychosis or mania. Studies examining these specific concerns can be for shorter lengths of time and still provide valuable information.
Indeed, any study that is shorter (such as a 4 week or 8 week study) provides us some information. It’s just that that information is a snapshot of the typical treatment regiment, and doesn’t give us as full a picture as a longer treatment study. Study length is less of a concern for any study that is not specifically examining a treatment for a mental disorder.
Who funded the study?
Generally, most studies that are government funded will exhibit less bias than those funded by a company (such as a pharmaceutical company) with a direct interest in achieving a specific result. Virtually all studies are conducted within a university or hospital setting, however, so funding information may not be readily available (the researchers’ affiliations usually provide little information about how the study was funded). Government funding doesn’t mean a study can’t be badly designed or implemented, it just means that you don’t have to worry about “funding bias” influencing the results.
How do the authors talk about their results?
Authors should be humble and cautious about their results and not making overly-broad generalizations or summary conclusions (especially about causation if causation was not designed into the study, as it usually is not). Authors should also clearly describe the limitations of the current study in any journal article; articles that leave out such information should be viewed skeptically, as every study has limitations.
Authors should also clearly note the different between clinical and statistical significance in treatment studies. A 2 or 3 point change in a scale measuring depression might be statistically significant (resulting in a “positive” result), but have little clinical significance for most participants. (See this article or this article for examples of this.) While it’s informative to know that an experimental group is statistically different (e.g., better than chance) than a control group, that difference may not have real-world meaning to most of us.
Beware, too, of studies that rely completely on clinician-rated measures or scales without any patient-rated scales. Who better to tell you a treatment is working than the patient themselves?
Thanks to CL Psych for reviewing an earlier draft of this article.
10 comments
This is especially important during these days of the internet, where studies are blasted out and spread like wildfire. Who’s vetting this stuff? Journalists are only too happy to pass it along to their readers.
Great input! I’ve saved a copy as an aide-memoire for teaching undergrads in library research sessions(how to look at abstracts).
Good start, but a bit misleading. My first hesitancy was that Dr. Grohol has a Psy.D, not a research oriented Ph.D. Psy.D.s are clinical by nature, although there are many good Psy.D. researchers. As a Ph.D. who works with Ph.D. graduate students, I’m aware that there are many types of research that are valuable in many different ways. This article is specifically geared toward clinical psychology and clinical trials research. Within that context, then I say great start. In the broader research arena, I would love to see acknowledgment of the many types of quantitative, mixed methods, and qualitative research methods.
Ryan,
With all due respect I’m sure you are good at some aspects of research. As a Ph.D. myself, I find that even Ph.D.’s are not all experts in the many forms of research. Some are very good at quantitative but less competent at qualitative work and vice-versa. I’ve even sensed Ph.D. academicians putting down qualitative research used in doctoral dissertations. In my opinion, it is almost impossible to be very good at all or most of the methodologies. This can be seen in many Ph.D.’s hiring statisticians to do their dirty work of inferential stats processing and analysis.
I’m sure you realize that on this kind of forum you cannot even begin to scratch the surface of what research is because you are so limited to a few paragraphs. The world of research methodology is vast and voluminous, so much so that there are even whole journals dedicated to the study of studying (or doing research).
I get a chuckle or a smirk when I see Ph.D.’s either strongly or gently putting down the Psy.D. guys. I think that is quite unfortunate. I’ve seen Psy.D. dissertations that are as good as Ph.D. ones. The reality is that academia is negatively biased against Psy.D.’s and relegates them to second class academicians. This is vastly evident by the many graduate academic departments that tend to use Psy.D.’s only in clinical supervision courses. These institutions give most or all the regular courses to the Ph.D.’s. I know Psy.D.’s that run circles around Ph.D.’s including me.
Unless I misunderstood you, I sense a little bit of a put-down tone to your Psy.D. brethren. I don’t find that kind of attitude towards Ph.D.’s coming from Psy.D’s. Please correct me if I am wrong. I welcome it.
Best regards,
Samuel Lopez De Victoria, Ph.D.
http://www.DrSam.tv
I’ll just add that the article is indeed geared toward research that is likely to make mainstream news in psychology and mental health. So while a new perception or beliefs experiment might be intellectually interesting (and there are dozens published every year), they are not likely to make the news headlines. Obviously a short “crib sheet” for consumers isn’t going to cover all possible experimental and research permutations.
As a clinician, I appreciate any effort to educate consumers of mental health services regarding the “product”. However, the reality of the situation is that most people – including clinicians (especially those in private practice) – do not have access to the peer-reviewed literature. The guess is that it takes about seven years for the science to actually reach the public and, even then, it is truncated into the small sound/print bites that the media favors so much — often incorrectly or only presented partially.
Good post and good comments. I’ve taken science reporting to task from time to time for some of the same reasons noted above. One of my favorite gotchas wasn’t listed above, though: How well were the variables defined in the first place? Are you studying what you think you’re studying? I’ve seen research reports that defined “spirituality” as church attendance, but the operational differences between the different groups studied were membership in a caring community and regular social and physical contact (hugs & handshakes) versus living alone without being a member of a social group that met weekly, looked after each other, etc. I’m not sure the researchers were observing the effects of spirituality so much as the effects of being a member of a warm human community.
I find that good research will have an end result in the workability of what was being discovered. That simply means, that a common denominator can be derived from all of the results which will help one to evaluate all other data in reference to it. If it doesn’t do that, then in my eyes the research is incomplete. This is probably why medicine is more profit driven than anything else.