Reading a Research Paper - Part 2
Statistics is an important research tool used in many fields. In this assignment, you will read the original research article analyzing the dengue data which we have used in class. The purpose of this assignment is to help you learn how to read a research paper, and extract key details about the study and results. Later, we will try to replicate the authors’ results ourselves.
Overview
Dengue is a mosquito-borne viral disease which affects hundreds of millions of people each year. Early diagnosis is crucial for patients to have the best prognosis, but relies on a variety of labo ratory tests. To enhance practitioners’ ability to diagnose dengue, a 2015 paper by Tuan et al. investigated the possibility of detecting dengue using a variety of clinical measurements like white blood cell count and platelet count. This activity will help you read the paper by Tuan et al.. The paper is available at https://journals.plos.org/plosntds/article?id=10.1371/journal.pntd.0003638.
Outline of a Research Paper
Research papers in many fields, particularly the sciences, often contain the following main sections:
Abstract: A short overview of the full paper, giving highlights of the motivation and back ground, the research question, the data, and the results.
Introduction: A broad overview of the research question the authors want to study, motivation for studying this question, and the authors’ approach to answering their question. The introduction often starts very general, then narrows to the specific question addressed in this paper. More detail is provided in the introduction than in the abstract, and more time is spent on motivation and related literature.
Methods: The data and analysis techniques used to answer the research question. This typically describes the what the data looks like, how and where it was collected, and any statistical tools (e.g. visualizations, regression, hypothesis testing) that were used when analyzing the data.
Results: A summary of the analysis results, such as figures showing regression fits, and tables of regression coefficients and p-values.
Discussion: A discussion of the analysis results, in context of the original research question. In this section, explanations for why particular results were observed may be proposed.
Conclusion: A short summary of the paper and its key results, and their connections to broader scientific questions. The conclusion is often the reverse of the introduction: it starts with the specific question addressed by this paper, then discusses the implications of this research for science in general.
Reading a Research Paper
Reading a research paper, particularly in a field in which you are not an expert, can be challenging. The trick is to skim the paper for the most relevant information, and skip over technical details that are not essential to understanding the key take-aways. The questions below will guide you to the most important sections in the paper by Tuan et al.
Submission
Click here to download a template file for completing this assignment:
You are allowed to use any text editing software to make your document (e.g., Word, Pages, Google Docs), but your submission must be a PDF. If you are unsure how to save your file as a PDF, I recommend using Google!
The Abstract and Introduction
A good place to start is often with the Abstract and Introduction, which allow you get an overview of the paper, and usually don’t contain too many technical details. The Abstract is more succinct than the Introduction, but it also provides less motivation. When the Introduction is long, you may want to skim for key details.
Read the Abstract and Introduction. Then answer the following questions.
Why is it important for the researchers to build a model to detect dengue in hospital patients?
What is the specific purpose of the research study?
So far, we know what question the researchers are trying to answer, and we know that they are going to build some kind of model to predict dengue. Our goal for the rest of the paper is to understand how the authors conducted this analysis. In particular, we want to answer the following questions:
- Who participated in the study?
- What did the researchers record about each participant?
- What statistical methods did the researchers use?
- What did the researchers conclude from their study?
This information is provided in the Methods and Results sections of the paper. These sections also contain lots of other details which is valuable, but not crucial to understand on a first reading, so we will focus on the most important parts of the Methods and Results.
Study Participants
Potential research subjects must meet certain criteria, defined by the researchers, to participate in a study. Inclusion criteria define requirements for inclusion in the study (e.g., a target age range or social group), while exclusion criteria are reasons a subject would be asked not to participate (e.g., certain medical conditions).
- What are the inclusion/exclusion criteria for this study?
Data Collection and Analysis
Now that we know who was studied, we want to know what data was collected about each participant, and how it was analyzed. Read the Clinical and laboratory investigations on the day of enrolment and Statistical methods subsections of the Methods. Then answer the following questions.
Which variables were recorded for the patients in the study?
Which types of statistical methods were used to model the relation between the explanatory variables and whether the patient had dengue? (It is okay if you’re not familiar with all these methods!)
How did the researchers choose which variables to include in their logistic regression model?
Which threshold did the researchers use when converting their predicted probabilities into binary predictions?
Results
Finally, let’s see what the researchers concluded from their statistical models! Read the Early Dengue Classifier subsection of the Results, then answer the following questions.
Which variables did the researchers include in their final logistic regression model?
Were there other models that performed as well as the model that the researchers chose? If so, why were these other models not selected?
What does it mean that the model had a “sensitivity” of 74.8% for correctly classifying dengue?