How to do Statistical Research

How to do Statistical Research

How to do a Statistical Project and Write a Proposal and Paper for this project
By Lynn Kuo

Obtain research ideas by
Taking classes
Attending seminars
Consulting
Collaboration
Preliminary Work, Go to HuskyCT, this course
If you need an interesting data set, go to the folder “Data sets”, browse the file “Source for useful data”.
In the same folder, browse writing sample (1), writing sample (2), writing sample #3, writing sample #4, and writing sample #5 to gain some knowledge on what I consider to be good papers.
Purpose of the Project: Actually do statistics; Any topic; Make it interesting
Step 1: Find a topic which
interests you
you can research easily
Write out topic and brainstorm.
Select your paper’s specific topic from this brainstorming list.
In a sentence or short paragraph, describe what you think your paper is about.
Step 2: Initial Planning, Investigation, and Outlining

Form a hypothesis, design a study, conduct the study, make notes.
Collect the data (with time constraints, this can be substituted by finding a data set collected by others); Describe the data.
Make a tentative outline to guide writing
Reorganize notes, fill in the tentative outline.
Make a final outline, reorganize notes accordingly.
As you decide where you will use outside resources in your paper, note the use of outside resources in a different font or text color from the rest of your outline. It is important to maintain a clear distinction between your own words (ideas) and those of others.

Step 3: Analyze Data; Describe your Statistical Methods

Perform data analysis
Write the paper, describe the statistical methods
Use your outline to guide you
Summarize the results of data analysis
Write quickly—capture flow of ideas—deal with proofreading later
Put aside overnight, if possible
Fill in details on the writing

Step 4: Make conclusion

Discuss limitations of the paper
Anticipate future research
Recap the whole paper

Check organization—reorganize paragraphs and add transitions where necessary.
Make sure all researched information is documented.
Rework introduction and conclusion.
Work on sentences—check spelling, punctuation, word choice, etc.
Read out loud to check for flow.
Find a classmate to proofread each other’s paper. Turn on track changes in MS word to mark suggested changes.

Proposal
Explain your explanatory and response variables and how you have collected the data. If you use human subjects, you must also make sure that your study will be safe and ethical (anonymous, able to quit at any time, informed consent).
Anticipate appropriate statistical methods to be used for analyzing data.
Entertain the conclusion from your analysis.
Rubric for the proposal (around 1 page), due February 22th, 2020
(i) Introduction: introducing the topic and why you have chosen this topic (3-5 lines). Mention briefly the current related research (cite sources).
(ii) Formulate a research hypothesis in the chosen topic. Describe briefly why you select such a hypothesis and its importance in the field (cite sources).
(iii) Provide a brief description of your data set (for instance: number of rows (observations), number of variables, variables of interest, nature of the variables) and the source of your data set if it is not collected by yourself.
(iv) Describe briefly (5-7 lines) your plan of action. If you choose to apply any specific statistical methods, please mention and cite them. Please be consistent here with your research hypothesis. Write a line regarding how the methods would help in investigating the research hypothesis.
(v) Finally mention your intuitions and expectations from the research hypothesis. What do you expect to find and why do you feel so.
20 points for each of the above 5 items. The proposal should have at least two cited sources. (5 Bonus points for each well-cited source).

First Draft: due April 5th
1. Introduction
Rubric: Complete=30 pts
Describes the context of the research
Has a clearly stated question of interest
Clearly defines the parameter of interest and states correct hypotheses
Thorough literature search
Question of interest is of appropriate difficulty

Minimal=10 pts
Briefly describes the context of the research

2.Data Collection
Complete=20 pts
Method of data collection is clearly described
Includes appropriate randomization
Describes efforts to reduce bias, variability, confounding

Minimal=5 pts
Some evidence of data collection
or

3. Graphs and Summary Statistics
Complete=10 Bonus pts
Appropriate graphs are included
Graphs are neat, clearly labeled, and easy to compare
Appropriate summary statistics are included
Graphs and summary statistics are used to give a preliminary answer to the question of interest

Minimal=2 Bonus pts
Graphs or summary statistics are included

4. Analysis
Complete=30pts Minimal=10pts
Correct inference procedure is chosen
Justifies use of inference procedure
Test statistic/P-value or confidence interval is calculated correctly
P-value or confidence interval is interpreted correctly
Inference procedure is attempted

5. Conclusion
Complete = 20 pts Minimal =5 pts
Uses P-value/confidence interval to correctly answer question of interest
Discusses what inferences are appropriate based on study design (population/cause-effect)
Shows good evidence of critical reflection (discusses Type I and Type II errors, limitations, etc.)
Makes a conclusion

6. Overall Presentation/Communication
Complete = 10 bonus pts Minimal= 0 bonus pts
Communication and organization are very poor
Clear, holistic understanding of the project
Paper is well organized, neat and easy to read
Statistical vocabulary is used correctly

Rubric for the first draft (summary of previous pages:
(1) Define an interesting problem supported with a critical literature review (30 points).
(2) Identify or choose—and justify—the measurement system; Design the collection of data; Piloting and discerning “exploratory” data analysis (20 points).
(3) Hypothesis generation for the problem stated in (1), and statistical methods, inferential data analysis, and interpretation of results (30 points).
(4) Draw and contextualize conclusions and references, writing proficiency (20 points)

Deadline and Rubric for the Final Paper

Rubric for the final version:
1.  Interesting topic (20 points);
2.  Data, hypothesis, and methods (30 points);
3.  Writing skill, conclusion, and references (20 points);
4.  Incorporate comments from the first draft (30 points).

– All graphs, figures, should be accompanied with suitable titles, legends, headings, and captions.
– Attach codes in an appendix.

From Terence’s Stuff: How to do Statistical Research

Xiao-Li Meng: X-L Files: Rejection Pursuit, http://bulletin.imstat.org/2013/08/1562/

References
(I) Terry Speed: Terence’s Stuff: How to do Statistical Research, https://www4.stat.ncsu.edu/~davidian/st810a/speed.pdf.

(II) Xiao-Li Meng: X-L Files: Rejection Pursuit, http://imstat.org/2013/08/29/1562/.
