Biostatistics-Epidemiology

Written paper Assignment
    Introduction
    Methods,
    Results, and Discussion
SAS code in a separate file

Total < 15 pages, < 4 tables, citations, This assignment IS graded.
Please see below instructions:
For this project, students may choose one of four publicly available datasets:
1) The 2017 Community Health Survey (CHS), conducted by the New York City Department of Health and Mental Hygiene. CHS is a cross sectional telephone survey conducted annually by the NYCDOH. Students interested in using this dataset should use the most up-to-date data (2010). The data in SAS and documentation is available online at: https://www1.nyc.gov/site/doh/data/data-sets/community-health-survey-public-use-data.page. The data are available in SAS.
2) 2013-14 NYC Health and Nutrition Examination Survey is a population-based cross sectional study among a sample of non-institutionalized New York City residents age 20 or older. It was modeled on the national NHANES and includes data from interviews, physical examination and laboratory tests on a number of health behaviors and outcomes. The data is available in SAS at: http://nychanes.org/data//. Note the weights used vary depending on the variables you include in your analysis.
3) 2017 Youth Risk Behavior Survey (YBRS). The Youth Risk Behavior Survey (YRBS) has been conducted by the National Centers for Disease Control and Prevention (CDC) in odd-numbered years since 1997. The surveys’ primary purpose is to monitor priority health risk behaviors that contribute to the leading causes of mortality, morbidity, and social problems among youth in the USA. Students complete a self-administered, anonymous questionnaire that measures a variety of behaviors, including tobacco, alcohol and drug use, and violence, sexual behaviors, dietary behaviors, and physical activity. The data is available online at: https://www.cdc.gov/healthyyouth/data/yrbs/data.htm. The data is provided as an ASCII file or an Access file and SAS code for opening the data is also provided (although sometimes quirky).
4) The Global School Based Student Health Survey (GSHS) from the World Health Organization & the Centers for Disease Control and Prevention (CDC). GSHS is a school-based survey conducted in various countries primarily among students aged 1315 years that aims to provide data on health behaviors and protective factors related to the leading causes of mortality and morbidity among children. The data and documentation (code book, etc.) are available online at: https://www.cdc.gov/GSHS/ The data are available in SAS (as well as other formats). You should probably focus on one country if using this data, and you might want to choose a country with more recent data so that your analysis is more relevant.
**All four datasets have population weights, which should be used to apply the descriptive statistics and analytic results to the total population rather than just the study sample, and were collected using a complex sampling scheme that needs to be taken into account during analysis. The weighting and sampling scheme can be taken into account in SAS for most statistical tests you would want to run. More complicated analyses might require SUDAAN, which is beyond the scope of this class. Note that it is also possible to analyze this type of data in STATA, R, and, if you have the complex sampling module add-on, in SPSS, but for this course we will be using SAS. The websites for the data sets have instructions for how to analyze the data and students should refer to those instructions when planning and conducting their analyses.

Instructions:

Describe the dataset you will use for your paper, what outcomes and exposures you want to look at specifically description of:

1.Planned paper topic and data set to be used for project and

2. Structured argument justifying paper topic with citations:
Briefly describe what you found online about the
             Study participants (eligibility, recruitment, participation rate, consent)
             Data collection procedures
             Data management procedures
    Ethical considerations

3. Choose an outcome, a key exposure, to propose a hypothesis for why you think the outcome would be related to the exposure and in what direction, and what other variables to consider impacting the relationship with exposure and outcome.

4. Describe how your outcome and exposure variables are coded, their frequency (unweighted) and what recode, if any, you might make.

5.Based on the frequencies in the code book, what are your outcome and primary exposure(s)?  Do these variables have (1) sufficient variation for analysis, (2) any small cells, and (3) will you recode the variables?  If so, how?

Please see the rubric below to help organize your thinking and research plan. 

It would help moving forward to make much more explicit your:
1) outcome of interest
2) exposure of interest

Generally, it would be best for you to work with one outcome and one exposure. If you do one exposure and two outcomes or two exposures and one outcome then you would need to do two sets of analyses. You are welcome but not required to do this double amount of work. Similarly, if you did two exposures and two outcomes you would have to do four analyses; I strongly discourage this as it would be too much work proportional to the time available in this class.

3) your hypothesized relationship between your exposure and outcome (with directionality)

4) specify several key covariates and a clear reason why each should be examined as covariates

5) your denominator for your outcome and the numerator for your outcome

6) the numerator and denominator of the exposure (either yes/no or distribution if not dichotomous)

7) specify the population this numerator and denominator of exposure and outcome apply to

Remember that your document must adhere to the following guidelines:

    Use Microsoft Word
    Double-space
    Paginate
    Include your name on each page either in the header or footer
    Use 12-point font
    Use 1-inch margins
    Use EndNote for references
    Include Table(s) in the manuscript Word document after bibliography or as separate Excel or PDF file(s)
    Include data cleaning and statistical analysis syntax (SAS syntax, well-organized and labeled) after Tables at the end of the manuscript as an Appendix or as a separate file
    The manuscript should be formatted for a specific journal where the manuscript could theoretically be submitted 
o    If no preferred journal is identified, format for AJPH (https://ajph.aphapublications.org/authorinstructions)
o    You do not need to include extra-manuscript components such as highlights, funding, acknowledgements, or conflicts of interest
o    You may include an abstract if you choose to  

Leave a Reply

Your email address will not be published. Required fields are marked *