Every student should master a basic data-science toolkit that will be applicable to a wide array of potential research projects. The midterm gives you an opportunity to practice and apply the skills we’ve discussed so far. It takes the form of an extended exercise using real data I collected for a recent project; you’ll read the data into R, perform data-wrangling, report summary information, and create plots. It is somewhat open-ended, in that there may be multiple valid ways to write code to accomplish the same task, and you’ll have the option to dig further into different parts of the data.
It will be due in two phases:
For the metacognitive reflection, write a paragraph that describes your progress as a learner in this course, using the evidence provided by reviewing your midterm answers. There is no set structure, but you may want to think about questions like:
Our class GitHub organization has a midterm repository, which you will fork.
For the first submission, you’ll need to complete midterm.Rmd
to the best of your ability and knit it to midterm.md
;
for the second submission, you’ll have those two files plus reflection.md
(which you can create as an .Rmd
initially if you want to embed the output of R code).
You will not contribute your fork to the upstream remote, so you don’t need to rename any files.
You are free to create R Notebooks in the process of figuring out your ‘final’ code, as long as your repo also has the required files.
Because your midterm repo comes from our class GitHub organization repo, it’ll be visible to other students;
please do not look at your fellow students’ midterm repositories before you have submitted yours.
The midterm will be graded for effort, because I know enough about all of you by now that I can safely trust you all to work hard. You will mostly create your own feedback based on the “answer key” and your discussion with your partner, but please don’t hesitate to ask me for feedback as well. Don’t worry about making the Markdown file look pretty (unlike our To-dos, where you’re encouraged to pay attention to formatting); just pay attention to the content of your code.
tidyverse
(ggplot2
, dplyr
, tidyr
, readr
, purrr
, tibble
, stringr
, forcats
), but you might find yourself using functions we haven’t discussed within those packages.
<package-name>.tidyverse.org
(e.g., https://ggplot2.tidyverse.org/)?purrr
and click “index” at the bottommidterm.Rmd
document