Due by 12pm Thursday, Sep 2
The Internet is full of published linguistic data sets. Let’s data-surf! Instructions:
Go out and find two linguistic data sets you like. One should be a corpus, the other should be something friendlier for R (typically, a data in table form). They must be free and downloadable in full. Make sure they are linguistic data sets, meaning designed for linguistic inquiries.
You might want to start with various bookmark sites listed in the Datasets section of our Learning Resources page. But don’t be constrained by them.
Download the data sets and poke around. Open up a file or two to take a peek.
.txt
extension), make note of:
.md
file instead of a text file.Submission: Upload your text file to the To-do 1 submission link, on Canvas.
Due by 12pm Tuesday, September 7
Time for some hands-on practice! Do the following:
Install/update the tidyverse
by opening up RStudio and running install.packages("tidyverse")
in the console. If you’ve done it correctly, then running packageVersion("tidyverse")
should return ‘1.3.1’
Learn about ggplot2
! Go through the data visualization chapter in our class version of R for data science. (Pay attention to the yellow blocks, where I’ve injected notes for our class into the chapter!) And then go through at least one other data visualization resource on our learning resources page.
It’s up to you how thoroughly you want to interact with these materials. You could just read them, or you could just copy and paste the code. But for coding, the most effective way to learn is by doing—not just typing out all the commands yourself and ensuring you get the same output, but tinkering and exploring.
ggplot2_notes_YOURNAME.Rmd
. Include examples, explanations, etc. You are essentially creating your own reference material.Submission: Share your notes on the #todo2 channel on Slack
Due by 12pm Thursday, September 9
Share your GitHub username in a file called github_YOURNAME.txt
(e.g., github_Dan.txt
)
Learn about dplyr
! Like you did for To-do 2, go through the data transformation chapter in our class version of R for data science, and create your own study notes as dplyr_notes_YOURNAME.Rmd
.
What’s your muddiest point for dplyr
? Create a file called dplyr_muddiest_YOURNAME.txt
that has your muddiest point: After going through the dplyr
chapter, what’s the concept or skill that’s giving you the most issues? What are you most unsure about?
Submission: Share your files on the #todo3 channel on Slack. You should have 3 files:
github_YOURNAME.txt
dplyr_notes_YOURNAME.Rmd
dplyr_muddiest_YOURNAME.txt
Due by 12pm Tuesday, September 14
It’s time to flex our newfound skills with GitHub and R Markdown!
Clean up your R Markdown files from To-dos 2 & 3. Use R code chunks, text formatting, headings, and session info.
Knit your R Markdown files to GitHub-flavored markdown files
Submission: Put your files in the todo2/
and todo3/
directories of the Class-Exercise-Repo
. You should have 2 files in each: an Rmd file and a md file. Commit your changes, push to your GitHub fork, and create a pull request for me.
Due by 12pm Thursday, September 16
Go through two chapters of our class version of R4DS: the short tibbles chapter and the tidy data chapter.
Create two Rmd files of your notes, tibble_notes_YOURNAME.Rmd
/tidyr_notes_YOURNAME.Rmd
. Include your muddiest point for each chapter.
Knit as github_document
files.
Submission: Put your files in the todo5/
directory of the Class-Exercise-Repo
. You should have at least 4 files (one Rmd & one md for each chapter), plus any image directory(ies) that you create, if applicable. Commit your changes, push to your GitHub fork, check that the md files look like you expect, and create a pull request for me.
Due by 12pm Tuesday, September 21
You know the drill from here: Create the files as Rmd
, knit as github_document
files, put your files in Class-Exercise-Repo/todo6/
, add/commit/push, and create a pull request for me.
Due by 12pm Thursday, September 23
stringr
package by going through the R4DS strings chapter. Give yourself time to go through this chapter, as it’s on the longer side. And don’t forget to list your muddiest point(s)!You know the drill from here.
Due by 12pm Tuesday, September 28
Time for more regex practice, this time with a longer text.
todo8/regex-practice_YOURNAME.Rmd
. Change the file name immediately so your own name is in it.git add .
!), commit (with an informative message), push to your upstream fork, and create a pull request for me.Don’t forget the regex learning resources are at your disposal!
Due by 12pm Thursday, September 30October 7 (postponed due to illness)
Let’s pool our questions together for Dr. Lauren Collister, who will be our guest speaker on Thursday. Review the topic of open access and data publishing, focusing in particular on the first two resources (“Data Sharing for Linguists” and the “Copyright and Intellectual Property Toolkit”).
Think of a question or two on the topic, and add yours along with your name to this Google document.
SUBMISSION: The Google document is your submission—don’t forget to add your name. Dr. Collister will take a look at your questions before class, and any questions that are asked by 1pm on Wednesday are guaranteed to be answered!
Due by 12pm Thursday, October 14
The key tool for phonetic analysis of speech data is the free program Praat. In addition to a click-through GUI, Praat has its own scripting language to automate tasks. Let’s learn all about Praat scripting! Do the following:
if
& for
, and file input/output.
.md
document, named praat_reference_YOURNAME.md
. (Note: Since this file won’t contain any R code, you can create this .md
file directly without first writing a .Rmd
file!) Make sure you include at least one muddiest point for Praat scripting that we can discuss in class on Thursday!Submission: Put your .md
file in Class-Exercise-Repo/todo10/
(check spelling/punctuation of the folder name!), add/commit/push, and create a pull request for me.
Due by 12pm Tuesday, October 19
Let’s learn more about how to use LaBB-CAT! I’ve posted some worksheets in a new repository in our class GitHub organization; you don’t have to clone the repo, unless you want to keep the files around for your own reference. Do the following:
labbcat_research_YOURNAME.md
: How could using LaBB-CAT to organize linguistic data (whether it’s speech data, written data, etc.) benefit the type of research you’ve done and/or are interested in doing? Or, what sorts of additional functionality would LaBB-CAT need to have in order to benefit the type of research you’ve done and/or are interedted in doing? Feel free to be creative here!labbcat_search_YOURNAME.md
: Write a short ‘search problem’ that a classmate could complete after doing the first three worksheets (see examples at the bottom of worksheet 3). Again, be creative here! If the solution requires any layers that are referenced in worksheets 4, 5, or 6, please mention that..Rmd
file first!Submission: Put your .md
file in Class-Exercise-Repo/todo11/
(check spelling/punctuation of the folder name!), add/commit/push, and create a pull request for me.
Due by 12pm Thursday, October 21
Which are longer: Words that start with stops, or words that start with fricatives? Time to put our LaBB-CAT and R skills together to find out!
Target
or Target.
firstPhon
with the first phoneme in the word, and look at the distribution of firstPhon
. Try to explain how w
, I
, or c
got into the data.github_document
. You may want to play around in an R Notebook before creating your final Rmd (though this analysis is pretty short).Submission: Put three files—the downloaded csv, your Rmd, and the knitted md file—in a subfolder YOURNAME/
within Class-Exercise-Repo/todo12/
. (This is because there might be overlaps in csv file names!). Add/commit/push, and create a pull request for me.
Due by 12pm Tuesday, November 2
What has everyone been up to? Let’s take a look – it’s a “visit your classmates” day!
Class-Lounge
repo, but you should edit it so that:
guestbooks
directory. You should visit two people after you in (wrap-around) alphabetical order (Angela: Joe & Katherine, Shaohua: Yan & Angela, etc.)Submission: Since Class-Lounge
is a fully collaborative repo, there is no formal submission process; simply add your comments to the guestbook.