APLS Documentation

Welcome to the documentation homepage for the Archive of Pittsburgh Language and Speech (aka APLS, pronounced like apples)! APLS is a linguistic data resource, powered by the open-source linguistic corpus software LaBB-CAT, that contains:

  • recordings of interviews conducted with speakers native to Pittsburgh and surrounding neighborhoods,
  • annotated transcripts with information at the phrase, word, and individual speech sound levels, allowing these recordings to be used as structured linguistic data, and
  • metadata on interviewees and transcripts that facilitate large-scale (socio)linguistic analysis.

APLS is (and will always be) free to use. APLS contains 274 sound files totaling over 45 hours of audio from 40 interviewees.

Demo: Measuring F1 and F2 for /aw/ in closed syllables

Some speakers of Pittsburgh English pronounce the /aw/ vowel (the vowel sound in words like out and downtown) more like “ah” (stereotyped as “aht” and “dahntahn”). This pronunciation is noticeable to Pittsburghers as a marker of Pittsburgh identity and social meanings like working-class status (e.g., Johnstone et al. 2006).

Let’s say we wanted to investigate how different speakers pronounce /aw/ in different situations. A pretty typical data task would be to identify all tokens (individual instances in speech) matching a specific linguistic context (for example, when /aw/ is followed by a consonant in the same syllable) and extract a set of acoustic measurements (for example, F1 and F2 at 3 timepoints). Normally, performing this sort of batch acoustic measurement on a dataset this big would take hours of manual effort, even if you use state-of-the-art speech technologies for automatic speech recognition and segmental alignment.

With APLS, it takes as little as 2 minutes to measure all 5202 tokens of /aw/ in closed syllables in the corpus.

Show me how!

  1. Search for tokens
  2. Export search results to a CSV file
    • We get a search-results file with one token per row, and columns for different annotation layers
  3. Extract acoustic measurements for search results using APLS’s built-in Praat module
    • Our search-results file gets updated with acoustic measurements that we specify (in this case, F1 and F2 at the vowel’s 20%, 50%, and 80% timepoints)

Ready to get started with APLS? Click here!