Searching the corpus

APLS provides powerful search capabilities for finding linguistic patterns1 in the interview data. It all starts from the Search page: https://apls.pitt.edu/labbcat/search. Once you create a search, you can export the results in a variety of formats.

Because the Search page has many options that will be of interest to a variety of different researchers, these documentation pages are more focused on giving users an understanding of the Search page capabilities through “Try it!” tutorials. If you would like to understand the data structure that the Search page relies on, read the Layers and attributes and Layer typology documentation pages.

If you are looking for information on the Search results page, go to the Exporting data documentation page.

On this page
  1. What you can do on the Search page
    1. Page layout

What you can do on the Search page

The Search page allows you to…

Page layout

The following image shows the Search page when you first load it:

  • The search matrix (highlighted in red above).
    • This is where you specify patterns to search.
    • Made up of one or more pattern inputs.
  • The layer picker (highlighted in blue above).
    • This is where you can add and subtract pattern inputs from the search matrix.
  • The search button (highlighted in green above).
    • Click this to begin your search (or press Enter/Return while in a text box).
  • The tabs (highlighted in purple above). This is where you can…

Once you enter a pattern into the search matrix and click Search, a progress bar pops up (highlighted in orange below):

Search matrix and pattern inputs

Feel free to skip this section if you want to jump into your first search!

The search matrix is made up of one or more pattern inputs. When you first load the search page, you’ll only see a pattern input for the orthography layer (like in the previous images). You can make more pattern inputs appear by selecting layers in the layer picker and/or widening the search matrix to multiple words.

Not all layers’ pattern inputs look alike. To demonstrate all the possible features, here’s what the search matrix looks like when you select overlap, turn, syllables, foll_pause, orthography, and segment in the layer picker:

This search matrix has 6 different pattern inputs, one for each selected layer. Let’s break down what we’re seeing:

  • Most important of all, input boxes (marked Regular expression, Minimum, or Maximum) let you specify a pattern for that layer.
    • Appearance depends on the layer’s data type:
      • Text layers (like overlap) and phonological layers (like syllables) have one input box marked Regular expression.
      • Numeric layers (like foll_pause) have two: Minimum and Maximum.
      • Timing-only layers (like turn) have none.
  • The buttons let you change the width of the search matrix.
  • The matches dropdown lets you search for the opposite of a pattern or missing annotations.
  • The button opens a symbol picker, which helps with inputting patterns for layers that have specific notation systems (like overlap, syllables, and segment).
  • The icon lets you define the “token of interest” when searching across words, within words, or across layers, given that annotations differ by both scope and alignment.
  • The icon lets you anchor a pattern to the beginning or end another layer’s scope.
  • Black lines at the top of a pattern input indicate that the layer has either span or phrase scope (like overlap and turn).
  1. The term patterns is used throughout these pages because you can match exact words, affixes, individual sound segments, stress, part of speech, and much more!Â