Searching other layers

You can search more than just the orthography layer on the search page. The first step in doing so is to use the layer picker menu.

On this page
  1. Selecting other layers to search
  2. Layer data types and notation systems
    1. Searching text layers…
      1. using English spelling
        1. using part-of-speech tags
      2. using boolean values (for overlap)
    2. Searching phonological layers
    3. Searching numeric layers

The layer picker on the search page is similar to the layer picker on transcript pages, with some minor differences.1

By default, the layer picker has the orthography layer already selected. To make more layers appear, you have to select the project corresponding with that layer in the Projects column.

The orthography, word, and segment layers are displayed without selecting any projects because these layers are not associated with a project.

Selecting a layer in the layer picker will make a pattern input field for that layer appear.

If you select a layer but don’t enter anything into its pattern input field, it doesn’t get applied to the search.

The screengrab below has the projects that are most useful to users toggled on to show all the layers that are useful for searching APLS.

Layers in the temp project aren’t meaningful for end-users, so these layers are not discussed on this page.

Hovering over the names of the layers will display a tooltip giving a brief description of that layer. For an overview of how layers work in APLS, start with the Layers and attributes documentation section. In-depth descriptions of the different layers in APLS can be found in the field guide.

Layer data types and notation systems

Different layers in APLS contain different kinds of annotations. These differences affect what and how you can search in each layer.

The four data types are:

  • Text
  • Phonological
  • Numeric
  • Timing-only

More information about these data types is provided on the Layer typology documentation page, but for learning how to use the search page it is most useful to simply know that these distinctions exist.

While data type describes the kinds of annotation that different layers contain, notation systems are what those annotations actually look like. All layers have a primary notation system (with the exception of timing-only layers because their annotations don’t have labels).

The primary notation systems by data type are:

  • Text
    • English spelling
    • Treebank part-of-speech tags
    • Boolean (TRUE or FALSE)
  • Phonological
    • DISC phonemic alphabet
    • Stress markers (' [primary stress], " [secondary stress], 0 [unstressed])
  • Numeric
    • Positive whole numbers
    • Decimal numbers

You can read more about notation systems on the Layer typology and Layer notation systems documentation pages.

Pattern input fields for layers that don’t use English spelling or numbers include drop-down selection menus (shown with a icon) that contain valid input for that layer.

This saves you the trouble of needing to memorize things like the DISC alphabet or part-of-speech tags to use all of the search page’s functionalities! Similar to other parts of APLS, you can hover your mouse over the options in these drop-down menus to get a short tooltip description for each symbol.

The drop-down menus are displayed as IPA symbols, but they are input into the text fields using the DISC alphabet. Because the purpose of DISC is to make phonetic transcription more machine-readable, the only characters that are different from IPA are the ones that aren’t found on QWERTY keyboards. IPA characters found on QWERTY keyboards, like /p/ or /i/, are represented the same in DISC.

Searching text layers…

using English spelling

As discussed in Your first search, you can search the orthography layer using English spelling. Other layers that can be searched using English spelling are:

  • lemma
  • morphemes
  • lexical
  • word
  • comment
  • noise
  • redaction

The redaction search targets the comments that are made for redactions on the word layer. All redactions are annotated as ___ on the word layer, which is what appears on the search results page for redactions. However, the pattern for redaction must match the comments in the redaction layer. Therefore, searching for ___ on the redaction layer won’t display any results but searching .*name.* will show all redactions that contain “name” in the comment.

using part-of-speech tags

The only syntax layer in APLS is the part_of_speech layer, which has a selection menu for accepted part-of-speech tags.

To find matches for every superlative adjective in APLS:

  1. Go to the search page.
  2. Select the syntax project in the layer picker and click the checkbox for the part_of_speech layer to make the part_of_speech input field appear.
  3. Click the drop-down menu button () and select JJS.
  4. Click the Search button.

using boolean values (for overlap)

The overlap layer is part of the timing project, but it contains text data in the form of Boolean values: TRUE for speech that is overlapping and FALSE for speech that isn’t overlapping. The overlap layer has a drop-down selection menu for these two TRUE and FALSE values.

Searching phonological layers

The phonology layers (and what is included in their drop-down menus) in APLS are:

  • phonemes (IPA symbols that are not on QWERTY keyboards)
  • dictionary_phonemes (IPA symbols that are not on QWERTY keyboards)
  • stress (stress markers)
  • pronounce (IPA symbols that are not on QWERTY keyboards and stress markers)
  • syllables (IPA symbols that are not on QWERTY keyboards and stress markers)

segment and foll_segment layers are similar to the phonological layers in this section, but they have additional capabilities that are described in Searching segments within words.

To find all matches for syllables with /ĘŠ/ that have secondary stress:

  1. Go to the search page.
  2. Select the phonology layer in the layer picker and click the checkbox for the syllables layer to make the syllables input field appear.
  3. Click the drop-down menu button () and select secondary under the STRESS heading.
  4. Enter .* after " in the syllables input field.
  5. Click the drop-down menu button () and select ĘŠ under the VOWEL heading.
  6. Enter .* after ".*U in the sylablles input field.
  7. Click the Search button.

Searching numeric layers

Some layers contain numeric data, instead of text data. As a result, the pattern input fields for these layers allow you to define a numerical range you would like to search for.

The following layers use numeric range input fields:

  • frequency_from_celex
  • frequency_in_corpus
  • speech_rate
  • foll_pause

To find every word that only appears in the corpus one to two times:

  1. Go to the search page.
  2. Select the lexicon project in the layer picker and click the checkbox for the frequency_in_corpus layer to make the frequency_in_corpus input field appear.
  3. Enter 1 in the frequency_in_corpus Minimum input field and enter 2 in the frequency_in_corpus Maximum input field.
  4. Click the Search button.
    • This search will include incomplete words as well. To exclude incomplete words, select doesn’t match from the matches drop-down menu and enter .+~ into the orthography input field.
  1. The main difference is that transcript pages always have the word, turn, and utterance layers selected and they cannot be de-selected. The search page allows these layers to be toggled on and off.Â