Searching other layers
You can search more than just the orthography layer on the search page. The first step in doing so is to use the layer picker menu.
On this page
Selecting other layers to search
The layer picker on the search page is similar to the layer picker on transcript pages, with some minor differences.1
By default, the layer picker has the orthography layer already selected. To make more layers appear, you have to select the project corresponding with that layer in the Projects
column.
The orthography, word, and segment layers are displayed without selecting any projects because these layers are not associated with a project.
Selecting a layer in the layer picker will make a pattern input field for that layer appear.
If you select a layer but don’t enter anything into its pattern input field, it doesn’t get applied to the search.
The screengrab below has the projects that are most useful to users toggled on to show all the layers that are useful for searching APLS.
Layers in the
temp
project aren’t meaningful for end-users, so these layers are not discussed on this page.
Hovering over the names of the layers will display a tooltip giving a brief description of that layer. For an overview of how layers work in APLS, start with the Layers and attributes documentation section. In-depth descriptions of the different layers in APLS can be found in the field guide.
Layer data types and notation systems
Different layers in APLS contain different kinds of annotations. These differences affect what and how you can search in each layer.
The four data types are:
- Text
- Phonological
- Numeric
- Timing-only
More information about these data types is provided on the Layer typology documentation page, but for learning how to use the search page it is most useful to simply know that these distinctions exist.
While data type describes the kinds of annotation that different layers contain, notation systems are what those annotations actually look like. All layers have a primary notation system (with the exception of timing-only layers because their annotations don’t have labels).
The primary notation systems by data type are:
- Text
- English spelling
- Treebank part-of-speech tags
- Boolean (
TRUE
orFALSE
)
- Phonological
- DISC phonemic alphabet
- Stress markers (
'
[primary stress],"
[secondary stress],0
[unstressed])
- Numeric
- Positive whole numbers
- Decimal numbers
You can read more about notation systems on the Layer typology and Layer notation systems documentation pages.
Pattern input fields for layers that don’t use English spelling or numbers include drop-down selection menus (shown with a icon) that contain valid input for that layer.
This saves you the trouble of needing to memorize things like the DISC alphabet or part-of-speech tags to use all of the search page’s functionalities! Similar to other parts of APLS, you can hover your mouse over the options in these drop-down menus to get a short tooltip description for each symbol.
The drop-down menus are displayed as IPA symbols, but they are input into the text fields using the DISC alphabet. Because the purpose of DISC is to make phonetic transcription more machine-readable, the only characters that are different from IPA are the ones that aren’t found on QWERTY keyboards. IPA characters found on QWERTY keyboards, like /p/ or /i/, are represented the same in DISC.
Searching text layers…
using English spelling
As discussed in Your first search, you can search the orthography layer using English spelling. Other layers that can be searched using English spelling are:
- lemma
- morphemes
- lexical
- word
- comment
- noise
- redaction
The redaction search targets the comments that are made for redactions on the word layer. All redactions are annotated as
___
on the word layer, which is what appears on the search results page for redactions. However, the pattern for redaction must match the comments in the redaction layer. Therefore, searching for___
on the redaction layer won’t display any results but searching.*name.*
will show all redactions that contain “name” in the comment.
using part-of-speech tags
The only syntax layer in APLS is the part_of_speech layer, which has a selection menu for accepted part-of-speech tags.
To find matches for every superlative adjective in APLS:
- Go to the search page.
- Select the
syntax
project in the layer picker and click the checkbox for the part_of_speech layer to make the part_of_speech input field appear.- Click the drop-down menu button (
) and select
JJS
.- Click the Search button.
using boolean values (for overlap)
The overlap layer is part of the timing project, but it contains text data in the form of Boolean values: TRUE
for speech that is overlapping and FALSE
for speech that isn’t overlapping. The overlap layer has a drop-down selection menu for these two TRUE
and FALSE
values.
Searching phonological layers
The phonology layers (and what is included in their drop-down menus) in APLS are:
- phonemes (IPA symbols that are not on QWERTY keyboards)
- dictionary_phonemes (IPA symbols that are not on QWERTY keyboards)
- stress (stress markers)
- pronounce (IPA symbols that are not on QWERTY keyboards and stress markers)
- syllables (IPA symbols that are not on QWERTY keyboards and stress markers)
segment and foll_segment layers are similar to the phonological layers in this section, but they have additional capabilities that are described in Searching segments within words.
To find all matches for syllables with /ĘŠ/ that have secondary stress:
- Go to the search page.
- Select the
phonology
layer in the layer picker and click the checkbox for the syllables layer to make the syllables input field appear.- Click the drop-down menu button (
) and select
secondary
under theSTRESS
heading.- Enter
.*
after"
in the syllables input field.- Click the drop-down menu button (
) and select
ĘŠ
under theVOWEL
heading.- Enter
.*
after".*U
in the sylablles input field.- Click the Search button.
Searching numeric layers
Some layers contain numeric data, instead of text data. As a result, the pattern input fields for these layers allow you to define a numerical range you would like to search for.
The following layers use numeric range input fields:
- frequency_from_celex
- frequency_in_corpus
- speech_rate
- foll_pause
To find every word that only appears in the corpus one to two times:
- Go to the search page.
- Select the
lexicon
project in the layer picker and click the checkbox for the frequency_in_corpus layer to make the frequency_in_corpus input field appear.- Enter
1
in the frequency_in_corpus Minimum input field and enter2
in the frequency_in_corpus Maximum input field.- Click the Search button.
- This search will include incomplete words as well. To exclude incomplete words, select doesn’t match from the matches drop-down menu and enter
.+~
into the orthography input field.
-
The main difference is that transcript pages always have the word, turn, and utterance layers selected and they cannot be de-selected. The search page allows these layers to be toggled on and off. ↩