Layers: Layer typology

The layers in APLS can be categorized along some important properties, in terms of how they appear, how they’re generated, how they can be searched, and other properties. This page outlines those categories.

On this page

Scope
1. What you’ll see in APLS
2. Layers by scope
Alignment and horizontal peers
1. What you’ll see in APLS
2. Layers by alignment
Vertical peers
1. What you’ll see in APLS
2. Layers by vertical peers
Data type
1. What you’ll see in APLS
2. Layers by data type
Notation system
1. What you’ll see in APLS
2. Layers by notation system
Segment dependency
1. What you’ll see in APLS
2. Layers by segment dependency
Project
1. What you’ll see in APLS
2. Layers by project

Scope

Not all annotations in a transcript are equally “wide” (in duration). Some layers contain annotations that pertain to individual words (e.g., part_of_speech), others to individual speech sounds (e.g., segment), and others to multiple words (e.g., redaction). This distinction is captured by layer scope.

Layers can have one of four possible scopes, defined by how long (in time) they can span. From longest to shortest, these are:

Scope	Meaning	Notes
Span	Annotations can span beyond an individual turn of talk, up to the length of the entire transcript
Phrase	Annotations can span beyond an individual word, up to the length of an entire turn
Word	Annotations usually span the length of a word	Some word layers (like syllables) have annotations that span just part of a word. These annotations are called horizontal peers, discussed below
Segment	Annotations span the length of a speech sound

What you’ll see in APLS

Individual Transcript pages
- In the layer selector, layers are organized into columns by scope: [IMAGE]
- You can usually tell what each layer’s scope is by just glancing at the transcript. For example:
  - The noise annotations aren’t bounded by a single line, so noise is clearly a span layer. (Note that transcripts always display the noise and comment layers.)
  - Both overlap and speech_rate have annotations that span multiple words, so they are clearly phrase layers. You can tell where these annotations start and end because of the colored arcs above each line.
  - Each foll_pause annotation is lined up above each word annotation, so foll_pause is clearly a word layer
  - Segment layers like segment aren’t quite as obvious as the other scopes, but you can tell them apart because they’re plain text rather than links, and they’re always underneath word
  - Finally, note that layers are always in scope order from longest to shortest
Search
- In the layered search matrix, layers are organized into columns by scope: Only a few layers appear when you first load the search page, but you can select projects to show more (see below).
Exporting data
- Layers are organized into columns by scope
- Depending on how you’re exporting data, there might be additional columns. For example, when exporting a formatted transcript, you can also export participant and transcript attributes:

Layers by scope

Span	Phrase	Word	Segment
comment noise	overlap redaction speech_rate turn utterance	foll_pause frequency_from_celex frequency_from_subtlex frequency_in_corpus hesitation lemma lexical morphemes orthography part_of_speech phonemes prec_pause pronounce stress syllables word word_index	coronal_stop_deletion foll_segment prec_segment segment

Alignment and horizontal peers

Some annotations take up just part of their scope, and some annotations take up their entire scope. For example, syllables annotations can take up just part of their corresponding word (if the word is multisyllabic); by contrast, each word has just one frequency_in_corpus annotation. When annotations take up just part of their scope, we say they have horizontal peers: “peers” because they share a scope with other annotations, “horizontal” because we often conceptualize time on an x-axis (i.e., from left to right). This distinction is captured by layer alignment.

Layers can have one of three possible alignments:

Alignment	Meaning	Notes
Complete interval	Annotations always span their entire scope
Sub-interval	Annotations can span part of their scope, or their entire scope Annotations that share their scope with other annotations are called horizontal peers	There may or may not be gaps between annotations within a scope. For example, there are usually long gaps between comment annotations within a transcript
Timepoint	Annotations don’t have a start and end time, just a time	There aren’t currently any timepoint layers in APLS. If we wanted to store vowel measurements in a layer, then we might do so in a timepoint layer (i.e., at the vowel’s midpoint)

What you’ll see in APLS

Individual Transcript pages
- In the layer selector, the icons denote alignments
- When multiple annotations share a word, they crowd into the space above the word (with spaces separating labels). For example:
  - Most words in this line have one part_of_speech annotation, but don’t and it’s each have two (n’t is an adverb, and ‘s is a present-tense 3rd-person-singular verb)
  - Most words in this line are monosyllabic and just have one syllables annotation, but Family has three syllables annotations and Dollar has two.
Search page
- In the layer selector, the icons denote alignments
- [Like with cross-scope search, you can do anchoring] [But you can’t do e.g. multiple syllables within a word]
Exporting data

Layers by alignment

Since alignment is relative to the layer’s scope, these are broken down by scope too:

Scope	Alignment
Scope	Complete interval	Sub-interval
Span		comment noise
Phrase	turn	overlap redaction speech_rate utterance
Word	foll_pause frequency_from_celex frequency_from_subtlex frequency_in_corpus hesitation lemma lexical morphemes orthography phonemes prec_pause pronounce word word_index	part_of_speech stress syllables
Segment	coronal_stop_deletion foll_segment prec_segment segment

There aren’t currently any timepoint layers in APLS.

Vertical peers

Some layers allow vertical peers: 2 or more annotations that occupy an identical timespan. For example, the dictionary_phonemes layer represents all possible phonemic representations of a word, while the phonemes layer represents the speech sounds actually in a word. Since the word the can be pronounced /ði/ or /ðə/, the has two annotations on the dictionary_phonemes layer—but since only one of these is how the word was actually pronounced, the only has one phonemes annotation.

Unlike horizontal peers, which divide the timespan of their scope, we can think of vertical peers as being “stacked” on top of one another within the same timespan.

Allow vertical peers?	Icon	Meaning
True		A single timespan may contain multiple annotations
False	(none)	A single timespan can contain at most one annotation

What you’ll see in APLS

Individual Transcript pages
- In the layer selector, the icon denotes layers that allow vertical peers
- Only one vertical peer is visible on the Transcript page, whichever is first in alphabetical order
  - Example: The dictionary_phonemes layer only shows ðə for the word the, even though the has two dictionary_phonemes annotations:
  - To check whether an annotation has vertical peers, export the line to TextGrid instead
Transcripts or fragments exported to Praat TextGrids
- Normally, TextGrid transcripts/fragments show one tier per layer per speaker. However, if a layer allows vertical peers, then that layer can have multiple tiers per speaker.
  - Example: The word times has two lemma annotations, time and times,¹ so this TextGrid shows two lemma tiers for CB17:
- If a layer allows multiple peers, but the exported transcript/fragment doesn’t actually have vertical peers on that layer, then that layer will only have one tier.
  - Example: None of the words in this fragment have multiple lemma annotations, so this TextGrid shows just one lemma tier for CB17:
- An exported TextGrid will only show as many tiers per layer as there are vertical peers on that layer in the exported transcript/fragment. As a result, longer transcripts/fragments tend to have more tiers.
  - Example: When the entire transcript for CB17interview3.eaf is exported to TextGrid, there are three lemma tiers for CB17 for the whole TextGrid, even though there is only one word in the transcript with three lemma annotations

Search page
- In the layer selector, the icon denotes layers that allow vertical peers
- Searching a layer that allows vertical peers will match any of the vertical peers for a given word
  - Example: All tokens of the have two dictionary_phonemes annotations: D@ and Di. Thus, a search for the on orthography and D@ on dictionary_phonemes returns the same set of matches as the on orthography and Di on dictionary_phonemes
  - As a result, “doesn’t match” searches on a layer that allows vertical peers can return false positives
    - Example: Searching doesn’t match D@ on dictionary_phonemes will still return all the tokens, because the Di vertical peer doesn’t match D@
Exporting data

Layers by vertical peers

True	False
comment frequency_from_celex lemma morphemes noise	coronal_stop_deletion foll_pause foll_segment frequency_from_subtlex frequency_in_corpus hesitation lexical orthography overlap part_of_speech phonemes prec_pause prec_segment pronounce redaction segment speech_rate stress syllables turn utterance word word_index

Data type

Different layers contain different kinds of annotations. For example, foll_segment annotations are speech sounds, while foll_pause annotations are numbers. This distinction is captured by a layer’s data type:

Data type	Meaning	Notes
Phonological layers	Annotations are labeled with speech sounds	Some phonological layers’ annotations are individual sounds (like segment); others’ are sequences of sounds (like syllables).
Numeric layers	Annotations are labeled with a measurement	Some numeric layers’ annotations are counts (like frequency_in_corpus); others’ are decimal numbers (like foll_pause).
Text layers	Annotations are labeled with text	Most text layers’ annotations are normal English spelling (like word) or lowercase English spelling (like orthography)
Timing-only layers	Annotations don’t have a label, only a start/end time	The two timing-only layers represent temporal subdivisions of a transcript: turn and utterance

What you’ll see in APLS

This distinction is mostly important for search.

Individual Transcript pages
- In the layer selector, alignment is denoted by the icons in the previous table
Search page
- In the layer selector, alignment is denoted by the icons in the previous table

Layers by data type

Phonological	Numeric	Text	Timing-only
foll_segment phonemes prec_segment pronounce segment stress syllables	foll_pause frequency_from_celex frequency_from_subtlex frequency_in_corpus prec_pause speech_rate word_index	comment coronal_stop_deletion hesitation lemma lexical morphemes noise orthography overlap part_of_speech redaction word	turn utterance

Notation system

While data type describes the kinds of annotation that different layers contain, notation systems are what those annotations actually look like. In other words, notation systems are the details of how layers represent their data. For example, the phonemes layer represents speech sounds as symbols in the DISC phonemic alphabet, such as fIS for the word fish.

Almost all layers have a primary notation system. (The exceptions are the timing-only layers, since their annotations don’t have labels—see above.) Some layers have additional notations, depending on what their annotations need to represent. For example, the syllables layer uses the DISC phonemic alphabet for speech sounds plus stress markers for stress, such as 'fIS for the word fish. Some additional notations are only used for a single layer, such as the morpheme boundary (+) for morphemes.

Here are brief descriptions of primary notation systems, with links to more details on the notation systems page if applicable:

Data type	Primary notation	Description
Phonological	DISC	DISC phonemic alphabet
Phonological	Stress markers	`'` (primary stress) `"` (secondary stress) `0` (unstressed)
Numeric	Count	Positive whole numbers
Numeric	Decimal	Decimal numbers
Text	Boolean	`True` or `False`
Text	English spelling
Text	English spelling (lowercase)
Text	Treebank part-of-speech tags
Timing-only	(none)	By definition, timing-only layers don’t have labels

Here are additional notations:

Data type	Additional notation	Brief description
Phonological	DISC pause	`.`
Phonological	Stress markers	`'` (primary stress) `"` (secondary stress) `0` (unstressed)
Phonological	Syllable boundary	`-`
Text	Hesitation marker	`~` (at the end of an incomplete word)
Text	Transcription pause/question markers	`.` (short pause) `-` (long pause) `?` (question)
Text	Redaction marker	`___`
Text	Morpheme boundary	`+`

What you’ll see in APLS

Search
- [More about how users will interact w/ notation systems than what they’ll see]

Layers by notation system

Primary notation system:

DISC	Stress markers	Count	Decimal	Boolean	English spelling	English spelling (lowercase)	Treebank part-of-speech tags	(none)
foll_segment phonemes prec_segment pronounce segment syllables	stress	frequency_from_celex frequency_in_corpus word_index	foll_pause frequency_from_subtlex prec_pause speech_rate	hesitation overlap	comment lexical noise redaction word	lemma morphemes orthography	part_of_speech	turn utterance

Additional notation:

DISC pause	Stress markers	Syllable boundary	Hesitation marker	Transcription pause/question markers	Redaction marker	Morpheme boundary
foll_segment prec_segment	pronounce syllables	pronounce	lexical orthography word	word	word	morphemes

Segment dependency

Some layers need information about individual speech sounds: which sounds are in a word, and where these sounds begin and end. For example, a word’s phonemes annotation is only possible if the word has segment annotations. Other layers (like part_of_speech) don’t depend on segment annotations. This distinction is captured by segment dependency:

Segment-dependent?	Meaning
True	Annotations are only present if the line contains segment annotations
False	Annotations can be present even if the line is missing segment annotations

A layer can be segment-dependent even if the segment layer isn’t an input to the layer. For example, speech_rate (a phrase layer) takes syllables as input, which takes segment as input; if segment has no annotations in a turn, then syllables won’t have any annotations, which means speech_rate won’t have any annotations.

What you’ll see in APLS

Individual Transcript pages
- [Whole turns w/ big chunks of no annotations]
Search
- [False negatives in search because turn isn’t aligned]

Layers by segment dependency

True	False
foll_pause foll_segment phonemes prec_pause prec_segment segment speech_rate stress syllables	comment coronal_stop_deletion frequency_from_celex frequency_from_subtlex frequency_in_corpus hesitation lemma lexical morphemes noise orthography overlap part_of_speech pronounce redaction turn utterance word word_index

Project

Different layers pertain to different levels of linguistic representation. For example, morphemes annotations pertain to how words are represented in the lexicon, while speech_rate annotations pertain to the timing of a particular line of speech. As a result, APLS categorizes layers into projects:

Project	Meaning	Notes
syntax	Annotations pertain to syntactic structure
lexicon	Annotations pertain to how words are represented in the lexicon
timing	Annotations pertain to timing/rhythm of speech and speaker turns
phonology	Annotations pertain to the speech sounds in a word	All of these layers are phonological layers, see data type above
imported	Annotations were imported from the original transcription
temp	Annotations are either metadata on how other layers were generated, or their only job is to provide data for other layers
(none)	Layers whose check-boxes are always shown by default on the search and transcript pages

What you’ll see in APLS

The main reason projects exist is to reduce visual clutter on the search and transcript pages. When the page loads, only a few layers are selectable (those without a project): [IMAGE] Click a layer’s project to make the layer selectable: [IMAGE]

Unlike other layer properties, projects don’t affect anything about how layers are represented, how you need to search for them, export them, etc.

Layers by project

(none)	syntax	lexicon	timing	phonology	imported	temp
orthography segment word	part_of_speech	frequency_from_celex frequency_from_subtlex frequency_in_corpus hesitation lemma morphemes word_index	foll_pause overlap prec_pause speech_rate turn utterance	foll_segment phonemes prec_segment stress syllables	comment lexical noise pronounce redaction	dictionary_phonemes htk orthography_no_clitic phonemes_no_clitic

Layers in the “temp” project aren’t meaningful for end-users, so the other “layers by property” tables on this page omit these layers.

The lemma layer represents all possible base forms of the word. Depending on context, the word times could be the plural of the base form time, or times could be the base form itself (as in “two times four”), hence the multiple annotations. ↩