journal article
Open Access Collection
Introducing the Archive of Pittsburgh Language and Speech, a Publicly Accessible, Richly Annotated Corpus of Sociolinguistic Interviews
Villarreal, Dan; Rechsteiner, Jack; Johnstone, Barbara; Kiesling, Scott
doi: 10.1111/lnc3.70038pmid: N/A
The troves of speech data that have driven an increasing orientation towards large‐scale methods in linguistics have been, for the most part, available only to closed teams of researchers and their collaborators. The Archive of Pittsburgh Language and Speech (APLS, https://apls.pitt.edu) is a new open data resource, consisting of nearly 46 h of audio from sociolinguistic interviews with 40 speakers of Pittsburgh English. Powered by the corpus management software LaBB‐CAT, APLS interviews are richly annotated with multiple layers of linguistic information at the phrase, word, and segment level. Thanks to APLS's graphical user interface, users can access powerful tools for searching the corpus and extracting acoustic measurements with relatively few technical barriers to entry. We describe how APLS fits into the current landscape of (socio)linguistic open data, exemplify APLS's capabilities via a case study of 7137 /aʊ/ tokens, and contextualise the data both in terms of how fieldwork was carried out and Pittsburgh in the early 2000s.