Applied linguistics research within the domain of computer-assisted language learning (CALL) has given much attention to videogames and assessing their efficacy as tools for second language (L2) learning and development. This efficacy is synthesized in Dixon et al.’s (2022) meta-analysis reporting that videogames tended to have a medium-sized positive effect (d=0.65) on L2 development compared to learners receiving traditional instruction. Although encouraging, Reinhardt (2021) reminds us that not all games “are created equal” in terms of their affordances for L2 learning (p. 69). That is, the linguistic richness of videogames can vary wildly from games like Tetris that present virtually no language to the linguistically rich virtual worlds of games like Baldur’s Gate 3 which engage players with hundreds of hours of recorded speech and thousands of texts including millions of words. With millions of words in a single game, L2 educators taking a CALL-informed pedagogic approach may struggle assessing or identifying an appropriate title for their specific population of learners. In making this assessment, corpus linguistics tools and methods can yield valuable insight into the linguistic environments of videogames.
With a focus on corpus design and representativeness, in this talk, Dr. Dixon discusses the compilation of the Single Player Offline Game Corpus (SPOC; see Dixon, 2024). SPOC is a 3.7-million-word corpus compiled through a process that began by extracting the language files from the directories of four popular single player role-playing games: Divinity Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. Following Biber and Conrad’s (2019) Register Analysis Framework as well as methods outlined by Egbert, Biber, and Gray (2022), the language files were parsed into meaningful units of observations and placed into one of seven registers identified from a situational analysis. Findings from empirical research investigating the linguistic similarity of the SPOC to real-world registers will be discussed along with implications for CALL classrooms and direction for future research outlined.
Presenter:
Dr. Daniel Dixon (Ph.D.) is an Assistant Professor in the Department of Applied Linguistics and ESL at Georgia State University (GSU). Broadly, his research focuses on language and technology and computer-assisted language learning (CALL) in which he has investigated the characteristics of language use in technologies like Generative AI and videogames. Drawing on natural language processing as well as corpus linguistics tools and methods, he has compared linguistic variation across a number of digital and real-world domains while also exploring the effectiveness of targeted technologies for promoting second language development. He uses the programming language Python extensively in his work and has developed a number of digital tools for applied linguistics teaching and research. At GSU, he teaches Python programming for linguistic analysis to graduate (M.A., Ph.D.) and undergraduate students. He also teaches graduate courses related to research methods, language and technology, corpus linguistics, among other topics in applied linguistics.
Personal Website: https://sites.google.com/view/danielhdixon/home
Google Scholar: https://scholar.google.com/citations?user=sMLKjksAAAAJ&hl=en
Facebook: https://www.facebook.com/ALESLatGSU
Bluesky: https://bsky.app/profile/danielhdixon.bsky.social
Date & Time:
Friday, November 14, 2025, 12:00-1:00 pm PT (3:00-4:00 pm ET)
Hosted by CIRT-IG:
Elizaveta (Ellie) Kuznetsova & Margi Wald
#catesolcirt