Corpus Linguistics Work Station

Available software

The DH Lab is proud to introduce a new corpus linguistics research station. Corpus linguistics is a form of research that uses linguistics software to investigate large collections of texts. Researchers look for words, phrases, and linguistic patterns that can only be seen using this software. The new workstation was made possible by a GGC Seed Grant and is comprised of a single large-screen Dell desktop computer, which has been outfitted with specialized corpus research software.

The research station was created through a GGC seed grant awarded to Dr. Dan Vollaro, who will be using the research station to do a corpus project on Henry David Thoreau. The station is available for use by other SLA professors and students.

The computer station features several software packages that help with corpus linguistics projects. For example, Wordsmith is a program primarily used by linguists. The software offers a collection of modules for searching patterns in a language. Another application is AntConc, which works with language corpora using a graphical user interface and provides details about the text inside of one or multiple text files. The other software available on the station is called Sketch Engine, a corpus manager and text analysis software. The program is used by linguists, lexicographers, lexicologists, and other researchers to learn about how language works. Sketch Engine was designed to enable people studying language behavior to search large text collections according to complex, linguistically motivated queries.