ICE Scotland

In this project, we collect a 1-million-word corpus of spoken and written 21st century Scottish English. The corpus contains the text categories and annotations specified by the ICE project plus a high number of additional linguistic annotations such as part-of-speech and phonetic transcriptions. The corpus is available in an XML-format and can be downloaded here. Download of sound files.

Corpus annotation is carried out with Pacx - Platform for Annotated Corpora. The corpus creation process is agile, which means query-driven, based on a cyclic processing model and following the minimal effort principle (see Voormann & Gut 2008).

Project members:

  • Robert Fuchs
  • Ulrike Gut
  • Elvira Hadzic
  • Ole Schützler
  • Jennifer Smith
  • Laura Sollgan
  • Silke Stagg
  • Holger Voormann
  • Sarah-Loana Weiß
  • Daniel Zerner