24,316,232 Words 354 Texts 36 Authors
Below you will find information on how to obtain integrated corpora and also have the opportunity to download specialised corpora. In our recent research, we have needed a corpus of DH Lawrence's novels along with a comparison corpus, one for Modernist prose that contains the word "colourless/colorless", and one for a study of the word "creature" in Modernist prose.
As always, feel free to email us with questions.
Our aim is to provide researchers with an open accessible corpus of Modernist Literature. As such, we offer a main Corpus of Modernist Literature, composed of three main sub-corpora:
To access these larger files, please email us at modmac2@outlook.com.
We also offer smaller sub-corpora of Prose Fiction, Shorter Prose, and Poetry organised per authors, as well as Specialist Corpora directly accessible on this website. These smaller sub-corpora have been tagged for Part of Speech (POS) and Semantic Domains (SEM).
Because more and more texts become available each year in the public domain, we aim to update the website every two years at a minimum.
The initial investigation for this project involved 368 prose texts from 31 authors in The Modernist Literature Project. As the focus of our research was on the single lexeme "colourless", we selected four authors with the highest frequency of the lexeme: Henry James, Edith Wharton, Joseph Conrad, and Rudyard Kipling. The aim of our analysis is to show how colourless is employed in Modernist prose and is not purely based on quantitative analysis. A zipped file containing the 85 examined texts is provided below and the paper .
Below you will find the ten major novels of DH Lawrence that are contained in the DHL corpus and the 85 novels that comprise the Literary Reference corpus. A list of the 95 texts and authors is available below and contains type, token and TTR information. Also available for download are the Wmatrix parts-of-speech and semantic domain tagsets. Please reference as:
McClure, S. (2021) Oppositional Language as Thematic Signals in the Novels of DH Lawrence: A Corpus-Based Examination.
PhD Dissertation, University of Liverpool.
Below is a file containing 61 texts that comprise the Creature Corpus. These are texts from the Paris Group Modernist study being undertaken by Professor Liudmyla Hryzhak and Suzanne McClure (publication forthcoming).