24,316,232 Words 354 Texts 36 Authors

The Modernist Literature Project

The Modernist Literature ProjectThe Modernist Literature ProjectThe Modernist Literature Project
  • Home
  • Authors A - L
  • Authors K - Z
  • Specialised Corpora
  • Papers
  • Copyright Restrictions
  • Contact Us
  • More
    • Home
    • Authors A - L
    • Authors K - Z
    • Specialised Corpora
    • Papers
    • Copyright Restrictions
    • Contact Us

The Modernist Literature Project

The Modernist Literature ProjectThe Modernist Literature ProjectThe Modernist Literature Project
  • Home
  • Authors A - L
  • Authors K - Z
  • Specialised Corpora
  • Papers
  • Copyright Restrictions
  • Contact Us

The Modernist Literature Project

Integrated & Specialised Corpora

  

Below you will find information on how to obtain integrated corpora and also have the opportunity to download specialised corpora. In our recent research, we have needed a corpus of DH Lawrence's novels along with a comparison corpus, one for Modernist prose that contains the word "colourless/colorless", and one for a study of the word "creature" in Modernist prose.  


As always, feel free to email us with questions. 

Contact Kimberley and Suzanne ->

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Cancel

Integrated Corpora

Our aim is to provide researchers with an open accessible corpus of Modernist Literature.  As such, we offer a main Corpus of Modernist Literature, composed of three main sub-corpora: 

  • Prose Fiction
  • Shorter Prose
  • Poetry


To access these larger files, please email us at modmac2@outlook.com.

 

We also offer smaller sub-corpora of Prose Fiction, Shorter Prose, and Poetry organised per authors, as well as Specialist Corpora directly accessible on this website. These smaller sub-corpora have been tagged for Part of Speech (POS) and Semantic Domains (SEM).
 

Because more and more texts become available each year in the public domain, we aim to update the website every two years at a minimum. 

Colourless Corpus

The initial investigation for this project involved 368 prose texts from 31 authors in The Modernist Literature Project.   As the focus of our research was on the single lexeme "colourless", we selected four authors with the highest frequency of the lexeme: Henry James, Edith Wharton, Joseph Conrad, and Rudyard Kipling.  The aim of our analysis is to show how colourless is employed in Modernist prose and is not purely based on quantitative analysis.  A zipped file containing the 85 examined texts is provided below and the paper . 

Colourless Corpus (zip)Download

DH Lawrence and Literary Reference Corpora

Below you will find the ten major novels of DH Lawrence that are contained in the DHL corpus and the 85 novels that comprise the Literary Reference corpus.  A list of the 95 texts and authors is available below and contains type,  token and TTR information.  Also available for download are the Wmatrix parts-of-speech and semantic domain tagsets.  Please reference as: 

McClure, S. (2021) Oppositional Language as Thematic Signals in the Novels of DH Lawrence: A Corpus-Based Examination. 

PhD Dissertation, University of Liverpool.

DHL and Reference Corpora Listing (pdf)Download
DHL Original Texts (zip)Download
Reference Corpus Original Texts (zip)Download
DHL Corpus Wmatrix tags (zip)Download
Reference corpus A - C Wmatrix tags (zip)Download
Reference corpus D Wmatrix tags (zip)Download
Reference Corpus J - O Wmatrix tags (zip)Download
Reference corpus W Wmatrix tags (zip)Download

Creature Corpus

Below is a file containing 61 texts that comprise the Creature Corpus.  These are texts from the Paris Group Modernist study being undertaken by Professor Liudmyla Hryzhak and Suzanne McClure (publication forthcoming).  

Creature Corpus (zip)Download

Copyright © 2022 The Modernist Literature Project - All Rights Reserved.


Please reference The Modernist Literature Project as follows:

McClure, S. and Pager-McClymont, K. (2022). The Modernist Literature Project. ModernistLiteratureProject.org.

Powered by

  • Copyright Restrictions
  • Privacy Policy

This website uses cookies.

We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.

DeclineAccept