Using the latest tools in digitization and data analysis, a group of Stanford English students is helping scholars uncover new insights about British writer Virginia Woolf and the history of literary movements in the early 20th century.

English lecturer Alice Staveley, left, meets with Stanford junior Emily Elott, center, and second-year English doctoral candidate Anna Mukamal to discuss their latest work on the Modernist Archives Publishing Project. (Image credit: Alex Shashkevich)

Until now, no one has studied in detail Woolf’s impact on the publishing industry of that era and the business networks of Hogarth Press, the printing press founded in 1917 by Woolf and her husband, Leonard Woolf, at their home in southwest London. Scholars had previously thought the press was a niche business created to publish just Woolf’s work.

But as scholars involved in The Modernist Archives Publishing Project (MAPP) have unearthed, the press promoted lesser-known voices as well, said second-year English doctoral candidate and project manager Anna Mukamal.

Over a 30-year period, the press published over 500 books by over 300 authors. This revelation is bringing a new understanding to the modernist movement in literature, the interwar period of the 1920s and 1930s that was spearheaded by a handful of writers, including Woolf, T. S. Eliot, James Joyce and Ezra Pound.

Virginia Woolf in 1902. (Image credit: George Charles Beresford)

“The idea that modernism was created by five people hanging out at Virginia Woolf’s salon couldn’t be further from the truth,” Mukamal said about MAPP’s findings. “The movement is so much more complicated.”

Founded in 2012 by English lecturer Alice Staveley and five other literature scholars from across the world, MAPP focuses on creating a public, digital archive of early 20th-century publishers.

MAPP’s goal is to trace the detailed history of that period through transcribing and digitizing thousands of documents from the publishers of the time. For example, the team members are looking at who bought books by Woolf and other modernist writers, as well as by lesser-known authors. They are also examining when books were bought and where they were distributed.

“This is a study of bookshops and the nature of the book market, which no one has really analyzed in this granular fashion before,” Staveley said. “We believe that studying the history of book production can help us glean important information about the literary figures we all love to read and teach from this period.”

So far, MAPP’s team has gathered and digitized over 4,000 artifacts related to the Woolfs’ publishing business, including one-of-a-kind book jackets, readers’ reports and correspondences between authors and their publishers. Most of these artifacts are physically scattered across institutions in England, Canada and other countries.

Difficulties of historical data analysis

With the support of the Center for Spatial and Textual Analysis, six Stanford undergraduates have been assisting with the digitization of the archives, transcription of certain documents, data analysis and other work.

A page Stanford junior Emily Elott helped transcribe from the sales records for Virginia Woolf’s 1925 novel Mrs Dalloway that catalogues order requests from booksellers. (Image credit: Courtesy of University of Reading Special Collections)

One of those students is Stanford junior Emily Elott, who over the past year has helped transcribe one of the Hogarth Press’s order books, which is about 300 pages long.

Each order book contains chronologically dated orders that the press received from booksellers for the works of Woolf and British novelist Vita Sackville-West. The project’s team is interested in analyzing these data because they can help people understand who bought Woolf’s and Sackville-West’s books and how international their reach was at the time.

Being a part of MAPP has showed Elott the extensive work and decisions that go into interpreting, visualizing and analyzing historical data.

For Elott, the challenge starts with the transcription of decades-old handwriting, some of which can be almost illegible. Parts of the order book contained scribbles of an unknown bookkeeper whose entries were so hard to understand that they inspired Elott to be more mindful of her own handwriting.

“Historical data is messier than the data we collect today,” said Elott, an English major and computer science minor.

Elott spent last summer double-checking every word in a 30,000-line spreadsheet that resulted from the six-month transcription process. As part of the data cleaning, she had to standardize some entries.

For example, the press’ bookkeepers appeared to use shorthand for a lot of orders. Many entries named “Smith” as the bookseller, while others contained the full names, such as “W. H. Smith,” “James Smith” and “John Smith.” The project team had to figure out the exact buyer to which the “Smith” entries referred. Because W. H. Smith was at the time one of biggest booksellers, they chose to interpret all “Smith” entries as “W. H. Smith.”

“We can’t go back and ask them what they meant,” said Elott about the nearly 100-year-old entries she analyzed.

The project’s team is documenting each similar interpretive decision and plans to include it in any of the final published analyses of the data.

Elott said that she now sees how crucial the humanistic perspective is to data science.

“I think people who are in more traditional data science fields tend to have this belief that numbers are infallible and that if it’s in the data then it must be true,” Elott said. “But through this experience, I’ve seen the ways in which you can manipulate the data to tell a story that you want it to say. Every minute decision I make as a data cleaner can naturally accrue from my own bias. So to think that these numbers are infallible to me is alarming.”

Other co-founding members of the project are Claire Battershill of Simon Fraser University in Canada; Nicola Wilson of the University of Reading in the United Kingdom; Helen Southworth of the University of Oregon; Elizabeth Willson Gordon of The King’s University in Canada; and Mike Widner, a former academic technology specialist for the Division of Literatures, Cultures, and Languages at Stanford.

The project is supported by grants from the Social Sciences and Humanities Research Council of Canada and, at Stanford, the Roberta Bowman Denning Fund for Humanities and Technology, the Center for Spatial and Textual Analysis, and the Department of English in the School of Humanities and Sciences.  

Media Contacts

Alex Shashkevich, Stanford News Service: (650) 497-4419,