University of Galway | Stemma Project

The Challenge

The Stemma Project's challenge was monumental, revolving around the digitisation and analytical exploration of early modern English manuscripts. The challenge was in the diversity and range of source information from across the world.

These manuscripts, fragmented and preserved in various unstructured and legacy formats, represented a rich cultural heritage yet were inaccessible for comprehensive study. For example, we found 3291 different date formats in these data sets alone.

The task at hand was not just about converting these disparate pieces of history into a digital format but also about creating a structured, searchable, and analytically useful repository of information - effectively - turning fragmented ancient historical manuscripts into data for academic research and public access.

This required innovative approaches to data ingestion, cleansing, and reconciliation, along with sophisticated database design to ensure both academic researchers and the public could explore these literary treasures effectively.

Services provided

Digital Transformation

Data & AI

Software Development

Hosting & Cloud Infrastructure

What we did

We solved this intricate challenge by architecting an ingestion system that provided sophisticated and scalable algorithmic data cleansing processes. This allows us to re-cataloge and categorise complex information to support academic research

This initiative was aimed at deciphering the complexities of early modern English manuscripts, which were scattered across various formats and sources, posing a significant challenge in terms of digitisation and analysis.

Through data ingestion, cleansing, and reconciliation processes, we transformed these fragmented, historically significant manuscripts into a cohesive, digitally accessible archive.

The solution not only preserved the invaluable literary heritage but also made it readily available for academic research and public exploration. By implementing state-of-the-art database design and employing advanced data management techniques, we bridged the gap between historical manuscripts and modern digital accessibility.

A woman standing with data & code projected over herself and the background with colours

Outcome

The Stemma Project achieved a transformative outcome by digitising and structuring fragmented, historical poetry manuscripts into a comprehensive digital archive.

This endeavour not only preserved a valuable segment of literary heritage but also made it accessible for scholarly research and public exploration. Through data processing and innovative database solutions, we overcame significant challenges related to data fragmentation and legacy formats, enabling Researchers to perform digital humanities research focused on early modern English poetry.

We’d like to thank the team from the University of Galway for their hard work and collaboration with us. Their expertise, dedication, and consistent commitment have been essential in order to reach our goals within the project.

Here’s what Established Professor of English Literature and Computational Humanities / PI of STEMMA, Erin McCarthy, from University of Galway had to say about working with our team...

The Ember team has exceeded my expectations, not only because of their technical and project management expertise but because of their curiosity about the historical source material and the project’s bigger implications for those interested in it.

Lastly, we’d like to extend a huge thank you to the Irish Research Council and the European Research Council as the success of The Stemma Project would not have been possible without their support. We are incredibly grateful for their contributions and partnership throughout the project.

‍