ISAW Library Internship Report: Continuing work on a Linguistic Dataset of Latin Works Written by Women

By Patrick J. Burns
03/31/2026

Annotation and dataset building continued in Summer 2025 for the Representing Women Authorship in the Latin Treebanks (RWALT) project with three new contributors from our high-school internship program. The project added works by Hrotsvitha as well as sections from the Passio Perpetuae et Felicitatis. Read below reports from the contributors themselves, with Oliver Katz writing about his work on plays and letters from Hrotsvitha and John Toews and Alexangel Ventura on Perpetua.—Patrick J. Burns


Over the course of this summer, I completed an internship at NYU’s Institute for the Study of the Ancient World Library, in which I worked on the project Representing Women Authors in the Latin Treebanks (RWALT). Under the guidance of David Ratzan and Patrick J. Burns, I translated selections from the plays and letters of Hrotsvitha. Hrotsvitha was a tenth-century playwright, poet, and canoness of Gandersheim, a largely autonomous intellectual aristocratic principality and abbey composed entirely of women. Hrotsvitha’s work reflects the context within which she was writing. She heavily emphasizes themes of Christianity and extensively propagandizes in support of the Ottonian imperial court (which had close ties to Gandersheim). Having been educated in Classical and Christian Latin, as well as Greek, her writing pulls linguistic elements from multiple time periods while forefronting female characters and themes. Out of her seven plays, I primarily focused on Dulcitiusalso known as Passio sanctarum virginum Agapis Chioniae et Hirenae—a story of three virgin Christian martyrs whom the titular Dulcitius tries (and, hindered by divine intervention, fails) to persecute for their refusal to renounce their faith. As a part of this project, I focused on the first six scenes of Dulcitius, but also annotated personal letters to her abbess Gerberga and to her readers as found in the Medieval Women’s Latin Letters project. For each of these works, I drafted a translation from Latin to English and then ran the original sentence through the LatinCy analyzer, which yielded annotations for each word, stating its lemma, part-of-speech tag, and morphological information into a spreadsheet; and then I corrected the annotations for over 1000 words from Hrotsvitha’s writings. The aim was to help train the LatinCy analyzer program so that it was able to annotate more accurately Latin texts by women authors who are underrepresented in the treebank dataset.

Going into this project, I was incredibly excited to be working on the plays of Hrotsvitha. While I had not extensively studied her works, I had briefly read an excerpt from Dulcitius and was immediately won over by her mix of humor and thematic richness within a theatrical format. However, as I read through more of Dulcitius and Hrotsvitha’s letters, I was fascinated by the degree to which propaganda shaped her narrative storytelling (I guess some things never change, whether you are a Roman general and self-promoter or a tenth-century canoness). This experience shaped how I approached the annotations. I found that, in order to best correct the LatinCy program, I had to approach my translation of the Latin with Hrotsvitha’s intended rhetoric in mind, not necessarily in the ways I had become accustomed to translating Cicero and Virgil. In order to train the LatinCy program to “understand” how to work with the works of Hrotsvitha, I first had to understand Hrotsvitha’s goals within the context of her own work. To that extent, I was amazed and delighted by how interdisciplinary this project was. While I may have been spending my time working with both my computer and a Latin dictionary at my side, I was still engaging all aspects of my brain. I was thinking about religion, rhetoric, theatrical conventions, and practically everything under the sun in order to correct the program in a way that was truly representative of the women authors this project seeks to platform within the treebanks.

I am immensely grateful to both Patrick and David for the opportunity to assist on this project and all the support they have given along the way. The chance to annotate works by a woman who is now my favorite Latin author has been incredible and has helped me grow not just as a Latin student but holistically. Whether it was meeting with David on Zoom to work through a literary translation of Hrotsvitha’s letter to her readers or working through the developments of the LatinCy program with Patrick, this process has been very rewarding. I am excited to see the developments of this project, and I hope to continue my involvement through the coming year!

—Oliver Katz


I have just completed my summer internship at the ISAW Library, working with David M. Ratzan and Patrick J. Burns and assisting with their RWALT project. Over the course of the summer I read and translated the first seven chapters of the Passio Sanctarum Perpetuae et Felicitatis. One of the earliest surviving pieces of Latin literature written by a female author, the Passio contains the prison diary of Perpetua, a Christian martyr condemned to death on account of her faith, subsequently edited, structured, and completed by a mysterious figure known as “the Redactor.” Using these translations, I ran each sentence of the Passio through the LatinCy text analyzer which Patrick created and which the project seeks to improve. I then processed the output data into a spreadsheet with categories for (most significantly) lemma, morphology, and part of speech where I then corrected any mistakes that I found in the analysis. In those first seven chapters I was able to process over 100 sentences and annotate over 1250 tokens of text. The purpose of the project was twofold: (1) to improve the accuracy of the LatinCy program with the long-term goal of creating a machine-learning platform capable of successfully annotating new (i.e., newly digitized or new to the platform) Latin texts and processing these annotations into an accessible digital database; and (2) to help rectify the underrepresentation of Latin texts written by female authors present in resources like the Latin UD treebanks.

Besides the feeling of modernizing the study and accessibility of Latin literature, the aspect of my RWALT internship that I found to be most gratifying (and most surprising) was the extent to which the Passio began to influence my experience of other literature, outside of the text. Before this summer, I had never heard of Saint Perpetua, to say nothing of her autobiography. Now I feel as though the experience of reading it—some combination of the particularities of Perpetua’s voice, the Redactor’s method of structuring the narrative, and the actual plot of the martyrdom—appears everywhere I look and in everything I’ve read since, from Cicero’s Pro Archia Poeta to Nathaniel Hawthorne’s The Scarlet Letter to Mikhail Bulgakov’s The Master and Margarita. And I think this phenomenon arises just as much from the specific lens of correcting annotations as it does from the text itself. When the LatinCy model analyzes a text, it looks at a token solely for its role in the sentence; and in order to correct it, you, too, as a reader, have to look at each word with a focus on grammatical status. Unlike a computer, it was difficult for my human brain to separate each word from its meanings, connotations, and particular oddities. But this is exactly what was required: the program seems to make errors more often when it comes to metaphorical speech or to what one might call exceptions from standard expression or usage: idioms, homonyms, irregulars, and anything specific to the author’s style of writing. In this way, each word is individually magnified through the practice of annotation, as the reader is simultaneously forced to pay extra attention to the greater patterns of the text as a whole, in order to catch and correct the computer’s inhumanness. Because of this annotation process, and through the discussions of diction and grammar I was able to have with David and Patrick, I got the chance to engage with this incredible text on a level of detail that was completely new to me. Practicing this unusual form of textual analysis has genuinely changed the way I read, for modern texts as well as ancient ones.

This internship has really been a rewarding and perspective-shifting experience: I’ve had the opportunity to make a lasting contribution to the study of classical literature and at the same time to grow as a learner, in terms of both my Latin skillset and more general reading experience. I can’t thank Patrick and David enough for all their time, support, and mentorship, and for creating this project in the first place. I hope to stay involved with the program as the year continues!

—John Toews


During my summer internship with RWALT at the ISAW Library, I created, analyzed, and improved Latin databases from women-authored texts, particularly excerpts from the Passio Sanctarum Perpetuae et Felicitatis, an early 3rd-century CE Christian text written during the reign of the Roman emperors Septimius Severus and Caracalla. Known for being one of the earliest Christian accounts from a woman in the Roman Empire, it describes the firsthand experience of Perpetua, a noble woman born into a Romanized African family, who was a fresh convert to Christianity. The martyr act also includes other early Christians and catechumens, such as Felicitas, a slave woman, along with Saturus, Revocatus, Saturnius, and Secundulus. I used the LatinCy system to annotate the works of women authors like Perpetua while also correcting its database to contribute to the training of the model. The LatinCy system was trained on the Universal Dependencies Latin treebanks, learning to assign lemmata, part-of-speech tags, and morphological tags in Latin. My own human annotations are then used to refine the model and improve its accuracy.

For me, my internship experience at RWALT fostered my love for the intersection of humanities topics like Latin with data management and computational analysis. Having taken Latin classes for three years in high school alongside calculus and computer science, this experience gave me the opportunity to connect all my interests into one. I was improving my Latin skills while also gaining familiarity with data analysis, spreadsheets, and digital modeling. But, this program also hit home for me as an enthusiast for Roman history and as a Roman Catholic from a Jesuit high school: the story of perseverance and tremendous dedication to her faith made reading Perpetua’s story admirable but also inspirational, teaching us to appreciate our cultural and religious affiliations as a diverse planet of beliefs.

—Alexangel Ventura