ISAW Hosts “Future Philologies: Digital Directions in Ancient World Text” Conference

By Patrick J. Burns

The ISAW Library hosted Future Philologies: Digital Directions in Ancient World Text, a conference organized by Patrick J. Burns, David Ratzan, and Sebastian Heath on the intersection of research on historical languages and computer science on April 20. Building on the series of Linked Ancient World Data Initiative New York (LAWDNY) workshops hosted by the Library and Digital Programs, Future Philologies brought together scholars working on a diverse range of languages—Latin, Greek, Coptic, Persian, Arabic, Classical Chinese, Sumerian were all represented on the program—to discuss the effect that digitization and the ability to analyze massive amounts of text is having and will continue to have on philological research and teaching. The event was co-sponsored by the NYU Center for the Humanities, the NYU Division of Libraries, NYU’s Center for Ancient Study, and the NYU Department of Classics.

Two things immediately stand out about the program, both of which capture the deeply interdisciplinary and interdepartmental aims of the conference: first, the diversity of historical languages covered and, second, the evolving nature of the collaboration between humanities disciplines and computer and information science. In both its linguistic range and its incorporation of a digital approach to humanistic inquiry, the conference reflected in miniature core strengths of ISAW’s mission, namely a commitment to a wide geographic and broad chronological scope in defining the ancient world as well as a commitment to promote innovative research in our field’s digital platforms and communities.

Assistant Research Scholar, Patrick J. Burns, opened the conference with a reflection on where philology fits in at ISAW and in Digital Programs at ISAW specifically. ISAW is an institution where seminars on “Anatolian Languages of the 2nd and 1st Millennium BCE: Hieroglyphic Luwian and Lycian” and “Advanced Data Structures and Querying for the Ancient World” sit side by side in the course offerings. Burns described the Future Philologies program as the productive middle ground between these two aspects of ISAW research culture, exploring ideas about ancient text-as-data and the implications of big-data approaches to these materials. Following Burns’s introductory remarks, Caroline Schroeder, Professor of Religious Studies at the University of the Pacific and co-founder of the flagship digital project in Coptic language studies, the Coptic Scriptorium, delivered the keynote address for the conference, reflecting on tendencies of digital scholarship to replicate and reinforce existing systems of knowledge and the ways in which we can move our disciplines forward by learning to tolerate non-canonical ways of dealing with data as well as fostering interdisciplinary, collaborative research.

Donald Sturgeon presents on Chinese text analysis tools at Future Philologies.

Interdisciplinarity and collaboration featured prominently in the conference’s next two panels, the first on digital corpus and data projects and the second on digital methodologies. In the first panel, Perseus Project editor Gregory Crane (Tufts/Leipzig) provided a roadmap to how open-source development and open-access data are helping to usher in a new age of comparative philology, while Alexander Magidow (U. of Rhode Island) and Yonatan Belinkov (MIT) gave a specific example of how Shamela, a diachronic corpus of Arabic, is challenging traditional views on the basic periodization of the language. In the second panel, Donald Sturgeon (Harvard) from the Chinese Text Project discussed the need for accessible digital tools for exploratory analysis of pre-modern Chinese and Émilie Pagé-Perron (U. Toronto) demonstrated how machine learning is being used at the Cuneiform Digital Library Initiative to bring large numbers of untranslated texts to a larger audience through the Machine Translation and Automated Analysis of Cuneiform project.

The final panel of the day asked researchers in computer science and information science to reflect on current opportunities  in and future challenges to digital philology. Kyle P. Johnson, founder of the Classical Language Toolkit, assessed a number of existing digital philology projects and laid out the minimum requirements that need to be met if natural language processing is to be extended to an increasing number of historical languages, especially ones with limited existing resources or institutional support. Laure Thompson and David Mimno (Cornell) presented a blunt challenge to classical philologists—what is the quickest, most efficient way to summarize the contents of the 164 volumes of the Patrologia Graeca comprising nearly 57,000 pages of parallel Greek and Latin text? Thompson and Mimno then used topic modeling to reduce the massive scale of the PG to something approaching an automated thematic index. Lastly, David Smith (Northeastern) presented on the Viral Texts project and its implications for the future of philology, namely how researchers going forward will need to address the proliferation of ancient word textual data and multiple editions of this data. Using 19th-centuries American newspapers as a comparative model, Smith showed how machine learning can be used to collate and correct massive amounts of textual data.

A primary goal of Future Philologies was to assemble scholars working on large-scale ancient world text projects who might not otherwise find opportunities to discuss their work together. A cuneiform machine learning project will find an audience in a Near Eastern Languages department, but perhaps not in a Classics department. Corpus-based approaches to periodization in Arabic literature could be, mutatis mutandis, readily adapted to the study of other ancient languages. Chinese text analysis tools may, on the surface, appear to offer little to a Coptic scholar, and yet in Sturgeon’s presentation we learned how simplifying workflows and lowering the barrier to entry for computational text analysis could benefit any historical language text project. Linguistic differences, to be sure, translate into differences in digital approach and design; but there is also a great deal of overlap at the foundational level of most computational approaches to languages, and Future Philologies took advantage of the common ground to expand the vision of all the projects involved. This kind of inclusive view of the ancient world is a core strength of ISAW, making it an ideal venue for a conference on the future of comparative philology.

A further goal of Future Philologies was to include computer scientists in the discussion. Computer science can be seen as a discipline of solving interesting problems that, whether for reasons of scale or subtlety, are difficult to solve through human study and cognition alone. Large collections of untransliterated and untranslated Sumerian tablets are an interesting problem. Periodization of Arabic literature over centuries is an interesting problem. Helping students to learn Greek, Arabic, and Persian more quickly and more efficiently is an interesting problem. Philology, and comparative philology in particular, contains many such problems and, at the level of complex information problems, they are not dissimilar to those that CS departments study and solve in different contexts. Crane once argued that “we are all corpus linguists and don't know it.” But with so much attention in computer science currently focused on natural language processing, text mining, text classification, and related text-focused subfields, I think we can update this to “we are all (or will soon be to some degree) computer scientists and don’t know it.” These are the areas where we find the bleeding edge in working with language and this research will gradually infiltrate how we approach philology going forward.

The LAWDNY Workshops, last fall’s Digital Publication in Mediterranean Archaeology: Current Practice and Common Goals, and Future Philologies all reflect the commitment of the Library, Digital Programs, and the ISAW faculty to support digital and computational humanities at ISAW through presenting our research—and bringing together colleagues to present their work—to our community. This programming continues with our next conference Digital Approaches to Teaching the Ancient Mediterranean later this year on October 26.

Future Philologies: Digital Directions in Ancient World Text presentations:

  • Patrick J. Burns (ISAW), "The 'Point' of Future Philologies"

  • Caroline Schroeder (Pacific/Coptic Scriptorium) "Annotating Heresies”

  • Gregory Crane (Tufts/Leipzig), "Digital Philology 2.0, Smart Editions, and the Future of Work"

  • Alexander Magidow (URI) and Yonatan Belinkov (MIT), "Analyzing the History of Formal Written Arabic"

  • Donald Sturgeon (Harvard), "Accessible Digital Text Analysis for Classical Chinese"

  • Émilie Pagé-Perron (Toronto/MTAAC), "Machine Translation for the Sumerian language: Workflow and Prerequisites"

  • Kyle P. Johnson (Accenture), "The Next 700 Classical Languages"

  • David Mimno and Laure Thompson (Cornell), "Authorship and Translation: Bilingual Modeling of the Patrologia Graeca

  • David Smith (Northeastern), "Viral Texts and Networked Authors: Computational Models of Information Propagation"