Home » Content Tags » mtmiller

mtmiller

8/2/22

By Jessica Weiss ’05

The University of Maryland has received a nearly $300,000 grant from the National Science Foundation that will support efforts to improve the way handwritten documents from the premodern Islamicate world—primarily in Persian and Arabic—are turned into machine-readable text for use by academics or the public. 

Assistant Professor Matthew Thomas Miller and Mellon Postdoctoral Fellow Jonathan Parkes Allen, both of the Roshan Institute for Persian Studies, will work with researchers at the University of California San Diego (UCSD), led by computer scientist Taylor Berg-Kirkpatrick, on the innovative humanities-computer science collaboration. UCSD received its own $300,000 award.    

Over three years, the researchers will work in the domain of handwritten text recognition, which are methods designed to automatically read a diversity of human handwriting types with high levels of accuracy. 

“This work has the potential to remove substantial roadblocks for digital study of the premodern Islamicate written tradition and would be really transformative for future studies of these manuscripts,” Miller said. “We are very grateful to the NSF for its support.” 

This latest research proposal builds on a number of ongoing efforts to develop open-source technology to expand digital access to manuscripts and books from the premodern Islamicate world in Arabic, Persian, Ottoman Turkish and Urdu; Miller currently leads an interdisciplinary team of researchers on a $1.75 million grant from the Mellon Foundation as well as a $300,000 grant from the National Endowment for the Humanities.

There are hundreds of thousands—perhaps even millions—of premodern Islamicate books and manuscripts spanning over 1,500 years, from the 7th–19th centuries, forming perhaps the largest archive of cultural production of the premodern world. Scanning and digitization efforts over the last decade have made images of Islamicate manuscripts in a large number of collections available to the public. However, they remain mostly “locked” for digital search and manipulation because the text has not been transcribed into digital text.  

The task is made more difficult by the diversity and intricacy of many Arabic manuscripts, said Allen, who is a historian of early modern Ottoman religious and cultural history. They may be written alongside diagonal notes, annotations and corrections, in multiple colors and “hands.” 

Under the NSF grant, researchers will develop new techniques that remove the need for extensive manual—or human—labor, a method known as “unsupervised” transcription. Eventually, the tools under development will produce models that will be able to automatically transcribe large quantities of Persian and Arabic script in a multitude of different styles with substantially higher degrees of accuracy than is currently possible.

“The Arabic script tradition is so extensive and so broad,” Allen said. “People need to be able to read these manuscripts, search within them, and integrate them into their research.” 

Image: Staatsbibliothek zu Berlin, Ms. or. oct. 3759

7/5/22

By Jessica Weiss ’05

The University of Maryland has received a $1.75 million grant from the Mellon Foundation to continue development of open-source technology to expand digital access to manuscripts and books from the premodern Islamicate world in Arabic, Persian, Ottoman Turkish and Urdu.

Matthew Thomas Miller, assistant professor in the Roshan Institute for Persian Studies in the School of Languages, Literatures, and Cultures, leads the interdisciplinary team of researchers, including David Smith from Northeastern University, Sarah Bowen Savant from Aga Khan University (AKU) in London, Taylor Berg-Kirkpatrick from the University of California, San Diego, and Raffaele Viglianti from the Maryland Institute for Technology in the Humanities at Maryland. The Mellon Foundation has been funding the project, known as “OpenITI AOCP,” since 2019.

“Over the past four years we have made incredible progress on the creation of digital infrastructure for Islamicate studies, and that is thanks in large part to the Mellon Foundation,” Miller said. “We are honored that the foundation continues to support our efforts to expand access to and digitally preserve such a rich and important cultural tradition.”

There are currently hundreds of thousands—perhaps even millions—of premodern Islamicate books and manuscripts that are not able to be accessed digitally by academics or the public, Miller said.

Thus far, the project team—made up of computer science and humanities experts—has successfully improved the accuracy of open-source Persian and Arabic optical character recognition (OCR) software, which is a system that turns physical, printed documents into machine-readable text. Under the new grant, they will use this OCR software to produce 2,500 new digitized Persian and Arabic texts, as well as expand the OCR system’s capabilities into Ottoman Turkish and Urdu.

They also aim to improve the accuracy of open-source handwritten text recognition (HTR) for Arabic-script manuscripts. A subfield of OCR technology, HTR tools are designed to read a diversity of human handwriting types with high levels of accuracy.

The team will also roll out a user-friendly redesign of its eScriptorium platform, which hosts the open-source tools. This latest Mellon grant will last three years. (Last year, Miller also received a grant from the National Endowment for the Humanities to support the project.)

Though he hopes its next phase of developments mark a major improvement for Arabic, Persian, Ottoman Turkish and Urdu texts, Miller said the goal ultimately is for the open-source tools to be used across a wide variety of languages.

“We really hope the technology will be reused by other users, especially those working in other under-resourced languages,” he said. “It’s designed to meet the needs of varied users.”

Image description: Persian ruba‘i (quatrain) calligraphy dating between circa 1610 and circa 1620. Gift in honor of Madeline Neves Clapp; Gift of Mrs. Henry White Cannon by exchange; Bequest of Louise T. Cooper; Leonard C. Hanna Jr. Fund; From the Catherine and Ralph Benkaim Collection. Learn more.

 

1/13/22

The University of Maryland Office of the Provost and Office of the Vice President for Research have announced ten recipients of this year’s Independent Scholarship, Research and Creativity Awards (ISRCA). The grant funding will support a variety of research studies and scholarly explorations ranging from poetry and literature to the immigrant experience.

“We are excited to support these projects, which embody faculty creativity and demonstrate the versatility and broad expertise of our researchers,” said Senior Vice President and Provost Jennifer King Rice.

The ISRCA program, launched in 2019, is designed to support the professional advancement of faculty engaged in scholarly and creative pursuits that use historical, humanistic, interpretive, or ethnographic approaches; explore aesthetic, ethical, and/or cultural values and their roles in society; conduct critical or rhetorical analysis; engage in archival and/or field research; and develop or produce creative works. Awardees are selected based on peer review of the quality of the proposed project, the degree to which the project will lead to the applicant’s professional advancement, and the potential academic and societal impact of the project.

In all, 44 eligible proposals were submitted, representing 9 colleges and 29 departments across campus. The awards, worth up to $10K, support faculty and their research expenses.  

“We were greatly pleased to see the strong faculty interest and engagement in this program, and the robust and diverse research areas explored by our faculty,” said Interim Vice President for Research Amitabh Varshney. 

This year’s award recipients are:

In References We Trust? A History of Peer Review in the Sciences  
Melinda Baldwin, Associate Professor (ARHU-History)

Landscape Memories, Migration, and Commons Management in Forest Systems
Madeline Brown, Assistant Professor (BSOS-Anthropology)

Radical Lens: The Photographs of Nancy Shia 
Nancy Mirabal, Associate Professor (ARHU-American Studies)

 Navigating Prolonged Legal Limbo: Deferred Action for Childhood Arrivals Recipients in the D.C. Metro Region
Christina Getrich, Associate Professor (BSOS-Anthropology)

Kippax Colonoware Sourcing and Trade Study
Donald Linebaugh, Professor (ARCH-Historic Preservation)

Embodied Afterlives: Performing Love Suicide in Early Modern Japan
Jyana Browne, Assistant Professor (ARHU-SLLC)

Selective: Data, Power, and the Fight over Fit in Organizational Life
Daniel Greene, Assistant Professor (INFO)

Sensing God: Embodied Poetics and Somatic Epistemology in Medieval Persian Sufi Literature
Matthew Miller, Assistant Professor (ARHU-Persian/SLLC)

Korean Immigrant Pioneers and Intergenerational Mobility Prospects in the DC Region 
Julie Park, Associate Professor (BSOS-Sociology and Asian American Studies)

Cool Fratricide: Murder and Metaphysics in Black and Indigenous U.S. Literature 
Chad Infante, Assistant Professor (ARHU-English)

9/3/21

By Jessica Weiss ’05

The Andrew W. Mellon Foundation has awarded a $100,000 grant to support the continued development of user-friendly, open-source software capable of creating digital texts from Persian and Arabic books. 

Matthew Thomas Miller, assistant professor in the Roshan Institute for Persian Studies in the School of Languages, Literatures, and Cultures, leads an interdisciplinary team of researchers from Northeastern University, Aga Khan University (AKU) in London and the Maryland Institute for Technology in the Humanities at Maryland. The Mellon Foundation has been funding the team’s work since 2019.

“We are honored that The Andrew W. Mellon Foundation has again supported our efforts,” Miller said. “They have been global leaders in building open-source tools and open-access collections for the expansion in access to and digital preservation of cultural traditions across the world, and we are delighted to be a part of these efforts.”

The project, known as “OpenITI AOCP,” aims to enable the digitization of texts from the premodern Islamicate world—an enormous tradition stretching over 1,000 years. The tools being created by the project team will be free and open to use and will allow academics and the public to produce high-quality digital transcriptions of Persian and Arabic printed texts, from poetry to the Quran. 

“Premodern Islamicate textual production is a massive and understudied archive that remains particularly underrepresented in the field of digital humanities,” Miller said. “This democratization of access to digital text production will change the landscape of Islamicate studies.”

Thus far, the project team—made up of computer science and humanities experts—has successfully improved the accuracy of Persian and Arabic optical character recognition (OCR) tools, which are tools that transfer printed text into machine-encoded text, and have begun experimenting on Ottoman Turkish and Urdu. They are integrating those tools into a platform called eScriptorium. They also held a training session at the University of Maryland in 2020 for OCR experts from all over the world. And they taught a Spring 2021 Global Classrooms course, “The Islamicate World 2.0: Studying Islamic Cultures through Computational Textual Analysis,” on the basics of computational textual analysis as it relates to textual data about the Islamicate world.

Next steps include finalizing the open-source software for widespread use, as well as holding additional workshops and community building activities around the new tools. This latest Mellon grant will last one year. 

Earlier this year, Miller was awarded $282,905 by the National Endowment for the Humanities to support the project.

Image description: The introduction to George B. Whiting's Kitab fi al-Imtina‘ ‘an Shurb al-Muskirat, published in Beirut by American Mission Press in 1838 and housed at Harvard's Houghton Library (*98Miss168). Licensed for non-commercial use.

Subscribe to RSS - mtmiller