Buddhist Translators Workbench

What's New

The Mangalam Dictionary of Buddhist Sanskrit has been released. It is fully automated, and offers comparison between Buddhist and non-Buddhist word usage. For curated data, see our Visual Dictionary and Thesaurus of Buddhist Sanskrit. New lexical portraits for key terms continue to be added regularly.

Overview

The Buddhist Translators Workbench facilitates the translation and study of the Buddhist textual heritage through the development of lexicographic resources, corpora and corpus methods for classical Buddhist languages. So far, the project has focused on Sanskrit, one of the least resourced classical languages of Buddhism. We are currently developing the first processed corpus of Buddhist Sanskrit literature and using it to create a corpus dictionary, the Visual Dictionary and Thesaurus of Buddhist Sanskrit.

News

July 2025: Our fully automated corpus dictionary covering over 2000 nouns, with comparison between Buddhist and non-Buddhist sources, is now online: Mangalam Dictionary of Buddhist Sanskrit.

July 2025: We have started including semantic tags in our Buddhist Sanskrit corpus (for now only words in the semantic domain of Language are tagged), you can download it from Zenodo

January 2023: Mangalam Research Center has been awarded a Digital Humanities Advancement Grant from the National Endowment for the Humanities for a project titled Democratizing digital lexicography: an infrastructure to facilitate the creation and dissemination of electronic dictionaries (HAA-290402-23).

January 2022: Mangalam Research Center has received an Ashoka grant from the Khyentse Foundation for the project “Machine-readable Dharma: Building a body of Buddhist Sanskrit literature for computer-aided analysis”, which seeks to expand and improve our processed corpus of Buddhist Sanskrit literature.

January 2021: Mangalam Research Center has been awarded a Digital Humanities Advancement Grant from the National Endowment for the Humanities in support of the Buddhist Translators’ Workbench project (HAA-277246-21). The grant will help build a Natural Language Processing infrastructure for the exploration of Buddhist Sanskrit lexical semantics through word embedding models.

Team

Ligeia Lugli, Ph.D., Project Director
Luis Gamaliel Quiñones-Martinez, Lexicographer
Tilak Balavijayan, Technical Assistant
Morgan Wells, Esq., Grant Administrator

Publications

Lugli, Ligeia (2024). Agile Lexicography: rapid dictionary prototyping with R Shiny, with examples from projects on Sanskrit and Tibetan. In Grammatical Theory, Language Processing and Databases in Historical Linguistics BRILL. 121-149.

Martinc, Pelicon, Pollak, Lugli (2023). Word-sense Induction on a Corpus of Buddhist Sanskrit Literature. In Medveď, M. & Měchura, M. & Tiberius, C. & Kosem, I. & Kallas, J. & Jakubíček, M. & Krek, S. (eds.). Electronic lexicography in the 21st century (eLex 2023): Invisible Lexicography. Proceedings of the eLex 2023 conference, pages 201-215.

Lugli, L., Martinc, M., Pelicon, A., Pollak, S. 2022. Embedding models for Buddhist Sanskrit. Proceedings of the 13th Conference on Language Resources and Evaluation (LREC), Marseille, 20-25 June 2022, 3861-3871.

Lugli, Ligeia. 2021. Words or terms? Models of terminology and the translation of Buddhist Sanskrit vocabulary. In Alice Collett (ed.) Translating Buddhism: Historical and Contextual Perspectives, New York: SUNY.

Lugli, L. 2021. Dictionaries as collections of data stories: an alternative post-editing model for historical corpus lexicography. In Itzok Kosem, et al. (eds.). Post-Editing Lexicography: eLex 2019, 216–231.

Lugli, Ligeia. 2019. Smart lexicography for low-resource languages: lessons learned from Buddhist Sanskrit and Classical Tibetan. Smart Lexicography: eLex 2019, 198–212.

Lugli, Ligeia. 2018. Drifting in Timeless Polysemy: Problems of chronology in Sanskrit lexicography. Dictionaries: Journal of the Dictionary Society of North America. Vol. 39 (1): 105-129.

Lugli, Ligeia. 2015. Mapping meaning across cultures: a lexicographic resource for translators of Sanskrit Buddhist texts into English. Proceedings of the 9th International Conference of ASIALEX.

Digital Outputs

Ongoing development: Visual Dictionary and Thesaurus of Buddhist Sanskrit (online interface) [https://mangalamresearch.shinyapps.io/VisualDictionaryOfBuddhistSanskrit/]

Ongoing development: Visual Dictionary and Thesaurus of Buddhist Sanskrit (dictionary data) [10.5281/zenodo.3716014]

Ongoing development: Meaning Mapper, a lexical annotation tool. [github.com/MangalamResearch-Lexicography/MeaningMapperApp]

Ongoing development: Buddhist Sanskrit Segmenter and Lemmatizer [10.5281/zenodo.3459218]

2017-2019 Meaning Mapper (chrome plug-in version) [https://github.com/mangalam-research/mmwp]

Ongoing development: Corpus of Buddhist Sanskrit Literature [10.5281/zenodo.3457821]

2012-2016 Buddhist Translator Workbench (early database & interface): [https://github.com/mangalam-research/btw]

2012-2016 Buddhist Translator Workbench (early dictionary data): [10.5281/zenodo.3605420]

Funding

The Buddhist Translator Workbench was started thanks to funding from the National Endowment for the Humanities (HD-51383-11; HD-51772-13).

The corpus of Buddhist Sanskrit literature [10.5281/zenodo.3457821] and the Buddhist Sanskrit segmenter and lemmatizer [10.5281/zenodo.3459218] were created by Ligeia Lugli with funding from the British Academy (NF161436) in 2018-2019 and subsequently expanded with funding from the Mangalam Research Center for Buddhist Languages. The corpus is currently being refined thanks to an Ashoka grant from the Khyentse foundation.

Additional funding from the National Endowment for the Humanities was awarded in 2021 and 2023.

Related Work

March 2022: We are happy to announce that following Dr. Lugli’s CAPES-funded visiting professorship in Brazil, two ongoing bilingual Portuguese dictionary projects at São Paulo State University are now using an architecture and interface modelled after our Visual Dictionary. One of them is already publicly available at https://lexicografiaunesp.shinyapps.io/DicionarioEscolarDeVerbosPortuguesIngles/.

Dr. Lugli has adapted our ‘Visual Dictionary’ model to develop the interface of a diachronic lexicon of Tibetan verb valency at SOAS (University of London), with fundings from UKRI and in collaboration with U. Pagel, E. Garrett, Ch. Faggionato, S. Rhode and N. Somsdorf. The Visual Dictionary of Tibetan Verb Valency is currently hosted and maintained by the Mangalam Research Center.

Mangalam Research Center