Featured datasets¶
-
Open Parallel Corpus
This corpus contains an up-to-date, ever-growing collection of multilingual texts aligned to Tibetan texts (bo) at the sentence-level. It is intended to be used to train an MT model.
-
Vulgate Kangyur
This Kangyur was created with OpenPecha's Vulgate Generator, which compares instances of a work and compiles a new version using the most common character at each position in the work.