Skip to content

qizha/buddhist-dictionary

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

buddhist-dictionary

NTI Buddhist Text Reader

A text reader, including dictionary and tools, for analyzing and managing Buddhist texts in Chinese and Sanskrit. This is a non-profit, open source project.

Goals:

  1. Create a dictionary that is easy to use for everybody interested in Buddhism, including lay people reading Buddhist texts, students, translators, and academics. Importantly, the goal is to create useful tools rather than authoritative definitions of terms.

  2. Create tools that are useful for lingustic analysis of Buddhist texts, including identification of specialist Buddhist terms and comparison of Chinese and Sanskrit texts.

  3. Use the tools to analyze and annotate a number of texts and share the content with the general public.

There are three parts to the project:

  1. The web user interface. This includes HTML, PHP, and JavaScript files.

  2. The data. This is the dictionary and text files. The data files are in UTF-8 tab delimited text. There is also a corpus directory, which contains the literature to build the vocabulary and word sense frequency from. These are Chinese and Sanksrit texts from the Buddhist canon and related collections. The corpus files include part-of-speech (POS) tagged documents and untagged documents.

  3. Command line tools. For building vocabulary. These are in Python. This includes a POS tagger and HTML annotation tool.

The license for the web site and dictionary content is Creative Commons Attribution-Share Alike 3.0. The license for source code and markup templates, is Apache 2.0.


Copyright Nan Tien Institute 2013, http://www.nantien.edu.au.

Releases

No releases published

Packages

No packages published

Languages

  • PHP 55.0%
  • Python 41.3%
  • JavaScript 2.8%
  • Other 0.9%