Wiktionary:About Sanskrit

English Wikipedia has an article on:
Wikipedia
See also Category:Sanskrit language

(based on Wiktionary:Entry layout)

Note 1: This guide is intended to provide guidelines both for creating Sanskrit entries on English Wiktionary as well as for adding Sanskrit translations to English words. The main guidelines for creating any entry on English Wiktionary is set forth in Wiktionary:Entry layout; this page is an addition to that page, not a replacement.

Note 2: If a change occurs in the basic wiktionary template (currently at Wiktionary:Entry layout) that affects Sanskrit entries, then that change should be reflected here.


Scope edit

"Sanskrit" on Wiktionary refers to not only Vedic and Classical Sanskrit, but also the broad dialect continuum of Old Indo-Aryan languages that gave rise to the Middle and Modern Indo-Aryan languages. Some literature use a narrower definition, but for lexicographic simplicity it was agreed to use a broad one per this discussion.

Practically, this means:

  • reconstructed words are valid Sanskrit entries, like *चिष्ट (ciṣṭa, message). These act as a hub for descendants, like चिट्ठी (ciṭṭhī) and चिठी (ciṭhī). Editors should be careful to ensure that the reconstructed term based on New Indo-Aryan is actually Old Indo-Aryan/Sanskrit and not Middle Indo-Aryan (this is discussed further in #Sanskrit verbs in etymologies of other languages)
  • Entries for Vedic Sanskrit are valid entries. In a term like क्रीळ (krīḷa), ahead of the foremost definition we would write {{lb|sa|Vedic}} to both categorize the term as "Vedic Sanskrit" and display a "Vedic" label.
  • In the absence of the reconstruction asterisk or any "Vedic" label, a term is understood to be Classical Sanskrit, as in क्रीड (krīḍa). Where applicable, classical terms should link to Vedic equivalents in the "Alternative forms" section.

Creating Sanskrit Entries edit

Script edit

The Sanskrit language has no single script associated with it. The system predominant in India historically in the written literature as well as today is Devanagari. Entries in Wiktionary may be in any of the scripts if there is usage. However, all words should at least have a Devanagari entry.

The same word in other Indic scripts may be referenced under the Alternative scripts header, see WT:ELE. In scripts other than Devanagari, it often suffices to define a term as <Other Script Name> form of <Devanagari Equivalent> with {{sa-sc}}.

In the Translation section of English terms, Sanskrit entries should be presented in Devanagari, e.g. at horse:

* Sanskrit: {{t|sa|अश्व|m}}

Accent Diacritics edit

The headword/inflection line should show the Devanagari or other Indic script, with the IAST transliteration in parenthesis, with optional Vedic accent marks on vowels where present; example at अश्व (áśva):

{{sa-noun|g=m|tr=áśva}}

As pitch accent was lost in Classical Sanskrit, in many cases, the position of the original accent is unknown. In such cases, accent can be ignored.

Formatting entries edit

Nominal Lemma edit

For Sanskrit nominals, the lemma form is the stem (in the case of adjectives, the masculine stem). See here for an overview of Sanskrit declension. For instance, for a-stems, the lemma ends in -a (not the nominative -aḥ).

Active participles are sometimes given in dictionaries as ending in either -at or -ant. Our policy is to use -at for the lemma, as in भवत् (bhavat, present); the -ant form can be optionally listed here as an "Alternative form". Similarly, perfect participles should end in -vāṃs, like विद्वांस् (vidvāṃs, understanding).

Root Lemma edit

Following the style of Proto-Indo-European and related languages, the Sanskrit "root" is a basic unit of meaning from which verbal and nominal forms are derived.

The root is often in the zero-grade, like भृ (bhṛ), but can sometimes be in other grades, like धा (dhā). In some printed dictionaries where compactness is required, the symbol signifies that a term is a Sanskrit root, like √dā. Because there are no such space constraint in Wiktionary, the symbol should always be avoided. Instead, refer to roots like "from the root Sanskrit दा ()", etc.

Wiktionary makes some distinctions between "roots" and "verbs" that some dictionaries, like Monier-Williams' English-Sanskrit dictionary, do not. Some roots, like दा (), are "true" non-prefixed roots, have a number of verbal/nominal derived forms, and have an entry in dictionaries like Whitney's dictionary of roots. The current practice is to include prefixed roots, like संधा (saṃdhā), as valid "root" entries.

Some other given "roots" in dictionaries like Monier-Williams essentially only correspond to the present-tense verbal forms. In such cases, there is no Wiktionary root, and the verbal form is given without a root, like खणखणायते (khaṇakhaṇāyate). This distinction can be blurry. In general, it is best to include just a verbal lemma and only add a root lemma if one knows what they are doing.

Verbal Lemma edit

Sanskrit verbs are lemmatised in the third-person singular present active indicative. As discussed here, the following are valid "verbal" lemmas:

  • A third-person present tense, e.g. भरति (bharati, to bear). If the verb is ubhayapada (i.e. has both parasmaipada/active and atmanepada/medio-passive forms), the parasmaipada form is lemma and the atmanepada form is non-lemma. If the verb is only atmanepada, then that atmanepada form becomes the lemma.

    The current practice of Wiktionary is to define Sanskrit third-person verbs (particularly in the present tense) in terms of the English infinitive. For instance, we would say भरति (bharati, to bear) and not भरति (bharati, bears), even though भरति (bharati) more actually means "(it) bears". This is simply by dictionary convention and for ease of referencing Sanskrit terms in the etymologies of other language terms, which are usually in the infinitive (e.g. Hindi भरना (bharnā)).
  • A third-person future tense, e.g. भरिष्यति (bhariṣyati, future of भृ (bhṛ))
  • A third-person periphrastic future tense, e.g. गन्ता (gantā, periphrastic future of गम् (gam))
  • A third-person aorist past tense, e.g. अगमत् (agamat, aorist of गम् (gam))
  • A third-person benedictive tense, e.g. तप्यात् (tapyāt, benedictive of तप् (tap))
  • A third-person perfect tense, e.g. जगाम (jagāma, perfect of गम् (gam))
  • The equivalent third-person forms of secondary conjugations like the passive, causative, desiderative, and intensive, like गम्यते (gamyáte, passive of गम् (gam)), गमयति (gamáyati, passive of गम् (gam)), and अजीजनत् (ajījanat, causative aorist of जन् (jan))

Notably, this means that the imperfect is non-lemma (the present tense is the lemma) and the conditional is non-lemma (the future is the lemma), among others.

In the non-present forms, usually a definition like "perfect of जन् (jan)" produced by {{inflection of|sa|जन्}}, along with the conjugation given by {{sa-conj}} suffices for a definition.

All Lemmas edit

If a root form exists for some lemma, it should be linked to in the headword template (e.g. {{sa-noun}}, {{sa-adj}}, {{sa-verb}}, etc). This will categorize the word appropriately. {{root|sa|inc-pro}}, {{root|sa|iir-pro}}, and {{root|sa|ine-pro}} should also be used in the "Etymology" section where appropriate.

Inflection edit

The template {{inflection of}} identifies the lemma form and particular inflected form of the entry.

Noteworthy non-lemma forms edit

All infinitives, gerundives, and past passive participles are considered non-lemma forms of the root in Sanskrit. Active and medio-passive participles are considered non-lemma forms of their respective verbal lemmas.

Frequently, active/medio-passive/passive participles are also considered "adjectives" or "nouns" in their own right. In such cases, like कृत (kṛta), there should be a participle section (non-lemma) defined with {{inflection of}} and an adjective/noun section (lemma) with the other relevant definitions and inflections.

Quotations edit

Sanskrit literature chronologically encompasses more than 3 millenia of written and oral record. As such, owing especially to the particular detachment from spoken language after the codification of Classical Sanskrit by Pāṇini ~ C5 BCE, Sanskrit words came to develop plethora of often widely divergent meanings. Some of these are confined to a particular chronological period, to a particular literary style, or a particular author, work or a tradition. All of these meanings merit inclusion per criteria for inclusion for extinct languages. Monier-Williams' English-Sanskrit dictionary employs several hundreds of abbreviations listed after a particular semantic group (that itself corresponds to a single Wiktionary definition line) for this purpose. Wiktionary shall employ the same set of abbreviations, by means of a quote provided by the {{Q}} template which accepts the abbreviation without the final dot and automatically fills in the metadata from Module:Quotations/sa/data (which can be expanded as necessary).

Such abbreviations should come bulleted following every definition line. For example, the second definition line of दृष्टि (dṛṣṭi) is in the Monier-Williams dictionary given as:

sight, the faculty of seeing, ŚBr.; Mn.; Suśr. &c;

which translates into Wiktionary syntax as:

# [[sight]], the faculty of seeing
#* {{Q|sa||ŚBr}}
#* {{Q|sa||Mn}}
#* {{Q|sa||Suśr}}

Formatting References edit

See the #References section below for specific details on good Sanskrit references.

This section always appears at level 3 as ===References===. It should conclude the language section, and should never be placed within any subheader. It will include all references for the Sanskrit section as a group. If there are multiple etymologies corresponding to different terms that are homonyms, do not include a separate level 4 ====References==== section for all the different words; instead, use the "<ref>" tag to reference specific citations throughout the subsections and use "<references/>" under the level 3 ===References=== section.

Transliteration edit

Standard transliteration system for Sanskrit on Wiktionary is exclusively IAST - all the others of dozen or so commonly used transliteration schemes such as Harvard-Kyoto or ISO 15919 are forbidden. Transliterations shall appear in the inflection line with tr= parameter, and everywhere else when they are commonly used, such as mentioned in prose with {{m}}. Transliterations are not mandatory for listings of Sanskrit lexemes, such as inside ====Related terms==== or appendices.

Entries written in IAST transliterations shall not appear in the main namespace. Commonly used English terms originating from Sanskrit that approximately correspond to transliterated Devanagari are subject to WT:CFI for English lexemes, and as such shall be formatted under ==English== rather than ==Sanskrit== L2 headers.

References edit

For reference purposes the following templates are available for dictionaries that are out of copyright and freely available on various places on the Web:

  • {{R:MW}} – the popular Monier-Williams' Sanskrit-English dictionary. This template accepts single unnamed parameter: the page number in 4-number format. So, for example, for referencing the page 1, this template would be called as {{R:MW|0001}}, for page 234 as {{R:MW|0234}}, for page 1234 as {{R:MW|1234}} and so on.
  • {{R:Cappeller Sanskrit-English}} or {{R:CAP}} – Cappeller's dictionary. See the template page for instructions.
  • {{R:MCD}} – Macdonell's dictionary (1929 reprint). This template accepts a single unnamed parameter: the page number in 3-number format.
  • {{R:WIL}} – Wilson's dictionary. This template accepts a single unnamed parameter: the page number in 3-number format.

For example, the entry on अंश (áṃśa) has the following ===References=== section:

===References===
{{R:MW|0001}}
{{R:CAP|001}}
{{R:MCD|001}}
{{R:WIL|001}}

Sanskrit verbs in etymologies of other languages edit

Distinction between Sanskrit and Middle Indo-Aryan in reconstruction edit

"Sanskrit" in Wiktionary actually refers to a dialect continuum of Old Indo-Aryan_languages (see #Scope). {{R:CDIAL}} is one of the best resources for reconstructing based on New and Middle Indo-Aryan, but is not always clear about whether the reconstructed term is early Middle Indo-Aryan (which Wiktionary calls Ashokan Prakrit) or Old Indo-Aryan (which Wiktionary calls Sanskrit). Between Old Indo-Aryan and Middle Indo-Aryan, there are a few key changes:

  • In Middle Indo-Aryan, palatal ś, retroflex , and dental s are merged into dental s
  • In Middle Indo-Aryan, the syllabic and vowels are not found.
  • Middle Indo-Aryan does not allow complex conjuncts or onsets. Only one consonant can appear at the start of a word, and the only consonant clusters allowed medially are geminated consonants or a sequence of a nasal and a stop at the same place of articulation.
  • Middle Indo-Aryan does not allow superheavy syllables. We would not find a sequence of a long vowel and a coda consonant. For this reason, the overlong vowels ai and au are not found in Middle Indo-Aryan in favour of e and o.

For example, for Hindi अधूरा (adhūrā, incomplete), CDIAL gives a reconstructed ancestor *ardhapūraka, which is Old Indo-Aryan as it contains the complex consonant cluster rdh. We therefore class it as reconstructed Sanskrit Sanskrit *अर्धपूरक (ardhapūraka). For Hindi बोलना (bolnā), CDIAL gives a reconstructed ancestral root as bōll (with the equivalent present tense lemma form bollati), which is phonetically-valid for Middle Indo-Aryan. Hence, it would be more correct to label the ancestor of the Hindi word as Ashokan Prakrit *𑀩𑁄𑀮𑁆𑀮𑀢𑀺 (*bollati) from the root *𑀩𑁄𑀮𑁆𑀮𑁆 (*boll), rather than Sanskrit *बोल्लति (*bollati) in the absence of an attested form. The Old Indo-Aryan ancestor of बोलना (bolnā) would simply be considered unclear in such a case.

For many other words in descendant languages like Hindi, Bengali, and Marathi, there is a a clear, attested ancestor in Sanskrit. In such cases, most of the above advice can be disregarded and the Sanskrit term is given as the ancestor to the modern language.

This situation is made much more complex by the concept of Sanskritization of Prakrit forms and hyper-Sanskritization (i.e. hypercorrection of Middle-Indo Aryan Prakrit forms attempting to "reconstruct" the Sanskrit form). As a general rule, words should not inherited/derive from hyper-Sanskritization. In a few cases, the lines of what inherits from what is not entirely clear. It may be helpful to use {{rfe|sa}} or start a discussion in the Talk page of the verb.

Etymologies edit

As with other Wiktionary languages, apply the principle of "translate lemmas with lemmas":

  • For verbs, the descendant verb should link to the Sanskrit lemma (the third-person form) glossed with the English infinitive. As an example, an etymology for Hindi भरना (bharnā, to bear) is given as Sanskrit भरति (bhárati, to bear) and not Sanskrit भरण (bharaṇa, filling), though the latter is more correctly the "ancestor" for the literal Hindi lemma form. This is because the actual verbal forms of the Hindi lemma term are descendants of the forms of the Sanskrit verb.
  • In the case of learned borrowings, inherited nominals, and other terms from Sanskrit, the lemma should similarly be used. For instance, Hindi भरण (bharaṇ) descends from Sanskrit भरण (bharaṇa). A valid etymology for Hindi प्राणी (prāṇī) should clearly link the Sanskrit प्राणिन् (prāṇin) lemma form, but may mention that the Hindi term is derived more specifically from the masculine nominative singular प्राणी (prāṇī). It would be incorrect to only mention Sanskrit प्राणी (prāṇī).

Additional help edit

Help from the community edit

Sometimes, we know there is a problem, but don't know what to do to correct the problem. If you should find a Sanskrit entry with a problem that you do not know how to correct, there are several ways to approach the situation.

  1. Mark the page with {{attention|sa}}. This template will add the entry to Category:Requests for attention concerning Sanskrit, where another user can then find and correct the problem. It helps if you include comments on the entry's talk page explaining what the problem is or why you think the page needs attention.
  2. Raise the issue on Wiktionary talk:About Sanskrit. Note that this approach is primarily for issues of style, formatting, categorization, and not for specifics of content.
  3. Mark the page with {{rfc}}. this is a more general cleanup tag, and it allows the user to include reasons or concerns as an argument in the template. Be sure to also add an entry to WT:RFC concerning the word so that other editors will be made aware of the problem.

Other Sanskrit aids edit