Wiktionary:Lemmas

link={{{imglink}}} This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors.
Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES.
English Wikipedia has an article on:
Wikipedia

When a word has multiple distinct forms, the lemma is the main entry at which the definitions, etymology, inflections and such are placed. All other forms of the word are non-lemma forms, and the entries for these forms generally only contain a link to the lemma form. For example, walk is the lemma of an English verb (the "bare infinitive"—"to walk" without the "to"), while walked is a non-lemma form. Lemmas and non-lemma forms are categorized separately: Category:English lemmas for lemma forms and Category:English non-lemma forms for non-lemma forms; likewise for other languages.

Choosing the lemma edit

For many words across languages, there is only one form, so that form automatically becomes the lemma. When there are multiple forms, the lemma is usually the most basic form, but not necessarily. Different languages have different conventions.

Nouns edit

For nouns, the lemma is normally the form that is used as the singular subject of an intransitive verb. For languages with a case system, this is the nominative or absolutive case form.

The following languages use a different form as the noun lemma:

  • Old French: objective singular
  • Primitive Irish: genitive singular (for most nouns, the only attested form)
  • Sanskrit: noun stem
  • Welsh: usually the singular, but the plural in cases where the singular is a singulative derived from the plural by suffixation (e.g. the lemma is adar (birds), not aderyn (a bird)).
  • Berber languages: free state singular

Adjectives edit

For adjectives, the lemma is chosen as it is for nouns. If a language has genders, usually the masculine form is chosen. Adjectives may have distinct predicative and attributive forms in some languages; this can be clarified here. Note that for many languages adjectives are stative verbs, which could be translated as "to be X"

The following languages use a different form as the adjective lemma:

  • Berber languages: free state masculine singular
  • Dutch: predicative (uninflected) form
  • German: predicative (uninflected) form
  • Korean: predicative form, the same as for verbs
  • Sami languages (except Southern Sami): predicative form in nominative singular, attributive form otherwise
  • Sanskrit: basic stem
  • Southern Sami: attributive
  • Xhosa: basic stem
  • Zulu: basic stem

Verbs edit

For verbs there is more variation in the form chosen as the lemma. For many languages, the infinitive is used as the lemma, but many languages have no infinitive, so another form is chosen. If the infinitive is assumed the default lemma form, the following list contains languages that deviate from this.

Multiple infinitives edit

The following languages have multiple infinitives, so it is specified which of them is the lemma:

  • Estonian: ma-infinitive
  • Finnish: first infinitive
  • Ido: present infinitive
  • Võro: ma-infinitive

First-person singular present active indicative edit

The following languages use the form used for the first-person singular in the present tense, active voice, indicative mood. Not all languages may distinguish all of these.

  • Albanian: first-person singular present indicative
  • Ancient Greek: first-person singular present active indicative
  • Bulgarian: first-person singular present
  • Greek: first-person singular present indicative
  • Latin: first-person singular present active indicative (first principal part)
  • Old Armenian: first-person singular present indicative
  • Proto-Hellenic: first-person singular present active indicative (as in Ancient Greek)
  • Proto-Italic: first-person singular present active indicative (as in Latin)

Third-person singular present active indicative edit

As above, but in third person.

  • Avestan: third-person singular present active indicative
  • Hungarian: third-person singular indefinite present
  • Proto-Celtic: third-person singular indicative (as in Old Irish)
  • Proto-Indo-European: third-person singular indicative (as in Proto-Celtic, Proto-Indo-Iranian)
  • Proto-Indo-Iranian: third-person singular indicative (as in Avestan, Sanskrit)
  • Quechua: third-person singular present
  • Sanskrit: third-person singular present active indicative
  • Sioux: third-person singular present
  • Macedonian: third-person singular present
  • Navajo: third-person singular present
  • Ojibwe: third-person singular present
  • Old Irish (and Middle Irish): third-person singular present absolute/deuterotonic
  • Wauja: third-person singular present
  • Yup'ik: third-person singular present

Root/stem edit

The following languages use the basic root or stem of the verb as the lemma. This form may or may not actually exist as a real word.

  • Cherokee: root minus prefixes, suffixes, or anything else, for example, -e-, go.
  • Lingala: basic stem
  • Old Turkic: basic stem
  • Proto-Uralic: basic stem
  • Swahili: indicative root form of the verb (e.g. -peleki (to send)).
  • Xhosa: basic stem
  • Zulu: basic stem

Others edit

  • Arabic: third person masculine singular past (perfect)
  • Berber languages (including Kabyle, Central Atlas Tamazight, Tuareg, etc.): second-person singular aorist imperative
  • Hebrew: third-person masculine singular past (perfect)
  • Irish: second-person singular imperative
  • Japanese: the non-past tense (verbs have no person, gender, or number)
    This is the conclusive form, known in Japanese as 終止形; also colloquially known as dictionary form (辞書形) by some.
    The canonical form will be the 終止形 (shūshi-kei) which is unique to each verb. Using w:Japanese grammar terminology, it is the terminal form.
  • Korean: infinitive (i.e. ending in 다 -da)
    The Korean infinitive is generally (by Martin etc.) considered to be the 하여 (hayeo), 와 (wa), etc. form—what is sometimes called the "polite stem". IMX the dictionary form is usually just called the "dictionary form", though 기본형 (gibonhyeong "basic form") also has some currency.
  • Welsh (and Middle Welsh): verbal noun