Talk:mahā

Noun edit

Latest comment: 9 years ago12 comments4 people in discussion

Does the noun have the same etymon as the adjective? If so, and if महा has "no meaning on its own", how did it come to be used as a standalone noun in the citation provided? Was the translator/author of that citation shortening a longer Sanskrit term? - -sche (discuss) 05:12, 9 February 2015 (UTC)Reply

According to the Monier-Williams dictionary, as a noun in Sanskrit it means "cow", as well as being a shorthand for mahā́-kaṅkara "a particularly high number". It's could be taken from Pali/Prakrit, or substantiated in English (e.g. from various mahā-X compounds meaning something and subsequently referred to as mahās). Given its unclear origin it should best be grouped under the same etym unless evidence for the contrary arises. --Ivan Štambuk (talk) 15:54, 15 February 2015 (UTC)Reply

... which meaning, "cow", is clearly not the meaning used in any of the citations. Moreover, it is not relevant to the given etymology (i.e. combining form of महत् (mahat, “great, big, high”)). ‑‑ Eiríkr Útlendi │ Tala við mig 02:51, 16 February 2015 (UTC)Reply

I would guess that if mahā has a noun meaning, then it came to have this the same way that great came to have a noun meaning in English. bd2412 T 05:21, 9 February 2015 (UTC)Reply

The purported noun in the cited text has no clear meaning. The text is apparently a translation from Tibetan, with numerous terms left untranslated.

Iff we are to accept this text as a valid representation of only English, and discard any possibility that some of the words in this text might not be English, then we must also create English entries for krīya, anu, ati, ubhaya...

Ivan himself stated over at Wiktionary:Requests for deletion:

What you found in English text is Sanskrit excerpts. That kind of format is common with classical and religious languages where "original and proper" enunciation plays an important role in setting the stage for supernatural mysticism in the minds of followers, as well as for reducing the possibility for mistranslation by enhancing the readers understanding of the text in its original form.

I posit that these untranslated terms are similarly untranslated Sanskrit excerpts in an otherwise English text, left untranslated for setting the stage for supernatural mysticism in the minds of followers, as well as for reducing the possibility for mistranslation by enhancing the readers understanding of the text in its original form. As evidence for this possibility, the first appearance of [[mahā]] in Luminous Essence: A Guide to the Guhyagarbha Tantra is this snippet on page 3:

This is also the reasoning behind the subdivisions of the Nyingma School's mantra scriptures, such as the classification of mahāyoga into three parts, starting with the mahā of mahā.

The next appearance of [[mahā]] is on page 5 in the quote given in the citation. The next appearance is on page 23, where [[mahā]] is listed as one of “the three vehicles of mastery in means (mahā, anu, and ati).”

That's it. The term only appears five times in three separate sentences, and it is never used in a way where either its meaning or its "Englishness" is clear. ‑‑ Eiríkr Útlendi │ Tala við mig 06:15, 9 February 2015 (UTC)Reply

I posit that these untranslated terms are similarly untranslated Sanskrit excerpt - No, they are English words not untranslated Sanskrit terms. They undergo English morphology (e.g. taking the plural suffix -s) as well as fitting into English sentence syntax. They are used not mentioned terms which makes them English. These terms cannot be translated to English directly because English doesn't have any words to refer to these concepts, other than those Sanskrit borrowings. Similar conventions of using transcribed words is used with Islamic terminology taken from Arabic (see e.g. Encyclopedia of Islam), for Persian (and other Iranian languages) in Encyclopedia Iranica, and I suspect many others as well. For scholarly works it's a matter of precision, for religious works it's a matter of adherence to faith and tradition (specifically for Sanskrit the study of pronunciation śikṣā has a tradition from Vedic times and is the reason why the archaic speech was preserved despite not being written.)

Regarding this particular term, I'm not sure that it's definition is correct - it was just one of the first results on Google Books. Those kind of terms have many meanings depending on the author, period and school of thought. I don't really feel like investigating it more thoroughly though. --Ivan Štambuk (talk) 16:14, 15 February 2015 (UTC)Reply

In what way does mahā take English morphology in any of the provided citations? The citations do not show anything of the kind. English morphology is one thing, but for syntax, any term can be used in English syntax, so that proves nothing. In fact, we see a paucity of demonstrative examples even for syntax, as demonstrated by the lack of any examples of usage like very mahā, more mahā, etc. None of the examples suggest anything more than an untranslated Sanskrit term. Wanting something doesn't make it so. ‑‑ Eiríkr Útlendi │ Tala við mig 02:51, 16 February 2015 (UTC)Reply
While mahā specifically does not in any of the cited examples (it's for the adjectival meanings, so it can't take a plural ending), others do, and your blanket statement referred to all of such supposedly "untranslated" words originating from Sanskrit. Words which happen to take the definite article, plural and possessive ending, become NP heads etc. However, quick BGC search for mahas easily reveals plenty of attestations of the nominal meaning in the plural.

Mentioned terms have limited syntactical usages. The moment a foreign-language terms becomes used it English, it becomes English. You can't just plug in any term from any language. I mean, you can for some imaginary reason to make a point - but if sufficiently large number of people does so in durably archived sources, the term becomes English for our practical purposes.

Many (if not most, if you count X's as possessive adjectives) adjectives are in fact not gradable and cannot be used in "very X" or "more X" formations. Specifically, the issue with maha is that it has a meaning of an adjective, but its usage is limited to certain contexts.

Your insistence that maha it's an untranslated Sanskrit term and not English, while the Sanskrit महा is not even a word itself (neither orthographically nor phonetically, as opposed to English maha), is simply not supported by logic. --Ivan Štambuk (talk) 21:58, 20 February 2015 (UTC)Reply

Ivan, the term we are discussing here is mahā, NOT maha. The existence of mahas has zero relevance to the mahā entry.

(google books:"mahas" generates somewhere north of 27K hits, if Google is to be believed. That said, skimming through the results suggests that many of these are 1) personal names (“When Mr. Mahas received the November 5 decision ...”), 2) place names (“About 7km NW of Wadi es Sir suburb is the small hillside town of Mahas (Mahis)...”), 3) Semitic (“The "Nubian" type, itself a hybrid one, which dates back to the age of pre-dynastic Egypt, is most purely preserved in the Kenuz, Mahas, and Sukkot...”), 4) French (“le huitiême du mois de mars le grand chef des mahas et toute sa famille est partis d'icy pour retournés ...”), 5) Native American (“US Factory at Fort Osage: Roberdeau and Pepin: Ottoes, loways, Missourias, Pawnees, Mahas, Piankeshaws, Sioux...”), 6) German (“Er durchsetzte ebenfalls den Mahas-Fluß...”). When we do find an instance of mahas in an English context describing something related to India, and thus more likely relevant to the senses at issue here, it is in sentences like “the all-embracing truth-ideation, Mahas, Veda, Drishti, replaces the fragmentary mental activity...” where the term is clearly not the plural of purportedly-English mahā -- i.e., this is yet again a different term and not a relevant citation for mahā.)

You consistently overlook this very simple point. I'm not sure how to communicate with you. Consequently, I think I'm done here. ‑‑ Eiríkr Útlendi │ Tala við mig 22:41, 20 February 2015 (UTC)Reply

maha is just an alternative form of mahā, which is the modern and more "proper" spelling. I already explained that at another discussion. Ordinary dictionaries combine the historical spellings at the headword. The insistence to split them is the side-effect of the anally retentive obsession with spellings here. maha and mahā are the exact same word, and the macron which is an otherwise non-existent diacritic in English changes nothing.

It took a minute to find clear examples of nominal usage: [1], [2]. All are instances of a substantivized adjectives since this is an adjective by origin. At any case, the volume of attestation of this particular word taking a plural ending -s, a word which isn't even formatted as a ===Noun===, does nothing to invalidate your original point, namely that those are untranslated Sanskritisms, and not otherwise normal LWs in English.

Perhaps I'm overlooking your points since you haven't made any of them. You're just clutching at strawmen. The only valid argument I saw against was that it appears to be like mein in mein Kampf - but this argument is refuted by the fact that in most maha X constructs X is also a loanword, that maha in some rare instances also qualifies words of non-Sanskrit origin, and that separate spelling indicates a separate word both phonetically and orthographically as perceived by the speakers - which is not the case with ordinary prefixes - so a separate entry is merited. --Ivan Štambuk (talk) 01:46, 21 February 2015 (UTC)Reply

As other users have noted, usexes for Spelling 1 are not usable as citations for Spelling 2.

Of the two links you've offered, the first only uses the term maha once in an English sentence, on page 240, with the word in reverse italics (the sentence is all italicized except for this word), clearly indicating that the word is not English. Moreover, that is an example of maha, which is irrelevant to showing any evidence of the term mahā as English. The second link is also insufficient, both because it is not the mahā spelling, and because the term is clearly listed in the book's own glossary (scroll down to the listing for page 194), amply demonstrating that even the author did not consider this to be an English word.

My point in all of these various posts, both here and in RFV, is that none of the provided citations demonstrate in any sufficient way that mahā is an English word. Without sufficient citations, this term fails RFV. ‑‑ Eiríkr Útlendi │ Tala við mig 02:34, 21 February 2015 (UTC)Reply

These usexes are not for spelling but for the word. maha and mahā are the same word. It's OK to have a separate set of attestations for each spelling variation in order to "prove" that each exists, but they should all be combined at the main entry, because those differences are trivial and do not involve characters of English alphabet.

Just because the word is italicized it doesn't mean it's not English. If it's used as a word in English, it's English. Italicization can indicate all kinds of stuff: emphasis, special kind of pronunciation, as well as mentioning words (as opposed to using them), but it this particular example it does not.

Words being listed in a glossary is not an argument against them being English. Many technical and non-technical books have glossaries expalining lesser known words. Authors don get to decide whether the word is English or not. Only usage of words is an indication of that.

The third sense already has five citations, three of which are specifically for the spelling with the macron, and thus passes RFV. Now, whether it should be kept or not is a matter for RFD. --Ivan Štambuk (talk) 00:35, 26 February 2015 (UTC)Reply

RfD discussion edit

Latest comment: 9 years ago63 comments12 people in discussion

The following discussion has been moved from Wiktionary:Requests for deletion.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

I would like to request the restoration, in some form, of mahā, the transliteration of the Sanskrit महा (great). In the course of fixing disambiguation links to this title on Wikipedia, I have found many uses of mahā with this meaning. It is similarly widely used in books. However, searching for it here takes the reader to maha, which has no information on the Sanskrit meaning of the word. Cheers! bd2412 T 17:54, 20 May 2014 (UTC)Reply

We don't do Sanskrit romanised forms. If you want to find a term using this transliteration - 1. paste/type it in the search window and linger to see suggestions, 2. select containing mahā from the bottom and click enter/double-click. A Search results page will appear 3. "Search in namespaces:" check "None" first, then check (Main). This will shorten your search to the main namespace and click "Search". again. महत् appears the 4th in the results. --Anatoli ^{(обсудить}/^вклад) 02:08, 21 May 2014 (UTC)Reply

I don't think that sort of advice is going to reach the average reader, who is more likely to either type maha into the window, or to type/paste in mahā and hit enter, which will take them to maha. I'm not sure why we wouldn't "do" this unusually well attested romanization. If someone sees this word in English text, they should be able to find it defined here. bd2412 T 02:55, 21 May 2014 (UTC)Reply

(E/C)I was just giving you a technical advice how to reach the entry currently, since searching in Wiktionary and search results keep changing. There's no policy on romanised Sanskrit, AFAIK, even if romanisations are attested, they are not in the native script. E.g. ghar is an attestable transliteration of Hindi घर but we only have घर (there's Irish but no Hindi), yeoksa is an attestable transliteration of Korean 역사 but we only have 역사. I'm just stating the fact, so if mahā is created, any admin may delete it on sight. The policies can be created and changed, though. There are romanisations for some languages with complex scripts. --Anatoli ^{(обсудить}/^вклад) 03:19, 21 May 2014 (UTC)Reply

We could add matching transliterations to the {{also}} templates. As for whether this entry should be restored, WT:About Sanskrit#Transliterated entries bans transliteration entries, so I oppose unless the Sanskrit editing community decides to change that. — Ungoliant ^(falai) 03:18, 21 May 2014 (UTC)Reply

The use of {{also}}, as now at maha, seems like a decent idea that respects our prejudices and yet offers the more persistent users at least a way of finding native script entries that provide a useful definition for the transliteration they may have come across, the Wiktionary definition for which they may not find by direct search. DCDuring TALK 03:40, 21 May 2014 (UTC)Reply

I personally have no objections to redirects. --Anatoli ^{(обсудить}/^вклад) 03:47, 21 May 2014 (UTC)Reply

A redirect from mahā to महा would be fine with me, so long as there are no other meanings of mahā. bd2412 T 12:17, 21 May 2014 (UTC)Reply

I think we should reconsider permitting Latin-alphabet entries for Sanskrit, even if all they say is "Romanization of महा". We already allow Latin-alphabet entries for Pali, Gothic, and some other ancient languages that are usually encountered in Romanization in modern editions. —Aɴɢʀ (talk) 12:27, 21 May 2014 (UTC)Reply

Is it used as a word in any language? Renard Migrant (talk) 18:24, 27 May 2014 (UTC)Reply

According to Google Books, it appears in about 150,000 books. bd2412 T 22:43, 28 May 2014 (UTC)Reply

If it's used as an English word or any other language, it may get an English or other entry. For romanised Sanskrit, I'm afraid it's a policy question, you'll have to start a separate discussion or a vote. --Anatoli ^{(обсудить}/^вклад) 22:53, 28 May 2014 (UTC)Reply

Alternative form of maha (“four”) in Tahitian. — Ungoliant ^(falai) 00:01, 29 May 2014 (UTC)Reply

I would like to see a discussion or policy that says that romanizations of Sanskrit are disallowed. Until then, I consider the above statement "We don't do Sanskrit romanised forms" unsubstantiated. In fact, Wiktionary:Votes/pl-2011-08/Romanization of languages in ancient scripts resulted in 7:4 for the proposal that "If an ancient, no longer living language was written in a script that is now no longer used or widely understood, and it was not represented in another script that still is used or widely understood, then romanizations of its words will be allowed entries." (I wrote 7:4 rather than 8:4, since Ruakh only supported for Gothic.). A subsequent vote Wiktionary:Votes/pl-2011-09/Romanization of languages in ancient scripts 2 unanimously expressly allowed romanizations for Etruscan, Gothic, Lydian, Oscan, and Phoenician.

I found Wiktionary:Beer_parlour/2013/August#Sanskrit_in_Latin_script?. There, couple of people support allowing Sanskrit romanizations, including Ivan Štambuk (apparently), Angr, Dan Polansky (me), and Eiríkr Útlendi, where Ivan reported User:Dbachmann to support including Sanskrit romanizations as well; opposition seems to include Liliana; Chuck Entz is unclear. --Dan Polansky (talk) 09:33, 1 June 2014 (UTC)Reply

I don't know much about Sanskrit, but I do know that there are tens of thousands of books that use the mahā (in that script) to signify a specific word with a specific meaning. I'm not about to suggest that we incorporate the whole transliterated Sanskrit corpus, but it seems absurd to refuse to have a definition for a word used as widely as this one. bd2412 T 15:14, 1 June 2014 (UTC)Reply

I think we should continue to have a consistent (uniform) policy towards romanized Sanskrit. At the moment, that policy is to exclude it. I wouldn't mind reversing that policy and allowing romanized Sanskrit to be entered similarly to romanized Gothic or pinyin Chinese, and the preceding comments suggest that enough other people feel the same way that we should probably have a vote.
Allowing some romanized of Sanskrit words and not others according to some arbitrary threshold such as "n Wiktionary users think this word is important" or "[we think] this word is used in x books (where x is some very high number, like 10 000)" does not strike me as a workable state of affairs. Google Books' raw book counts are unreliable, as are its attempts to restrict searching to particular languages, so although we might decide to include only romanizations used in e.g. more than 10 000 books, we have no easy way of ascertaining whether or not a romanization actually meets that threshold.
Even if we continue to exclude romanized Sanskrit, it might be possible to cite mahā as a loanword in some language, if it is really as common as has been suggested. - -sche (discuss) 17:11, 1 June 2014 (UTC)Reply

What evidence supports the hypothesis that the current policy is to exclude romanized Sanskrit? Or, put differently, what makes you think and say that the policy is to exclude it? --Dan Polansky (talk) 20:12, 1 June 2014 (UTC)Reply

See WT:ASA. — Ungoliant ^(falai) 20:16, 1 June 2014 (UTC)Reply

Wiktionary:About Sanskrit is not a policy; it is a policy draft. Furthermore, this is not evidence; a discussion or a vote is evidence of policy. The draft says "Entries written in IAST transliterations shall not appear in the main namespace." which was added in diff. The first edit I can find to that effect is diff, before which the page said "If entries are made under the IAST orthographic transliteration, they should use the standard template {{temp|romanization of}} to reference the Devanagari entry." Since none of the diffs refer to a discussion or a vote, they are illegitimate as means of policy making. --Dan Polansky (talk) 20:31, 1 June 2014 (UTC)Reply

Draft or not, excluding transliterated Sanskrit is the common practice. Start a discussion if you want to change that, or continue refusing to believe it, I don’t care. — Ungoliant ^(falai) 21:48, 1 June 2014 (UTC)Reply

I asked "What evidence ...". If you had no answer to that question, you did not need to answer; the question was directed to -sche anyway. --Dan Polansky (talk) 05:42, 2 June 2014 (UTC)Reply

If you really want evidence, look for RFD archives of romanised Sanskrit entries. I’m familiar with your strategy of asking people to waste their time looking for this or that and then finding some excuse for why what they found is not valid or outright ignoring it. I’m going to act like CodeCat and not waste my time; as I said, you can continue refusing to believe it. — Ungoliant ^(falai) 10:32, 2 June 2014 (UTC)Reply

Putting aside the outcomes of previous discussions, what is the reason for not having entries for such things? We are talking about a well-attested word that readers may well look to us to define. bd2412 T 16:21, 2 June 2014 (UTC)Reply

I think the logic is that, insofar as we hold that Sanskrit is not written in the Latin script, mahā is not a Sanskrit word. Compare: insofar as Russian is not written in the Latin script, soyuz is not a Russian word. And mahā (“great”) and soyuz (“union”) have not been shown to be English words, or German/Chinese/etc words. If mahā is not a word in any language, it is both outside our stated scope ("all words in all languages") and not technically includable anyway : what L2 would it use?

In contrast, महा (mahā) is a Sanskrit word, and is included, and союз#Russian is included.

That said, we have made exceptions for some languages, e.g. Japanese and Gothic, and we have said in effect "even though this language is not natively written in the Latin script, we will allow soft-redirects from the Latin script to the native script for all the words in this language which we include." (Note this is very different from your statement of "I'm not about to suggest that we incorporate the whole transliterated Sanskrit corpus, but [... only] a word used as widely as this one.") I think one could make a strong case that we should make a Gothic-style exception for Sanskrit, since Sanskrit, like Gothic (and unlike Russian), is very often discussed/mentioned (whether or not it is used) in the Latin script. - -sche (discuss) 20:17, 2 June 2014 (UTC)Reply

Even if we admit that "mahā is not a Sanskrit word" (and that is rather questionable since it seems to confuse words with their writen forms), it still does not follow that we have a policy that forbids having Sanskrit romanization soft-redirect entries in the mainspace, on the model of Japanese, Chinese and other romanizations (Category:Japanese romaji, Category:Mandarin pinyin). We have had Japanese romanizations for a long time (dentaku was created on 17 August 2005‎), full will definitions or translations, since no rogue oligarch bothered or dared to eradicate them (we still have them, albeit in reduced form). Whether we have a policy could be quite important in a possible upcoming vote about Sanskrit romanization, since it is not really clear what the status quo is. Therefore, it is rather important to avoid misrepresentations (unintentional or otherwise) about there being or not being a policy. As for the amount of Sanskrit romanization in the mainspace, there may well be none, which would be a fairly good sign for there being a common practice of avoiding Sanskrit romanizations, but one has to consider that this could be a result of rogue olicharch actions. Generally speaking, I find it hard to find a reason for having Japanese and Chinese romanizations while avoiding Sanskrit romanizations. --Dan Polansky (talk) 08:25, 3 June 2014 (UTC)Reply

@Ungoliant MMDCCLXIV: Re: "I’m familiar with your strategy of asking people to waste their time looking for this or that ...": Not really. You would be familiar with my strategy of asking people to source their claims, supply evidence, clarify the manner in which they use ambiguous terms or explain themselves. Since you already know this strategy (as you say), since you don't like it, and since the question was not directed at you, you should have spared yourself the trouble and avoid answering the question (about evidence for there being policy as opposed to common practice or a draft page that anyone can edit regardless of consensus) that you did not intend to really answer anyway. --Dan Polansky (talk) 07:51, 3 June 2014 (UTC)Reply

I did intend to answer. Not for your benefit, but for that of others who may otherwise be fooled by you into thinking that adding romanised Sanskrit is totally OK. — Ungoliant ^(falai) 13:00, 3 June 2014 (UTC)Reply

I still see no rationale for excluding a widely used romanization that readers are likely to come across and want defined. Some justification beyond the naked assertion of policy or the momentum of past exclusions. bd2412 T 14:01, 3 June 2014 (UTC)Reply

AFAICS, adding romanised Sanskrit is totally OK; there is no discussion or vote the outcome of which is that Sanskrit romanizations shall be excluded from the mainspace. --Dan Polansky (talk) 15:02, 3 June 2014 (UTC)Reply

@BD, re "I still see no rationale": I just explained one rationale (mahā is not a word in any language).
The previous BP discussion linked-to above, and comments in this discussion by people who didn't participate in the previous discussion, suggest that a proposal to allow romanizations of all Sanskrit words would pass. I myself could support such a proposal. I suggest, for the third time, that someone make that proposal.
I do not see any indication that the proposal to allow "widely used romanization[s]" only has gained traction with anyone beyond you and possibly Dan. As you note, quite a lot of momentum is against you: AFAIK, there has never been a language for which we allowed romanizations for only some words according to some threshold of exceptional commonness. AFAIK, there has never even been an alphabetic or abugidic language for which we allowed romanizations for only some words according to the threshold of any citations at all. (If you discovered that one of our Gothic romanizations had 0 attestations at Google Books, Groups, etc, we'd still keep it as long as it was derived from an attested native-script form according to the rules of Wiktionary:Gothic transliteration.)
You could keep trying to overturn this momentum, but — especially given that the only people who still seem to be participating in this discussion are you, me, Ungoliant, and Dan, and we don't seem to be changing each others' minds — I think it would be more productive to grasp the support for allowing all romanized Sanskrit, and run with it. - -sche (discuss) 17:58, 3 June 2014 (UTC)Reply

We generally decide whether any unbroken string of letters is "a word" by looking to see if it is used in print to convey a consistent meaning. We do this because the existence of the word in print is what makes it likely that a reader will come across it and want to know how it is defined, or possibly how it is pronounced, derived, or translated into other languages. There are now a half dozen citations of mahā at Citations:mahā, including several where the word is used in English running text without italicization. In some previous discussions we have used the compromise position of declaring the word to be English, but derived from the language of its original script. I think this is absurd. Is tovarich English, really? bd2412 T 18:33, 3 June 2014 (UTC)Reply

I have posted this at the Beer Parlour. bd2412 T 19:04, 3 June 2014 (UTC)Reply

Yes tovarich is indeed English if it's used in running English text as an English word (for which a citation is provided). Same with mahā - the word originates from Sanskrit but it's not a Sanskrit word in the context of provided citations - it's an English word now because it's used in English. --09:57, 27 July 2014 (UTC)

The above unsigned comment seeks to make the case:

it's an English word now because it's used in English.

That alone is a wholly inadequate reason. I say how natsukashii a certain time of year makes me; that doesn't make natsukashii suddenly English. The whole context must be taken into account: to whom am I speaking? Do I assume that my intended audience is familiar enough with Japanese to understand this term? Or am I being deliberately obtuse in using a word that my audience probably won't know? Or perhaps I introduced this term earlier, and explicitly explained it then. All of this must be taken into account before deciding how "English" any given term is.

Past there, I just had a look at Citations:mahā page. There are currently six citations listed. The first one mentions mahā where it's used as part of a title (the w:Mahabharata), rendering that invalid. The second, third, fifth, and sixth all feel the need to add a gloss for the term in parentheses, clearly indicating that this is not an English word. The fourth citation is the only one that might pass muster, but it's from a quite esoteric text about Tibetan Buddhism. The deeply specialized nature of this text assumes that the reader is intimately familiar with many things related to Tibetan Buddhism and related terminology, and as such, I would characterize this as a case of using Sanskrit terms in an English context where the audience is expected to know the term, and not a use of the term as English.

Delete as an English entry. Per Dan below, possibly keep as an IAST transliteration of Sanskrit महा (mahā), similar to our various other transliteration entries for non-Latin-alphabet languages, like Japanese or Gothic. ‑‑ Eiríkr Útlendi │ Tala við mig 07:49, 24 November 2014 (UTC)Reply

I don't see any reason to exclude any word used in English as not being English. The use of a gloss doesn't clearly indicate that it's not an English word; it's clearly indicating that the word is precise but not necessarily clear. As for the fourth citation, if a term is used in English, even in a specialist context, it's still English.

"This time of year makes me feel natsukashii" does use natsukashii as an English word. Chasing down every bit of code switching is not a fruitful pursuit and we probably do need to have some lines, but I think you're confusing the map for the terrain there.--Prosfilaes (talk) 15:17, 26 November 2014 (UTC)Reply

"This time of year makes me feel natsukashii" does use natsukashii as an English word. I can only say that you and I have very different ideas about the criteria by which any given word belongs to any given language. 17:58, 26 November 2014 (UTC)

I regard "criteria by which any given word belongs to any given language" as problematically treating "word" and "language" as platonic entities. The fundamental question is flawed. If we have a sentence that uses a word unmarked in the English language, then it's using that word as an English word. I have a book before me that says "On Agasha, these include horse, gressh, sleth and skink." Those aren't exactly English words, in the sense that an English speaker would understand them, but what else are they? They, just like natsukashii, are being used in English as English words.--Prosfilaes (talk) 13:42, 28 November 2014 (UTC)Reply

Keep mahā as an IAST transliteration of the Sanskrit महा. (To make my stance clear to a prospective closing admin; my reasoning is above.] --Dan Polansky (talk) 08:46, 27 July 2014 (UTC)Reply

Follow-up question. What do we do with references like:

2014, M. A. Center, Archana Book: with English Translation, page 40:
214 Om mahā pātaka nāśinyai namaḥ
...Who destroys even the greatest of sins.
215 Om mahāmāyāyai namaḥ
...Who is the Great Illusion.
216 Om mahā sattvāyai namaḥ
...Who possesses great sattva.
2010, Anne M. Blackburn, Locations of Buddhism: Colonialism and Modernity in Sri Lanka, page xvii:
Tibbotuvāvē Śrī Siddhartha Sumangala Mahā Nāyaka Thera of the Malvatu Vihāraya and the Ven. Aggamahāpandita Ahungallē Vimalanandatissa Mahā Nāyaka Thera of the Amarapura Mahā Sangha Sabhā is remembered with gratitude.
1975, A. C. Bhaktivedanta Swami Prabhupada, Sri Caitanya-caritamrta, Antya-lila: The Pastimes of Lord Caitanya Mahaprabhu, Text 3.62:
Text 3.62 taṁ nirvyājaṁ bhaja guṇa-nidhe pāvanaṁ pāvanānāṁ śraddhā-rajyan-matir atitarām uttamaḥ-śloka-maulim prodyann antaḥ-karaṇa-kuhare hanta yan-nāma-bhānor ābhāso 'pi kṣapayati mahā-pātaka-dhvānta-rāśim

The latter work also provides a later phrase-by-phrase translation, but only after showing the entire passage as a single block of running text. Also, it does not individually translate "mahā", but translates the phrase of which it is part. Cheers! bd2412 T 20:57, 2 February 2015 (UTC)Reply

In light of the foregoing, I have modified Wiktionary:Votes/pl-2014-07/Allowing well-attested romanizations of Sanskrit‎ to accommodate strings of text rather than individual words, and made it live. bd2412 T 20:43, 4 February 2015 (UTC)Reply

What does that mean for the current [[mahā]] entry? This is now structured as English, which I strongly oppose. Moreover, the quotes in the entry do not illustrate usage as English -- of the four, three are immediately followed by glosses in a way that clearly indicates foreignness and that the author does not expect the reader to know this term, and the fourth shows use in the title of a work of literature, the Mahābhārata, with no other meaning apparent.

Will this be converted into a romanized Sanskrit entry? ‑‑ Eiríkr Útlendi │ Tala við mig 01:04, 5 February 2015 (UTC)Reply

Strictly speaking, there is no reason that a term can not at the same time be a romanization from a different language, and a term that has made its way into English (see sayonara, listed as both). bd2412 T 13:00, 5 February 2015 (UTC)Reply

I agree (with Eirikir) that the citations being used to support the English entry in this case are woefully inadequate; it could be RFVed. But bd2412 is right that in general things can be both romanizations and loanwords; another good example is yin; like sayonara, it even pluralizes in English: yins.) - -sche (discuss) 16:31, 5 February 2015 (UTC)Reply

There is one citation at Citations:mahā that uses the word in running English text with no gloss (several times, click "view all" to see). I would imagine that there would be a few more among the 100,000+ Google Books results. This brings us back to the original dilemma. It seems absurd to say that a space-delineated set of characters used to convey a specific meaning across that many sources is not a "word"; so what kind of word is it? bd2412 T 16:46, 5 February 2015 (UTC)Reply

As I previously mentioned, I do not think this particular text (Luminous Essence: A Guide to the Guhyagarbha Tantra) is a good illustration of this term in use as English. This is an esoteric text discussing deep details of certain Buddhist philosophy and practice, and it presupposes that the reader is deeply familiar with the subject matter. There are numerous examples of Sanskrit (or possibly Pali, or Hindi, or ...?) terms that are dropped into the text with no explanation, in ways that are completely opaque to anyone not well-versed in this subject matter. For instance, on page 1, we are apparently greeted with the text Namo gurumañjughoṣāya! I have no idea what this means, and no way of discovering the meaning from the text. Per Prosfilaes' arguments above, this is "English". I argue again, as I did above, that context is key in determining the language of a term in a given text, and that this context goes beyond just the sentence itself to include the work as a whole and the social circumstances of that work, such as the intended audience. Considering the intended audience of this specific text (Luminous Essence: A Guide to the Guhyagarbha Tantra), it is clear that the author assumes that readers have a high level of familiarity with Buddhist terminology, much of which is Sanskrit or Pali. In light of this, it follows that the terms namo and gurumañjughoṣāya (and, indeed, many other domain-specific terms that the broader population of English speakers are unlikely to know) are not English, and are instead Sanskrit, etc. used in a kind of code switching within an otherwise English text.

In short, yes, the book Luminous Essence: A Guide to the Guhyagarbha Tantra does use the term [[mahā]] in otherwise-English text without italics and without any gloss, but no, I do not think this particular work contains valid illustrations of the term [[mahā]] in use as an English term. ‑‑ Eiríkr Útlendi │ Tala við mig 22:17, 5 February 2015 (UTC)Reply

If it's used in running English text, conforming to English syntactical structure, it's English. To what extent embedded and isolated phrases are genuinely foreign language, as opposed to being wholesale loanwords, is debatable, though the opinion of Average English Speaker not well-versed in the subject matter is no measure of wordiness when discussing the work specialized in the very same subject matter. Citations are for attestation purposes i.e. CFI passing, usexes are for illustrative usages. --Ivan Štambuk (talk) 20:51, 6 February 2015 (UTC)Reply

I look forward to your explanation of how "taṁ nirvyājaṁ bhaja guṇa-nidhe pāvanaṁ pāvanānāṁ śraddhā-rajyan-matir atitarām uttamaḥ-śloka-maulim prodyann antaḥ-karaṇa-kuhare hanta yan-nāma-bhānor ābhāso 'pi kṣapayati mahā-pātaka-dhvānta-rāśim" is a sentence in English, and your English-language entries for these words. bd2412 T 20:54, 6 February 2015 (UTC)Reply

That is not English. OTOH, the citations at [[mahā]] are from English works, and are indeed English. --Ivan Štambuk (talk) 21:06, 6 February 2015 (UTC)Reply

That is a citation for mahā - fourth word from the end there. So now mahā is both English and not English? Every one of these words, if found in three "English" texts, can be put in as "English", in your view? bd2412 T 21:08, 6 February 2015 (UTC)Reply

In that particular excerpt, mahā is not a word. It's a part of the compound which is morphologically decomposed with hyphens, which is by itself a single word, and mahā in that spelling is an inseparable part of it. mahā never appears on its own in Sanskrit. What you found in English text is Sanskrit excerpts. That kind of format is common with classical and religious languages where "original and proper" enunciation plays an important role in setting the stage for supernatural mysticism in the minds of followers, as well as for reducing the possibility for mistranslation by enhancing the readers understanding of the text in its original form. You can find countless similar bilingual excerpts for Latin, Ancient Greek, Hebrew and Arabic in the case of former, and for almost any other major language for the latter case, particularly in poetry. --Ivan Štambuk (talk) 21:24, 6 February 2015 (UTC)Reply

So if those last three citations are not English, but are "Sanskrit excerpts", then this is Sanskrit, isn't it? Are these not citations for Sanskrit entries, then? bd2412 T 22:19, 6 February 2015 (UTC)Reply

They are Sanskrit, but the current practice is to use Devanagari, so they'd have to be transliterated. --Ivan Štambuk (talk) 22:27, 6 February 2015 (UTC)Reply

I'm not sure what you mean by "they'd have to be transliterated". By whom? By the authors who wrote those books? Is it Wiktionary's role to police authors and admonish them for using the wrong transliteration, or is it our role to accurately report and define terms as they are used in the real world? There are many more authors independently using the same transliteration scheme to represent Sanskrit words in the Latin alphabet. Unless we want to call these Translingual or make up some new category of words to put them in, how are we to represent real-world usage? bd2412 T 22:44, 6 February 2015 (UTC)Reply

They'd have to be transliterated by Wiktionary editors. These excerpts are not real-world usages. Rather, they are transcriptions for the benefit of readers familiar with the Latin script. The author, Rupa Goswami, certainly didn't use IAST. Contemporary speakers of Sanskrit in India don't use it either. --Ivan Štambuk (talk) 23:06, 6 February 2015 (UTC)Reply

How would you define "real-world usages" in a way that excludes usages in books that have actually been published in print? bd2412 T 23:12, 6 February 2015 (UTC)Reply

A related question: how would you treat "voolay" as found in numerous renditions of w:Lady Marmalade lyrics? Is it English? Is it an alternative spelling of French? Chuck Entz (talk) 00:11, 7 February 2015 (UTC)Reply

I don't know how related that is, since we don't have transliterations of words written in the Latin alphabet in the first place. I would consider "voolay" used in English running text to be eye dialect no different from likee or dayum. bd2412 T 00:30, 7 February 2015 (UTC)Reply

Except that some of the people using it aren't trying to caricature anything, they're honestly representing the sounds of French using their own methods, just as someone writing "mahā" is representing the sounds of Sanskrit. If we allow the second, how do we distinguish it from the first? Chuck Entz (talk) 07:24, 9 February 2015 (UTC)Reply

I note that [[mahā]] includes no indication that the entry has been nominated here for RFD. Should we add {{rfd}}? ‑‑ Eiríkr Útlendi │ Tala við mig 02:16, 9 February 2015 (UTC)Reply
Is the entry currently nominated for deletion? If you look at the very top of this thread, it's about undeleting the Sanskrit entry. This thread is also very long and old, so I would suggest beginning a new thread if you want to discuss deleting the English entry. - -sche (discuss) 05:20, 9 February 2015 (UTC)Reply
For the record, I continue to think that the Sanskrit entry should be undeleted. There are, as I have noted a few lines above, lengthier passages that use this transliteration, such that the passage can hardly be called English. We could call half the words in all the world's languages "English" if our standard was that they appeared a few times in English running text. bd2412 T 05:25, 9 February 2015 (UTC)Reply
Has anyone proposed treating those passages as English? Not even Ivan has, as far as I can see.
I think mahā should stay deleted until and unless we formulate a general policy on romanized Sanskrit that allows it, there being nothing unique about this one word that would cause it to deserve special treatment. Formulating such a policy is, of course, complicated by the fact that we appear to have not two factions ("yes, have romanizations", "no, don't have romanizations") but at least four ("romanize Sanskrit systematically, like Gothic", "have ad-hoc, attestation-dependent entries for ad-hoc romanizations", "don't have romanizations", and "don't have romanizations, lemmatize Sanskrit in Latin script because Devanagari is POV"). - -sche (discuss) 06:36, 9 February 2015 (UTC)Reply

@bd2412: We could call half the words in all the world's languages "English"... -- that's precisely the situation that I'm trying to avoid. It's both unhelpful to the user, and patently ridiculous.

@-sche: FWIW, I think we should have very low barriers to having romanizations of any language that does not typically use the Latin script. And if we have romanizations, it should be consistent and language-wide, as with Gothic, or Chinese, or Japanese. ‑‑ Eiríkr Útlendi │ Tala við mig 07:44, 9 February 2015 (UTC)Reply

At this point, it seems fairly obvious that this should be closed as no consensus to restore. If there is no objection, I will close it accordingly. bd2412 T 16:53, 2 March 2015 (UTC)Reply

No consensus to restore the Sanskrit entry. bd2412 T 14:33, 4 March 2015 (UTC)Reply

May 2015 deletion discussion edit

Latest comment: 8 years ago18 comments5 people in discussion

The following information has failed Wiktionary's deletion process.

It should not be re-entered without careful consideration.

~~mahā~~

rfd-sense: "as a part of Sanskrit compounds with separated parts"

This is in RFV (WT:RFV#mahā), and there was no unanimity about whether the quotations provided for the sense and currently in the entry meet WT:ATTEST, or whether the sense should be deleted. One editor even opined that attestation in use is not in question. To resolve the matter, I am opening a RFD. --Dan Polansky (talk) 07:24, 9 May 2015 (UTC)Reply

Delete. The quotations provided do not attest the sense in use (WT:ATTEST). Furthermore, as an analogy, the use of the word "New York" in a Czech sentence attests "New York" as a Czech word, full with inflection, but does not attest "New" as a Czech word; thus, "Mahā Bhārata" does not attest "mahā" as an English word. --Dan Polansky (talk) 07:26, 9 May 2015 (UTC)Reply

Delete per Dan Polansky. We don't have an English entry for vivre because of joie de vivre. Where would it end? Also, the citations cannot possibly convey meaning, since it doesn't mean anything. Renard Migrant (talk) 11:57, 9 May 2015 (UTC)Reply

The nominated sense does not actually appear to be a definition. The citations could be redistributed to support the first two senses. bd2412 T 22:45, 9 May 2015 (UTC)Reply

Delete as nominated. Re: bd2412's comment above, I continue to argue that the citations as provided are not sufficient to demonstrate that the term in question is actually being used as English, regardless of which sense the citations are added to. ‑‑ Eiríkr Útlendi │ Tala við mig 22:57, 12 May 2015 (UTC)Reply

RFD for the whole English entry. As already amply discussed in the earlier portion of the thread at RFV#mahā and at Talk:mahā. Much like for the nominated sense 3 above, the provided citations fail to demonstrate sufficient Englishness for this to qualify as an English-language term entry. ‑‑ Eiríkr Útlendi │ Tala við mig 05:21, 11 May 2015 (UTC)Reply

Are you proposing to create such an RfD? The editor who nominated the specific sense has not chosen to do so. Also, what then would we do about citations like:

2004, Sushil Mittal, ‎Gene Thursby, The Hindu World, Ch. V:
Classifying texts into mahā and upa appears to be a convenient device to organize the texts in a schematic order.
2008, S. Bodhesako, Beginnings: Collected Essays of S. Bodhesako, page 21:
The evidence is twofold. First, we would expect the Cullavagga to have, if not fewer, at least not more Khandhakas than the Mahāvagga. In the Suttas we often encounter Mahā/Culla pairs, and the Mahā is invariably the longer.
2013, V. Ravi, Understanding and Worshiping Sri Chakra, page 92:
1 represents Mahā Kāmeśvarī (not Parāśakti; Mahā and Kāmeśvarī are two separate words here) and she protects the eastern corner, the point of triangle facing down.

These are all new to the discussion; I will add then to Citations:mahā now. Cheers! bd2412 T 12:21, 11 May 2015 (UTC)Reply

@Eirikr: I propose the following, if I may: 1) please post boldface delete here if you agree that the third sense, nominated above, should be deleted. This makes it so much easier for the prospective closing admin. 2) Postpone any discussion of the whole English entry to a later RFD nomination. This nomination is only to the third sense, the one that lingered in RFV so long. Expanding the nomination here creates a confusion, IHMO, decreases focus, and thus creates a disincentive for people to join the discussion. --Dan Polansky (talk) 18:37, 11 May 2015 (UTC)Reply

@Dan: Fair enough. You're right to keep things on target. I've struck my comment above, and will propose a separate RFD in due time. ‑‑ Eiríkr Útlendi │ Tala við mig 22:07, 12 May 2015 (UTC)Reply
- I would prefer to have it as Sanskrit, but thus far the community has insisted that if it is in the Latin alphabet, it must be English. bd2412 T 23:32, 12 May 2015 (UTC)Reply
  - I would also prefer to have it as Sanskrit, defined simply as {{sa-romanization of|महा}}. —Aɴɢʀ (talk) 23:37, 12 May 2015 (UTC)Reply
  - Ditto bd2412 and Angr above. ‑‑ Eiríkr Útlendi │ Tala við mig 04:36, 13 May 2015 (UTC)Reply
    - That was my original suggestion, in the first discussion of this topic. bd2412 T 02:46, 15 May 2015 (UTC)Reply
      - Re: "the community has insisted": That was not the community, that was a significant minority, as per Wiktionary:Votes/pl-2014-07/Allowing well-attested romanizations of Sanskrit. --Dan Polansky (talk) 14:49, 15 May 2015 (UTC)Reply
      - I am wrong; you mean something else. You mean romanizations in the middle of English text. --Dan Polansky (talk) 16:11, 15 May 2015 (UTC)Reply
        The community is schizophrenic on this issue. A majority want some kind of entry to exist for mahā, but there is not a "consensus" majority to have an English entry and there is not a "consensus" majority to have a Sanskrit entry. What we need is an either/or vote to decide which one it is. bd2412 T 16:24, 15 May 2015 (UTC)Reply
        Only one sense is being challenged. Nobody has yet challenged the other two English senses. Renard Migrant (talk) 13:26, 23 May 2015 (UTC)Reply

Sense deleted. bd2412 T 19:35, 31 May 2015 (UTC)Reply

RFV discussion: February–July 2015 edit

Latest comment: 8 years ago37 comments5 people in discussion

The following discussion has been moved from Wiktionary:Requests for verification (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

~~mahā~~

RFV of adjective sense 3. (Other senses may be RFVed later.) As far as I can tell, it would be a departure from previous practice to have sense-lines (especially ones consisting of more than {{only in}}) for parts of the proper names of works, even if we weren't talking about transliterations. For example, we don't and shouldn't have an English entry for Mein (or mein), or Soyuz or soyuz, based on instances of people mentioning Mein Kampf and Soyuz nerushimy in English; we don't even have an English (or German/Russian) entry for either full title. Hence, the one citation currently under the RFVed sense, if it is using any English term at all, seems to be using Mahā Bhārata, not mahā. But it is italicized as if it were not English at all, but only a transliteration of the Sanskrit title — compare google books:Zhonghua renmin gongheguo xianfa — so seems useful it see if it can be cited in English at all before beginning an RFD. (Otherwise, someone at RFD would say "shouldn't this be at RFV first to see if it's attested?") - -sche (discuss) 21:45, 6 February 2015 (UTC)Reply

Our current entry for Mahā Bhārata defines the term as "Alternative spelling of Mahabharata", and our entry for Mahabharata defines the term as "A Sanskrit epic concerning some text of Bhagavad Gita plus elaborations on theology and morality". None of that provides the reader with any guidance when they come across "mahā" and decide to look it up here. bd2412 T 22:26, 6 February 2015 (UTC)Reply

[3] - There are 3 cites for the specified definition, take it to RFD if you dispute inclusion in the first place. It's pointless to discuss the far-fetched and disanalogous parallels with mein and soyuz here. --Ivan Štambuk (talk) 22:33, 6 February 2015 (UTC)Reply
On the contrary, Soyuz in particularly is directly analogous. - -sche (discuss) 22:47, 6 February 2015 (UTC)Reply
soyuz is a standalone word so it can't be analogous. soyuz also doesn't appear in hundreds of compounds (or phrases), and it's not a result of people mistakenly spelling it on its own just because it's phonetically a single word, as is the case with maha. Fundamentally there is no difference between it and ordinary English affixes, and the opposition seems to stem from the fact that the former only admits Sanskrit basewords. Since we have entries on English affixes with as little as 2-3 derivations, with maha having hundreds it surely deserves its entry. --Ivan Štambuk (talk) 01:04, 7 February 2015 (UTC)Reply
- Ivan, the two citations you recently added were both illustrations of the term [[maha]], not [[mahā]], and I have therefore removed them. ‑‑ Eiríkr Útlendi │ Tala við mig 00:06, 7 February 2015 (UTC)Reply
  - Ivan, I see that you've just reverted my removal. Rather than edit warring, I rebut below:

I understand that [[maha]] and [[mahā]] are related. However, citations showing use of [[maha]] do not suffice as evidence for [[mahā]] as an English term.
Both of your added citations also don't show use of [[mahā]] as an individual term, but only as part of larger compounds. We already have an entry for maharaja, and I would argue that maha raja (regardless of capitalization) is an alternate form of maharaja and not an example of [[mahā]] as an independent English term.
Your usage note still makes no real sense, and is incorrect in characterizing Sanskrit [[mahā]] as having no independent meaning.

Please remove the incorrect citations and incorrect usage note. ‑‑ Eiríkr Útlendi │ Tala við mig 00:51, 7 February 2015 (UTC)Reply

It's an alternative form, and citations are valid for the normalized spelling as well. It's the same word.
It's an independent term by the virtue of being demarcated with whitespace. It how the notion of "word" is defined.
mahā is not a lexical word in Sanskrit and has no meaning. The usage note is correct, just because you don't understand it doesn't mean it makes no sense. --Ivan Štambuk (talk) 00:58, 7 February 2015 (UTC)Reply

On the first point, I was under the impression that separate spellings are treated here on the EN WT as separate entries. As such, [[maha]] and [[mahā]] are separate. Could any third party chime in with clarification?
On the second point, I'm not arguing that [[maha]] is not a word. I'm arguing that [[maha]] as illustrated by the compound term [[maha raja]], as an alternate form of [[maharaja]], is not an independent usage of [[maha]] as an English word.
Per your last point, care to explain the महा (mahā) entry, then? inter- is not a lexical word in English either, but it definitely has a meaning. ‑‑ Eiríkr Útlendi │ Tala við mig 01:19, 7 February 2015 (UTC)Reply

A 1779 citation for maha encompasses both modern maha and mahā, since at that time there was no way to make a typographic distinction between the the two. Today when you write mahā it's a choice. So a 1779 citation for maha is a proper citation for mahā as well.

What is an "independent word" ? The definition line for the third sense states that it carries no inherent meaning. It can't be classified as an affix either, so it's best left as an adjective.

I suggest that you read the entry on महा (mahā). It says that it doesn't mean anything. That word never occurs in that form on its own so that entry can be safely deleted. Sanskrit has formalized rules of compounding so any word can have a bunch of such "combining forms". inter- doesn't have a meaning either. It modified the meaning of the baseword, but it has no meaning of its own. That's why all of the affixes have non-gloss definitions. --Ivan Štambuk (talk) 02:56, 7 February 2015 (UTC)Reply

If महा (mahā) doesn't mean anything, then perhaps mahā as used in at least some English texts appears to have been given a meaning by those writers beyond its Sanskrit origins. bd2412 T 04:21, 7 February 2015 (UTC)Reply

As it is explained in the usage notes, it's usage as a separate word in English in the third sense is due to the fact that it's phonetically a separate word. In Sanskrit compounds as a rule have a single accent. Modern usage dictates a single-word spelling for all Sanskrit compounds. --Ivan Štambuk (talk) 20:14, 7 February 2015 (UTC)Reply

I already read the महा (mahā) entry. The महा (mahā) entry does not say that it has no meaning. The lack of a gloss on the definition line is not a positive statement that the term has no definition. Clicking through to महत् (mahat) further explains that महा (mahā) is the combining form of महत् (mahat), i.e. it has the same definition(s) as महत् (mahat), albeit a different lexical role.

If inter- has no meaning, why does it have a definition line?

Ivan, I honestly can't tell if you're trolling me, or if you're bending the logic of your argument, or if you and I just have profoundly different understandings of Wiktionary. ‑‑ Eiríkr Útlendi │ Tala við mig 05:23, 7 February 2015 (UTC)Reply

... or perhaps I'm just really not understanding you? ‑‑ Eiríkr Útlendi │ Tala við mig 07:00, 7 February 2015 (UTC)Reply

The entry says that it's a combining form of another word, it doesn't provide any definition. On Wiktionary only words without meanings lack definitions, excepting alt-form redirects and non-lemma entries. महा never occurs on its own (in terms of "separated with whitespace in writing", or "pronounced separately when speaking") in that form. It's not a word in Sanskrit. Sanskrit compounding forms are not like inter- and other affixes in English and other languages - basically every single word can have a bunch of these forms depending on word sandhi.

The definition line of inter- is now encapsulated with {{n-g}}. Even in this form it's deficient because inter- does not mean among, between, amid, during, since these all are not lexical words either. It should be something along "Prefix used to form nouns and adjectives indicating this-and-that type of relationship, corresponding to the usage of prepositions among, between, amid etc.".

Now suppose that all of those derivations with inter- where overwhelmingly written separately, as inter governmental or inter state, until relatively recently (C20), and admitted only stems of Latin origin. Would it be justifiable to have a separate entry on inter? This is such a case. --Ivan Štambuk (talk) 20:27, 7 February 2015 (UTC)Reply

Eirikr, you are correct that different spellings are treated as different entries, and citations of maha do not verify mahā. Iff they are not invalidated by other factors such as italics, or being mentions rather than uses, etc, citations of maha verify maha. Macrons have been in use for hundreds of years, so Ivan, your assertion that there was historically no way to make a typographic distinction between maha and mahā is simply mistaken. I agree with Eirikr's comments of 00:51, 7 February 2015, including that maha raja is an alternative form of maharaja, not a use of *maha#English + *raja#English. (Does anyone other than Ivan feel otherwise?) - -sche (discuss) 03:48, 7 February 2015 (UTC)Reply

A reader coming across the phrase maha raja is likely to see these as two separate words, and will (correctly) conclude that "maha" and "raja" each contribute some different meaning to the whole phrase. This is particularly so if the same reader also sees phrases like maha bharata or maha yogi. Of course, the same applies for examples of each of these phrases using mahā. bd2412 T 04:21, 7 February 2015 (UTC)Reply

Indeed, a reader will likely conclude that each discrete space-delimited string of characters conveys some discrete unit of meaning. However, does the maha in [[maha raja]] parse as English? Would the reader infer that they can then refer to a maha deal, a maha examination, a maha big mess, and expect other English readers to understand? ‑‑ Eiríkr Útlendi │ Tala við mig 05:23, 7 February 2015 (UTC)Reply
How is maha in maha raja any less English than inter in international ? Answer: none of these are English. But it's a separate word, spelled separately for a reason that it is pronounced separately, reflecting what is now an obsolete orthographical practice. People will look up raja, see that it means something, and they will look up maha, and won't find anything (relevant). --Ivan Štambuk (talk) 20:33, 7 February 2015 (UTC)Reply

It is my position that a citation of "maha" does not attest "mahā"; the two are different spellings, to be attested separately. In this I support -sche and Eirikr. --Dan Polansky (talk) 09:48, 8 February 2015 (UTC)Reply
A third quote for the mahā spelling has been added. --Ivan Štambuk (talk) 17:11, 8 February 2015 (UTC)Reply

Neither of the additional citations adequately show use of [[mahā]] as English. One of the additions is [[mahā]] not as a term, but as part of a title: Mahā Purusha is apparently the title of a 1985 film. The other addition is as part of the compound term mahā mudrā, which is apparently a yoga position. These do not illustrate use of [[mahā]] as an independent English term. ‑‑ Eiríkr Útlendi │ Tala við mig 22:22, 8 February 2015 (UTC)Reply
Maha Purusha is not the name of a movie in that particular citation (note the date), but an anglicized spelling for the Sanskrit term mahā-puruṣa. mahā is a word by the virtue of being separated with whitespace. I'm still waiting for your definition of independent term. It's not a lexical word, but the definition line for mahā doesn't even claim that it is. --Ivan Štambuk (talk) 00:07, 9 February 2015 (UTC)Reply

You have not demonstrated that [[mahā]] ever appears in an English text in a way that is 1) not an untranslated term used as code-switching in a text targeted at readers likely familiar with Sanskrit, Pali, etc.; 2) not part of a compound term that has been used as an integral whole in a way where [[mahā]] has no clearly independent meaning in English; 3) actually used as English, such as with English modifiers like more or less, and where [[mahā]] is used to modify a common English term. None of your examples serve as adequate evidence that [[mahā]] is being used as English.

I am perfectly happy for EN WT to have an entry at [[mahā]]. Given the evidence to date, I am strongly opposed to any [[mahā]] entry that lists [[mahā]] as an English term. ‑‑ Eiríkr Útlendi │ Tala við mig 00:17, 9 February 2015 (UTC)Reply

I am not supposed to demonstrate anything of that because the definition line for the third meaning doesn't require it. Whether it should be kept or not is a different matter (for RFD). Note also that there are sufficiently large number of attestations for English of both X and maha X forms (e.g. maha raja and maha purusha mentioned in this very discussion), which is arguably in favor of the claim that maha is in fact a native English adjective used within these constructs meaning "great", but that is already covered by the preceding definitions, and precluded by our knowledge of the origin of such constructs (direct borrowings from sa). --Ivan Štambuk (talk) 00:48, 9 February 2015 (UTC)Reply

Limiting discussion just to the third sense (which I admit I was not doing -- I was still writing in reference to the entire English entry for [[mahā]]), there are currently five citations listed. These five are, in order:

Invalid: wrong spelling ([[maha]]).
Invalid: part of the untranslated title of a literary work (the w:Mahabharata).
Invalid: apparently part of a proper noun (Mahā Purusha), and also clearly delineated in a way to indicate use of an untranslated non-English term (italicized). This is also the only appearance of this term in the entire cited book.
Invalid: code switching in a text targeted at an audience already familiar with various Sanskrit and/or Pali terminology, and also clearly delineated in a way to indicate use of an untranslated non-English term (underlined). If Google Books is to be believed, the word [[mahā]] appears six times in this book, and only as part of the compound term mahā mudrā.
Invalid: wrong spelling ([[maha]]).

Analysis indicates that none of the provided citations are valid or sufficient to illustrate use of this term, with this particular sense, as English. Delete.

Once we are done beating this dead horse, I would like to nominate the entire English entry for RFV / RFD. ‑‑ Eiríkr Útlendi │ Tala við mig 02:09, 9 February 2015 (UTC)Reply

mahā and maha are the same words in English. The difference is in a macron which is not a part of the English alphabet.

Sanskrit words used in English are not "untranslated" Sanskrit words. They are English words.

The frequency by which a term appears in a work is irrelevant for the purposes of a single attestation. Yes it's a part of the noun (there is no concept of a proper noun in Sanskrit) - but it's spelled and pronounced as a separate word in English, which is not the case in the Sanskrit original. That is both explained in the definition line and the usage note. I'm glad that you've reached that conclusion on your own.

This is not code switching. These are not snippets of Sanskrit used in English. Those are ordinary English words fitting into English syntactical structure. They are used as objects, qualified with articles, pluralized and so on. --Ivan Štambuk (talk) 00:47, 26 February 2015 (UTC)Reply

Could anyone else chime in? I feel like Ivan and I are going in circles. More discussion also at [[Talk:mahā]].

Specific issues that I'm hoping others can help address:

Can examples of Spelling A be used as attestations of Spelling B? In this case, are quotes containing maha sufficient to attest the term mahā?

From my reading of past discussions about citations, I arrived at the understanding that citations must demonstrate use of the relevant word with exactly the same spelling. As such, attestations of ate cannot be used to verify the existence of et. Users -sche and Dan Polansky seem to agree with my position, that any citations used to verify the existence of mahā must use the same spelling, diacritics and all.

Are terms that are clearly set off in a text (using italics, reverse italics, bold, quotes, etc.) sufficient to demonstrate the non-foreign-ness of that term?

Other discussions suggest that italics and the like are used by authors to indicate the non-nativeness of a term. Ivan above clearly disagrees.

Are uses in transliterated titles and names sufficient for attestation?

Two of the citations at [[mahā]] appear to be titles, one the title of a literary work, the other a personal epithet.

There are other issues at hand as well, but for starters, I would appreciate input on the above two points.

TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 01:36, 26 February 2015 (UTC)Reply

The form mahā now has has three attestations on its own.

"foreign-ness" and "non-nativeness" (whatever that means) is not a criteria to exclude words. The criteria is usage in English. The problem is that you don't define usage on semantic grounds (i.e. words being used in their meaning alongside other English words to contruct a complete English sentence), but on how they are formatted and where do they originate form. All of these are irrelevant points. --Ivan Štambuk (talk) 01:51, 26 February 2015 (UTC)Reply

Note: There is no entry at maha reflecting the senses reported at mahā. It is quite likely that citations supporting additions of those senses to maha could be found, given the usage of the less convenient diacritic form. bd2412 T 04:35, 26 February 2015 (UTC)Reply

The spelling with the diacritic is the more proper one, and that's where the senses and citations should be located. The only exceptions should be relatively common terms (e.g. Shiva not Śiva). --Ivan Štambuk (talk) 14:27, 28 February 2015 (UTC)Reply

As for transliterated titles (of literary works and such), my position is that "... are the two great epics, the "Rāmāyana" and the Mahā Bhārata" is not an attestation of "mahā" as an English word conveying meaning, and should be removed from mahā page. --Dan Polansky (talk) 13:15, 28 February 2015 (UTC)Reply
But it's not even defined as a lexical word with a meaning. It's definition is surrounded with the {{n-g}} template. Having a meaning is not a criteria for inclusion, otherwise we wouldn't have entries on affixes, prepositions and so on. Furthermore, it's a part of the title only in that specific citation by sheer coincidence - in others it is not. --Ivan Štambuk (talk) 14:27, 28 February 2015 (UTC)Reply
It's not void of linguistic effect, though. A sentence with "mahā" at a certain place means something different than a sentence with nothing (or four letters of random gibberish) at the same place. bd2412 T 21:18, 1 April 2015 (UTC)Reply
Consider the following Czech sentence: Přiletěla do New Yorku se zpožděním. Does it attest "New York" as a Czech term? I'd say yes. Does it attest "New" as a Czech term? I'd say no. --Dan Polansky (talk) 21:46, 1 April 2015 (UTC)Reply
I have opened WT:RFD#mahā to help resolve this issue. If anyone feels comfortable closing this RFV in any manner, they can still do so. --Dan Polansky (talk) 07:25, 9 May 2015 (UTC)Reply
RFV closed: the sense nominated here for RFV was deleted via RFD (Talk:mahā#May 2015 deletion discussion). RFV no longer relevant. Again, the sense was "as a part of Sanskrit compounds with separated parts". --Dan Polansky (talk) 10:33, 12 July 2015 (UTC)Reply

RFD discussion: September 2018–July 2019 edit

Latest comment: 4 years ago21 comments7 people in discussion

The following discussion has been moved from Wiktionary:Requests for deletion (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

~~mahā#English~~

Following on a similar discussion at Wiktionary:Tea_room/2018/September#tiru, determining that that term is not English, I would like to nominate the entry at mahā#English for deletion, on the grounds that this is also "clearly never productive in English", and is also not English. There was considerable discussion about this term in the past, as recorded at Talk:mahā. Said discussion included a refutation of the various citations intended to support the validity of the term's English-ness listed at Citations:mahā#English_citations_of_mahā, pointing out that none of the provided citations actually supports that position.

Looking forward to a thoughtful and reasoned discussion. ‑‑ Eiríkr Útlendi │^{Tala við mig} 16:35, 5 September 2018 (UTC)Reply

I've already perused the mahā talk page several times in the past, and I'll issue a tentative delete: just as I do not believe osthya to be an English word, I don't believe this to be an English word. But we'll see.

The problem is that (in my view) quotations such as "All are classed among the eighteen mahā or ‘great’ purāṇas." or "hence in spite of its labio-dentality, it came to be listed as an oṣṭhya sound." are useless for our purposes: they cannot be used to attest the words in English, nor can they really be used to attest the words in Sanskrit. They simply aren't quality quotes / good for anything. Per utramque cavernam 16:55, 5 September 2018 (UTC)Reply

Delete the adjective. Abstain on the noun sense. —Μετάknowledge^{discuss/deeds} 19:04, 10 September 2018 (UTC)Reply
Keep in some form. This is a word that appears in print often enough that a reader may want to learn what it actually means. There are a small but concrete number of instances of this word appearing in English running text which are presented without italics or other formatting to distinguish it as a word in a different language. We should not delete words based on catch-22 reasoning, which seems to presume that words are bad, and should be eliminated from the dictionary if we can find a technical reason to justify their removal. Rather, we should consider how we can help readers define words they may reasonably come across. bd2412 T 13:23, 11 September 2018 (UTC)Reply

I have no judgment on words being "good" or "bad", that is entirely beside the point.

I am also not pushing to "eliminate" words from Wiktionary. I am much more concerned with accurate description.

As stated before, I am fine with the existence of an entry at [[mahā]]. What I am nominating for deletion is [[mahā#English]], and as noted at [[Talk:mahā]], those (exceedingly few) instances of mahā in running text without any gloss or special formatting are also in works that treat a broad array of Buddhist- or yoga-related terminology the same way: essentially as untranslated Sanskrit sprinkled through the body of the text. If inclusion in an otherwise English sentence, without regard for context or domain, is our only criterion for "English-ness", then it follows that we must also create English entries for ... a truly vast array of terms, so many that the significance of the "English" language label would be severely diluted. That, I argue, would do our readers more of a disservice. ‑‑ Eiríkr Útlendi │^{Tala við mig} 19:10, 11 September 2018 (UTC)Reply

I would welcome your proposal of what form this entry should take, if [[mahā#English]] (which is currently the entire entry) is removed. bd2412 T 19:51, 11 September 2018 (UTC)Reply

In the past, the idea was floated (perhaps even by you?) to have romanized Sanskrit entries. I still support this option, as we also currently have for Gothic, Japanese, and Chinese (and perhaps others too). ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:20, 11 September 2018 (UTC)Reply

It was. I am not opposed to having this presented as something other than an English term. My concern is that different groups of editors will oppose different solutions, so that the end result is no solution, and the benefit to the reader of knowing what "mahā" means will be lost. I would prefer a process to determine how it should be included, rather than one which risks excluding an attested term from the dictionary entirely. bd2412 T 00:51, 12 September 2018 (UTC)Reply

FWIW, I don't share the assumption that there must be an entry here if this string appears in print. Even a remit as broad as "all words in all languages" is not "all representations of all words or portions of words". There are enough works on German and its dialects that contain blocks of text transcribed in IPA or even other pronunciation systems that I could probably "cite" words like zaɪn or diː or ʃə, but I don't think we need an entry at [[zaɪn]] or [[diː]] or [[ʃə]]; the entries at [[sein]] and [[die]] and [[-sche]] cover the words as they exist in the language to which they belong. In this case, it's arguable (there is a case to be made) that there should be (soft) redirects of sorts at romanizations for Sanskrit as there are for Gothic, but I don't share what seems to be the underlying assumption. - -sche (discuss) 01:22, 12 September 2018 (UTC)Reply

You say, "I don't share what seems to be the underlying assumption." Could you unpack that? What underlying assumption? (Honest question, I feel a bit confused and am seeking clarity.) ‑‑ Eiríkr Útlendi │^{Tala við mig} 04:07, 12 September 2018 (UTC)Reply

(I hope this doesn't sound curt,) Would it clarify things if I said the clause you quote, from the last sentence of my comment, is merely restating my first sentence? The assumption I'm referring to is the assumption (embedded in bd's comment about "what form this entry should take") that there should be an entry at this title because (quoting again) "this is a word that appears in print often enough". - -sche (discuss) 04:47, 12 September 2018 (UTC)Reply

@-sche I feel that you have either misunderstood or misrepresented my position. I have been consistent in opposing the inclusion of neologisms and brand names even where these appear in print "often enough". In this case, the term in question not only appears in print often enough, but has for a long time, as a freestanding word (not just a particle of another word), perhaps having a meaning unique in some subtle sense to this specific presentation of the word. bd2412 T 19:19, 30 September 2018 (UTC)Reply

Delete the adjective as it stands, or (if kept at RFD) send to RFV to seek better citations, as every one currently under the adjective section is inadmissable: under the first sense the 1980 and 2014 Shiva cites clearly set it off as a foreign language term, the 2012 cite doesn't use this spelling (in addition to other problems), the 2013 cite doesn't seem to be an adjective (in addition to other concerns), the 2014 Mohr cite is clearly a mention of a foreign language term and not a use, and not even a mention of this adjective but rather of a prefix with a hyphen; the cites under the second adjective sense suffer similar problems. It is also very questionable to use even a valid use of a compound word as an argument that its elements are also independently English; as I wrote recently in the Tea Room, the ability to say "I visited Bad Kreuznach and Bad Kissingen" doesn't in and of itself make "Bad" an English word meaning "spa" (although someone may now seek out better citations which do). Use in collocations that aren't viewable as wholesale borrowings/transliterations, e.g. "a mahā leader", "the mahā teachings of the ascetics", would be more convincing evidence of the existence of "mahā" as an English word. It is concievable that the string might exist as an English word the way e.g. verboten does, but it would need to be demonstrated. Abstain for now on the noun. Some investigation should be done to determine if the noun (or adjective) is more commonly spelled maha. - -sche (discuss) 19:47, 11 September 2018 (UTC)Reply

@-sche, @Μετάknowledge: regarding the noun form, we currently only have one citation given for the purported noun sense, from the work Luminous Essence: A Guide to the Guhyagarbha Tantra. As can be seen here, if Google Books search is working correctly, the term mahā only appears five times in this whole book, in three separate sentences (formatting kept as in the original):

This is also the reasoning behind the subdivisions of the Nyingma School's mantra scriptures, such as the classification of mahāyoga into three parts, starting with the mahā of mahā. -- page 3
The Tantra of the Secret Essence is the ati of mahā, which is the same as the mahā of ati in terms of the three divisions of the great perfection. -- page 5
The liberating paths of the supramundane vehicles explained above can also be classified into nine vehicles: the three vehicles that guide through renunciation (the vehicles of the listeners, self-realized buddhas, and bodhisattvas), the three vehicles of Vedic austerities (krīya, ubhaya, and yoga), and the three vehicles of mastery in means (mahā, anu, and ati). -- page 23

The book's topic appears to be esoteric Tibetan Buddhism. No definitions are given anywhere for the terms mahā, ati, anu, krīya, or ubhaya. Yoga I only know as the common exercise practice of stretching and controlling one's breathing and posture; if it has any other meaning in this book, that is wholly lost on me. I would argue that these terms are untranslated Sanskrit, used on the assumption that the intended audience is sufficiently familiar with the Sanskrit terminology.

Considering the overall context of the work -- the subject matter, the intended audience, usage of other esoteric terms -- I would argue that this work is using untranslated Sanskrit as Sanskrit and not as English, and that this is thus not a useful citation to show use of an English term. And without this one citation, we have no citations at all for the noun sense, and should therefore strike that from the EN entry. ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:20, 11 September 2018 (UTC)Reply

If that's the case, I recommend you RFV the noun sense. By the way, I also support romanisation soft redirects for Sanskrit. —Μετάknowledge^{discuss/deeds} 22:17, 11 September 2018 (UTC)Reply

@Metaknowledge: suppose we had a policy of allowing romanisation soft redirects for Sanskrit. In that case, what would we do for an entry like this one that has a sense in another language? We can't use the template that says Wiktionary has no entry at this title, because it has one for Pali. bd2412 T 15:47, 13 February 2019 (UTC)Reply

@BD2412: The same way we do it for any other language treated thus, e.g. kara#Japanese. I don't see the relevance, now that the vote to implement such entries has failed. —Μετάknowledge^{discuss/deeds} 18:43, 13 February 2019 (UTC)Reply

I have no objection to such a change for the current English entry at mahā. bd2412 T 19:01, 13 February 2019 (UTC)Reply

@BD2412: My mistake; the vote was in fact extended and is still ongoing, although I don't expect it to pass. See Wiktionary:Votes/pl-2018-12/Allowing attested romanizations of Sanskrit. —Μετάknowledge^{discuss/deeds} 20:36, 13 February 2019 (UTC)Reply

For the record, the vote was not extended but rather was created to run for 3 months from the start. What made me do so was the knowledge that vote extensions were accused of fishing for results in the past, and at the same time, it took people long time before to cast a vote in a Sanskrit-related vote. --Dan Polansky (talk) 10:18, 16 February 2019 (UTC)Reply

Closed as no consensus No votes or comments in five months. Should have been closed a long time ago Purplebackpack89 15:04, 22 July 2019 (UTC)Reply

Add topic