Wiktionary talk:Language treatment/Discussions

WT:LTD redirects to this talk page, to ease using the aWa tool to archive discussions here. For more archived discussions, see the corresponding Wiktionary:-space page, Wiktionary:Language treatment/Discussions.

Rename Old East Slavic to Old Russian (orv) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The term "Old East Slavic" must have been created to avoid offending Ukrainians and Belarusians and to please nationalists who use "Old Ukrainian" and "Old Belarusian", since "Old Russian" may imply to some that Belarusian/Ukrainian languages are derivations of Russian. However, "Old East Slavic" (Russian: древнеру́сский язы́к (drevnerússkij jazýk)) translates into Belarusian as старажытнару́ская мо́ва (staražytnarúskaja móva) and into Ukrainian as давньору́ська мо́ва (davnʹorúsʹka móva) (alternative names also exist, including "Old Ukrainian"). All three terms are literally translated as "Old/Ancient Russian/Rusian", not "Old East Slavic". Note that in Ukrainian ру́ський (rúsʹkyj) refers more specifically to Rus rather than modern Russia. (Unlike Russian and Belarusian, the Ukrainian term росі́йський (rosíjsʹkyj) means "Russian" both referring to ethnicity and the country. Cf. Russian ру́сский (rússkij) / росси́йский (rossíjskij) and Belarusian ру́скі (rúski) / расі́йскі (rasíjski)). Despite the current tensions between Russia and Ukraine, linguistically it makes much more sense to rename "Old East Slavic" to "Old Russian". Call me biased, whatever, but respectable East Slavic linguists all use "Old Russian", it's of the Ancient Rus, not modern Russia, the centre of which was in Kiev, modern Ukraine. :) --Anatoli (обсудить/вклад) 02:47, 12 March 2014 (UTC)Reply

I don't think the names that modern speakers use should really count. They're bound to have some kind of national pride in them, like discussions on the Wikipedia article show. (And I'm hoping to avoid a straw man, but by the same logic we'd call Dutch "Netherlandish", German "Dutch", Greek "Hellenic", Armenian "Hayeric", Hittite "Neshite", and so on...) I think we should focus purely on what modern English-language scholarship calls the language. —CodeCat 03:23, 12 March 2014 (UTC)Reply
Re: English language usage. Ngram, Ngram 2 can't even find "Old East Slavic (language)", not sure if I interpret this correctly or if it is possible to check the English usage via Google.
Being half-Russian, half-Ukrainian, I take pride in belonging to both but pride can take various ways. E.g. ру́ський (rúsʹkyj) is sometimes translated by Ukrainians as давньоукраї́нський (davnʹoukrajínsʹkyj, Old Ukrainian) and some Ukrainian nationalists claim they have nothing to do with Russian but others take pride in being the cradle of all Slavs. We all know the Serbo-Croatian language story and Israeli-Palestinian conflict doesn't make Hebrew and Arabic belong to different language families. Modern Ukrainian, Russian, Belarusian are distinct languages but they are all derived from one, so your comparison is not valid.
In any case, Belarusians (educated), considering the name of the language, are not even trying to deny that Belarusian is derived from Old Russian (=Rusian, Old East Slavic), the language of Rus, in Ukrainian, the term ру́ська мо́ва (rúsʹka móva) is also used, without "Old", since ру́ський (rúsʹkyj) has this meaning ((modern) Russian language is called росі́йська мо́ва (rosíjsʹka móva) in Ukrainian). Ukrainians also know that Украї́на (Ukrajína) is quite a new term, although there are some ridiculous claims about so-called "Ukr" tribes, derived directly from Proto-Slavic, bypassing Old Russian stage. I wonder what discussion you're referring to?
BTW, I disapprove the policy of Russian government and respect the right of Ukrainians to decide their fate but we are talking about language names and linguistics here. Bad timing for this discussion, considering the Ukrainian and Crimean crisis. --Anatoli (обсудить/вклад) 04:22, 12 March 2014 (UTC)Reply
We should probably invite our Ukrainian editors to this discussion to get a broader picture. --Anatoli (обсудить/вклад) 04:49, 12 March 2014 (UTC)Reply
I think this should be decided solely by the most commonly used name in English. Thus, I tentatively support "Old Russian" based on the Ngrams above, and based on a hunch that English speakers are less likely to have to look up the language if it is called "Old Russian" than if it is called "Old East Slavic". However, there needs to be a more thorough investigation into the evidence, since Ngrams are often inaccurate and/or skewed by unexpected factors. --WikiTiki89 04:55, 12 March 2014 (UTC)Reply
Agreed but Google books count of "Old Russian language" is also significantly larger than "Old East Slavic language" (even though "Old Russian language" may not always have the same sense as "древнерусский язык"). I'm happy if someone else can analyse it better using Google or other sources. --Anatoli (обсудить/вклад) 05:02, 12 March 2014 (UTC)Reply
Re: solely by the most commonly used name in English. We should also take into consideration the literal names in the affected languages. I think "Old East Slavic" is a result of some kind of political correctness or fairness, ignoring the fact that Ukrainian and Belarusian don't use this fairness in the name, also not note French "vieux russe", Estonian "vanavene keel", Dutch "Oudrussisch" and many Slavic names. The rest of names must have followed English Wikipedia in their namings. --Anatoli (обсудить/вклад) 05:21, 12 March 2014 (UTC)Reply
I disagree, whatever the origin of its usage, if "Old East Slavic" became were shown to be used more commonly in English, then that is the name we should use. The only other factors we should consider are the ambiguity of the name (for example, if "That Language There" were the most commonly used term for some language, we probably still shouldn't use it), or possibly some other minor issues that I can't think of at the moment. --WikiTiki89 05:30, 12 March 2014 (UTC)Reply
I meant it seems it was coined out of some considerations but I agree that if it were more common we should use it. I can't think of another example but in Japanese ハングル (Hanguru-go) "Hangeul language" was specifically coined, out of political correctness, to avoid giving preferences to either Korea or their languages (North and South Koreas have different names in East Asian languages, see 韓国 and 朝鮮). --Anatoli (обсудить/вклад) 05:38, 12 March 2014 (UTC)Reply
If you agree that if it were more common then we should use it, then isn't that the same thing as only relying on which is more common? --WikiTiki89 05:43, 12 March 2014 (UTC)Reply
--Anatoli (обсудить/вклад) 05:25, 12 March 2014 (UTC)@Mzajac, @Alexdubr, anyone else is active?Reply
It is not easy to get a sense of which name is most common in English. Chaff like "an old Russian car" skews counts of "Old East Slavic" vs "Old Russian", but if "language" is added, the number of hits drops so low that it is statistically insignificant / unreliable. However, based on manual review of a search with "in", I think "Old Russian" is the more common of the two names. - -sche (discuss) 06:25, 12 March 2014 (UTC)Reply
I support the rename to Old Russian per the following Google Scholar Data, showing that "Old Russian" is more than 10 times as common as all other names I could think of, combined. —Aɴɢʀ (talk) 19:20, 12 March 2014 (UTC)Reply

“Old Russian” reflects systemic prejudices that go back to times of the empires, when Encyclopedia Britannica defined wrote about White Russians and Little Russians, some of whom were Ruthenians. These prejudices were still felt in academics and journalism when the Soviet Union broke up, and are only going away now. I don’t think Wiktionary is part of an establishment that tries to wilfully reinforces these backward practices.

Furthermore, the moment when the Russian regime is exploiting such ideas in its anti-Ukrainian propaganda is the shittiest possible time to start this discussion. Michael Z. 2014-03-12 19:18 z

I feel kind of the same about it. I can't motivate the choice with widespread usage, but I feel that "Old East Slavic" just is a better, more descriptive name for the language. So I oppose. —CodeCat 19:42, 12 March 2014 (UTC)Reply
@Mzajac I agree about bad timing and with the criticism of the Russian regime and I'm very sad about the latest events - Putin, in a short period, has created a situation, which was unimaginable for hundred years - Russians fighting Ukrainians. I wish we didn't mix politics in here, though. The political split of Yugoslavia caused the artificial language split but we all know that Serbo-Croatian is one language, despite the tensions. Jewish and Arabic are still semitic languages and Ukrainian and Russian are still derived from the same source (after a big injection of Old Church Slavonic into Russian and Polish injection into Ukrainian). I'm not stating any Russian supremacy or support any name-calling. Regarding the names, I disagree that "White Russians" and "Little Russians" were a result of prejudice, just as "Белару́сь" (Бе́лая Русь) and "Малоро́ссия" are just historical words, names among others like "Great Rus", "Red Rus", "Black Rus", "Carpathian Rus". Nikolay Gogol, native of modern Ukraine, also used "Малороссия" when referring to his homeland. The "offensive" meaning (if "Малоро́ссия" ("Little Rus(sia)" sounds offensive to Ukrainians) was acquired mistakenly in the modern times. In any case, Русь (Rusĭ) originated on the territory of modern Ukraine by people who lived there and this is where all East Slavs originate from. I don't see or Ukrainians don't see anything offensive in terms "давньору́ська мо́ва" and "ру́ська мо́ва" when referring to Old East Slavic language but I won't insist on continuing this discussion, if it hurts someone's feelings. --Anatoli (обсудить/вклад) 02:10, 14 March 2014 (UTC)Reply
Guys, call it Old Bulgarian and that's it. Though modern Bulgarian belongs to the southern branch, the early Bulgarian and Russian texts do show their being one language. Alexdubr (talk) 14:31, 17 March 2014 (UTC)Reply
You are confusing two different languages. Old Bulgarian, which we call Old Church Slavonic, was also a South Slavic language, but it was used even in Russia/Ukraine/etc. as the main written and liturgical language. The main spoken language, however was Old Russian or Old East Slavic, which was an East Slavic language and the ancestor of modern Russian/Ukrainian/etc. --WikiTiki89 16:14, 17 March 2014 (UTC)Reply
Result: not renamed; no consensus for renaming. - -sche (discuss) 02:16, 26 January 2015 (UTC)Reply


Komi language edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge Komi-Zyrian (kpv), Komi-Permyak (koi), Komi-Yodzyak (no code) with Komi (kv). Komi-Zyrian is dominating over others. Currently, they all use the same alphabet but there were differences in the past. There's very little information on the grammar differences but they are mostly considered dialects, not separate languages. Merging kv and kpv should be straightforward, Zyrian is the language of Komi Republic.-Anatoli (обсудить/вклад) 10:39, 6 April 2014 (UTC)Reply

Actually, it has been planned as far as Komi-Zyrian and Komi-Permyak are concerned, see WT:LANGTREAT: "kv and kpv refer to the same lect; one will eventually be deleted.". Apparently, there are differences with Komi-Permyak but those can be labeled as "Permyak", similar Serbo-Croatian or Albanian varieties. --Anatoli (обсудить/вклад) 01:54, 7 April 2014 (UTC)Reply
Added: Wiktionary:Komi transliteration, Module:kv-translit, which handles both Zyrian and Permyak. --Anatoli (обсудить/вклад) 02:22, 7 April 2014 (UTC)Reply
Komi-Permyak has a distinct written tradition and should not and cannot be merged. Your statement that Komi-Zyrian is dominating shows your pro-Russian and anti-minority POV which should never be a basis for consensus on this wiki. Komi-Zyrian can be handled under {{kv}}, yes. -- Liliana 12:34, 7 April 2014 (UTC)Reply
Do you have paranoia or something? I'm not Putin and not suppressing any minorities. Komi-Zyrian is a language of majority of Komi people in Komi Republic, has a much larger number of speakers and much more materials written in this variety of Komi. Komi-Permyak is used by a minority in Perm Krai and has only 63,000 speakers. By all means, I'm not forcing anyone to merge, this is only a suggestion. They are mutually comprehensible and each of them has dialects. It's possible to merge Komi varieties like it's possible to have one L2 header for Albanian, Norwegian or Serbo-Croatian, marking varieties accordingly: тӧлысь (Zyryan)/тӧлісь (Permyak), выль (Zyryan)/виль (Permyak). --Anatoli (обсудить/вклад) 12:58, 7 April 2014 (UTC)Reply
That was completely rediculous and has no place on this wiki. It shows your anti-Russian bias more than Anatoli ever showed any pro-Russian bias. —CodeCat 13:00, 7 April 2014 (UTC)Reply
Thanks. I can't help to have some Russian bias, though. This is my language, my culture. I can help with the Russian language and Russia-related topics, even if Russian may now be interesting perhaps as a "language of enemy" for some. I don't blame people for criticizing Russian politicians looking at Russia with suspicion. I don't support Russian politics and propaganda but I don't have to apologize for being Russian either. :) Certainly I shouldn't be blamed for being against minorities. Why would I add contents for minority languages, if I wanted to supress them? --Anatoli (обсудить/вклад) 13:19, 7 April 2014 (UTC)Reply
There is one problem: as WT:DATACHECK reveals, "Komi-Zyrian language (kpv) has a canonical name that is not unique, it is also used by the code kv." Presumably one of the two needs to be retired (or, failing that, at least renamed). - -sche (discuss) 18:52, 2 May 2014 (UTC)Reply
Komi-Zyrian is less ambiguous and clear. Although I favoured "Komi", perhaps we should retire it and leave Komi-Zyrian for kpv and kv and Komi-Permyak for koi. Komi-Zyrian is implied if Komi is used. --Anatoli (обсудить/вклад) 00:12, 3 May 2014 (UTC)Reply
I switched about half of our kv entries to use kpv, and then I got distracted. I'll try to finish sometime soon. (See also User talk:-sche#kv_language_code and Category talk:Komi language.) - -sche (discuss) 04:29, 27 May 2014 (UTC)Reply
Everything that needed to be done seems to have been done. - -sche (discuss) 02:19, 26 January 2015 (UTC)Reply


Sardinian languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Category:Campidanese_Sardinian_language Category:Gallurese_Sardinian_language Category:Logudorese_Sardinian_language Category:Sassarese_Sardinian_language

merged into

Category:Sardinian_language

We are not supposed to treat dialects as independent languages. How are these not dialects? --Æ&Œ (talk) 20:21, 6 May 2014 (UTC)Reply

I support a merger of Campidanese (sro) and Logudorese (src) into Sardinian (sc), for reasons I outlined on my talk page and repeat here for others' benefit: those two lects differ from each other, quoth WP, "mostly in phonetics, which does not hamper intelligibility among the speakers". They are perhaps comparable to the dialects of Irish, with standard Sardinian (sc) existing as a unification of the dialects (again, comparable to Irish). Notably, we already include standard Sardinian (sc) — which means the additional inclusion of sro and src is quite schizophrenic.
There is some disagreement over whether Gallurese (sdn) and Sassarese (sdc) are dialects of Sardinian, dialects of Corsican, or languages separate from both Sardinian and Corsican. I would not merge them at this time. - -sche (discuss) 03:19, 12 May 2014 (UTC)Reply
I've merged Campidanese (sro) and Logudorese (src) into Sardinian (sc). - -sche (discuss) 08:14, 17 August 2015 (UTC)Reply


The Asu and Pare languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@-sche (and anyone else who loves language-name conflicts), we currently call aum "Abewa" and asa "Asu". Firstly, it seems that practically nobody calls aum by that name; Wikipedia and Ethnologue both titles their entries for it "Asu", and Roger Blench, who may be the only person ever to study it, calls it "Asu" as well. As for asa, which currently occupies that name, Wikipedia and most hits on Google Books call it "Pare" (the remainder call it "Asu" or "Chasu", for the most part). As you may have guessed, there is already a language that we call "Pare", namely ppt, but this New Guinean language is also called "Akium-Pare" and (Wikipedia's choice) "Pa", which thankfully appears to be untaken. The chain of changing language names does seem rather silly, but the overall purpose of this is to move the only language out of these three that actually has any literature on it (and thus the one I just added a translation in), namely asa, to its most commonly used name. —Μετάknowledgediscuss/deeds 05:40, 9 September 2015 (UTC)Reply

I'm down with renaming aum away from Abewa.
On a balance, I'm also OK with renaming asa away from Asu. I can find a decent amount of references to "Asu": it's hard to say whether more or less than "Pare" because both terms turn up so much chaff. There is some documentation of its vocabulary available, incidentally; The Making of a Mixed Language: The Case of Ma'a/Mbugu by Maarten Mous mentions "Pare (Chasu)" muruke "sweat" and tika "lift" (in the context of their having been borrowed without change into Normal Mbugu, and then glottalized into Inner Mbugu muru'u and ti'i); Mbugu also borrowed Pare ku-kasha "hunt", Zigua ku-kala "to hunt", and Shambaa u-kalá "hunting" and ngwilizi "eagle" (source of Normal + Inner Mbugu ngwirizi, variation of l/r being dialectal in Shambaa; contrast Pare ngwirini).
Isaria N. Kimambo's Political history of the Pare of Tanzania, c. 1500-1900 (1969) implies Pare and Asu are different: "Other naming procedures include the use of u- for territorial names, e.g. Upare for the Pare country; and ki- for language, e.g. Kipare for the Pare language. The only exception here is Chasu which refers to the Asu language." John D. Kesby's Rangi of Tanzania: an introduction to their culture (1981) seems to clarify, however: "in the northeast of Tanzania, [...] the people called Pare in Swahili refer to themselves as Asu". (And Maarten Mous provides a bit more detail, that Pare/Asu has at least two dialects, north and south, with the north one apparently also being called Vudee and several spelling variations thereof.)
As for ppt, some works refer to it as "Pari", e.g. The Abandoned Narcotic: Kava and Cultural Instability refers to "the Pari (Pa) language" (not to be confused with lkr Päri - Lokoro). I guess we can rename it "Pa" for now, and switch to "Pari" later if something else called Pa comes up. - -sche (discuss) 05:48, 10 September 2015 (UTC)Reply
Renamed as proposed: aum from Abewa to Asu, asa from Asu to Pare, ppt from Pare to Pa. - -sche (discuss) 00:09, 27 September 2015 (UTC)Reply


Chidigo to Digo (dig) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming dig

From "Chidigo" to just "Digo", partly because we should try to purge prefixes from our language names where appropriate and partly because the latter name is vastly more used by linguists. —Μετάknowledgediscuss/deeds 19:35, 5 September 2015 (UTC)Reply

Support. - -sche (discuss) 16:55, 8 September 2015 (UTC)Reply
Renamed. - -sche (discuss) 00:26, 27 September 2015 (UTC)Reply


Pol Pomo edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently call pmm "Pomo", which makes it sound like the Pomoan "language" which some people hypothesize exists, and for this reason I almost added a translation (with nested translations for Northern and Central Pomo) using the code in this way. I propose it be renamed "Pol" or "Pol Pomo", the name Wikipedia and the International Encyclopedia of Linguistics use. - -sche (discuss) 13:57, 19 June 2015 (UTC)Reply

Damn, that's very rightfully confusing. Support "Pol", since that's what Wikipedia uses. —Μετάknowledgediscuss/deeds 07:17, 11 August 2015 (UTC)Reply
Renamed. - -sche (discuss) 22:25, 27 September 2015 (UTC)Reply


Aramanik (aam), Aasax (aas) edit

The ISO merged aam "Aramanik" into aas "Aasax", saying here "Aramanik [aam] is listed as a Southern Nilotic language of the Nandi group, presumably because the Aramanik people assimilated to the Nandi. The original Aramanik language was a Cushitic language (or a non-Nilotic language with heavy Cushitic overlay) usually called Aasax (Fleming 1969) and is already included in a separate Aasax [aas] entry. Maarten Mous, in A Grammar of Iraqw, also gives them as synonyms."
- -sche (discuss) 22:11, 2 November 2015 (UTC)Reply

Gallurese (sdn), Sassarese (sdc) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


These are currently named "Gallurese Sardinian" and "Sassarese Sardinian", which has led to them sometimes being nested under Sardinian in translations tables, but this is erroneous because they are (transitional) dialects of Corsican spoken on Sardinia, not dialects of Sardinian. I propose to drop "Sardinian" from their names. See also WT:RFM#Sardinian_templates. - -sche (discuss) 17:52, 17 August 2015 (UTC)Reply

Or we could just merge them into co. —Aɴɢʀ (talk) 19:09, 17 August 2015 (UTC)Reply
We could. They're subject to the same LDL CFI whether we keep them independent or merge them into Corsican (as contrasted with dialects of Italian, for example, which would find themselves subject to much higher CFI if merged into it), and a merger would reduce duplication while we could still note differences with {{label}}s and {{qualifier}}s... so perhaps we even should merge them. But they occupy grey areas. Gallurese is transitional between Corsican and Sardinian; Sassarese is transitional between Corsican, Sardinian and Tuscan (which I guess we consider it?). Ah, dialect continua... - -sche (discuss) 01:45, 18 August 2015 (UTC)Reply
Renamed. Not merged at this time. - -sche (discuss) 21:23, 19 October 2015 (UTC)Reply


Merging cbk-zam (Zamboanga Chavacano) into cbk (Chavacano) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Pretty much every source I could find referred to Zamboangueño et al. as varieties or dialects, not languages. Two of them in particular make it very clear:

“The result of the study showed that while there are observable differences in certain language features beween and among the four variants, they are nonetheless, mutually intelligible with each other even among native speakers who do not have any special language training. Thus, for the purpose of this pilot study, al four variants were identified as dialects of PCS.” Sister María Isabelita O. Riego de Dios, A Pilot Study on the Dialects of Philippine Creole Spanish

“The two variants of PCS share enough distinctive differences from regular Spanish or regular Philippine usage that they must be considered historicaly related dialects of the same language” Charles O. Frake, Lexical Origins and Semantic Structure in Philippine Creole Spanish

Ungoliant (falai) 07:19, 15 March 2014 (UTC)Reply

Support. - -sche (discuss) 08:22, 2 November 2015 (UTC)Reply
  Done. - -sche (discuss) 08:57, 1 February 2016 (UTC)Reply


Kiyaka language edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There are two problems here: First, we currently have the code yaf referred to a language we call Kiyaka, but which Wikipedia calls Yaka (without the noun class prefix) and which authors on Google Books seem to agree to call Yaka as well. There are a couple other languages sometimes called Yaka, but fortunately they all have other names that are more common and therefore there is no conflict, and the principal name of yaf should therefore be modified; there are very few categories associated with this one, so it should be easy to change.
Secondly, the Wikipedia article states that the codes noq, ppp, and lnz refer to its dialects, but Glottolog seems to consider them separate languages (possibly just following the ISO rather than actually making a judgement). Therefore we should consider merging these codes, if in fact there are not enough differences (some data would be helpful). @-scheΜετάknowledgediscuss/deeds 02:16, 18 June 2015 (UTC)Reply

axk currently goes by "Yaka"; if yaf were renamed "Yaka", yaf would need to use a disambiguator or axk would need to be renamed, but to what? axk's alt name of "Aka" is taken (by soh), and I haven't offhand found evidence that anyone calls it by its alt name "Beka".
Ethnologue, although it grants Ngoongo a separate code (noq), labels it a dialect of Yaka in its entry on Yaka. I can find a German reference stating "Eng mit den Yaka verwandt sind die Lonzo, Pelende und Suku." ("Closely related to the Yaka are the Lonzo, Pelende and Suku", the last of which WP and Ethnologue consider to speak a separate language.) A French reference says "Les Pelende ont un accent linguistique propre, mais ils s'entendent avec les Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala et Kongo." ("The Pelendes have their own linguistic accent, but they get along with / can understand the Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala and Kongo.") Another says "Le kipelênde comme le kiyaka est un dialecte du kikongo commun. Plus répandu, «Le kiyaka comprend quelques neuf dialectes distincts, présentant parfois des variantes assez considérables.»" ("The Kipelênde like Kiyaka is a dialect of the common Kikongo. More widespread, "The Kiyaka includes some nine separate dialects, sometimes with quite considerable variations.")
I can find a small Yaka corpus, but not any comparison of the different [might-be-]dialects.
A conservative approach might leave the codes separate until such time as someone comes along with words in them. - -sche (discuss) 16:20, 18 June 2015 (UTC)Reply
Hmm, re names: soh is also called (Jebel) Silak, or (Jebel) Sillok; Wikipedia uses the name Sillok, but I'm not finding many resources to assess how common that is (and some refer it by a hyphenated string of dialects). If we can move soh (and perhaps should anyway), then we could move the rest down without disambiguating (so axk would be Aka, and yaf would be Yaka). —Μετάknowledgediscuss/deeds 23:22, 18 June 2015 (UTC)Reply
I sometimes wonder if we should start preferring disambiguators to alt names: they'd be annoying to type, but I think the mere fact that we're discussing a three-link chain of renames (language A takes B's name, B takes C's name, C takes D's name) shows how much clearer they'd be. I pity the new user who e.g. adds aja content under an ==Adja== header, and I pity the veteran user who has to notice that that has happened.
In this case, I can't find evidence of soh being called Sillok, but the people who speak it and the place they live are called Sillok, so at least it wouldn't be unclear. Perhaps we could just rename axk and soh to have disambiguators, though: "Aka (Congo)" and "Aka (Sudan)". - -sche (discuss) 02:02, 19 June 2015 (UTC)Reply
I suggest renaming axk and soh to "Aka (Central Africa)" and "Aka (Sudan)", and then renaming yaf as originally proposed. - -sche (discuss) 06:52, 4 July 2015 (UTC)Reply
Renamed as proposed. What to do with the [might-be-]dialects remains to be determined. - -sche (discuss) 03:47, 12 July 2015 (UTC)Reply
@-sche: I've generally followed the guideline that we avoid such parenthetical geographic locators; were we to use them in general, it would change a great deal of our names. I know few others care, but perhaps we ought to put this to the community at large in the BP? —Μετάknowledgediscuss/deeds 19:56, 28 July 2015 (UTC)Reply
In this specific case, I think there is compelling reason to deviate from the general practice/guideline of preferring alt names to parentheticals even without changing that guideline: in order to use alt names here, we'd have to chain-rename ≥3 languages such that the name each one most often went by was assigned to a different one, and one of them would end up with an unattested name, which would all be extremely confusing.
As for whether/how to change the general guideline: I'll think the matter through more thoroughly before I post anything in the BP. I don't think I'd propose switching to parentheticals in all cases (I think, for instance, that Pyu/Tircul and Riang/Reang use different scripts and so are unlikely to be mixed up). I would only prefer parentheticals where people would be likely to mix up which language was meant by a given name, and where the mix-up would be likely to go unnoticed (e.g. because the script was the same). - -sche (discuss) 21:23, 28 July 2015 (UTC)Reply
Convenient links to previous discussions:
--WikiTiki89 21:38, 28 July 2015 (UTC)Reply


The Tonga languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There are far more Tonga languages than anyone would want to have to deal with, but I am excluding all but two of them in this discussion for the sake of ease. We currently call toi and tog "Tonga" and "Chitonga", but both languages are called by both names, and use of alternative names seems to be vanishingly rare. Our current system is leaving me (and at least one person who tried to give a translation in one of the languages) thoroughly confused, so as much as I find them ungainly, I'd much rather we use parenthetical geographic identifiers than have to go through this madness. (Pinging @-sche as usual (and you ought to take a look at the other ones I've posted recently on this page when you get a chance, if you are so inclined.) —Μετάknowledgediscuss/deeds 05:28, 20 September 2015 (UTC)Reply

I'm all in favor of parenthetical disambiguators, but what should they be? Wikipedia calls toi Tonga language (Zambia and Zimbabwe) and tog Tonga (Nyasa) language, but that seems suboptimal to me since the parentheticals aren't parallel. Ethnologue suggests the majority of toi speakers are in Zambia and all tog speakers are in Malawi, so how about "Tonga (Zambia)" and "Tonga (Malawi)"? —Aɴɢʀ (talk) 13:25, 20 September 2015 (UTC)Reply
Support a rename of toi to "Tonga (Zambia)" and of tog to "Tonga (Malawi)". While we're at it, I think we prefer (do we?) to drop "ki-", "chi-", "gi-" and such African language-name prefixes, so toh could be renamed from "Gitonga" to "Tonga (Mozambique)". Happily, to is distinct as Tongan, and we don't have tnz yet, but it seems to be consistently called "Ten'edn" or "Maniq" (the latter being properly an ethnonym) by its speakers, who SIL says are totally unfamiliar with "Tonga" as the name of a language. - -sche (discuss) 23:50, 26 September 2015 (UTC)Reply
@-sche: I don't know if we've talked about it before, but I think in general we should avoid language-name prefixes. However, there are some exceptions; I prefer "Luganda" to "Ganda", for example, because it's far more commonly used. I'm fine with renaming Gitonga as you suggest. By the way, thanks for dealing with some of these language issues; Kikuyu and Rwanda-Rundi are still lingering on this page, so please give them some love/research when you have a chance.Μετάknowledgediscuss/deeds 04:07, 27 September 2015 (UTC)Reply
  Done. - -sche (discuss) 10:08, 18 February 2016 (UTC)Reply


Wolof edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Gambian Wolof (wof) should be merged into Wolof (wo), IMO. Ethnologue says "Senegalese Wolof [wol] intelligible by speakers of Gambian Wolof but with significant enough differences to require adaptation of materials", which seems to have been their motive in splitting these lects (like so many others). - -sche (discuss) 06:51, 25 February 2016 (UTC)Reply

Support. We can make Gambian Wolof a regional dialect of Wolof and tag relevant words {{lb|wo|Gambia}}. —Aɴɢʀ (talk) 08:38, 25 February 2016 (UTC)Reply
  Done (no entries used the code). - -sche (discuss) 05:11, 29 February 2016 (UTC)Reply


Raga or Hano edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming lml

It seems to be more common to call this language "Raga", as Wikipedia does. Compare e.g. google books:"Raga" "vavine" (several books mentioning the language) vs google books:"Hano" "vavine" (no relevant hits). - -sche (discuss) 09:11, 27 February 2016 (UTC)Reply

That's a good way to demonstrate commonness of use. Support renaming to Raga. —Μετάknowledgediscuss/deeds 01:44, 4 March 2016 (UTC)Reply
Renamed. Compare also google books:"Raga" "wai" "water" vs the same with "Hano". - -sche (discuss) 07:23, 22 March 2016 (UTC)Reply


Sauk or Ma Manda edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming skc

Currently called "Sauk", but Wikipedia, SIL publications on the language, and work by Alexandra Aikhenvald all call it "Ma Manda". A happy side effect of the move would be that we could add "Sauk" as an alias for sac, as it is probably the second most common name for that language. —Μετάknowledgediscuss/deeds 06:34, 3 November 2015 (UTC)Reply

Renamed per nom. Sure enough, the only entries with "Sauk" translations are referring to sac. - -sche (discuss) 09:07, 27 February 2016 (UTC)Reply


Waray-Waray and Warray edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Considering war and wrz

Our current situation is to call war "Waray-Waray" and wrz "Waray"; this is not necessarily an optimal solution. Wikipedia chooses to call war "Waray" and wrz "Warray"; although "Warray" is less common than "Waray" to refer to wrz (as far as I can tell), this gives the commonest name of war to that language, which probably deserves priority due to being much more studied. At Template talk:war, you can see that the idea to rename war to "Winaray" was rejected and Liliana's choice of "Waray-Waray" won out. However, it's clear that our current situation has caused some confusion (User:DTLHS/cleanup/mismatched translation codes shows a lot of misuse of wrz when war was intended). Basically, what we have now isn't bad, but the fact is that it's resulted in mismatched codes, so we might want to try a different approach. —Μετάknowledgediscuss/deeds 20:46, 14 September 2015 (UTC)Reply

I'd prefer minimizing ambiguity by calling war "Waray-Waray" and wrz "Warray" so that no language at all is called by the ambiguous name "Waray". —Aɴɢʀ (talk) 14:56, 15 September 2015 (UTC)Reply
To be fair I think most of the mistakes were caused when the language was renamed but the translations weren't edited, not by whoever added them in the first place. DTLHS (talk) 23:49, 26 September 2015 (UTC)Reply
I've renamed wrz in the manner Angr suggested. - -sche (discuss) 09:04, 27 February 2016 (UTC)Reply


Sura or Mwaghavul edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming sur

The name "Sura" is ambiguous (as noted on Talk:am), and the name "Mwaghavul" seems to be more common (compare google books:"Sura language", google books:"Mwaghavul language") — and is, in any case, quite common — so I propose to rename sur from "Sura" to "Mwaghavul". - -sche (discuss) 20:25, 23 August 2015 (UTC)Reply

Support. —Μετάknowledgediscuss/deeds 04:40, 31 August 2015 (UTC)Reply
Renamed. - -sche (discuss) 05:06, 28 February 2016 (UTC)Reply


Tingal and Tegali edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


In 2011, the ISO retired the code for Tingal [tie], merging it into Tegali [ras]. I think we should follow suit. Wikipedia notes that there is dialectal variation in Tegali, but it's not between Tegali proper and Tingal, it is rather between Tegali proper and Rashad (but even those dialects are "nearly identical"). - -sche (discuss) 06:15, 11 August 2015 (UTC)Reply

  Done. - -sche (discuss) 03:26, 29 February 2016 (UTC)Reply


Batak or Palawan Batak edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bya

-sche has pointed out that we call this language "Batak"; that is an awful idea, due to the existence of the Batak languages. To reduce confusion, we should do as Wikipedia does, and call it "Palawan Batak". —Μετάknowledgediscuss/deeds 06:18, 29 February 2016 (UTC)Reply

@Μετάknowledge: I support this renaming. — I.S.M.E.T.A. 01:30, 4 March 2016 (UTC)Reply
Support, naturally. "Palawan Batak" still sounds like it might be a Batak language, but at least it stops people from seeing one of the Batak languages and entering it into Wiktionary as bya (compare the recent rename of "Pomo" to "Pol Pomo"). - -sche (discuss) 02:03, 4 March 2016 (UTC)Reply
  Done. - -sche (discuss) 07:09, 14 April 2016 (UTC)Reply


Shuadit edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Shuadit (sdt) aka Judeo-Occitan or Judeo-Provençal is most definitely not an independent language. The literature seems to be quite sure on this point: Banitt refers to it as a "langue fantôme", Vouland as a "non-langue", and Alessio as a "langue imaginaire", to quote three especially scathing francophone scholars. One main scholar is Szajkowski, who seems to have made up a great deal about it (including the name Shuadit, of which there is no evidence of use) and who "was no linguist, and his knowledge of Occitan was quite poor" (Strich and Jochnowitz). Moreover, the so-called last speaker, and a chief primary source, Armand Lunel, was evidently a semi-speaker who was not actually fluent in the "language". Glottolog sums all this up by saying: "This entry [Shuadit] is spurious. This means either that the language denoted cannot be asserted to be/have been a language distinct from all others, or that the language denoted is covered in another entry." To the extent that anyone wants to enter the paltry Hebrew-script text, it can be done as Old Provençal or Occitan, depending on how old it is. —Μετάknowledgediscuss/deeds 06:33, 14 April 2016 (UTC)Reply

@Metaknowledge: Good to know. Is there any evidence that it was a dialect, or was it basically just a script variant like Judeo‐French? --Romanophile (contributions) 06:41, 14 April 2016 (UTC)Reply
Pretty much just a script variant. That was a popular thing to do, because using the Latin script was seen as too associated with the Church and distant from the Jewish educational tradition. There have been many other isolated incidences of languages like Urdu and Samogitian being written in Hebrew script as well. —Μετάknowledgediscuss/deeds 06:45, 14 April 2016 (UTC)Reply
@Metaknowledge: do you think that much of Judaeo‐Romance contains only superficial differences? Most of them are considered ‘extinct’ by Wikipedia, bearing Ladino and Judeo‐Italian. Judaeo‐Italian notwithstanding, Ladino appears to be the language of Latin Jews around the world. --Romanophile (contributions) 07:13, 14 April 2016 (UTC)Reply
I think so, but I'd like to read through the chapter on each in the Handbook and consult some other sources before deciding. I obviously like Jewish languages, but I think that the line between language and dialect is being abused, and that we should find better ways to document these. —Μετάknowledgediscuss/deeds 07:20, 14 April 2016 (UTC)Reply
Merge into (Old) Provençal / Occitan per nom. - -sche (discuss) 03:23, 19 April 2016 (UTC)Reply
Merged. - -sche (discuss) 02:56, 28 May 2016 (UTC)Reply


Zarphatic edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This name is used rather uncommonly, and really only by Jewish philologists, not mainstream linguists. It should be changed to the much more common "Judeo-French".

Unrelated to the name change, I'm not fully sure that we should have this as a separate language. Kiwitt and Dörr (2015) say: "It should be noted, however, that the major part of linguistic data attested in Judeo-French sources is simply common Old French written in Hebrew script, with some texts showing little to no register variation in comparison with Christian Old French sources." They go on to discuss one extensive text, a biblical glossary, where only 6% of words were not attested in Christian Old French texts. Basically, this is similar to Hindi and Urdu — is it worth keeping separate? (And if you think it'd be strange to have Hebrew-script entries under ==Old French==, remember that we have Arabic-script entries under ==Afrikaans==.) @-sche, Renard Migrant, Wikitiki89Μετάknowledgediscuss/deeds 04:52, 13 April 2016 (UTC)Reply

Interesting. I’d be fine with a merge or a rename. Script variants and dialectisms can simply be marked with Judeo-French. I would love to work on this dialect, but I have no idea where to find any texts. --Romanophile (contributions) 05:08, 13 April 2016 (UTC)Reply
Before commenting here, I was planning to read the Judeo-French chapter of my new copy of the Handbook of Jewish Languages, but I now realize that Metaknowledge also recently acquired this book and has probably already this chapter and probably only started this discussion because of that. So I'll just assume that his conclusion is the same that I would have drawn and say that I agree that this should be merged with Old French. --WikiTiki89 21:10, 13 April 2016 (UTC)Reply
Precisely. I intend to work through the entire book, improving how we cover Jewish languages. —Μετάknowledgediscuss/deeds 01:10, 14 April 2016 (UTC)Reply
Rename per nom; see also ngrams. Also merge per nom; perusing the references that turn up if I just search for references that mention both lects (google books:"Judeo-French" "Old French"), I find that they agree:
  • Raphael Patai, Encyclopedia of Jewish Folklore and Traditions (2015, →ISBN, page 316: "Judeo-French is Medieval (Old) French as spoken and written by French and Rhenish Jews. It differs from the other “Judeo” languages in that there were no dialectal differences between it and the Old French spoken by the non-Jews[.]"
  • Aaron D. Rubin, ‎Lily Kahn, Handbook of Jewish Languages (2015, →ISBN, page 139: "However, this term does not imply the existence of a set of linguistic features common to these sources that would allow identifying a 'Judeo-French' language or dialect distinct from the varieties of Old French encountered in Christian sources."
Yes, this will result in Hebrew-script Old French (alternative-form-of) entries, and that is OK. - -sche (discuss) 00:16, 14 April 2016 (UTC)Reply
If I remember correctly, a number of Old French words are attested first in Rashi's writings, and those writings been used in recent years by scholars of Old French to fill in gaps in knowledge of other aspects of the language. I'm sure some of our etymologies already include those Zarphatic words as Old French, so we might as well make it official. I agree we should both rename the lect and merge it with Old French. Chuck Entz (talk) 02:29, 14 April 2016 (UTC)Reply
Merged. - -sche (discuss) 03:01, 28 May 2016 (UTC)Reply


Kichwa edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Hello everyone. I'm curious as to how one would go about requesting an exceptional code for the standard Kichwa language? There are several SIL/Ethnologue codes for various Kichwa dialects (Imbabura (qvi), Chimborazo (qug), Cañar Highland (qxr), etc), but not one is for the standard Kichwa language that's taught in schools and used by the government in Ecuador. There is a common Quechua code (qu) is currently used for Quechua Wiktionary and Wikipedia, but both projects are exclusively written in standard Southern Quechua. Both Kichwa and Southern Quechua are part of the Quechua II branch of the Quechuan languages, but they are different dialects with different standardized grammars and different standardized writing conventions. --Dijan (talk) 18:28, 12 May 2014 (UTC)Reply

Hmm... how mutually intelligible are Kichwa and Quechua? It wasn't that long ago that the various Quechua dialects' codes were removed from Module:languages, though I'm having trouble locating the discussion(s) that led to that.
If Kichwa is to be included, we could either (1) pick the code of one of the Kichwa dialects and use it for all of Kichwa (the way we use gcf for all of Antillean Creole), or (2) design our own code, like qwe-kch, according to the system outlined in WT:LANGCODE, point 3.3. - -sche (discuss) 04:25, 27 May 2014 (UTC)Reply
They are mutually intelligible for the most part (similar to differences between Danish and Swedish), but they are also significantly different from each other to be considered separate. Linguistically, Kichwa belongs to a different subgroup of Quechua. The grammar of Kichwa is more simplified (loss of possissive suffixes, loss of the voiceless uvular fricative, etc) and the vocabulary is affected by native languages spoken before the Incan conquest of the territory of today's Ecuador and Colombia (meaning, Quechua was imposed as a foreign language, whereas it is a native language in the regions where Southern Quechua is spoken - in southern Peru and Bolivia).
I was referring to designing our own code and using that as an umbrella for all the Kichwa varieties - which now all use a standardized alphabet different from the Peruvian varieties (such as Southern Quechua), but I couldn't find the procedure for it.
There was an attempt to create a separate Kichwa Wikipedia, but apparently no one got around to it and it got complicated as somone pointed out that an official ISO code must be requested specifically for the standard variety. And for some the problem was that it was trying to use the Chimborazo (qug) code (which is one of the most widely spoken varieties of Kichwa). Apparently the issuing of codes is very strict on Wikipedia. --Dijan (talk) 20:57, 27 May 2014 (UTC)Reply
@Liliana-60, what are your thoughts on this? The Quechua dialects themselves were merged into qu, though I can't find the discussion that led to that. - -sche (discuss) 03:08, 29 February 2016 (UTC)Reply
If none of the ISO codes cover this we should create our own code. The languages seem different enough that they warrant separate treatment, at least. -- Liliana 21:43, 29 February 2016 (UTC)Reply
OK, I have added qwe-kch "Kichwa". @Dijan, I'm sorry neither I nor anyone else acted upon this for so long. The contents of Category:Standard Kichwa will presumably need to be re-headered and moved into Category:Kichwa language. - -sche (discuss) 22:52, 29 February 2016 (UTC)Reply


Persian, Dari, etc edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Although LANGTREAT notes that they should be subsumed into fa, the following codes still exist in Module:languages:

  • pes, for "Western Persian", the most common variety of Persian. We should merge it into fa because it *is* fa.
  • prs, Eastern Persian / Dari, the variety of Persian spoken in Iran Afghanistan. Its status as a separate language, and its very name 'Dari', were products of Afghanistani politics. Not even Afghani speakers of the language call it 'Dari' or consider it separate from Persian; we shouldn't consider it separate, either.
  • aiq, Aimaq, a variety spoken by nomads in Afghanistan and Iran. It is sometimes considered specifically a variety of Dari (which is itself little more than another name for Persian, as explained). It differs from standard Persian mainly in matters of pronunciation, something we usually handle with {{a}} rather than separate L2s.
  • haz, Hazaragi, another Afghan variety. WP summarizes scholarly opinion (with citations, for which look here): "The primary differences between Standard Persian and Hazaragi are the accent and Hazaragi's greater array of Mongolic loanwords. Despite these differences, Hazaragi is mutually intelligible with other regional Persian dialects."
  • deh and phv, Dehwari and Pahlavani, which it is hard to find information on because even WP simply redirects the words to "Persian".

As indicated above, my opinion is that we should merge all of those codes into fa.
Incidentally, LANGTREAT originally also banned Tajik (tg), but this was not supported by scholarship or by our own practice (we had hundreds of Tajik entries), so after two discussions, I updated the page to note that Tajik is allowed. LANGTREAT made no mention of Judeo-Persian (jpr), Bukhari (bhh), Judeo-Tat (jdt) or Tat (ttt), and past discussions of them have assumed they were separate languages, so I also updated the page to reflect that. - -sche (discuss) 05:16, 20 November 2013 (UTC) (fixed think-o)Reply

The Persian lects are an interesting issue; they are on the whole pretty similar, but Persian and Tajik have separate literary and cultural traditions, and I believe Dari does too. I think it is best to keep them separate. The Jewish varieties are often written in the Hebrew script and have a separate cultural tradition, so I think it would probably be handy to keep them separate as well. All the rest probably ought to be merged into their macrolanguages, unless there are script conflicts I'm unaware of (it is, of course, easier to keep with one script per language). —Μετάknowledgediscuss/deeds 05:23, 20 November 2013 (UTC)Reply
The difference between Dari and Persian is not great AFAIK, there are some references on Wikipedia. Tajik should stay separate, not just because it's in Cyrillic. It's very different from Persian and has many Russian and Turkic loanwords. There is also a significant difference in phonology (vowels). Persian e, o and â are usually i, u and o in Tajik. ZxxZxxZ (talkcontribs) may be able to say a bit more. --Anatoli (обсудить/вклад) 05:36, 20 November 2013 (UTC)Reply
We also currently include "Parsi" (prp) and "Parsi-Dari" (prd), which Wikipedia suggests are spurious(!). - -sche (discuss) 05:58, 3 December 2013 (UTC)Reply
Regarding Dari, to merge Dari into Persian, we would have to figure out what to do with transliterations. In standard Persian, ē merged with ī into what we transliterate as i (no diacritic) and ō merged with ū into what we transliterate as u. In Dari, ē and ī and ō and ū are still differentiated. Furthing complicating the problem, standard Persian e, o, ey, and ow are pronounced i, o, ay and aw in Dari, although these differences are not phonemic. The other differences are not as much of a problem, but see a brief description at w:Dari#Phonology. --WikiTiki89 14:37, 4 December 2013 (UTC)Reply

The followed was copied here by User:-sche from User talk:Dijan:

Hi, Do you think Dari can be merged with Persian? See Wiktionary:Requests_for_moves,_mergers_and_splits#Persian, Dari, etc. {{context|Dari|lang=fa}} can still be used. CC ZxxZxxZ (talkcontribs). --Anatoli (обсудить/вклад) 04:56, 4 December 2013 (UTC)Reply

That has been the practice on Wiktionary. Dari should be under the Persian heading with {{context|Dari|lang=fa}} label. That is what we do already. Take a look at فاکولته, for example. Regarding the transliteration of long vowels, ZxxZxxZ (talkcontribs) has been trying to implement a classical Persian transliteration (which is pretty much used for Dari by scholars) as a standard for all Persian entries. So far, it's been a slow and selective process. I'm not opposed to it and it can easily be indicated as the standard Wiktionary practice in the Appendix:Persian transliteration. --Dijan (talk) 06:35, 5 December 2013 (UTC)Reply
Thanks. Do you mind posting your answer on Wiktionary:Requests_for_moves,_mergers_and_splits#Persian, Dari, etc and perhaps address Persian/Dari vowel differences raised? Where can I look at the classical transliteration? --Anatoli (обсудить/вклад) 00:18, 6 December 2013 (UTC)Reply
By the way, if you happen to know anything about Parsi or Parsi-Dari (and about their relationship to Farsi and Dari), your input on that subject would also be appreciated over in the same section. :) - -sche (discuss) 00:53, 6 December 2013 (UTC)Reply
I support keeping Dari (and Western Persian) under Persian heading, but I'm not sure what to do about transliteration. By the way, Eastern Persian / Dari (prs) is the variety of Persian spoken in Afghanistan, not Iran, that of Iran is the first one, Western Persian (pes). --Z 17:48, 12 December 2013 (UTC)Reply
Oh, yep, I've fixed my think-o. - -sche (discuss) 21:25, 12 December 2013 (UTC)Reply
I've merged pes and prs into fa. - -sche (discuss) 22:42, 27 May 2014 (UTC)Reply
I've deleted "Parsi" (prp) and "Parsi-Dari" (prd). - -sche (discuss) 06:36, 5 February 2015 (UTC)Reply
James Minahan says "The Hazara language, called Hazaragi, is a Farsi dialect, although the Hazaras are physically Mongol. The intermixture of the Indo-European and Mongol linguistic groups resulted in a dialect of Dari Persian that contains extensive words and forms from Farsi, Turkic, and Mongol. [...] Most Hazaras also speak Dari Persian [...] as a second language. In Iran most are bilingual, speaking both Hazaragi and Farsi. [...] Until the 1980s educated Hazaras used Farsi or Arabic as the literary language, but a movement to create a Hazaragi literary language has gained momentum." This suggests that Hazaragi is a separate language; we also already have a few translations into it. Therefore, I've kept it separate and updated WT:LT accordingly.
Barbara West, in the Encyclopedia of the Peoples of Asia and Oceania, writes that the Aimaq speak "Aimaq, a dialect of Dari or Afghan Persian"; Brian Williams writes that "Aimaqs also have a strong Persian admixture, and their language is Dari or Farsi", Jake Kircher writes that "Once there was a generally used, common Aimaq language but now, few seem to speak it anymore. Dialects spoken today resemble Dari (Afghan eastern Farsi) admixed with words of Mongolian and Turkic origin." In general it seems that it should be handled the same way as Dari, via {{lb}} and {{a}}.
- -sche (discuss) 03:26, 4 March 2016 (UTC)Reply
Several references consider Pahlavani an alternative term for Pahlavi, but the 2003 International Encyclopedia of Linguistics says Pahlavani is still spoken today in a village in Afghanistan, Haji Hamza Khan, where it is "similar to Dari Persian but still distinct" (earlier editions of the IEL are explicit that this is the only village where the language is spoken). I suppose it can be left unresolved for now.
Dehwari is spoken by Persians in Baluchistan; several references seem to consider it to be Persian, but Denys Bray's 1934 The Brāhūī problem writes as if it and Persian are separate languages; I suppose it too can be left unresolved for now.
- -sche (discuss) 04:01, 4 March 2016 (UTC)Reply


the Lega lects edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have separate codes for three lects considered part of the Lega macrolanguage: lea, lgm, and khx. These are reasonable to separate into two languages, lgm (which should be called Lega-Ntara) and lea (which should be called Lega-Malinga) as there is 67% mutual intelligibility, but khx is clearly a variety of lea. All this is per the treatment of the Beya dialect of lea in The Bantu Languages. —Μετάknowledgediscuss/deeds 04:23, 22 November 2015 (UTC)Reply

Support; merge khx into lea, with lgm remaining separate, and name everything per nom. I'm not sure why Wikipedia names the two main dialects using placenames. The full array of alternate names I encountered in (cursorily) researching the matter:
  • Mwenga Lega = Lega-Ntara / Lega Ntara (variously translated in refs as "Lower Lega", "Upper Lega" or "Eastern/Northern Lega") = Isile, Ishile, Kisile; Mwenda-Liga
  • Shabunda Lega = Lega-Malinga / Lega Malinga (variously translated in refs as "Upper Lega", "Lower Lega" or "Forest Lega" or "Western/Southern Lega") = Lega (Kilega) / Liga (Kiliga) proper; dialects: Kanu (Kikanu), Gala (Kigala), Yoma (Kiyoma), Sede (Kisede), Gonzabale, Beya (Beia), and possibly (Ki)Nyamunsange and Banagabo and Kabango and Bene
- -sche (discuss) 22:22, 25 November 2015 (UTC)Reply
I'v merged khx into lea. However, as names go, "Lega-Shabunda" and "Lega-Mwenga" seems to be more common than "Lega-Malinga" or "Lega-Ntara". - -sche (discuss) 05:06, 29 February 2016 (UTC)Reply
(Re)named "Lega-Shabunda" and "Lega-Mwenga". - -sche (discuss) 18:55, 22 March 2016 (UTC)Reply


Kikuyu edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming ki

This is a hard one, and I'm not advocating one way or the other, just wishing to raise the issue. Ngrams reveal that in 1992, "speaking Gikuyu" and "Gikuyu language" became more common than "speaking Kikuyu" and "Kikuyu language" (as well as becoming the linguistic standard), yet overall the spelling "Kikuyu" is still more common, presumably in speaking of the people, who are less obscure than their language. We follow Wikipedia in using the spelling "Kikuyu", but this spelling is clearly no longer favoured for the language (if you're curious, the "g" is to reflect the etymon, Kikuyu Gĩkũyũ). Should we change it? —Μετάknowledgediscuss/deeds 19:55, 5 September 2015 (UTC)Reply

As you note, there are phrases where ngrams suggets Gikuyu is now more common as a name for the language — but there are also phrases (1, 2) where Kikuyu is still more common even as a language name. I'd stick with the current spelling. - -sche (discuss) 05:00, 28 February 2016 (UTC)Reply
Not renamed at this time. - -sche (discuss) 23:40, 2 July 2016 (UTC)Reply


Kristang edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Judging by Ngrams, "Malaccan Creole Portuguese" is the least common name for this language; more common is "Malacca Creole Portuguese" with no n, and most common is "Kristang". In Glottolog's list of materials on it, I note that most of the modern material on it (by Baxter and Marbeck) calls it Kristang. I suggest renaming. That entails updating several entries and moving several categories. - -sche (discuss) 19:01, 20 March 2016 (UTC)Reply

Support. —Μετάknowledgediscuss/deeds 00:19, 2 April 2016 (UTC)Reply
Kristang gets a misleadingly high number of hits because it’s also the name of the people that speaks it, and part of the synonym Papia/Papiah/Papiá Kristang. — Ungoliant (falai) 01:05, 2 April 2016 (UTC)Reply
@Ungoliant MMDCCLXIV: Given that you're the most knowledgeable about PT-based creoles, what would you prefer? —Μετάknowledgediscuss/deeds 17:36, 3 April 2016 (UTC)Reply
Kristang, but Papia Kristang or Malacca Creole Portuguese are also good options. The most important works about this language use Kristang more prominently. — Ungoliant (falai) 19:30, 3 April 2016 (UTC)Reply
Done. - -sche (discuss) 07:30, 3 July 2016 (UTC)Reply


Loreto-Ucayali Spanish edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have a code spq for this dialect of Spanish; Wikipedia has an article for it at Amazonic Spanish which states that "Ethnologue's reasons for doing this [making a separate code for it] are poorly documented." Although it has some mild differences, it is clearly a dialect of es and should be merged into it. (There are no entries, but we should record the merger.) —Μετάknowledgediscuss/deeds 17:35, 3 April 2016 (UTC)Reply

Support. — Ungoliant (falai) 20:27, 3 April 2016 (UTC)Reply
Merge, IMO. (The only thing that gives me pause is the difference in attestation requirements: if Amazonian Spanish isn't well documented, then treating it as a separate lect allows its entries to be held to a lower attestation requirement. But the same could be said of a lot of varieties that don't have their own codes, e.g. New Mexico and Southern Colorado Spanish, for which there are several published references.) - -sche (discuss) 00:50, 4 April 2016 (UTC)Reply
  Done. - -sche (discuss) 19:23, 2 July 2016 (UTC)Reply


Shempire Senoufo edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The Senoufo lects are a mess, and we have no consistency in naming them (some are Senoufo, some Sénoufo, some without the Senoufo at all). This one, seb seems not even to be a separate lect at all, but instead what Supyire spp is called in Côte d'Ivoire. —Μετάknowledgediscuss/deeds 20:25, 3 April 2016 (UTC)Reply

Yes, Wikipedia and Omniglot agree they are the same. (The old 2003 International Encyclopedia of Linguistics said their "Relationship [was] undetermined" at that time.) As for the other Senoufo lects: I noticed one while checking translations at water, and removed "Senoufo" from its name before I added entries in it because I saw how rarely it was actually referred to with "Senoufo" in the name. Supyire too seems to be mostly referred to without "Senoufo". - -sche (discuss) 00:45, 4 April 2016 (UTC)Reply
Eventually we'll have to get to renaming them. For now, I just wanted to excise duplicates. Also, thanks for the archiving.Μετάknowledgediscuss/deeds 01:55, 4 April 2016 (UTC)Reply
Merged. - -sche (discuss) 19:26, 2 July 2016 (UTC)Reply


Brythonic to Brittonic edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@Angr According to w:Brittonic languages, the name "Brittonic" is far more common than "Brythonic", which is apparently rather outdated. We should use the more common name. —CodeCat 17:30, 12 April 2016 (UTC)Reply

@CodeCat This is pretty funny. I'm fairly certain I used to use "Brittonic" and then you corrected me to "Brythonic" (rightfully, since it is the one we use currently), but it's humorous to me that we're now suggesting the change. I definitely think we should change it to "Brittonic". —JohnC5 18:21, 12 April 2016 (UTC)Reply
@CodeCat, JohnC5: What about the (unsourced) "Some authors reserve the term Brittonic for the modified later Brittonic languages after about AD 600." statement? — I.S.M.E.T.A. 14:45, 13 April 2016 (UTC)Reply
Google Books Ngrams shows what looks to me like virtually a statistical tie since 1950, though Brythonic has been more common since the turn of the century. I really don't have a strong preference either way. —Aɴɢʀ (talk) 09:28, 14 April 2016 (UTC)Reply
@Angr: That Ngrams result is very interesting and makes me lean towards keeping it as “Brythonic.” —JohnC5 14:56, 14 April 2016 (UTC)Reply
@CodeCat, JohnC5, Angr: Yes, keep as “Brythonic”. — I.S.M.E.T.A. 15:27, 14 April 2016 (UTC)Reply
Yes, based on the ngram, keep as "Brythonic". - -sche (discuss) 03:02, 28 May 2016 (UTC)Reply
Not renamed at this time. - -sche (discuss) 04:49, 3 July 2016 (UTC)Reply


Extinct languages of the Marañón River basin edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I reckon that we should add exceptional codes for the ones that have words directly recorded from them, even though it's precious few for each.

  • Palta (separate article) could be qfa-jiv-pal, in imitation of Linguist list's jiv-pal, but given that its family assignment is not one hundred percent certain and a three-part code is annoying and abnormal, we could also settle for sai-pal. I don't know why Jivaroan's language family uses qfa rather than sai, which seems to be our default for unsorted South American languages.
  • Rabona could be sai-rab.
  • Patagón could be sai-pat.
  • Bagua could be sai-bag.
  • Copallén could be sai-cop.
  • Tabancale could be sai-tab.
  • Chirino could be sai-chi.
  • Sácata could be sai-sac.

I'm not sure I see a point in adding codes for languages where the only words are elements in toponyms or names, rather than directly recorded. However, Puruhá language, Cañari language, Panzaleo language, Caranqui language, and others can be added if there is interest. After all, ISO already has codes for European languages like Dacian that aren't much better attested, and I suppose we could have an entry or two. @-scheΜετάknowledgediscuss/deeds 05:13, 26 May 2016 (UTC)Reply

Re "why Jivaroan's language family uses qfa rather than sai": presumably human error. It could be updated to "sai-jiv". Re whether to use "sai-pal" or "sai-jiv-pal": for consistency with other codes ("nai-yuc-yav", etc) and with the schema described in WT:LANG, we should use "sai-jiv-pal" if we accept that it was a Jivaroan language. But as you say, the family identification is speculative (although the evidence which does exist is consistent with it). I suppose we could use "sai-pal" to be 'safe' / 'conservative' about the family identification and get a shorter code... this also lets us add the others as "sai-" codes without worrying we ought to reassign them if we later create a family code for the families they belong to. (Indeed, we already have a code for the Cariban family which Campbell and Grondona say Patagón belonged to.) - -sche (discuss) 22:27, 26 May 2016 (UTC)Reply
@-sche: I think it's better not to make any assumptions for these languages' genetic affiliations. And do you want to add the languages I mentioned in my last paragraph? —Μετάknowledgediscuss/deeds 04:02, 27 May 2016 (UTC)Reply
I've added Palta as sai-pal, added Rabona, Patagón, Bagua, Copallén Tabancale, Chirino and Sácata, and also renamed the Jivaroan code to sai-jiv and changed Esmeralda's code from qfa-und-esm to sai-esm to fit the usual naming scheme. - -sche (discuss) 15:42, 29 June 2016 (UTC)Reply
I have added Puruhá as sai-prh. There was at one point a grammar of it (though it has been lost), so we know it existed as a discrete lect. And based on personal- and place-names, words in it have been reconstructed by scholars. Even if the only words in it we can add are words in the Reconstruction: namespace, that does seem worth having a code for (a code also lets us reference it when giving the etymologies of those personal- and place-names). The other languages could probably be given codes on the same basis (if words in them have been reconstructed). - -sche (discuss) 00:56, 1 July 2016 (UTC)Reply
I've also added codes and entries for Cañari, Panzaleo and Caranqui. Everything here is done, I think. - -sche (discuss) 20:34, 2 July 2016 (UTC)Reply


Saliba languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently have a language called Saliba (sbe) and one called Sáliba (slc). In my mind, having two languages' names only differ in a diacritic is not acceptable. It confuses automated programs as much as human editors. I'm not sure there are any acceptable alternative names, so I propose using geographic disambiguation for sbe as "Saliba (Papua New Guinea)". @-scheΜετάknowledgediscuss/deeds 04:15, 26 June 2016 (UTC)Reply

And slc should be "Saliba (Colombia)". As a side note, w:Sáliba language redirects to w:Saliba language. —Aɴɢʀ (talk) 13:30, 26 June 2016 (UTC)Reply
Alternately we could call slc "Saliva". This seems to get about 10× more Ghits for both the ethnic group and the language (though interference from saliva is possible). --Tropylium (talk) 04:34, 27 June 2016 (UTC)Reply
Indeed the current names are so confusing that I got them backwards when I added the only three entries we have in the two languages. I think that disambiguating them both with parentheticals is clearer than allowing one to keep the ambiguous name (either while renaming the other to "Saliva" or while disambiguating the other). Current practice suggests that we should rename slc to "Saliva" or "Sáliva", but I wouldn't mind if we started making more frequent use of disambiguators instread. - -sche (discuss) 08:50, 27 June 2016 (UTC)Reply
My first choice is "Saliba (Colombia)", my second choice is "Sáliva". It's bad enough we have Anus language to protect from puerile vandalism without also having Saliva language. —Aɴɢʀ (talk) 09:40, 27 June 2016 (UTC)Reply
Alright, I'll rename them both to use parenthetical disambiguators. PS, don't forget Category:Anal language. - -sche (discuss) 05:16, 28 June 2016 (UTC)Reply
  Done, except that per previous discussion on using geographic disambiguators rather than national ones, I used "New Guinea" rather than "Papua New Guinea". (Other languages also use "New Guinea" as a parenthetical disambiguator in their canonical names, and only mention "Papua New Guinea" in alt names because SIL uses national disambiguators like that.) - -sche (discuss) 05:24, 28 June 2016 (UTC)Reply


Loarki and Gade Lohar edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Ethnologue encoded this language twice: once as lrk "Loarki" (the name its 20,000 Pakistani speakers call it), and a second time as gda "Gade Lohar", the name its ~500 speakers on the Indian side of the border call it. (The International Encyclopedia of Linguistics entry on Gade Lohar conservatively only says the languages "may be the same" as Loarki, and notes its long list of other names: Gaduliya Lohar, Lohpitta Rajput Lohar, Bagri Lohar, Bhubaliya Lohar, Lohari, Gara, Domba, Dombiali, Chitodi Lohar, Panchal Lohar, Belani, and Dhunkuria Kanwar Khati. The IEL entry on Loarki is more explicit, breaking down the population by country and countain Gade Lohar's 500 speakers as Loarki speakers, because Loarki is "probably the same as Gade Lohar in Rajasthan, India, a Rajasthani language.") I propose to merge gda into lrk. - -sche (discuss) 20:50, 11 August 2015 (UTC)Reply

  Done - -sche (discuss) 03:23, 11 July 2016 (UTC)Reply


Dhuwal edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The ISO has retired the code duj and replaced it with dwu (in the process of splitting off dwy "Dhuwaya"). If we want to follow the ISO, all our Dhuwal entries and categories need to be switched from duj to dwu, which seems like a lot of unnecessary bother. (Whoever does this should also add dwy to the module.) - -sche (discuss) 05:54, 24 February 2016 (UTC)Reply

On one hand, it's unnecessary; on the other hand, I think it would be ideal to follow ISO in all cases where we don't have a well articulated reason not to do so. —Μετάknowledgediscuss/deeds 06:00, 24 February 2016 (UTC)Reply
Support deprecation of duj. —Μετάknowledgediscuss/deeds 06:19, 29 February 2016 (UTC)Reply
  Done. (I also recoded Elfdalian from the nonstandard dlc to the standard ovd, per a BP discussion.) - -sche (discuss) 02:56, 7 July 2016 (UTC)Reply


Kiriri edit

 

The following discussion has been moved from the page User talk:-sche.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


What do you think about adding a code for this language, and under what name? Wikipedia describes it at Katembri language; they cite Fabre for the claim that this language is only preserved in a single brief wordlist, where it is called Kiriri (the wordlist is on page 22 (section 3.4) of this pdf). Regardless, that document does seem to be a good place to find more words for water. —Μετάknowledgediscuss/deeds 03:24, 2 March 2016 (UTC)Reply

There's a Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Katembri language, and another Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Xukuru language? Well, that's confusing.
The fact that there's only a limited number of words is no reason not to include the language, but it would be good to avoid the ambiguous name Kiriri. How about calling them Katembri and Xukuru, like Wikipedia does? Wait, (as a minor point of curiosity,) if the wordlist is labelled Kiriri, where'd the alternative name come from?
The difficult part will be assigning codes, given that the family affiliation is unclear. - -sche (discuss) 03:45, 2 March 2016 (UTC)Reply
Given the naming issues, I'm suddenly confused about whether I have correctly identified the wordlist being referred to. I don't know anything about any of these languages, so I feel lost (it's so much better in Austronesian, for example, where I at least feel like I have a hold on what goes where). Anyway, that naming scheme makes sense; we can use qfa codes and not worry about the families, no? —Μετάknowledgediscuss/deeds 05:01, 2 March 2016 (UTC)Reply
qfa is the prefix for exceptional family codes. All of our exceptional language codes which start with qfa do so because they start with a family code that starts with qfa, like qfa-ctc-col. There's been at least one case where we've created a family code for an accepted family (qfa-len, the Lencan languages) in order to use it in constructing a language code (qfa-len-slv for Salvadoran Lencan), but Wikipedia notes that scholars aren't certain what family either Kiriri belonged to, so we couldn't do that here because we couldn't accurately, confidently assign either one a family code (even an exceptional family code). I suppose we could construct codes starting with qfa-und, like qfa-und-ktm for Katembri. I wouldn't want to use bare qfa-___ (e.g. qfa-ktm for Katembri) because it would look like a family code. - -sche (discuss) 08:21, 3 March 2016 (UTC)Reply


More languages without ISO codes, part 1 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I have gone through w:Category:Languages without ISO 639-3 code but with Linguist List code (thanks, Angr), and the languages listed below still need exceptional codes. I have not listed those that have no recorded material or toponyms, or those that are treated as a dialect of another language in the linguistic literature (like Akokisa). I have put suggested codes after them, and notes where I'm unsure (please correct me if I made any mistakes). —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply


Comments:
  • Amarizana: add per nom. Alexandra Y. Aikhenvald in Languages of the Amazon cautions that "many languages in Amazonia have 'namesakes' [and] more than one group may hide behind the same name", and more than one language has been called Amarizana: "one of the clans of the Piapoco is the Amarizanes (and the name is sometimes applied to the whole group). A now extinct language, also called Amarizana, and from the same Arawak family, used to be spoken in the Meta territory of modern Colombia." Nonetheless, the only references I find unambiguously mean this language. Čestmír Loukotka, Johannes Wilbert, Classification of South American Indian Languages (1968), page 131, lists some Amarizana words alongside and hence obviously distinct from Piapoco, including nuita "head", notuy "eye", nukagi "hand", kaxü "house", sietai "water", eriepi "fire" and keybin "sun". Julian Granberry's A Grammar and Dictionary of the Timucua Language even provides some etymology, connecting Amarizana eri(-...) "fire" to Achagua eri "sun, day", Arekena ale "sun".
  • Amasi: this happens to highlight what a mess our African language family codes are. Several codes use the prefix nic- even though their most immediate superfamily is alv, e.g. nic-vco should be alv-vco. Fortunately, fixing the nic- codes should not require updating very many pages. One that is done, precedent would have us use alv-bco-... rather than alv-... (compare nai-yuc-tip, qfa-ctc-cat), although the argument in favor of a shorter code is obvious. :-/ Some words are listed in a 1973 article in Africana Marburgensia ('AM') and in a pre-draft working paper cited by WP ('B'), including (AM) / bu (B) "dog", ázɔ́lí (AM) / azɔle (B) "tree", ɣà-nēm (AM) / ɣanim (B) "man", ɣà-zhyī (-zhyì?) (AM) / ɣaʒɛ (B) "woman", mwɔ̄ (AM) / muɔ (B) "water".
  • Anauyá: add per nom. Also called Anauya, but the version with diacritic is more common. [3] has uni "water" and ahiri "sun", the latter confirmed by I Simposio Antonio Tovar sobre Lenguas Amerindias: Tordesillas... (Emilio Ridruejo Alonso, ‎Mara Fuertes, ‎Carlos González-Espresati; 2003) and both are seemingly in the aforementioned Classification of South American Indian Languages, although I can't see the exact snippet.
  • Atanque(s): out of the various names WP mentions, namely "Atanque (Atanques) or Cancuamo (Kankuamo), also known as Kankwe and Kankuí", plus others I ran across (Atanke), "Atanques" seems to be most common, at least as the name of the language. ("Kankuamo" is quite common as a placename(?) that forms part of the designation of a tribe.) A 1962 article in Anthropological Linguistics has some words, including jo̱ke "gourd cup", cognate to cho̱kue (gourd cup), and mo̱ga "two", cognate to mo̱ga (two), and the 1981 Comparative Chibchan Phonology has more words (and may drop the underline from the os of those words; it is hard to see, because all words are underlined), including ji "worm", jinua "six".
  • Ayomán (rarely also Ayoman): I've added a code for Jirajaran, sai-jir, so this language's code should be sai-jir-ayo.
- -sche (discuss) 07:32, 4 July 2016 (UTC)Reply
@-sche: Can't we just do two-part codes so we don't have to feel obligated to create these horribly long ones? It wouldn't clash with all of our preëxisting practice, despite there being some precedent. Also, I'm worried that your careful work on this is going to make this RFM section far too long, and also cause you to burn out. Perhaps this should be a user page that this section links to? —Μετάknowledgediscuss/deeds 07:57, 4 July 2016 (UTC)Reply
Where an existing ISO family code like alv exists, I suppose we could go with two-part codes, but then what should be done with languages that have no ISO family code but instead belong to families for which we've had to create qfa- codes? I suppose they can be treated the same as they are now. But if we accept nai and sai as family codes for this purpose, I suppose that means some qfa- things like Salvadoran Lenca and Catacao can be re-coded. I will update the existing three-part non-proto-language codes if we go that route. I've started Wiktionary:Beer_parlour/2016/July#Shortening_some_.27exceptional.27_language_codes.
Yes, I considered as probably should start storing long comments and information on addable vocabulary in userspace. I wouldn't worry too much about burnout; we can take time; the only reason there would be a rush to add these codes ASAP is if we wanted to add words in particular ones of them, and if we wanted to add words, we'd need to do some research to find words to add. - -sche (discuss) 17:40, 4 July 2016 (UTC)Reply
Regarding Burgundian, what should we do about Burgundian language (Oïl)? Add it, too? - -sche (discuss) 01:48, 5 July 2016 (UTC)Reply
I'm not sure. I assumed that it could be subsumed under fr, as the Oïl languages usually are, but we probably ought to address it separately. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)Reply
I mean, is it separate from fr? I don't know. In any case, I guess on further consideration it doesn't stop us from adding the Germanic language as "Burgundian", because if the Romance language needs to be added, it can be Bourguignon. (And since they're from different (sub)families, it should be easy to tell which one was meant if someone enters a word from one incorrectly as the other.) I'll collect information about them at User:-sche/Burgundian (Wiktionary:About Bourguignon). - -sche (discuss) 16:50, 5 July 2016 (UTC)Reply
Btw, I found and added another one we were missing, Macoris, attested in one word (baeza) and some placenames. - -sche (discuss) 06:27, 5 July 2016 (UTC)Reply
Oh, there are tons more. This is only the low-hanging fruit; a lot more languages with paltry data are waiting to be dealt with. I'm avoiding the Bantu ones for now, because pretty much all of them are in dialect continua and probably should be left alone unless good scholarship on their mutual intelligibility can be found (which I suppose I should go about finding). There's a pile of Australian ones that I'll get around to listing at some point (I thought maybe I'd give you time to digest all this first), and then even more messier ones from South America. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)Reply
FWIW, re 'time to digest', I'd feel free to post any others you have the data to post (except the ones you mention are parts of dialect continua ... might as well leave them alone, as long as some part of the continuum has a code, although if no part does, then we should probably rectify that). There's no harm in it sitting around on the site unattended-to for a while, whereas letting it sit on one's computer sometimes (at least for me) means forgetting where one put it. (I can no longer find the information I thought I had collected on the separability vs mergeability of Haida dialects.) - -sche (discuss) 16:50, 5 July 2016 (UTC)Reply
Regarding Ch'olti': The oldest stage of the language (written in so-called hieroglyphs), sometimes confusingly just called "Ch'olti'" but more often called by other names, has its own code emy. The Colonial- and post-Colonial-era stage is considered distinct from emy, and also considered distinct from the more recent stages of Ch'orti', e.g. one reference says "In the Mayan classificatory tradition, the Ch'olti' language, as recorded in the 1695 grammar of Pedro Moran, is generally held to be related to but separate from the modern language of Ch'orti' (see Kaufman's 1976 classification, for example)." Post-Epigraphic-era Ch'olti' and modern-era Ch'orti' (caa) are theoretically distinguished from each other as different branches of Eastern Ch'olan (and from ctu, as it is a Western Ch'olan language), but the size of the difference between Ch'olti' and Ch'orti' is hard to ascertain, especially because, quoth WP, "the post-colonial stage of the language is only known from a single manuscript written between 1685 and 1695" (as afore-mentioned). For that matter, the size of the difference between Epigraphic Ch'olti'an and Colonial Ch'olti' is not obvious to me; Søren Wichmann, The Linguistics of Maya Writing (2004), page 271, says "In this section we show how Classic Ch'olti'an became seventeenth-century Ch'olti'. The chief grammatical difference between the grammars of Classic Ch'olti'an and Ch'olti' is the difference between straight- and split-ergativity." As an example, mi "father" is used in Classic and Colonial Ch'olti'(an) and in Ch'orti'. Nonetheless, given that the corpus of post-emy Ch'olti' is small and well-defined, it shouldn't be that hard to include it separately from emy and caa. - -sche (discuss) 21:42, 5 July 2016 (UTC)Reply
I also added Wanham. - -sche (discuss) 22:07, 10 July 2016 (UTC)Reply
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)Reply


Yokutsan languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We seem to have this as a single macrolanguage, yok, despite the fact that the constituent lects seem to constitute at least a few languages. -sche added some entries, but it's all tagged by (dia)lect, so it will be easy to separate them. I think we should retire yok and replace it with exceptional codes to reduce confusion, but I am not sure what those divisions should be. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply

Yes, the difficulty is in deciding how to divide it. Christopher Loether cautions that although e.g. "Gayton (1948) listed 26 named groups or 'tribelets'[,] many of these named groups speak dialects which are nearly identical phonologically, lexically and syntactically, while others speak varieties which are indeed quite distinct from their neighbours' speech. Kroeber [...] divided [what he considered a single] language into two main divisions: Valley and Foothill Yokuts. He further divided the Valley Branch into Northern and Southern, and the Foothill branch into Kings, Tule-Kaweah, Poso, and Buena Vista. Newman (1944: 5-3) agreed with Kroeber's analysis of a single Yokuts language and stated that his data corroborated Kroeber's dialect divisions."
Kroeber lists 20+ dialects, of which 21 are named in [[ˀilik']]. Wikipedia has a tree/bush diagram from Whistler and Golla of 23-28 dialects, including all of those 21 plus Koyeti, Merced "(?)", Noptinte (Nopchinche(s), Nopthrinthre(s), Nopṭinṭe, Nopthrinte, Noptinci), Yachikumne a.k.a. Chulamni, Lower San Joaquin Yokuts, and Lakisamni "(?)", and Tawalimni. (Several have multiple names, e.g. Ayticha is also called Kocheyali as well as Ayitcha; Palewyami is also Altinin and Poso Creek Yokuts in addition to Paleuyami. And Hometwoli is also Taneshach?)
For Yawelmani and Chukchansi, decent resources exist; in addition to those two, Wikchamni and Tachi are also being taught according to WP, and in addition to those four, Choinimni and Kechayi also have at least some speakers according to WP.
WP says the Yokutsan family consists of "half a dozen" languages, but evidently not the six just named, because those six leave out several major branches that Kroeber, Newman, and Whistler and Golla all agree on.
I suggest we create a family code nai-yok for the Yokutsan languages, and then distinguish the following branches which Kroeber, Newman, and Whistler and Golla all consider distinguishable, without splitting them further at this time:
  1. Palewyami (nai-ply — or putting the y at the start so the codes sort together and are more apparently connected — on further thought, nah) a.k.a. Poso a.k.a. Poso Creek
  2. Buena Vista Yokuts (nai-bvy) a.k.a. Tulamni-Hometwoli
  3. Tule-Kaweah Yokuts (nai-tky) a.k.a. Wikchamni, Yawdanchi
  4. Kings River Yokuts (nai-kry) a.k.a. Choinimni, etc
  5. Gashowu (nai-gsy), which Kroeber and Whislter/Golla agree is intermediate between Kings River and Northern Valley, though Kroeber considers it ultimately/genetically Kings
  6. Southern Valley Yokuts (nai-svy) a.k.a. Yawelmani, Tachi, etc
  7. Northern Valley Yokuts (nai-nvy) a.k.a. Chukchansi, Kechayi, etc
  8. Delta Yokuts (nai-dly) a.k.a. Far Northern Valley Yokuts
A more conservative approach would keep Southern Valley Yokuts, Northern Valley Yokuts, and Delta Yokuts together as "Valley Yokuts", but Delta Yokuts is relatively divergent. - -sche (discuss) 22:16, 3 July 2016 (UTC)Reply
Pinging @Chuck Entz in case you have insight or input on this Californian language family. - -sche (discuss) 04:01, 5 July 2016 (UTC)Reply
Not much, I'm afraid. A couple of decades ago I read everything I could find on the Wikchamni, who used to live in the area where my brother lives now. I read all of the sources you mentioned above, but I was more interested in the ethnobotany of the Yokuts than their languages, per se, and that was a long time ago.
On a side note, I remember one of my professors at UCLA back in the 80s saying that Yawelmani was one of the best-understood languages in the world at the time from a theoretical perspective, because so many linguists had been publishing papers on it- it was sort of the linguistic equivalent of a model organism. Chuck Entz (talk) 04:25, 5 July 2016 (UTC)Reply
  Done [4]. - -sche (discuss) 06:51, 15 July 2016 (UTC)Reply


various places where WT:LANGTREAT and Module:languages are inconsistent edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Module:languages includes codes from the individual members of several language/dialect groups which WT:LANGTREAT says, without citing any discussion, should be merged. Something needs to change: either WT:LANGTREAT should be updated to note that the individual varieties are allowed, or their codes should be removed from the module. The following language/dialect groups are affected:
(Note 1: whenever the merging of a particular dialect group had been discussed and that discussion was cited by LANGTREAT, I simply updated the module.)
(Note 2: Haida, Cree and Kalenjin face the same issue; I expect to write about them later.)
- -sche (discuss) 05:16, 20 November 2013 (UTC)Reply

the Gondi lects edit

Stephen Tyler (not the singer), in his oft-cited works on Gondi, states "Though I have no real evidence, the general pattern seems to be for geographically adjacent Koya and Gondi populations to speak different, but mutually intelligible Gondi dialects. Where these populations are geographically non-contiguous, the dialects are not mutually intelligible. This same pattern probably prevails among all Gondi dialects." WP says "The more important dialects are Dorla, Koya, Maria, Muria, and Raj Gond." Ethnologue, meanwhile, as encoded only two varieties, ggo (Southern Gondi) and gno (Northern Gondi). Should we deprecate those two codes? Or deprecate the macro-code gon and recognise those dialects? Or allow all three? - -sche (discuss) 05:16, 20 November 2013 (UTC)Reply

Merged (whereas, the ISO, while retaining gon, split ggo into two new codes which have not been added). - -sche (discuss) 06:27, 24 February 2016 (UTC)Reply


Merging Malay edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Malay lects

I propose to merge the following Malay lects into Malay [ms]:

  • Jakun [jak]
  • Orang Kanaq [orn]
  • Orang Seletar [ors]
  • Temuan [tmw]

These are all mere dialects of Malay with no written tradition and perfectly mutually intelligible. Even Ethnologue says they should be considered dialects of Malay rather than separate languages. -- Liliana 23:39, 28 February 2016 (UTC)Reply

Support. - -sche (discuss) 02:36, 29 February 2016 (UTC)Reply
Support. — I.S.M.E.T.A. 01:32, 4 March 2016 (UTC)Reply
  Done. - -sche (discuss) 04:20, 3 July 2016 (UTC)Reply


Ngadha edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming nxg

This language is far more commonly called "Ngadha" than "Ngad'a"; the latter spelling is so rare that when I was trying to verify our "Ngad'a" translation of water using that spelling for the language name, I couldn't find any references at all (they all spell it "Ngadha"). This rename entails moving a few categories and updating a handful of entries. - -sche (discuss) 02:32, 29 February 2016 (UTC)Reply

  Done. Should Eastern Ngad'a be merged? - -sche (discuss) 04:47, 3 July 2016 (UTC)Reply


Bontoc edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently include both the macrolanguage Bontoc (code: bnc) and its dialects, particularly Eastern Bontok (not even using the same spelling, you notice! code: ebk) which we have about ~35 translations in. IMO, it rarely makes sense to include both a macrolanguage and also all of its dialects; we should usually have one or the other but not both. Ethnologue says the dialects are "reportedly similar", as if they split bnc into dialects in 2010 without without knowing enough about them to tell whether they were similar or distinct. The International Encyclopedia of Linguistics considers Central Bontoc to be only 56% intelligible with Eastern Bontoc, which is only a few percentage points better than the intelligibility of the various Bontocs with Ilocano, suggesting that at least Central and Eastern Bontocs, if not the others, are different languages. Our ~15 "Bontoc" (bnc) entries seem to be Central (Igorot) Bontoc and could be relabelled accordingly if we deprecated bnc. - -sche (discuss) 07:48, 29 February 2016 (UTC)Reply

@-sche: Relabelled "Central Bontoc" or "Igorot Bontoc"? And is it "Bontoc" or "Bontok"? Whatever the details, I support the idea of reducing the macrolanguage Bontoc (bnc) to an etymology-only language in favour or having translations and entries for the various Bontoc languages. — I.S.M.E.T.A. 01:30, 4 March 2016 (UTC)Reply
"Central Bonto(c|k)" is more common than "Igorot Bonto(c|k)" or "Bontok Igorot", and "Bontoc" is more common than "Bontok". I've tweaked the canonical spelling of Central Bontoc (lbk) accordingly; I suppose the other Bontocs which are currently spelled with k should also be updated. - -sche (discuss) 01:58, 4 March 2016 (UTC)Reply
@-sche: "Cental Bontoc" it is, then. — I.S.M.E.T.A. 02:14, 4 March 2016 (UTC)Reply
A number of works refer to "the Bontoc language" without specifying which of the Bontoc languages they mean, and we couldn't easily include words from these works if we deprecated bnc; there are even books like Clapp's Vocabulary of the Igorot Language as Spoken by the Bontok Igorots which conflate all the languages of the Igorot people (perhaps understandably, given the point above that the Bontocs and e.g. Ilocano are equally different from each other). However, if we accept that being barely halfway mutually intelligible makes Central vs Eastern Bontoc separate languages, then we're not losing anything of quality by not following (and not being able to easily add content from) books that fail to distinguish such different lects. - -sche (discuss) 03:01, 4 March 2016 (UTC)Reply
@-sche: Quite. Though, there will be occasions when it will not possible to work out easily from which of the Bontoc languages a given term in a borrowing language will have been derived. — I.S.M.E.T.A. 03:28, 4 March 2016 (UTC)Reply
  Split. - -sche (discuss) 04:15, 15 July 2016 (UTC)Reply


Matanawi edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Another language we need a code for, presumably sai-mat. Is there a more efficient way to find languages we've missed than my current method, which is simply happening upon them? —Μετάknowledgediscuss/deeds 08:33, 28 June 2016 (UTC)Reply

I suppose you could go through w:Category:Languages without ISO 639-3 code but with Linguist List code. —Aɴɢʀ (talk) 13:54, 28 June 2016 (UTC)Reply
  Done - -sche (discuss) 04:51, 15 August 2016 (UTC)Reply


Btw, Native South Americans: Ethnology of the Least Known Continent lays out the case from Nimuendaju (who documented Mura) that Rivet, at least, if not others, was too hasty in grouping this with Mura. Nimuendaju considers Mat. and Mur to be isolates. - -sche (discuss) 05:12, 15 August 2016 (UTC)Reply

The Rwanda-Rundi lects edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We currently separate {{rw}}, {{rn}}, {{haq}}, {{suj}}, {{han}}, and {{vin}}, which can all be treated as a single language and often are; their separation seems largely political. The Wikipedia page Rwanda-Rundi language catalogues the differences between rw and rn: they are pretty minor, and a lot seem to have to do with regular spelling differences and tones, which are not even reflected in the standard orthography and could thus be relegated to pronunciation sections. To quote Zorc and Nibagwire's Kinyarwanda and Kirundi comparative grammar (2007):

 

The terms dialect and language are used loosely in everyday communication. In linguistic terms, the two are bound together in the same definition: a language consists of all the dialects that are connected by a chain of mutual intelligibility. Thus, if a person from Bronx, New York can speak with someone from Mobile, Alabama, and these two can converse with someone from Sydney, Australia without significant misunderstandings, then they all form part of the English language. Kigali [the capital of Rwanda] and Bujumbura [the capital of Burundi] are similarly connected within a chain of dialects that collectively make up the Rwanda-Rundi language.

 

Kimenyi's A Relational Grammar of Kinyarwanda (1980) explains that:

 

This language [Kinyarwanda, rw] is very close to both Kirundi [rn], the national language of Burundi, and Giha [haq], a language spoken in western Tanzania. The three languages are really dialects of a single language, since they are mutually intelligible to their respective speakers.

 

That seems like a strong case for merger to me, although I'd like to see if any academic sources disagree. (Pinging User:-sche as well, just to try it out.) —Μετάknowledgediscuss/deeds 01:52, 17 December 2013 (UTC)Reply

I got the ping, thanks. :) I've just been busy. I'll look into this more closely soon, but on the face of it, it does seem like we could merge them. (And that reminds me that en.Wikt really needs to have a discussion about merging Nynorsk and Bokmal. It's bizarre that we manage to accept that Drents and Twents — and, to use the example above, the English of Alabama and the English of Australia — are not separate languages, but haven't managed to accept that Nynorsk and Bokmal aren't. But that's for another discussion...) - -sche (discuss) 06:04, 19 December 2013 (UTC)Reply
But cf. Appendix:Swadesh lists for Bantu languages, which shows large differences between the two. -- Liliana 14:41, 14 February 2014 (UTC)Reply
I was just talking to a native speaker today to get their perspective on this. They said that the vocabulary varies a lot dialectally, but not along national lines, and it's still easy to understand people on the other side. They claimed that the biggest differences were in the tones, but that's not even marked in the orthography. I think that's a pretty strong ase for merger. —Μετάknowledgediscuss/deeds 19:22, 28 December 2015 (UTC)Reply
R. David Zorc and ‎Louise Nibagwire have an entire Comparative Grammar (2007, →ISBN devoted to the differences between the two, which Google unfortunately only shows snippets of, including the TOC which lays out spelling differences, noun class differences, "word pairs, one matched, the other completely different", and false friends, as well as dialect-specific tonal marking. However, vocabulary differences and false friends exist even between English dialects (luck out), and tonal marking and other pronunciation differences which aren't reflected in the orthography can be handled in pronunciation sections, as with Iranian Persian vs Dari. The only thing that gives me pause is the point that dialects on the extremes of the continuum "may not" be intelligible with each other per WP, but then, if the dialects aren't split up along national lines / along the same lines as the codes, then that's not a good argument against merger. - -sche (discuss) 00:14, 7 March 2016 (UTC)Reply
Elena Zinovʹevna Dubnova, in The Rwanda Language (1984), page 15, writes: "A. Coupez maintains that "as a matter of fact, Rwanda, Rundi and Ha ( and maybe other languages spoken east of the latter two areas) are so close that they can be regarded as dialects of one language" (17. p. 11). According to the other view they are closely related but still different languages (11, p. 26)." Other sources concur, "In linguistic terms, Kirundi and Kinyarwanda are mutually intelligible dialects of the Rwanda-Rundi language." Merged under the code rw. - -sche (discuss) 19:17, 14 August 2016 (UTC)Reply


w:Huarpean languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


These are three extinct languages of Argentina that lack ISO codes, but two of them have recorded material (the third, Puntano, seems pointless to add). The only problem is that some linguists consider these to be dialects of the same language, although that is debated and cannot be satisfactorily resolved with the limited preserved lexica from each. I would prefer we follow es.wiktionary's lead in adding separate codes for Allentiac (sai-all) and Millcayac (sai-mil). —Μετάknowledgediscuss/deeds 05:36, 28 June 2016 (UTC)Reply

  Done. - -sche (discuss) 06:11, 15 August 2016 (UTC)Reply


renaming Atong edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I suggest renaming

  • ato from Atong to Atong (Africa) or Atong (Cameroon)
  • aot from A'tong to Atong (Asia) or Atong (India) if we accept the non-inclusion in the latter name of the border areas of Bangladesh where it is also spoken

References seem to prefer the spelling "Atong" to "A'tong" for aot, and Wikipedia says "The correct spelling Atong is based on the way the speakers themselves pronounce the name of their language. There is no glottal stop in the name and it is not a tonal language." - -sche (discuss) 04:37, 29 July 2016 (UTC)Reply

@-sche: I support renaming the languages to "Atong (Cameroon)" and "Atong (India)", respectively; in the latter, "India" can be taken to refer to the subcontinent, rather than the country. — I.S.M.E.T.A. 18:55, 31 July 2016 (UTC)Reply
  Done. - -sche (discuss) 20:36, 15 August 2016 (UTC)Reply


More languages without ISO codes, part 2 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))Reply


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)Reply
As of now, with the addition of Tataviam, we include codes for 7900 languages, and will soon have codes for over 8000, including artificial languages and proto-languages. Wiktionary talk:Milestones#We_have_words_from_21.25_of_languages. - -sche (discuss) 07:48, 11 July 2016 (UTC)Reply
As to how different the dialects of Dorasque are: A. L. Pinart's Vocabulario Castellano-Dorasque, Dialectos Chumulu, Gualaca y Changuina gives si (and ji) as the Chumulu form(s), and gives ti as the Gualaca form and ji as the Changuina form for "water". Overall, it's hard to tell how intelligible the dialects would be; some words are quite similar (Chumulu utká, Gualaca utkál "yellow"; Chumulu katuvá, Gualaca katavá "bow"), others are very different: Chumulu sagúsaña, Gualaca θake "blue"; Chumulu sérkala, Gualaca okiyigua "asiento" (OTOH, all three dialects use sérkala for "bench"). "Woman" is biá in Chumulu and Changuina, ωiá in Gualaca. I suppose we can follow Pinart in entering it as one language. - -sche (discuss) 01:12, 10 August 2016 (UTC)Reply
I've also added a code for Yuri language (Amazon) and Culle language. - -sche (discuss) 21:08, 13 August 2016 (UTC)Reply
I've also added three Shastan languages:
  • Okwanuchu language (nai-okw): Berkeley.edu's Indian Languages project says some (<100) words were recorded in the early 20th century; the Handbook of North American Indians: California refers readers to Kroeber (1925) and Voegelin (1942); Glottolog cites Victor Golla California Indian languages (2011). Kroeber says "The dialect is peculiar. Many words are practically pure Shasta; others are distorted to the very verge of recognizability, or utterly different." Victor Golla, California Indian Languages, speculates at length that Okwanchu may have been "a bilingual mix of Shasta and some other language". There was a people "whom the McCloud River Wintu considered Wintu and called Waymaq ('north people') [whom] Du Bois believed [...] were closely related to, if not identical with, the Shastan Okwanuchu; the survivor of the group whom she interviewed gave her a short vocabulary that included words of Shasta origin (Du Bois 1935:8)." These words included atsa 'water', au-u 'wood', katisuk 'bring'. Golla also says "Okwanuchu speech may also be attested in [eighteen] words identified as 'Wailaki on McCloud' (cf. Wintu waylaki 'north people') that Jeremiah Curtin recorded" in 1884, namely: gü'ru 'man', ki'rikega 'woman', hänumaqa 'old man', apci 'old' (in ki'rikega apci 'old woman'), ä'toqe'äqa 'young man', kewatcaq 'young woman', tse'gwa 'one', hoka 'two', qätsi 'four', tseapka 'five' = 'one hand', hukaapka 'six', tsuwara 'sun', kapqu'[r]wara 'moon', kau 'snow', atsa 'water', gri'tuma 'thunder', itsa 'rock', tarak (Golla's note: "[terak?]") 'earth'. Golla notes: "Curtin employed the BAE transcription system, in which q represents a velar fricative, not a velar stop." "Of these forms, five (man, old man, four, thunder) are not attested in any other variety of Shasta."
  • Konomihu language (nai-knm): The Handbook of North American Indians: California says "An unpublished Konomihu word list was collected by Angulo (1928a)." Glottolog cites Shirley Silver Shasta and Konomihu (1980), Roland B. Dixon The Shasta-Achumawi: A New Linguistic Stock, with Four New Dialects (1905), Lars J. Larsson Who Were the Konomihu? (1987). Kroeber says "Kroeber says "it is still questionable whether their speech is more properly a highly specialized aberration of Shasta or of an ancient and independent but moribund branch of Hokan from which Karok and Chimariko are descended together with Shasta. [...] Konomihu is their own name." Silver's work documents a number of words, including kihínàpxī́k "woman".
  • New River Shasta language (nai-nrs): "the language is attested only in a few wordlists" per Berkeley.edu's Indian Languages project. Kroeber, who mentions the alternative exonym Amutahwe, says they were "perhaps rather nearest to the major group in speech, although at that their tongue as a whole must have been unintelligible to the Shasta proper. [...T]he tribe melted away without a survivor, leaving only a fragmentary vocabulary."
- -sche (discuss) 23:11, 15 August 2016 (UTC)Reply


Bole edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bvx

The di- in its current name, Dibole, is one of those language prefixes, but curiously enough, the most commonly used name for this language is actually Babole, with a different prefix. (Luckily, unprefixed Bole isn't used, because it's taken up already by bol.) We should rename this to Babole accordingly. —Μετάknowledgediscuss/deeds 17:32, 29 June 2016 (UTC)Reply

If I know anything about Bantu languages, "Babole" refers to the people who speak it rather than the language itself. Is this really what they call it? —CodeCat 18:24, 29 June 2016 (UTC)Reply
Yes, that's why I noted that it was odd. You're welcome to compare google books:"Dibole language" and google books:"Babole language" yourself. Not much work is done on it, but when it is, it's called Babole. —Μετάknowledgediscuss/deeds 22:46, 29 June 2016 (UTC)Reply
Indeed, searching google books:"Dibole" language turns up nothing relevant, and searching google books:"Bole" language and google books:"Bole" language Nigeria OR Congo finds only references to the Nigerian Bole. Whereas, google books:"Babole" language is well attested. Rename. - -sche (discuss) 03:11, 30 June 2016 (UTC)Reply
  Done - -sche (discuss) 00:08, 17 August 2016 (UTC)Reply


Gandhari edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming pgd

We currently have this as "Gāndhārī". This spelling is indeed used frequently in the literature, but we try to avoid difficult diacritics, and this spelling can be seen as using IAST to render the native name of the language, whereas "Gandhari" is the corresponding English. I think that switching to "Gandhari" would be the better choice. @Aryamanarora, -scheΜετάknowledgediscuss/deeds 08:52, 5 July 2016 (UTC)Reply

I agree, the IAST diacritics are unnecessary. —Aryamanarora (मुझसे बात करो) 15:34, 5 July 2016 (UTC)Reply
Yes, rename. - -sche (discuss) 22:16, 5 July 2016 (UTC)Reply
  Done - -sche (discuss) 00:45, 17 August 2016 (UTC)Reply


Ersu languages edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


We follow the ISO in covering this as a single macrolanguage, ers. Following Yu (2012), we should keep ers as Ersu, but also create sit-tos for Tosu and sit-liz for Lizu. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply

  Done. And wow! What a writing system — the colour of the writing changes the meaning! - -sche (discuss) 20:52, 28 August 2016 (UTC)Reply


More languages without ISO codes, part 3 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))Reply


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)Reply

Quimbaya edit

Loukotka, and Campbell citing him (and Wikipedia citing Campbell), says only a single word is attested, but none of them say what that word is. Other sources, none of which strike me as reliable, provide different(!) words, of which the most common is quindio (one book has quindus, one has kindiyo, parallelling the spellings Quimbaya - Kimbaya), meaning "paradise", but other words mentioned include chascará, batatabatí, and fihisca "spirit" (the last of which may actually be from some other language). proel.org says eight words are attested, but doesn't say what they are. Several books provide placenames which may attest the language. - -sche (discuss) 23:57, 20 August 2016 (UTC)Reply

Yupua edit

An old source (Daniel Garrison Brinton, 1898) says "The Jupua and Curetu dialects are properly one and the same, the difference which appears in their vocabularies arising simply from inequality in the ears and the orthographies of observers. This is evident by the following comparison..." and then proceeds to offer a comparison which IMO doesn't actually conclusively demonstrate anything. In any case, Cueretú / Curetu itself does not currently have a code. - -sche (discuss) 04:20, 14 July 2016 (UTC)Reply
I can find a number of sources attesting Yupua words, but none seem like reliable primary sources: Peruvia Scythica: The Quichua Language of Peru mentions "wui" as the word for house, but does so as part of trying to connect a huge number of unrelated languages based on chance sound correspondances. Brinton's 1898 Studies in South American Native Languages has a list of words which he sources to Martius (surely referring to the Wörtersammlung Brasilianischer Sprachen) and compares to Curetu words sourced to Wallace (surely referring to A Narrative of Travels on the Amazon and Rio Negro), but I haven't located the Yupua words in the originals; the words Brinton gives are: blood: thik (Yupua), dü (Wallace); bow: patopai, patueipei; earth: thitta, ditta; flesh, ga'hi', se'hea'; finger, moh-asoing, mu-etshu; fire, pieri, piure; flower, pagari, bagaria; foot, göaphoe, giapa; hair, poa, phoa; hand, moho, muhu; head, co'ëre, cuilri; house, wu'i, wee; mouth, thischüh, dishi; sun, hauvä, aoué; tongue, toro, dolo; tooth, gobâckaa', gophpecuh; water, thäco, deco; woman, nomöa, nomi; he also offers the additional Yupua words hóggoa "water" (sic), göaphae "foot", ga'hi "meat", jih "jaguar", ikama "deer", jocheo' "star". Ruhlen has manapẹ "husband / man", apara "we two", ti "this", -mai- "we", tsīngeē "boy", pilo "fire", poa "feather". - -sche (discuss) 05:29, 14 July 2016 (UTC)Reply
Martius confirms thäco is "water" and pieri is "fire", and has wúi "house", pohjá "feather". - -sche (discuss) 07:54, 20 August 2016 (UTC)Reply


Merging Berawan edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Berawan

It does not make sense to have both the macrolanguage (lod, which the ISO retired) and the dialects (zbc, zbe, zbw). We should either merge the dialects or retire lod. However, Berawan is composed of more than three varieties, which the literature often speaks of, though there are some works that use the same coarser division as Ethnologue. - -sche (discuss) 09:57, 1 September 2016 (UTC)Reply

Being a "lumper" rather than a "splitter", I support merging the dialects into the macrolanguage. Dialect labels can be added with {{lb}} as needed. —Aɴɢʀ (talk) 11:24, 1 September 2016 (UTC)Reply
Oddly, Ethnologue does not provide a claim of how mutually intelligible vs unintelligible they are. Blust, in The Consonants of Long Terawan, asserts four dialects, all spoken within 25 miles of each other: Long Terawan (Ethnologue's zbw) and Batu Belah (part of zbc), and Long Teru (part of zbc) and Long Jegan (zbe); he considers e.g. Long Pata (which WP names, and which is noted by Ray's 1913 Languages of Borneo) to be a variant of one of these, though he is nonspecific as to which one. When I compare Blust's Terawan data and Burkhardt's The historic evolvement of true triphthong phonemes in Long Jegan Berawan, the differences are not large, especially considering the obvious differences in style of notation: e.g. Blust has Terawan gitoh "hundred", Burkhardt has Jegan getoʔ (Blust noted some uncertainty in his transcription of vowels); Terawan dimmeh "five", Jegan dimmiᵊy; T buluh "feathers or body hair", J bullǝw, T iciw "day", J iciᵊw, T iko "tail", J eko, T puté "white", J pote, T depeh "fathom", J dǝppiǝ̈. A larger difference is Terawan lebbih "two" vs Jegan duβiᵊy, T puʔ "head hair", J uk, T manoʔ "bird", J manǎuk. The only entry we have in the dialects is pi "water", which is (per our entry) the same in all of them. It does seem that these could be merged. - -sche (discuss) 18:19, 1 September 2016 (UTC)Reply
  Merged. - -sche (discuss) 05:33, 11 September 2016 (UTC)Reply


Sanglechi [sgy] and Warduji [wrd] edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


It appears from Sanglechi language that these names are synonymous. As "Sanglechi" is vastly more common, I move that we remove Warduji. —Μετάknowledgediscuss/deeds 06:12, 6 July 2016 (UTC)Reply

  Done. - -sche (discuss) 17:17, 18 September 2016 (UTC)Reply


Even more languages without ISO codes, part 1 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)Reply

  Done as sai-cno since -chn was taken. In addition to the catechism, Fitz-Roy (1839) has three words, and Jose Garcia (1889) has three words, per Loukotka. - -sche (discuss) 06:18, 17 August 2016 (UTC)Reply
  • Damin (art-dam) — this might be better off considered as a natural language instead so it can be in mainspace (--Meta)
    If we want Damin in mainspace, I'd rather consider it a dialect of Lardil than a separate language. —Aɴɢʀ (talk) 20:43, 13 August 2016 (UTC)Reply
    Hmm, this is an interesting case. On the one hand, they seem to be mutually unintelligible, and speakers seem to refer to them as separate languages, and outside linguists seem to treat them as separate things (compare Eskayan)... and we do consider even such very similar, often-overlapping things as Norwegian and a slightly different orthography of Norwegian to be entirely separate languages. On the other hand, Damin is said — by researchers with access to its full vocabulary, as opposed to access to only a limited surviving corpus — to have a vocabulary of "perhaps no more than 250 words in all", which makes it hard to consider it a language, and does make it seem similar to pandanus avoidance registers or especially thick cant or jargon. According to Wikipedia, some markers are different, but others are the same, e.g. genitive -kan and future -ur. I suppose treating it as a dialect of sorts does make the most sense. - -sche (discuss) 17:35, 14 August 2016 (UTC)Reply
    Not done per the above discussion. - -sche (discuss) 08:51, 9 September 2016 (UTC)Reply

also this: - -sche (discuss) 05:21, 29 August 2016 (UTC)Reply

  • Mamluk-Kipchak (see w:fr:Kiptchak mamelouk) (trk-mam? -mmk?)   Done
    Water is سو (su); the word also means 'tempering (of a sword)'. 'Ebçi' (apparently containing the occupational suffix '-çi'; ç = č) is 'woman', mentioned in one military treatise when it says that Indian swords are "useful only for hanging on the neck of a woman who cannot give birth to a son". 'sovuq' is 'cold' and 'sol' is 'left': andın songra sol egining ūzārā salgıl taqı boynuñ ūzārā tezgindūrgil "after that, place it over your left shoulder and make it rotate over your neck". See Munytu'l-Ghuzāt: a 14th-century Mamluk-Kipchak military treatise and Vocabulaire arabe-kiptchak de l'époque de l'État mamelouk. - -sche (discuss) 00:47, 31 August 2016 (UTC)Reply


splitting rGyalrong edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Based on their limited mutual intelligibility, and following the references on these languages (see w:Rgyalrong languages' bibliography), we should split jya, and its category and any entries, into four languages:

  • Situ (sometimes called Eastern rGyalrong) (sit-sit?)
  • Japhug (sit-jap?)
  • Tshobdun (Caodeng, Sidaba) (sit-tsh)
  • Zbu (Rdzong'bur, Showu, Sidaba) (sit-zbu)

- -sche (discuss) 05:11, 29 August 2016 (UTC)Reply

  Done. - -sche (discuss) 04:44, 4 October 2016 (UTC)Reply


Merge Qashqai and Sonqor into Azeri? edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Sonqor Turkic is an Oghuz lect spoken in northwestern Iran. The Turkic Languages by Fuchs Christian et al speaks of it, Qashqai/Kashkay (code qxq) and Afshar as "deviating" dialects of Azeri. The Oxford Handbook of Inflection, edited by Matthew Baerman, speaks of Sonqor as "a minority language of Iran heavily influenced by Kurdish". (The Handbook provides some Sonqor example words: ušaġ-ækæ-le' "child-SPEC-PL" "those children" (compare ušaġ to Azeri uşaq), mincuġ-ækæ-re "bead-SPEC-ABL" "of those pearls" (Azeri muncuq-), šéʕr-eke-sin-ne "poem-SPEC-POSS-ABL" "about that poem by him" (Azeri şeir-).) Other sources note that "Sonqor and others" "transitional dialects". How should Sonqor and Qashqai be handled? Ethnologue claims there are "significant differences [between North Azeri and] South Azerbaijani [azb] in phonology, lexicon, morphology, syntax, and loanwords", but we merged them (but not qxq) after this discussion, based on references saying "[the] dialects of Azerbaijani do not differ substantially. Speakers of various dialects normally do not have problems understanding each other" (heck, even Azeri and Turkish speakers do not have that much difficulty understanding one another). Gilles Authier, in New strategies for relative clauses in Azeri and Apsheron Tat, in Clause Linkage in Cross-Linguistic Perspective (2012, →ISBN, similarly says "There is an almost perfect mutual intelligibility between Azeri and Kashkai speakers. I tested this personally by submitting recordings to both audiences." It sounds to me as if Kashkay should be merged into Azeri, and as if we don't need a code for Sonqor. However, I'm pinging our only recently active editor who speaks some Azeri, User:123snake45. - -sche (discuss) 03:33, 11 September 2016 (UTC)Reply

As you noted, only political/orthographic considerations compel us to keep Turkish and Azeri apart. I'd keep all these merged in Azeri. While we're at it, have we ever discussed whether the name "Azeri" should be changed to "Azerbaijani"? —Μετάknowledgediscuss/deeds 03:46, 11 September 2016 (UTC)Reply
BGC Ngrams shows a statistical dead heat between the two over the years. I personally prefer the shorter name. —Aɴɢʀ (talk) 12:51, 12 September 2016 (UTC)Reply
I do not speak anything Turkic; but I keep getting the general impression we make too little use of the "X is a dialect of Y" functionality. This would be well-suited for handling cases where one variety is mostly intelligible with a bigger standard language, but has its share of its own quirks (loanwords, occasional diverging phonetics, etc). --Tropylium (talk) 13:32, 12 September 2016 (UTC)Reply
  Done. - -sche (discuss) 05:06, 4 October 2016 (UTC)Reply


Belize Kriol edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming bzj

The name we use, "Belize Kriol English", is certainly among the more uncommon names of this language. Much better would be just "Belize Kriol"; there might also be argument to be made for the more dated term "Belizean Creole" (which is the title of the Wikipedia entry). There is one entry which would have to be changed. —Μετάknowledgediscuss/deeds 20:25, 3 April 2016 (UTC)Reply

On Ngrams and in the 'raw' Google Books results, and when I check academic sites on Google (site:.edi) and academic papers (filetype:pdf — most results for most non-famous languages' names are academic papers, dictionaries, etc), the most common name is "Belizean Creole" (which doesn't seem dated), then "Belize Creole", with "Belize Kriol" bringing up the rear. Approximately the same pattern holds when I page through the results looking only for primary reference works about this language in particular, as opposed to works that just mention it. There is some interference from the fact that "Belizean Creole" and "Belize Creole" are apparently also used to name the people who speak it and not just the language, but I suggest a rename to "Belizean Creole". - -sche (discuss) 18:27, 20 August 2016 (UTC)Reply
Renamed to "Belizean Creole". - -sche (discuss) 17:40, 11 September 2016 (UTC)Reply


Even more languages without ISO codes, part 2 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)Reply

Gallaecian and Ivernic

@Angr, since you are knowledgeable of Celtic languages: do you agree that the Gallaecian language and Ivernic language should be added? Should "Ivernic" be called that, or another name? And do you know if any words in it are attested? (It's not clear to me if ond and fern, which Wikipedia mentions, are Ivernic words, or words borrowed by some other language from (potentially slightly different) Ivernic words.) - -sche (discuss) 20:18, 12 August 2016 (UTC)Reply

@-sche:, AFAIK "Ivernic" is unattested; ond and fern are Old Irish words which Cormac mac Cuilennáin (who lived in the 9th century) believed to be of "Ivernic" origin. I don't think we need to add it. Gallaecian, on the other hand, is attested. If it were to be subsumed under anything else, it would be Celtiberian (xce), but I'd rather keep it separate. —Aɴɢʀ (talk) 20:38, 13 August 2016 (UTC)Reply


Here are other languages we might need codes for: - -sche (discuss) 05:21, 29 August 2016 (UTC)Reply

  • Light Warlpiri, a small mixed language (350 speakers, all also fluent in Warlpiri). I'm not sure if we should have this or not.
    It's gotten a lot of press, but I lean toward saying it's not worth including — mixed languages are always a little hard to justify, and one that's just come into being is covered by Warlpiri and English sections well enough, I reckon. —Μετάknowledgediscuss/deeds 05:30, 4 October 2016 (UTC)Reply


Bushi [buc] edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Apparently, we merged all the Malagasy dialects except this one. I don't see any reason to keep it separate at this point (and I'm not even sure that this oversight merits an RFM). @-scheΜετάknowledgediscuss/deeds 06:19, 28 September 2016 (UTC)Reply

It apparently has its own orthography and is spoken on a different island, which might have motivated a non-merger, although I can find no discussion of it so it seems most likely that it simply went entirely unnoticed. Reconstruction:Proto-Austronesian/daNum and water make use of it. The only references I can find that mention it do speak of it as a dialect of Malagasy. Merged. - -sche (discuss) 01:24, 4 October 2016 (UTC)Reply


renaming Pu Xian [cpx] edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I propose renaming cpx to either "Pu-Xian", "Puxian", "Pu-Xian Min", or "Puxian Min". (at least remove the space, per google books:"pu xian" "fujian"). I do not know which name is most common, but Wikipedia's article is titled Pu-Xian Min. —suzukaze (tc) 09:46, 15 August 2016 (UTC)Reply

On Google Books, "Puxian" and "Pu-Xian" seem to be the most common names, about equally common: but on Google Scholar, about 3.5 times as many works use "Puxian" as "Pu-Xian", so I will rename the language to "Puxian". "Pu Xian" does get a few hits. Incorporating "Min" ("Puxian Min") seems to be uncommon. Another attested spelling (not common) is "Putian". Wikipedia says Xinghua and Hinghwa are additional alt names. - -sche (discuss) 13:50, 1 November 2016 (UTC)Reply
  Done. - -sche (discuss) 13:57, 1 November 2016 (UTC)Reply


Renaming Timne edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Searches for "speak X" and "X language" on Google Books reveal that [tem], which we currently call "Timne", is referred to as "Temne" far more frequently than by any other spelling. The native spelling, Themnɛ (or the ASCII-friendly Themne), sees extremely limited use. See also w:Temne language. —Μετάknowledgediscuss/deeds 05:40, 4 February 2017 (UTC)Reply

I support changing what we call the language to Temne. — I.S.M.E.T.A. 15:32, 15 February 2017 (UTC)Reply
  Done. - -sche (discuss) 17:56, 8 March 2017 (UTC)Reply


Some more missing South American languages, 1 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Here are a few more South American languages for which we could add codes:

  • Cacán (Kakán, Diaguita) was documented, but the document was then lost, so it seems unnecessary to add.
    Update:   added: Some words are recorded elsewhere and collected by Nardi (1979). - -sche (discuss) 14:28, 29 August 2016 (UTC)Reply
  • Coroado Puri language (sai-crd), of which Revista de la Universidad Nacional de Cordoba, volume 6, includes two quite disparate wordlists, with "sun" alternately given as Hope or Aaam, "moon" as pethara or kesha, "water" as something illegible (nhanan?) or gioi. The Viagem a Curitiba e Província de Santa Catarina also has some words, including goio "water". Greenberg, from who knows where, sources ope as "sun", oron as "new moon", and also e.g. čama "animal", puara "chest", teke "belly", bo, ambo "tree", ke "wood, fire", tong, ton "nape of the neck" (compare Greenberg's Puri thong), uka, høka "stone" (compare Puri ukhua), tima "love/want/enjoy" (compare Puri tamathi), baj "to live".
  • Menien (sai-men), attested in a Vocabulario Menien by Príncipe von Wied. He wrote in German, was translated with only a few alterations into French, and then was translated badly with a number of serious errors into Portuguese (says Loukotka, giving some examples from all three editions). Words (agreed on by the German and French editions) include hioi "wax", koinin [sic not h-] "child", and kohira [sic not c-] "red".

- -sche (discuss) 21:18, 16 August 2016 (UTC)Reply

More languages without ISO codes, part 4 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))Reply


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)Reply
Note that some discussion of Yumana, Mariaté (awd-mrt), Wainumá, Wiriná, Guinau, Baré, and Marawá is found on User talk:-sche#Proto-Nawiki. - -sche (discuss) 04:40, 21 November 2016 (UTC)Reply

Corobici edit

Whether Corobici existed as a separate language, and whether words in it are attested, is unclear to me (and apparently to a lot of writers).
Native-languages.org, based on who knows what, identifies Corobicí as bzd. They give four words, which I would not trust without other confirmation: tun "man", si "water", unsat "earth" and guá "house". (For comparison, Wikipedia gives the following Rama words: kiikna "man", kumaa "woman", nguu "house", sii "water".)
The 2013 Native Languages of the Americas, volume 2, citing Mason, envisions a "Rama-Corobici" family which includes the two languages "Rama" and "Guatuso (incl. Corobici)"; the Classification and Index of the World's Languages by Charles Frederick Voegelin and Florence Marie Voegelin also includes Corobici in Guatuso. Guatuso is also called Maléku Jaíka, and Ethnologue says it's specifically not intelligible with bzd.
However, the 1950 Handbook of South American Indians says "Guatuso, with its variety Corobici or Corbesi, and Rama with its dialect Melchora, are obviously very different from each other and from other Central American Chibchan languages, and Mason (1940) was evidently in error in making a Rama-Corobici subfamily." It goes on to place Guatuso and Rama in separate (sub)families.
Comparative Chibchan Phonology (1981), speaking about someone who has "gives a small sample of" Corobici, says "The 'Corobici' words are [actually] words from the dialect of Rama that was spoken in the region of Upala, Costa Rica, up to the 1920s. Really, not a single word of the language of the Corobicies (an extinct group) was recorded". An 1890s report by Daniel Brinton to the US Congress agrees, "Nothing remains of the Corobicies or Corvesies except the name Corobici or Curubici" used in placenames.
Other authors say the Corobici were the ancestors of both the Rama and Guatuso. Tozzer says "The modern Guatuso are probably descendants of, and synonymous with, the ancient Corobici." Lothrop says "It is generally assumed that the Rama were once a tribe identical in language and speech with the Corobici."
- -sche (discuss) 21:55, 9 August 2016 (UTC)Reply
Therefore, not added. - -sche (discuss) 22:00, 2 November 2016 (UTC)Reply

Cumana edit

It's hard to find information on this Chapacuran language, firstly because it's not discussed much in the literature, and secondly because it has the same name, in the same range of spellings, as the Cariban language of the Cumanagotos (cuo). Some sources speak of it as a dialect of Abitana (Wanham), and Loukotka's wordlist is quite similar to his Abitana wordlist; indeed, the two are more similar than some wordlists of the same language (e.g., of Peba) are to each other. Glottolog only has resources on Kuyubi / Kaw Ta Yo, and although the equation of that with Cumana is non-obvious, it would at least be a less confusing name (any spelling of Cumana is likely to be misinterpreted as cuo). - -sche (discuss) 03:36, 31 August 2016 (UTC)Reply

Guachí edit

There may be two lects called Guachí. Wikipedia places a Guachí / Wachí, which it considers possibly Guaicuruan, in Argentina. Glottolog places a possibly-Guaicuruan Guachí in southwestern Brazil, in Mato Grosso do Sul, where Opaie is also spoken. The index to Čestmír Loukotka, Johannes Wilbert, Classification of South American Indian Languages (1968), says they describe Guachí is two places: pages 51-52, which I can't see, and page 66, which says: "Opaie or Ofaie-Chavante - spoken on the Ivinhema, Pardo and Nhandui Rivers in the Brazilian state of Mato Grosso, now by only a few individuals. [Nimuendaju in Ihering 1912, pp. 256-260; Nimuendaju 1932b, pp. 567-573; Ribeiro ms.a.] The so-called language 'Guachi' on the Vaccaria River in the same state is only a dialectal version of Opaie. [Nimuendaju 1932b, pp. 567-573.]" Opaie is not Guaicuruan. It is possible that pages 51-52 describe a different lect. It is separately possible that the more recent (2004) Guaicurú no, macro-Guaicurú sí: Una hipótesis sobre la clasificación de la lengua Guachí (Mato Grosso do Sul, Brasil) knows more than Nimuendaju in 1932 and Loukotka in 1968 about the separateness and familiar relationships of Guachí. [5] lists two Guachí words, [iava] ‘diente(s)’ and [iacté] ‘pierna’. - -sche (discuss) 04:46, 11 July 2016 (UTC)Reply

Guanaca edit

I'm having trouble finding evidence that this lect was recorded. Greenberg says "In Paezan, we find Guambiana, Guanaca, and Totoro with the noun plurals -ele, -el, and -le, respectively", but existence of the Paezan family has been discredited and it's not clear where Greenberg gets his Guanaca data; Glottolog apparently doesn't list the lect and I have not yet found anyone other than Greenberg who mentions words in it. Loukotka cites Castillo y Orozco's 1877 Vocabulario páez-castellano as a source for the lect, but I haven't found it in that work; maybe it's under an alternative name or a spelling I couldn't think of to check? I've found the spellings Guanaca, Guanáca, Guanaco, Guanuco, Guanukco(?), Wuanaka mentioned or used in various sources. Huanca and Huanuco are names of a variety of Quechua (whereas WP says Guanaca is "perhaps Barbacoan"). - -sche (discuss) 15:51, 18 September 2016 (UTC)Reply

Kuwani edit

I'm ambivalent about this. WP says it's known only from one wordlist "and even its exact location is unknown. [And] Smits and Voorhoeve (1998) assumed it to be equivalent to Kalabra". There are, as WP notes, lexical differences, but compare other languages attested (with more certainty) in multiple wordlists that show great differences, including Tapachultec and Coroado Puri. OTOH, we could include it and just merge it later if more information came out that it was Kalabra. - -sche (discuss) 01:44, 20 August 2016 (UTC)Reply

With wordlist-only languages that thus have small, finite lexica, it's best to be conservative and keep them separate when there's no clear case for merger. —Μετάknowledgediscuss/deeds 04:39, 13 September 2016 (UTC)Reply
OK, added. - -sche (discuss) 22:00, 2 November 2016 (UTC)Reply


Baïnounk Gubëeher edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This obscure lect does seem like a separate language, but wasn't described until Cobbinah (2013). The Wikipedia article follows his lead and calls it thus, but it seems odd for us to do so, given that all the other Bainouk languages are given names starting with Bainouk. We also need to decide on a code, perhaps alv-bgu. —Μετάknowledgediscuss/deeds 00:19, 2 April 2016 (UTC)Reply

Added. Wikipedia says Gubëeher (which I can also find referred to as that single word, or with the alternative "prefixes"/first-elements Nyun Gubëeher or Nun Gubëeher) is merely closely related to the two Bainouks, so it's probably tolerable that the spelling is Baïnounk here and Bainouk there. Incidentally, WP citing Glottolog says bcb is a duplicate code for an alternative name of bab. Should they be merged? and should we drop the "Bainouk"/"Baïnounk" prefixes? - -sche (discuss) 03:25, 13 September 2016 (UTC)Reply
Incidentally, there's a book Le gúbaher, parler baïnouck de Djibonker by Noël Bernard Biagui which apparently uses yet another spelling. - -sche (discuss) 14:52, 18 September 2016 (UTC)Reply


Antigua and Barbuda Creole English edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming aig

We currently call this "Antigua and Barbuda Creole English" — names containing and are suspect, in my opinion. Wikipedia calls it "Leeward Caribbean Creole English". The most common name in literature seems to be "Antiguan Creole", which is what I suggest renaming it. - -sche (discuss) 03:22, 19 April 2016 (UTC)Reply

The country is called Antigua and Barbuda, so I don't see anything suspicious about the name per se. Nevertheless, I prefer Wikipedia's name, since it's also spoken in Montserrat, Saint Kitts and Nevis, and Anguilla. —Aɴɢʀ (talk) 17:53, 19 April 2016 (UTC)Reply
I concur with Angr, although I have hesitations about changing it. The use of "Antiguan Creole" seems to be mainly by scholars who are just studying the lect spoken on that island and not worrying about what is and isn't considered the same language. —Μετάknowledgediscuss/deeds 04:12, 26 June 2016 (UTC)Reply
Meh, left as-is. - -sche (discuss) 06:45, 27 March 2017 (UTC)Reply


Kela-Yela language edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This language got encoded twice, once as Kela [kel] and once as Yela [yel], based on its two names in two different provinces of DR Congo. Wikipedia opts for "Kela-Yela" as the compromise name, and I can't find a better one, unfortunately. (This also opens up "Kela" for [kcl], which we currently call "Kala", though that name seems to be less commonly used.) —Μετάknowledgediscuss/deeds 06:00, 4 October 2016 (UTC)Reply

I've merged yel into kel. - -sche (discuss) 04:20, 27 March 2017 (UTC)Reply


Sena language merger edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Sena is mutually intelligible with Chichewa, so it really shouldn't be considered a separate language, but that's probably not worth a merger, as the speakers seem to disagree. What the speakers do seem to agree with is that there is one Sena language, despite it having gotten two ISO codes, seh in Mozambique and swk in Malawi, which we call "Sena" and "Malawi Sena" respectively. We should probably just do away with swk. —Μετάknowledgediscuss/deeds 23:48, 8 October 2016 (UTC)Reply

I've merged swk into seh. No action taken with regard to Chichewa at this time. - -sche (discuss) 04:15, 27 March 2017 (UTC)Reply


Old Uighur language (oui) to Old Uyghur edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Spelling is inconsistent (we spell the modern language Uyghur). DTLHS (talk) 23:28, 26 December 2016 (UTC)Reply

Renamed. It seems to be as common or perhaps more common to spell it with a y, although searchign is complicated by chaff from the SOP phrases "old Uighur", "old Uyghur". - -sche (discuss) 01:53, 28 March 2017 (UTC)Reply


Chimwiini edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is pretty decidedly too distinct to be considered a dialect; I was just reading a grammatical description and it's heavily different from Swahili. Maho, Kisseberth, and others seem to agree that it is a separate language. As for the name, it seems that "Chimwiini" (sadly, with the language prefix) is a better option than "Bravanese", used at w:Bravanese dialect. I suppose the code could be bnt-mwi. @-sche, I noticed this because of muke; what other Chimwiini entries did you add? —Μετάknowledgediscuss/deeds 03:28, 25 March 2017 (UTC)Reply

I know what your reasoning is for choosing bnt-mwi as the language code, but I think that it's better if the code matches the English name, thus bnt-chi or bnt-cmw. —CodeCat 18:56, 25 March 2017 (UTC)Reply
I agree it merits separation; indeed, I see articles going back to at least the 1960s pointing out that "Certainly Bravanese and standard Swahili are not mutually intelligible" (Morris Goodman's 1967 Prosodic Features of Bravanese, J. of African Languages 6). And yes, for better or worse codes have tended to be named based on the English names. Did you search for "Mwini" with one i? I find roughly comparable numbers of Google Books hits for "Mwini" Swahili, "Chimwiini" Swahili and "Bravanese" Swahili. (Ngrams, catching I-don't-know how much chaff, finds Mwini to be the most common spelling recently, followed by Bravanese — the most common spelling until recently — and then Mwiini, with Chimwiini not common enough for it to plot.) It looks like we could avoid the prefix if you wanted, though I defer to you if you checked more literature than the cross-section Google has digitized. Google Scholar has ~120 articles on "Chimwiini" Swahili compared to ~90 for "Mwini" Swahili, some of which are the irrelevant last name Mwini, and also ~90 for "Bravanese" Swahili, but many are bibliographies mentioning the same few articles. Other spellings I see, besides Chimwiini, Bravanese, Mwiini, and Mwini, include Chimwini, Chimini, and Brava. - -sche (discuss) 23:37, 25 March 2017 (UTC)Reply
In my own files, and likely elsewhere, the Chichewa word mwini gets in the way. But in any case, most of the works I see dedicated to the language that aren't on the older end use "Chimwiini". Anyway, bnt-cmw is fine. —Μετάknowledgediscuss/deeds 23:43, 25 March 2017 (UTC)Reply
I've added the code bnt-cmw. Btw, I didn't add muke: that was User:Muke themself all the way back in 2006. :-p - -sche (discuss) 03:06, 27 March 2017 (UTC)Reply
I'm shocked that it wasn't part of your water-and-woman effort. I guess I'll add the word for water now. —Μετάknowledgediscuss/deeds 03:25, 27 March 2017 (UTC)Reply


Aramaic edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The same language in two different scripts with two different literary standards, and yet each quite similar to each other. I think I see a pattern. —Μετάknowledgediscuss/deeds 07:49, 21 December 2012 (UTC)Reply

"Arc" should only be used for "Imperial Aramaic" (aka "Official Aramaic"), ideally written in the Old Aramaic script rather than Hebrew. Current usage does not reflect that though and there are a whole mix of dialects intertwined within the "arc" code, so that one at least should be split. "Syc" should stay as it is. --334a (talk) 08:04, 21 December 2012 (UTC)Reply
I think SIL has done a really bad job with the classification of Aramaic languages. ARC was an umbrella code that was used to describe all later Aramaic varieties in ISO-639-2 in ISO 639-3 they introduced SYC for classical Syriac, by far the most widespread form of literary Aramaic.--Rafy (talk) 17:38, 28 December 2012 (UTC)Reply


East Frisian lects edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This is an issue that needs to be resolved soon: there are 90 module errors related to it.

First let me lay out the linguistic details:

East Frisian, a language of Lower Saxony in Germany, is, along with West Frisian in the Netherlands and North Frisian in Germany, one of the w:Frisian languages. It has historically consisted of two dialect groups: one near the Ems River, and the other near the Weser River. The Weser dialects are all now extinct, with the w:Wursten Frisian dialect surviving into at least the 1700s, and the last speaker of w:Wangerooge Frisian dying in 1953. Although most of the Ems dialects died out by the 18th century, Saterland Frisian has survived to the present day. The extinct dialects were absorbed into Low German to become East Frisian Low Saxon.

The ISO has typically made a massive mess out of the Germanic languages, and they really screwed up in this case- until recently, it was impossible to tell if their code for "East Frisian", frs, referred to Frisian East Frisian or to East Frisian Low Saxon. If the first were true, it would overlap with Saterland Frisian, stq. If the second were true, it would be just another of the Low German lects that we decided earlier to treat as German Low German, nds-de (not Dutch Low Saxon, nds-nl, in spite of having "Saxon" in the name). Because of this ambiguity, the frs code was pressed into service for Frisian East Frisian in upwards of 140 etymologies and translations (there might have been entries, too, but I have no way to tell).

A few weeks ago, I mentioned to User:-sche that the online description of another lect had been updated. In the process of checking this out, he discovered that frs was now unambiguously described as East Frisian Low Saxon, and thus redundant to nds-de. After making the usual checks of the categories and changing uses of frs that he knew about, he removed frs from the data module. Unfortunately, East Frisian is mostly only mentioned in etymologies as cognates, and cognates don't show up in any of the categories, nor do redlinked translations. User:Leasnam (who added most of the frs references in the first place), -sche and I have been able to whittle it down from 137 entries in Category:Pages with module errors, to the present 90 just by getting rid of unnecessary ones and by changing recognizable instances of Saterland Frisian and East Frisian Low Saxon to the correct language codes.

As I see it, there are two halfway-decent options:

  1. Merge all of Frisian East Frisian into Saterland Low Frisian, stq, since the latter is the only surviving dialect of the former.
  2. Create an exception code for Frisian East Frisian, such as gmw-efr or gmw-fre.

There's also the possibilty of restoring frs as Frisian East Frisian, but that would put us in direct contradiction to the ISO standard, and leave things open for all kinds of confusion.

The first probably fits the linguistic facts best, but the second may be more practical, at least in the short run.

I've found no references on East Frisian Low Saxon, and very little on Saterland Frisian (there's a Saterland Frisian Wiktionary, but most of the remaining terms aren't mentioned there). There is, at least, one good dictionary of Frisian East Frisian available online. Those with better sources apparently don't have the time to work on this right now. Chuck Entz (talk) 02:41, 11 April 2015 (UTC)Reply

I don't think restoring frs is a good idea for the confusion you mentioned. I think option 1 is the best; we would then treat Saterland Frisian as the main dialect of East Frisian. Option 2 would just introduce more ambiguity, since one language would suddenly become a part of another. This is why we eliminated the Low German varieties in the first place. —CodeCat 17:28, 11 April 2015 (UTC)Reply
Yes, it wouldn't make sense to have both "East Frisian" and "Saterland Frisian" (they are not sufficiently distinct), so option 2 would only make sense if we changed instances of stq to the new pan-East-Frisian code and {{label}}-ed them. But we do need to recognize that there are inclusible East Frisian words which are not Saterland Frisian (namely, all the words and forms that we know — from records — existed in the non-Saterland varieties of East Frisian). Precedent exists of us repurposing codes to refer to slightly more things than the ISO intended them to refer to, e.g. we use gcf to refer to both gcf and acf and we used en to refer to both en-proper and hwc (Hawai'ian Creole English) and pld (Polari). Hence, I would add "East Frisian", "Eastern Frisian" etc as alternative names of stq, and then change all remaining uses of frs to stq to solve the module errors. Then, at leisure, we can go back through the affected pages and specify, whenever possible, which precise dialect they are from. (It may look a bit ugly to have e.g. "Wursten Saterland Frisian foo" or "Saterland Frisian foo (Wursten dialect)", but it's probably the best we can do, since changing the canonical name of stq to "East Frisian" would just invite people to become confused about it and misuse it again.) - -sche (discuss) 16:58, 13 April 2015 (UTC)Reply
By the way, with all these module errors on highly visited pages, we can't really wait for this to go through RFM procedure. I think whoever sees this next, if (s)he has the time, should implement -sche's temporary solution. (I myself will do it if I can get my work done in meatspace quickly enough.) —Μετάknowledgediscuss/deeds 06:36, 16 April 2015 (UTC)Reply


Weyto language edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This language's Wikipedia article helpfully explains that this extinct language of Ethiopia is unattested. However, since Amharic is an LDL and wordlists of the Weyto dialect of Amharic have been made that contain words purported to derive from it, we should make it an etymology-only language code. —Μετάknowledgediscuss/deeds 02:23, 4 April 2017 (UTC)Reply

I wonder why the SIL/ISO assigned a full code in the first place to an unattested extinct language of unclear family association that is not even certain to have existed... - -sche (discuss) 03:16, 4 April 2017 (UTC)Reply
I dunno! But after my efforts above on this page to identify new codes that need adding, I reckon I need to do some work identifying codes that need removing. —Μετάknowledgediscuss/deeds 03:22, 4 April 2017 (UTC)Reply
  Done. - -sche (discuss) 06:59, 12 May 2017 (UTC)Reply


Dupaningan Agta edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


At present, Dupaningan Agta is the language evoked by language code duo, but Dupaninan Agta is also a valid spelling. User:Mar vin kaiser has been adding entries in this language under Dupaninan Agta. —Stephen (Talk) 09:57, 5 April 2017 (UTC)Reply

The only reason though why I used Dupaninan Agta instead of Dupaningan Agta is because that's the spelling that came out of Wiktionary when I inputted duo. --Mar vin kaiser (talk) 10:18, 5 April 2017 (UTC)Reply
That's true: {{subst:\|duo}} yields "Dupaninan Agta", as listed at Module:languages/data3/d. —Aɴɢʀ (talk) 11:38, 5 April 2017 (UTC)Reply
Which, in turn, is because that's the spelling the Ethnologue/SIL/ISO used when we copied codes over, years ago. But the spelling with g seems slightly more common. Should it be made the canonical spelling, with the other one retained as another name? The names without "Agta" (‎i.e. "Dupaninan", ‎"Dupaningan") should also be otherNames, since they are sometimes encountered. - -sche (discuss) 17:04, 5 April 2017 (UTC)Reply
@Mar vin kaiser, Stephen G. Brown I have updated things so that the more common spelling is the one used in all the entries and translations tables and categories now: Dupaningan Agta. I'm sorry if this causes you to have to relearn how to type the header, etc. The "joys" of muscle memory! :-p - -sche (discuss) 07:27, 10 May 2017 (UTC)Reply


Gail/Gayle (gic) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The Gayle language is a gay argot lexicon that can be used in English or Afrikaans. Words should be under whatever language they are found in, as argots are not independent languages, and this code should be removed. We have precedent for excluding gay argots based on our deletion of the code for Polari (discussion, most of which is off-topic). —Μετάknowledgediscuss/deeds 06:09, 6 July 2016 (UTC)Reply

Removed. It can be reinstated as an etymology-only code if the need arises. —Μετάknowledgediscuss/deeds 02:52, 16 May 2017 (UTC)Reply


the Kewa lects edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I propose to merge [kjy] "South Kewa"/"Erave Kewa", [kjs] "East Kewa" and [kew] "West Kewa"/"Pasuma Kewa" into [kew] as "Kewa". AFAICT most literature treats Kewa as a single language, and the only effect having three codes has had upon us so far is that our Kewa content is duplicated under several headers (as in ipa and utyali). There seems to be a far more marked difference between normal Kewa and its pandanus avoidance register than between South, East and West. - -sche (discuss) 06:00, 11 August 2015 (UTC)Reply

  Done. - -sche (discuss) 05:37, 14 May 2017 (UTC)Reply


(Proto-)Western Malayo-Polynesian into (Proto-)Malayo-Polynesian edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


[pqw-pro] → [poz-pro]

As discussed earlier this year. Western Malayo-Polynesian is a solely geographic group, it is not recognized by our language categorization system, and a proto-language appendix seems to be superfluous.

In addition to the mentioned durian issue, we have currently 13 Proto-WMP lemmas, with a breakdown as follows:

  • 10 entries are fully identical to corresponding Proto-MP entries (e.g. *huaji = *huaji, *wada = *wada)
  • 2 entries (*huaji-ŋ, *qari-mauŋ) are reconstructed from very scarce data, and the most likely situation is that they were just randomly lost in Central-Eastern MP.
  • 1 entry (*azak) is, per the cited source (Blust's dictionary), probably a late Wanderwort originating in Malay(ic) and not inherited.

--Tropylium (talk) 15:03, 29 June 2015 (UTC)Reply

Even if there are regional differences, you don't have to resort to separate protolanguages to explain them: for one thing, the substratum languages encountered off of Southeast Asia had to have been vastly different from those farther east. As for animal (and to a lesser extent plant) names, there's the matter of the w:Wallace Line and other such boundaries: the farther east you go, the fewer non-marine Asian species you find. By the time you get to Polynesia, the only flightless land animals (New Zealand is an exception, or course) are human-transported creatures such as pigs, chickens, dogs and rats, and the only widely-distributed plants are those with seeds that can drift on the currents, or Polynesian canoe plants- if you don't have wild beasts, you're not likely to preserve inherited names for them. Chuck Entz (talk) 02:30, 30 June 2015 (UTC)Reply
I've started to merge these. - -sche (discuss) 05:14, 29 May 2016 (UTC)Reply
  Done. I've moved the remaining 8 PWMP appendices and updated the entries that referred to PWMP in their etymologies. - -sche (discuss) 09:39, 15 May 2017 (UTC)Reply


Renaming Kxoe edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


There's a bit of noise on BGC, but it seems that "Khwe" is more common that "Kxoe" for [xuu], and continues to be used in modern books. See w:Khwe language. —Μετάknowledgediscuss/deeds 06:12, 4 February 2017 (UTC)Reply

Perusing Glottolog's large bibliography for this language, I note that Kxoe predominated and Kxoe was little used until about the year 2000, after which Khwe has predominated and Kxoe has been little used. Wikipedia uses Khwe and mentions that it is the preferred spelling. Renamed. - -sche (discuss) 07:51, 22 May 2017 (UTC)Reply


More languages without ISO codes, part 5 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


{—Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC))Reply


Comments:
I've split the discussion so that ones that are done can be archived. - -sche (discuss) 19:28, 10 July 2016 (UTC)Reply
I've added Pai-lang and Nicola. - -sche (discuss) 06:31, 30 March 2017 (UTC)Reply
I've added Moran. - -sche (discuss) 04:15, 1 May 2017 (UTC)Reply

Sinúfana edit

The variety of names this language has and their homography with the names of the people and the place they lived, has made it hard to search for information on this language. And some scholars say there is nothing to find: The Indigenous Languages of South America (edited by Lyle Campbell and Verónica Grondona) quotes "Adelaar and Muysken (2004: 624): [Sinu] 'cannot be classified for absence of data.' Loukotka (1968: 257) grouped Zenú (Senú) with the Chocó Stock, though nothing was known of the language." However, I've been poking around for quite a while now, and finally found a copy of A. Oyuela-Caycedo's full article in the Handbook of South American Archaeology (edited by Helaine Silverman, William Isbell), which I'd previously only been able to see a page of. On the page I'd seen, Oyuela-Caycedo says "the descendants of the Sinú have lost their language, making it impossible to classify them in terms of known linguistic families (Adelaar and Muysken 2004). However, taking toponyms into consideration the area seems to have been occupied by Chibchan speakers." On the next page, however, Oyuela-Caycedo goes on to discuss Spanish records and mentions that the "name or title" of the Finzenú chief was "Tota", with which clue I tracked down Jaime Alberto Castro Núñez's Historia de la medicina en Córdoba: notas preliminares (2002), mentioning a few other words and names: "A la llegada de los cappunia, como los indios llamaban en su lengua a los españoles, la cacica de Finzenú era Tota; el Zenúfana era Nutibara y el Panzenú era Yapé." "At the arrival of the cappunia, as the Indians called in their language the Spanish, the chief of Finzenu was Tota, that [of] Zenúfana was Nutibara and that [of] Panzenú was Yapé." (Other mentioned names are Anunaybe and Quinuchu.) The ambiguity over whether Tota was a proper name or title appears to be original; I find what seems to be a copy of Sebastián de Benalcázar's writings, saying "[...] Tota, nombre que no sabemos si era el de su cargo o propio de la última reina de este riquísimo pueblo". - -sche (discuss) 05:21, 11 May 2017 (UTC)Reply

Even more languages without ISO codes, part 3 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


04:54, 6 July 2016 (UTC)

Australian languages
other codes

Alungul, Angkula edit

Alungul and Angkula are very similar (but, OTOH, so are other related languages), with Barry Alpher mentioning the following words: Alungul oth:o, Angkula otho "liver" (but e.g. Ikarranggal otho is also similar); Alungul mbolvm, Angkula ombolvm "mosquito" (but also e.g. Ogunyjan ombolvm); Alungul orormv, Angkula otil "nape" (the latter "likely a loan from Uw-Olkola odel", and other languages use different words); Alungul obmo(gng), Angkula obmu "nose" (but also e.g. Athima ubmu); Alungul atïl, Angkula atï "see" (Ikarranggal ara); Alungul amadhv, Angkula amadh "shin" (Athima amadhv); Alungul anggul, Angkula angkul "tooth" (Athima angkul, Ogunyjan enggul). Many words were even recorded from the same informant, with one scholar writing "West recorded considerable material in Ogh Alungul and Ogh Anggula from the now deceased Mr Jack Burton, but subsequently l was able to record and transcribe not only a few fragments more of these but a quantity of another dialect." It is difficult to say if they should be treated as one language, and what it would be called. They have separate codes for now. They could always be merged later. - -sche (discuss) 18:35, 25 May 2017 (UTC)Reply

Other comments edit

I also added a code for Yangkaal, which Ethnologue had conflated into nny for some reason. - -sche (discuss) 03:25, 26 May 2017 (UTC)Reply

Languages of France edit

For discussion about the addition of codes for Angevin (roa-ang), Champenois (roa-cha), Lorrain (roa-lor), and Franc-Comtois (roa-fcm), Orléanais (roa-orl), Poitevin-Saintongeais (roa-poi), and Tourangeau (roa-tou), and discussion of Bourbonnais and Berrichon, Mayennais and Sarthois and Percheron, see Wiktionary:Beer parlour/2017/May#Language_codes_for_Bourbonnais_and_Poitevin. - -sche (discuss) 21:57, 1 June 2017 (UTC)Reply

Renaming Standard Moroccan Amazigh (zgh) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The preferred form in the ISO 639-2 is: Moroccan Amazigh. — This unsigned comment was added by YesIn (talkcontribs) at 17:34, 24 June 2017‎.

I have moved this from Wiktionary talk:Language treatment/Discussions, the out-of-the-way place where finished discussions are archived. - -sche (discuss) 18:00, 24 June 2017 (UTC)Reply
In general, we do try to avoid using "Standard" in language names (e.g. we have "Estonian", not "Standard Estonian"), so a rename would be good, and "Moroccan Amazigh" does seem to be more common than "Moroccan Tamazight". - -sche (discuss) 18:00, 24 June 2017 (UTC)Reply
Support per -sche. This language is currently our only one using "Standard" of all the coded languages with categories. —Μετάknowledgediscuss/deeds 14:33, 27 June 2017 (UTC)Reply
You can find in the official Request for new language code:

"Name(s) of language (English): (Required)

Moroccan Amazigh, Amazigh, Common Moroccan Amazigh, Standard Moroccan Amazigh, Moroccan Berber

If giving variant names, indicate preferred form first
Name(s) of language (French):

Amazighe marocain, amazighe, amazighe marocain commun, amazighe marocain normalisé, berbère marocain, amazighe standard.

If giving variant names, indicate preferred form first
Reference where found:

The French name is used by Institut royal de la culture amazighe (IRCAM). Amazighe is the name found in the French version of the constitution approved by a referendum on the 1st of July 2011.
http://lematin.ma/Events/discours-royal/constitution-referundum.pdf (article 5) and http://www.sgg.gov.ma/bo5952F.pdf?cle=42 (Official Government Gazette)
The qualificative "marocain"/"Moroccan" is used to make sure it does not refer to a specific Moroccan dialect of Berber [tzm] (Moroccan being broader than the Central Atlas qualitificative used for [tzm]) or to the Berber/Amazigh family as a whole [ber] (Moroccan being narrower than the whole family’s coverage area).

Name(s) of language (indigenous):

ⵜⴰⵎⴰⵣⵉⵖⵜ". ‒ YesIn (discuss) 03:08, 5 September 2017 (UTC)Reply


Renaming Banggarla (bjb) edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Judging by Google Books and Scholar searches, and based on the references cited by Glottolog and Wikipedia, the most common name for this language in recent literature (since the 1990s?) seems to be Barngarla, while the most common name historically/overall (still found in some recent references) is Parnkalla. Some old references have been updated from Parnkalla to Barngarla, e.g. Mark Clendon's Clamor Schürmann’s Barngarla grammar: A commentary on the first section (where the referred-to original had Parnkalla), which argues based on recordings as well as etymology that the name is /parnkarla/, with the /ŋ/-form Banggarla as a northern dialect form "or even an exonymic pronunciation". Banggarla, and Barngala with no second 'r' and Parnkarla with 'P' and two 'r's, seem relatively uncommon, and many other spellings exist (see Glottolog). I suggest a rename to Barngarla, or Parnkalla (this entails moving the categories). (Fr.Wikt has "banggarla" and "barngala" as separate languages, but this seems to be an oversight and I will let them know.) - -sche (discuss) 15:43, 26 June 2017 (UTC)Reply

I support a rename to something, with "Barngarla" being my preference, but we seem to vacillate in general on whether we should use the more traditional spelling or the one that is becoming the standard. Compare "Kikuyu" vs "Gikuyu". —Μετάknowledgediscuss/deeds 19:22, 26 June 2017 (UTC)Reply
Renamed. - -sche (discuss) 03:14, 9 July 2017 (UTC)Reply


East Franconian (vmf), High Franconian (gmw-hfr) edit

Noting here that gmw-hfr was retired and vmf restored after becoming usable; see Wiktionary:Beer parlour/2017/November#Restoring_vmf_and_eliminating_gmw-hfr. - -sche (discuss) 23:45, 29 December 2017 (UTC)Reply

Splitting Monguor into Mangghuer and Mongghul edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This was already discussed at Wiktionary:Beer_parlour/2016/December#Splitting_Monguor, but seeing as it's been a few days I've decided to use the circumstance of this being the preferred place for such requests to bump the topic. Crom daba (talk) 05:19, 12 December 2016 (UTC)Reply

@Crom daba, what would happen to words in older literature such as de Smedt / Mostaert 1933 or Todaeva 1973 which are not clearly marked as being either Minhe or Huzhu? Now they can be added under Monguor. Someone can later label them as Mangghuer or Mongghul using {{lb|mjg|}}. After splitting, a lot of useful stuff from older sources will hang in the air. --Vahag (talk) 14:49, 22 December 2016 (UTC)Reply

Most sources are identifiable as either Mangghuer or Mongghul, we could specify which resource contains what in the about page. A bigger problem for me is how do we even enter data from the old sources? Do we put in Todaeva's Cyrillic and Mostaert's pre-IPA phonetic symbols or do we transcribe it into Pinyin? I wasn't around long enough to see what the precedent here is. Crom daba (talk) 22:08, 22 December 2016 (UTC)Reply
If you can specify which resource contains what in the about page, then I support the split. Otherwise, I would like to have three codes, like we do with ku (Kurdish macrolanguage), kmr (Northern Kurdish variety), ckb (Central Kurdish variety).
As for entering words from older sources, you can normalize them into Pinyin, as long as the rules of normalization are clearly defined. Look at what I have done with Udi at WT:AUDI. --Vahag (talk) 06:56, 23 December 2016 (UTC)Reply
After some research, it appears that correspondence of Pinyin spellings (as written by Dpal-ldan-bkra-shis et al) and older attestations is non-trivial, so I will put off transcribing anything which isn't already in Pinyin, at least until I see an example of it supporting the full range of dialectal and historical Monguor variation. Crom daba (talk) 00:16, 24 December 2016 (UTC)Reply
@Crom daba, Vahagn Petrosyan, Angr, Metaknowledge: I added codes for Mangghuer (xgn-mgr) and Mongghul (xgn-mgl). I suggest that we move as much content as possible to the new codes and update the orthography as we go, and that will give us an idea of whether or not it is feasible to retire the macrolanguage code yet. - -sche (discuss) 00:58, 2 June 2017 (UTC)Reply


Splitting Evenki and Solon edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Solon is a language spoken by a Tungusic people living in Manchuria and Inner Mongolia, they consider themselves Evenks, but their language is somewhat different and is not usually counted among Evenki dialects in literature. We follow Ethnologue in categorizing it as Evenki, but most Russian Tungusic literature (Tsintsius, Vasilevich, ...) counts it as a separate language and I too think this would be preferable.

Also worth mentioning is that we have the language of Oroqen as a separate language already for whom Janhunen claims are " in fact much closer to the "Ewenki proper" (i.e., the Evenks of Siberia) than the Solon are" (quote from Wikipedia). Crom daba (talk) 08:37, 7 August 2017 (UTC)Reply

Sadly (as the lack of response shows), I think you may be the only editor with knowledge of Tungusic. The English-language works I can find discussing the lects, while mostly general rather than specialist, also seem to mostly speak of them as separate languages, even though Wikipedia redirects "Solon language" to Evenki. I see that Solon currently has an etymology-only code ("evn-sol"); do you want it to have a "full" code and its own header and entries like дяви, @Crom daba? That seems reasonable, although the code should be "tuw-sol", I think, to fit the customary naming scheme describe in WT:Languages; if that's OK, ping me to add it or add it yourself to Module:languages/datax (and then update the entries that refer to it and remove the etymology-only code "evn-sol"). - -sche (discuss) 06:44, 29 December 2017 (UTC)Reply
Yes thank you, I'd like it to have its own header @-sche. Crom daba (talk) 17:02, 29 December 2017 (UTC)Reply
  Done: [6], [7]. - -sche (discuss) 23:17, 29 December 2017 (UTC)Reply


Renaming (A)Ngas, anc edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming anc

A pretty clear case. We call it "Angas", but the name "Ngas" has been more common for quite some time now. Compare google books:"Angas language" with google books:"Ngas language" for an example. —Μετάknowledgediscuss/deeds 20:53, 14 August 2017 (UTC)Reply

Yes, perusing those searches and Glottolog's list of reference works about it, I see that some books do still use "Angas" but "Ngas" does seem to have been more common for at least a couple decades. (And you have access to better resources on African languages than I do and I trust your judgment.) I recall that we renamed the related Mwaghavul (from "Sura") only a couple years ago. Renamed. - -sche (discuss) 06:15, 29 December 2017 (UTC)Reply


Removing Jorto, jrt edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Removing jrt

An anon has pointed out that Jorto language is apparently a spurious invention. There is a wordlist, and a case could be made for including it, but given that we chose to exclude the Oropom language, I don't see how this is any different. —Μετάknowledgediscuss/deeds 00:57, 28 December 2017 (UTC)Reply

Remove before it lays eggs. Palaestrator verborum sis loquier 🗣 01:07, 28 December 2017 (UTC)Reply
  Done. - -sche (discuss) 23:39, 29 December 2017 (UTC)Reply


Even more languages without ISO codes, part 4 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Even more languages without ISO codes, part 4

Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)Reply

Others:

  • Alazapa/Alasapa/Pinto (nai-ala), a fragmentarily attested language of northern Mexico, scarcely described in English and not that much better described in Spanish. Considered to be related to Quinigua and Cotoname by Norman A. McQuown's 1968 Handbook of Middle American Indians (volume 5: Linguistics, page 100), it is sometimes identified with or considered a dialect of Coahuilteco, apparently as part of the former belief that the "the Coahuiltecans belonged to a single language family and that the Coahuiltecan languages were related to the Hokan languages of California, Arizona, and Baja California. Most modern linguists, however, discount this theory for lack of evidence and believe that the Coahuiltecans were diverse in both culture and language. At least seven different languages are known to have been spoken[.]" Some of the scholars who responded to and followed up on del Hoyo's vocabulary of Quinigua provide a few Alazapa words, like axi "tobacco" (compared to Karuk úuh "tobacco", Esselen ka'a "tobacco"). - -sche (discuss) 03:48, 6 June 2017 (UTC)Reply
    • I'm away from my books at the moment, but I seem to remember Yuman languages having something along the lines of /up/ for tobacco. Chuck Entz (talk) 04:01, 6 June 2017 (UTC)Reply
      • Yes, Cocopa has ˀu·p "tobacco", and ˀu·p xyay "smoke tobacco". Quechan/Yuma itself has axta/ak’sa’ for a tobacco pipe; in a short search, I didn't find the [Quechan] word for tobacco itself, but the list I found the Alazapa word in was comparing it to mostly words for "tobacco" but some words for "pipe". (I updated the spelling of the Karuk and Esselen words to the spellings given in dedicated works on those languages.) - -sche (discuss) 04:30, 6 June 2017 (UTC)Reply
      Added. - -sche (discuss) 19:48, 16 June 2017 (UTC)Reply


Ossetian: RFM discussion: December 2012–January 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Template:os

The Digor and Iron dialects of Ossetian seem quite different, and already many (most?) of our entries distinguish which is meant. It seems to me that there is a fair chance that the two are separate enough to deserve being called different languages here. —Μετάknowledgediscuss/deeds 04:46, 19 December 2012 (UTC)Reply

I've seen them referred to as separate languages before, but there's still some debate over that. Doesn't matter to me. But would there still be plain Ossetian language entries or would all be sorted into the new languages? There are some that aren't labelled as either Iron or Digor.Word dewd544 (talk) 17:58, 20 December 2012 (UTC)Reply
Iron is by far the more common dialect, and the literary Ossetian language is based on Iron. However, Digor is different enough that it could be considered a separate language. The main things against it here are the relatively small number of speakers and that it does not yet have a written standard, as far as I know. But there is now a Digor dictionary out there, and it’s probably just a matter of time before Digor develops a literary standard of its own. I think it’s unlikely that we will get enough Digor contributions to make a difference, but it is always possible that someone will start entering words from a Digor dictionary. The Digor language code is oss-dig. We could use os for Ossetian proper (and Iron), and oss-dig for Digor. —Stephen (Talk) 02:20, 21 December 2012 (UTC)Reply
In that case, I support. — Ungoliant (Falai) 03:40, 21 December 2012 (UTC)Reply
A standard language based on one widely-spoken dialect, and another lect sometimes considered a dialect and sometimes considered its own language? This reminds me of Tosk vs Gheg Albanian: some references say they're mutually unintelligible separate languages, speakers say their differences present no impediment to communication. Unfortunately, we lack speakers of the Ossestian lects, and the dictionary of Digor is said to waffle, the author calling it a language and the editor calling it a dialect. Stephen is probably right that it's just a matter of time before Digor develops its own standard (and merits separation as much as Luxembourgish and Limburgish do from each other and from German); OTOH, Wiktionary, like Wikipedia, is not a crystal ball. My preference would be to wait and not split them for now. If we do split them, I agree with using {{os}} for Iron (compare {{lt}} and {{sgs}}), and we should devise an exceptional code for Digor that fits our usual naming scheme (ira-odg or ira-dig), rather than using Linguist List's ersatz "oss-dig". - -sche (discuss) 03:42, 21 December 2012 (UTC)Reply
Etymology-only codes have been created for Digor and Iron, per the last part of this January 2018 WT:ES thread. - -sche (discuss) 03:21, 10 January 2018 (UTC)Reply
Not merged at this time; cf the WT:ES discussion; but especially if the developments with regard to resources in the lects over the past few years suggest they should, in fact, have separate codes, please feel free to reopen (or start a new, non-stale) discussion. - -sche (discuss) 14:47, 12 January 2018 (UTC)Reply


Renaming (Tshi)Luba, lua edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming lua

This language is currently called "Tshiluba", which is a really awful choice. First of all, tshi- is that good ol' language prefix that we often try not to have in language names (which I think is ci- in modern orthography), and there are in fact two Luba languages (the other is lu "Luba-Katanga"). To avoid confusion, we rightfully give neither the name Luba, but this is not much better, and we should rename it to "Luba-Kasai", as Wikipedia does. —Μετάknowledgediscuss/deeds 19:28, 27 November 2015 (UTC)Reply

On the one hand, we do try to avoid prefixes. On the other hand, "Luba-Kasai" seems to more often be a placename and an ethnonym than a language name, and "Tshiluba" seems to be about twice as common. - -sche (discuss) 21:15, 8 January 2016 (UTC)Reply
The issue is that, AFAICT, "Tshiluba" is more commonly used because it refers to both Luba languages! This is not so much about prefixes so much as the issue of the name being exceedingly ambiguous in its referent. —Μετάknowledgediscuss/deeds 01:28, 9 January 2016 (UTC)Reply
OK, renamed. - -sche (discuss) 21:19, 14 January 2018 (UTC)Reply


Taishanese and Teochew edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


@-sche I think Taishanese dialect of Cantonese and Teochew dialect of Min Nan both need language codes. They are covered by Chinese pronunciation modules but theire transliterations is very different from Cantonese and Min Nan accordingly. E.g. in 鉛筆铅笔 (qiānbǐ) "yon3 bit2" is not standard Jyutping and "ing5 big4" is Teochew Peng'im, not POJ. @Kc_kennylau, Wyang, Suzukaze-c, Justinrleung. Perhaps these subdialects needs nesting in translations and numbered tone marks (also Gan, Jin, Xiang) also need superscripts, just like Cantonese Jyutping. --Anatoli T. (обсудить/вклад) 11:32, 12 November 2016 (UTC)Reply
typo fix→@Kc_kennylau, Wyang, Suzukaze-c, Justinrleungsuzukaze (tc) 12:05, 13 November 2016 (UTC)Reply
Thanks, suzukaze! --Anatoli T. (обсудить/вклад) 12:08, 13 November 2016 (UTC)Reply
They seem to have language codes already (yue-tai for Taishanese and nan-teo for Teochew); at least they work in the etymology templates. — justin(r)leung (t...) | c=› } 00:58, 14 November 2016 (UTC)Reply
@Justinrleung Thanks but if I try add a translation using 'yue-tai' I get the error:
Lua error in Module:languages/templates at line 28: The language code 'yue-tai' is not valid.: Lua error in Module:parameters at line 110: The parameter "<strong class" is not used by this template. --Anatoli T. (обсудить/вклад) 06:15, 14 November 2016 (UTC)Reply
Yes, those are etymology-only codes. They would standardly need different codes if we are to treat them as full languages, though. The point is that you guys need to decide what status you want these lects to have. @Atitarev, Justinrleung, suzukaze-c, WyangΜετάknowledgediscuss/deeds 06:24, 14 November 2016 (UTC)Reply
If there's a need to enter them into translations tables (on account of their different transliteration and, according to Wikipedia, sometimes different vocabulary), they could be given full codes, which as Meta notes would be named a little differently (using the system described in WT:LANG): zhx-tai and zhx-teo. I await Wyang's input. As an aside, we should consider taking a Chinese approach to Arabic, i.e. not have separate headers for each dialect, but retain the option of listing each dialect's pronunciation and maybe having each dialect in translations tables. - -sche (discuss) 16:36, 14 November 2016 (UTC)Reply

@Kc_kennylau, Wyang, Suzukaze-c, Justinrleung, -sche, Metaknowledge Thanks all. Will further nesting open a Pandora's box of nesting subdialects if we do it like this? (pls note new rows for Taishanese and Teochew):

* Chinese:
*: Cantonese: {{t|yue|鉛筆|sc=Hani}}, {{t|yue|铅笔|tr=jyun4 bat1|sc=Hani}}
*:: Taishanese: {{t|zhx-tai|鉛筆|sc=Hani}}, {{t|zhx-tai|铅笔|tr=yon3 bit2|sc=Hani}}
*: Gan: {{t|gan|鉛筆}}, {{t|gan|铅笔|tr=nyyon4 'bit6}}
*: Hakka: {{t|hak|鉛筆}}, {{t|hak|铅笔|tr=yèn-pit}}
*: Jin: {{t|cjy|鉛筆}}, {{t|cjy|铅笔|tr=qie1 bieh4}}
*: Mandarin: {{t+|cmn|鉛筆|sc=Hani}}, {{t+|cmn|铅笔|tr=qiānbǐ|sc=Hani}}
*: Min Dong: {{t|cdo|鉛筆}}, {{t|cdo|铅笔|tr=iòng-bék}}
*: Min Nan: {{t+|nan|鉛筆}}, {{t|nan|铅笔|tr=iân-pit}}
*:: Teochew: {{t|zhx-teo|鉛筆|sc=Hani}}, {{t|zhx-teo|铅笔|tr=ing5 big4|sc=Hani}}
*: Wu: {{t|wuu|鉛筆}}, {{t|wuu|铅笔|tr=khe piq}}
*: Xiang: {{t|hsn|鉛筆}}, {{t|hsn|铅笔|tr=yan2 bi6}}

There's some work for translation adder as well. --Anatoli T. (обсудить/вклад) 21:27, 15 November 2016 (UTC)Reply

In this case I would really prefer:

* Chinese: {{zh-l|鉛筆}}

where {{zh-l}} extracts and displays the simplified form from the entry, as well as extracts all the readings of the word from the entry, and (if on a computer) displays the readings on hover-over or (if on a mobile device) something. Additional topolect-specific translations can be added as:

* Chinese: {{zh-l|鉛筆}}
*: Cantonese: {{zh-l|鉛鉛筆}}
*: Mandarin: {{zh-l|筆鉛鉛}}

Wyang (talk) 21:34, 15 November 2016 (UTC)Reply

We'd need a full-blown {{zh-t}} that can serve the function of linking to zh.wikt if an entry exists, and you'd also need to run a bot to update those every now and then (or convince Ruakh to do it). This would imply some changes to the translation adder as well, and possibly other gadgets and bits of code lying around. In short, that's a big jump from what Anatoli proposed. It does sound like a good idea, though, if you want to put the requisite work into it. —Μετάknowledgediscuss/deeds 00:10, 16 November 2016 (UTC)Reply
@Wyang: so, do we need to add language codes for these varieties (in which case, please add them), or are they adequately covered by zh? - -sche (discuss) 23:19, 11 January 2018 (UTC)Reply
@-sche I think having full language codes for these would be useful in translation tables. @Justinrleung, Atitarev, Suzukaze-c, Tooironic, Dokurrat What thinkest thou? Wyang (talk) 13:12, 12 January 2018 (UTC)Reply
@Wyang: Thank you for pinging me but I'm not familiar with Cantonese and Banlamgu and their regional speeches and I currently have no idea about this... Dokurrat (talk) 13:35, 12 January 2018 (UTC)Reply
@Dokurrat Thank you for your frankness... :) Wyang (talk) 13:42, 12 January 2018 (UTC)Reply
Support. — justin(r)leung (t...) | c=› } 13:35, 12 January 2018 (UTC)Reply
OK, thank you all for your input.   Done. :) - -sche (discuss) 14:44, 12 January 2018 (UTC)Reply
Hmm, having translations is nice but do we really want Category:Teochew lemmas? —suzukaze (tc) 23:18, 13 January 2018 (UTC)Reply
I presume such categories would be populated the same way as Category:Min Nan lemmas (and Category:Old Chinese lemmas, etc), i.e. the entries use the consolidated "Chinese" L2 header... at which point, having Category:Teochew lemmas seems no better or worse to have than Category:Min Nan lemmas. (There are some other languages where it could be good to do similar, e.g. Arabic and possibly Romani.) - -sche (discuss) 23:38, 13 January 2018 (UTC)Reply
That's true. It seems weird though, because Teochew is Min Nan, and Taishanese is Cantonese. —suzukaze (tc) 05:18, 14 January 2018 (UTC)Reply


Diegueño edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Diegueño

I thought I might dig out a few references I have and add a few entries in one of the lects that go by this name, but I'm a bit confused by the way we have the language codes set up.

Diegueño is the name that anthropologists have traditionally used for the language originally spoken around Mission San Diego in the southwestern corner of California. Older references referred to it as single language covering most of San Diego County, California and northern Baja California, Mexico, but I always understood it to be at least three languages:

  1. Northern Diegueño, known to its speakers as 'Iipay 'aa, and generally called Ipai in the literature
  2. Central Diegueño, now called Kumeyaay or Kumiai
  3. Southern Digueño, now calley Tipai or Tipay

Just to confuse things, Kumeyaay is sometimes used to refer to all three, and there are some scholars who have merged Tipay and Kumeyaay into a single language that they call Tipai. Then there's Kamia, which has been used in older literature for a variety of groups who all seem to have been Diegueño of one sort or another. The ISO has a single code, dih (which we call Kumiai) and our lone entry using that code is the 'Iipay 'aa word for water. That would make sense if we treated all of Diegueño as one language, but we have an exception code for Tipai: nai-tip, and a single entry. As far as I know, no one currently considers Ipai to be part of Kumiai unless they consider Tipai to be part of it, too- hence my confusion.

I've only studied Ipai (perhaps I should say "dabbled in"), so I can't judge for myself how different the lects are from each other. Based on what little I have read, it would seem to me that we have just a few credible options:

  1. Treat Diegueño as a single language, keeping dih and retiring nai-tip
  2. Treat Ipai as separate, but merge the rest of Diegueño into Tipai
  3. Treat each of the three lects as independent, preferably all with exception codes (nai-ipa, nai-kum and nai-tip, perhaps?).

I would recommend against using dih for anything but the single-language option- this isn't a macro-language with a standard or prestige lect, and it would probably just confuse things. Chuck Entz (talk) 06:12, 19 June 2017 (UTC)Reply

(After digging into the history of the codes, I've refreshed myself that) I added the code for Tipai in diff, so I could add diff, after seeing that we called dih "Kumiai" and taking that to mean that it referred to the Central dialect and the others needed codes. (Apparently, the ISO/Ethnologue's actual reason for calling it "Kumiai" is that some people use "Kumiai" instead of "Diegueño" as the cover term, as you note.) I probably didn't add a code for the Northern variety at that time because I didn't want to bother figuring out what to call it ("'Iipay 'aa" struck me as a suboptimal name; do we have other names with spaces in them where the part after the space isn't capitalized?) at a time when no-one was coming forward with content that needed to be added in it. :p
Victor Golla, California Indian Languages (2011, →ISBN, page 120, says: "While Kroeber (1925) and others treated the Kamia as a Diegueño subgroup, there is no firm evidence in support of this approach, although the name they are known by [Kamia] appears to be a variant of Kumeyaay (Langdon 1975a). With this possible exception, all of the groups definitely known to have spoken varieties of Diegueño were located west of the present San Diego-Imperial County line or in Baja California west of the Sierra de Juarez. [...] Although Ipai and Tipai are to some extent mutually intelligible, they show numerous differences in vocabulary and structure (for a comparison of Mesa Grande Ipai and Jamul Tipai see A. Miller 2001:359-363) and have sometimes been treated as separate languages. Winter [...] judged [Tipai] to be no closer to (Northern) Diegueño than to Cocopa. The most recent classification (Langdon 1991; Miller 2001:1-4) distinguishes [Ipai, Tipai and Kumeyaay]."
Amy Miller's referred-to work, A Grammar of Jamul Tipay, says "A comparison of descriptions of Mesa Grande [...] with the results of my own research reveals that differences between Mesa Grande and Jamul pervade the phonology, lexicon, derivational morphology, inflectional morphology, syntactic morphology, syntax, and discourse."
OTOH, Tipai Ethnographic Notes: A Baja California Indian Community (2001, →ISBN, edited by Langdon et al, says: "These Mexican Diegueno, who call themselves Ipai or Tipai 'people,' cannot be described as now forming a tribe; they are a group of Indian families speaking mutually intelligible dialects (Northern and Southern) of a language[.]"
- -sche (discuss) 18:51, 24 June 2017 (UTC)Reply
Amy Miller has a comparison of Ipai and Tipai in her work cited above; I have put a shorter comparison of various words at User:-sche/Diegueño. - -sche (discuss) 20:19, 24 June 2017 (UTC)Reply
@Chuck Entz I have split (and retired) dih into three codes as proposed above. - -sche (discuss) 22:02, 19 January 2018 (UTC)Reply


Renaming Azeri to Azerbaijani edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming az

I propose to rename the language name Azeri to Azerbaijani. The reasons are (1) to distinguish it from the Iranian language Azeri, (2) Ethnologue, Glottolog and Wikipedia call the language Azerbaijani, (3) Azerbaijani is closer the native form Azərbaycan dili, (4) Azeri has pejorative connotations in Armenian. @Allahverdi Verdizade, currently the most active Azerbaijani contributor, agrees. --Vahag (talk) 14:01, 24 January 2018 (UTC)Reply

Support. Allahverdi Verdizade (talk) 15:52, 24 January 2018 (UTC)Reply
Support. And to summon evidence more appropriate to our dictionarying efforts, Google Books searches for "speaking X" or "the X language" have invariably returned more hits with "Azerbaijani" than with "Azeri" when I have tried them. —Μετάknowledgediscuss/deeds 17:22, 24 January 2018 (UTC)Reply
Pinging several potentially interested users for more opinions: @Atitarev, ZxxZxxZ, Crom daba, Anylai. --Vahag (talk) 15:23, 25 January 2018 (UTC)Reply
Support. It took me some getting used to the English "Azeri" at Wiktionary, which also has negative connotations in Russian and many Azerbaijanis speak Russian. а́зер (ázer) is pejorative for азербайджа́нец (azerbajdžánec) in Russian. --Anatoli T. (обсудить/вклад) 15:28, 25 January 2018 (UTC)Reply
Support. Azeri is a little "vague" while Azerbaijani is kind of long and sounds only limited to Azerbaijan, but I support the change, "Azeri" is used for people rather than the language itself in Turkish as well. --Anylai (talk) 17:33, 25 January 2018 (UTC)Reply
Support --Z 20:06, 25 January 2018 (UTC)Reply
Support for the reasons Vahag and Meta outlined. Someone with a bot will need to implement the rename, because at least three thousand entries will need to be updated, and that's just the ones where the L2 header needs to be changed; in other entries, translations tables and descendants lists and etymologies where the name is spelled out will need updating. - -sche (discuss) 23:51, 25 January 2018 (UTC)Reply
Support. I never liked that we used that name. PseudoSkull (talk) 00:16, 26 January 2018 (UTC)Reply
Looks like this is uncontroversial, so   Done. I made a request at the Grease Pit for a rename by a bot. --Vahag (talk) 17:52, 9 February 2018 (UTC)Reply

Has anyone modified the tool that assists in adding translations to translation tables so that it uses the word “Azerbaijani”? — SGconlaw (talk) 04:57, 10 February 2018 (UTC)Reply

That happens automatically. DTLHS (talk) 05:04, 10 February 2018 (UTC)Reply
Oh, good. — SGconlaw (talk) 05:54, 10 February 2018 (UTC)Reply


Removing spurious Dororo, Guliguli, Yarsun, and Wares edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Some spurious languages to merge or remove, 1
merge Dororo [drr] and Guliguli [gli] into [kzk]

In 1953, Lanyon-Orgill provided short wordlists of two languages he called Dororo and Guliguli. His lists are so similar to Kazukuru that subsequent scholars have suspected they are dialects, if not alternative transcriptions, if not jokes. Karen Davis, in A grammar of the Hoava language, Western Solomons, notes "there was no one in present day Kusaghe who had heard of [Dororo or Guliguli], and Lanyon-Orgill does not identify his informant. As one of the names of the dialects, Guliguli, can mean 'masturbate' in Hoava-Kusaghe, I have my doubts about the existence of this language." Michael Dunn and Malcolm Ross expand on Davis's point in their 2007 article Is Kazukuru really non-Austronesian, with the view (accepted also by e.g. Harald Hammarström and Sebastian Nordhoff, in Melanesian Languages on the Edge of Asia: Challenges for the 21st Century) that if the lects were real, they were the same language as Kazukuru, which I propose to merge them into. (Sample words in Kazukuru, Guliguli, and Dororo: pito, bito, bito "arrow"; vinovo, vino, bino "banana"; viniti, vini, vinitini "body"; minata, minate, minate "die"; meta, mata, mata / meta "eye"; rano, rano, rano "head"; muni, moni, muni / moni "night".) - -sche (discuss) 06:44, 12 May 2017 (UTC)Reply

  Done. - -sche (discuss) 21:03, 19 January 2018 (UTC)Reply
remove Yarsun [yrs] and Wares [wai]

As noted by Hammarström, Melanesian Languages on the Edge of Asia: Challenges for the 21st Century, the existence of a Yarsun language seems have been based on the confusion of language names with place names which is not uncommon in Papua (Yarsun is near Anus Island, and that's not a joke); "no such language is attested". Likewise with Wares. - -sche (discuss) 06:44, 12 May 2017 (UTC)Reply

  Done. - -sche (discuss) 23:26, 29 December 2017 (UTC)Reply


RFM discussion: July 2016–January 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Bunurong

This Pama-Nyungan lect seems to have been given an exceptional code with no discussion, for its use at one transwikied entry, ngargee. It's by no means clear that we should be giving it a separate code rather than treating it as a dialect of Woiwurrung wyi; Wikipedia claims they are 90% mutually intelligible, but doesn't cite that claim directly. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)Reply

I noticed this a while ago, but left it alone because I could not at the time find a reference (i.e. outside WP) that confirmed that they were mutually intelligible, probably because the huge variety of spellings made searching for information difficult. However, I can now find references that suggest we should be merging more than just these two. Leigh Boucher and ‎Lynette Russell's Settler Colonial Governance in Nineteenth-Century Victoria (2015, →ISBN, page 8, speaks of "the Woiwurrung (Wurundjeri), Boonwurrung, Wathaurung, Taungurong and Dja Dja Wurrung [being] mutually intelligible languages that share up to 80 per cent of their terminology." A paper by Barry Blake and Julie Reid on Sound Change in Kulin, in the La Trobe Working Papers in Linguistics, v 6-8 (1993), speaks of a single Kulin language, with "material available on three dialects: Boonwurrung (B), Woiwurrung (W) and Thagungwurrung (T)" (emphasis mine). Dja Dja Wurrung = dja, and Taungurong / Thagungwurrung = dgw, and Wathaurung = wth, all of which we currently treat as separate. Kulin suggests that at least the four eastern ones, if not also Wathaurung, could be merged. - -sche (discuss) 06:48, 6 July 2016 (UTC)Reply
I've merged Bunurong; the others still need to be merged. - -sche (discuss) 16:52, 18 September 2016 (UTC)Reply
Bunurong was merged at the time this discussion was current. The discussion has since gone stale, and I am going to leave the others alone (unmerged). - -sche (discuss) 06:26, 29 January 2018 (UTC)Reply


RFM discussion: January–March 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


ISO code changes for 2017

Here is a document summarising the changes to ISO codes that have been made in 2017. I have gone through all the approved request documents and briefly summarised their findings and my conclusions about what I propose we should do. —Μετάknowledgediscuss/deeds 10:36, 31 January 2018 (UTC)Reply

Deprecated codes edit

  • Mosiro [mwy]
    Only a clan name, confirmed with fieldwork. I concur with removal.
  • Ndaktup [ncp]
    Merge into Kwaja [kdz], confirmed with fieldwork. I concur with removal.
  • Lyons Sign Language [lsg]
    Apparently spurious. I concur with removal.
  • Mediak [mwx]
    Only a clan name, confirmed with fieldwork. I concur with removal.
  I also agree, and have removed them all, adding Mosiro and Mediak and several other things as alt names of oki (our spelling of the canonical name, Okiek, seems more common than Wikipedia's Ogiek), and adding Ndaktup and several other forms as alt names of kdz. - -sche (discuss) 16:56, 31 January 2018 (UTC)Reply

Added codes edit

  • Gyalsumdo [gyo]
    Confirmed with fieldwork and dictionary creation is underway. I concur with creation.
      Done, tentatively with Latn, Deva and Tibt scripts listed, based on the two scripts Ethnologue lists for broader Manang [which WP thinks it is a dialect of] and how I see linguists documenting it. - -sche (discuss) 23:35, 7 February 2018 (UTC)Reply
    But is this consistent with how we treat other Tibetan/Tibetic varieties? E.g. kte is subsumed into bo, somewhat like with Chinese varieties. This may need further discussion, maybe separate from this big list of codes, with Wyang. - -sche (discuss) 23:58, 7 February 2018 (UTC)Reply
    I realised this when reading your comment at Talk:གསར་པ. We decided on a Chinese-style merger, but this compounds the lack of interest and expertise that has plagued documenting anything other than Lhasa Tibetan (which is admittedly the only one I have formally studied either). —Μετάknowledgediscuss/deeds 00:35, 8 February 2018 (UTC)Reply
  • Ngen [gnj]
    Confirmed with a Swadesh list as worthy to be split from Beng [nhb]. I cannot find any documentation after a quick search, nor do I find confirming fieldwork. I abstain on creation.
    Glottlog, citing Anna Maloletnyaya's 2014 Brief presentation of Ngen language, agrees that "Ngen, a Southeastern Mande language of the Ben-Gban group not intelligible to Beng". Therefore, added. - -sche (discuss) 03:55, 7 March 2018 (UTC)Reply
  • Western Armenian [hyw]
    Request document is chiefly concerned with social needs (e.g. a new Wikipedia). There are significant dialectal differences, but Western and Eastern Armenian are known to be mutually intelligible and have been successfully treated as [hy] at Wiktionary (though there has been little to no prior discussion). There is currently a parallel discussion of this issue in the Beer parlour. I disagree with creation.
    The BP discussion will take care of this (and seems to be reaching the conclusion that the code should not be added). - -sche (discuss) 23:27, 7 February 2018 (UTC)Reply

Name changes edit

  • Helambu Sherpa [scp]
    Changed to Hyolmo, based on the current name being a misnomer (the speakers are not Sherpas and the language is not closely related to Sherpa) and preference of native speakers. I find that Yolmo is in fact the most-used spelling on Google Books and Google Scholar, so I suggest we use that name instead.
      Done (Yolmo). - -sche (discuss) 05:25, 13 February 2018 (UTC)Reply
  • Dzodinka [add]
    Changed to Lidzonka, based on preference of native speakers. This name has almost never been used in the linguistic literature, whereas Dzodinka has and the request document itself admits that "both are correct". I disagree with the name change.
    Ergo, not renamed. (But struck, for the sake of tracking which ones have been looked at.) - -sche (discuss) 05:25, 13 February 2018 (UTC)Reply
  • Shixing [sxg]
    Changed to Shuhi, based on local usage; the requester actually uses the name Xumi when documenting the language. All other use in the linguistic literature appears to follow the original name, Shixing. I disagree with the name change.
    Ha, weird.  N Not done at this time. - -sche (discuss) 05:25, 13 February 2018 (UTC)Reply
  • Palor [fap]
    Changed to Paloor, based on the language's new orthographic norms. There is very little linguistic literature on it, most of it in French, and it is difficult to tell what name is or will be more common. I abstain on the name change.
    The new spelling may well take hold, but it hasn't yet (and we have no contributors in the language who might clamor to use the new spelling when adding entries), so, not renamed at this time. We should reexamine this in a few years. - -sche (discuss) 16:50, 22 February 2018 (UTC)Reply
  • Australian Sign Language [asf]
    Changed to Auslan, based on popular and linguistic usage. This does seem to be more common as the standard name, and is preferred by native speakers. I support the name change.
    Has been done, apparently.

The name of [khm] was also changed from "Khmer, Central" to "Khmer", but we have chosen to exclude this code, so it does not affect us.

Μετάknowledgediscuss/deeds 10:36, 31 January 2018 (UTC)Reply

Where should discussions like this be archived? — SGconlaw (talk) 03:40, 22 February 2018 (UTC)Reply
They are archived to Wiktionary:Language treatment/Discussions. If you're asking so you can archive them, you'd be better off leaving them for -sche. —Μετάknowledgediscuss/deeds 03:48, 22 February 2018 (UTC)Reply
*Thumbs up* — SGconlaw (talk) 03:57, 22 February 2018 (UTC)Reply
Oh, no, the more people who know how to do this kind of thing, the better! :) You just have to be sure all the requests have been handled (done/accepted, or rejected). But yes, WT:LTD is the catch-all for anything that doesn't have a more logical place to be archived; and then, if there's a change in how we treat a language (e.g. a language has been split between two codes), update WT:LT and link to the discussion. If a language code has been removed (e.g. merged into something else), I find that it's helpful (and prevents accidental re-creation) to leave a --placeholder comment in the module (as I did for e.g. "frc" and am slowly doing for other codes). - -sche (discuss) 16:42, 22 February 2018 (UTC)Reply


RFM discussion: March 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Songhai to Songhay

Our spelling for the name of the Songhay languages is simply much less common, on Google, Google Books, or in the literature. Note that we will also need to change the name of Proto-Songhai [son-pro] if this changes. —Μετάknowledgediscuss/deeds 01:10, 14 March 2018 (UTC)Reply

  Done. - -sche (discuss) 18:12, 17 March 2018 (UTC)Reply


Proto-Sarmato-Alanic [Old Ossetic] edit

Copied from WT:Etymology scriptorium/2018/April.

OK, let me try this again. I'd like to have Alanic xln renamed to Sarmato-Alanic and have entries use xln-pro. Alanic and Sarmatian (which has no language code otherwise) occupy a dialect continuum, and neither might be the direct ancestor of Ossetian or Jassic. Alanic would then be made into an etymology-only code, xln-ala. Alternatively, a new language code could be created, ira-sma-pro, and xln removed all together, but I think the former the better option. @-sche, Tropylium --Victar (talk) 20:30, 3 April 2018 (UTC)Reply

There are a couple problems here that I see just from a brief reading of the discussion. You want us to change to a name that is very rarely used instead of a more common one, and also switch to considering it a protolanguage rather than an attested one, despite the fact that it is actually attested (yes, we do that for Proto-Norse, but it is not ideal and is in large part because, as is relevant here, we try to use the most commonly used names where possible). —Μετάknowledgediscuss/deeds 01:19, 5 April 2018 (UTC)Reply
Thanks for the reply, @Metaknowledge. 99% of the entries I'll be entering will be reconstructions, and most will be derived from Ossetian, not Alanic or Sarmatian borrowing. I think there is a ton of precedence for using alternative names for codes on wikt, but I'll concede that I can't think of any example of using the language code of a dialect to refer to a whole dialect family. I'm not opposed to using ira-sma-pro instead, but I do still think then the xln code should be discontinued, because Alanic, Sarmatian and Proto-Ossetic should all be under the same code. --Victar (talk) 04:13, 5 April 2018 (UTC)Reply
To give an example of what I had in mind for formatting descendant trees:
* Sarmato-Alanic {{l|xln-pro}}
*: Alanic: {{l|xln-ala}}
*: Sarmatian: {{l|xln-sar}}
** Ossetian: {{l|os}}
--Victar (talk) 04:22, 5 April 2018 (UTC)Reply
The proportion that are reconstructions is irrelevant unless it's 100%. When we have mainspace entries, we should avoid assigning them to protolanguages (which are technically hypotheses) wherever possible. We always try to use to the most common, unambiguous name possible, and if you know of any exceptions, we should see if they ought to be fixed. Basically, I think you're conflating the needs of descendant trees and the criteria for determining what ought to be a separate language. Bear in mind that regardless of what codes and names are, you can always structure descendant trees to show distinct dialects or sublects (Crom daba has done this quite fruitfully). —Μετάknowledgediscuss/deeds 04:48, 5 April 2018 (UTC)Reply
@Metaknowledge, I think you're missing the point of my need. I want to create reconstructions of Proto-Ossetic and Sarmatian. Sarmatian and Alanic are well established as two separate dialects. --Victar (talk) 04:59, 5 April 2018 (UTC)Reply
And if they're dialects, they shouldn't have separate codes. Remember, you can still give them separate lines and reconstructions, when and where those are supported by scholarly sources, regardless of the situation with codes. —Μετάknowledgediscuss/deeds 06:18, 5 April 2018 (UTC)Reply
Exactly, @Metaknowledge, which is why Alanic shouldn't have its own code. --Victar (talk) 07:37, 5 April 2018 (UTC)Reply
Then why did your example above have them with two different codes? —Μετάknowledgediscuss/deeds 16:22, 5 April 2018 (UTC)Reply
Sorry, I tried to indicate xln-ala and xln-sar where etymology-only codes by the dash in them, but I guess that wasn't clear (though I did make that point in my opening statement). We also have oos for Old/Proto-Ossentic, which we could used instead for parent of Alanic/Sarmatian/Ossentic. We currently list it below xln, and I've always considered it a stage between, yet MultiTree seems to use it as their catch-all. Again though, I'm not opposed to using a new ira-sma-pro code. --Victar (talk) 17:34, 5 April 2018 (UTC)Reply
We create etymology-only codes for etymology sections. If there's a language that derives terms from both Alanic and Sarmatian with a meaningful difference between the two, then those codes should exist, but we shouldn't create them just for descendant lists, which can be freely formatted. —Μετάknowledgediscuss/deeds 18:39, 5 April 2018 (UTC)Reply
  • Here's a different idea. Alanic and Sarmatian are both barely attested; from my brief reading of the literature, it seems to be unclear whether or not they represent dialects or fully separate speech communities, and whether they represent the ancestor of Ossetian or a close relative (and given the timescales over which they are attested, they cannot be the same thing as a protolanguage, which is a hypothesis of the most recent common ancestor). The resultant action would be to have separate codes for Alanic, Sarmatian, and Proto-Ossetic, with the former two only in mainspace (in original script, e.g. Greek) and the latter only in Reconstruction space (in normalised form). As always, you can format descendant lists however you like. Does that seem like it would make sense? —Μετάknowledgediscuss/deeds 18:39, 5 April 2018 (UTC)Reply
@Metaknowledge: No, I don't agree with that solution. Whether they exhibit the same exact timeframe is irrelevant and labeling Sarmatian and Ossetic as Alanic is inaccurate. What my sources are reconstructing is a common ancestor of all three of these dialectal branches. See https://ibb.co/jFEnSH. --Victar (talk) 19:01, 5 April 2018 (UTC)Reply
Your response is confusing; I did not suggest labelling Sarmatian and Ossetic as Alanic (in fact, I suggested separating all three), and I was under the impression that Proto-Ossetic is the unattested ancestor of all these languages. —Μετάknowledgediscuss/deeds 19:30, 5 April 2018 (UTC)Reply
Am I understanding this correctly that it is similar to the problem that he have had/are having with Sanskrit and the Prakrits, namely that there is a dialectal continuum between Sarmatian, Alanic, and the unattested ancestor of Ossetian? —*i̯óh₁n̥C[5] 19:24, 5 April 2018 (UTC)Reply
@JohnC5: Not exactly. Sarmatian, Alanic and Ossentic are all largely unattested dialects form a single language of the Middle Iranian period. So what I'm suggesting is that we unify them under a single code and name, and have the dialects differentiated only by etymology-only codes. I'm recommending the name Sarmato-Alanic, which is what I mostly see in literature when referring to them as a whole, but if I had to choose to unify them under one name out of the three, it would be Ossetic, being the only one with modern descendents. What code we use, be it a repurposed one, or a new one, doesn't matter to me. --Victar (talk) 20:26, 5 April 2018 (UTC)Reply
@Metaknowledge: Would you support the idea of unifying Sarmatian and Alanic under Old Ossetic oos (as per MultiTree) with both reconstructed and mainspain entries, and making xln an etymology only code? That should resolve your Proto-Norse argumentment. --Victar (talk) 22:37, 5 April 2018 (UTC)Reply
I see that the name "Old Ossetic" is broadly attested (when spelt correctly), and many sources seem to equate it with Alanic, but that doesn't mean Sarmatian should necessarily be merged as well. If they are indeed dialects as you claimed, then that would be perfectly fine. WP cites EB for the following: "The languages of the Scytho-Sarmatian inscription may represent dialects of a language family of which Modern Ossetian is a continuation, but does not simply represent the same language at an earlier time." If that is true, then Sarmatian should be kept separate. —Μετάknowledgediscuss/deeds 23:29, 5 April 2018 (UTC)Reply
@Metaknowledge, there is no code for Sarmatian. If we put Alanic under Old Ossetic as part of a dialectal continuum, Sarmatian, as a dialect thought to be very similar to Alanic, should unequivocally be included. Otherwise it defeats the point. --Victar (talk) 23:49, 5 April 2018 (UTC)Reply
I would be in favor of Old Ossetic for the continuum of Alanic and Ossetic and, if it can be demonstrated as true, Sarmatian. In what way can we adjudicate this Sarmatian situation? As mentioned before, I'm getting a little bit frustrated at continuously running across this "continuum problem" (attested languages descending from unattested near neighbors). There's been a fair amount of research saying that the phylogeny of language change tends to be binary in nature, but that depends on how you look at language continua versus dialectally diverse super-languages. I'd be interested to think about the principled use of language continua in our language data (like "oss-cnt" or the like), not just the "substrata" we use in the etymology-only language data. The question is in the utility of such a demarcation, but the inherent assumption of our current n-ary (or perhaps my theoretically binary) branching system tends to omit this subtlety of language change because frequently these continua are not protolanguages and exist clearly in the data... I dunno. —*i̯óh₁n̥C[5] 00:07, 6 April 2018 (UTC)Reply
Ignoring John's tangent... I know there is no code for Sarmatian. That's immaterial; we can make one if we deem it necessary. You claim that Sarmatian is very similar to Alanic, to the point of being a continuum; I know little about this, but found a scholarly source that claims otherwise. Can you respond to that with actual evidence? —Μετάknowledgediscuss/deeds 00:16, 6 April 2018 (UTC)Reply
@JohnC5, Metaknowledge:
  1. "Sarmatian and Alanic represent a dialect continuum" and "it is difficult to draw the line between Sarmatian and Alanic".[S 1]
  2. "Ossetic [...] is the last remnant of the essentially unknown Middle Iranian dialect area that included Sarmatian, and is said to descend from Alanic."[S 2]
  3. "[Ossetic] is the sole surviving descendant of the Northeast Iranian dialects of the ancient Scythians and Sarmatians and medieval Alans".[S 3]
  4. "Deine klare linguistische Scheidung zwischen Sarmatisch und Alanisch aufgrund der Materiallage nicht möglich ist"[S 4]
Even if Alanic and Sarmatian were divergent enough to call separate languages, that distinction isn't apparent in the little material we have, so to reconstruct them separately at this time would be folly. --Victar (talk) 01:20, 6 April 2018 (UTC)Reply
Thanks, that seems like good evidence for merger. I am now satisfied with having "Old Ossetic" as an L2 header with categorising context labels for the dialects. I would like to wait a couple of days just in case anyone raises an objection, so please ping me with a reminder. Also, please clarify if there are any etymology sections that need to distinguish between the dialects; if not, we can dispense with etymology-only codes and simply remove xln. —Μετάknowledgediscuss/deeds 04:44, 6 April 2018 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Metaknowledge, if you have a moment, I would appreciate you making these changes. Thanks. --Victar (talk) 03:10, 16 April 2018 (UTC)Reply

@Victar, could you please respond to the query in the last sentence of my last comment? —Μετάknowledgediscuss/deeds 04:39, 16 April 2018 (UTC)Reply
@Metaknowledge, yes, need the etymology-only codes as well, not for the linguist distinction, really, but for the historical one, i.e. the names of Alanic kings. --Victar (talk) 04:44, 16 April 2018 (UTC)Reply
  Done. —Μετάknowledgediscuss/deeds 05:33, 16 April 2018 (UTC)Reply
Changes look good. Thanks, @Metaknowledge! --Victar (talk) 05:35, 16 April 2018 (UTC)Reply

References edit

  1. ^ Novák, Ľubomír (2013) Problem of Archaism and Innovation in the Eastern Iranian Languages (PhD dissertation)[1], Prague: Univerzita Karlova v Praze, filozofická fakulta
  2. ^ Fortson, Benjamin W. (2004) Indo-European Language and Culture: An Introduction, first edition, Oxford: Blackwell
  3. ^ Lua error in Module:quote at line 2380: |title= is an alias of |chapter=; cannot specify a value for both
  4. ^ Schmitt, Rüdiger, editor (1989), Compendium Linguarum Iranicarum[2], Wiesbaden: Reichert Verlag, →ISBN

RFM discussion: August 2016–June 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Some more missing South American languages, 2

Here are a few more South American languages for which we could add codes:

  • Coeruna / Koeruna (sai-coe), said to be attested.
  • Conambo language (sai-cnb) is sometimes considered a dialect of the Záparo language, but Loukotka has samples, which show differences: "head" is ku-anak in Zaparo, ku-anaka in Conambo, "eye" is nu-namits (Z) / ku-iyamixa (C), "fire" unamisok (Z) / umani (C), "woman" itumu (Z) / maxi (C). OTOH, telling it apart from what a number of references refer to as the Zaparo spoken in the Conambo river could be non-trivial, so perhaps treating it as a dialect would be easier...
    Meh, left as a dialect. - -sche (discuss) 20:28, 4 June 2018 (UTC)Reply
  • Koihoma language should probably be put off until we can confirm it as a distinct lect; its alt names are all the names of other languages...
  • Yao language (Trinidad) (sai-yao), attested in a single wordlist.

- -sche (discuss) 21:18, 16 August 2016 (UTC)Reply

RFM discussion: March–June 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming siz

We currently call this "Siwa". That's the name of the oasis, but the proper name for the language, which sees far more use in the literature and in general, is "Siwi". —Μετάknowledgediscuss/deeds 01:26, 14 March 2018 (UTC)Reply

  Done. - -sche (discuss) 19:05, 4 June 2018 (UTC)Reply


RFM discussion: March–June 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming cak

We currently call this "Cakchiquel". Language-learning materials and linguistic literature generally prefer the de-hispanicised spelling "Kaqchikel", as can be seen at Google Books. —Μετάknowledgediscuss/deeds 04:41, 17 March 2018 (UTC)Reply

  Done, although I note seems to have Kaqchikel only recently (circa 1995) displaced Cakchiquel. Note also the naming of ckz. - -sche (discuss) 19:19, 4 June 2018 (UTC)Reply


RFM discussion: July 2019 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Converting Proto-Tibeto-Burman to an etymology-only language

I propose that we treat Proto-Tibeto-Burman as an etymology-only variant of Proto-Sino-Tibetan as it appears that the consensus among linguists in the field is that the set of non-Sinitic Sino-Tibetan languages is not monophyletic. (For a parallel, we treat Proto-Baltic as an etymology-only variant of Proto-Balto-Slavic for the same reason.) By the same token, I propose that we change all languages and families currently listed as family = "tbq" to family = "sit" instead. Wyang has said on his talk page that he isn't opposed to the idea, and I don't know who else here is working on Sino-Tibetan issues. What do other people think? —Mahāgaja · talk 14:32, 19 July 2019 (UTC)Reply

Three days with no response; I'm doing it now. —Mahāgaja · talk 19:05, 22 July 2019 (UTC)Reply


RFM discussion: July–August 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Wanji languages

Currently the canonical name of ira-wnj is Wanji which is also used by wbi. One of them should be changed. We also have wny which is Wanyi. DTLHS (talk) 18:27, 22 July 2018 (UTC)Reply

(Also similar: wdd Wanzi / Wandji.) I've renamed ira-wnj to "Vanji", which is the spelling Wikipedia uses anyway. We don't have much content in either language, just a few entries in descendant trees for the Iranian one and a few translations for the Bantu one, but yes, it causes problems (starting with conflated categories) if they have the same name. - -sche (discuss) 15:08, 24 July 2018 (UTC)Reply
I was wondering what the heck happened to this. --Victar (talk) 22:43, 3 August 2018 (UTC)Reply


RFM discussion: September–December 2019 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59042277#Renaming_[ik]_from_Inupiak_to_Inupiaq|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [ik] from Inupiak to Inupiaq

Searches for "speak X" and "X language" show that the spelling Inupiaq is much more common than Inupiak, which is what we currently use. —Μετάknowledgediscuss/deeds 20:59, 12 September 2019 (UTC)Reply

  DoneMahāgaja · talk 10:09, 24 October 2019 (UTC)Reply

I've moved all the categories to the new spelling, but it would be great if someone with a bot could change all L2 headers from ==Inupiak== to ==Inupiaq==. —Mahāgaja · talk 10:12, 24 October 2019 (UTC)Reply
Also all instances of "Inupiak" in Translation sections (including |langname=Inupiak inside {{t-simple}}). —Mahāgaja · talk 22:31, 24 October 2019 (UTC)Reply
@Mahagaja, Erutuon: Has this been done, and if not, can someone with a bot make it so? —Μετάknowledgediscuss/deeds 07:24, 24 December 2019 (UTC)Reply
@Metaknowledge, Erutuon: I don't know if there are any L2 heading left using the old spelling, but there are definitely still translation sections calling it "Inupiak". I don't have a bot (or the remotest idea how to write one); @DTLHS, Benwing2, Ruakh, is this something one of y'all's bots would be interested in doing? —Mahāgaja · talk 08:02, 24 December 2019 (UTC)Reply
Happily all the "Inupiak" headers were caught quickly because they showed up on my incorrect headers page. I went and fixed all cases in translation sections with JWB because there were less than a hundred. — Eru·tuon 10:15, 24 December 2019 (UTC)Reply
Glad to hear it! I'm still hoping a bot runner will take care of the translations sections. —Mahāgaja · talk 15:10, 24 December 2019 (UTC)Reply
Erutuon in his last comment: "I went and fixed all cases in translation sections". —Μετάknowledgediscuss/deeds 18:26, 24 December 2019 (UTC)Reply
Oh right. Never mind.Mahāgaja · talk 18:52, 24 December 2019 (UTC)Reply


RFM discussion: August–September 2018 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Kamkata-viri merger

We have language codes for both Kamviri xvi and Kati/Kata-vari bsh, despite being dialects of Kamkata-viri.[8] I'd like to merge them under either code and rename the canonical name to Kamkata-viri. @-sche, Metaknowledge --Victar (talk) 22:31, 3 August 2018 (UTC)Reply

Support. The phrasing in Liljegren (2016) that you linked to leads me to think that perhaps bsh is the code we should keep, and xvi the one we should retire, although it's wholly arbitrary. Thanks for paying attention to Nuristani, in any case. —Μετάknowledgediscuss/deeds 23:07, 3 August 2018 (UTC)Reply
Kata-vari is the larger of the two by at least five-fold, which on one hand would make it the logical encompassing one, but bsh is actually named for its eastern subdialect Bashgali (taken from Dardic), making it a somewhat inaccurate code to begin with, but I don't really care either way. Here is Strand's tree. --Victar (talk) 23:39, 3 August 2018 (UTC)Reply
And here is from Strand's 1974 paper, back when he used to refer to Kati as being the parent language of all three dialects: "Kati (Bašgalī) has three major dialects: Katə́viri, Kamvíri, and Mumvíri. Katə́viri is spoken by members of the Katə́ tribe. It is divided into two major subdialects: Western Katə́viri and Eastern Katə́viri. Western Katə́viri is further subdivided into the dialects of Ramgə́l, Kulám, Ktívi (Kantivo), and Pə́řuk (Papruk)". I'm also fine keeping it named Kati, despite it perhaps being outdated, if that's preferable to others. --Victar (talk) 00:05, 4 August 2018 (UTC)Reply
Support the merger. As for the name: on one hand, "Kati" seems to be about twice as common even when I search for the names together with "Nuristani" to filter out the New Guinea-area language, but on the other hand the absolute number of words that mention either name is small, so if "Kati" is dated and "Kamkata-viri" is preferred these days, and "Kamkata-viri" would also make clearer to readers, etc that we're considering all the dialects under that one header (and that we're not talking about kti, Muyu / Kati), then go ahead and use "Kamkata-viri", (with the other names as alt names). - -sche (discuss) 00:58, 7 August 2018 (UTC)Reply

Done. --Victar (talk) 16:57, 5 September 2018 (UTC)Reply

RFM discussion: December 2019–January 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59042471#Renaming_[rnd]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [rnd]

Currently "Ruund", which is also is on WP, but "Ruwund" is more common in the literature. —Μετάknowledgediscuss/deeds 07:20, 24 December 2019 (UTC)Reply

Renamed. Searching the site, the only pages I see using "Ruund" are categories (in family trees) and Reconstruction pages (using {{desc}}), which should update automatically; I think there is, therefore, nothing that needs changing(?) besides the three modules I just changed. - -sche (discuss) 08:35, 11 January 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59531101#Renaming_[zmp]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [zmp]

We currently call this "Mpuono"; "Mbuun" is much more common in the (admittedly limited) scholarly literature. For example, compare google books:"Mbuun" grammar and google books:"Mpuono" grammar. —Μετάknowledgediscuss/deeds 05:16, 10 March 2020 (UTC)Reply


RFM discussion: July 2016–January 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Dungan is technically Mandarin, or a dialect of Mandarin

Hi, I'm not sure whether this is the right venue for this discussion, but I would like to bring up Dungan, which is spoken in Central Asia. According to Wikipedia, Ethnologue, and Glottolog, this is a Chinese language, specifically Mandarin. There are only 20 entries in Wiktionary that are for Dungan, and all of them are Mandarin words, with some from the Gansu and Shaanxi dialects. The difference, however, is that, Dungan is written in Cyrillic. In Wiktionary, all Chinese dialects are merged into one single Chinese entry, and pronunciations are listed. Shouldn't we do that, or at least partially, for Dungan? Please feel free to comment. Thanks. --Mar vin kaiser (talk) 14:54, 16 July 2016 (UTC)Reply

Re venue: this is indeed a venue where discussions of merging/splitting language codes and categories and entries take place. I tend to put my "biggest" proposals (ones that needed votes in the past, or that concern major or controversial languages that I suspect will need votes) in the WT:BP, but here is OK.
Wiktionary:About Chinese#The_Chinese_lects and Wiktionary:Votes/pl-2014-04/Unified_Chinese intentionally left lects that don't use Hanzi separate from Unified Chinese, but I don't know if that was because Chinese editors felt they should never be merged, or just felt that merging them would be difficult and best attempted after everything else had been merged. It would obviously be possible to merge Dungan and other such lects if Chinese editors wanted to; we have plenty of other languages which use multiple scripts (e.g. Afrikaans). However, the various Chinese lects which are distinctive to the point of potentially being not-mutually-intelligible when spoken were able to be unified here because they share a written form in which they are theoretically mutually intelligible. If Dungan is potentially not intelligible with lects from other areas (lects that differ from Mandarin enough that speakers don't understand it without study) in either speech or writing, then what would be the basis for unifying them? - -sche (discuss) 15:42, 16 July 2016 (UTC)Reply
Actually, Shaanxi and Gansu Mandarin is mutually intelligible with Dungan. Furthermore, a large majority of Dungan vocabulary is from Chinese, which therefore, has Chinese equivalent entries written in Chinese characters. There are Russian and Turkic vocabulary. My suggestion is to leave Dungan loanwords from Russian and Turkic as written in Cyrillic, and merge the Dungan Chinese words with Chinese entries, and perhaps leaving the Cyrillic entry of those words like how Chinese pinyin and Japanese romaji are left. --Mar vin kaiser (talk) 15:53, 16 July 2016 (UTC)Reply
Also, I speak Mandarin, and I tried listening to Dungan videos in Youtube. They're actually understandable for the most part. As in I can write down what they're saying in Chinese, except for some words though (presumably Russian and Turkic loanwords). --Mar vin kaiser (talk) 16:04, 16 July 2016 (UTC)Reply
  • I am not opposed to this, but it would require that Dungan orthography be incorporated into the relevant Chinese templates, so @Wyang's aid and support will be critical. —Μετάknowledgediscuss/deeds 16:21, 16 July 2016 (UTC)Reply
  • I support this. Cyrillic Dungan forms can be added to {{zh-pron}}, under Mandarin. Wyang (talk) 00:25, 17 July 2016 (UTC)Reply
    • The main caveats are that Cyrillic is apparently the standard script for Dungan, unlike with Pinyin or Romaji, and there may be some vocabulary that only exists in Cyrillic. I suppose the writing systems for Hokkien might be analogous, though. Chuck Entz (talk) 01:07, 17 July 2016 (UTC)Reply
  • This doesn't feel right to me. I think it would be like folding Maltese into Arabic, or merging Hindi and Urdu. I foresee a lot of complaints from anons if entries like дянхуа and شِيَوْ عَر دٍ have a ==Chinese== heading, and I would find it disconcerting myself, too. And what would the definition then say? {{lb|zh|Dungan}} {{form of|Cyrillic script|電話|lang=zh}}? I think readers would find that more confusing than helpful. And then what about the Russian and Turkic loanwords that don't exist in China? They would have to have full definitions without a link to a Hanzi entry, and that would probably baffle readers even more, despite the {{lb|zh|Dungan}} tag. —Aɴɢʀ (talk) 14:14, 17 July 2016 (UTC)Reply
    Hindi and Urdu should be merged — they are separate for political reasons. Remember, we allow for Afrikaans entries in Arabic script, Old French entries in Hebrew script, and other odd happenstances of historical script usage. We can continue to use the Dungan header for words only existing in Cyrillic form, just like I believe we do for Min Nan. —Μετάknowledgediscuss/deeds 16:10, 17 July 2016 (UTC)Reply
Undecided for now. There are pros and cons. Cyrillic and Arabic spellings could potentially be added to each Mandarin standard pinyin syllable (non-standard could also be considered if confirmed). Multisyllabic only for confirmed ones.
Mandarin pinyin (with tone marks and monosyllabic tone numbered syllables), Min Nan POJ, zhuyin characters have not been "unified" under the Chinese umbrella for various reasons. Some are described above.--Anatoli T. (обсудить/вклад) 07:39, 18 July 2016 (UTC)Reply
I think it would be more convenient for editors if Dungan were part of unified Chinese, since it would be easier to edit. It would just feel like too repetitive if I made a new Dungan entry that technically already has an equivalent Chinese entry. --Mar vin kaiser (talk) 08:32, 18 July 2016 (UTC)Reply
Should I bring this somewhere else to a vote whether Dungan should be merged into the unified Chinese? --Mar vin kaiser (talk) 16:08, 22 July 2016 (UTC)Reply
@Wyang: if you still support folding Dungan into Chinese, how best could that be accomplished? Make the attested Cyrillic (and Arabic?) forms soft redirects...? (Would the L2 header of e.g. фонзы be "Chinese" or remain "Dungan"?) What is to be done if, as several users worry above, there are loanwords that don't have Han-character representations? - -sche (discuss) 04:48, 21 January 2018 (UTC)Reply
@-sche I'm now also concerned about the complexity of this... a possible solution is to have {{zh-pron}} link to the Dungan word, but keep the Dungan L2 heading, and treat Dungan as a full language with the Cyrillic entry showing full usage examples, etc. and linking back to the Hanzi form, perhaps on the headword line. Wyang (talk) 07:03, 21 January 2018 (UTC)Reply
@Wyang, -sche, Justinrleung, Suzukaze-c, Tooironic: Perhaps attested Dungan Cyrillic and Arabic syllables and multisyllabic words could be allowed into {{cmn-pron}}? E.g. йүян (yüi͡an) and يُوْيًا (not sure if the latter is right) in 語言语言 (yǔyán)? Each or almost each standard pinyin syllable should have a corresponding Dungan Cyrillic syllable, so monosyllabic entries may have them by default. --Anatoli T. (обсудить/вклад) 07:28, 21 January 2018 (UTC)Reply
I think it is simple. For special words w/o hanzi, do it like the Min Nan lo͘-lài-bà entry, otherwise do it like the Min Nan phōng-kó entry.
{{zh-pron}} > Mandarin > Dungan (Cyrillic), Dungan (Xiao'erjin) doesn't seem problematic to me.
Cyrillic quotations are welcome on the hanzi entries. —suzukaze (tc) 07:45, 21 January 2018 (UTC)Reply
(I just noticed, phōng-kó uses the Chinese header. I think it should use the Min Nan header, similar to how pinyin entries use the Mandarin header. But that's a separate discussion, I think. —suzukaze (tc) 07:46, 21 January 2018 (UTC))Reply
We had a discussion on this but I don't remember where. I'm OK to use the Chinese L2 header on all romanised entries if they link to (and consequently, have a written form in) hanzi. --08:03, 21 January 2018 (UTC)
@Wyang, Suzukaze-c, Atitarev, Mar vin kaiser I have made some trials to Module:zh-pron. But currently there're some problem:
  1. As Cyrillic script does not imply tone and merges some consonant, the Cyrillic form is generated from pinyin-like romanisation instead. We need Wiktionary:About Dungan to document it. Also, current pinyin-like romanisation include some redundant vowels.
  2. IPA value are from Wikipedia and needs checking.
  3. I don't know how neutral tone work, nor whether there're tone sandhi or erhua in Dungan.
  4. Some values can not be generated by current pinyin-like romanisation. e.g. чў=q+u (not ü) which is not a valid Pinyin syllable, and ңыйлу (="ng+er+y" lou).
  5. We need a module to transcript Cyrillic Dungan entries.

--Zcreator (talk) 21:17, 23 January 2018 (UTC)Reply

My thoughts:
  • Instead of Template:docparam, it should be Template:docparam for consistency.
  • I don't think we should make a "pinyin". I think we should input the original Cyrillic directly, annotated with tones (somehow).
  • There is definitely erhua.
suzukaze (tc) 05:13, 24 January 2018 (UTC)Reply
I have removed pinyin-like romanisation. However point 2 and 3 still needs to be solved.--Zcreator (talk) 09:05, 24 January 2018 (UTC)Reply
I oppose the use of tones in the transliteration for Dungan if it's reintroduced. The Cyrillic spelling have no tones and there should be no tone numbers or marks in the romanisation. Further, the transliteration should show what's actually written, not matching the Mandarin pinyin, e.g. йүян (yüi͡an) is actually "yüyan" (without a space), not "yu3 yan2" but perhaps, there's no need to provide any additional transliteration. --Anatoli T. (обсудить/вклад) 10:56, 24 January 2018 (UTC)Reply
We have tone and length annotations for every other language whose orthography does not reflect relevant phonemic differences. Why should this be different? Korn [kʰũːɘ̃n] (talk) 12:08, 24 January 2018 (UTC)Reply
Just to be clear, I was talking about the transliteration, not IPA. Anatoli T. (обсудить/вклад) 20:19, 24 January 2018 (UTC)Reply
@Atitarev: I don't think the Cyrillic in Dungan should be referred to as a transliteration, but as a script of its own. Therefore, I agree with you that tones shouldn't be put in the Cyrillic writing precisely because it is a script, and it is written without the tone, but the IPA should definitely provide the tone, since the script traditionally doesn't write it down. --Mar vin kaiser (talk) 06:30, 25 January 2018 (UTC)Reply
@Mar vin kaiser: I don't think Atitarev is referring to the Cyrillic as transliteration. He's talking about the romanization of the Cyrillic. I found out from Omniglot that there are two conventions used for indicating tones: -, ъ, ь or I, II, III. We should probably adopt one of these two. — justin(r)leung (t...) | c=› } 07:05, 25 January 2018 (UTC)Reply
@Justinrleung: I see. Then it's settled then, I guess. For cognates with other Chinese languages, a unified Chinese entry will be used with the Cyrillic entry still existing the same way POJ entries in Min Nan still exist. The Dungan portion of the pronunciation would then contain the Cyrillic entry or the Cyrillic script with tone marks? What do you think? Maybe the Cyrillic script with tone marks, but it would link to the Cyrillic entry without tone marks? --Mar vin kaiser (talk) 07:20, 25 January 2018 (UTC)Reply

The module can't handle variants, separated by commas, e.g. dg=Җун1гуй1,Җун1гуә2 in 中國中国 (Zhōngguó). --Anatoli T. (обсудить/вклад) 14:01, 25 January 2018 (UTC)Reply

@Atitarev I found a new source for Dungan [here]. It shows more dialects and more tones compared to the Russian source. --Mar vin kaiser (talk) 16:32, 13 February 2018 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536613#Renaming_[lel]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [lel]

We currently refer to the other languages most commonly referred to as "Lele" with parenthetical geographic disambiguators (Lele (Guinea), Lele (New Guinea), Lele (Chad)). I suggest we do the same for [lel] and change "Bashilele" to "Lele (Congo)". There is an argument that "Lele (Congo)" should be "Lele (Democratic Republic of the Congo)", but that seems too long and none of these languages are spoken in the Republic of the Congo, so I think it's safe.Μετάknowledgediscuss/deeds 05:30, 10 March 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming the Sangu languages

We currently call [snq] "Chango", which is quite infrequently used. We could go for "Isangu" (with the language prefix), because there is another Bantu language called Sangu — or we could bite the bullet and go for geographical disambiguation with "Sangu (Gabon)" for [snq] and "Sangu (Tanzania)" for [sbp], which is what Wikipedia does, and is probably least misleading. —Μετάknowledgediscuss/deeds 04:29, 31 March 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536623#Merge_[xma]_into_[ziw]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [xma] into [ziw]

This is a case where a single language spoken in two countries has been given two codes. Mushungulu [xma] is the dialect of Zigula spoken in Somalia. Maarten Mous says: "I interviewed some of the refugees [from Somalia into the Zigula homeland in Tanzania]. They claim there is no difference between the way they speak Zigula and the Zigula of those who never left the area; the [Tanzanian] Zigula speakers present agreed to this. Noting down some expressions, I had the same impression when comparing this to my Zigula field notes from 1994." —Μετάknowledgediscuss/deeds 19:45, 31 March 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Split Bafaw and Balong

Currently, the code [bwt] is called "Bafaw-Balong" in an attempt to cover the two dialects. However, these "dialects" are quite divergent, though fairly closely related. Maho (2009) separates them, as does the Linguistic Atlas of Cameroon. Emmanuel Chia suggests that heavy borrowing has confused linguists into considering them the same language in The Bafaw Language (Bantu A10) (which treats Bafaw and not Balong, considering it completely separate). I suggest that we rename [bwt] to "Bafaw" and make a new code for Balong, maybe bnt-bal. —Μετάknowledgediscuss/deeds 20:12, 31 March 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Adding Lwel and Mpiin

The Boma-Dzing languages are a bit of a mess, as Dzing was given a code but none of its sister languages were initially. Lwel is considered a distinct language by Maho (2009) and by {{R:nzd:Crane}}, and seems to have a grammar (Éléments de grammaire morphologique de la langue lwel) that I can't find anywhere, but citations to it seem like a distinct language. I suggest a new code, bnt-lwl. —Μετάknowledgediscuss/deeds 21:02, 31 March 2020 (UTC) As for Mpiin, it's also considered distinct by Maho (2009) (and, I should add, these opinions are upheld by Van de Velde et al. (ch.2, by Hammarström)), and a selection of words in Koni Muluwa (2010) seem similar to, but consistently distinct, from other Boma-Dzing languages. I suggest a new code, bnt-mpi. —Μετάknowledgediscuss/deeds 05:44, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536629#Retire_[mct]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Retire [mct]

This code for "Mengisa" is listed as a duplicate code by WP. Van de Velde et al. (ch.2, by Hammarström) say: "The Eton that most Mengisa now speak is not linguistically remarkable and therefore we count it as an Eton variety". (The Mengisa originally spoke [leo].) —Μετάknowledgediscuss/deeds 00:46, 1 April 2020 (UTC)Reply

Support. - -sche (discuss) 06:15, 6 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536632#Merge_[nzu]_and_[ebo]_as_Central_Teke|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [nzu] and [ebo] as Central Teke

Van de Velde et al. (ch.2, by Hammarström) say: "are commonly enumerated separately but the recent comparison by Raharimanantsoa (2012) [which I cannot find online] shows that such a distinction is untenable, wherefore we count them as the same". I am agnostic as to which code to choose, but the language should be called "Central Teke". —Μετάknowledgediscuss/deeds 00:55, 1 April 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536634#Renaming_[tiv]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [tiv]

Almost everybody calls it "Tiv", except us (we have "Tivi"). Compare google books:"Tiv language" and google books:"Tivi language" (which has almost nothing except Library of Congress categorisation), for example. —Μετάknowledgediscuss/deeds 06:10, 30 March 2020 (UTC)Reply

Support. - -sche (discuss) 06:16, 6 April 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536636#Renaming_[auj]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [auj]

Currently "Augila"; many spellings exist, but all modern linguistic work on the language in English (and most of the material on the town itself) refer to it as "Awjila". —Μετάknowledgediscuss/deeds 23:10, 19 March 2020 (UTC)Reply

@Benwing2: There are a bunch of language renames on this page that I'd like to do (this being one of them). If I just ignore the categories, will your regular bot runs create the new ones and delete the old, or should I move the categories instead? —Μετάknowledgediscuss/deeds 04:17, 11 May 2020 (UTC)Reply
@Metaknowledge I do a bot run every 3 days to create new categories, but it doesn't delete old ones. I have periodically deleted old categories listed in CAT:Empty categories, but I don't do it on a regular basis. I *think* all the old categories will eventually end up on that page. It's probably better to move the categories and manually fix the lemmas belonging to the renamed languages to have the new language name in their headers. For languages with very few lemmas it's probably not a big deal to do it manually; otherwise, let me know and I'll do it by bot. Benwing2 (talk) 04:29, 11 May 2020 (UTC)Reply
Yeah, none of them are a pain in the ass individually, I was just hopeful due to how many languages there are to rename. (Awjila only has one entry, but I had to move 10 categories for it!) Thanks for the offer, I'll let you know if any have a significant number of entries. —Μετάknowledgediscuss/deeds 04:36, 11 May 2020 (UTC)Reply


RFM discussion: March–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming the Kota languages

The name "Kota" is commonly used for two very small languages, so this seems like another good opportunity to use parenthetical disambiguators rather than force one to use a name with a language prefix, rarely encountered in the literature. I suggest that we rename [kfe] from "Kota" to "Kota (India)", and [koq] from "Ikota" to "Kota (Gabon)". —Μετάknowledgediscuss/deeds 05:35, 10 March 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536643#Merge_[lal]_into_[nxd]_and_rename_[nxd]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [lal] into [nxd] and rename [nxd]

Van de Velde et al. (ch.2, by Hammarström) say that "studies in the field emphasise that Lalia is simply a variety in the Bangando area with no special status vis-a-vis other Bongando varieties". While we're at it, we currently call [nxd] "Longandu", which seems to be one of the rarer names. In the tradition of lopping off language prefixes and matching the usual final vowel, I suggest we rename it to "Ngando". —Μετάknowledgediscuss/deeds 01:21, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536650#Renaming_[ngc]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [ngc]

We currently call this "Longandu". I assume the reason for avoiding "Ngandu" was originally potential conflict with [bgf], which we call "Bangandu". Given that "Bangandu" seems like a perfectly fine name, I suggest we rename [ngc] to "Ngandu". —Μετάknowledgediscuss/deeds 01:27, 1 April 2020 (UTC)Reply

  • Well, I seem to have gotten these confused while attempting to clean them up, which is as good an argument as any for better names. We actually call [ngc] "Lingombe", from which we should remove the language prefix. Unfortunately, that brings it into potential conflict with the poorly named "Bangando-Ngombe" [nmj], variously considered a dialect of Bangandu [bgf]. I am instead renaming it to "Ngombe (Congo)" for maximal clarity, and will create a new thread to discuss [nmj]. —Μετάknowledgediscuss/deeds 23:17, 11 May 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536654#Merge_[gti]_into_[nyc]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [gti] into [nyc]

Van de Velde et al. (ch.2, by Hammarström) say: "the only source on this language (Hackett and van Bulck 1956:74) has it as a dialect of Nyanga-li [nyc]." If that's the sole source, I have no idea why it ever got a code. —Μετάknowledgediscuss/deeds 01:33, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536657#Merge_[ppp]_and_[lnz]_into_[yaf]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [ppp] and [lnz] into [yaf]

Van de Velde et al. (ch.2, by Hammarström) say: "Pelende and Lonzo denote political rather than ethnolinguistic sub-entities of Yaka [yaf] ... as explicit linguistic data, whenever available, confirms". —Μετάknowledgediscuss/deeds 01:38, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536659#Merge_[njd]_into_[kde]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [njd] into [kde]

Van de Velde et al. (ch.2, by Hammarström) say: "the field research of Kraal (2005 1–7) finds this distinction [between Ndonde and Makonde] untenable from a linguistic and ethnographic point of view". "Makonde" [kde] is the usual term. —Μετάknowledgediscuss/deeds 02:02, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536661#Merge_Shona_dialects_[twl],_[mxc],_[twx]_into_Shona_[sn]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge Shona dialects [twl], [mxc], [twx] into Shona [sn]

Ehret and Kinsman (1981) "Shona Dialect Classification and Its Implications for Iron Age History in Southern Africa" have helpful trees of the Shona dialects, which confirm that [twl], [mxc], [twx] are all mutually intelligible to a high degree with the other dialects of Shona [sn]. (In fact, [twl] and [twx] are subdialects of the larger dialects that most Shona dictionaries treat, and [mxc] isn't even a valid clade, but merely a geographic convenience.) —Μετάknowledgediscuss/deeds 02:17, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Lala languages

The Bantu Lala language (South Africa) is certainly not the same as Zulu, as should be clear just from the sample in the WP article. Unfortunately, our current "Lala language" will have to be moved to make way. In the emerging tradition of geographic parentheticals, I suggest that [nrz] be renamed to "Lala (New Guinea)" and a new code, maybe [bnt-lal], be created for "Lala (South Africa)". (As a side note, there are also Lala-Roba and Lala-Bisa, which thankfully have hyphenated disambiguators.) —Μετάknowledgediscuss/deeds 02:33, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536668#Merging_[nyr]_into_[nih]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merging [nyr] into [nih]

There's a lengthy SIL report on the Nyiha languages, and skimming it suggests that the Malawian ([nyr]) and Tanzanian ([nih]) dialects are mutually intelligible. Chances are that the codes haven't been used to separate the two countries' dialects in a consistent way anyway, because they've just been called "Shinyiha" (with the language prefix) and "Nyiha" (without) — the latter is what we should use. —Μετάknowledgediscuss/deeds 05:21, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536671#Renaming_[kmw]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [kmw]

We currently call [kmw] "Kikumu"; the language is most commonly referred to as "Komo", but we already have [xom] "Komo", which is not easily disambiguated (it's spoken in three countries and belongs to an obscure group). We can still do better than "Kikumu", though — I suggest we lop off the language prefix and call it "Kumu", which is more common than the name we currently use. —Μετάknowledgediscuss/deeds 06:10, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536673#Renaming_[lwg]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [lwg]

We currently call this "Oluwanga"; we should drop the language prefix, as we normally do and as matches the scholarly literature, and call it "Wanga". (Note that we need to reconsider the Luhya codes altogether, but that can be dealt with later.) —Μετάknowledgediscuss/deeds 20:23, 1 April 2020 (UTC)Reply

Support removing the prefix. Comparing google books:"Wanga" Bantu OR Kenya language to google books:"Hanga" Bantu OR Kenya language, I notice that although the tribe seems to be mostly called the Wanga, the language is not infrequently "Hanga". (Another alt name is "Luwanga"). - -sche (discuss) 06:04, 6 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536677#Renaming_[blv]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [blv]

We currently call [blv] "Bolo", which refers to a peripheral dialect of this language. This SIL report says that a conference decided on "Kibala", and Angola's Institute of National Languages followed suit (the alternative suggestion of "Kibala-Ngoya" having been rejected due to its historically derogatory usage). I suggest we do the same and use "Kibala". —Μετάknowledgediscuss/deeds 20:47, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536679#Renaming_[syx]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [syx]

We currently call [syx] "Shamay", which is a rather rare name for it. WP uses "Samay", but I suggest we follow Van de Velde et al. (ch.2, by Hammarström) and use "Osamayi". —Μετάknowledgediscuss/deeds 20:57, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536681#Merge_[xdo]_into_[mho]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [xdo] into [mho]

Van de Velde et al. (ch.2, by Hammarström) and indeed every linguistic source I can find has this as synonymous with or a dialect of Mashi [mho]. I can only see a bit of the relevant part in Laranjo Medeiros (1981) VaKwandu on BGC, where he seems rather confused about the language in general, but there may be useful information there if someone can see it (or if anyone still has academic libraries operating in their part of the world). —Μετάknowledgediscuss/deeds 21:16, 1 April 2020 (UTC)Reply


RFM discussion: April–May 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59536683#Merge_[yax]_into_[mck]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [yax] into [mck]

The Yauma are ethnographically distinct, but speak a Mbunda dialect according to Fleisch (2000). I see a journal article in 1970 that says Yauma is synonymous with "newer Mbunda" (in contrast with "Old Mbunda"), and I'm not really sure what's going on there, but it also supports a merger. I cannot find Yauma mentioned in the usual sources otherwise. —Μετάknowledgediscuss/deeds 21:32, 1 April 2020 (UTC)Reply


RFM discussion: December 2017 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming fay

"Southwestern Fars" is a really awful name. Fars is not a macrolanguage as the name would suggest, but a province of Iran, and there are other lects spoken in southwestern Fars. We would do much better to use the unambiguous name "Kuhmareyi", as used by Wikipedia. —Μετάknowledgediscuss/deeds 19:44, 9 December 2017 (UTC)Reply

@Eeranee, who has been adding entries in it, may have an opinion. —Μετάknowledgediscuss/deeds 20:53, 9 December 2017 (UTC)Reply
hi, Metaknowledge. At the beginning when I wanted to create the category for the language I searched online to see if I can find Kuhmareyi. I could not find online sources mentioning Kuhmareyi other than Wikipedia. Online sources mostly mention Southwestern Fars. I know Fars is a province. ethnologue and some other sites use this name. I am trying to find the name Kuhmareyi in the book that I have, A treasury of the Dialectology of fars. I should also mention that I am very cautious in adding the new words and double check the words and have not recorded some words that can not be found in other sources. However the book is written by a professional linguist and is funded or guided by Iranian Academy of Persian Language and literature. I think I saw Kuhmareyi in one online source. I have not found the name "Kuhmareyi" So far but the name seems correct.--Eeranee (talk) 21:09, 9 December 2017 (UTC)Reply
@Eeranee: It sounds like you're being a very conscientious editor, so thank you! Do you think "Southwestern Fars" really is the most used name? If so, I guess we'd be wrong to do otherwise. —Μετάknowledgediscuss/deeds 21:29, 9 December 2017 (UTC)Reply
I can find online sources in Persian using Kuhmareyi which might not be reliable but I could not find a source written in English using Kuhmareyi. If anyone thinks kuhmareyi is more correct we can use it--Eeranee (talk) 22:06, 9 December 2017 (UTC)Reply
Searching Google Books for both names, I find nothing that would help us much to decide one way or the other: nothing using "Kuhmareyi", and only a couple of general reference works on world languages that mention "Southwestern Fars" in giant lists of languages, probably just copying Ethnologue. (Wikipedia curiously says "the southwestern [Fars] dialects can be divided into three families of dialects according to geographical distribution and local names: Southwestern (Lori), South-central (Kuhmareyi) and Southeastern (Larestani)", as if calling it "southwestern" might be slightly confusing/misnomial.) @ZxxZxxZ, Vahagn Petrosyan, do either of you have a preference or knowledge of which name is more common or appropriate? - -sche (discuss) 19:19, 17 December 2017 (UTC)Reply
I don't know anything about this. --Vahag (talk) 18:38, 18 December 2017 (UTC)Reply


RFM discussion: May–June 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59618578#Renaming_[chw]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [chw]

Currently called "Chuwabu". There are quite a lot of spellings, influenced by English, Swahili, Portuguese, and Africanist notation, and none of them are unquestionably more common, but this shows that the case for "Chuabo" is the best, which is supported by paging through the Google Books results as well. —Μετάknowledgediscuss/deeds 17:11, 5 May 2020 (UTC)Reply

Support. - -sche (discuss) 01:39, 12 May 2020 (UTC)Reply


RFM discussion: May–June 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59618613#Renaming_[nmj]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [nmj]

We currently call [nmj] "Bangando-Ngombe". It is variously considered a dialect of Bangandu [bgf]: Wikipedia cites Glottolog to support a claim that the code is "spurious", and the Atlas linguistique de l'Afrique centrale calls it "une variété très voisine" to Bangandu. Without a clearer statement about mutual intelligibility or specific fieldwork, I am hesitant to merge it. However, the current name, which hyphenates a (differently spelt!) version of the macrolanguage with the dialect's name, is clearly undesirable. I suggest we rename it to "Ngombe (Central African Republic)" (the national parenthetical serving to disambiguate from [ngc] "Ngombe (Congo)"). @-scheΜετάknowledgediscuss/deeds 23:24, 11 May 2020 (UTC)Reply

Seems reasonable; go for it. - -sche (discuss) 01:38, 12 May 2020 (UTC)Reply


RFM discussion: June 2018–June 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming the Manda languages

We currently call [mgs] "Nyasa", and it came to my attention due to the entry mtua, which User:Wikitiki89 mistakenly created as a "Nyasa" entry, based on an obsolete dictionary of Chichewa that calls it "Kiniassa". Nowadays, Nyasa is usually a name for a group of closely related languages including Chichewa (Nyasa languages) but WP claims it is also used for mjh. Following WP, we could call it "Manda (Tanzania)", rendered problematic by the two other Manda language possibilities, which we call "Manda" and "Australian Manda" (!). I propose that we go all in for national disambiguation and make those parenthetical as well, as "Manda (India)" and "Manda (Australia)". @-scheΜετάknowledgediscuss/deeds 06:18, 28 June 2018 (UTC)Reply

Just for the record, I did a lot of research to determine what the correct language code was for the language described in that dictionary, even referring to this map (source) and strongly considering "mjh". Thanks for finally sorting it out! Note also, there is a "Nyasa" translation at water, which does not align with Chichewa "madzi" (which the dictionary I used gave as "madsi"). --WikiTiki89 15:07, 28 June 2018 (UTC)Reply
@Wikitiki89: Next time, try Google instead of poring over maps. :) I was already aware of this dictionary (and its miserable orthography), but searching "Rebman Kiniassa" gets you the Wikipedia article for Johannes Rebmann, which in turn tells you this is Chichewa. As for the "Nyasa" translation, masi... I really don't know what language that should be. The word for "water" in mgs is máchi. —Μετάknowledgediscuss/deeds 17:16, 28 June 2018 (UTC)Reply
I did Google quite a bit, and I probably did look at the w:Chichewa Wikipedia article, but didn't trust that it was accurately citing the dictionary. --WikiTiki89 17:21, 28 June 2018 (UTC)Reply
WP says Australian Manda (zma) is also called "Menhthe", but the only published resources on it I could find used Manda. To distinguish it and [mgs] and [mha], the proposal of [mgs]="Manda (Tanzania)" and [mha]="Manda (India)" and [zma]="Manda (Australia)" sounds good. (As an aside, there is also a "Ma Manda" language.) - -sche (discuss) 03:11, 21 January 2020 (UTC)Reply


RFM discussion: June 2018–June 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming mjh > Merging mjh

Tangentially relevant to the discussion of languages named "Nyasa" above, mjh is not only one of those, but is also called "Mwera" (a name currently occupied by mwe, which can't even by distinguished by a parenthetical, because both languages are from Tanzania!). Maho's Guthrie List, the standard list used by Bantuists, calls it "Mbamba Bay Mwera", but the only hits for that string on Google Books are of that very list. WP chooses to simplify this as just "Mbamba Bay", which is both the most unambiguous option, and also an option that seems to be used only there. I'm really unsure what to call it — anything but our current name of "Nyanza", which refers to the lake and offers only more confusion. —Μετάknowledgediscuss/deeds 06:27, 28 June 2018 (UTC)Reply

@-sche, Mahagaja: Any thoughts on this (or the above discussion)? —Μετάknowledgediscuss/deeds 03:19, 17 January 2020 (UTC)Reply
@Metaknowledge Can they be [called by whatever name is most common, even though homographic to another language, and then] distinguished by linguistic family, like WT:RFM#Canonical_name_of_"fan"? Say, "Mwera (Nyasa)" vs "Mwera (Ruvuma)" or whatever? Btw, we should standardize on putting the family in parentheses (where we also put geographic disambiguators), which will require renaming some languages that were named with the family first, like Papuan Mor (but not Sepik Iwam, which is apparently normally called that, to distinguish it from the other Iwam that is also Sepik). - -sche (discuss) 07:05, 17 January 2020 (UTC)Reply
I don't know anything about these languages, but if indeed both mjh and mwe are usually called Mwera and both are spoken in Tanzania, then I'd support "Mwera (Nyasa)" and "Mwera (Ruvuma)". —Mahāgaja · talk 07:29, 17 January 2020 (UTC)Reply


RFM discussion: January 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merging Gaulish and Lepontic

I'd like to address merging Lepontic with Gaulish because it is already included in the English article provided and the terms are frequently the same except for some differences in endings development without any necessary declension change. Comparing this to Latin, as it is merged with Old Latin, this may also be extended to Gaulish and Lepontic. Furthermore, Cisalpine Gaulish is already encompassed by Gaulish even though it was written in the Old Italic alphabet just like Lepontic. HeliosX (talk) 19:55, 15 January 2020 (UTC)Reply

Having read both Lepontic language and Cisalpine Gaulish, I'm not convinced they're similar enough to warrant merging. —Mahāgaja · talk 21:06, 15 January 2020 (UTC)Reply
I generally push for a conservative treatment for extinct languages with small, finite corpora. We can afford to over-split a little, if that even is the case here, and be no worse off for doing so. —Μετάknowledgediscuss/deeds 03:13, 17 January 2020 (UTC)Reply


RFM discussion: February 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Adding Old Omagua

This helpful source strongly suggests that early texts written are sufficiently distinct from Omagua (omg) to be considered a separate language. I don't know much about it, but if this is right, we need a new code for Old Omagua, maybe tup-oom. @Lvovmauro, -sche. —Μετάknowledgediscuss/deeds 06:24, 14 February 2020 (UTC)Reply

Meh. I don't object, but I'm not convinced it's necessary. We're talking about a time difference less than that between modern and Middle English, many of the differences seem to be in whether various morphemes remain attested into the present in various positions (or, conversely, whether modern ones happened to be found in the sparse early corpus), spelling variations are clearly attributable to texts being written by potentially non-fluent outsiders vs modern linguists and natives, and modern Omagua is small and dying. (Fox is another language which underwent changes in a relatively short time period that might be considered substantial, but it seems to still be treated as one language.) It seems to me like we could adequately, and perhaps even more sensibly, handle the differences by labelling words and spellings obsolete where necessary, lemmatizing the modern forms (and conversely labelling words/forms as modern developments in cases where we can tell that they are as opposed to that they just weren't attested in the old texts). - -sche (discuss) 21:12, 18 February 2020 (UTC)Reply
Some languages do change a lot faster than others due to social pressures (hence why glottochronology doesn't really work). I found the differences in morphology and lexicon significant, especially in the hopes that someday, someone will build the infrastructure for (modern) Omagua here. But if you're not convinced, I'm fine with shelving this; I certainly won't be working on Tupian stuff myself. —Μετάknowledgediscuss/deeds 03:11, 19 February 2020 (UTC)Reply


RFM discussion: April 2017–August 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Misc language code retirements (2017)
  • Moksela language [vms]Μετάknowledgediscuss/deeds 04:15, 4 April 2017 (UTC)Reply
    Charles Grimes says in Spices from the East: Papers in Languages of Eastern Indonesia: "This speech variety has been extinct since 1974, when the last speaker died. No clues other than the name of a stream east of Kayeli called Moksela, give any indication as to where it was spoken or what it was like. If it was spoken from the stream by that name eastward, then chances are likely that it was also a variety of the Kayeli language. People in the Kayeli area remember nothing more than the name of the language, who in the community spoke it [] " (I cannot view beyond this in the Google Books preview.)
    "...who in the community spoke it before they died, and that it was somehow different enough to have its own identity." is the rest of the sentence, I managed to coax Google into telling me. The name seemed familiar, as if it had been in one of the wordlists I've been looking at recently, but I just went back over them and searched through various other sources and indeed the only mentions of it I find all just say it's extinct and not recorded; how sad. Removed. - -sche (discuss) 08:19, 10 May 2017 (UTC)Reply
  • In this vein, Makolkol [zmh] is claimed to be extinct (per Wurm 2003, after having 7 speakers in 1988) and apparently unattested (per Stebbins 2010). Harald Hammarström and Sebastian Nordhoff accept this conclusion in Melanesian Languages on the Edge of Asia, but it may be a cautionary tale instead, because an article in LoopPNG from 2016 says five Makolkol still live, and even provides words(!), saying it is related to Simbali: "mam, meaning father, and nan, meaning mother". A 2005 article in Anthropological Linguistics (volume 47, page 77) agrees on the relation to Simbali: " [] Makolkol (extinct), [] is locally understood to have been a 'mixed language' combining Simbali and Nakanai (an Austronesian language on the northern side of New Britain)." I suppose the code should be left alone for now, pending further data. (There were widely varying estimates of how many speakers it had earlier in the 20th century, and fanciful tales of who they were, "headhunters" or "giants" who "lived in trees" and who no white person had survived meeting at first.) - -sche (discuss) 08:43, 10 May 2017 (UTC)Reply
    (Ergo, code kept.) - -sche (discuss) 01:30, 6 August 2020 (UTC)Reply
  • Maramba
Also Maramba (myd)? (And many more at Spurious languages need to be checked, but some are not spurious, like Ammonite.) - -sche (discuss) 09:51, 3 June 2017 (UTC)Reply
myd has been retired by the ISO and hence now also by us. - -sche (discuss) 07:47, 23 February 2019 (UTC)Reply


RFM discussion: July 2016–August 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Even more languages without ISO codes, part 5

This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)Reply

Australian languages
Tasmanian languages
Western Tasmanian:
Northern Tasmanian:
Eastern Tasmanian:
Oyster Bay (Big River, Paredarerme/Paritarami, Lairmairrener, Lemerina)? - -sche (discuss)   Done as aus-par
Little Swanport? - -sche (discuss)   Done as aus-lsw
comments

@-sche, back when I suggested these Australian languages, I included the codes for the Tasmanian languages that Bowern (2012) teased out of various wordlists. At the time, I was ignorant of the fact that there is an ISO code, xtz, for a language called "Tasmanian", and we have a few words in it. There was no single Tasmanian language, so I think this code should be retired and the words sorted into their respective languages by Bowern's scheme. —Μετάknowledgediscuss/deeds 05:28, 3 May 2017 (UTC)Reply

Other needed codes edit

Here are other languages we might need codes for: - -sche (discuss) 05:21, 29 August 2016 (UTC)Reply

  • Indanga (Kɔlɔmɔnyi, Kɔlɛ, Kasaï Oriental) (bnt-ind?)
    It lacks a Wikipedia article but is documented by Jacobs, Texte et lexique indanga (2002). fr.Wikt already has a word from it. OTOH, fr.WP considers it a regional variant of Tetela. And fr.Wikt does have a tendency to treat dialects as language, also splitting e.g. Alsatian German from Alemannic German, Hoanya from Papora, etc. - -sche (discuss) 05:21, 29 August 2016 (UTC)Reply
    Well, it's definitely part of the dialect continuum known in Guthrie as C.70, which has 8 ISO codes that cover it rather poorly (this is a typical situation with Bantu languages, which really need their own overhaul at some point). I see that its word for "water" is bash in that reference, rather different than Tetela proper ashi. We have to draw lines somewhere, and I can't figure out where Indanga would be merged, so I suppose a new code is in order. —Μετάknowledgediscuss/deeds 05:30, 4 October 2016 (UTC)Reply
      DoneΜετάknowledgediscuss/deeds 01:58, 5 April 2018 (UTC)Reply


RFM discussion: April–July 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/59942051#Merge_[dov]_into_[toi]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [dov] into [toi]

Van de Velde et al. (ch.2, by Hammarström) say: "Dombe is a derogatory nickname for Tonga found in Hwange district of Zimbabwe". This raises an issue with [toi], which we currently call "Tonga (Zambia)" (to distinguish it from "Tonga (Malawi)" and "Tonga (Mozambique)") — it's spoken in Zimbabwe too. The WP article is called "Tonga (Zambia and Zimbabwe)", which is awfully awkward name for a language spoken by 1.5 million people. I think we may have to cut our losses and stick with "Tonga (Zambia)" even after the merger. —Μετάknowledgediscuss/deeds 01:51, 1 April 2020 (UTC)Reply

@-sche, Smashhoof, anyone have feelings about "Tonga (Zambia)" rather than "Tonga (Zambia and Zimbabwe)"? —Μετάknowledgediscuss/deeds 20:59, 11 May 2020 (UTC)Reply
@-sche, Smashhoof, Mahagaja: Reminder that I'm still looking for input on this one. When you see a country name in a parenthetical disambiguation in a language's canonical name, do you assume that the language is exclusively spoken in that country? —Μετάknowledgediscuss/deeds 06:35, 15 June 2020 (UTC)Reply
At least I would assume it's spoken primarily in Zambia, as in maybe at least 67% of speakers are in Zambia. But why not use subfamilies instead of countries as disambiguators? [toi]/[dov] could be "Tonga (Botatwe)", [toh] could be "Tonga (Southern Bantu)", and [tog] could be "Tonga (Nyasa)". —Mahāgaja · talk 06:41, 15 June 2020 (UTC)Reply
@Mahagaja: "Primarily" is certainly correct here. But the idea of using subfamilies is an intriguing suggestion that I hadn't considered. My only fear is that those subfamilies are not very well known by any terms, as such classification is relatively recent, and to someone unfamiliar with the classification system, the word "Botatwe" would be meaningless and the concept of "Southern Bantu" would be ambiguous. —Μετάknowledgediscuss/deeds 06:51, 15 June 2020 (UTC)Reply
I would personally prefer the more accurate label Tonga (Zambia and Zimbabwe), even if it's a bit clumsy. But I'm not opposed to keeping it as Tonga (Zambia) if you prefer. What I'm confused about is why you want to merge [dov] and [toi]. The chapter you're referencing splits [dov] as a separate language Toka-Leya, which suggests that language isn't mutually intelligible with Tonga. Smashhoof (Talk · Contributions) 17:25, 15 June 2020 (UTC)Reply
@Smashhoof: Thank you for checking! I read that footnote as meaning that the "Toka-Leya dialects" were to be thought of as dialects of Tonga, but I see that Hammarström does give the sum of Toka, Leya, and "Dombe" its own line as a language. I went to check Hachipola (1991), who did a study on eight Tonga-group lects, including Toka, and says: " [] Toka informants all insisted that they were all legitimately Tonga. Valley speakers in particular consider their speech as the 'purer' form of Tonga of which Plateau is the corrupted version [MK: note that Plateau is standard Zambian Tonga]. The speakers of Plateau, Valley and Toka can also understand speakers of Ila, but with some difficulty. However, conversation can be carried out between, say, a Plateau speaker and an Ila speaker each using their own speech form. This is also true of Lenje on the one hand and Plateau on the other [] ". I glanced through the lexical items in the second half of the dissertation and found that these three lects were quite similar, and Toka might be closer to Valley Tonga than to Plateau Tonga. I'm really not sure where to draw the line, especially since we (by default) consider Valley and Plateau to both be [toi]. What do you think? —Μετάknowledgediscuss/deeds 19:17, 15 June 2020 (UTC)Reply
@Metaknowledge I'm not sure what "Valley" refers to specifically. But I would follow Hammarström (and Glottolog) by classifying Toka, Leya, and "Dombe" as [dov]. Smashhoof (Talk · Contributions) 20:50, 19 June 2020 (UTC)Reply


RFM discussion: August 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Adding Ashraaf

Ashraaf is the sister language to Somali, and currently included in its code despite being mutually unintelligible. Wikipedia has an entry at Ashraf dialect, despite the fact that they quote two sources that make the case for it being its own language, and their references use the usual spelling with a double a. It is somewhat poorly documented (there is no dedicated grammar), but multiple scholarly sources include vocabulary. I suggest the code cus-ash. —Μετάknowledgediscuss/deeds 01:33, 11 August 2020 (UTC)Reply

I am inclined to agree/support; Green and Jones' piece outright says, elsewhere from the snippet WP quotes, that "despite all classifications of Ashraaf treating it as a dialect of Somali, our Marka speakers have intimated to us that both Marka/Somali and Marka/Maay intelligibility presents a challenge" and "despite the fact that some classifications treat Ashraaf as a dialect of Somali, Marka and Somali appear not to have a high degree of mutual intelligibility"; they say plurals are almost all different from Somali, and singulars also show some differences. (They mention the otherNames = "Af-Ashraaf", and the varieties {"Marka, Lower Shabelle"} and "Shingani", saying — as you probably saw, but as I will mention here for easier findability later — that for Shingani there is one 'theoretical article' on 'theme construction', Ajello 1984, one short grammatical sketch, Moreno 1953, and one book of pedagogical material, Abo 2007, and for Marka there is an unpublished grammatical sketch, Lamberti 1980, and an article on verb inflection, Ajello 1988.) And I see that Maay already has a separate code, reasonably (the IEL says standard Somali is "difficult or unintelligible to Maay and Digil speakers" unless they've learned it). The only thing that gives me pause is that, as they admit, many prior (albeit non-Ashraaf-specific) classifications have viewed Somali as a single language, as WP claims speakers also view it, despite the lack of mutual intelligibility. - -sche (discuss) 20:46, 11 August 2020 (UTC)Reply
That's a matter of linguistic nationalism, coupled with an unfortunate habit of scholars like Lamberti to count everything spoken by people who identify as belonging to Somali clans as "Somali dialects". Maay, Jiddu, Garre, and many others were in this boat, but have all gotten ISO codes, except for Ashraaf. —Μετάknowledgediscuss/deeds 21:07, 11 August 2020 (UTC)Reply
Added (as cus-ash with an initial entry at ilig; you may want to add diacritic stripping rules to the code and gender to the entry). - -sche (discuss) 04:20, 18 August 2020 (UTC)Reply


RFM discussion: March 2014 – August 2015 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Meänkieli (fit) and Kven (fkv) into Finnish (fi)
 
Finnish dialects

I think that linguists consider these to be dialects of Finnish, so that would make these pluricentric standards of a single language. I don't know if keeping them separate would hold any value? —CodeCat 14:05, 15 March 2014 (UTC)Reply

Let's ping our active Finnish speakers to see if they have input: User:Hekaheka and User:Makaokalani. 23:16, 23 March 2014 (UTC) (updated - -sche (discuss) 06:09, 6 April 2014 (UTC))Reply
The impression I get from the example at w:Meänkieli is that the differences are very minor, no more than there might be between Croatian and Serbian. I notice systematic loss of -d- and Finnish -ts- corresponds to -tt- in Meänkieli. They definitely look mutually intelligible. Kven looks a little more different, but it might also just be the spelling; I don't know how hard it would be to the average Finnish speaker. —CodeCat 23:26, 23 March 2014 (UTC)Reply
I know maybe a dozen words of Finnish, so I can't judge for myself, but the impression I get from the Wikipedia articles is that there's an equal or greater range of variation between dialects in Finland as there is with these dialects- if these dialects were on the other side of the Finnish border, they would probably be considered just part of the normal dialectal variation (I'm sure there are some differences due to their isolation from the influence of standard Finnish, as well). They have special status because they're in Sweden and Norway surrounded by Swedish and Norwegian. Chuck Entz (talk) 23:53, 23 March 2014 (UTC)Reply
Finnish wasn't even a single language to begin with originally. There's several dialect groups that form a continuum, but it's not easy to draw clear lines. Savonian (eastern) dialects for example might well be closer to Karelian (considered a separate language) than they are to western Finnish. —CodeCat 00:10, 24 March 2014 (UTC)Reply
My impression is the same as Chuck's, that these could be merged. By my (quick) count, we have 11 Meänkieli entries and 14 English entries with Meänkieli translations, and 19 Kven entries and 8 entries with Kven translations. - -sche (discuss) 02:45, 24 March 2014 (UTC)Reply
For more information see w:Finnish dialects and also w:Peräpohjola dialects. The map to the right may also help. —CodeCat 03:33, 24 March 2014 (UTC)Reply
 
Blue indicates areas where Finnish is spoken by the majority, and green indictes minority. Meänkieli and Kven are considered Finnish on this map

I somehow missed this discussion when it was active, but better later than never. I have the following comments:

  • The map is outdated. There's practically no Finnish-speaking population left in the areas which were annexed by the Soviet Union during and after the WWII. The map on the right is more up-to-date.
  • There's some Ingrian population left in the St. Petersburg area, but their number and share of population (less than 0,5‰ in Leningrad oblast) is drastically reduced due to 1) inflow of Russians to St. Petersburg, 2) Stalin's terror in the 1930's and 3) emigration to Finland between 1990 and 2011.
  • I'm not sure of Kven-speakers, but the speakers of Meänkieli tend to be quite strong in their opinion that they are not Finnish-speakers. It is probably true that if the border were in another place, Meänkieli would be considered a Finnish dialect. But then again, it would hardly be the same language as it is today - it would have preserved less archaic features and there would be much less Swedish influence in it. If ISO regards it a language, how could we be wiser?
  • Meänkieli is an official minority language in Sweden, and is regarded as distinct from Finnish which also has a (separate) minority language status there.
  • "Finnish wasn't even a single language to begin with originally." -- Show me one that was!

--Hekaheka (talk) 12:27, 25 January 2015 (UTC)Reply

Let's take a look at our current 15 Meänkieli and 20 Kven lemmas:
  • Meänkieli:
    • Six words indistinguishable from Standard Finnish
    • Two words indistinguishable in shape from Standard Finnish but with dialect-specific meanings
    • Four words with some phonetic peculiarities specific to Northern dialects
    • Two words widespread across Finnish dialects
    • One word that might be specific to the variety, or might be one of the previous
  • Kven:
    • Seven words indistinguishable from Standard Finnish
    • Seven words widespread across Finnish dialects
    • Five words with some phonetic peculiarities specific to Northern dialects
    • One narrow-distribution loanword from Norwegian
So yes,   Support. We could well treat these as Finnish dialects, though I think to account for any local neologisms and such, they would deserve categories of their own under Category:Regional Finnish. --Tropylium (talk) 19:55, 28 February 2015 (UTC)Reply
I've merged Kven into Finnish, relabelling the handful of Kven entries we had, except nelje and kahðeksen, yhðeksen and yhðeksentoista, which don't seem to be attested in any language. (kahdeksen and yhdeksen do seem to be attested as regional variants of the usual Finnish terms.) - -sche (discuss) 05:16, 18 August 2015 (UTC)Reply


RFM discussion: April 2018–October 2019 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming svm

From ‘Molise Croatian’ to ‘Molise Slavic’. It seems most scholars in the field, excluding some in Croatia itself, have been switching to the latter name; see, for instance, the papers of Krzysztof Borowski or the more recent works of Walter Breu. The endonym is naš jezik (literally our language) or na-našo (literally in our (manner)), without any specific ethnonational designation; the other names are apparently recent impositions. For this see for instance Sujoldžić 2004:

 

Along with the institutional support provided by the Italian government and Croatian institutions based on bilateral agreements between the two states, the Slavic communities also received a new label for their language and a new ethnic identity — Croatian — and there have been increasing tendencies to standardize the spoken idiom on the basis of Standard Croatian. It should be stressed, however, that although they regarded their different language as a source of prestige and self-appreciation, these communities have always considered themselves to be Italians who in addition have Slavic origins and at best accept to be called Italo-Slavi, while the term »Molise Croatian« emerged recently as a general term in scientific and popular literature to describe the Croatian-speaking population living in the Molise.

 

Information about current scholarly usage is given by Walter Breu in the request for an ISO code here:

 

Slavomolisano: [] In scientific work, this name is predominantly used, either in its Italian form or in translations. As the language is "genetically" affiliated to the Serbo-Croatian macrolanguage with its dialectal continuum and the problems of its segmentation, a denomination, referring to one of the individual Standard languages of this group, e.g. Croatian or Bosnian, should be avoided, the more so, as its individual character is mainly due to the language contact with Italian and its dialects, especially that of Lower Molise.

 

Ethnologue, Glottolog, and SIL (as well as Wikipedia) all followed the ISO’s lead and list the language under ‘Slavomolisano’, the Italian form of ‘Molise Slavic’ (which would also be fine). AFAIK I’m the only contributor to Wiktionary in this language to date. — Vorziblix (talk · contribs) 19:57, 15 April 2018 (UTC)Reply

If most scholars as well as Ethnologue, Glottolog, SIL, and Wikipedia all call it Slavomolisano, shouldn't we do the same, rather than call it Molise Slavic? —Mahāgaja (formerly Angr) · talk 11:46, 19 April 2018 (UTC)Reply
Sure, that would also be fine. The two are used pretty much interchangeably in scholarly publications (so the ISO note says ‘either in its Italian form or in translations’). It’s a bit of an odd situation, given the most prominent scholar in the field (Breu) submitted the language under ‘Slavomolisano’ to the ISO (hence the adoption by all the other organizations) but uses ‘Molise Slavic’ in his own recent English publications. But if you think it’s preferable to directly follow Ethnologue and the others I wouldn’t object. — Vorziblix (talk · contribs) 12:57, 21 April 2018 (UTC)Reply
I’ll start switching this over to ‘Slavomolisano’ one of these days if no one has any objections. — Vorziblix (talk · contribs) 03:04, 24 September 2019 (UTC)Reply
Done.Vorziblix (talk · contribs) 16:05, 17 October 2019 (UTC)Reply


RFM discussion: March 2019–February 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I believe that Category:Nama language should be renamed to Category:Khoekhoe language. The Nama are an ethnic group, but other ethnic groups like the Damara, Haiǁom, and ǂĀkhoe also speak the same language. It's more accurate and inclusive to call it Khoekhoe. Smashhoof (talk) 18:44, 18 March 2019 (UTC)Reply

On Google Books, "Nama language" and "Khoekhoe language" are about equally common. This is misleading, however, because many of the hits refer one of multiple Khoekhoe languages, as the term is used somewhat more broadly by some authors. Further evidence of this is supplied by "speak Nama" being much more common than "speak Khoekhoe" on Google Books. If we restrict ourselves to linguistic literature, "Nama language" is significantly more common than "Khoekhoe language".
In short, we do sometimes use a less common name for a language in order to disambiguate or avoid a name widely considered offensive. The name "Nama" is less inclusive, but also slightly less ambiguous, and seems to be the most common name in usage, pace Wikipedia. —Μετάknowledgediscuss/deeds 18:52, 18 March 2019 (UTC)Reply
The native name is Khoekhoegowab (literally "Khoekhoe language"). "Nama language" may have been more common in the past, but the standardized language today is called "Khoekhoe" or "Khoekhoegowab". The dictionary I have says that "Khoekhoegowab is the Language of mainly the Damara, Haiǁom and Nama." In The Khoesan Languages (2013, Routledge Language Family series), they exclusively refer to the language as Khoekhoe; however, they distinguish between Namibian Khoekhoe (Nama/Damara) and Haiǁom/ǂĀkhoe. "Khoekhoe(gowab)" does seem to be a more accurate and preferred term to me. Smashhoof (talk) 21:43, 18 March 2019 (UTC)Reply
If you search a digital copy of The Khoesan Languages, you'll see that different authors use different terminology in the book. The sections concerning the language in question call it Khoekhoegowab [well, usually just Kh. for short], which is far less common than either Khoekhoe or Nama and little used in non-scholarly English. —Μετάknowledgediscuss/deeds 00:58, 19 March 2019 (UTC)Reply
Looking in my copy of The Khoesan Languages, I see a few usages of "Khoekhoegowab", but "Khoekhoe" seems to be used more. "Khoekhoe (N.Kh.) is strictly a suffixing language...", "Khoekhoe categorizes nouns according to ...". The whole morphology section on the language seems to use "Khoekhoe". The syntax section uses "N.Kh." for "Namibian Khoekhoe(gowab)". Regardless, since the standard language is called Khoekhoegowab, I think it would be best to call it Khoekhoe, as that seems to be equivalent and more common than the full Khoekhoegowab in English. Also, Wikipedia uses the term Khoekhoe (see Khoekhoe language), so it would also make sense to keep the same terminology between wikis. Though, that article does use the term "Nama" more often than "Khoekhoe", which is a bit odd given the title is "Khoekhoe language". Smashhoof (talk) 02:49, 19 March 2019 (UTC)Reply
I asked someone who knows more about this and they said that Khoekhoegowab is the standard language, taught in schools and used in media, but Nama/Damara is the colloquial spoken language. Locals refer to it as Damaranama, Namadamara, Namataal, or Namagowab. Smashhoof (talk) 23:06, 18 March 2019 (UTC)Reply


RFM discussion: September–November 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Recreation of Template:hwc (Hawaiian Pidgin)

I would like to propose splitting HWC (Hawaiian Pidgin) off of EN (English) due to the following reasons:

1. The discussion in Template talk:hwc is outdated. It took place back in 2012/2013. This is before the US Census recognized Pidgin as a separate language in 2015 (2-3 years later). As such, the general linguistic view of HWC's status may have changed as a result of federal recognition.

2. Many of the arguments these people used were flawed (at least in my opinion).

- They implied Hawaiian Pidgin shouldn't be classified as a creole due to its high levels of mutual intelligibility with Standard English, even going as far as to compare it with "gangsta slang" from NYC or LA. The problem with this argument is that intelligibility doesn't define a creole. Creoles are naturalized pidgins, and pidgins are: "an amalgamation of two disparate languages, used by two populations having no common language as a lingua franca to communicate with each other, lacking formalized grammar and having a small, utilitarian vocabulary and no native speakers". That perfectly defines the origins of Hawaiian Pidgin. It formed as a result of many immigrant groups learning English. In fact, Tok Pisin has a very similar origin.

- The central claim that Hawaiian Pidgin can be understood by English speakers is only a half-truth to begin with. Many English speakers say they understand "Pidgin", but that's because the majority of Pidgin speakers mix a lot of their speech with Standard English, even more so in formal contexts. Full-on Pidgin can be a tough nut to crack. Furthermore, it's interesting that they use the argument of mutual intelligibility on Wiktionary, the same website that lists Bulgarian and Macedonian as separate languages. Those two languages are more similar to each other than Standard English is to Hawaiian Pidgin, yet for some reason, the BUL-MAC languages are separated meanwhile HWC is listed as dialect of EN.

3. Even in Wiktionary, it gets referred to as a "creole language". In the page Hawaiian Pidgin, it literally says "A creole language based in part on English used by most "local" residents of Hawaii. It's kind of odd that Wiktionary defines it as a creole language, but doesn't actually list it as one.

These are my main reasons why I do think Wiktionary should split hwc off of en. — Coastaline (talk) 00:25, 11 September 2020 (UTC)Reply

I started that discussion in 2012, and I don't really agree with how I approached it then. Hopefully I can do a better job now. Your first and third points aren't really relevant here: the political status of hwc doesn't have any practical bearing on its linguistic status, and our entry has no authority over our language treatment; it could well be changed. But your second point gets to the crux of the matter, and I am no longer certain how to proceed. Ultimately, we need to identify texts or recordings that we can agree are "pure" hwc, as opposed to being mixed with (colloquial) English. You mention Bulgarian and Macedonian; one reason it's easy to separate them is that there are lots of texts in each, clearly belonging to one or the other. Mutual intelligibility may be one of our best guides if we can't even decide what to put in which category. —Μετάknowledgediscuss/deeds 02:45, 11 September 2020 (UTC)Reply
My point wasn’t that the political status changes the linguistic status. I tried to say that federal recognition was a result of linguists changing their attitudes towards HWC, since the government wouldn’t randomly recognize it out of nowhere if it weren’t for linguistic perception. That’s why I thought the discussion needed to be revamped because there’s likely been a change in linguistic consensus from 2012 to 2020, if we’re seeing different developments like this.
And fair point about the third argument. But regarding the second one, I do think there are texts that are in pure hwc, such as the Hawaiian Pidgin Bible (Da Jesus Book). The book “Pidgin Grammar: An Introduction to the Creole English of Hawaii” also showcases pure Pidgin (although the explanations are in English). — Coastaline (talk) 04:27, 11 September 2020 (UTC)Reply
Just to clarify: even if there's consensus to reinstate Hawaiian Pidgin as a separate language, the template {{hwc}} will not be recreated, because that's not how we work with languages anymore. Rather, an entry for hwc will be added to Module:languages/data3/h. I don't know anything about HP beyond what Wikipedia tells me, but if the entire Bible is written in the language illustrated by File:Hawaii Pidgin crop.jpg, then I do think it's appropriate to call that a separate language rather than a variety of English. It may not be as different from standard English as Bislama is (as Meta said 8 years ago), but I do get the impression that it's about as different from it as Jamaican Creole is, one big difference being that Jamaican spells things phonetically while Hawaiian sticks closer to standard English orthography (Jamaican Creole fuud vs. HP food), which makes HP easier to read for English speakers. —Mahāgaja · talk 15:24, 11 September 2020 (UTC)Reply
Poking around the copy Google Books has snippets of, it does seem to all be like that, including e.g. "But wen da police guys wen come to da jail, dey neva find Jesus guys ova dea. So da police guys wen go back an tell da leadas: 'we wen find da jail door still yet stay lock, an da security guard standing by da doors, but wen we wen open um, nobody stay inside.' Wen da leada guys hear dat, da captain fo da temple guard an da main priest guys wen come mix up, an dey wen tink plenny wat goin happen." (I also found a book called "Pidgin to Da Max" which mentions a number of loanwords from Samoan, etc, and has its own usexes like "aalas dollahs means no mo' kala", and suggests the pronunciation of some words differs from English even when the spelling does not.) On one hand, it seems mostly intelligible, and there are substantially less intelligible dialects of English that we clearly treat as such; on the other hand, I see the argument above that it has a different origin. I have no strong opinion, but will say that on a practical level we do a poor job of covering most things we've merged into English, often having only a little vocabulary, little information on e.g. how verbs conjugate, and few usexes, since putting "wen da police guys wen come to da jail, dey neva find peopo" as a usex under police#English (as long as Hawaiian Pidgin is considered English) would probably be frowned on as inappropriately niche. - -sche (discuss) 09:42, 12 September 2020 (UTC)Reply
Thank you for sharing your thoughts. I’d just like to mention that earlier, Metaknowledge said Bulgarian and Macedonian are separate languages since both lects can be easily identified in writing. But I do in fact believe Pidgin can be identified as well. Sure, not all Pidgin books spell things the same way but it generally keeps basic rules (such as “where” being “wea” and “the kind” being “da kine”). Pidgin writing as a result should be easily identifiable from American English. It’s kind of like Neapolitan, since it’s easy to identify it from Standard Italian but not everyone spells its words the same way, even among those who speak the same dialect of Neapolitan. Much like Pidgin, it’s not standardized but it still can be differentiated. — Coastaline (talk) 01:06, 13 September 2020 (UTC)Reply

HWC Recognition (take 2) edit

@Mahagaja, @-sche In September, I requested that Wiktionary should add hwc as a language. However, the discussion died down and the status quo continued. Once again, I'd like to propose recognizing hwc as a separate code from en. The main counter-argument to my first post was that there's no agreed-upon way of separating pure Pidgin from Standard English. I actually disagree with this claim, because material in pure Pidgin does exist. The entire new testament was translated into Hawaiian Pidgin, and they're actually close to finishing the old testament. If you go to a Hawaii bookstore or search in Google Books, you can find more literature in pure Pidgin. The spelling tends to be uniform for most common words, with "where" being "wea" or "the" being "da". "Hamajang" and "da kine" are also uniform, and so are most Pidgin-exclusive terms. Spelling variations may occur but I don't think it stops it from being listed as a language. Wiktionary lists Neapolitan as one, yet spelling in that language is not standardized and many variations of the same word/pronounciation occur.

So what are your thoughts? Also, in case there's any confusion, Coastaline was my previous account. I switched to this current one since I recently recovered it. ― Haimounten (talk) 21:15, 16 October 2020 (UTC)Reply

@Haimounten: These discussions can sometimes take a while. I'd encourage you to merge this into the original conversation and ping people who contributed, rather than creating a new section, which may confuse people new to this conversation. I think I now lean toward reinstating the code, but I still want to hear others' positions. —Μετάknowledgediscuss/deeds 21:44, 16 October 2020 (UTC)Reply
I apologize. I will merge it with the previous discussion. ― Haimounten (talk) 22:24, 16 October 2020 (UTC)Reply
I’m going to ping others just in case this discussion got missed. @-sche @MahagajaHaimounten (talk) 01:57, 3 November 2020 (UTC)Reply
It's a grey area: much of the vocabulary is either identical to, or readily intelligible to a speaker of, standard English, and because Wiktionary is a dictionary (with entries for individual words) rather than a collection of long texts written in the lect (the way a hwc.Wikisource or even hwc.Wikipedia would be), differences in things other than vocabulary (such as grammar) are relatively less prominent / significant. However, I'm inclined to treat it as its own language (a) because of the argument that its origin suggests it's a distinct lect rather than a variant of English, and because treating it as its own thing would be consistent with how we treat Jamaican Creole or West African pidgin(s) as their own lects, and (b) because we're not meaningfully covering it as English, and I can't see us starting to: I can't see anyone putting up with wen da police guys wen come to da jail, dey neva find peopo as a usex on police, and we don't do a good job of indicating when a "general English" word is also used in a dialect, so while peopo might have its own entry with {{lb|en|Hawaiian Pidgin}}, it's unclear how anyone would know that e.g. police was used in Pidgin, since we wouldn't label it {{lb|en|US|ncluding|Hawaiian Pidgin|and|Midwest|and|NYC|and|New England|and|Southern US|and|AAVE|Canada|UK|Ireland|Australia|NZ|India}}. And we have someone who wants to add content in it and it does have its own ISO code, which we're no longer sure of our decision to retire. - -sche (discuss) 05:26, 3 November 2020 (UTC)Reply
I'm still in favor of including hwc as a separate language, if for no other reason than that "wen da police guys wen come to da jail, dey neva find peopo" is not a sentence of English, not even a nonstandard variety. —Mahāgaja · talk 12:48, 4 November 2020 (UTC)Reply
OK, I restored it, under code hwc and name "Hawaiian Creole" (since Ngrams suggests that's more common, and WP suggests it's also more correct, than Hawaiian Pidgin). I will say, though, there are things less "English" than "wen da police..." which we treat as en, like various dialects of England. (To pluck two examples out of the texts in the EDD, "A dooat like t'thowts a bin ower gyversum an hankeran eftre it", "Lükee zee tü 'er, 'er'th agot a rat! My eymers, 'ow 'er shak'th 'n!" These are far from the most extreme examples, just two examples I found in a quick search.) - -sche (discuss) 17:57, 4 November 2020 (UTC)Reply
(Notifying @Coastaline, Haimounten that the code has been restored.) - -sche (discuss) 17:59, 4 November 2020 (UTC)Reply
I would not necessarily be opposed to creating language codes for some traditional dialects of England as they are certainly as different from standard English as Scots and Yola are. —Mahāgaja · talk 19:22, 4 November 2020 (UTC)Reply


RFM discussion: October 2018–February 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Rename yrl

We currently call this "Nhengatu", but Nheengatu is where we've put our actual lemma for the language name, and it does seem to be more common in English per BGC. —Μετάknowledgediscuss/deeds 19:09, 21 October 2018 (UTC)Reply

Support. - -sche (discuss) 17:59, 14 November 2018 (UTC)Reply


RFM discussion: August–October 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


The names of most languages which belong to the Northern branch of the Jê family as they are now are not the ones actually in use (either by linguists or by their speakers), and for a couple of languages even Glottolog (let alone Ethnologue) gets it right. I suggest:

I would also be happy to know if I can make any of the above changes myself without being an admin (I am new to Wiktionary so not everything is clear to me yet). Degoiabeira (talk) 01:19, 23 August 2020 (UTC)Reply

In answer to the last question, on a technical level you can't make these changes without being an admin or Template Editor, since they require changing Module:languages. But at least three admins have eyes on this now (User:Mahagaja, User:Metaknowledge, and me) so any changes that need to be made will get made after discussion here. :)
As for the suggested renames: AFAICT Suyá is much more common than Kĩsêdjê (or Kisedje) overall, though most occurrences refer to the tribe, and many of the occurrences which refer to the language are from 25+ years ago, as you brought up on Mahagaja's talk page. Even then, judging by google books:Suyá OR Kĩsêdjê OR Kisedje language, the two names seem roughly equally(?) common in books from the last 25 years. On one hand, Kĩsêdjê being the autonym speaks in favor of it; on the other hand, Suyá being historically more common speaks in favor of it. (From the perspective of ease of input, both names contain diacritics, so neither is per se easy to type.) The situation with Kayapó / Mẽbêngôkre is much the same.
For Parkatêjê, it seems like that name might indeed be most common, aided by the fact that there are so many alternatives ("Gavião Perkatêjê", bare "Gavião", "Gavião do Pará", etc) that no one name is that common AFAICT.
For Apinayé, judging by both books and scholarly journal articles, Apinayé seems to be the most common name even in recent sources, and it is also most common among the sources Glottolog lists which are specifically about the language and which use one spelling or the other in their names, so that is fine as-is and should not be renamed.
I will look into Pykobjê and Tapayuna.
Creating a code for Proto-Northern Jê seems like a good idea (there are reconstructions of it), especially if you will be adding entries in it or mentioning it in etymologies of other words. :)
Pinging User:Ungoliant_MMDCCLXIV, in case you have anything to add.
- -sche (discuss) 07:41, 29 August 2020 (UTC)Reply
I assume the other admins I mentioned above are following this, having participated in it when it was on the user talk page of one of them, but I'm going to re-ping User:Ungoliant MMDCCLXIV because I recall that pings must(?) be added in the same line as signatures in order to work. - -sche (discuss) 07:48, 29 August 2020 (UTC)Reply
@-sche: The mention doesn't have to be in the same line as the signature, but mw:Manual:Echo seems to say that it has to be in a chunk of added lines that contains a signature. — Eru·tuon 19:22, 31 August 2020 (UTC)Reply
Not much more than what you’ve researched yourself. Ultimately each language will have to be researched individually to find its ideal canonical name. I do recall reading that our canonical name of a native Brazilian language was considered an ethnic slur by the people who speak it. I think it was one with a bird’s name, maybe Gavião or Urubu Kaapor.
I’ll add that I believe the main criterion in choosing a canonical name should be the name used in English-language media only. If multiple English-language names exhibit roughly the same amount of use, only then should its status as an autonym or not be considered as a “tiebreaker” (as should practical concerns such as the non-use of diacritics). — Ungoliant (falai) 22:21, 29 August 2020 (UTC)Reply
Thanks for looking into it. The situation with Kayapó vs. Mẽbêngôkre is arguably not the same, however: the label Kayapó has been ambiguously used for designating both (i) a specific Mẽbêngôkre-speaking people (that is, to the exclusion of the Xikrin) as well as the dialect they speak and (ii) as a synonym of Mẽbêngôkre (that is, encompassing both the Kayapó and the Xikrin). In this sense, Mẽbêngôkre is preferrable because it helps avoid this ambiguity. Kayapó and Xikrin could be listed as varieties of Mẽbêngôkre (perhaps as different between each other as are Quebec French and Hexagonal French). Regarding the remaining comments, I realize I might be an interested party (I've been doing comparative and some descriptive work on Northern Jê for ~6 years and I've heard too many linguists' and native speakers' comments regarding their preferred ways to spell the names of these languages), so I guess I'll just wait for a consensus among unbiased Wiktionarians to form. I can also confirm that the Proto-Northern Jê reconstructions you might have seen out there are most probably mine and yes, I would be willing to add the respective entries (guess it's not a problem as long as the reconstructions are published in peer-reviewed outlets). Degoiabeira (talk) 18:43, 4 September 2020 (UTC)Reply
@Degoiabeira I got a bit distracted, but returning to this, I added a code for the family Northern Jê, sai-nje, and a code for the reconstructed language Proto-Northern Jê, sai-nje-pro.
You may already be familiar with how to use them, I don't know, but if not: reconstructions in it would go the Reconstruction namespace (Reconstruction:Proto-Algonquian/askyi is an example entry in that language), and are mentioned in etymologies in the way you see in e.g. hàki, while the family could can be used if you don't know the precise language (as in sagamité).
- -sche (discuss) 10:12, 12 September 2020 (UTC)Reply
I've also added a code for Tapayuna, sai-tap and Pykobjê language as sai-pyk. - -sche (discuss) 10:41, 12 September 2020 (UTC)Reply
Brilliant, thank you! Degoiabeira (talk) 04:56, 5 October 2020 (UTC)Reply


RFM discussion: March–July 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Chuvantsy and Chuvan

Hi, I propose renaming Chuvantsy (xcv) to Chuvan. The pro of this renaming is the consistency between the WM projects (both the English wikipedia and Wikidata favour this name). Furthermore it's a more simple name, that is easily deductible from the native speakers' ethnonym. I don't personally see any downside to this, since all the primary (descriptive) sources on this language are written in Russian, and the term Chuvan seems to be just as, if not more, often in use as Chuvantsy. Thadh (talk) 23:43, 22 March 2021 (UTC)Reply


Judeo-Arabic edit

(See Category_talk:Arabic_language#RFM_discussion:_September_2016–August_2021.) - -sche (discuss) 04:24, 3 October 2021 (UTC)Reply

RFM discussion: August 2020–July 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Adding Fogaha

It's time to start chipping away at the mess that is our Berber codes. Wikipedia considers the Sokna language to be spoken in three oases: Sokna, Fogaha, and Tmessa. According to van Putten, the claim of any Berber variety spoken in Tmessa is wholly unsubstantiated. That leaves Sokna and Fogaha, which were each documented by different Italian scholars using their problematic style of notation, but do seem to be significantly different nonetheless. I suggest that we add a code ber-fog for Fogaha (variously spelled "Fuqaha", "Foqaha"; the definite article is often tacked on, including by its only documenter Paradisi, who wrote "El-Fogaha", but this seems contrary to one's expectations when it comes to alphabetisation). —Μετάknowledgediscuss/deeds 07:39, 10 August 2020 (UTC)Reply

Support. (I can even find such spellings as "Fojaha", "Al-Fojaha", "Fodjaha",...)
Off of the immediate topic, but re Tmessa: I can find quite a few overviews which mention Tmessa, but AFAICT none provide detail about it. The Oxford Handbook of African Languages (Rainer Vossen, Gerrit J. Dimmendaal, 2020), page 285, says it is merely undocumented, not nonexistent, as part of a lengthy lament about Berber ("an unconvincing classification has been provided by Aikhenvald and Militarev (1991), [...] at many points this classification seems to be arbitrary. Moreover, at points it classifies dialects which are fully undocumented (e.g. Tmessa in Libya) or which are not Berber at all (e.g. Tadaksahak, which is Northern Songhay, and the Kufra oases, which are Teda-speaking). Unfortunately, some of the main lines of this classification have been taken over by the Ethnologue"). I reckon these changes are fine, though, since if any documentation subsequently emerges attesting Tmessa, it'll either show that it should be (re)included in an existing code or given its own. - -sche (discuss) 21:28, 11 August 2020 (UTC)Reply
Here's a blogpost that explains the deal with Tmessa. As for the spellings with (d)j, they merely betray people who have made a faulty transcriptional assumption because they can't actually read the Arabic script... —Μετάknowledgediscuss/deeds 22:17, 11 August 2020 (UTC)Reply


RFM discussion: August 2020–August 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Overhaul of Berber language codes

The current state of our Berber (Amazigh) codes is the ISO codes without any modifications made to their scheme, depsite its many problems. The Berber languages have comparable diversity to Romance, but specialists are distinctly uncomfortable with the idea of dividing up what is mostly a dialect continuum into over 25 languages, as Ethnologue does, despite the fact that nobody disputes this practice for Romance (even though on grounds of mutual intelligibility, it's hard to say why, say, Galician and Portuguese should be considered separate). For the most part, I think we should follow the Romance practice and be (mild) splitters; in cases like Tashelhit and Central Atlas Tamazight, you wouldn't gain much by merging (as most entries would still be on separate pages), and the remarkable internal dialectal diversity in these "languages" means that good coverage will still require a great deal of dialect-tagging. I have worked through the Berber classification used by Kossmann (2013) below, using his numbered "blocks", with my notes on changes to our scheme that I recommend.

  1. Two languages, Zenaga [zen] and Tetserret [tez], both separate from the continuum and each other. No changes needed.
  2. The Tuareg group, which has a macrolanguage code [tmh], but also codes for Tahaggart [thv], Tawallammat [ttq], Tayart [thz], Tamasheq [thq]. Tamasheq is meant to encompass both Adagh and Taneslemt, and the sole sedentary Tuareg lect, Ghat (spoken in Libya) is missing a code. I struggled to find data on mutual intelligibility; Adam (2017) indicated that speakers from Ghat had a lot of trouble understanding non-Libyan dialects, but that this might be a recent phenomenon. There is a written Tuareg standard using Tifinagh, although it does not indicate vowels and does not seem widely used any more. Clearly, the current state is redundant, and I think collapsing into a single "Tuareg" would be workable, but perhaps not ideal.
  3. South-Central Moroccan group, Tashelhit [shi] and Central Atlas Tamazight [tzm], both written in Tifinagh. There is also a poorly defined Judeo-Berber [jbe], which I don't want to get into (but reminds me of the issues with Judeo-Arabic). These are all fine, but ISO also gave a politically-motivated code for "Standard Moroccan Tamazight" [zgh], which is a redundant literary koiné that I have already posted a request to delete above.
  4. Northwestern Moroccan group, Ghomara [gho] and Senhaja de Srair [sjs]. No changes needed.
  5. The Zenatic group, and here is where things get messy.
    • The existing codes in the West are Tarifit [rif], Chenoua [cnu], Tachawit [shy]. Missing are Iznasen (sometimes considered "Eastern Riffian", so presumably merged under [rif]) and Eastern Central Atlas Tamazight (a kind of contact language between [tzm] and [rif], presumably merged under [tzm]).
    • In the Moroccan and Algerian oases, Tagargrent [oua], Tugurt/Temacine [tjo], Taznatit [grr], Tumzabt [mzb], and Tidikelt [tia] (which is almost undocumented!) all have codes, thus excluding Sud-oranais (principally the Figuig dialect, which has a grammar). Mutual intelligibility in this cluster of dialects is known to be high, and this seems ridiculously oversplit; I would recommend one code, although I'm not sure what to call it. Ethnologue calls it "Mzab–Wargla", which are only two of the dialects and is obviously not great, while Kossmann calls it "Northern Saharan oasis" Berber, which is highly clunky, and my "North Saharan Berber" is a protologism.
    • In the East, there is a single code for Sened [sds], an extinct dialect in Tunisia, and no codes for the extant Tunisian varieties (principally Djerbi), nor for the closely related Zuwara dialect over the border in Libya. I recommend that [sds] be repurposed as "Tunisian Berber", and a new code be added for Zuwara (also known as "Zuaran"), which has a grammar by Mitchell; this could be considered oversplitting, but it would be odder for "Tunisian Berber" to have entries that are almost all from Libya.
  6. The remaining languages of the East belong to various blocks, and are mostly covered appropriately: Kabyle [kab], Nafusi [jbn], Siwi [siz], Sokna [swn], Ghadames [gha], and Awjila [auj]. The only one missing is (the presumably extinct, and thus finite) Fogaha, which I recommended for a new code in its own section above.

Feedback would be appreciated, especially on the naming of North Saharan Berber. Note that I intend to give etymology-only codes for all the dialects that end up getting merged. —Μετάknowledgediscuss/deeds 22:04, 10 August 2020 (UTC)Reply

Re North Saharan Berber: I think naming it "Northern Saharan Berber" is OK; we're not inventing the grouping, it's a descriptive name, and it's similar to and simply trims an excess word off of an existing name used in literature. - -sche (discuss) 21:06, 11 August 2020 (UTC)Reply
Reviewing Tuareg further, I see that Heath (2005) refuses to take a stand: "it is very difficult to decide whether we are dealing with a single "Tuareg" language (with many dialects), or two or more languages (each with some internal dialectal variation)." That said, he endorses the four-way distinction (where "Tahaggart" should be renamed to "Tamahaq" to include the Kel Ajjer), supports the bundling of Tamasheq, and makes no mention of Ghat, presumably lumping it into Tamahaq as well. The 4 lects with codes all have good dictionaries (ttq and thz share a dictionary, but all forms are marked for which lect they belong to), and 3 of the 4 have grammars, so they won't be hard to sort out or attest. As a result, I now definitively lean toward retiring the macrolanguage and keeping the split codes. —Μετάknowledgediscuss/deeds 06:26, 26 August 2020 (UTC)Reply
I still haven't dealt with this, and I found reason to merge Tuareg instead of splitting it. Sudlow (2008) says: "Tamasheq has split into three major dialects which the Tamasheq themselves refer to as ‘sha’, ‘za’ and ‘ha’, or Tamasheq, Tamajeq, and Tamahaq." His grammar itself supports the utility of that view, as it treats two dialects spoken together (often in a mixed form) in Burkina Faso, which would have to go under two different language codes in the splitter scenario. We would have to dialect-tag our entries carefully, of course. —Μετάknowledgediscuss/deeds 23:33, 8 March 2021 (UTC)Reply
Kossmann, who is a leading Berberist, offers this judgement on Tuareg: "In spite of important differences, it would be exaggerated to consider these variants distinct languages." This makes me feel very secure in merging them, so I will finally get all of this in working order soon. —Μετάknowledgediscuss/deeds 19:41, 24 August 2021 (UTC)Reply
  • I have made all the changes recommended in this section. I chose [mzb] as the code for Northern Saharan Berber, and preserved the Tuareg dialect codes as etymology-only codes (adding an etymology-only code [ber-ght] for Ghat). I added Zuwara as [ber-zuw]. I also merged Judeo-Berber into Tashelhit, following our practice with Judeo-Arabic (see Category:Judeo-Berber). —Μετάknowledgediscuss/deeds 05:37, 25 August 2021 (UTC)Reply


RFM discussion: July–October 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Renaming [dba]

We currently call this language "Bangi Me"; the only grammar of it, and most recent scholarly work, uses the spelling "Bangime" instead. —Μετάknowledgediscuss/deeds 19:14, 14 July 2021 (UTC)Reply

Support. (I notice Appendix:Bangime word list already uses the unspaced form, unlike the entry Bangi Me and categories.) - -sche (discuss) 04:57, 15 July 2021 (UTC)Reply


RFD discussion: August–October 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for deletion (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Proto-Dardic entries

Currently there are 2, but none are sourced. ·~ dictátor·mundꟾ 19:03, 26 August 2021 (UTC)Reply

That alone is not a reason to delete them. Perhaps @Victar will comment. —Μετάknowledgediscuss/deeds 03:30, 6 September 2021 (UTC)Reply
So long as the child entries are on the Sanskrit entry, I don't mind if they're deleted, not because they're unsourced, but because Proto-Dardic is controversial. --{{victar|talk}} 06:33, 6 September 2021 (UTC)Reply
Delete per
Wiktionary:Beer_parlour/2020/August#Unsourced_and_poorly_formatted_Proto-Dardic_entries_by_2401:4900:448F:3291:0:3F:8DB4:7F01_(talk)
Also, Reconstruction:Proto-Dardic/múkʰa- is poorly formatted. Kutchkutch (talk) 11:25, 6 September 2021 (UTC)Reply
To clarify, I did not nominate them for deletion simply because of them being unsourced, but because of the language: seeing that some Proto-Indo-Aryan entries have also been deleted lately. ·~ dictátor·mundꟾ 06:56, 7 September 2021 (UTC)Reply
@Inqilābī, Victar, Kutchkutch, Svartava2, Bhagadatta: Does this mean that we should not be reconstructing Proto-Dardic at all? If so, the real solution would be to remove the protolanguage code entirely — is that desired? Also, before they can be deleted, someone has to move the descendants. —Μετάknowledgediscuss/deeds 22:05, 8 October 2021 (UTC)Reply
@Metaknowledge I would be supportive of that. No problem in removing it. I'll move the descendants. Svartava2 (talk) 02:54, 9 October 2021 (UTC)Reply
@Metaknowledge: The descendants had already been moved one month ago, but I am not sure if we should get rid of the lang code, as it’s used in the Descendants section. @Bhagadatta, Kutchkutch, Victar: While the Proto-Dardic entries are to be deleted, should we keep the reconstructed forms in the Descendants section? @Svartava2 has removed the terms without prior discussion. ·~ dictátor·mundꟾ 06:48, 9 October 2021 (UTC)Reply
I did see when I thought of moving the descendants that they were already moved. We can simply show Dardic in the descendants, name of the family (cat:Dardic languages) instead of {{desc|inc-dar-pro|-}}. Svartava2 (talk) 06:54, 9 October 2021 (UTC)Reply
@Inqilābī, Svartava2: Even if the protolanguage code is removed and the reconstructed forms in the Descendants section are not needed, the Dardic label would still be necessary to separate those languages from other Sanskrit descendants. Kutchkutch (talk) 13:23, 9 October 2021 (UTC)Reply
@Metaknowledge: Proto Dardic did most likely exist but there aren't enough published sources to cite while creating reconstructions. The study in the area is quite murky, with some authors even confusing them with Nuristani languages (Nuristani languages are geographically close to the Dardic languages but form a separate subfamily within Indo-Iranian unlike Dardic which is part of the Indo-Aryan subfamily). So yeah, Wiktionary would presently lose nothing by removing the inc-dar-pro code. It's like having Proto-Bengali-Assamese or Proto-Tamil-Kannada. If some good reliable sources that deal in Proto-Dardic become available later on, we can always add the code back. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴) 13:02, 23 October 2021 (UTC)Reply
  • I have removed the Proto-Dardic language from the module and its entries. Note to archiver: please archive this discussion to WT:LTD. —Μετάknowledgediscuss/deeds 20:38, 23 October 2021 (UTC)Reply
    @Bhagadatta, Inqilābī, Svartava2 The errors in Category:Pages with module errors that need to be addressed. Should every instance of the code inc-dar-pro in descendants trees be replaced by Dardic: or should this stage be removed altogether? Kutchkutch (talk) 12:43, 25 October 2021 (UTC)Reply
    @Kutchkutch for now it's ok to change all these to Dardic:, so that for is little change (just Proto- and recons term/[Term ?] removed) and because we wouldn't have to change the ordering like if there is "{{desc|inc-dar-pro}}" before Gujarati we'd have to move the, say, Kashmiri descendant after Gujarati one in alphabetical order; hence its easier to be done by bot (most of these replacements were done by NadandoBot). From now it's up to the editor what to do; like with Bengali-Assamese:diff and * {{desc|as}} * {{desc|bn}} * {{desc|syl}} are both valid and neither is wrong. However I do think that if there is only one desc like it would be better to show that one separate without label Dardic:. Svartava2 (talk) 15:09, 25 October 2021 (UTC)Reply


RFM discussion: April 2020–March 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


An absurd language name apparently copied from glottologue. This is not even attestable, nor its synonym Levantine Bedawi Arabic, and no native would ever use this to create an entry. “Bedawi” is = bedouin. The “characteristics” in the Wikipedia article Northwest Arabian Arabic linked are in every or many Arabic dialects. There are difference in urban and desert-dweller speech everywhere but one sorts these speeches under the regiolects, so it would be Egyptian Arabic, South Levantine Arabic, South Levantine Arabic, Najdi Arabic. Fay Freak (talk) 13:11, 5 April 2020 (UTC)Reply


RFM discussion: April 2020–March 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Remove aao “Algerian Saharan Arabic

Unattestable or SOP language name fudged by the for-profit language database Ethnologue; the linked Wikipedia pages will always stay stubs. The Verkehrsanschauung sorts this under Algerian Arabic. There is a continuum with Moroccan Arabic. Pinging @Fenakhay following our considerations at Talk:زرودية; nobody has ever thought such a category. One can attest the term “Saharan Arabic” but this is of course not meant to denote one language. As distinguished from Hassānīya Arabic one could need codes eastwards for some dialects spoken in the Sahara and Sahel, like for Malian and Nigerien Arabic – Chadian Arabic we have as shu. Fay Freak (talk) 13:31, 5 April 2020 (UTC)Reply

I think the idea here is that Algerian Arabic as spoken in Algiers belongs on a continuum that speakers in the southern and western desert of Algeria (inclusing many who are not ethnically Arab) do not belong on. I don't see the use of this code, though, because that speech definitely counts as Hassaniya. (As for the other countries on the fringe, we seem to count Nigeria and Cameroon under Chadian Arabic, which is less than ideal, but then again, I think that our whole system needs to be more like Chinese to be functional at all.) —Μετάknowledgediscuss/deeds 18:18, 5 April 2020 (UTC)Reply


RFM discussion: April 2020–March 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/62589227#Merge_[aec]_“Saidi_Arabic”_into_[arz]_“Egyptian_Arabic”|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [aec] “Saidi Arabic” into [arz] “Egyptian Arabic

Hardly attestable language name. Occurrences for it turn out as “Saʿiidi Arabic” for example. “Upper Egyptian Arabic” (I don’t see use of “Upper Egypt Arabic”, but this is a detail) is sometimes posited, variously ill-defined because of superstratum influence from the Nile Delta, but this is usually not accepted as distinct language and can well be an sum-of-parts term; Arabic Wikipedia clearly says Ṣaʿīdīy Arabic is “within the Egyptian Arabic dialects”. There are but some isoglosses dividing Arabic-speaking Egypt latitudinally, but so one can shed urban and bedouin speech and Muslim and Christian speech. English Wikipedia says “the realisation of /q/ as [ɡ] is retained (normally realised in Egyptian Arabic as [ʔ]” but this is the speech of the desert vs. the speech in the largest cities and [ɡ] for ق (q) can be found in northern Egyptian bedouins. Most natives of Upper Egypt would use the code for Egyptian Arabic, and I would too sometimes with label “Upper Egypt” or more specific labels. I have created the single entry in “Saidi Arabic” and only because there was this code. Fay Freak (talk) 13:54, 5 April 2020 (UTC)Reply


RFM discussion: April 2020–March 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/62589227#Merge_[ayn]_“Sanaani_Arabic”,_[acq]_“Ta'izzi-Adeni_Arabic”,_[ayh]_“Hadrami_Arabic”_into_a_new_“Yemeni_Arabic”|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merge [ayn] “Sanaani Arabic”, [acq] “Ta'izzi-Adeni Arabic”, [ayh] “Hadrami Arabic” into a new “Yemeni Arabic”

The categories are ill-defined; the second one is unattestable. Yemen is a dialectologically complicated area – the material is even partially collected outside of Yemen, as for instance the Dictionary of Post-Classical Yemeni Arabic 1990 is surveyed in Israel – with many isoglosses and the number of distinct languages is multiplied at will if one is oblivious of SOP designations of lects so for this reason, apart from nobody being able to safely use these categories, it is not wise to distinguish at the L2 level already. If you look into the Dialect Atlas of North Yemen and Adjacent Areas published by Brill four years ago and probably also Saint Elbakyan you will forget those categories and won’t be able to map content onto the codes. Apart from that the distinction is inconsistent with the fact that we have a code for “Judeo-Yemeni Arabic” jye – as if Jewish speech in Yemen were one language while Muslim speech were three! I just mention here that I find “Judeo-Arabic” and its sub-languages suspicious, it could all be only Classical Arabic or Arabic dialects with some peculiarities written in Hebrew script, remaining from a time when Wiktionary did not use {{spelling of}} for this purpose. The Jews might deal with it themselves. Fay Freak (talk) 14:23, 5 April 2020 (UTC)Reply

I pinged you in a separate discussion above about Judeo-Arabic. As for Yemen... it is true that the codes are oversplit, but it is also true that the WAD attests to the fact that when Hejazi and Omani speakers disagree about a word, chances are that North Yemen agrees with the Hejaz or Egypt, and that South Yemen agrees with Oman. The existence of Piamenta's dictionary gives me hope for treating Yemen as a unit, but I don't know how he does it (do you have a PDF?). —Μετάknowledgediscuss/deeds 18:33, 5 April 2020 (UTC)Reply
Nay, I do not. Fay Freak (talk) 18:36, 5 April 2020 (UTC)Reply
I find this “South Yemen” and “North Yemen” part difficult and think that one needs an elevation profile of Southwest Arabia to dissect the Arabic dialects of and close to Yemen. Fay Freak (talk) 18:41, 5 April 2020 (UTC)Reply
Deboo's Jemenitisches Wörterbuch is another good example of treating all Yemeni varieties together, with scrupulous dialect-tagging. —Μετάknowledgediscuss/deeds 01:54, 29 March 2021 (UTC)Reply


Adding Bwala edit

This doesn't seem to merit a discussion, but I am putting this here for posterity: I am adding Bwala, a previously undocumented Bantu language, as [bnt-bwa] based on Bollaert et al. (2021). —Μετάknowledgediscuss/deeds 20:05, 21 December 2021 (UTC)Reply

RFM discussion: August 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Yavapai vs. Havasupai-Walapai-Yavapai

We are currently treating Yavapai nai-yav as a separate language from Havasupai-Walapai-Yavapai yuf, but some of our yuf entries (e.g. 'ha) are tagged as Yavapai. Is there any difference between nai-yav and the Yavapai dialect of yuf? If not, can we merge nai-yav into yuf? And if we do want them separate, can we rename yuf "Havasupai-Hualapai" (as per Wikpiedia) or "Havasupai-Walapai" rather than keeping a different language's name in there? We only have one entry for nai-yav, namely ke#Yavapai. @-sche seems to be the only person who's dealt with these languages in the past. —Mahāgaja · talk 11:07, 30 August 2021 (UTC)Reply

Interesting; I dug into the edit histories, and I added Yavapai in February 2016, at which time I also added a word. I added Walapai-and Yavapai-specific content under the trinitarian name Havasupai-Walapai-Yavapai a couple months later; I must've been looking to add the Walapai content, saw we had a code for Havasupai-Walapai-Yavapai, since Stephen G Brown had previously added Havasupai-Walapai-Yavapai content like T:yuf-personal pronouns and entries for those pronouns without dialectal labelling, and forgot about the Yavapai-specific code.
Wikipedia has Havasupai-Walapai and Yavapai as distinct but similar languages (a 1978 United States Indian Claims Commission decision says "the Walapai and Havasupai [...] spoke a language and had a culture very similar to the Yavapai"); OTOH, Thomas Sebeok's Native Languages of the Americas, volume 1 (2013), page 467, cites them as an example of ethnic groups that fought "even though [...they] spoke the same Yuman language". Christopher Moseley's Encyclopedia of the World's Endangered Languages (2008) calls the language "Upland Yuman" ("Upland Yuman is a Yuman language"), spoken by the "Hualapai, the Havasupai, and the Yavapai, the last traditionally divided into four regional subtribes. Each community speaks a distinct variety, with the Yavapai varieties forming a well defined dialect, although all varieties are mutually intelligible with little difficulty." I suppose they should be merged. - -sche (discuss) 18:53, 30 August 2021 (UTC)Reply
@-sche: OK, I moved our one nai-yav entry to yuf, deleted all the nai-yav categories, and deleted nai-yav from the language modules. —Mahāgaja · talk 10:39, 31 August 2021 (UTC)Reply


RFD discussion: November 2021–January 2022 edit

 

The following discussion has been moved from Wiktionary:Requests for deletion (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Knaanic hasn't been definitively proven to exist as a distinct language yet, and its few attestations (and entries here) seem indistinguishable from standard Slavic languages in Hebrew transcription. Here's a pretty good article by Dovid Katz on the topic. airy—zero (talk) 03:22, 4 November 2021 (UTC)Reply

Knaanic is a useful holding category for the scattered attestations of West Slavic words in Hebrew script, which have to be included in the dictionary some way or another. This discussion doesn't belong at RFDO, because there is no chance we will delete all those entries, which pass CFI and definitely belong here. The question is whether we want to merge Knaanic into another language code, presumably Old Czech, although in all honesty, I rather doubt that all the Knaanic attestations can really be called Old Czech in good faith. The difficulty of deciding where they ought to go, and whether people would even think to look for them there, is why Knaanic exists as a separate language here in the first place. —Μετάknowledgediscuss/deeds 06:11, 4 November 2021 (UTC)Reply
Oh, okay — thanks for the explanation! airy—zero (talk) 12:57, 4 November 2021 (UTC)Reply

RFD-resolved. If you want to merge lects, propose something in WT:BP. — Fytcha T | L | C 12:13, 22 January 2022 (UTC)Reply

RFM discussion: January–February 2022 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Adding Early Modern Korean (ko-ear) as an L2

(Notifying TAKASUGI Shinji, Atitarev, HappyMidnight, Tibidibi, Quadmix77, Kaepoong, Mahagaja, Metaknowledge): @Fish bowl, @Solarkoid, @Lunabunn, @Echo Heo (feel free to tag anyone else). Currently, Early Modern Korean (EMK) is listed as an etymology-only language, limiting its usage as a header and leading all EMK terms to be listed under the "Korean" header with an "Early Modern" label. However, through several discussions on the matter (thank you @ Tibidibi), it's been made more and more clear, that it needs to be separated out from Modern Korean in order for its coverage to be done well. Here are a few reasons (feel free to make any corrections if needed):

  1. Contemporary Korean Linguistics and Korean dictionaries almost consistently separate out EMK as a separate language. For Korean linguistics done in Korean, the word 근대국어 (geundaegugeo) is used exclusively for it, and it's analyzed as its own. As for English sources, Cho, Sungdai; Whitman, John (2019), Korean: A Linguistic Introduction and Lee, Ki-Moon; Ramsey, S. Robert (2011), A History of the Korean Language along with several other resources also make the distinction.
  2. Early Modern Korean used a rather different orthography compared to Korean with letters like (st) and (yoi) becoming obsolete after that time period, leading Modern Korean speakers without training in the language to struggle to comprehend it.
  3. There are terms that are solely used in EMK such as 툐상 (tyosang) that deserve to have their own space and coverage rather than thrown under the "Korean" header and left alone.
  4. Currently our etymologies are unclear in terms of showing the development of words from Middle Korean (MK) to EMK to Modern Korean (MoK) such as in ᄃᆞ리〮다〮 (tòlítá), or often skipping EMK altogether due to the lack of clarity. Additionally, with words such as 구무 (kwumwu), we'd be able to show more clearly the relation between EMK terms and their dialectal descendants.
  5. Separating EMK out will allow us to more easily create pronunciation & transcription modules for the language (such as what was done with Jeju in Module:jje-pron & Module:jje-translit), rather than the somewhat inconsistent transcriptions that have been used in the past or attempting to include all the EMK use-cases within the current system.

Overall, having Early Modern Korean as its own L2 would greatly improve its coverage if done well, make life easier for the Koreanic editor community, make our content clearer and more precise for readers, and finally put Wiktionary in line with Koreanic Linguistics as a whole. AG202 (talk) 09:06, 23 January 2022 (UTC)Reply

@AG202Strong support on all points. As a way of division, I suggest that:
  • The split between Middle Korean and EMK is generally a hard line at 1600, but the few texts that use MK-only characters may be treated as MK.
  • The split between EMK and modern Korean is more fluid.
    • Texts made by foreign learners of the language in the late nineteenth century and onwards are considered modern Korean, despite being written in an EMK orthography, because they are often the first attestations of archetypally modern Korean words not attested in EMK, such as 때문 (ttaemun) and 언니 (eonni).
    • 개화기 국어 (開化期國語, gaehwagi gugeo), the language of “modernizing” texts from the period from c. 1890 to 1910, is considered modern Korean despite having EMK orthography, because they are the first attestations of many of the Japanese orthographic borrowings (wasei kango) totally absent in EMK but a defining feature of modern Korean vocabulary.
    • “Traditional” texts from the 1890–1910 period, such as personal letters of provincial people less affected by Westernizing trends or traditional novels untouched by Western and Japanese influence, are still considered EMK. The quotation at 섬섬옥수 (seomseomoksu), despite the late date, is a typically EMK material and would hence be quoted at Early Modern Korean 셤셤옥슈 (syemsyemwoksyu).
  • For IPA, I suggest aiming at a 1750 pronunciation, with the assumption that the vowel qualities were the same as in MK.--Tibidibi (talk) 09:39, 23 January 2022 (UTC)Reply

Being that it's been three weeks, the only active EMK editor is in strong support, and there hasn't been any opposition (along with tagging all of the Korean editors in the working group), I'd like to close this as RFM-Split. @Erutuon, @Surjection, if you have the time, please. Edit: Apologies for forgetting to sign, and thank you @ J3133, for pinging them for me. AG202 (talk) 01:07, 13 February 2022 (UTC)Reply

  Done. Entries need to be moved over manually though — SURJECTION / T / C / L / 11:32, 14 February 2022 (UTC)Reply


RFM discussion: April–May 2022 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Formosan Kulon-Pazeh (uun) split to Kulon (uon) and Pazeh (pzh)

According to this ISO change request (Kulon-Pazeh Change Request Documentation), Kulon (uon) and Pazeh (pzh) are now separate languages. How do we update Wiktionary to add data specifically for pzh from now on (given that data is nonexistent for the long-extinct Kulon language). I'm in the process of adding lemmas and example sentences for several of the western Formosan languages. Kangtw (talk) 19:57, 3 April 2022 (UTC)Reply

@Austronesier Chuck Entz (talk) 04:20, 4 April 2022 (UTC)Reply

Given that all (uun)-lemmas are actually Pazeh lemmas, the easiest thing would be to just to rename "Kulon-Pazeh" to "Pazeh". –Austronesier (talk) 19:06, 6 April 2022 (UTC)Reply
Well, and to recode entries to use the right code. (Is there really no data on Kulon?) Other ISO code changes discussed at Wiktionary:Beer_parlour/2022/March#2021_ISO_code_changes btw. - -sche (discuss) 22:16, 6 April 2022 (UTC)Reply
Follow-up: Wiktionary:Etymology scriptorium/2022/May#ISO_639-3_code_uun_split_into_uon_an_pzh. - -sche (discuss) 17:52, 7 May 2022 (UTC)Reply
See that discussion for more, but: I added uon and pzh; once instances of uun are updated, it can be removed. - -sche (discuss) 20:38, 8 May 2022 (UTC)Reply


RFM discussion: Moving Finno-Ugric families to Uralic families edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Moving Finno-Ugric families to Uralic families

We don't recognize Finno-Ugric as a valid family; just Uralic. Hence urj is a valid code, but fiu isn't. Nevertheless, we're using fiu- as the prefix for four branches: fiu-fin for Finnic, fiu-mdv for Mordvinic, fiu-prm for Permic, and fiu-ugr for Ugric. I propose we use urj- for these instead, thus moving as follows:

  • fiu-finurj-fin
  • fiu-mdvurj-mdv
  • fiu-prmurj-prm
  • fiu-ugrurj-ugr

At the same time, we should move the codes for the corresponding protolanguages:

  • fiu-fin-prourj-fin-pro
  • fiu-mdv-prourj-mdv-pro
  • fiu-prm-prourj-prm-pro
  • fiu-ugr-prourj-ugr-pro

as well as the code for the etymology-only lect Proto-Finno-Permic:

  • fiu-fpr-prourj-fpr-pro

I suppose we can keep fiu-pro as an etymology-only variant of urj-pro if it's important. What do others think? —Aɴɢʀ (talk) 14:29, 23 December 2016 (UTC)Reply

That seems like a lot of disruption for a small theoretical benefit: we've always used codes like aus, cau, nai and sai that we don't recognize as families for making exception codes, so it's not a huge violation of our naming logic. In this case, though, it looks to me like we don't recognize fiu more because it's too much like urj, not because it's invalid, per se (though I don't know a lot on the subject). We do have gmw-fri rather than gem-fri, for instance. Of course, I'd rather follow those who actually work in this area- especially @Tropylium. Chuck Entz (talk) 20:59, 23 December 2016 (UTC)Reply
I have no opinion on moving around family codes either way (it doesn't seem they actually come up much, whatever they are), but if we start moving around the proto-language codes, I would like to suggest simple two-part codes. Proto-Samic and Proto-Samoyedic are already smi-pro and syd-pro, so is there any reason we couldn't make do with e.g. fin-pro, fpr-pro, ugr-pro etc.?
Also, as long as we're on this topic, at some point we are going to need the following:
  • Proto-Mansi: (ugr-/urj-?)mns-pro
  • Proto-Khanty: (ugr-/urj-?)kca-pro
  • Proto-Selkup: (ugr-/urj-?)sel-pro
No rush though, since so far we do not even have separate codes for their subdivisions. The only distinction that comes up in practice is distinguishing Northern Khanty from Eastern Khanty (Mansi and Selkup only have one main variety that is not extinct or nearly extinct). --Tropylium (talk) 10:53, 24 December 2016 (UTC)Reply
There is, actually, a reason: our exception codes are designed to avoid conflict with the ISO 639 codes, so they start with an existing ISO 639 code or a code in the qaa-qtz range set aside by ISO 639 for private use. fin is one of the codes for the Finnish language. fpr and ugr are apparently unassigned- for now. As for the three proto-language codes, those don't need a family prefix because they already start with an ISO 639 code. Chuck Entz (talk) 13:06, 24 December 2016 (UTC)Reply
The only codes in the qaa-qtz range we actually use are qfa as a prefix for otherwise unclassified families and qot for Sahaptin (a macrolanguage that wasn't given an ISO code of its own), right? —Aɴɢʀ (talk) 07:55, 26 December 2016 (UTC)Reply
Right. And I didn't notice that we used qot although it is not an ISO code; it seems we followed Linguist List in using it. For consistency, I suggest changing it to fit our usual scheme, so nai-spt or similar (nai-shp is already in use as the family code). - -sche (discuss) 05:10, 27 December 2016 (UTC)Reply
Support, for consistency. fiu is different from nai, because fiu is [a supposedly genetic grouping which is] agreed to be encompassed by a higher-level genetic family which also has an ISO code (urj), and that code can be used if we drop fiu. nai and sai are placeholders rather than genetic groupings, and they're useful ones, because If we dropped them we'd had to recode everything as qfa- (and might conceivably run out of recognizable/mnemonic codes at that point). - -sche (discuss) 05:10, 27 December 2016 (UTC)Reply
Support per -sche, both the main issue being suggested here, as well as recoding Sahaptin. —Μετάknowledgediscuss/deeds 05:57, 19 January 2017 (UTC)Reply
I've recoded Sahaptin and all the Finno-Ugric lects except fiu-fin-pro which requires moving a lot of categories, which I will get to later. - -sche (discuss) 17:20, 8 March 2017 (UTC)Reply
I can volunteer to carry out the move from fiu-fin-pro to urj-fin-pro if there's still consensus that it should be done (which there is, as far as I can tell). — SURJECTION / T / C / L / 09:19, 12 January 2023 (UTC)Reply

Now there are lots of module errors in Cat:E as a result of these language code changes. It might be easiest to fix them by bot. @DTLHS, what do you think? — Eru·tuon 22:45, 9 March 2017 (UTC)Reply

I'll see what I can do. DTLHS (talk) 23:09, 9 March 2017 (UTC)Reply
@Erutuon I've done a bunch of them- I think the reconstructions should be fixed by hand. DTLHS (talk) 23:22, 9 March 2017 (UTC)Reply
@DTLHS: Three incorrect language codes remain, I think: fiu-ugr, fiu-fpr-pro, urj-fin-pro. I couldn't figure out what fiu-fpr-pro should be; it seems to refer to Proto-Finno-Permic, but I searched various language data modules and didn't find a match. Is there someone who can look through and fix the remaining module errors that relate to incorrect language codes? @Tropylium, @Angr, @-sche? — Eru·tuon 04:52, 11 March 2017 (UTC)Reply
Proto-Finno-Permic is an etymology-only language (and a kind of a legacy concept) that we encode as a variety of Proto-Uralic, if that helps. --Tropylium (talk) 13:58, 11 March 2017 (UTC)Reply
The categories were easy to deal with: you just change the {{derivcatboiler}} to {{auto cat}} and the template plugs in the correct language code, if it exists. That also makes it a quick way to check whether there is a correct language code. by the time I finished that, there were only a dozen or so entries left in CAT:E due to everyone else's efforts, so I finished off the remainder by hand. It would have been easier if there hadn't been hundreds of other module errors cluttering up CAT:E- yet another reason for you to be more careful. Chuck Entz (talk) 21:10, 11 March 2017 (UTC)Reply
I apologise for not catching and fixing those uses at the time I renamed the codes. I searched all pages on the site for each of the old codes, and some pages turned up [including pages where the codes were used inside some templates, and I fixed those pages], so I forgot to also do an "insource:" search to catch other uses inside templates like {{m}}. We so rarely change language codes compared to changing language names. - -sche (discuss) 23:36, 11 March 2017 (UTC)Reply
Moved, some earlier, but Finnic is now also in process of being moved from fiu-fin to urj-fin. — SURJECTION / T / C / L / 17:38, 14 January 2023 (UTC)Reply


RFM discussion: New Permic codes edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


New Permic codes

Hi, I would like to request three new codes:

  • Perm for Anbur
  • prm-koo for Old Komi (Writing systems: Perm and Cyrs; Family: urj-prm)
  • prm-kya for Komi-Yazva (Writing system: Cyrl; Ancestor: prm-koo; Family: urj-prm)

Furthermore, the newly made prm-koo should be given as the ancestor of Komi-Permyak (koi) and Komi-Zyrian (kpv). The writing system Perm should be removed from both these languages.

The specifics of the codes' namings can be tweaked if there are any concerns. Pinging @Tropylium. Thadh (talk) 20:18, 16 November 2021 (UTC)Reply

We already have Perm for the Old Permic script. Are you requesting that we change its canonical name to "Anbur"? —Mahāgaja · talk 08:04, 17 November 2021 (UTC)Reply
Oh, couldn't find it for some reason... No , in that case it's okay. Thadh (talk) 08:20, 17 November 2021 (UTC)Reply
Done. Thadh (talk) 13:17, 15 January 2023 (UTC)Reply


RFM discussion: Mari phylogeny edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Mari phylogeny

It seems Eastern Mari (chm) is treated as an ancestor of Western Mari (mrj). Historically, that makes little sense though, because both are different written standards of the same continuum, so this is about akin to setting Nynorsk as a descendant of Bokmål, Komi-Permyak as a descendant of Komi-Zyrian or setting Livvi as a descendant of Karelian. I think we should set both as direct descendants of Proto-Uralic (urj-pro), or, alternatively, create a code for Proto-Mari. I know too little about the history of Mari languages to say anything sensible on that topic. Anyway, pinging some editors that might be interested, @Tropylium, Atitarev, -sche. Thadh (talk) 18:03, 20 February 2022 (UTC)Reply

Previous discussions at Wiktionary:Beer parlour/2013/September#Merging Mari and Buryat varieties and Wiktionary:Language treatment/Discussions#Merging Buryat dialects; also, merging Mari dialects. —Mahāgaja · talk 18:25, 20 February 2022 (UTC)Reply
I changed the one line in Module:languages/data3/m that derived mrj from chm. AFAICT this is all that needed to be changed? - -sche (discuss) 04:31, 1 June 2022 (UTC)Reply
Yeah, that was the minimum requirement. I thought maybe if there were any solid Proto-Mari reconstructions, we could add a code for that, but seeing as nobody answered it's better to leave that for the future. Thadh (talk) 07:18, 1 June 2022 (UTC)Reply

Making a family code edit

@-sche, Tropylium, Mahagaja Would anyone oppose moving chm to mean "Mari" (as a family), creating chm-pro for Proto-Mari and adding the iso-intended code mhr for Eastern Mari? If I understand correctly, @Surjection has volunteered to handle the bot jobs if needed. Thadh (talk) 18:52, 7 November 2022 (UTC)Reply

No objection from me. —Mahāgaja · talk 18:57, 7 November 2022 (UTC)Reply
No objection from me. Looking into the history, it seems we used the macrolanguage code chm for the standard variety (Eastern/Standard Mari) because of this discussion years ago, so I'll ping User:Atitarev who was involved in that old discussion. - -sche (discuss) 19:23, 13 November 2022 (UTC)Reply
No objection from me either. Anatoli T. (обсудить/вклад) 22:03, 13 November 2022 (UTC)Reply
I'll begin the move probably later today, since there are no objections. — SURJECTION / T / C / L / 08:57, 11 January 2023 (UTC)Reply
Move ongoing. Please archive this discussion to Wiktionary talk:Language_treatment/Discussions, under the heading "Mari phylogeny". — SURJECTION / T / C / L / 11:34, 11 January 2023 (UTC)Reply


RFM discussion: January 2023 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Change L2 language name: Kurtop to Kurtöp

The Kurtöp language is a small Sino-Tibetan language spoken in Bhutan. We currently omit the umlaut from the name, but I suspect this is simply because ISO language names don't have diacritics in them. On the other hand, academic texts such as A Grammar of Kurtöp and An Overview of Kurtöp Morphophonemics do use it, and I think we should follow that trend. Theknightwho (talk) 15:57, 4 January 2023 (UTC)Reply

Support. Thadh (talk) 16:58, 4 January 2023 (UTC)Reply
Support. Vininn126 (talk) 17:01, 4 January 2023 (UTC)Reply
Support. — Fenakhay (حيطي · مساهماتي) 17:03, 4 January 2023 (UTC)Reply

RFM moved. Theknightwho (talk) 10:51, 14 January 2023 (UTC)Reply

RFM discussion: November–December 2022 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Change "Middle Mongolian" to "Middle Mongol" (language code xng)
Discussion moved from WT:BP.

Literature on the form of common Mongolic spoken between the 13th and 16th centuries increasingly uses the term "Middle Mongol" rather than "Middle Mongolian". This is primarily because it was the ancestor to several extant Mongolic languages, not all of which are spoken by people who we would usually call Mongolians (though they are sometimes considered part of the Mongol people - especially historically). For example, Buryat (spoken in Buryatia, just north of Mongolia) and Kalmyk (spoken in Kalmykia, around 4,000km to the west of Mongolia). There are several others.

I also feel that this would (slightly) reduce the faulty implication that Mongolian is the "primary" descendant of Middle Mongol, too. To give a comparison, using the name "Middle Mongolian" is roughly equivalent to using the name "Old Russian" for Old East Slavic: obviously less than ideal.

This is supported by Glottolog[9], and I can provide recent academic sources using the term if needed. Theknightwho (talk) 18:08, 4 November 2022 (UTC)Reply

Support, though these proposals are usually done in WT:RFM. AG202 (talk) 18:15, 4 November 2022 (UTC)Reply
  Support Hromi duabh (talk) 14:32, 25 November 2022 (UTC)Reply
  Support Just wondering when we're changing Old Portuguese to Old Galician-Portuguese for similar reasons...--Ser be être 是talk/stalk 21:16, 10 December 2022 (UTC)Reply

RFM moved. Theknightwho (talk) 15:26, 23 December 2022 (UTC)Reply

RFM discussion: July 2022–June 2023 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Rename Ulukwumi (ulb) to Olukumi, add Yoruba as an ancestor for Lucumí (luq), & the case of Itsekiri (its)

As we're moving towards the creation of a more robust coverage of Yoruboid languages, see: agutan for a rough start, we'll need some changes made earlier rather than later.

  • "Olukumi" is the name often, if not most used, to describe the language, unsure why Ethnologue and SIL went with "Ulukwumi". Some sources: Entry for Olùkùmi in the Olùkùmi Talking Dictionary, Olukumi Bilingual Dictionary, Ngram.
  • Lucumi (luq) should be changed to have its name have the accent mark on the "i" becoming "Lucumí", and then being that it is a liturgical language derived from Yoruba, it should have Yoruba as its ancestor.
  • Additionally, there's the case of Isekiri (its), which looks as if it's spelled "Itsekiri" the most in English, including by the main driver of teaching the language nowadays "Learn Itsekiri". The spelling "Isekiri" seems to come from the spelling in the language which involves a , but being as that does not exist in English, it's often been adapted as a "ts". At the moment, I am leaning towards the "Itsekiri" spelling, though I would like more input if possible.

Those are the changes that we need for now, and hopefully we can continue increasing the coverage of these languages. AG202 (talk) 20:30, 6 July 2022 (UTC)Reply

(Notifying Oníhùmọ̀, Oniwe, Egbingíga): AG202 (talk) 15:12, 21 July 2022 (UTC)Reply
Sorry ooo, I don't usually pay attention to the notifications from Wiktionary and Wikipedia, I always think that it'll be difficult to use. About Ìṣẹkírì, in my opinion I think that you write it as Ìtsẹkírì because it isn't normal for them to write it with "ṣ" like us Yorùbá do. Egbingíga (talk) 02:44, 16 September 2022 (UTC)Reply
I went ahead and changed Olukumi, per the persuasive evidence above. Not sure whether Lucumí really needs the accent; ngrams might suggest Lucumi was recently overtaken by Lucumí but it's not as overwhelming as with Ulukwumi not even getting enough hits to plot. - -sche (discuss) 23:25, 23 July 2022 (UTC)Reply
@-sche Thank you for the changes! Hmmm for Lucumí, I'm still leaning towards including the accent, but I see your point. I do wish that there were more folks active here to comment on this though. AG202 (talk) 19:52, 13 August 2022 (UTC)Reply
@-sche, seeing as though I'm just now seeing Wiktionary:Beer parlour/2019/September § Isekiri or Itsekiri? which had unanimous support, I'd think it'd be good to go ahead and make the change for Itsekiri. AG202 (talk) 15:19, 11 September 2022 (UTC)Reply
I think Lucumí should be used as it more accurately transcribes how it is said Egbingíga (talk) 02:52, 16 September 2022 (UTC)Reply
With this, I'm going to push forward with the changes. AG202 (talk) 22:26, 5 March 2023 (UTC)Reply
@Benwing2, is it possible that we could move forward on these ones as well? The Olukumi change has already been made, but the others have not. AG202 (talk) 22:16, 21 March 2023 (UTC)Reply
Isekiri → Itsekiri is now   done. — SURJECTION / T / C / L / 12:17, 30 May 2023 (UTC)Reply
Thanks! Lucumi → Lucumí is also now   done. Closing this discussion since I think everything was covered, and will archive in a week. AG202 (talk) 03:11, 1 June 2023 (UTC)Reply


RFM discussion: October 2019–August 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Canonical name of "fan"

Our canonical name for fan is "Fang (Guinea)", which is unfortunate since it isn't spoken in Guinea. It's spoken primarily in Gabon and Equatorial Guinea. I'd recommend calling it "Fang (Gabon)" since there seem to be more speakers in Gabon than in Eq.G. and since the name of Gabon is shorter. —Mahāgaja · talk 11:53, 17 October 2019 (UTC) I'd recommend calling it "Fang (Equatorial Guinea)" since according to Ethnologue there are more speakers in that country than in Gabon. —Mahāgaja · talk 12:09, 22 October 2019 (UTC)Reply

The fact that there are even some speakers in Cameroon (according to Wikipedia) but it's not the same as "Fang (Cameroon)" is icing on the confusion-cake... and they're both Bantoid languages, so disambiguating by family doesn't help, and both spoken in Central Africa, so we can't disambiguate by mere region as we sometimes do. - -sche (discuss) 16:00, 24 October 2019 (UTC)Reply
@-sche: They are different branches of Bantoid, though. We could call [fak] "Fang (Beboid)" and [fan] "Fang (Bantu)". —Mahāgaja · talk 22:04, 24 October 2019 (UTC)Reply
Ah, true; I saw that Wikipedia classified Fang language (Cameroon) as "Western Beboid" but caveated that it was not necessarily a valid family, but if Beboid overall is valid, then that works and is clearer (IMO) than picking just one of the countries to list. I would support renaming them in that way. The only other languages that come to mind which are disambiguated by family are "Austronesian Mor" and "Papuan Mor" both spoken in West Papua), which are mentioned on WT:LANG; probably we should change those to "Mor (Austronesian)" and "Mor (Papuan)" for consistency. - -sche (discuss) 18:00, 25 October 2019 (UTC)Reply
I was going to rename Austronesian and Papuan Mor to use the 'parenthetical, postpositive' naming format for consistency with the Fangs and with how languages with country-name disambiguators are named, but I notice we also have e.g. Austronesian and Papuan Gimi, Austronesian and Sepik Mari (and some other Maris), and several other such languages, and I don't have time to rename all of those, so consistency will have to wait. (There is also "Sepik Iwam", but it appears to actually get called that, to distinguish it from the other Sepik language called Iwam which is spoken in the same place and belongs to the same Iwam subfamily of Upper Sepik. Confusing!) - -sche (discuss) 08:41, 28 November 2019 (UTC)Reply
The reason I haven't archived this is that several other languages, mentioned above, still need to be renamed to fit the format of the other languages. I or someone else just need(s) to find the time... - -sche (discuss) 01:48, 6 August 2020 (UTC)Reply


RFM discussion: May–September 2023 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Proto-Witotoan and Proto-Huitoto-Ocaina

Currently, the way they are treated makes no sense: Proto-Huitoto-Ocaina is handled as the parent of Proto-Witotoan and nothing else. In our word list, Proto-Witotoan is used to denote Proto-Bora-Witoto, which is a macroreconstruction that is very speculative. I propose we merge these into one language, Proto-Witotoan, which seems to be the more common term.

Notifying @-sche, Mahagaja. Thadh (talk) 15:32, 9 May 2023 (UTC)Reply

No objection here. —Mahāgaja · talk 15:36, 9 May 2023 (UTC)Reply
If I recall and understand correctly (?) the current setup is based on your request here, where I noted the issue of the broader/older grouping having been set in some entries as the child of a smaller/younger grouping. No objection to changing it. I see that a handful of Murui Huitoto entries use Proto-Witotoan in their etymologies, but these should be unaffected by removing Boran from its scope. - -sche (discuss) 16:53, 9 May 2023 (UTC)Reply
Oof, I didn't even recall that discussion - I guess I was a bit too hasty, sorry. If there are no further objections, I'll just manually remove and/or change the reconstructions from the Murui Huitoto etymologies and fix this. Thadh (talk) 17:05, 9 May 2023 (UTC)Reply
Merged. Thadh (talk) 20:34, 8 September 2023 (UTC)Reply


RFM discussion: August 2015 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


West African Pidgin English varieties

Ethnologue has assigned codes to some but not all of the varieties of West African Pidgin English, and we in turn have incorporated some (e.g. pcm) but not all (e.g. not gpe) of those codes. As WP notes, the "contemporary English-based pidgin and creole languages are so similar that they are sometimes grouped together under the name 'West African Pidgin English'" (a name which also denotes their predecessor which developed in the 1700s). WP's examples are illustrative, particularly in that its Ghanaian and Nigerian Pidgin English examples are identical. I propose to merge at least the following three varieties into wes, renaming it "West African Pidgin English":

  1. Ghanaian Pidgin English (gpe)
  2. Nigerian Pidgin English (pcm)
  3. Cameroonian Pidgin English (wes)

We could also discuss whether or not to merge Sierra Leone Krio (kri, which WP notes its often mistaken for English slang due to its similarity to English, but which has a somewhat distinct alphabet), Pichinglis / Fernando Po Creole (fpe), and Liberian Kreyol / Liberian Pidgin English (lir). - -sche (discuss) 21:11, 11 August 2015 (UTC)Reply

The question is a very complex one. Firstly (but of least importance), scholars are divided on which lects have creolised and which have not, but it is generally agreed upon that at least some of the language you mentioned are not pidgins, which would make the name "West African Pidgin English" somewhat of a misnomer (the more neutral name "Wes-Kos" have been suggested as an alternative, but even linguists haven't fully adopted it). Secondly, all these lects are remarkably similar on a lexical level, but that's unsurprising; after all, they resulted from separate but very similar language contact events, and then probably modified each other (one scholar posits that Krio and Cameroonian Pidgin English relexified each other to some degree after pidginisation). The similarities are also obscured by the fact that there is nothing close to an agreed orthography for most of these, and pronunciation does differ a bit across West Africa. Linguistically, I'd probably merge them all, but practically that may not be the best decision. I know we have entries in pcm, but probably next to nothing for the rest, and if somebody wants to add them, given how each lect is very neatly assigned to a certain West African country, at least it won't be confusing for them to do so. Conclusion: the literature is schizophrenic, the lects mutually intelligible, and the existing situation remarkably unproblematic. Therefore I abstain. —Μετάknowledgediscuss/deeds 21:19, 16 August 2015 (UTC)Reply

{{look}}

RFM discussion: July 2016–December 2020 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I see no evidence that this exists as a separate language, and move that it be merged with tr. The literature which references it seems to describe the dialect of Turkish which may be spoken by Gagauz people in the Balkan Peninsula. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply

Wikipedia, citing Ethnologue, insists that Balkan Gagauz Turkish, Gagauz, and Turkish are all separate, and a few sources do seem to take that view, e.g. Cem Keskin, Subject agreement-dependency of accusative case in Turkish, or, Jump-starting grammatical machinery (2009) speaks of "Balkan Gagauz Turkish, Gagauz, Turkish, Iraqi Turkmen, North and South Azerbaijani, Salchuq, Aynallu, Qashqay, Khorasan Turkic, Turkmen, Oghuz Uzbek, Afshar, and possibly Crimean Tatar". Other references speak of Balkan Gagauz Turkish as a variety of Gagauz, e.g. James Minahan's Encyclopedia of the Stateless Nations says "The Gagauz speak a Turkic language [...] also called Balkan Gagauz or Balkan Turkic, [which] is spoken in two major dialects, Central and Southern, with the former the basis of the literary language. Other dialects [include] Maritime Gagauz" (which comports with w:Gagauz's list of its dialects). Matthias Brenzinger's Language Diversity Endangered also treats Balkan Gagauz "or slightly misleading, Balkan Turkic" in his entry on Gagauz, but says it that the Balkan "varieties might deserve the status of outlying languages but very little information is available about them." (A few generalist references seem to subsume all gag into tr.) I would leave them all separate, pending more conclusive evidence that they should be merged. - -sche (discuss) 23:58, 3 July 2016 (UTC)Reply
I think there's some confusion about what exactly we're talking about, and whether it's Gagauz or Turkish. Just because they use the term "Balkan Gagauz Turkish" doesn't mean that they're referring to the language with ISO 639-3 code bgx. When I look at who's citing the references listed for bgx at Glottolog, Manević (the reference for its classification) is cited in papers clearly talking about the dialects of tr. These are the only actual words attributed to this lect that I can find. —Μετάknowledgediscuss/deeds 00:33, 4 July 2016 (UTC)Reply
@Tropylium, on the subject of Turkic languages spoken in Europe, do you know anything about this one, and about its differences or similarity to Gagauz and standard Turkish? - -sche (discuss) 01:08, 11 May 2017 (UTC)Reply
I'm not previously familiar with this dispute, but here are a few handbooks on the topic:
  • Menges in The Turkic Languages and Peoples has the following slightly complicated quote (p. 11): "The Turkic languages spoken farthest west are the Balkanic dialects of Osman and Gagauz in Bosnia, Bulgaria and Macedonia. These seem to form two groups, one of possibly pre-Osman origin, and a later Osman one. To the former belong the Gaǯaly in Deli-Orman (Eastern Bulgaria), who, according to V. A. Moškov, are descended from the Päčänäg, Uz, and Torci (?), the Surguč, numbering about 7000 people in the district (vilājät) of Edirnä, who call themselves Gagauz. In Moškov's opinion, they, too, go back to the Päčänägs (?) and the Macedonian Gagauz; they number ca. 4000 people in southeastern Macedonia." — It seems clear that some group(s) corresponding to "Balkan Gagauz" is being identified here, but I am not even sure how to parse the sentence structure; e.g. are "Uz" and "Torci" some of the pre-Osman Turkic groups, or some of the alleged ancestors of the Gaǯaly? ("Osman" is, of course, Turkish.)
  • Hendrik Boeschoten in a classificatory chapter in Routledge's The Turkic Languages mentions that "a few speakers [of Gagauz] in northern Bulgaria, Romania and Greece, adhere to the Orthodox faith, and have their own history." This again seems to refer to "Balkan Gagauz", but with no indication of being its own language.
So far I would gather from this that "Balkan Gagauz" is at most a sister language of "non-Balkan Gagauz", and perhaps indeed just a different dialect group (perhaps one whose features are not reflected in written standard Gagauz). But the Manević 1954 paper would be more informative on this topic, if anyone wants to hunt it down. --Tropylium (talk) 11:55, 11 May 2017 (UTC)Reply
I think Balkan Gagauz should be merged with gag, especially since it contains no entries. The few terms that would be specific for Gagauz spoken outside of the traditional Gagauz area in Moldova/Romania/Bulgaria can be dealt with within gag entries. The only thing is that some etymologies of other Turkic languages sometimes refer to Balkan Gagauz instead of Gagauz, because editors didn't know the difference between two. Otherwise I don't see any problems with merging them two.
On the other hand, Gagauz should definitely NOT be merged with Turkish, that is pretty obvious to me.Allahverdi Verdizade (talk) 05:09, 9 September 2018 (UTC)Reply
@Metaknowledge This is a hard question, I can offer only guesswork.
I can't find any good maps for the distribution of Gagauz and (Muslim) Turks proper in the Balkans, most don't show Balkan Gagauz at all although we know they exist at least in Bulgaria and Macedonia.
It seems that they are not easily separated geographically from Muslim Turks although they presumably live in different localities. I'm guessing this means that their languages ("Balkan Gagauz Turkish" and "Rumelian Turkish") could be the same, although maybe only the latter call their language "Turkish", so I guess that they (would?) use Standard Turkish in education and administration.
This would be a good argument to merge Balkan Gagauz into Turkish, except that this paper shows that Balkan Turkic (if this really is a single language) is quite distinct from Anatolian Turkish and perhaps worth considering a different language. Baskakov also considers Balkan Turkish and (Moldovan) Gagauz to form a clade within Oghuz and Anatolian Turkish and Azerbaijani to form another. Crom daba (talk) 21:35, 30 September 2018 (UTC)Reply
@Anylai, can you find anything in Turkish on the possible differences between Balkan Gagauz and Rumelian Turkish? Crom daba (talk) 21:38, 30 September 2018 (UTC)Reply
Merge / delete it. The distribution of the name, the way it is “mentioned”, points towards it being a ghost language. The name is not attestable as used by anyone having particular information about it; nobody can add anything under it either in such a situation where it is a content-filled concept for nobody. Its alleged synonyms “Balkan Turkish” and “Rumelian Turkish” show it is just an SOP term for Turkish as spoken on the Balkans respectively Rumelia, i.e. remnant speakers of the Ottoman rule. German Balkantürkisch, distinguished from Türkeitürkisch as a regiolect. Fay Freak (talk) 13:38, 2 December 2020 (UTC)Reply

{{look}}

RFM discussion: July 2016 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Marrithiyel

Maridan [zmd], Maridjabin [zmj], Marimanindji [zmm], Maringarr [zmt], Marithiel [mfr], Mariyedi [zmy], Marti Ke [zmg]: should these be merged? References speak of a singular Marrithiyel language. - -sche (discuss) 21:30, 20 July 2016 (UTC)Reply

RFM discussion: February–March 2017 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Chinese Pidgin English (cpi)

This is not a separate language at all, it's just English with different grammar and some loanwords, but other than that it's completely intelligible with standard English. As such, it should be moved to Category:Chinese English. -- Pedrianaplant (talk) 15:19, 8 February 2017 (UTC)Reply

That's not at all the impression I get from Chinese Pidgin English. It seems to be a distinct language to me, as much as any other English-based pidgin. —Aɴɢʀ (talk) 16:45, 8 February 2017 (UTC)Reply
We did delete Hawaiian Pidgin English in the past though (see Template talk:hwc). I don't see how this case is any different. -- Pedrianaplant (talk)
I know we did, but I didn't participate in that discussion (only 3 people did), and I disagree with it too, probably even more strongly than I disagree with merging cpi. —Aɴɢʀ (talk) 17:02, 8 February 2017 (UTC)Reply
  • Basically, this is a terminological problem. There may have been a true pidgin in each of these cases, but it has not been recorded. What is called a pidgin in many descriptive works is instead a dialect of English that is very easy to understand, nothing like the real English-based pidgins and creoles that I have studied. If you look at the actual quotations used to support lemmas in Chinese Pidgin English, you find that it is Chinese English. Support merge, but leave [cpi] as an etymology-only code. —Μετάknowledgediscuss/deeds 23:16, 8 February 2017 (UTC)Reply
  • At least some texts seem very distinct, to the point of unintelligibility; consider "Joss pidgin man chop chop begin" (Whedon's translator begins chopping things? or "god's businessman begins right away"?). On the other hand, other sentences given by Wikipedia are quite intelligible...and possibly not attestable under the stricter CFI to which English is subject. I'm not sure what to do. (Our short previous discussion also didn't reach a firm resolution.) - -sche (discuss) 17:46, 8 March 2017 (UTC)Reply
    I mean, I use joss and chop chop in English normally (having grown up in a fairly Chinese environment likely has something to do with that)... and I think that was chosen as an especially extreme example. —Μετάknowledgediscuss/deeds 03:32, 25 March 2017 (UTC)Reply


RFM discussion: November 2018–December 2023 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Cleanup suggestions for some badly attested Semitic languages, needing admin action
Discussion moved from Wiktionary:Grease_pit/2018/November#Cleanup suggestions for some badly attested Semitic languages, needing admin action.
  1. Pray somebody add |scripts = {"Narb"} to Module:languages/data3/x after line 1026 for xna. (Otherwise mentions of words in it are shown in slanted letters.)
    Added. DTLHS (talk) 03:17, 14 November 2018 (UTC)Reply
    It seems that even MediaWiki:Common.css needs a new class for Narb added, to get font-style: normal; Sarb is there and has it, Narb is not there. If the mention of a North Arabian word in عَنْكَبُوت (ʕankabūt) works then it is complete. Also I see that in Module:scripts/data Narb does not have direction = "rtl" while Sarb has. Fay Freak (talk) 14:43, 15 November 2018 (UTC)Reply
    Good catch. I've updated Common.css and Mobile.cc and set it to display rtl. Sadly, it seems there are no fonts that display it. If you or I could find a good image of what the letters are supposed to look like, I might have time to make a basic font iff the letters don't have to be joined the way they do in Arabic. - -sche (discuss) 22:08, 18 November 2018 (UTC)Reply
    I as an Archfag recently had a great update three weeks ago that adds displaying support for Old North Arabian, amongst other things like which improved Arabic and Syriac script rendering everywhere. gucharmap calls the name of the font by “Noto Sans Old North Arabian”, which I find in the filelist of the noto-fonts package. @-sche Fay Freak (talk) 22:29, 18 November 2018 (UTC)Reply
  2. I think everything under Category:Old North Arabian script languages should be “Ancient North Arabian” (xna), it is to wonder that Dadanitic (sem-dad), Hismaic (sem-his), Safaitic (sem-saf), Taymanitic (sem-tay), Dumaitic (sem-dum), Hasaitic (sem-has), Thamudic (sem-tha) are separate languages on Wiktionary (some also with no script assigned). (Prolly someone went through some lects and added all he found.) Those lects are at a level of attestion or study where it does not even matter whether they are dialects or languages, and “Thamudic” is even a collective term for any of the Ancient North Arabian lects not further classified. Many inscriptions cannot be classified unto more specific lects anyway (you know, people also were nomads and wrote graffiti here and there) and they can only be entered as “Ancient North Arabian”. With words being found randomly and in concise consonantal writing I don’t see why one would pursue separation other than by stating the find spot.
  3. Also, “Qatabanian” (xqt), “Sabaean” (xsa), “Minaean” (inm), “Harami” (xha, redirects to “Minaean” on Wikipedia), Hadrami (xhd) – likewise otiose distinctions, regarding form and amount of attestion of Epigraphic South Arabian, as the name says only epigraphically attested, without any vowels –, have been unpopular in use already, entries and etymologies use the header “Old South Arabian” (sem-srb). I suggests to cross out those. Etymology-only is possible so one can use those in {{cog}} when in an individual case a word is known to be attested as of one of the dialects. North Arabian epigraphy categorization is more complex and it is better anyway to mention in each etymology where a lexeme has been encountered.
    1. Himyaritic (sem-him), as an attested language, is rather mythical because the Ḥimyarites wrote Sabaean. Wikipedia mentions “three Himyaritic texts”, at the same time in the Encyclopedia of Arabic Language and Lingustics s.v. we read about two: “It is not even possible to establish whether they were written in the same language. The first text dates from around 100 C.E. and the second from around 300 C.E.” And about the secondary material from Early Medieval Arabs: “It is easy to see that quotations from Himyaritic offer very different readings according to the manuscripts.” Or according to others, mentioned in the EALL, Ḥimyarite is the same as Arabic, only with peculiar features (which might as well derive from Arabicized transmission, or later language fusion or whatever, much that could fool us). It could be grouped with those spurious languages if this category held languages from Antiquity.
  4. Gurage is according to Wolf Leslau, it’s most eminent scholar, one language with twelve dialects; others share this view. The material for this language, particularly by Leslau across his works, only lists words as “Gurage”, without qualifying if they are “Inor”, “Mesqan” or some other Gurage, so on Wiktionary one cannot simply give “Gurage” words (which has recently been done in Semitic comparisons by abusing the code of the largest dialect Sebat Bet Gurage, in spite of the source saying “Gurage”). The following dialects I find on en.Wiktionary as languages: Kistane/Soddo (gru), Mesqan (mvz), Sebat Bet Gurage (sgw), Silt'e (stv), Inor (ior), Muher (sem-mhr), Mesmes (mys), Chaha (sem-cha), Wolane (wle), Zay (zwa); some of these are considered subdialects of Sebat Bet Gurage. There are more I don’t find on Wiktionary. It’s perhaps like with the Aramaic dialects yore or the Low German dialects today. People publish Westphalian dictionaries but it’s still Low German and so treated by Wiktionary. I suspect that instead of holding controversial subdivisions deriving from Ethnologue we should, holding to the sources, keep the Wiktionary-language level higher. The source for a certain word can be further qualified by labels as with Coptic. I mean that with language, unlike with biological taxonomy, one cannot simply assume that distinctiveness of a taxon is ascertained by experiments and then authoritatively published in some reference. As the individual forms are described in this dictionary, one must weigh if the data allows distinction at all. Currently it looks to me that hence Gurage must be lumped; I don’t know if, with new data or emerging different literary standards, separating the lects with separate codes will later be convenient (the increase in language material will be disappointing and unlikely someone will come and add Gurage in thousands of entries anyway, let’s be realistic), but I doubt that it would be comfortable. See also Why is Old Novgorodian a separate language in Wiktionary? This is the question: Is the difference in data enough to justify separation? The actual language-dialect distinction does not matter, it must be seen functionally, for dictionary purposes, for dictionary purposes. And if linguists publish material as “Gurage” the distinction is probably not good for Wiktionary headers. Isn’t it out of scope of Wiktionary to distinguish lect clusters when they are generally unwritten and chiefly written by and variously lumped and splitted by linguists? That’s a difficult question. Also I fear that such distinctions might be precisely the cause why nobody comes and pours out his rich Gurage knowledge. An adept would not be sure to distinguish, pendulating between two extremes, not witting if he should split as much as he can by all kinds of criteria or if to standardize and to abstract. To help though first all mentioned codes need the Ge'ez and Latin script both assigned, and the macrolanguage created. Maybe there will be late order from early ambiguity. Though I would perhaps do the order by lumping and labelling by location, were I that certain aficionado.
The obese Wiktionary:List of languages currently comprising 8055 lects needs cuts however. Fay Freak (talk)
This discussion really belongs at rfm, because that's where we normally discuss changes to whether or how we recognize a language. The Grease pit is for discussing how to implement something along those lines- not whether it should be implemented. The other option would be at the Beer parlour, but this seems like something that would benefit from the more specialized focus of rfm. Chuck Entz (talk) 03:39, 14 November 2018 (UTC)Reply
Good distinction. I hesitated at 4:13 AM where to put it because of the mixed content. Moved. Fay Freak (talk) 14:16, 14 November 2018 (UTC)Reply
Some prior discussion of Thamudic et al is on Category talk:Hismaic language; IIRC they were separated because literature does mention them as distinct entities, but if they were very similar or often treated as one language, and especially if there's difficulty in assigning specific texts to specific ones due to similarity, that would be an argument for reversing that decision and going back to the conservative approach of treating them all as one language with 'dialect'/'region' labels where appropriate.
(As to the venue, yes, these discussions tend to happen on RFM for quirky historical reasons — originally the discussions entailed actually merging or splitting language templates — although some have proposed the Beer Parlour as a more logical venue. There are minor benefits and drawbacks to either venue; this venue does have the advantage that discussions stay on the page until resolved.) - -sche (discuss) 17:20, 14 November 2018 (UTC)Reply
I avoided Beer Parlour because I thought it is only for matters already affecting people, but it would not affect anyone we know now. Fay Freak (talk) 14:43, 15 November 2018 (UTC)Reply
Who is likely to have access to resources on Africa's Semitic languages that could help judge what to do with Gurage? User:Metaknowledge, User:Wikitiki89? Wikipedia insists "The Gurage languages do not constitute a coherent linguistic grouping", which seems incompatible with merging them. William A. Shack, in his book on The Gurage, writes that "each Gurage dialect is usually understood only by its own speakers, and there is a rough correlation between the contiguity of dialect groups and the extent to which their dialects are mutually intelligible." (Steven Danver, in his (general-focus) encyclopedia, says "the languages of the different groups of Ethiopian Gurage are seldom mutually intelligible.") Marvin Lionel Bender, in his 1976 Language in Ethiopia, says "Although seventeen varieties of Gurage dialects are listed, mutual intelligibility reduces this to four languages and three dialect clusters as follows (Hetzron classification):
  Gogot, Misqan, Muxir, Soddo
  East Gurage (Inneqor, Silti, Urbareg, Weleni, Zway)
  Central West Gurage (Chaha, Gumer, Gura Izha)
  Peripheral West Gurage (Ener, Geto, Indegegn, Innemor)"
However, his very next sentence is: "Gogot, Muxir, Soddo comprise a geographical (non-genetic) grouping of non-mutually-intelligible languages known as 'North Gurage'", all of which seeems to suggest that merging all of the Gurages would not be sound.
- -sche (discuss) 17:28, 14 November 2018 (UTC)Reply
The cited grouping of course adds to the confusion. Three languages, but four dialects clusters, not mentioning their intersections? Well, we will not find out how one should see them without deep-diving. But the question is which direction Wiktionary should go: likely the current division is not correct. Should Wiktionary just add all possible splits so they can be cleaned up later when someone would commit himself to add the whole Gurage and judge about which distinctions are most convenient or should we have one macro-code because distinction is hopeless? The reason why I have even mentioned Gurage is that for example Leslau’s Etymological Dictionary of Geʿez which I like to use just gives words as “Gurage”, which sounds like there is a common vocabulary. Fay Freak (talk) 14:43, 15 November 2018 (UTC)Reply
Perhaps you can deduce from Leslau's literature list which Gurage language he gets his data from? He seems to have written an etymological dictionary of Gurage as well, presumably its foreword could clear things up.
His own field studies. I hade linked his Etymological Dictionary of Gurage (“according to Wolf Leslau” etc.). Fay Freak (talk) 15:23, 17 November 2018 (UTC)Reply
As a volunteer project (run on fancy), we really have no other choice than to wait for someone to investigate the matter deeply and order the languages in a manner that facilitates their lexicographical work.
Maybe we need non-genetic language group categories and ways to give forms in unindentified languages belonging to language groups. Crom daba (talk) 15:49, 15 November 2018 (UTC)Reply
  • @Fay Freak, -sche: A bit late, but here are my responses to the three outstanding problems (your #2–4):
    1. It is fairly evident that Ancient North Arabian is not a single language, and I advocate that sem-xna be abolished rather than the specific language codes; read Al-Jallad (2018), "What is Ancient North Arabian?". He sees Safaitic (which he has written a grammar of) and Hismaic as being of the same continuum as Old Arabic, but they are obviously too distinct from Classical Arabic for lexicographical purposes. He supports the distinctness of the others as languages, and of the various "Thamudic" lects. Based on Al-Jallad, I would prefer we split Thamudic B, C, D, etc as necessary; each language will have a very small corpus, but it seems like the most honest way to do it, and if more inscriptions are found, the lettered Thamudic wastebaskets will probably get their own names as the others did.
    2. Old South Arabian is also not a single language, though Sabaean was the standard that the other lects imitated, and I advocated that sem-srb be abolished as well. Multhoff (2019) in The Semitic Languages makes the case for four distinct languages: Sabaean, Minaean, Qatabanian, and Hadrami. She makes no mention, however, of Harami. Macdonald (2000), "Reflections on the linguistic map of pre-Islamic Arabia" explains that "Harami" is a name given to a few Sabaean texts that seem to have been contaminated by other Semitic languages, which is not at all an unusual feature and not unique to that site, so I suggest we remove that code.
      1. As for Himyaritic, I now think I was wrong to include it. There are three texts often attributed to it, but see Stein (2008), "The ‘Himyaritic’ Language in Pre-Islamic Yemen", which makes a strong argument to consider these as simply very late examples of Sabaean, which is indisputably the language of the other texts of the region in that script.
    3. Finally, for Gurage, the chief problem is that some scholars follow Hetzron in saying that Gurage is polyphyletic, in which case lumping would be committing a grave error (and the same charge has been levelled for Aramaic, with perhaps more evidence). Meyer (2011) in the International Handbook does seem to support the unity of Gurage, and treats the lects together, which gives me hope for lumping, but he is unwilling to commit to whether they should be considered dialects or languages. I think your Gurage-adding genius is mythical, so we have to choose which is least bad: many languages with scanty coverage, because their forms may be similar to forms entered under a different L2 header; or one Gurage language with decent coverage, but many forms that are not marked for what dialect they belong to and therefore a poor resource. I hesitantly support merger, given those choices. —Μετάknowledgediscuss/deeds 03:13, 10 August 2020 (UTC)Reply
    4. An addendum: "Hadrami" is a terrible name for xhd, and invites confusion with Hadrami Arabic. Wikipedia uses "Hadramautic", but N-grams and a quick literature review suggests that "Hadramitic" is more common. @Fay Freak, -sche again (yes, I know I'm pestering, but I don't want to move forward on all this alone, both because I am fallible and because some of these, particularly splitting OSA, would require a bit of work, although in that case there is an online corpus that will help immensely). —Μετάknowledgediscuss/deeds 02:40, 17 August 2020 (UTC)Reply
Re North Arabian: Many works I browsed through speak of Old North Arabian as a unit with dialects, but also carefully specify what lects (including Thamudic B vs C, etc) words are attested in. Some imply, in their presentations, that a large number of words are identical between dialects, at least in the sample of vocabulary that they're treating (e.g., the pronouns treated in Roger D. Woodard, The Ancient Languages of Syria-Palestine and Arabia (2008), pages 197-198), though this seems to be because the authors are presenting 'normal', normalized and romanized forms, given Al-Jallad's evidence that words (even the supposedly distinctive definite article) varied not just among dialects but even within the writings of individual speakers. The native script also loses many possible differences in pronunciation, but then, we are a written, writing-based dictionary. I find slightly more works speaking of "Ancient North Arabian dialects" than "Ancient North Arabian languages", and the fact that some authors have argued the varieties are the same language not only as each other but even as Arabic itself does suggest a high degree of similarity (or that the scholars in question are lumpers). As we're dealing with small, extinct and apparently clearly delineated corpora, it seems like the conservative approach of treating each under its own L2 could be better, and we could retire xna ... unless we need it as a wastebasket for unsorted things, which Al-Jallad (and Fay Freak, above) suggests we would. (Bah, It's messy business, deciding what's a language and what's a dialect...) I will try to dig into the rest later. - -sche (discuss) 04:10, 19 August 2020 (UTC)Reply

Well myself I have added Sabaean, Minaean, Qatabanian entries meanwhile, understanding and quoting a few inscriptions, although apart from some occasional features I noticed little how such an inscription can be classified as either, other than by provenience or rulers or gods mentioned—but that must be due to my blasé comparative approach that also makes me read Romance without recognizing the individual language. So somehow the volition to a merge is gone, though the lumping codes “Old South Arabian” and “Old North Arabian” must be kept for inscriptions no one has classified. Both are useful.

For Himyaritic, however, nothing is left. As here said already, the three alleged Himyaritic inscriptions don’t even need to be in the same language, and they aren’t even from anything to be called Ḥimyar (there are “Lesser Himyarites” and “Greater Himyarites” and the ethnic identity is fragmentary, too, by the way). In the “Critical Reevaluation” of the Ḥimyaritic language – cited by Wikipedia on Himyaritic language one does not know what for: their “undeciphered-k language” header recently introduced is surely a made-up term, oddly suggesting that these inscriptions are yet another language when those “k-language” inscriptions are exactly those otherwise claimed for Himyaritic, so we see Wikipedia editors had no clue and phantasize together languages due to their disdain for primary sources – helpfully includes a map, also coming to the conclusion “we have no reason to assume the existence of some “non-Ṣayhadic” language in pre-Islamic Yemen that was spoken besides the (Late) Sabaic idiom known from the inscriptions.” That from the fact that “Himyaritic” words typically given from Arabic sources are all also found in Sabaic, and the grammar found in the three inscriptions, including the prefixed instead of postfixed article which is only found in two of them, is too either found in Sabaean or can well be ascribed to their being poetry, which is also the reason for their being poorly understood. Many Arabic poems are also hard to understand and mostly helped by the copious material for the language which is not the case for languages with so limited a corpus, like Old South Arabian. Even in the Digests, Latin prose, not all passages are of discoverable meaning.

What would hinder man though to add understood words with quotes from the ominous inscriptions as Sabaean? Or anything from Arabic sources transmitted as Himyaritic instead of Arabic as Sabaean? For there is no evidence for it being a particular language. You see, from the corpus-based standpoint Wiktionary takes Himyaritic must go. Nothing can get the header “Himyaritic”, it can only be mentioned at Sabaean or Old South Arabian entries that Himyaritic nature is suggested by those who have come to believe in this extraordinary claim for which extraordinary evidence is not provided. Fay Freak (talk) 04:18, 6 August 2021 (UTC)Reply

I went on and moved our only “Ḥimyaritic” entry after that famous sentence to Yemeni Arabic in which the word طَيِّب (ṭayyib) for “gold” turns out otherwise known, and to be nothing else than Classical Arabic طَيِّب (ṭayyib, good) meaning “refined” and therefore gold, while Old South Arabian could not have developed such sense, so it is clear the famous quote one has been so inept to classify is at best only macaronic Sabaean-Yemenite Arabic. It is well put by Marijn van Putten:
The Arab grammarians were interested in describing correct usage of language of Classical Arabic. It is quite clear that Himyaritic (and by extension Yemeni Arabic) did not fall in the category of 'correct usage'. Within this context, it is of course not surprising that anything that is "wrong" and from Yemen might be denoted as Himyaritic. This would then include both varieties of Yemeni Arabic and some surviving vestiges of Ancient South Arabian. Fay Freak (talk) 04:59, 13 September 2021 (UTC)Reply
Now also in a new article by Koutchoukali like communis opinio, though his blogs transpire by him stalking Wiktionary: later Muslim historians would refer to anything related to South Arabia’s pre-Islamic history as “Himyaritic,” all memory of its other states having passed away. Fay Freak (talk) 01:01, 18 September 2021 (UTC)Reply
@-sche, Theknightwho: So are we gonna delete Himyaritic now? I promise there is nothing to add in it, and I have ever been averse to use the code, as on بطيخ; everything one needs to know about it is in the dictionary entry Himyaritic. Everything else in this five-year old topic is resolved:
Old South Arabian and Ancient North Arabian are both necessary evils in addition to individual languages of the Old South Arabian and Ancient North Arabian families because inscriptions are all over the place and sometimes not informative enough for specific identification, so we can’t remove the languages like we have removed Nahuatl, but like for Nahuatls the language treatments use to be sharp. Gurage nobody will reliably untangle in the near future; we may have a family under sem-eth for the “Gurage cluster”. Fay Freak (talk) 03:07, 6 December 2023 (UTC)Reply
OK, per the discussion above I have removed Himyaritic. - -sche (discuss) 14:43, 7 December 2023 (UTC)Reply
I struck the thread then. Fay Freak (talk) 00:52, 8 December 2023 (UTC)Reply


RFM discussion: September 2018–February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Arawak and Island Carib

Any objections to me renaming Lokono arw (4 entries) and Kalinago crb (0 entries) to Lokono and Kalhiphona, respectively? Arawak is easily confused with the Arawak/Arawakan proto language and family, and Carib is one of two often confounded languages, the Carib language and the Island Carib language. --Victar (talk) 04:03, 6 September 2018 (UTC)Reply

No objection to renaming Arawak, but I'm not sure about Kalhiphona, which seems to be quite rare even on a Google web search, and which seems to invite as much possible confusion (in its various spellings) with the various spellings of Garifuna as it avoids with other "Carib"s. - -sche (discuss) 06:56, 19 September 2018 (UTC)Reply

Arawak → Lokono   Done. Island Carib → Kalinago (rather than Kalhiphona, after further discussion at #Arawak and Carib again)   Done. — Vorziblix (talk · contribs) 20:38, 2 February 2024 (UTC)Reply

RFM discussion: July–August 2021 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Lenape

The decision to deprecate Lenape as a canonical language needs to be revisited. It seems to me that removing Lenape as a canonical language is like removing French as a canonical language and instead having only Quebecois and Parisian. Classifying Wënami and Munsee as seperate canonical languages is a narrow linguistic differentiation that quickly spills into the political. In a 2013 discussion (archived at Category talk:Unami language) the difference between Wënami amd Munsee was compared to English and Scotts. This is a faulty comparison because the political and cultural situation is in no way analogous (and language versus dialect is always a deeply political not just academic subject). The continuum of dialects once spoken in Southern New York, New Jersey, Delaware, and Pennsylvania were on the brink of disappearing forever. What we are witnessing now is a process of standardization that could revive the Lenape language. I hope the Wiktionary community would not want to contribute to fracturing such efforts. Unnecessary rigidity about what is to be considered a canonical language could jeopardize such standardization and revival. I strongly urge everyone to reconsider having a canonical language category for Lenape, and specifying the dialect of origin for specific terms, usages, and grammatical conventions. Note that all orthographies for Lenape are relatively recent inventions based on various European languages, and that standardization here is more than warranted. Andreas.b.olsson (talk) 05:17, 29 July 2021 (UTC)Reply

@-sche Chuck Entz (talk) 05:45, 29 July 2021 (UTC)Reply
Also pinging @DCDuring as a participant in the previous discussion. —Mahāgaja · talk 06:06, 29 July 2021 (UTC)Reply
This is tricky. Some people want to document each language as such, especially with an eye to their past as mutually unintelligible languages (Marianne Mithun, The Languages of Native North America, 2001, page 331; Siebert even seems to suggest Munsee was more closely related or similar to Mahican than to Unami) with extensive differences in orthography and phonology (e.g. in having l vs r in their reflexes of PA's *r/*l phoneme). (Archaeological as well as linguistic evidence suggests the distinction between the two groups goes back to prehistoric times.) Merging northerly Munsee into the now more dominant southern Unami could complicate documentation of that critically endangered language, which some people are studying and learning from its remaining speakers in Canada. On the other hand, the most prominent revitalization efforts in the United States, based mostly on southern Unami, do seem to speak of Lenape as a single language (and FWIW, Canadian Munsee efforts, albeit with a different spelling, also seem to speak of a Lunaape language), apparently aiming to standardize the two into one language with an eye towards keeping it alive into the future. We have to think carefully about what to do here. - -sche (discuss) 17:52, 29 July 2021 (UTC)Reply
I have nothing to contribute beyond the wish for some precision in the etymologies of toponyms, which can easily be accomplished with qualifiers or labels without complicating the creation of good language entries. DCDuring (talk) 18:33, 29 July 2021 (UTC)Reply
You say on Talk:mochipwis that "I'm a learner of modernized Lenape, a language in the process of being revived. I am not an expert in the differences between Wënami versus Munsee. What I do know is that there are no first-language speakers of either dialect left." This goes to the heart of the issue, I think: Wiktionary does not only cover modern languages as they are currently spoken, but also covers languages that existed in the past. Although (as noted in the earlier discussion) I initially created "Lenape" content under a unified code, following the "lumper" approach of the US revivalists, the differences between the lects are (as Chuck noted in that same disscusion) historically extensive, to the point that the modern linguistic literature I've been able to find that says anything about their intelligibility accepts Mithun's statement that they were mutually unintelligible, which militates against combining them. (OTOH, if modern speakers are trying to merge them, that militates in favor of a merger.) Wiktionary does merge e.g. many Sinitic languages even when they're mutually unintelligible when spoken (and conversely we split e.g. Bokmal and Nynorsk even though they're not merely mutually intelligible but the same language), so either approach could be made to work. - -sche (discuss) 19:29, 29 July 2021 (UTC)Reply
@-sche: What about recognizing all three codes? We could have Lenape (either using del or creating a new code for it, e.g. del-len, so as to keep del for the family) for the modern language undergoing revitalization, and also have unm and umu for the historical stages. —Mahāgaja · talk 19:57, 29 July 2021 (UTC)Reply
Adding a link here to the important Lënape Talking Dictionary maintained by the Tribe of Delaware Indians. It should be recognized that the code del (for Delaware) is potentially problematic and controversial. If I say in New York City “I speak a little Delaware” I’m likely to get quizzical looks. But if I say “I’m learning Lënape” an educated resident of the city is likely to remember social studies classes in elementary school, contextualize, and understand what I mean. Wouldn’t it be better to have a new code for modernized Lënape that contains the letters in Lënape? Andreas.b.olsson (talk) 21:56, 29 July 2021 (UTC)Reply
We use the ISO 693-3 codes for languages, so that's out of our control. They're not going to change the code; they've refused for www, which has some significant real-life problems. In practice, the codes may sometimes look like abbreviations, but in theory, there's no necessary connection; it's just a three letter code.--Prosfilaes (talk) 01:30, 30 July 2021 (UTC)Reply
@Prosfilaes there is no ISO designation for the Lënape variants. So what language does the ISO code del stand for? Old Wënami? Mondern Lënape? Munsee dialects still spoken in Moraviantown? Andreas.b.olsson (talk) 02:50, 30 July 2021 (UTC)Reply
@Andreas.b.olsson: In the ISO itself, del stands for a macrolanguage called Delaware whose individual languages are Munsee umu and Unami unm (see https://iso639-3.sil.org/code/del). We're not obliged to use ISO's names, though, so we are free to call del Lenape or Lënape rather than Delaware. Don't get hung up on the letters used for the code; they don't have to bear any relation to the name of the language (xcl stands for Old Armenian, for example, and the code for Mapudungun is arn). —Mahāgaja · talk 07:40, 30 July 2021 (UTC)Reply
@Mahagaja:Fair enough about the abstraction. I did not know that ISO had a more complex sub-categorization. Thank you for providing the link and helping my knowledge about these standardizations grow. My concern was related to the fact that del is derived from Delaware – the European name for the region – and not Lënapehokink, the Lënape name. However, as far as I understand the Lënape are themselves comfortable with the syncretic designation and retain it in their name (Nation of Delaware Indians). I should therefore not assume any offending connotation in the designation del. I’m trying to reach out to the Oklahoma branch of the Lënape. Let’s see what they say about this issue. As a dominant constituency they should have a say about this actively evolving linguistic situation. On an additional note, there is a significant difference in the orthography being actively taught by the mainly Munsee branch in Moraviantown Canada.Andreas.b.olsson (talk) 08:49, 30 July 2021 (UTC)Reply

I’m not sure there is anything wrong in the abstract splitting of the dialects/languages. It’s the naming in Wiktionary for these lects that does not reflect how these lects are referred to in actuality (where there is unfortunately ambiguity). A proper and respectful name is needed for the following variants:

The modern variant in Moraviantown seems to be referred to alternatively as Lunaape and Munsee Language. The modern variant based on Rementer et al work is consistently referred to as Lënape or alternatively without the schwa (Lenape). Andreas.b.olsson (talk) 13:06, 30 July 2021 (UTC)Reply

How about the following naming:
  • Lënape
  • Lunaape
  • Old Lenape
The distance between the lects spoken by the Unami and Munsee tribes at the time of Zeisberger and prior seems largely speculative. Andreas.b.olsson (talk) 16:27, 30 July 2021 (UTC)Reply
For the purpose of historical linguistics Old Lenape can be divided into:
  • Southern Unami
  • Northern Unami
  • Munsee
Note here that Unami seems to be a term derived from historical linguistics. I believe it should not be used to denote any of the revitalized lects currently in use. However, the Stock-Bridge Munsee do refer to their Lunaape variant in English as the “Munsee language”. Andreas.b.olsson (talk) 14:54, 31 July 2021 (UTC)Reply
Unless the Stockbridge-Munsee Community object, I would suggest using Lunaape as the name of their orthographic variant and dialect.Andreas.b.olsson (talk) 15:04, 31 July 2021 (UTC)Reply

Here is what I propose in terms of naming, categorization, and ISO encoding.

  1. Old Lenape: del
    1. Munsee: umu
    2. Unami: unm
  2. Lënape: lnp
  3. Lunaape: lne

LNP and LNE are not used yet by ISO 639-3. Andreas.b.olsson (talk) 15:29, 1 August 2021 (UTC) North and South Unami can be treated as minor variants (i.e. dialects). Munsee is treated as distinct, leaving the question open about whether it was more closely related to Mohican. Andreas.b.olsson (talk) 15:35, 1 August 2021 (UTC)Reply

We can't be making up our own three-letter codes, but we can put them after a family code, separated by a hyphen. So if we wanted to, we could in theory create alg-lnp, alg-lne, etc. —Mahāgaja · talk 10:08, 2 August 2021 (UTC)Reply
Why can’t we be making up codes if they are not used? This can be in conjunction with ra request to ISO that these lects be added with those codes. It’s one thing to request a change, another the treat the ISO 639 as extensional. With its increased influence Wiktionary ought have some influence on ISO. Andreas.b.olsson (talk) 16:40, 2 August 2021 (UTC)Reply
We can't make up codes because ISO might assign them to some other languages in the future. I think you might be overestimating Wiktionary's influence; also, the ISO committee would take a whole lot of convincing before they agreed to two more codes. Considering you started out this discussion arguing against treating Unami and Munsee as two different languages, it seems like it would be quite a stretch to convincingly argue that they are, in fact, five different languages. From everything you've explained above, I'm starting to think we should recognize only del as a canonical language and reassign unm and umu to be etymology-only codes for subvarieties of del; if necessary, we can add more etymology-only codes (since these are local to Wiktionary and have nothing to do with ISO) for other chronolects and regiolects. —Mahāgaja · talk 18:38, 2 August 2021 (UTC)Reply
@Mahagaja I think your low-balling Wiktionary’s growing influence, and a few people’s ability to influence seemingly immobile institutions. My point has evolved based on @-sche’s input about linguistic historicity and the strong difference between orthography used by those who learn Lunaape versus Lënape. Importantly, there is extremely strong merit in treating the language recorded by Zeisberger et al in the 18th century as a different language. It is grammatically quite distant from Lënape and Lunaape. Note that the del, unm, and umu seem based on the historical study by Moravian missionaries (who referred to the language as “Delawarische Sprache”). There needs to be a clear distinction between these historical lects (which I’m referring to as “Old Lenape”) and the languages spoken today. Andreas.b.olsson (talk) 03:26, 3 August 2021 (UTC)Reply
Seeming immobile to whom? They've seemed relatively responsive to me; they offer a final decision on most requests in an annual wrapup. I certainly question any need to bluster ahead and then expect SIL to follow; that seems likely to be counterproductive.--Prosfilaes (talk) 09:31, 4 August 2021 (UTC)Reply
I would also note here that the German entry in Wikipedia is titled “Delawarisch” and talks about it first as a single language, and then alternatively two languages depending on one’s view. The equivalent English entry is “Delaware languages”. However, in both entries the focus is on the language(s) studied by Moravian missionaries et al, and then linguists studying those initial European studies (which haphazardly invented wildly different orthographies based on the authors own mother tongues). @-sche is right: it’s complicated. Here are the facts:
  1. Old Lenape is grammatically very different from modern variants.
  2. There are two very different modern orthographical variants.
If this were Norwegian, you would without hesitation have Gammelnorsk, Bokmål, and Nynorsk. Andreas.b.olsson (talk) 04:01, 3 August 2021 (UTC)Reply
So if we are going to base decisions on precedent – which one ought to - then the way Wiktionary has approached the orthographically distinct Norwegian Bokmål and Nynorsk speaks for having distinct language codes for Lënape and Lunaape regardless of whether they should be treated as the same language or not. Unfortunately, del really in the end refers to Old Lenape since the ISO further subdivides it into Unami and Munsee, which are really references to the language(s) studied by Zeisberger’s et al and those who continue to study their studies. To use it for Lënape and Lunaape would be confusing and improper.Andreas.b.olsson (talk) 04:25, 3 August 2021 (UTC)Reply
This is not Norwegian, and there's no other language that is presented like Norwegian. The closest are Ottoman Turkish and modern Turkish and Urdu and Hindi, and in both of those cases you're looking at different scripts, Arabic and Latin or Devanagari plus vocabulary divergence. One has to look at all the precedent, not just one example.--Prosfilaes (talk) 09:31, 4 August 2021 (UTC)Reply
On a last note for tonight, the codes alg-lnp and alg-lne proposed by @Mahagaja for Lënape and Lunaape could be a reasonable compromise.Andreas.b.olsson (talk) 04:34, 3 August 2021 (UTC)Reply
IMO, as someone who as added a good portion on the Munsee entries, unless someone can demonstratively show the need, I would say the way things are set up right now is perfectly fine. More work should be made to show the linguistic differences before any new codes are created. This all seems like hypothetical conjecture to me. --{{victar|talk}} 05:25, 3 August 2021 (UTC)Reply
Having looked closer at Zeisberger’s work, I should say that how distant Old Lenape is from Lënape and Lunaape is uncertain to me. There seems to have been a lot of simplification in how verbs are used, with certain modes falling greatly out of favor (e.g the subordinate mode). The issue is complicated by the very different orthography used by Zeisberger, who used German inspired phonological conventions. If Wiktionary were to want to document these changes though, it ought to leave a space for Old Lenape. The issue reminds me of Swedish, which went through a drastic change in orthography in the late 19th and early 20th century.
I think what I get back to is that Unami is not the proper L2 name for a language. The proper name for this language is Lënape. So even if we keep the coding structure in tact, the name of that lect needs to be changed (Unami => Lënape). Andreas.b.olsson (talk) 09:24, 3 August 2021 (UTC)Reply
Out of curiosity @Victar, how come you chose Munsee as the name of the L2 lect and not Lunaape? I see that the community in Moraviantown uses either designation. Nonetheless, what made you chose one over the other? If you are a member of the Stockbridge-Munsee community, please help me better understand and my sincerest apologies for potentially overstepping. Andreas.b.olsson (talk) 09:36, 3 August 2021 (UTC)Reply
We (Wiki/Wikt) base much of our language nomenclature and classifications on the recommendations of Ethnologue.. --{{victar|talk}} 19:15, 3 August 2021 (UTC)Reply

I guess another thing that bothers me is that the code unm – however abstract in the mind of the Wiktionary community – carries the stigma of a split. It’s continued use will cause continued sclerosis. It’s does not seem a term that all parties – Lënape in Oklahoma, Munsee in Canada, and enthusiasts like myself in New York City – can all rally around. It seems to me that there is something quite new going on, an amalgamation and standardization that should be captured in vivo by Wiktionary. Andreas.b.olsson (talk) 10:14, 3 August 2021 (UTC)Reply

(Sorry, I was busy/distracted.) If we could reach out to Lenape folks, including Munsee speakers or learners, for clarification of whether they consider the lects one language (in particular, if Munsee consider their language to be the same as Unami, or if it is only the US-based folks running the Unami-based Lenape Talking Dictionary who name it as if it speaks for all Lenape), that'd provide useful evidence: if speakers want to consider the lects one language, it'd be evidence that we should merge (and distinguish with qualifiers, etc); otherwise, the current setup is fine. It seems some of the objection comes down to just the names, preferring "Lenape" to "Unami"; in another discussion Andreas notes that speakers would sooner say the equivalent of "I speak Lenape" than "I speak Unami", but this is not evidence that they are one language (consider various Dogon languages, which are all Dogons though not all mutually intelligible, including Bangime language whose speakers consider themselves Dogon who "seem unaware that [their language] is not mutually intelligible with any Dogon language"). Simply renaming "Unami" to "Unami Lenape" (and "Munsee" to "Munsee Lenape") is an option, although we usually try to use the most commonly used names for languages, which may be the current names. The shape of the codes is a non-issue since they're mostly not visible to readers, only editors. This discussion does make me realize that they're reader-facing in the names of certain categories, and that this is probably confusing, but that's a general issue not specific to Lenape. (How many people looking at our Khanty content would guess that the list of berries is at "Category:kca:Berries"? Shouldn't we always use language names?) There's no linguistic basis for introducing more than one new code (e.g. a five-way split!), as floated above. - -sche (discuss) 02:24, 5 August 2021 (UTC)Reply

I’m the newcomer here @-sche. Though I have used Wiktionary for a long time, I did not prior to this participate in its maintenance. I will defer to you all on the best back-end encoding, and whether proliferating codes for anticipated future linguistic differentiating makes little sense. My objection – and this was not clear to me before either – seems reduced to the fact that “Lënape” was stripped from how the lect(s) are referred to. I think there is sufficient online evidence to make the affirmation that the so called Unami branch is almost exclusively referred to as Lënape. Therefore I think it would be warranted to make this change as soon as possible. Having looked at quite a few of the “Unami” entries it seems to me that they are almost all derived from the work of the Talking Lënape Dictionary project. Hence the issue of the distance between the lect recorded by Zeisberger et al versus the more recent collected speech samples of the Talking Lënape team is mute, and can be resolved later. Note though this may mean having to encode for another lect at some point in the future (especially since the spelling used by the Moravian missionaries was based on German and very different). I agree fully that the Munsee community should have a say in whether to treat their lect as the same as that taught by the Talking Lënape Dictionary project. It should be noted though that it can be factually stated that they have to date used a very different orthography in lessons and materials they have posted online. Andreas.b.olsson (talk)

@-sche, @Mahagaja, @Victar can we split out the decision about the naming for the lect with the code unm and rename it from Unami to Lënape? There is overwhelming evidence that Lënape should be the proper name of the language, and not Unami. The code issue and whether to treat the Munsee branch as a separate language (and what to call it) can be dealt with separately and later.

@-sche, @Mahagaja, @Victar asking again about splitting out the issue of the language's proper name and making an initial narrower decision. I would like to continue building out Wiktionary's Lënape entries, but first I would like to make sure the language currently referred to as "Unami" is referred to by its proper name. Andreas.b.olsson (talk) 13:01, 12 August 2021 (UTC)Reply
I have no objection (other than that it's hard to type) to renaming unm Lënape. —Mahāgaja · talk 13:10, 12 August 2021 (UTC)Reply
AFAICT "Lenape" is a far more common name than "Lënape", so renaming "Lenape" to "Lënape" does not seem appropriate. And because Unami is only one variety of Lenape, renaming Unami to Lenape/Lënape would be confusing. The difficulty seems to be that the main US-based language project produces its Talking Dictionary seemingly based on Unami but named as if it's one overarching "Lenape"/"Lënape" language. (I know this kind of thing has come up before, where a prominent dictionary of what it considers "a language" doesn't specify which of two actually-distinct lects its words are, or conflates them; I will track down other examples if necessary, but it seems tangential.) I don't want to stand in the way of a Native language revitalization effort, so the above-discussed option of leaving Unami and Munsee for the historical (and historically not mutually intelligible) varieties and making Lenape a language code (rather than a family code) for the "unified" language has some appeal, but it would entail duplicating a lot of content which is currently Unami. There is also the option of renaming Unami to "Unami Lenape" to get the "Lenape" name in there. - -sche (discuss) 19:19, 30 August 2021 (UTC)Reply
Again, I agree -- renaming Unami to Lenape would simply cause confusion whilst adding no benefit and renaming it to "Unami Lenape" does not serve to disambiguate. --{{victar|talk}} 20:13, 30 August 2021 (UTC)Reply
I’m not sure I understand for whom this would be confusing @Victar. I agree @-sche that Lenape may be simpler to type than Lënape. Anyway, the schwa indicator is optional in the dominant modern dialect of Lënape. I have a deep hunch here that Linnaean taxonomic orthodoxy and perpetuation of erroneous past anthropological claims are at fault for the strict linguistic distinction between Unami and Munsee. The more I study Lënape, the fewer distinctions I see in the various dialects except for in the choice of orthography (which noticeably stems from quite recent European attempts to record an unwritten language). Andreas.b.olsson (talk)

I suspect that Wiktionary is perpetuating a European construct erected between the late 18th and early 20th century. Of course, these faulty past errors of European linguists have had real effect, seemingly causing “two languages” to appear to exist. I need to further study the Munsee branch to substantiate my claims.

I continue to believe Unami is a culturally and linguistically inappropriate term for any dialect of Lënape. If the language spoken by the descendants of the Unami tribe of the Lënape were to have any other name than Lenape – which I insist is the appropriate designation – it might be Delaware. This is a name I believe the (predominantly) Unami descendants living in Oklahoma accept and sometimes use in English despite its European origins.

Wanishi Andreas.b.olsson (talk)

RFM discussion: January–February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Arawak and Carib again

Resurrecting an old discussion on the same subject above, I propose we rename Arawak (arw) and Island Carib (crb) to Lokono and Kalinago, respectively. As User:Victar noted in that discussion, Arawak is easily confused with the Arawak/Arawakan family. (Indeed, I have run into many etymologies where someone has mislabelled a word from a different Arawakan language as Arawak). Island Carib, meanwhile, is (1) not a Cariban language, making etymological discussions occasionally confusing; (2) no longer generally called by that name, since the people are now officially called the Kalinago by the Dominican government; (3) easily confused with the unrelated 'true' Carib language, which we call Galibi Carib.

I would also add this latter language (the one we currently call Galibi Carib (car)) to the discussion, and suggest it should be renamed either Kari'na or Kari'nja, names that are both more common in the current literature and closer to the native name (karìna in Courtz’s orthography). (The most common name for this language in the literature is actually simply Carib, but this may itself be confusing.) — Vorziblix (talk · contribs) 15:56, 3 January 2024 (UTC)Reply

  Support: Obviously still support renaming both to Lokono and Kalinago, respectively. I can't speak to Galibi Carib as I'm not familiar with the academic literature. --{{victar|talk}} 16:22, 5 January 2024 (UTC)Reply
I hope that the entries Arawak, Island Carib, Cariban, Kalinago, Galibi Carib, Kari'na, Kari'nia, and Carib have or will have all the definitions and attestation to help relatively normal users through this. DCDuring (talk) 18:54, 5 January 2024 (UTC)Reply
I’ve added most of those missing entries (and improved one or two others), and have also changed Galibi Carib (car) to Kari'na, since I am actively working on that language and would rather make the change while our coverage is still small. I will leave the discussion to run for a few more weeks before changing the others, and we can always also revert/change Kari'na to something else if the disucssion turns that way. — Vorziblix (talk · contribs) 13:42, 16 January 2024 (UTC)Reply
All three of these moves are now complete. The change away from ‘Arawak’ came not a moment too soon; a large amount of references to ‘Arawak’ (arw) in our entries were entirely wrong and needed to be corrected to ‘Arawakan’ (awd), and indeed there may be some remaining references to arw that are still wrong. — Vorziblix (talk · contribs) 06:42, 2 February 2024 (UTC)Reply

Arawak → Lokono, Island Carib → Kalinago, and Galibi Carib → Kari'na   Done. — Vorziblix (talk · contribs) 20:38, 2 February 2024 (UTC)Reply

RFM discussion: January–March 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Mansi Languages Split

Since the Mansi language represented in Wiktionary isn't a thing, since it is several languages compiled under one name and code, I would like to propose a split. It would be beneficial to split it along the 4 main dialects, (the ones still extant) Northern, Eastern and (extinct) Western, Southern. The subdialectal markings can be represented with labels, already in place in the current category for Mansi. Ewithu (talk) 18:19, 9 January 2024 (UTC)Reply

  DoneEwithu (talk) 13:22, 9 March 2024 (UTC)Reply


RFM discussion: January–March 2023 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Rename Pomeranian to Proto-Pomeranian

"Pomeranian" is essentially a term for the family consisting of the Kashubian and Slovincian languages. In fact, Pomeranian (or its Polish counterpart, język pomorski) has always been used as a synonym of Kashubian (Polish język kaszubski), as witnessed by the 1893 dictionary called Słownik języka pomorskiego czyli kaszubskiego ("Dictionary of the Pomeranian a.k.a. Kashubian language").

There are no written records of an ancestor of both Kashubian and Slovincian, and any attestation of a Pomeranian lect will automatically fall into either of the two categories according to our current handling. However, the two languages do share an ancestor, and this ancestor did influence other languages. As such, it seems only logical to set this language as a proto-language, and rename it to Proto-Pomeranian in reference to its being the unattested ancestor of a language family, rather than being an attested language.

Pinging @Sławobóg, Vininn126, Gnosandes. Thadh (talk) 13:55, 9 January 2023 (UTC)Reply

  Support. Vininn126 (talk) 13:59, 9 January 2023 (UTC)Reply
  Support. Sławobóg (talk) 14:10, 9 January 2023 (UTC)Reply
  Support. // Silmeth @talk 15:09, 9 January 2023 (UTC)Reply
  Oppose. Gnosandes ✿ (talk) 16:47, 9 January 2023 (UTC)Reply
  Support. —Mahāgaja · talk 08:24, 11 January 2023 (UTC)Reply
  Support. — Fenakhay (حيطي · مساهماتي) 11:56, 11 January 2023 (UTC)Reply
  Support I have looked into usage and it appears that this ancestor—not only in English, but also Russian and other current tongues of linguistic science—can only be called “Pomeranian” in the same way as Proto-Slavic can be called “Slavic” language, and it the same way as “Slavic” is any Slavic language, or “Turkic” any Turkic language etc., “Pomerian” is any language of the said group. Fay Freak (talk) 12:35, 11 January 2023 (UTC)Reply
  Done. @Vininn126, Thadh, Sławobóg, Silmethule, Gnosandes, Mahagaja, Fenakhay, Fay Freak Currently there is no family corresponding to Proto-Pomeranian and the code for this language is 'zlw-pom', which is exceptional in lacking the '-pro' suffix normally given to proto-languages. I'm thinking we should add a 'Pomeranian' family with code 'zlw-pom' and give Proto-Pomeranian the code 'zlw-pom-pro'. Thoughts? Benwing2 (talk) 07:25, 28 February 2023 (UTC)Reply
Yes, absolutely. zlw-pom should be a family code and zlw-pom-pro the protolanguage code. —Mahāgaja · talk 08:13, 28 February 2023 (UTC)Reply
@ZomBear, Mahagaja Can you take a look at Reconstruction:Proto-Slavic/žarъ? @ZomBear listed a non-reconstructed descendant for Proto-Pomeranian and I'm not quite sure how to fix this as I'm not sure where that term comes from. Benwing2 (talk) 23:18, 28 February 2023 (UTC)Reply
@Benwing2 error corrected by adding * (in *žarъ). Pomeranian term I met here — Martynaŭ, V. U., editor (1985), “жар”, in Этымалагічны слоўнік беларускай мовы [Etymological Dictionary of the Belarusian Language] (in Belarusian), volumes 3 (га! – інчэ́), Minsk: Navuka i technika, page 210 -- ZomBear (talk) 03:16, 1 March 2023 (UTC)Reply
BTW I renamed the codes. Benwing2 (talk) 23:24, 28 February 2023 (UTC)Reply
Obviously, there is no “Proto-Pomeranian”. There are no reconstructions, no comparisons of paradigmatic morphology, and so on. Gnosandes ❀ (talk) 08:41, 1 March 2023 (UTC)Reply
The little reality of the field of study does not equal the irreality of the language itself. Subbranch proto forms are often left unreconstructed for being too small a fish when a larger one is available. Fay Freak (talk) 22:28, 4 March 2023 (UTC)Reply


RFM discussion: January–February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Merging Proto-Pomeranian

As it stands, Proto-Pomeranian is next to useless and I don't see it becoming any more useful. There is a single lemma that we could treat as Pre-Kashubian. I don't see any reconstructions coming in the future as a lot of linguistics consider Slovincian a dialect of Kashubian (this needs further analysis and it's something I'm working on). The next highest level that linguists agree on is something like Pomeranian-Polabian, they share a lot of sound changes in common, but I think it would make more sense to treat descendants sections with something like:

* Pomeranian-Polabian:

** {{desc|csb}}

** {{desc|pox}}

** {{desc|zlw-slv}}

I propose we remove Proto-Pomeranian and merge with Kashubian. Vininn126 (talk) 12:42, 30 January 2024 (UTC)Reply

Support This was worth a try but didn't turn out well. Thadh (talk) 12:53, 30 January 2024 (UTC)Reply
There is no such thing as "Pomeranian-Polabian", hard no to that. Sławobóg (talk) 14:06, 30 January 2024 (UTC)Reply
And to the other, more pressing issue? Vininn126 (talk) 14:43, 30 January 2024 (UTC)Reply
I support removing Proto-Pomeranian, adding it was a mistake. We have nothing to make it work. It is probably not different from Proto-Polish and stuff like that. Sławobóg (talk) 15:00, 30 January 2024 (UTC)Reply
  Support. I wrote a long time ago that "Proto-Pomeranian" is nonsense. I also now write that "Pomeranian-Polabian" is nonsense. ɶLerman (talk) 18:28, 30 January 2024 (UTC)Reply
I'm going to speedy delete this tomorrow - I think everyone who has an opinion has spoken. As to the grouping, I have a new idea that I will bring up elsewhere. Vininn126 (talk) 13:27, 5 February 2024 (UTC)Reply

Deleted. Vininn126 (talk) 13:42, 6 February 2024 (UTC)Reply

@Vininn126: It turns out there were at least half a dozen entries linking to Proto-Pomeranian words in either the etymology or the descendants. There were a couple of those where the proto-form was followed by its alleged descendants, which I fixed by simply replacing it with the word "Pomeranian:". The rest need to be dealt with. Chuck Entz (talk) 04:56, 7 February 2024 (UTC)Reply
@Chuck Entz I thought I had caught them - can you provide a list? Vininn126 (talk) 08:12, 7 February 2024 (UTC)Reply
@Vininn126 See CAT:E. Benwing2 (talk) 09:33, 7 February 2024 (UTC)Reply
Ah, those hadn't popped yet when I checked, guess the server needed time to refresh. Vininn126 (talk) 09:37, 7 February 2024 (UTC)Reply
@Vininn126: Sometimes it takes a few days (widely transcluded modules can take langer- there are still errors popping up from a mistake someone made and immediately fixed last week).
A better method is to use insource:"zlw-pom-pro" in Special:Search with the option for searching all namespaces checked. The single remaining one is in the Reconstruction namespace (There's also the archived rfv discussion for the reconstruction you deleted, but I'm not sure what to do about that). Chuck Entz (talk) 15:42, 7 February 2024 (UTC)Reply
@Chuck Entz I don't see any reconstruction? Vininn126 (talk) 15:47, 7 February 2024 (UTC)Reply
@Vininn126: Reconstruction:Proto-Slavic/vydati Chuck Entz (talk) 15:52, 7 February 2024 (UTC)Reply


RFM discussion: February–March 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Splitting Khanty Languages

A long-standing issue with Ob-Ugric Languages as a whole (which I tried pointing out with the request for splitting Mansi as well) is that they are categorized under one language, while they could be considered separate languages, one fact is due to mutual unintelligibility.

But this talk is considering Khanty languages. We could do the way that it is on glottolog with the Khantic being the root of the all other Khanty languages. You can also find other sources for this on the List of languages of Russia (v2023) (in russian) site with the 143-146 numbers on the list

@Nyuhn I know you've only started to work on the Khanty language not so long ago, but I especially would love to hear your thoughts on this matter :] Ewithu (talk) 13:02, 6 February 2024 (UTC)Reply

@Thadh Can you comment (or ping the appropriate editors, if you know them)? You seem to know a lot about obscure languages. Benwing2 (talk) 03:28, 7 February 2024 (UTC)Reply
I'm not read-up on Mansi and Khanty and don't have the time to at the moment, but maybe @Tropylium, Rua, Surjection may have a clue. Thadh (talk) 08:37, 7 February 2024 (UTC)Reply
Yes almost all western researchers have been on board for decades now with "Khanty" as being at least three languages. Russian scholars are catching up to it too (e.g. a study was recently advertized as "Khanty dialects found to differ more than Slavic languages"). The minimal division would be Northern Khanty, Southern Khanty, Eastern Khanty as still used by e.g. Salminen in Routledge's recent (2023) The Uralic Languages, 2nd Edition handbook. In terms of practicality for someplace like Wiktionary, there's three or four essentially different literary standards for Northern (Kazym, Shuryshkar, Obdorsk; also the by now little used Middle Ob) and two or three for Eastern (Surgut, Vakh-Vasyugan, nascently Salym) which each could be treated as different languages too. The division of Northern is not highly in line with what comparative studies would have to say (many field doculects could not be nicely fit to it) but for anything documented from literary use, this would save a lot of "(Vakh dialect)" "(Obdorsk dialect)" comments. --Tropylium (talk) 12:48, 7 February 2024 (UTC)Reply
If the literary standards represent mutually intelligible dialects I would suggest keeping them unified. This is somewhat similar to the situation with Galician, where there are two competing literary standards ("standard" and "reconstructionist", with the former made to look like Spanish and the latter to look like Portuguese), and we do use labels to handle the differences (and if the same spelling is used for both, there are two verb tables under Conjugation). The main issue I see with creating separate languages is it leads to duplication as you can't simply point the word in one literary standard to another using {{alt form}}. Benwing2 (talk) 21:18, 7 February 2024 (UTC)Reply
Compare also Occitan, handled as a single language with six or so dialects. Benwing2 (talk) 21:21, 7 February 2024 (UTC)Reply
I don't know too much about Galician nor Occitian, but if the vocabulary difference would be our biggest problem between the Khanty dialects, we would be ok to keep it unified.
But since between Khanty dialects, even the grammar differs. I don't know how we could handle the different ways nouns decline or verbs conjugate if we keep it unified. Ewithu (talk) 22:10, 7 February 2024 (UTC)Reply
@Ewithu What I mean is, split based on mutual intelligibility, hence maybe a 3-way split or however-many-way split, but not split dialects just because they have different literary standards, if the literary standards refer to mutually intelligible dialects. Grammar also differs among Galician and Occitan standards and it's handled just fine. Benwing2 (talk) 22:49, 7 February 2024 (UTC)Reply
Coptic is another example where we handle multiple (4) literary standards under a single L2. Benwing2 (talk) 22:56, 7 February 2024 (UTC)Reply
Komi is splitted into three languages, Udmurt is in two, Selkup is also in two, Mari is in three (should be Northwestern one two by the way), Koibal has its own section despite being essentially a Kamassian dialect. I think splitting it into big chunks would really help reduce messiness, especially if there are also going to be dialects written down.
Two of the Mansi languages are extinct, to me it just feels wrong to mix everything together. Kaarkemhveel (talk) 23:27, 7 February 2024 (UTC)Reply
@Kaarkemhveel I am not proposing mixing everything together but splitting based on mutual intelligibility. Glottolog for example has 5 Khantyic languages and 3 Mansic languages. What I'm concerned about is splitting every proposed literary standard into its own microlanguage; this would give us 8+ Khanty languages and I don't know how many Mansi ones. Things like "Koibal has its own section despite being essentially a Kamassian dialect" IMO should not be considered precedents for micro-splitting. Benwing2 (talk) 00:57, 8 February 2024 (UTC)Reply
Right, that sound reasonable. In my head at least it looks like Northern/Eastern/(Salym?)/Southern Khanty and for Mansi either Nothern/Eastern/Western/Southern or Northern/Eastern/Western&Southern (though the last suggestion based rather on their current use status).
Kaarkemhveel (talk) 06:29, 8 February 2024 (UTC)Reply

Dear EnWiktionary bosses! I must keep you informed that so called 'Khanty', 'Koryak', 'Selkup', 'Nenets' and some other languages which are used to be considered as united languages have never been a matter of fact of the reality but only a product of early-Stalin national policy. Speaking about Khanty subfamily, the main sources have been already provided above. Khanty subfamily consists of at least 4 languages: Vakh-Vasyugan with Vakh literary norm; Surgut with its literary norm; Southern with no actual written use afaik; and Northern with previously used Middle Ob literary norm and currently used Kazym, Shuryshkary and Lower Ob literary norms. All literary standards I mentioned have, for instance, schoolbooks presently being released, some of them I have in paper on my right hand. The languages of Khanty group (as well as Sami to give an understandable parallel) are not mutually intelligible. For instance, Northern Khanty morphology is very close to the Northern Mansi one with 3-4 noun cases while Vakh-Vasyugan has 10-13 cases like Southern Selkup which is believed to be the remaining of the archaic state because of the close contact with Selkups. So people applying for splitting Khanty just range things as the things are, moreover they are ready to work. Grigoriy Korotkih (talk) 09:50, 8 February 2024 (UTC)Reply

@Grigoriy Korotkih Hi Grigoriy, thanks for your comments. I don't think anyone is objecting to splitting Khanty or Mansi, it's just a case of working out what the splits will be. Benwing2 (talk) 21:42, 8 February 2024 (UTC)Reply

Khanty languages could be considered separate languages. Most of the dictionaries describing them through dialects or subdialects (говоры). Those division already presented in the Wiktionary using regional categories.
How it would be beneficial to the Wiktionary to have three Khanty L-2 sections on the "тур" page instead of one?
Also we will need to somehow handle translations and etymology links from other languages, that refer to unspecified Khanty. Nyuhn (talk) 09:23, 9 February 2024 (UTC)Reply

Should we merge East Slavic languages because they have words like нога or зуб then? Don’t different Khanty lects have different inflection paradigms, pronunciation, even use sometimes?
Speaking of refering to "unspecified Khanty": wouldn’t it be beneficial to the Wiktionary to specify it, rather than pile everything into "Khanty"? Kaarkemhveel (talk) 12:47, 10 February 2024 (UTC)Reply

So in light of all that has been discussed, we should be good to split both languages, according to the cardinal directions of the North East South, and West. We can use labels to further specify which dialect it came from. Naturally each separate language (represented by the 4 cardinal directions) will have to (if possible to find) have an {{alt form}} section to make switching between dialects (represented by the river they lived beside traditionally) easier. Support or Oppose? Ewithu (talk) 09:59, 24 February 2024 (UTC)Reply

@Ewithu Northern/Southern/Eastern/Western seems correct for Mansi but for Khanty maybe it should be Northern/Southern/Eastern per Wikipedia; Western Khanty appears to be a larger grouping consisting of Southern and Northern. I agree with the idea of identifying dialects by rivers. (We also ended up doing this for the CAT:Ye'kwana language, inspired by the Khanty and Mansi situation, as using river names was the only unambiguous and accurate way of referring to different dialects that we could come up with.) Benwing2 (talk) 10:20, 24 February 2024 (UTC)Reply
@Benwing2 Alright, that's sounds good. Now we just need to get to it haha Ewithu (talk) 13:52, 26 February 2024 (UTC)Reply
@Ewithu I think what we need to do is as follows (for Khanty, similarly for Mansi):
  1. Assign language codes kca-nor, kca-sou and kca-eas to the split-out languages.
  2. Assign a temporary family code urj-kca to the new Khantyic (Khantic?) family; see below.
  3. Set up tracking for all uses of kca as a language code. (This will make it easier to find the references to this code so they can be changed.)
  4. Split the current {{kca-*}} templates into per-new-language templates. Thankfully there are only 4 templates. I assume the pronouns listed in {{kca-table-ppron}} are Kazym pronouns. I'm not sure if all the Khantyic languages use the same set of cases; if not we need to modify the new declension templates to have the appropriate sets of cases in them.
  5. Split the Khanty modules. There are only three of them: Module:kca:Dialects, Module:labels/data/lang/kca and Module:kca-translit. I think the translit module can stay as-is.
  6. Move all the lemmas and non-lemma forms to the new languages, and change template references accordingly.
  7. Change all references to language kca to use the appropriate code. This will principally be in Etymology, Translations and Descendants sections.
  8. Delete the Khanty language from Module:languages/data/3/k.
  9. Repurpose the kca code for the Khantyic family.
Benwing2 (talk) 03:41, 27 February 2024 (UTC)Reply
@Ewithu Steps 1-3 are done. Tracking can be found here for Khanty: Special:WhatLinksHere/Template:tracking/languages/kca. There is also tracking for Mansi here: Special:WhatLinksHere/Template:tracking/languages/mns. I have not done the corresponding split for Mansi yet because Glottolog only has a three-way split for Mansic (Southern, Northern, Central) instead of a four-way split (Southern, Northern, Eastern and Western), preferring to consider East Mansi and West Mansi as dialects of Central Mansi. This means we need a bit more discussion. Benwing2 (talk) 05:05, 28 February 2024 (UTC)Reply
@Benwing2 Wonderful Thank you!
So I would say split it 4 way, because on one hand, the western dialect is extinct, while the eastern (is also extinct in theory, but not proven yet) is extant, and western has a lack of Cyrillic dictionaries or sources (except that one Gospel of Mathew written in Lower Lozva most likely), while Eastern even has audio pronunciations on Lingvodoc. While I would also say keep it as Central Mansi, because those two languages have very few resources but I don't this is a really valid argument. Ewithu (talk) 06:17, 28 February 2024 (UTC)Reply
@Ewithu My general take is we should go by mutual intelligibility unless there's really a good reason to do it otherwise. Since Glottolog seems to think the two varieties are mutually intelligible (at least that's what I assume by the Central grouping) and there are few resources overall, I think it's easier to have fewer splits; we can always split later if the need arises. But I don't feel super strongly about it; maybe someone else can chime in. Benwing2 (talk) 06:23, 28 February 2024 (UTC)Reply
In any case let me know if you need help with the Khanty splitting effort (in particular anything best done by bot). Benwing2 (talk) 06:24, 28 February 2024 (UTC)Reply
@Benwing2 Minor point, but are we sure about the name "Khantyic"? It sounds really odd to me, and I think "Khantic" is more common going by GBooks. Theknightwho (talk) 06:51, 28 February 2024 (UTC)Reply
@Theknightwho Good point. I used "Khantyic" because this is what Glottolog uses. Benwing2 (talk) 07:06, 28 February 2024 (UTC)Reply
I can't actually confirm your assertion about Google Books; "Khantic" does seem to occur more often but most of the hits seem to be garbage. Benwing2 (talk) 07:09, 28 February 2024 (UTC)Reply
@Benwing2 I was just going by the number of confirmed hits I could see. I'll have a proper look later today. Theknightwho (talk) 12:21, 28 February 2024 (UTC)Reply
@Benwing2 Can't we just call it "Khanty languages"? Everyone calls the protolanguage "Proto-Khanty", nobody calls it "Proto-Khantyic" or "Proto-Khantic". — SURJECTION / T / C / L / 13:42, 28 February 2024 (UTC)Reply
I agree with this.
Also steps 4 and 5 are done for Khanty. Also redirected the reference templates according o their language used. Ewithu (talk) 14:29, 28 February 2024 (UTC)Reply
@Surjection I renamed Khantyic -> Khanty. I see you split all the lemmas and carried out step 9 above but step 7 isn't done. Benwing2 (talk) 22:03, 28 February 2024 (UTC)Reply
7 should be done. A few cases were left with etymology templates that used kca as a source, but those should work, as it now means "from one of the Khanty languages". Most references were updated; some others were indiscriminately changed to kca-nor with attention templates, as I was requested to do. — SURJECTION / T / C / L / 22:06, 28 February 2024 (UTC)Reply
@Surjection See e.g. роман. There is an error in the Descendants section due to the use of language code kca. See the tracking link above for all remaining references. Benwing2 (talk) 22:12, 28 February 2024 (UTC)Reply
Done. The remaining instances in mainspace at least are valid uses of the family code. I'll check the other namespaces (at least the ones that matter) as well. — SURJECTION / T / C / L / 22:22, 28 February 2024 (UTC)Reply
OK thanks. Benwing2 (talk) 22:26, 28 February 2024 (UTC)Reply
@Ewithu: If the differences in grammar, phonology and morphology of Eastern Mansi and Western Mansi are not too large, then the fact one is little-attested and extinct is all the more reason to keep the two together. Thadh (talk) 09:49, 28 February 2024 (UTC)Reply
@Thadh Alright, keep 'em one. Could you do the honors if you are free? Also should we add Proto-Mansi and Proto-Khanty as well? Ewithu (talk) 09:54, 28 February 2024 (UTC)Reply
@Ewithu I have added the Mansi family along with Northern/Southern/Central Mansi languages. Benwing2 (talk) 22:35, 28 February 2024 (UTC)Reply
Alright thank you! I'll move the templates to the Northern one. Ewithu (talk) 22:39, 28 February 2024 (UTC)Reply
Considering how often etymological sources distinguish between Western and Eastern Mansi though, we may want to consider adding them as etymology-only languages. — SURJECTION / T / C / L / 08:13, 29 February 2024 (UTC)Reply
@Surjection Sounds good, I'll go ahead and do that. Benwing2 (talk) 08:15, 29 February 2024 (UTC)Reply
  Done Into: Eastern Khanty, Northern Khanty, Southern Khanty, Central Mansi, Northern Mansi, Southern Mansi Ewithu (talk) 13:29, 9 March 2024 (UTC)Reply


RFM discussion: February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Some missing Sino-Tibetan languages

Hi, this is a proposal to add three new language codes, relating to a small family of Bai(-adjacent) languages which have been dubbed the Cai-Long languages (from Chinese 蔡龍語支), spoken in western Guizhou.

@Stevey7788 very kindly created an expansive vocabulary list of Greater Bai Macro-Bai languages several years ago, which covers all of these in detail, including a number of sources (all of which are in Chinese, unfortunately).

Caijia (蔡家話) sit-cai
  • The only extant member of the family, with approximately 1,000 speakers in western Guizhou, with the numbers having declined from over 10,000 in the 1980s. Just over 950 terms are listed in the vocabulary list, including those of the dialects spoken in Shuicheng and Yangjiazhai.
Longjia (龍家語) sit-lnj
  • Extinct from around 2010. Just under 600 terms are listed in the vocabulary list, including those of the dialects spoken in Anshun and Dafang).
Luren (盧人語) sit-lrn
  • Extinct from sometime between 1960 and 1980. 61 terms are listed in the vocabulary list.

Theknightwho (talk) 18:59, 3 February 2024 (UTC)Reply

No objections from me although I know little-to-nothing about these languages. Benwing2 (talk) 03:52, 7 February 2024 (UTC)Reply
  Support. What should their ancestor/family be listed as? – wpi (talk) 17:28, 11 February 2024 (UTC)Reply
@Wpi Macro-Bai, and then giving the core Bai languages (i.e. the ones we already have language codes for) their own language family within that simply called "Bai languages". I've checked, and I can't find any evidence the term "Greater Bai" is actually used outside of Wiktionary or Wikipedia, so we should definitely fix that. Theknightwho (talk) 17:17, 16 February 2024 (UTC)Reply
Created. Theknightwho (talk) 02:05, 17 February 2024 (UTC)Reply


RFM discussion: February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Splitting Northern Bai properly

Currently, we use the code bfc for Northern Bai, classifying it as one of the Bai languages. However, in 2014, bfc was split into Panyi Bai and Lama Bai ([10] [11] [12]). Lama Bai was given the new code lay, while Panyi Bai retained the old code bfc. At some point since then we added Lama Bai to our list of languages, but we never renamed bfc to Panyi Bai, which seems to have been an oversight. From the documents, it seems like they still do form a "Northern Bai" grouping, however. Theknightwho (talk) 02:25, 17 February 2024 (UTC)Reply

We only have two lemmas for Northern Bai, so I don't think this would be a difficult job. Theknightwho (talk) 02:25, 17 February 2024 (UTC)Reply

@Theknightwho   Support. Benwing2 (talk) 02:56, 17 February 2024 (UTC)Reply

Split. Having checked the source, both current lemmas should be under Panyi Bai. Theknightwho (talk) 15:35, 24 February 2024 (UTC)Reply

RFM discussion: March 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Anglicising the names we use for the branches of Min Chinese

On the back of the recent split of Min Nan into a family, I thought it would be a good idea to consider renaming the various branches of Min to the names usually used in English. With the exception of Coastal Min and Inland Min, the names we currently use are transliterations from Mandarin, whereas most sources - including academic sources - prefer to translate them into English. I have no doubt we only do this because the ISO uses the transliterated names, for whatever reason. As such, I propose the following:

  1. mnp: Min Bei → Northern Min
  2. cdo: Min Dong → Eastern Min
  3. nan: Min Nan → Southern Min (note: currently being converted into a family)
  4. czo: Min Zhong → Central Min
  5. cpx: Puxian → Puxian Min (edit: added per the discussion below)

It's difficult to use Google Ngrams with this, since there are a ton of false positives, but the general trend is very strongly towards the English names historically, with the transliterations increasing in popularity in recent publications. However, taking a look at Google Books results, a lot of the transliterations come from poor quality sources, or in many cases texts that simply contain the ISO language names. Google Scholar is somewhat more helpful, though, and shows a clear trend towards the English names even with very recent papers. Theknightwho (talk) 18:02, 4 March 2024 (UTC)Reply

Support; this is also consistent with general practice here at Wiktionary to prefer English terminology rather than native-language terminology. Benwing2 (talk) 00:15, 5 March 2024 (UTC)Reply
  Support; this is what i normally do outside of wiktionary anyways — 義順 (talk) 21:21, 7 March 2024 (UTC)Reply
Neutral / SupportFish bowl (talk) 05:43, 10 March 2024 (UTC)Reply

Question: Should cpx "Puxian" also be moved to "Pu–Xian Min" (cf. wikipedia:Pu–Xian Min?) —Fish bowl (talk) 05:43, 10 March 2024 (UTC)Reply

@Fish bowl Oppose use of em-dash or en-dash in the name of the language. Benwing2 (talk) 05:54, 10 March 2024 (UTC)Reply
Yeah that's sensible. I copy-and-pasted it from Wikipedia and didn't notice it was there. Imagine it's an ASCII hyphen. —Fish bowl (talk) 05:58, 10 March 2024 (UTC)Reply
Not sure here. Wikipedia uses dashes in the names of many languages (e.g. most of the Yue languages) where Glottolog uses closed compounds. Glottolog uses the form Pu-Xian with a hyphen not en-dash (although Omniglot uses Puxian). In general I prefer less punctuation than more so my gut would say Puxian but properly you'd have to look at scholarly sources to see which one is most common. Benwing2 (talk) 06:19, 10 March 2024 (UTC)Reply
It is because the name 莆仙 Pu(-)Xian is formed from two placenames, 莆田 Putian and 仙遊 Xianyou. This is also the case for the Yue dialects listed on Wikipedia. —Fish bowl (talk) 06:24, 10 March 2024 (UTC)Reply
@Fish bowl "Puxian Min" seems to be a lot more common than "Pu-Xian Min" in English-languages sources, but I agree it should be renamed as well. I'll add it to the list at the top. Theknightwho (talk) 06:34, 10 March 2024 (UTC)Reply
  Support. (Also the Bei, Dong, Nan, Zhong are not native anyway; they're Mandarin). — justin(r)leung (t...) | c=› } 18:17, 10 March 2024 (UTC)Reply

Moved, given this isn't controversial. Theknightwho (talk) 14:12, 12 March 2024 (UTC)Reply

RFM discussion: February 2022–March 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits ([[Special:PermanentLink/78966576#Change_"Chinese"_[zh]_from_descendant_of_"Middle_Chinese"_[ltc]_to_ancestor_of_"Old_Chinese"_[och]|permalink]]).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Change "Chinese" [zh] from descendant of "Middle Chinese" [ltc] to ancestor of "Old Chinese" [och]
  1. The family tree looks weird.
  2. Old Chinese is subsumed under ==Chinese==.

@Justinrleung, RcAlex36, 沈澄心

Fish bowl (talk) 06:34, 6 February 2022 (UTC)Reply

I wonder if we should push it even further and have Proto-Sino-Tibetan be the ancestor? Even Old Chinese being the ancestor of Old Chinese may be weird. — justin(r)leung (t...) | c=› } 06:54, 6 February 2022 (UTC)Reply

I note that Category:English language similarly belongs to Category:Middle English language, although a major difference in Wiktionary treatment is that ==English== does not cover Category:Middle English language or Category:Old English language. —Fish bowl (talk) 11:19, 7 February 2022 (UTC)Reply

@Fish bowl I'm not a Chinese editor, but from a outside perspective, that'd feel more weird to see (how can Chinese be the ancestor of Old Chinese?). It's also make the Chinese lects go like "Chinese -> Old Chinese -> Middle Chinese -> lect", which seems more confusing to me. Honestly, at this rate. I'd just remove Chinese from the family tree entirely with its current treatment. Or, at least make it on the same level as Old Chinese, rather than an ancestor. AG202 (talk) 04:33, 4 March 2022 (UTC)Reply
I don't see a problem with the current setup...? As AG says, making modern Chinese an ancestor of Old Chinese would be weird (and wrong). I think the current setup is fine...? The only awkwardness is that we group all kinds of Chinese under one L2. I'm going to <s>strike</s> this as no action taken so it can be archived soon, since the discussion is very old and this page is very large, but you can reopen it if you object... - -sche (discuss) 05:33, 31 March 2024 (UTC)Reply


RFM discussion: December 2023–February 2024 edit

 

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


Language codes for the proto forms of upper branches of Dravidian (south, south-central, central, north)

Has been discussed many times and was also agreed up to add them like here and here

The proposal was because there are tons of terms which are restricted to certain branches, sometime being only attested in like 4 languages of 1 branch (*cinki for example) in a family of 80+ languages with 4 branches but are reconstructed to the proto stage, {{R:dra:DL}} has many reconstructions for the inner branches. {{R:dra:Southworth}} mentions many cases where BK reconstructs terms to PD when they are restricted to certain branches like PSD or PCD further saying it should be reconstructed only to the proto branch only. AleksiB 1945 (talk) 13:21, 21 December 2023 (UTC)Reply

Support. Theknightwho (talk) 23:06, 30 December 2023 (UTC)Reply

Created (several days ago now): @AleksiB 1945 we have:

  • Proto-Central Dravidian dra-cen-pro
  • Proto-North Dravidain dra-nor-pro
  • Proto-South Dravidian dra-sou-pro
    • Proto-South Dravidian I dra-sdo-pro (covering what we formerly called South Dravidian)
    • Proto-South Dravidian II dra-sdt-pro (covering former South-Central Dravidian)

This is to properly align with the format our Proto-Dravidian entries have been following anyway. Theknightwho (talk) 00:03, 24 February 2024 (UTC)Reply

Return to the project page "Language treatment/Discussions".