WT:LTD redirects to this talk page, to ease using the aWa tool to archive discussions here. For more archived discussions, see the corresponding Wiktionary:-space page, Wiktionary:Language treatment/Discussions.
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
The term "Old East Slavic" must have been created to avoid offending Ukrainians and Belarusians and to please nationalists who use "Old Ukrainian" and "Old Belarusian", since "Old Russian" may imply to some that Belarusian/Ukrainian languages are derivations of Russian. However, "Old East Slavic" (Russian: древнеру́сский язы́к(drevnerússkij jazýk)) translates into Belarusian as старажытнару́ская мо́ва(staražytnarúskaja móva) and into Ukrainian as давньору́ська мо́ва(davnʹorúsʹka móva) (alternative names also exist, including "Old Ukrainian"). All three terms are literally translated as "Old/Ancient Russian/Rusian", not "Old East Slavic". Note that in Ukrainian ру́ський(rúsʹkyj) refers more specifically to Rus rather than modern Russia. (Unlike Russian and Belarusian, the Ukrainian term росі́йський(rosíjsʹkyj) means "Russian" both referring to ethnicity and the country. Cf. Russian ру́сский(rússkij) / росси́йский(rossíjskij) and Belarusian ру́скі(rúski) / расі́йскі(rasíjski)). Despite the current tensions between Russia and Ukraine, linguistically it makes much more sense to rename "Old East Slavic" to "Old Russian". Call me biased, whatever, but respectable East Slavic linguists all use "Old Russian", it's of the Ancient Rus, not modern Russia, the centre of which was in Kiev, modern Ukraine. :) --Anatoli(обсудить/вклад) 02:47, 12 March 2014 (UTC)Reply[reply]
I don't think the names that modern speakers use should really count. They're bound to have some kind of national pride in them, like discussions on the Wikipedia article show. (And I'm hoping to avoid a straw man, but by the same logic we'd call Dutch "Netherlandish", German "Dutch", Greek "Hellenic", Armenian "Hayeric", Hittite "Neshite", and so on...) I think we should focus purely on what modern English-language scholarship calls the language. —CodeCat 03:23, 12 March 2014 (UTC)Reply[reply]
Re: English language usage. Ngram, Ngram 2 can't even find "Old East Slavic (language)", not sure if I interpret this correctly or if it is possible to check the English usage via Google.
Being half-Russian, half-Ukrainian, I take pride in belonging to both but pride can take various ways. E.g. ру́ський(rúsʹkyj) is sometimes translated by Ukrainians as давньоукраї́нський(davnʹoukrajínsʹkyj, “Old Ukrainian”) and some Ukrainian nationalists claim they have nothing to do with Russian but others take pride in being the cradle of all Slavs. We all know the Serbo-Croatian language story and Israeli-Palestinian conflict doesn't make Hebrew and Arabic belong to different language families. Modern Ukrainian, Russian, Belarusian are distinct languages but they are all derived from one, so your comparison is not valid.
In any case, Belarusians (educated), considering the name of the language, are not even trying to deny that Belarusian is derived from Old Russian (=Rusian, Old East Slavic), the language of Rus, in Ukrainian, the term ру́ська мо́ва(rúsʹka móva) is also used, without "Old", since ру́ський(rúsʹkyj) has this meaning ((modern) Russian language is called росі́йська мо́ва(rosíjsʹka móva) in Ukrainian). Ukrainians also know that Украї́на(Ukrajína) is quite a new term, although there are some ridiculous claims about so-called "Ukr" tribes, derived directly from Proto-Slavic, bypassing Old Russian stage. I wonder what discussion you're referring to?
BTW, I disapprove the policy of Russian government and respect the right of Ukrainians to decide their fate but we are talking about language names and linguistics here. Bad timing for this discussion, considering the Ukrainian and Crimean crisis. --Anatoli(обсудить/вклад) 04:22, 12 March 2014 (UTC)Reply[reply]
We should probably invite our Ukrainian editors to this discussion to get a broader picture. --Anatoli(обсудить/вклад) 04:49, 12 March 2014 (UTC)Reply[reply]
I think this should be decided solely by the most commonly used name in English. Thus, I tentatively support "Old Russian" based on the Ngrams above, and based on a hunch that English speakers are less likely to have to look up the language if it is called "Old Russian" than if it is called "Old East Slavic". However, there needs to be a more thorough investigation into the evidence, since Ngrams are often inaccurate and/or skewed by unexpected factors. --WikiTiki89 04:55, 12 March 2014 (UTC)Reply[reply]
Agreed but Google books count of "Old Russian language" is also significantly larger than "Old East Slavic language" (even though "Old Russian language" may not always have the same sense as "древнерусский язык"). I'm happy if someone else can analyse it better using Google or other sources. --Anatoli(обсудить/вклад) 05:02, 12 March 2014 (UTC)Reply[reply]
Re: solely by the most commonly used name in English. We should also take into consideration the literal names in the affected languages. I think "Old East Slavic" is a result of some kind of political correctness or fairness, ignoring the fact that Ukrainian and Belarusian don't use this fairness in the name, also not note French "vieux russe", Estonian "vanavene keel", Dutch "Oudrussisch" and many Slavic names. The rest of names must have followed English Wikipedia in their namings. --Anatoli(обсудить/вклад) 05:21, 12 March 2014 (UTC)Reply[reply]
I disagree, whatever the origin of its usage, if "Old East Slavic" became were shown to be used more commonly in English, then that is the name we should use. The only other factors we should consider are the ambiguity of the name (for example, if "That Language There" were the most commonly used term for some language, we probably still shouldn't use it), or possibly some other minor issues that I can't think of at the moment. --WikiTiki89 05:30, 12 March 2014 (UTC)Reply[reply]
I meant it seems it was coined out of some considerations but I agree that if it were more common we should use it. I can't think of another example but in Japanese ハングル語 (Hanguru-go) "Hangeul language" was specifically coined, out of political correctness, to avoid giving preferences to either Korea or their languages (North and South Koreas have different names in East Asian languages, see 韓国 and 朝鮮). --Anatoli(обсудить/вклад) 05:38, 12 March 2014 (UTC)Reply[reply]
If you agree that if it were more common then we should use it, then isn't that the same thing as only relying on which is more common? --WikiTiki89 05:43, 12 March 2014 (UTC)Reply[reply]
It is not easy to get a sense of which name is most common in English. Chaff like "an old Russian car" skews counts of "Old East Slavic" vs "Old Russian", but if "language" is added, the number of hits drops so low that it is statistically insignificant / unreliable. However, based on manual review of a search with "in", I think "Old Russian" is the more common of the two names. - -sche(discuss) 06:25, 12 March 2014 (UTC)Reply[reply]
-sche's data
On BGC, "Old Russian language" gets 4 relevant hits, 7 chaff hits of things like "thirty-three-year-old Russian-language specialist" or "the old [=venerable] Russian language", and 2 hits alongside "Old East Slavic language" in Library of Congress subject heading catalogues.
"Old East Slavic language" gets 3 relevant hits, 6 hits in LOC subject heading catalogues or printed editions of Wikipedia, and 1 hit that actually uses both terms: "the purely vernacular, Old Russian — or, more precisely, Old East Slavic — language".
"in Old East Slavic" gets 13 relevant hits, 2 in books that also use "Old Russian".
"in Old Russian" gets 32 clearly relevant hits, 25 clearly irrelevant hits (like "dressed in old Russian uniforms") and 13 hits where its not clear whether it means Old Russian or just Russian that is old, and that's just the first 7 pages: I stopped counting after that because it was clear at that point that "Old Russian" was more common than "Old East Slavic".
I support the rename to Old Russian per the following Google Scholar Data, showing that "Old Russian" is more than 10 times as common as all other names I could think of, combined. —Aɴɢʀ (talk) 19:20, 12 March 2014 (UTC)Reply[reply]
Google Scholar data
"Old Russian" language – 13,200 hits
"Old Ukrainian" language – 847 hits
"Old East Slavic" language – 167 hits
"Old Belarusian" language – 139 hits
"Old Ruthenian" language – 96 hits
"Old Byelorussian" language – 38 hits
"Old East Slavonic" language – 26 hits
"Old Eastern Slavic" language – 9 hits
"Old Eastern Slavonic" language – 4 hits
“Old Russian” reflects systemic prejudices that go back to times of the empires, when Encyclopedia Britannica defined wrote about White Russians and Little Russians, some of whom were Ruthenians. These prejudices were still felt in academics and journalism when the Soviet Union broke up, and are only going away now. I don’t think Wiktionary is part of an establishment that tries to wilfully reinforces these backward practices.
Furthermore, the moment when the Russian regime is exploiting such ideas in its anti-Ukrainian propaganda is the shittiest possible time to start this discussion. —MichaelZ. 2014-03-12 19:18 z
I feel kind of the same about it. I can't motivate the choice with widespread usage, but I feel that "Old East Slavic" just is a better, more descriptive name for the language. So I oppose. —CodeCat 19:42, 12 March 2014 (UTC)Reply[reply]
@Mzajac I agree about bad timing and with the criticism of the Russian regime and I'm very sad about the latest events - Putin, in a short period, has created a situation, which was unimaginable for hundred years - Russians fighting Ukrainians. I wish we didn't mix politics in here, though. The political split of Yugoslavia caused the artificial language split but we all know that Serbo-Croatian is one language, despite the tensions. Jewish and Arabic are still semitic languages and Ukrainian and Russian are still derived from the same source (after a big injection of Old Church Slavonic into Russian and Polish injection into Ukrainian). I'm not stating any Russian supremacy or support any name-calling. Regarding the names, I disagree that "White Russians" and "Little Russians" were a result of prejudice, just as "Белару́сь" (Бе́лая Русь) and "Малоро́ссия" are just historical words, names among others like "Great Rus", "Red Rus", "Black Rus", "Carpathian Rus". Nikolay Gogol, native of modern Ukraine, also used "Малороссия" when referring to his homeland. The "offensive" meaning (if "Малоро́ссия" ("Little Rus(sia)" sounds offensive to Ukrainians) was acquired mistakenly in the modern times. In any case, Русь(Rusĭ) originated on the territory of modern Ukraine by people who lived there and this is where all East Slavs originate from. I don't see or Ukrainians don't see anything offensive in terms "давньору́ська мо́ва" and "ру́ська мо́ва" when referring to Old East Slavic language but I won't insist on continuing this discussion, if it hurts someone's feelings. --Anatoli(обсудить/вклад) 02:10, 14 March 2014 (UTC)Reply[reply]
Guys, call it Old Bulgarian and that's it. Though modern Bulgarian belongs to the southern branch, the early Bulgarian and Russian texts do show their being one language. Alexdubr (talk) 14:31, 17 March 2014 (UTC)Reply[reply]
You are confusing two different languages. Old Bulgarian, which we call Old Church Slavonic, was also a South Slavic language, but it was used even in Russia/Ukraine/etc. as the main written and liturgical language. The main spoken language, however was Old Russian or Old East Slavic, which was an East Slavic language and the ancestor of modern Russian/Ukrainian/etc. --WikiTiki89 16:14, 17 March 2014 (UTC)Reply[reply]
Result: not renamed; no consensus for renaming. - -sche(discuss) 02:16, 26 January 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Merge Komi-Zyrian (kpv), Komi-Permyak (koi), Komi-Yodzyak (no code) with Komi (kv). Komi-Zyrian is dominating over others. Currently, they all use the same alphabet but there were differences in the past. There's very little information on the grammar differences but they are mostly considered dialects, not separate languages. Merging kv and kpv should be straightforward, Zyrian is the language of Komi Republic.-Anatoli(обсудить/вклад) 10:39, 6 April 2014 (UTC)Reply[reply]
Actually, it has been planned as far as Komi-Zyrian and Komi-Permyak are concerned, see WT:LANGTREAT: "kv and kpv refer to the same lect; one will eventually be deleted.". Apparently, there are differences with Komi-Permyak but those can be labeled as "Permyak", similar Serbo-Croatian or Albanian varieties. --Anatoli(обсудить/вклад) 01:54, 7 April 2014 (UTC)Reply[reply]
Komi-Permyak has a distinct written tradition and should not and cannot be merged. Your statement that Komi-Zyrian is dominating shows your pro-Russian and anti-minority POV which should never be a basis for consensus on this wiki. Komi-Zyrian can be handled under {{kv}}, yes. -- Liliana• 12:34, 7 April 2014 (UTC)Reply[reply]
Do you have paranoia or something? I'm not Putin and not suppressing any minorities. Komi-Zyrian is a language of majority of Komi people in Komi Republic, has a much larger number of speakers and much more materials written in this variety of Komi. Komi-Permyak is used by a minority in Perm Krai and has only 63,000 speakers. By all means, I'm not forcing anyone to merge, this is only a suggestion. They are mutually comprehensible and each of them has dialects. It's possible to merge Komi varieties like it's possible to have one L2 header for Albanian, Norwegian or Serbo-Croatian, marking varieties accordingly: тӧлысь (Zyryan)/тӧлісь (Permyak), выль (Zyryan)/виль (Permyak). --Anatoli(обсудить/вклад) 12:58, 7 April 2014 (UTC)Reply[reply]
That was completely rediculous and has no place on this wiki. It shows your anti-Russian bias more than Anatoli ever showed any pro-Russian bias. —CodeCat 13:00, 7 April 2014 (UTC)Reply[reply]
Thanks. I can't help to have some Russian bias, though. This is my language, my culture. I can help with the Russian language and Russia-related topics, even if Russian may now be interesting perhaps as a "language of enemy" for some. I don't blame people for criticizing Russian politicians looking at Russia with suspicion. I don't support Russian politics and propaganda but I don't have to apologize for being Russian either. :) Certainly I shouldn't be blamed for being against minorities. Why would I add contents for minority languages, if I wanted to supress them? --Anatoli(обсудить/вклад) 13:19, 7 April 2014 (UTC)Reply[reply]
There is one problem: as WT:DATACHECK reveals, "Komi-Zyrian language (kpv) has a canonical name that is not unique, it is also used by the code kv." Presumably one of the two needs to be retired (or, failing that, at least renamed). - -sche(discuss) 18:52, 2 May 2014 (UTC)Reply[reply]
Komi-Zyrian is less ambiguous and clear. Although I favoured "Komi", perhaps we should retire it and leave Komi-Zyrian for kpv and kv and Komi-Permyak for koi. Komi-Zyrian is implied if Komi is used. --Anatoli(обсудить/вклад) 00:12, 3 May 2014 (UTC)Reply[reply]
We are not supposed to treat dialects as independent languages. How are these not dialects? --Æ&Œ (talk) 20:21, 6 May 2014 (UTC)Reply[reply]
I support a merger of Campidanese (sro) and Logudorese (src) into Sardinian (sc), for reasons I outlined on my talk page and repeat here for others' benefit: those two lects differ from each other, quoth WP, "mostly in phonetics, which does not hamper intelligibility among the speakers". They are perhaps comparable to the dialects of Irish, with standard Sardinian (sc) existing as a unification of the dialects (again, comparable to Irish). Notably, we already include standard Sardinian (sc) — which means the additional inclusion of sro and src is quite schizophrenic.
There is some disagreement over whether Gallurese (sdn) and Sassarese (sdc) are dialects of Sardinian, dialects of Corsican, or languages separate from both Sardinian and Corsican. I would not mergethem at this time. - -sche(discuss) 03:19, 12 May 2014 (UTC)Reply[reply]
I've merged Campidanese (sro) and Logudorese (src) into Sardinian (sc). - -sche(discuss) 08:14, 17 August 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
@-sche (and anyone else who loves language-name conflicts), we currently call aum "Abewa" and asa "Asu". Firstly, it seems that practically nobody calls aum by that name; Wikipedia and Ethnologue both titles their entries for it "Asu", and Roger Blench, who may be the only person ever to study it, calls it "Asu" as well. As for asa, which currently occupies that name, Wikipedia and most hits on Google Books call it "Pare" (the remainder call it "Asu" or "Chasu", for the most part). As you may have guessed, there is already a language that we call "Pare", namely ppt, but this New Guinean language is also called "Akium-Pare" and (Wikipedia's choice) "Pa", which thankfully appears to be untaken. The chain of changing language names does seem rather silly, but the overall purpose of this is to move the only language out of these three that actually has any literature on it (and thus the one I just added a translation in), namely asa, to its most commonly used name. —Μετάknowledgediscuss/deeds 05:40, 9 September 2015 (UTC)Reply[reply]
I'm down with renaming aum away from Abewa.
On a balance, I'm also OK with renaming asa away from Asu. I can find a decent amount of references to "Asu": it's hard to say whether more or less than "Pare" because both terms turn up so much chaff. There is some documentation of its vocabulary available, incidentally; The Making of a Mixed Language: The Case of Ma'a/Mbugu by Maarten Mous mentions "Pare (Chasu)" muruke "sweat" and tika "lift" (in the context of their having been borrowed without change into Normal Mbugu, and then glottalized into Inner Mbugu muru'u and ti'i); Mbugu also borrowed Pare ku-kasha "hunt", Zigua ku-kala "to hunt", and Shambaa u-kalá "hunting" and ngwilizi "eagle" (source of Normal + Inner Mbugu ngwirizi, variation of l/r being dialectal in Shambaa; contrast Pare ngwirini). Isaria N. Kimambo's Political history of the Pare of Tanzania, c. 1500-1900 (1969) implies Pare and Asu are different: "Other naming procedures include the use of u- for territorial names, e.g. Upare for the Pare country; and ki- for language, e.g. Kipare for the Pare language. The only exception here is Chasu which refers to the Asu language." John D. Kesby's Rangi of Tanzania: an introduction to their culture (1981) seems to clarify, however: "in the northeast of Tanzania, [...] the people called Pare in Swahili refer to themselves as Asu". (And Maarten Mous provides a bit more detail, that Pare/Asu has at least two dialects, north and south, with the north one apparently also being called Vudee and several spelling variations thereof.)
As for ppt, some works refer to it as "Pari", e.g. The Abandoned Narcotic: Kava and Cultural Instability refers to "the Pari (Pa) language" (not to be confused with lkr Päri - Lokoro). I guess we can rename it "Pa" for now, and switch to "Pari" later if something else called Pa comes up. - -sche(discuss) 05:48, 10 September 2015 (UTC)Reply[reply]
Renamed as proposed: aum from Abewa to Asu, asa from Asu to Pare, ppt from Pare to Pa. - -sche(discuss) 00:09, 27 September 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming dig
From "Chidigo" to just "Digo", partly because we should try to purge prefixes from our language names where appropriate and partly because the latter name is vastly more used by linguists. —Μετάknowledgediscuss/deeds 19:35, 5 September 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently call pmm "Pomo", which makes it sound like the Pomoan "language" which some people hypothesize exists, and for this reason I almost added a translation (with nested translations for Northern and Central Pomo) using the code in this way. I propose it be renamed "Pol" or "Pol Pomo", the name Wikipedia and the International Encyclopedia of Linguistics use. - -sche(discuss) 13:57, 19 June 2015 (UTC)Reply[reply]
Damn, that's very rightfully confusing. Support "Pol", since that's what Wikipedia uses. —Μετάknowledgediscuss/deeds 07:17, 11 August 2015 (UTC)Reply[reply]
Latest comment: 8 years ago1 comment1 person in discussion
The ISO merged aam "Aramanik" into aas "Aasax", saying here "Aramanik [aam] is listed as a Southern Nilotic language of the Nandi group, presumably because the Aramanik people assimilated to the Nandi. The original Aramanik language was a Cushitic language (or a non-Nilotic language with heavy Cushitic overlay) usually called Aasax (Fleming 1969) and is already included in a separate Aasax [aas] entry. Maarten Mous, in A Grammar of Iraqw, also gives them as synonyms." - -sche(discuss) 22:11, 2 November 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
These are currently named "Gallurese Sardinian" and "Sassarese Sardinian", which has led to them sometimes being nested under Sardinian in translations tables, but this is erroneous because they are (transitional) dialects of Corsican spoken on Sardinia, not dialects of Sardinian. I propose to drop "Sardinian" from their names. See also WT:RFM#Sardinian_templates. - -sche(discuss) 17:52, 17 August 2015 (UTC)Reply[reply]
Or we could just merge them into co. —Aɴɢʀ (talk) 19:09, 17 August 2015 (UTC)Reply[reply]
We could. They're subject to the same LDL CFI whether we keep them independent or merge them into Corsican (as contrasted with dialects of Italian, for example, which would find themselves subject to much higher CFI if merged into it), and a merger would reduce duplication while we could still note differences with {{label}}s and {{qualifier}}s... so perhaps we even should merge them. But they occupy grey areas. Gallurese is transitional between Corsican and Sardinian; Sassarese is transitional between Corsican, Sardinian and Tuscan (which I guess we consider it?). Ah, dialect continua... - -sche(discuss) 01:45, 18 August 2015 (UTC)Reply[reply]
Definitely support dropping the "Sardinian" in their names; abstain on whether or not to merge them into co (but oppose any merger of anything into it). —Μετάknowledgediscuss/deeds 20:50, 10 September 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Pretty much every source I could find referred to Zamboangueño et al. as varieties or dialects, not languages. Two of them in particular make it very clear:
“The result of the study showed that while there are observable differences in certain language features beween and among the four variants, they are nonetheless, mutually intelligible with each other even among native speakers who do not have any special language training. Thus, for the purpose of this pilot study, al four variants were identified as dialects of PCS.” Sister María Isabelita O. Riego de Dios, A Pilot Study on the Dialects of Philippine Creole Spanish
“The two variants of PCS share enough distinctive differences from regular Spanish or regular Philippine usage that they must be considered historicaly related dialects of the same language” Charles O. Frake, Lexical Origins and Semantic Structure in Philippine Creole Spanish
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
There are two problems here: First, we currently have the code yaf referred to a language we call Kiyaka, but which Wikipedia calls Yaka (without the noun class prefix) and which authors on Google Books seem to agree to call Yaka as well. There are a couple other languages sometimes called Yaka, but fortunately they all have other names that are more common and therefore there is no conflict, and the principal name of yaf should therefore be modified; there are very few categories associated with this one, so it should be easy to change.
Secondly, the Wikipedia article states that the codes noq, ppp, and lnz refer to its dialects, but Glottolog seems to consider them separate languages (possibly just following the ISO rather than actually making a judgement). Therefore we should consider merging these codes, if in fact there are not enough differences (some data would be helpful). @-sche —Μετάknowledgediscuss/deeds 02:16, 18 June 2015 (UTC)Reply[reply]
axk currently goes by "Yaka"; if yaf were renamed "Yaka", yaf would need to use a disambiguator or axk would need to be renamed, but to what? axk's alt name of "Aka" is taken (by soh), and I haven't offhand found evidence that anyone calls it by its alt name "Beka".
Ethnologue, although it grants Ngoongo a separate code (noq), labels it a dialect of Yaka in its entry on Yaka. I can find a German reference stating "Eng mit den Yaka verwandt sind die Lonzo, Pelende und Suku." ("Closely related to the Yaka are the Lonzo, Pelende and Suku", the last of which WP and Ethnologue consider to speak a separate language.) A French reference says "Les Pelende ont un accent linguistique propre, mais ils s'entendent avec les Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala et Kongo." ("The Pelendes have their own linguistic accent, but they get along with / can understand the Yaka, Suku, Lonzo, Luwa, Hungana, Tsamba, Ngongo, Mbala and Kongo.") Another says "Le kipelênde comme le kiyaka est un dialecte du kikongo commun. Plus répandu, «Le kiyaka comprend quelques neuf dialectes distincts, présentant parfois des variantes assez considérables.»" ("The Kipelênde like Kiyaka is a dialect of the common Kikongo. More widespread, "The Kiyaka includes some nine separate dialects, sometimes with quite considerable variations.")
I can find a small Yaka corpus, but not any comparison of the different [might-be-]dialects.
A conservative approach might leave the codes separate until such time as someone comes along with words in them. - -sche(discuss) 16:20, 18 June 2015 (UTC)Reply[reply]
Hmm, re names: soh is also called (Jebel) Silak, or (Jebel) Sillok; Wikipedia uses the name Sillok, but I'm not finding many resources to assess how common that is (and some refer it by a hyphenated string of dialects). If we can move soh (and perhaps should anyway), then we could move the rest down without disambiguating (so axk would be Aka, and yaf would be Yaka). —Μετάknowledgediscuss/deeds 23:22, 18 June 2015 (UTC)Reply[reply]
I sometimes wonder if we should start preferring disambiguators to alt names: they'd be annoying to type, but I think the mere fact that we're discussing a three-link chain of renames (language A takes B's name, B takes C's name, C takes D's name) shows how much clearer they'd be. I pity the new user who e.g. adds aja content under an ==Adja== header, and I pity the veteran user who has to notice that that has happened. In this case, I can't find evidence of soh being called Sillok, but the people who speak it and the place they live are called Sillok, so at least it wouldn't be unclear. Perhaps we could just rename axk and soh to have disambiguators, though: "Aka (Congo)" and "Aka (Sudan)". - -sche(discuss) 02:02, 19 June 2015 (UTC)Reply[reply]
I suggest renaming axk and soh to "Aka (Central Africa)" and "Aka (Sudan)", and then renaming yaf as originally proposed. - -sche(discuss) 06:52, 4 July 2015 (UTC)Reply[reply]
Renamed as proposed. What to do with the [might-be-]dialects remains to be determined. - -sche(discuss) 03:47, 12 July 2015 (UTC)Reply[reply]
@-sche: I've generally followed the guideline that we avoid such parenthetical geographic locators; were we to use them in general, it would change a great deal of our names. I know few others care, but perhaps we ought to put this to the community at large in the BP? —Μετάknowledgediscuss/deeds 19:56, 28 July 2015 (UTC)Reply[reply]
In this specific case, I think there is compelling reason to deviate from the general practice/guideline of preferring alt names to parentheticals even without changing that guideline: in order to use alt names here, we'd have to chain-rename ≥3 languages such that the name each one most often went by was assigned to a different one, and one of them would end up with an unattested name, which would all be extremely confusing. As for whether/how to change the general guideline: I'll think the matter through more thoroughly before I post anything in the BP. I don't think I'd propose switching to parentheticals in all cases (I think, for instance, that Pyu/Tircul and Riang/Reang use different scripts and so are unlikely to be mixed up). I would only prefer parentheticals where people would be likely to mix up which language was meant by a given name, and where the mix-up would be likely to go unnoticed (e.g. because the script was the same). - -sche(discuss) 21:23, 28 July 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
There are far more Tonga languages than anyone would want to have to deal with, but I am excluding all but two of them in this discussion for the sake of ease. We currently call toi and tog "Tonga" and "Chitonga", but both languages are called by both names, and use of alternative names seems to be vanishingly rare. Our current system is leaving me (and at least one person who tried to give a translation in one of the languages) thoroughly confused, so as much as I find them ungainly, I'd much rather we use parenthetical geographic identifiers than have to go through this madness. (Pinging @-sche as usual (and you ought to take a look at the other ones I've posted recently on this page when you get a chance, if you are so inclined.) —Μετάknowledgediscuss/deeds 05:28, 20 September 2015 (UTC)Reply[reply]
I'm all in favor of parenthetical disambiguators, but what should they be? Wikipedia calls toiTonga language (Zambia and Zimbabwe) and togTonga (Nyasa) language, but that seems suboptimal to me since the parentheticals aren't parallel. Ethnologue suggests the majority of toi speakers are in Zambia and all tog speakers are in Malawi, so how about "Tonga (Zambia)" and "Tonga (Malawi)"? —Aɴɢʀ (talk) 13:25, 20 September 2015 (UTC)Reply[reply]
Support a rename of toi to "Tonga (Zambia)" and of tog to "Tonga (Malawi)". While we're at it, I think we prefer (do we?) to drop "ki-", "chi-", "gi-" and such African language-name prefixes, so toh could be renamed from "Gitonga" to "Tonga (Mozambique)". Happily, to is distinct as Tongan, and we don't have tnz yet, but it seems to be consistently called "Ten'edn" or "Maniq" (the latter being properly an ethnonym) by its speakers, who SIL says are totally unfamiliar with "Tonga" as the name of a language. - -sche(discuss) 23:50, 26 September 2015 (UTC)Reply[reply]
@-sche: I don't know if we've talked about it before, but I think in general we should avoid language-name prefixes. However, there are some exceptions; I prefer "Luganda" to "Ganda", for example, because it's far more commonly used. I'm fine with renaming Gitonga as you suggest. By the way, thanks for dealing with some of these language issues; Kikuyu and Rwanda-Rundi are still lingering on this page, so please give them some love/research when you have a chance. —Μετάknowledgediscuss/deeds 04:07, 27 September 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Gambian Wolof (wof) should be merged into Wolof (wo), IMO. Ethnologue says "Senegalese Wolof [wol] intelligible by speakers of Gambian Wolof but with significant enough differences to require adaptation of materials", which seems to have been their motive in splitting these lects (like so many others). - -sche(discuss) 06:51, 25 February 2016 (UTC)Reply[reply]
Support. We can make Gambian Wolof a regional dialect of Wolof and tag relevant words {{lb|wo|Gambia}}. —Aɴɢʀ (talk) 08:38, 25 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming skc
Currently called "Sauk", but Wikipedia, SIL publications on the language, and work by Alexandra Aikhenvald all call it "Ma Manda". A happy side effect of the move would be that we could add "Sauk" as an alias for sac, as it is probably the second most common name for that language. —Μετάknowledgediscuss/deeds 06:34, 3 November 2015 (UTC)Reply[reply]
Renamed per nom. Sure enough, the only entries with "Sauk" translations are referring to sac. - -sche(discuss) 09:07, 27 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Considering war and wrz
Our current situation is to call war "Waray-Waray" and wrz "Waray"; this is not necessarily an optimal solution. Wikipedia chooses to call war "Waray" and wrz "Warray"; although "Warray" is less common than "Waray" to refer to wrz (as far as I can tell), this gives the commonest name of war to that language, which probably deserves priority due to being much more studied. At Template talk:war, you can see that the idea to rename war to "Winaray" was rejected and Liliana's choice of "Waray-Waray" won out. However, it's clear that our current situation has caused some confusion (User:DTLHS/cleanup/mismatched translation codes shows a lot of misuse of wrz when war was intended). Basically, what we have now isn't bad, but the fact is that it's resulted in mismatched codes, so we might want to try a different approach. —Μετάknowledgediscuss/deeds 20:46, 14 September 2015 (UTC)Reply[reply]
I'd prefer minimizing ambiguity by calling war "Waray-Waray" and wrz "Warray" so that no language at all is called by the ambiguous name "Waray". —Aɴɢʀ (talk) 14:56, 15 September 2015 (UTC)Reply[reply]
To be fair I think most of the mistakes were caused when the language was renamed but the translations weren't edited, not by whoever added them in the first place. DTLHS (talk) 23:49, 26 September 2015 (UTC)Reply[reply]
I've renamed wrz in the manner Angr suggested. - -sche(discuss) 09:04, 27 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
In 2011, the ISO retired the code for Tingal [tie], merging it into Tegali [ras]. I think we should follow suit. Wikipedia notes that there is dialectal variation in Tegali, but it's not between Tegali proper and Tingal, it is rather between Tegali proper and Rashad (but even those dialects are "nearly identical"). - -sche(discuss) 06:15, 11 August 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming bya
-sche has pointed out that we call this language "Batak"; that is an awful idea, due to the existence of the Batak languages. To reduce confusion, we should do as Wikipedia does, and call it "Palawan Batak". —Μετάknowledgediscuss/deeds 06:18, 29 February 2016 (UTC)Reply[reply]
Support, naturally. "Palawan Batak" still sounds like it might be a Batak language, but at least it stops people from seeing one of the Batak languages and entering it into Wiktionary as bya (compare the recent rename of "Pomo" to "Pol Pomo"). - -sche(discuss) 02:03, 4 March 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Shuadit (sdt) aka Judeo-Occitan or Judeo-Provençal is most definitely not an independent language. The literature seems to be quite sure on this point: Banitt refers to it as a "langue fantôme", Vouland as a "non-langue", and Alessio as a "langue imaginaire", to quote three especially scathing francophone scholars. One main scholar is Szajkowski, who seems to have made up a great deal about it (including the name Shuadit, of which there is no evidence of use) and who "was no linguist, and his knowledge of Occitan was quite poor" (Strich and Jochnowitz). Moreover, the so-called last speaker, and a chief primary source, Armand Lunel, was evidently a semi-speaker who was not actually fluent in the "language". Glottolog sums all this up by saying: "This entry [Shuadit] is spurious. This means either that the language denoted cannot be asserted to be/have been a language distinct from all others, or that the language denoted is covered in another entry." To the extent that anyone wants to enter the paltry Hebrew-script text, it can be done as Old Provençal or Occitan, depending on how old it is. —Μετάknowledgediscuss/deeds 06:33, 14 April 2016 (UTC)Reply[reply]
@Metaknowledge: Good to know. Is there any evidence that it was a dialect, or was it basically just a script variant like Judeo‐French? --Romanophile♞ (contributions) 06:41, 14 April 2016 (UTC)Reply[reply]
Pretty much just a script variant. That was a popular thing to do, because using the Latin script was seen as too associated with the Church and distant from the Jewish educational tradition. There have been many other isolated incidences of languages like Urdu and Samogitian being written in Hebrew script as well. —Μετάknowledgediscuss/deeds 06:45, 14 April 2016 (UTC)Reply[reply]
@Metaknowledge: do you think that much of Judaeo‐Romance contains only superficial differences? Most of them are considered ‘extinct’ by Wikipedia, bearing Ladino and Judeo‐Italian. Judaeo‐Italian notwithstanding, Ladino appears to be the language of Latin Jews around the world. --Romanophile♞ (contributions) 07:13, 14 April 2016 (UTC)Reply[reply]
I think so, but I'd like to read through the chapter on each in the Handbook and consult some other sources before deciding. I obviously like Jewish languages, but I think that the line between language and dialect is being abused, and that we should find better ways to document these. —Μετάknowledgediscuss/deeds 07:20, 14 April 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
This name is used rather uncommonly, and really only by Jewish philologists, not mainstream linguists. It should be changed to the much more common "Judeo-French".
Unrelated to the name change, I'm not fully sure that we should have this as a separate language. Kiwitt and Dörr (2015) say: "It should be noted, however, that the major part of linguistic data attested in Judeo-French sources is simply common Old French written in Hebrew script, with some texts showing little to no register variation in comparison with Christian Old French sources." They go on to discuss one extensive text, a biblical glossary, where only 6% of words were not attested in Christian Old French texts. Basically, this is similar to Hindi and Urdu — is it worth keeping separate? (And if you think it'd be strange to have Hebrew-script entries under ==Old French==, remember that we have Arabic-script entries under ==Afrikaans==.) @-sche, Renard Migrant, Wikitiki89 —Μετάknowledgediscuss/deeds 04:52, 13 April 2016 (UTC)Reply[reply]
Interesting. I’d be fine with a merge or a rename. Script variants and dialectisms can simply be marked with Judeo-French. I would love to work on this dialect, but I have no idea where to find any texts. --Romanophile♞ (contributions) 05:08, 13 April 2016 (UTC)Reply[reply]
Before commenting here, I was planning to read the Judeo-French chapter of my new copy of the Handbook of Jewish Languages, but I now realize that Metaknowledge also recently acquired this book and has probably already this chapter and probably only started this discussion because of that. So I'll just assume that his conclusion is the same that I would have drawn and say that I agree that this should be merged with Old French. --WikiTiki89 21:10, 13 April 2016 (UTC)Reply[reply]
Precisely. I intend to work through the entire book, improving how we cover Jewish languages. —Μετάknowledgediscuss/deeds 01:10, 14 April 2016 (UTC)Reply[reply]
Rename per nom; see also ngrams. Also merge per nom; perusing the references that turn up if I just search for references that mention both lects (google books:"Judeo-French" "Old French"), I find that they agree:
Raphael Patai, Encyclopedia of Jewish Folklore and Traditions (2015, →ISBN, page 316: "Judeo-French is Medieval (Old) French as spoken and written by French and Rhenish Jews. It differs from the other “Judeo” languages in that there were no dialectal differences between it and the Old French spoken by the non-Jews[.]"
Aaron D. Rubin, Lily Kahn, Handbook of Jewish Languages (2015, →ISBN, page 139: "However, this term does not imply the existence of a set of linguistic features common to these sources that would allow identifying a 'Judeo-French' language or dialect distinct from the varieties of Old French encountered in Christian sources."
Yes, this will result in Hebrew-script Old French (alternative-form-of) entries, and that is OK. - -sche(discuss) 00:16, 14 April 2016 (UTC)Reply[reply]
A note: I do think we should allow this code to be used for etymology-only uses, however. —Μετάknowledgediscuss/deeds 01:10, 14 April 2016 (UTC)Reply[reply]
If I remember correctly, a number of Old French words are attested first in Rashi's writings, and those writings been used in recent years by scholars of Old French to fill in gaps in knowledge of other aspects of the language. I'm sure some of our etymologies already include those Zarphatic words as Old French, so we might as well make it official. I agree we should both rename the lect and merge it with Old French. Chuck Entz (talk) 02:29, 14 April 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Hello everyone. I'm curious as to how one would go about requesting an exceptional code for the standard Kichwa language? There are several SIL/Ethnologue codes for various Kichwa dialects (Imbabura (qvi), Chimborazo (qug), Cañar Highland (qxr), etc), but not one is for the standard Kichwa language that's taught in schools and used by the government in Ecuador. There is a common Quechua code (qu) is currently used for Quechua Wiktionary and Wikipedia, but both projects are exclusively written in standard Southern Quechua. Both Kichwa and Southern Quechua are part of the Quechua II branch of the Quechuan languages, but they are different dialects with different standardized grammars and different standardized writing conventions. --Dijan (talk) 18:28, 12 May 2014 (UTC)Reply[reply]
Hmm... how mutually intelligible are Kichwa and Quechua? It wasn't that long ago that the various Quechua dialects' codes were removed from Module:languages, though I'm having trouble locating the discussion(s) that led to that.
They are mutually intelligible for the most part (similar to differences between Danish and Swedish), but they are also significantly different from each other to be considered separate. Linguistically, Kichwa belongs to a different subgroup of Quechua. The grammar of Kichwa is more simplified (loss of possissive suffixes, loss of the voiceless uvular fricative, etc) and the vocabulary is affected by native languages spoken before the Incan conquest of the territory of today's Ecuador and Colombia (meaning, Quechua was imposed as a foreign language, whereas it is a native language in the regions where Southern Quechua is spoken - in southern Peru and Bolivia).
I was referring to designing our own code and using that as an umbrella for all the Kichwa varieties - which now all use a standardized alphabet different from the Peruvian varieties (such as Southern Quechua), but I couldn't find the procedure for it.
There was an attempt to create a separate Kichwa Wikipedia, but apparently no one got around to it and it got complicated as somone pointed out that an official ISO code must be requested specifically for the standard variety. And for some the problem was that it was trying to use the Chimborazo (qug) code (which is one of the most widely spoken varieties of Kichwa). Apparently the issuing of codes is very strict on Wikipedia. --Dijan (talk) 20:57, 27 May 2014 (UTC)Reply[reply]
@Liliana-60, what are your thoughts on this? The Quechua dialects themselves were merged into qu, though I can't find the discussion that led to that. - -sche(discuss) 03:08, 29 February 2016 (UTC)Reply[reply]
If none of the ISO codes cover this we should create our own code. The languages seem different enough that they warrant separate treatment, at least. -- Liliana• 21:43, 29 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Although LANGTREAT notes that they should be subsumed into fa, the following codes still exist in Module:languages:
pes, for "Western Persian", the most common variety of Persian. We should merge it into fa because it *is* fa.
prs, Eastern Persian / Dari, the variety of Persian spoken in Iran Afghanistan. Its status as a separate language, and its very name 'Dari', were products of Afghanistani politics. Not even Afghani speakers of the language call it 'Dari' or consider it separate from Persian; we shouldn't consider it separate, either.
aiq, Aimaq, a variety spoken by nomads in Afghanistan and Iran. It is sometimes considered specifically a variety of Dari (which is itself little more than another name for Persian, as explained). It differs from standard Persian mainly in matters of pronunciation, something we usually handle with {{a}} rather than separate L2s.
haz, Hazaragi, another Afghan variety. WP summarizes scholarly opinion (with citations, for which look here): "The primary differences between Standard Persian and Hazaragi are the accent and Hazaragi's greater array of Mongolic loanwords. Despite these differences, Hazaragi is mutually intelligible with other regional Persian dialects."
deh and phv, Dehwari and Pahlavani, which it is hard to find information on because even WP simply redirects the words to "Persian".
As indicated above, my opinion is that we should merge all of those codes into fa.
Incidentally, LANGTREAT originally also banned Tajik (tg), but this was not supported by scholarship or by our own practice (we had hundreds of Tajik entries), so after two discussions, I updated the page to note that Tajik is allowed. LANGTREAT made no mention of Judeo-Persian (jpr), Bukhari (bhh), Judeo-Tat (jdt) or Tat (ttt), and past discussions of them have assumed they were separate languages, so I also updated the page to reflect that. - -sche(discuss) 05:16, 20 November 2013 (UTC) (fixed think-o)Reply[reply]
The Persian lects are an interesting issue; they are on the whole pretty similar, but Persian and Tajik have separate literary and cultural traditions, and I believe Dari does too. I think it is best to keep them separate. The Jewish varieties are often written in the Hebrew script and have a separate cultural tradition, so I think it would probably be handy to keep them separate as well. All the rest probably ought to be merged into their macrolanguages, unless there are script conflicts I'm unaware of (it is, of course, easier to keep with one script per language). —Μετάknowledgediscuss/deeds 05:23, 20 November 2013 (UTC)Reply[reply]
The difference between Dari and Persian is not great AFAIK, there are some references on Wikipedia. Tajik should stay separate, not just because it's in Cyrillic. It's very different from Persian and has many Russian and Turkic loanwords. There is also a significant difference in phonology (vowels). Persian e, o and â are usually i, u and o in Tajik. ZxxZxxZ (talk • contribs) may be able to say a bit more. --Anatoli(обсудить/вклад) 05:36, 20 November 2013 (UTC)Reply[reply]
We also currently include "Parsi" (prp) and "Parsi-Dari" (prd), which Wikipedia suggests are spurious(!). - -sche(discuss) 05:58, 3 December 2013 (UTC)Reply[reply]
Regarding Dari, to merge Dari into Persian, we would have to figure out what to do with transliterations. In standard Persian, ē merged with ī into what we transliterate as i (no diacritic) and ō merged with ū into what we transliterate as u. In Dari, ē and ī and ō and ū are still differentiated. Furthing complicating the problem, standard Persian e, o, ey, and ow are pronounced i, o, ay and aw in Dari, although these differences are not phonemic. The other differences are not as much of a problem, but see a brief description at w:Dari#Phonology. --WikiTiki89 14:37, 4 December 2013 (UTC)Reply[reply]
That has been the practice on Wiktionary. Dari should be under the Persian heading with {{context|Dari|lang=fa}} label. That is what we do already. Take a look at فاکولته, for example. Regarding the transliteration of long vowels, ZxxZxxZ (talk • contribs) has been trying to implement a classical Persian transliteration (which is pretty much used for Dari by scholars) as a standard for all Persian entries. So far, it's been a slow and selective process. I'm not opposed to it and it can easily be indicated as the standard Wiktionary practice in the Appendix:Persian transliteration. --Dijan (talk) 06:35, 5 December 2013 (UTC)Reply[reply]
By the way, if you happen to know anything about Parsi or Parsi-Dari (and about their relationship to Farsi and Dari), your input on that subject would also be appreciated over in the same section. :) - -sche(discuss) 00:53, 6 December 2013 (UTC)Reply[reply]
I support keeping Dari (and Western Persian) under Persian heading, but I'm not sure what to do about transliteration. By the way, Eastern Persian / Dari (prs) is the variety of Persian spoken in Afghanistan, not Iran, that of Iran is the first one, Western Persian (pes). --Z 17:48, 12 December 2013 (UTC)Reply[reply]
I've deleted "Parsi" (prp) and "Parsi-Dari" (prd). - -sche(discuss) 06:36, 5 February 2015 (UTC)Reply[reply]
James Minahan says "The Hazara language, called Hazaragi, is a Farsi dialect, although the Hazaras are physically Mongol. The intermixture of the Indo-European and Mongol linguistic groups resulted in a dialect of Dari Persian that contains extensive words and forms from Farsi, Turkic, and Mongol. [...] Most Hazaras also speak Dari Persian [...] as a second language. In Iran most are bilingual, speaking both Hazaragi and Farsi. [...] Until the 1980s educated Hazaras used Farsi or Arabic as the literary language, but a movement to create a Hazaragi literary language has gained momentum." This suggests that Hazaragi is a separate language; we also already have a few translations into it. Therefore, I've kept it separate and updated WT:LT accordingly.
Barbara West, in the Encyclopedia of the Peoples of Asia and Oceania, writes that the Aimaq speak "Aimaq, a dialect of Dari or Afghan Persian"; Brian Williams writes that "Aimaqs also have a strong Persian admixture, and their language is Dari or Farsi", Jake Kircher writes that "Once there was a generally used, common Aimaq language but now, few seem to speak it anymore. Dialects spoken today resemble Dari (Afghan eastern Farsi) admixed with words of Mongolian and Turkic origin." In general it seems that it should be handled the same way as Dari, via {{lb}} and {{a}}.
Several references consider Pahlavani an alternative term for Pahlavi, but the 2003 International Encyclopedia of Linguistics says Pahlavani is still spoken today in a village in Afghanistan, Haji Hamza Khan, where it is "similar to Dari Persian but still distinct" (earlier editions of the IEL are explicit that this is the only village where the language is spoken). I suppose it can be left unresolved for now.
Dehwari is spoken by Persians in Baluchistan; several references seem to consider it to be Persian, but Denys Bray's 1934 The Brāhūī problem writes as if it and Persian are separate languages; I suppose it too can be left unresolved for now.
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently have separate codes for three lects considered part of the Lega macrolanguage: lea, lgm, and khx. These are reasonable to separate into two languages, lgm (which should be called Lega-Ntara) and lea (which should be called Lega-Malinga) as there is 67% mutual intelligibility, but khx is clearly a variety of lea. All this is per the treatment of the Beya dialect of lea in The Bantu Languages. —Μετάknowledgediscuss/deeds 04:23, 22 November 2015 (UTC)Reply[reply]
Support; merge khx into lea, with lgm remaining separate, and name everything per nom. I'm not sure why Wikipedia names the two main dialects using placenames. The full array of alternate names I encountered in (cursorily) researching the matter:
Mwenga Lega = Lega-Ntara / Lega Ntara (variously translated in refs as "Lower Lega", "Upper Lega" or "Eastern/Northern Lega") = Isile, Ishile, Kisile; Mwenda-Liga
Shabunda Lega = Lega-Malinga / Lega Malinga (variously translated in refs as "Upper Lega", "Lower Lega" or "Forest Lega" or "Western/Southern Lega") = Lega (Kilega) / Liga (Kiliga) proper; dialects: Kanu (Kikanu), Gala (Kigala), Yoma (Kiyoma), Sede (Kisede), Gonzabale, Beya (Beia), and possibly (Ki)Nyamunsange and Banagabo and Kabango and Bene
I'v merged khx into lea. However, as names go, "Lega-Shabunda" and "Lega-Mwenga" seems to be more common than "Lega-Malinga" or "Lega-Ntara". - -sche(discuss) 05:06, 29 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming ki
This is a hard one, and I'm not advocating one way or the other, just wishing to raise the issue. Ngrams reveal that in 1992, "speaking Gikuyu" and "Gikuyu language" became more common than "speaking Kikuyu" and "Kikuyu language" (as well as becoming the linguistic standard), yet overall the spelling "Kikuyu" is still more common, presumably in speaking of the people, who are less obscure than their language. We follow Wikipedia in using the spelling "Kikuyu", but this spelling is clearly no longer favoured for the language (if you're curious, the "g" is to reflect the etymon, KikuyuGĩkũyũ). Should we change it? —Μετάknowledgediscuss/deeds 19:55, 5 September 2015 (UTC)Reply[reply]
I really don't have a strong opinion on this. Personally, I still think of the language as Kikuyu, which makes it difficult for me to come out and say "Yes, we should rename it", but the reasons you mention make it difficult for me to come out and say "No, we shouldn't rename it". So I abstain. I'll be happy if we continue to call it Kikuyu, but I won't be unhappy if we start calling it Gikuyu. (I will be unhappy if we start calling it Gĩkũyũ, though, since that really isn't an English word.) —Aɴɢʀ (talk) 08:19, 27 February 2016 (UTC)Reply[reply]
I am very familiar with the English name Kikuyu for both the language and the people (especially for the language), but I have never seen Gikuyu used in English. I think Gikuyu is the native name (more properly Gĩkũyũ). —Stephen(Talk) 22:50, 27 February 2016 (UTC)Reply[reply]
As you note, there are phrases where ngrams suggets Gikuyu is now more common as a name for the language — but there are also phrases (1, 2) where Kikuyu is still more common even as a language name. I'd stick with the current spelling. - -sche(discuss) 05:00, 28 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Judging by Ngrams, "Malaccan Creole Portuguese" is the least common name for this language; more common is "Malacca Creole Portuguese" with no n, and most common is "Kristang". In Glottolog's list of materials on it, I note that most of the modern material on it (by Baxter and Marbeck) calls it Kristang. I suggest renaming. That entails updating several entries and moving several categories. - -sche(discuss) 19:01, 20 March 2016 (UTC)Reply[reply]
Kristang gets a misleadingly high number of hits because it’s also the name of the people that speaks it, and part of the synonym Papia/Papiah/Papiá Kristang. — Ungoliant(falai) 01:05, 2 April 2016 (UTC)Reply[reply]
Kristang, but Papia Kristang or Malacca Creole Portuguese are also good options. The most important works about this language use Kristang more prominently. — Ungoliant(falai) 19:30, 3 April 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently have a code spq for this dialect of Spanish; Wikipedia has an article for it at Amazonic Spanish which states that "Ethnologue's reasons for doing this [making a separate code for it] are poorly documented." Although it has some mild differences, it is clearly a dialect of es and should be merged into it. (There are no entries, but we should record the merger.) —Μετάknowledgediscuss/deeds 17:35, 3 April 2016 (UTC)Reply[reply]
Merge, IMO. (The only thing that gives me pause is the difference in attestation requirements: if Amazonian Spanish isn't well documented, then treating it as a separate lect allows its entries to be held to a lower attestation requirement. But the same could be said of a lot of varieties that don't have their own codes, e.g. New Mexico and Southern Colorado Spanish, for which there are several published references.) - -sche(discuss) 00:50, 4 April 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
The Senoufo lects are a mess, and we have no consistency in naming them (some are Senoufo, some Sénoufo, some without the Senoufo at all). This one, seb seems not even to be a separate lect at all, but instead what Supyire spp is called in Côte d'Ivoire. —Μετάknowledgediscuss/deeds 20:25, 3 April 2016 (UTC)Reply[reply]
Yes, Wikipedia and Omniglot agree they are the same. (The old 2003 International Encyclopedia of Linguistics said their "Relationship [was] undetermined" at that time.) As for the other Senoufo lects: I noticed one while checking translations at water, and removed "Senoufo" from its name before I added entries in it because I saw how rarely it was actually referred to with "Senoufo" in the name. Supyire too seems to be mostly referred to without "Senoufo". - -sche(discuss) 00:45, 4 April 2016 (UTC)Reply[reply]
Eventually we'll have to get to renaming them. For now, I just wanted to excise duplicates. Also, thanks for the archiving. —Μετάknowledgediscuss/deeds 01:55, 4 April 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
@Angr According to w:Brittonic languages, the name "Brittonic" is far more common than "Brythonic", which is apparently rather outdated. We should use the more common name. —CodeCat 17:30, 12 April 2016 (UTC)Reply[reply]
@CodeCat This is pretty funny. I'm fairly certain I used to use "Brittonic" and then you corrected me to "Brythonic" (rightfully, since it is the one we use currently), but it's humorous to me that we're now suggesting the change. I definitely think we should change it to "Brittonic". —JohnC5 18:21, 12 April 2016 (UTC)Reply[reply]
@CodeCat, JohnC5: What about the (unsourced) "Some authors reserve the term Brittonic for the modified later Brittonic languages after about AD 600." statement? — I.S.M.E.T.A. 14:45, 13 April 2016 (UTC)Reply[reply]
Google Books Ngrams shows what looks to me like virtually a statistical tie since 1950, though Brythonic has been more common since the turn of the century. I really don't have a strong preference either way. —Aɴɢʀ (talk) 09:28, 14 April 2016 (UTC)Reply[reply]
@Angr: That Ngrams result is very interesting and makes me lean towards keeping it as “Brythonic.” —JohnC5 14:56, 14 April 2016 (UTC)Reply[reply]
I reckon that we should add exceptional codes for the ones that have words directly recorded from them, even though it's precious few for each.
Palta (separate article) could be qfa-jiv-pal, in imitation of Linguist list's jiv-pal, but given that its family assignment is not one hundred percent certain and a three-part code is annoying and abnormal, we could also settle for sai-pal. I don't know why Jivaroan's language family uses qfa rather than sai, which seems to be our default for unsorted South American languages.
Rabona could be sai-rab.
Patagón could be sai-pat.
Bagua could be sai-bag.
Copallén could be sai-cop.
Tabancale could be sai-tab.
Chirino could be sai-chi.
Sácata could be sai-sac.
I'm not sure I see a point in adding codes for languages where the only words are elements in toponyms or names, rather than directly recorded. However, Puruhá language, Cañari language, Panzaleo language, Caranqui language, and others can be added if there is interest. After all, ISO already has codes for European languages like Dacian that aren't much better attested, and I suppose we could have an entry or two. @-sche —Μετάknowledgediscuss/deeds 05:13, 26 May 2016 (UTC)Reply[reply]
Re "why Jivaroan's language family uses qfa rather than sai": presumably human error. It could be updated to "sai-jiv". Re whether to use "sai-pal" or "sai-jiv-pal": for consistency with other codes ("nai-yuc-yav", etc) and with the schema described in WT:LANG, we should use "sai-jiv-pal" if we accept that it was a Jivaroan language. But as you say, the family identification is speculative (although the evidence which does exist is consistent with it). I suppose we could use "sai-pal" to be 'safe' / 'conservative' about the family identification and get a shorter code... this also lets us add the others as "sai-" codes without worrying we ought to reassign them if we later create a family code for the families they belong to. (Indeed, we already have a code for the Cariban family which Campbell and Grondona say Patagón belonged to.) - -sche(discuss) 22:27, 26 May 2016 (UTC)Reply[reply]
@-sche: I think it's better not to make any assumptions for these languages' genetic affiliations. And do you want to add the languages I mentioned in my last paragraph? —Μετάknowledgediscuss/deeds 04:02, 27 May 2016 (UTC)Reply[reply]
I've added Palta as sai-pal, added Rabona, Patagón, Bagua, Copallén Tabancale, Chirino and Sácata, and also renamed the Jivaroan code to sai-jiv and changed Esmeralda's code from qfa-und-esm to sai-esm to fit the usual naming scheme. - -sche(discuss) 15:42, 29 June 2016 (UTC)Reply[reply]
I have added Puruhá as sai-prh. There was at one point a grammar of it (though it has been lost), so we know it existed as a discrete lect. And based on personal- and place-names, words in it have been reconstructed by scholars. Even if the only words in it we can add are words in the Reconstruction: namespace, that does seem worth having a code for (a code also lets us reference it when giving the etymologies of those personal- and place-names). The other languages could probably be given codes on the same basis (if words in them have been reconstructed). - -sche(discuss) 00:56, 1 July 2016 (UTC)Reply[reply]
I've also added codes and entries for Cañari, Panzaleo and Caranqui. Everything here is done, I think. - -sche(discuss) 20:34, 2 July 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently have a language called Saliba (sbe) and one called Sáliba (slc). In my mind, having two languages' names only differ in a diacritic is not acceptable. It confuses automated programs as much as human editors. I'm not sure there are any acceptable alternative names, so I propose using geographic disambiguation for sbe as "Saliba (Papua New Guinea)". @-sche —Μετάknowledgediscuss/deeds 04:15, 26 June 2016 (UTC)Reply[reply]
Alternately we could call slc "Saliva". This seems to get about 10× more Ghits for both the ethnic group and the language (though interference from saliva is possible). --Tropylium (talk) 04:34, 27 June 2016 (UTC)Reply[reply]
Indeed the current names are so confusing that I gotthembackwards when I added the only three entries we have in the two languages. I think that disambiguating them both with parentheticals is clearer than allowing one to keep the ambiguous name (either while renaming the other to "Saliva" or while disambiguating the other). Current practice suggests that we should rename slc to "Saliva" or "Sáliva", but I wouldn't mind if we started making more frequent use of disambiguators instread. - -sche(discuss) 08:50, 27 June 2016 (UTC)Reply[reply]
My first choice is "Saliba (Colombia)", my second choice is "Sáliva". It's bad enough we have Anus language to protect from puerile vandalism without also having Saliva language. —Aɴɢʀ (talk) 09:40, 27 June 2016 (UTC)Reply[reply]
Done, except that per previous discussion on using geographic disambiguators rather than national ones, I used "New Guinea" rather than "Papua New Guinea". (Other languages also use "New Guinea" as a parenthetical disambiguator in their canonical names, and only mention "Papua New Guinea" in alt names because SIL uses national disambiguators like that.) - -sche(discuss) 05:24, 28 June 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Ethnologue encoded this language twice: once as lrk "Loarki" (the name its 20,000 Pakistani speakers call it), and a second time as gda "Gade Lohar", the name its ~500 speakers on the Indian side of the border call it. (The International Encyclopedia of Linguistics entry on Gade Lohar conservatively only says the languages "may be the same" as Loarki, and notes its long list of other names: Gaduliya Lohar, Lohpitta Rajput Lohar, Bagri Lohar, Bhubaliya Lohar, Lohari, Gara, Domba, Dombiali, Chitodi Lohar, Panchal Lohar, Belani, and Dhunkuria Kanwar Khati. The IEL entry on Loarki is more explicit, breaking down the population by country and countain Gade Lohar's 500 speakers as Loarki speakers, because Loarki is "probably the same as Gade Lohar in Rajasthan, India, a Rajasthani language.") I propose to merge gda into lrk. - -sche(discuss) 20:50, 11 August 2015 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
The ISO has retired the code duj and replaced it with dwu (in the process of splitting off dwy "Dhuwaya"). If we want to follow the ISO, all our Dhuwal entries and categories need to be switched from duj to dwu, which seems like a lot of unnecessary bother. (Whoever does this should also add dwy to the module.) - -sche(discuss) 05:54, 24 February 2016 (UTC)Reply[reply]
On one hand, it's unnecessary; on the other hand, I think it would be ideal to follow ISO in all cases where we don't have a well articulated reason not to do so. —Μετάknowledgediscuss/deeds 06:00, 24 February 2016 (UTC)Reply[reply]
Done. (I also recoded Elfdalian from the nonstandard dlc to the standard ovd, per a BP discussion.) - -sche(discuss) 02:56, 7 July 2016 (UTC)Reply[reply]
Latest comment: 7 years ago4 comments2 people in discussion
The following discussion has been moved from the page User talk:-sche.
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
What do you think about adding a code for this language, and under what name? Wikipedia describes it at Katembri language; they cite Fabre for the claim that this language is only preserved in a single brief wordlist, where it is called Kiriri (the wordlist is on page 22 (section 3.4) of this pdf). Regardless, that document does seem to be a good place to find more words for water. —Μετάknowledgediscuss/deeds 03:24, 2 March 2016 (UTC)Reply[reply]
There's a Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Katembri language, and another Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Xukuru language? Well, that's confusing.
The fact that there's only a limited number of words is no reason not to include the language, but it would be good to avoid the ambiguous name Kiriri. How about calling them Katembri and Xukuru, like Wikipedia does? Wait, (as a minor point of curiosity,) if the wordlist is labelled Kiriri, where'd the alternative name come from?
The difficult part will be assigning codes, given that the family affiliation is unclear. - -sche(discuss) 03:45, 2 March 2016 (UTC)Reply[reply]
Given the naming issues, I'm suddenly confused about whether I have correctly identified the wordlist being referred to. I don't know anything about any of these languages, so I feel lost (it's so much better in Austronesian, for example, where I at least feel like I have a hold on what goes where). Anyway, that naming scheme makes sense; we can use qfa codes and not worry about the families, no? —Μετάknowledgediscuss/deeds 05:01, 2 March 2016 (UTC)Reply[reply]
qfa is the prefix for exceptional family codes. All of our exceptional language codes which start with qfa do so because they start with a family code that starts with qfa, like qfa-ctc-col. There's been at least one case where we've created a family code for an accepted family (qfa-len, the Lencan languages) in order to use it in constructing a language code (qfa-len-slv for Salvadoran Lencan), but Wikipedia notes that scholars aren't certain what family either Kiriri belonged to, so we couldn't do that here because we couldn't accurately, confidently assign either one a family code (even an exceptional family code). I suppose we could construct codes starting with qfa-und, like qfa-und-ktm for Katembri. I wouldn't want to use bare qfa-___ (e.g. qfa-ktm for Katembri) because it would look like a family code. - -sche(discuss) 08:21, 3 March 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
I have gone through w:Category:Languages without ISO 639-3 code but with Linguist List code (thanks, Angr), and the languages listed below still need exceptional codes. I have not listed those that have no recorded material or toponyms, or those that are treated as a dialect of another language in the linguistic literature (like Akokisa). I have put suggested codes after them, and notes where I'm unsure (please correct me if I made any mistakes). —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply[reply]
Burgundian language (Germanic) (gme-bur) — the name "Burgundian" is unoccupied but could cause confusionDone as gem-bur due to uncertainy over whether or not it is East Germanic
Ch'olti' language (myn-cho) — I wasn't 100% sure that this lect is a separate languageDone (notes below) as myn-chl for maximal distinction from Ch'orti'
Amarizana: add per nom. Alexandra Y. Aikhenvald in Languages of the Amazon cautions that "many languages in Amazonia have 'namesakes' [and] more than one group may hide behind the same name", and more than one language has been called Amarizana: "one of the clans of the Piapoco is the Amarizanes (and the name is sometimes applied to the whole group). A now extinct language, also called Amarizana, and from the same Arawak family, used to be spoken in the Meta territory of modern Colombia." Nonetheless, the only references I find unambiguously mean this language. Čestmír Loukotka, Johannes Wilbert, Classification of South American Indian Languages (1968), page 131, lists some Amarizana words alongside and hence obviously distinct from Piapoco, including nuita "head", notuy "eye", nukagi "hand", kaxü "house", sietai "water", eriepi "fire" and keybin "sun". Julian Granberry's A Grammar and Dictionary of the Timucua Language even provides some etymology, connecting Amarizana eri(-...) "fire" to Achagua eri "sun, day", Arekena ale "sun".
Amasi: this happens to highlight what a mess our African language family codes are. Several codes use the prefix nic- even though their most immediate superfamily is alv, e.g. nic-vco should be alv-vco. Fortunately, fixing the nic- codes should not require updating very many pages. One that is done, precedent would have us use alv-bco-... rather than alv-... (compare nai-yuc-tip, qfa-ctc-cat), although the argument in favor of a shorter code is obvious. :-/ Some words are listed in a 1973 article in Africana Marburgensia ('AM') and in a pre-draft working paper cited by WP ('B'), including bú (AM) / bu (B) "dog", ázɔ́lí (AM) / azɔle (B) "tree", ɣà-nēm (AM) / ɣanim (B) "man", ɣà-zhyī (-zhyì?) (AM) / ɣaʒɛ (B) "woman", mwɔ̄ (AM) / muɔ (B) "water".
Anauyá: add per nom. Also called Anauya, but the version with diacritic is more common. [4] has uni "water" and ahiri "sun", the latter confirmed by I Simposio Antonio Tovar sobre Lenguas Amerindias: Tordesillas... (Emilio Ridruejo Alonso, Mara Fuertes, Carlos González-Espresati; 2003) and both are seemingly in the aforementioned Classification of South American Indian Languages, although I can't see the exact snippet.
Atanque(s): out of the various names WP mentions, namely "Atanque (Atanques) or Cancuamo (Kankuamo), also known as Kankwe and Kankuí", plus others I ran across (Atanke), "Atanques" seems to be most common, at least as the name of the language. ("Kankuamo" is quite common as a placename(?) that forms part of the designation of a tribe.) A 1962 article in Anthropological Linguistics has some words, including jo̱ke "gourd cup", cognate to cho̱kue(“gourd cup”), and mo̱ga "two", cognate to mo̱ga(“two”), and the 1981 Comparative Chibchan Phonology has more words (and may drop the underline from the os of those words; it is hard to see, because all words are underlined), including ji "worm", jinua "six".
Ayomán (rarely also Ayoman): I've added a code for Jirajaran, sai-jir, so this language's code should be sai-jir-ayo.
@-sche: Can't we just do two-part codes so we don't have to feel obligated to create these horribly long ones? It wouldn't clash with all of our preëxisting practice, despite there being some precedent. Also, I'm worried that your careful work on this is going to make this RFM section far too long, and also cause you to burn out. Perhaps this should be a user page that this section links to? —Μετάknowledgediscuss/deeds 07:57, 4 July 2016 (UTC)Reply[reply]
Where an existing ISO family code like alv exists, I suppose we could go with two-part codes, but then what should be done with languages that have no ISO family code but instead belong to families for which we've had to create qfa- codes? I suppose they can be treated the same as they are now. But if we accept nai and sai as family codes for this purpose, I suppose that means some qfa- things like Salvadoran Lenca and Catacao can be re-coded. I will update the existing three-part non-proto-language codes if we go that route. I've started Wiktionary:Beer_parlour/2016/July#Shortening_some_.27exceptional.27_language_codes. Yes, I considered as probably should start storing long comments and information on addable vocabulary in userspace. I wouldn't worry too much about burnout; we can take time; the only reason there would be a rush to add these codes ASAP is if we wanted to add words in particular ones of them, and if we wanted to add words, we'd need to do some research to find words to add. - -sche(discuss) 17:40, 4 July 2016 (UTC)Reply[reply]
I'm not sure. I assumed that it could be subsumed under fr, as the Oïl languages usually are, but we probably ought to address it separately. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)Reply[reply]
I mean, is it separate from fr? I don't know. In any case, I guess on further consideration it doesn't stop us from adding the Germanic language as "Burgundian", because if the Romance language needs to be added, it can be Bourguignon. (And since they're from different (sub)families, it should be easy to tell which one was meant if someone enters a word from one incorrectly as the other.) I'll collect information about them at User:-sche/Burgundian (Wiktionary:About Bourguignon). - -sche(discuss) 16:50, 5 July 2016 (UTC)Reply[reply]
Btw, I found and added another one we were missing, Macoris, attested in one word (baeza) and some placenames. - -sche(discuss) 06:27, 5 July 2016 (UTC)Reply[reply]
Oh, there are tons more. This is only the low-hanging fruit; a lot more languages with paltry data are waiting to be dealt with. I'm avoiding the Bantu ones for now, because pretty much all of them are in dialect continua and probably should be left alone unless good scholarship on their mutual intelligibility can be found (which I suppose I should go about finding). There's a pile of Australian ones that I'll get around to listing at some point (I thought maybe I'd give you time to digest all this first), and then even more messier ones from South America. —Μετάknowledgediscuss/deeds 09:01, 5 July 2016 (UTC)Reply[reply]
FWIW, re 'time to digest', I'd feel free to post any others you have the data to post (except the ones you mention are parts of dialect continua ... might as well leave them alone, as long as some part of the continuum has a code, although if no part does, then we should probably rectify that). There's no harm in it sitting around on the site unattended-to for a while, whereas letting it sit on one's computer sometimes (at least for me) means forgetting where one put it. (I can no longer find the information I thought I had collected on the separability vs mergeability of Haida dialects.) - -sche(discuss) 16:50, 5 July 2016 (UTC)Reply[reply]
Regarding Ch'olti': The oldest stage of the language (written in so-called hieroglyphs), sometimes confusingly just called "Ch'olti'" but more often called by other names, has its own code emy. The Colonial- and post-Colonial-era stage is considered distinct from emy, and also considered distinct from the more recent stages of Ch'orti', e.g. one reference says "In the Mayan classificatory tradition, the Ch'olti' language, as recorded in the 1695 grammar of Pedro Moran, is generally held to be related to but separate from the modern language of Ch'orti' (see Kaufman's 1976 classification, for example)." Post-Epigraphic-era Ch'olti' and modern-era Ch'orti' (caa) are theoretically distinguished from each other as different branches of Eastern Ch'olan (and from ctu, as it is a Western Ch'olan language), but the size of the difference between Ch'olti' and Ch'orti' is hard to ascertain, especially because, quoth WP, "the post-colonial stage of the language is only known from a single manuscript written between 1685 and 1695" (as afore-mentioned). For that matter, the size of the difference between Epigraphic Ch'olti'an and Colonial Ch'olti' is not obvious to me; Søren Wichmann, The Linguistics of Maya Writing (2004), page 271, says "In this section we show how Classic Ch'olti'an became seventeenth-century Ch'olti'. The chief grammatical difference between the grammars of Classic Ch'olti'an and Ch'olti' is the difference between straight- and split-ergativity." As an example, mi "father" is used in Classic and Colonial Ch'olti'(an) and in Ch'orti'. Nonetheless, given that the corpus of post-emy Ch'olti' is small and well-defined, it shouldn't be that hard to include it separately from emy and caa. - -sche(discuss) 21:42, 5 July 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We seem to have this as a single macrolanguage, yok, despite the fact that the constituent lects seem to constitute at least a few languages. -sche added some entries, but it's all tagged by (dia)lect, so it will be easy to separate them. I think we should retire yok and replace it with exceptional codes to reduce confusion, but I am not sure what those divisions should be. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply[reply]
Yes, the difficulty is in deciding how to divide it. Christopher Loether cautions that although e.g. "Gayton (1948) listed 26 named groups or 'tribelets'[,] many of these named groups speak dialects which are nearly identical phonologically, lexically and syntactically, while others speak varieties which are indeed quite distinct from their neighbours' speech. Kroeber [...] divided [what he considered a single] language into two main divisions: Valley and Foothill Yokuts. He further divided the Valley Branch into Northern and Southern, and the Foothill branch into Kings, Tule-Kaweah, Poso, and Buena Vista. Newman (1944: 5-3) agreed with Kroeber's analysis of a single Yokuts language and stated that his data corroborated Kroeber's dialect divisions."
Kroeber lists 20+ dialects, of which 21 are named in [[ˀilik']]. Wikipedia has a tree/bush diagram from Whistler and Golla of 23-28 dialects, including all of those 21 plus Koyeti, Merced "(?)", Noptinte (Nopchinche(s), Nopthrinthre(s), Nopṭinṭe, Nopthrinte, Noptinci), Yachikumne a.k.a. Chulamni, Lower San Joaquin Yokuts, and Lakisamni "(?)", and Tawalimni. (Several have multiple names, e.g. Ayticha is also called Kocheyali as well as Ayitcha; Palewyami is also Altinin and Poso Creek Yokuts in addition to Paleuyami. And Hometwoli is also Taneshach?)
Kroeber's and Newman's division; Whistler and Golla's division
Kroeber's and Newman's division is:
Valley
1. Northern Valley
2. Southern Valley
Foothill
3. Kings River (including Gashowu)
4. Tule-Kaweah
5. Poso [Creek]
6. Buena Vista
Whereas, Whistler and Golla's division is:
1. Poso [Creek] (Palewyami)
General Yokuts
2. Buena Vista (Tulamni, Hometwoli)
Nim-Yokuts
3. Tule-Kaweah (Wikchamni, Yawdanchi)
Northern Yokuts
4. Kings River (Chukaimina, Michahay, Ayticha, Choynimni)
5. Gashowu
Valley Yokuts
6. Southern Valley (Yawelmani, Tachi, etc)
7. Northern Valley (Chukchansi, Kechayi, etc)
8. Far Northern Valley (misc dialects)
For Yawelmani and Chukchansi, decent resources exist; in addition to those two, Wikchamni and Tachi are also being taught according to WP, and in addition to those four, Choinimni and Kechayi also have at least some speakers according to WP.
WP says the Yokutsan family consists of "half a dozen" languages, but evidently not the six just named, because those six leave out several major branches that Kroeber, Newman, and Whistler and Golla all agree on.
I suggest we create a family code nai-yok for the Yokutsan languages, and then distinguish the following branches which Kroeber, Newman, and Whistler and Golla all consider distinguishable, without splitting them further at this time:
Palewyami (nai-ply — or putting the y at the start so the codes sort together and are more apparently connected — on further thought, nah) a.k.a. Poso a.k.a. Poso Creek
Buena Vista Yokuts (nai-bvy) a.k.a. Tulamni-Hometwoli
Kings River Yokuts (nai-kry) a.k.a. Choinimni, etc
Gashowu (nai-gsy), which Kroeber and Whislter/Golla agree is intermediate between Kings River and Northern Valley, though Kroeber considers it ultimately/genetically Kings
Southern Valley Yokuts (nai-svy) a.k.a. Yawelmani, Tachi, etc
Northern Valley Yokuts (nai-nvy) a.k.a. Chukchansi, Kechayi, etc
Delta Yokuts (nai-dly) a.k.a. Far Northern Valley Yokuts
A more conservative approach would keep Southern Valley Yokuts, Northern Valley Yokuts, and Delta Yokuts together as "Valley Yokuts", but Delta Yokuts is relatively divergent. - -sche(discuss) 22:16, 3 July 2016 (UTC)Reply[reply]
Pinging @Chuck Entz in case you have insight or input on this Californian language family. - -sche(discuss) 04:01, 5 July 2016 (UTC)Reply[reply]
Not much, I'm afraid. A couple of decades ago I read everything I could find on the Wikchamni, who used to live in the area where my brother lives now. I read all of the sources you mentioned above, but I was more interested in the ethnobotany of the Yokuts than their languages, per se, and that was a long time ago.
On a side note, I remember one of my professors at UCLA back in the 80s saying that Yawelmani was one of the best-understood languages in the world at the time from a theoretical perspective, because so many linguists had been publishing papers on it- it was sort of the linguistic equivalent of a model organism. Chuck Entz (talk) 04:25, 5 July 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Module:languages includes codes from the individual members of several language/dialect groups which WT:LANGTREAT says, without citing any discussion, should be merged. Something needs to change: either WT:LANGTREAT should be updated to note that the individual varieties are allowed, or their codes should be removed from the module. The following language/dialect groups are affected: (Note 1: whenever the merging of a particular dialect group had been discussed and that discussion was cited by LANGTREAT, I simply updated the module.) (Note 2: Haida, Cree and Kalenjin face the same issue; I expect to write about them later.) - -sche(discuss) 05:16, 20 November 2013 (UTC)Reply[reply]
Stephen Tyler (not the singer), in his oft-cited works on Gondi, states "Though I have no real evidence, the general pattern seems to be for geographically adjacent Koya and Gondi populations to speak different, but mutually intelligible Gondi dialects. Where these populations are geographically non-contiguous, the dialects are not mutually intelligible. This same pattern probably prevails among all Gondi dialects." WP says "The more important dialects are Dorla, Koya, Maria, Muria, and Raj Gond." Ethnologue, meanwhile, as encoded only two varieties, ggo (Southern Gondi) and gno (Northern Gondi). Should we deprecate those two codes? Or deprecate the macro-code gon and recognise those dialects? Or allow all three? - -sche(discuss) 05:16, 20 November 2013 (UTC)Reply[reply]
Merged (whereas, the ISO, while retaining gon, split ggo into two new codes which have not been added). - -sche(discuss) 06:27, 24 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Malay lects
I propose to merge the following Malay lects into Malay [ms]:
Jakun [jak]
Orang Kanaq [orn]
Orang Seletar [ors]
Temuan [tmw]
These are all mere dialects of Malay with no written tradition and perfectly mutually intelligible. Even Ethnologue says they should be considered dialects of Malay rather than separate languages. -- Liliana• 23:39, 28 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming nxg
This language is far more commonly called "Ngadha" than "Ngad'a"; the latter spelling is so rare that when I was trying to verify our "Ngad'a" translation of water using that spelling for the language name, I couldn't find any references at all (they all spell it "Ngadha"). This rename entails moving a few categories and updating a handful of entries. - -sche(discuss) 02:32, 29 February 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently include both the macrolanguage Bontoc (code: bnc) and its dialects, particularly Eastern Bontok (not even using the same spelling, you notice! code: ebk) which we have about ~35 translations in. IMO, it rarely makes sense to include both a macrolanguage and also all of its dialects; we should usually have one or the other but not both. Ethnologue says the dialects are "reportedly similar", as if they split bnc into dialects in 2010 without without knowing enough about them to tell whether they were similar or distinct. The International Encyclopedia of Linguistics considers Central Bontoc to be only 56% intelligible with Eastern Bontoc, which is only a few percentage points better than the intelligibility of the various Bontocs with Ilocano, suggesting that at least Central and Eastern Bontocs, if not the others, are different languages. Our ~15 "Bontoc" (bnc) entries seem to be Central (Igorot) Bontoc and could be relabelled accordingly if we deprecated bnc. - -sche(discuss) 07:48, 29 February 2016 (UTC)Reply[reply]
@-sche: Relabelled "Central Bontoc" or "Igorot Bontoc"? And is it "Bontoc" or "Bontok"? Whatever the details, I support the idea of reducing the macrolanguage Bontoc (bnc) to an etymology-only language in favour or having translations and entries for the various Bontoc languages. — I.S.M.E.T.A. 01:30, 4 March 2016 (UTC)Reply[reply]
"Central Bonto(c|k)" is more common than "Igorot Bonto(c|k)" or "Bontok Igorot", and "Bontoc" is more common than "Bontok". I've tweaked the canonical spelling of Central Bontoc (lbk) accordingly; I suppose the other Bontocs which are currently spelled with k should also be updated. - -sche(discuss) 01:58, 4 March 2016 (UTC)Reply[reply]
A number of works refer to "the Bontoc language" without specifying which of the Bontoc languages they mean, and we couldn't easily include words from these works if we deprecated bnc; there are even books like Clapp's Vocabulary of the Igorot Language as Spoken by the Bontok Igorots which conflate all the languages of the Igorot people (perhaps understandably, given the point above that the Bontocs and e.g. Ilocano are equally different from each other). However, if we accept that being barely halfway mutually intelligible makes Central vs Eastern Bontoc separate languages, then we're not losing anything of quality by not following (and not being able to easily add content from) books that fail to distinguish such different lects. - -sche(discuss) 03:01, 4 March 2016 (UTC)Reply[reply]
@-sche: Quite. Though, there will be occasions when it will not possible to work out easily from which of the Bontoc languages a given term in a borrowing language will have been derived. — I.S.M.E.T.A. 03:28, 4 March 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Another language we need a code for, presumably sai-mat. Is there a more efficient way to find languages we've missed than my current method, which is simply happening upon them? —Μετάknowledgediscuss/deeds 08:33, 28 June 2016 (UTC)Reply[reply]
Btw, Native South Americans: Ethnology of the Least Known Continent lays out the case from Nimuendaju (who documented Mura) that Rivet, at least, if not others, was too hasty in grouping this with Mura. Nimuendaju considers Mat. and Mur to be isolates. - -sche(discuss) 05:12, 15 August 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We currently separate {{rw}}, {{rn}}, {{haq}}, {{suj}}, {{han}}, and {{vin}}, which can all be treated as a single language and often are; their separation seems largely political. The Wikipedia page Rwanda-Rundi language catalogues the differences between rw and rn: they are pretty minor, and a lot seem to have to do with regular spelling differences and tones, which are not even reflected in the standard orthography and could thus be relegated to pronunciation sections. To quote Zorc and Nibagwire's Kinyarwanda and Kirundi comparative grammar (2007):
The terms dialect and language are used loosely in everyday communication. In linguistic terms, the two are bound together in the same definition: a language consists of all the dialects that are connected by a chain of mutual intelligibility. Thus, if a person from Bronx, New York can speak with someone from Mobile, Alabama, and these two can converse with someone from Sydney, Australia without significant misunderstandings, then they all form part of the English language. Kigali [the capital of Rwanda] and Bujumbura [the capital of Burundi] are similarly connected within a chain of dialects that collectively make up the Rwanda-Rundi language.
Kimenyi's A Relational Grammar of Kinyarwanda (1980) explains that:
This language [Kinyarwanda, rw] is very close to both Kirundi [rn], the national language of Burundi, and Giha [haq], a language spoken in western Tanzania. The three languages are really dialects of a single language, since they are mutually intelligible to their respective speakers.
That seems like a strong case for merger to me, although I'd like to see if any academic sources disagree. (Pinging User:-sche as well, just to try it out.) —Μετάknowledgediscuss/deeds 01:52, 17 December 2013 (UTC)Reply[reply]
I got the ping, thanks. :) I've just been busy. I'll look into this more closely soon, but on the face of it, it does seem like we could merge them. (And that reminds me that en.Wikt really needs to have a discussion about merging Nynorsk and Bokmal. It's bizarre that we manage to accept that Drents and Twents — and, to use the example above, the English of Alabama and the English of Australia — are not separate languages, but haven't managed to accept that Nynorsk and Bokmal aren't. But that's for another discussion...) - -sche(discuss) 06:04, 19 December 2013 (UTC)Reply[reply]
I was just talking to a native speaker today to get their perspective on this. They said that the vocabulary varies a lot dialectally, but not along national lines, and it's still easy to understand people on the other side. They claimed that the biggest differences were in the tones, but that's not even marked in the orthography. I think that's a pretty strong ase for merger. —Μετάknowledgediscuss/deeds 19:22, 28 December 2015 (UTC)Reply[reply]
R. David Zorc and Louise Nibagwire have an entire Comparative Grammar (2007, →ISBN devoted to the differences between the two, which Google unfortunately only shows snippets of, including the TOC which lays out spelling differences, noun class differences, "word pairs, one matched, the other completely different", and false friends, as well as dialect-specific tonal marking. However, vocabulary differences and false friends exist even between English dialects (luck out), and tonal marking and other pronunciation differences which aren't reflected in the orthography can be handled in pronunciation sections, as with Iranian Persian vs Dari. The only thing that gives me pause is the point that dialects on the extremes of the continuum "may not" be intelligible with each other per WP, but then, if the dialects aren't split up along national lines / along the same lines as the codes, then that's not a good argument against merger. - -sche(discuss) 00:14, 7 March 2016 (UTC)Reply[reply]
Elena Zinovʹevna Dubnova, in The Rwanda Language (1984), page 15, writes: "A. Coupez maintains that "as a matter of fact, Rwanda, Rundi and Ha ( and maybe other languages spoken east of the latter two areas) are so close that they can be regarded as dialects of one language" (17. p. 11). According to the other view they are closely related but still different languages (11, p. 26)." Other sources concur, "In linguistic terms, Kirundi and Kinyarwanda are mutually intelligible dialects of the Rwanda-Rundi language." Merged under the code rw. - -sche(discuss) 19:17, 14 August 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
These are three extinct languages of Argentina that lack ISO codes, but two of them have recorded material (the third, Puntano, seems pointless to add). The only problem is that some linguists consider these to be dialects of the same language, although that is debated and cannot be satisfactorily resolved with the limited preserved lexica from each. I would prefer we follow es.wiktionary's lead in adding separate codes for Allentiac (sai-all) and Millcayac (sai-mil). —Μετάknowledgediscuss/deeds 05:36, 28 June 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
I suggest renaming
ato from Atong to Atong (Africa) or Atong (Cameroon)
aot from A'tong to Atong (Asia) or Atong (India) if we accept the non-inclusion in the latter name of the border areas of Bangladesh where it is also spoken
References seem to prefer the spelling "Atong" to "A'tong" for aot, and Wikipedia says "The correct spelling Atong is based on the way the speakers themselves pronounce the name of their language. There is no glottal stop in the name and it is not a tonal language." - -sche(discuss) 04:37, 29 July 2016 (UTC)Reply[reply]
@-sche: I support renaming the languages to "Atong (Cameroon)" and "Atong (India)", respectively; in the latter, "India" can be taken to refer to the subcontinent, rather than the country. — I.S.M.E.T.A. 18:55, 31 July 2016 (UTC)Reply[reply]
Glottolog mentions a dialect Chánguena (Changuina); Linguist List gives separate codes to Chumulu and Gualaca. Do we know how distinct these are? - -sche(discuss) 07:31, 6 July 2016 (UTC)Reply[reply]
Do we need a code for the fourth of the languages Oramas covers (in the same work where he covers Gayon, Ayaman, and Jirajira), Ajagua (Axagua, Jagua)? Loukotka says: "once spoken on the Tocuyo River near Carera, state of Lara, Venezuela. [A. Espinosa (Vazquez de Espinosa) 1948, p. 35, only two words: Oramas 1916, pp. 49-57, only patronyms.]" - -sche(discuss) 23:02, 10 August 2016 (UTC)Reply[reply]
Done. This Ajagua is to be distinguished from Achagua, which is also called Ajagua and Achawa, but is spoken 1500+ kilometers away. - -sche(discuss) 05:07, 15 August 2016 (UTC)Reply[reply]
Cf User talk:-sche#Kiriri, where a few other needed additions are mentioned. This language is also known as Catrimbi or Kariri de Mirandela, or Kiriri, and is described by Loukotka as the "lost language of the ancient mission of Saco dos Morcegos, now the city of Mirandela". AFAICT, we should add this language (which is also known as Kariri), we already have Xukuru (which is also known as Kirirí and Kirirí-Xokó), and we should possibly add Xukuru-Kariri (postscript: Done), which is also known as Xocó. Keeping them all straight is going to be difficult and is complicated by the fact that each is known only from short words elicted from elders in 1961. - -sche(discuss) 20:21, 14 July 2016 (UTC)Reply[reply]
Done but without the accent, since English-language sources seem to drop it at least as often as they retain it. Cabrera (1929) is said to record a few words and Serrano (1945) five more. I've also added a code for Comechingón / Comechingon / Comechingona. - -sche(discuss) 17:51, 15 August 2016 (UTC)Reply[reply]
Btw, Matthias Urban lays out a number of arguments that Spruce's semi-well-known wordlist is definitely Sechura and not Tallan. - -sche(discuss) 06:03, 14 August 2016 (UTC)Reply[reply]
Also called Shebaye, Shebayo, per Campbell, American Indian Languages: The Historical Linguistics of Native America; he says David Payne "adduces persuasive evidence from the scant fifteen words recorded in extinct Shebayo (Shebaye) of Trinidad to show that it belongs with the Caribbean group (for example, it appears to have da- ‘my’, and these languages are the only ones which have an alveolar stop and not a nasal for 'first person singular')". - -sche(discuss) 05:04, 14 August 2016 (UTC)Reply[reply]
Done as "Shebayo", the most common spelling. Douglas MacRae Taylor, Languages of the West Indies (1977), page 15, has: The Shebayo list, taken from De Laet's Novus Orbis, is as follows (divergent spellings found in different editions are shown in parentheses): heia (heja) 'pater', hamma 'mater', wackewijrrij 'caput', wackenoely (wackenoey) 'auris', noeyerri (noeyerii) 'oculus', wassibaly (wassi) 'nasus', darrymaily 'os', wadacoely 'dentes', watabaye 'crura', wackehyrry 'pedes', ataly 'arbor', hoerapallii 'arcus', hewerry 'sagittae', kyrizyrre 'luna', and wecoelije 'sol'. - -sche(discuss) 19:22, 15 August 2016 (UTC)Reply[reply]
Loukotka says nothing of Voto is attested (which is ironic, because the subfamily is apparently named after it). The Indigenous Languages of South America: A Comprehensive Guide considers it a variety of Rama, possibly based on the fact that the Ramas were also called Votos. - -sche(discuss) 05:04, 14 August 2016 (UTC)Reply[reply]
Glottolog doesn't list it or have any resources on it, either. Therefore, not added until content in it can be found, or at least determined to exist. - -sche(discuss) 05:45, 15 August 2016 (UTC)Reply[reply]
As to how different the dialects of Dorasque are: A. L. Pinart's Vocabulario Castellano-Dorasque, Dialectos Chumulu, Gualaca y Changuina gives si (and ji) as the Chumulu form(s), and gives ti as the Gualaca form and ji as the Changuina form for "water". Overall, it's hard to tell how intelligible the dialects would be; some words are quite similar (Chumulu utká, Gualaca utkál "yellow"; Chumulu katuvá, Gualaca katavá "bow"), others are very different: Chumulu sagúsaña, Gualaca θake "blue"; Chumulu sérkala, Gualaca okiyigua "asiento" (OTOH, all three dialects use sérkala for "bench"). "Woman" is biá in Chumulu and Changuina, ωiá in Gualaca. I suppose we can follow Pinart in entering it as one language. - -sche(discuss) 01:12, 10 August 2016 (UTC)Reply[reply]
Okwanuchu language (nai-okw): Berkeley.edu's Indian Languages project says some (<100) words were recorded in the early 20th century; the Handbook of North American Indians: California refers readers to Kroeber (1925) and Voegelin (1942); Glottolog cites Victor Golla California Indian languages (2011). Kroeber says "The dialect is peculiar. Many words are practically pure Shasta; others are distorted to the very verge of recognizability, or utterly different." Victor Golla, California Indian Languages, speculates at length that Okwanchu may have been "a bilingual mix of Shasta and some other language". There was a people "whom the McCloud River Wintu considered Wintu and called Waymaq ('north people') [whom] Du Bois believed [...] were closely related to, if not identical with, the Shastan Okwanuchu; the survivor of the group whom she interviewed gave her a short vocabulary that included words of Shasta origin (Du Bois 1935:8)." These words included atsa 'water', au-u 'wood', katisuk 'bring'. Golla also says "Okwanuchu speech may also be attested in [eighteen] words identified as 'Wailaki on McCloud' (cf. Wintu waylaki 'north people') that Jeremiah Curtin recorded" in 1884, namely: gü'ru 'man', ki'rikega 'woman', hänumaqa 'old man', apci 'old' (in ki'rikega apci 'old woman'), ä'toqe'äqa 'young man', kewatcaq 'young woman', tse'gwa 'one', hoka 'two', qätsi 'four', tseapka 'five' = 'one hand', hukaapka 'six', tsuwara 'sun', kapqu'[r]wara 'moon', kau 'snow', atsa 'water', gri'tuma 'thunder', itsa 'rock', tarak (Golla's note: "[terak?]") 'earth'. Golla notes: "Curtin employed the BAE transcription system, in which q represents a velar fricative, not a velar stop." "Of these forms, five (man, old man, four, thunder) are not attested in any other variety of Shasta."
Konomihu language (nai-knm): The Handbook of North American Indians: California says "An unpublished Konomihu word list was collected by Angulo (1928a)." Glottolog cites Shirley Silver Shasta and Konomihu (1980), Roland B. Dixon The Shasta-Achumawi: A New Linguistic Stock, with Four New Dialects (1905), Lars J. Larsson Who Were the Konomihu? (1987). Kroeber says "Kroeber says "it is still questionable whether their speech is more properly a highly specialized aberration of Shasta or of an ancient and independent but moribund branch of Hokan from which Karok and Chimariko are descended together with Shasta. [...] Konomihu is their own name." Silver's work documents a number of words, including kihínàpxī́k "woman".
New River Shasta language (nai-nrs): "the language is attested only in a few wordlists" per Berkeley.edu's Indian Languages project. Kroeber, who mentions the alternative exonym Amutahwe, says they were "perhaps rather nearest to the major group in speech, although at that their tongue as a whole must have been unintelligible to the Shasta proper. [...T]he tribe melted away without a survivor, leaving only a fragmentary vocabulary."
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming bvx
The di- in its current name, Dibole, is one of those language prefixes, but curiously enough, the most commonly used name for this language is actually Babole, with a different prefix. (Luckily, unprefixed Bole isn't used, because it's taken up already by bol.) We should rename this to Babole accordingly. —Μετάknowledgediscuss/deeds 17:32, 29 June 2016 (UTC)Reply[reply]
If I know anything about Bantu languages, "Babole" refers to the people who speak it rather than the language itself. Is this really what they call it? —CodeCat 18:24, 29 June 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Renaming pgd
We currently have this as "Gāndhārī". This spelling is indeed used frequently in the literature, but we try to avoid difficult diacritics, and this spelling can be seen as using IAST to render the native name of the language, whereas "Gandhari" is the corresponding English. I think that switching to "Gandhari" would be the better choice. @Aryamanarora, -sche —Μετάknowledgediscuss/deeds 08:52, 5 July 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
We follow the ISO in covering this as a single macrolanguage, ers. Following Yu (2012), we should keep ers as Ersu, but also create sit-tos for Tosu and sit-liz for Lizu. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)Reply[reply]
Done as sit- because there is apparently disagreement over whether Tibeto-Burman comprises a monophyletic group, and if we later remove it in favor of sit, the fewer things we have to rename, the better. - -sche(discuss) 03:13, 29 August 2016 (UTC)Reply[reply]
Done, although this is another of those languages that is attested only in two (or in this case three) disparate wordlists, by Sapper, Ricke, and Johnston, all published by Lehmann, who considers them nonetheless the same language and presents them in two columns, "Johnston" and "Sapper-Ricke". - -sche(discuss) 19:45, 18 August 2016 (UTC)Reply[reply]
Loukotka, and Campbell citing him (and Wikipedia citing Campbell), says only a single word is attested, but none of them say what that word is. Other sources, none of which strike me as reliable, provide different(!) words, of which the most common is quindio (one book has quindus, one has kindiyo, parallelling the spellings Quimbaya - Kimbaya), meaning "paradise", but other words mentioned include chascará, batatabatí, and fihisca "spirit" (the last of which may actually be from some other language). proel.org says eight words are attested, but doesn't say what they are. Several books provide placenames which may attest the language. - -sche(discuss) 23:57, 20 August 2016 (UTC)Reply[reply]
An old source (Daniel Garrison Brinton, 1898) says "The Jupua and Curetu dialects are properly one and the same, the difference which appears in their vocabularies arising simply from inequality in the ears and the orthographies of observers. This is evident by the following comparison..." and then proceeds to offer a comparison which IMO doesn't actually conclusively demonstrate anything. In any case, Cueretú / Curetu itself does not currently have a code. - -sche(discuss) 04:20, 14 July 2016 (UTC)Reply[reply]
I can find a number of sources attesting Yupua words, but none seem like reliable primary sources: Peruvia Scythica: The Quichua Language of Peru mentions "wui" as the word for house, but does so as part of trying to connect a huge number of unrelated languages based on chance sound correspondances. Brinton's 1898 Studies in South American Native Languages has a list of words which he sources to Martius (surely referring to the Wörtersammlung Brasilianischer Sprachen) and compares to Curetu words sourced to Wallace (surely referring to A Narrative of Travels on the Amazon and Rio Negro), but I haven't located the Yupua words in the originals; the words Brinton gives are: blood: thik (Yupua), dü (Wallace); bow: patopai, patueipei; earth: thitta, ditta; flesh, ga'hi', se'hea'; finger, moh-asoing, mu-etshu; fire, pieri, piure; flower, pagari, bagaria; foot, göaphoe, giapa; hair, poa, phoa; hand, moho, muhu; head, co'ëre, cuilri; house, wu'i, wee; mouth, thischüh, dishi; sun, hauvä, aoué; tongue, toro, dolo; tooth, gobâckaa', gophpecuh; water, thäco, deco; woman, nomöa, nomi; he also offers the additional Yupua words hóggoa "water" (sic), göaphae "foot", ga'hi "meat", jih "jaguar", ikama "deer", jocheo' "star". Ruhlen has manapẹ "husband / man", apara "we two", ti "this", -mai- "we", tsīngeē "boy", pilo "fire", poa "feather". - -sche(discuss) 05:29, 14 July 2016 (UTC)Reply[reply]
Martius confirms thäco is "water" and pieri is "fire", and has wúi "house", pohjá "feather". - -sche(discuss) 07:54, 20 August 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Berawan
It does not make sense to have both the macrolanguage (lod, which the ISO retired) and the dialects (zbc, zbe, zbw). We should either merge the dialects or retire lod. However, Berawan is composed of more than three varieties, which the literature often speaks of, though there are some works that use the same coarser division as Ethnologue. - -sche(discuss) 09:57, 1 September 2016 (UTC)Reply[reply]
Being a "lumper" rather than a "splitter", I support merging the dialects into the macrolanguage. Dialect labels can be added with {{lb}} as needed. —Aɴɢʀ (talk) 11:24, 1 September 2016 (UTC)Reply[reply]
Oddly, Ethnologue does not provide a claim of how mutually intelligible vs unintelligible they are. Blust, in The Consonants of Long Terawan, asserts four dialects, all spoken within 25 miles of each other: Long Terawan (Ethnologue's zbw) and Batu Belah (part of zbc), and Long Teru (part of zbc) and Long Jegan (zbe); he considers e.g. Long Pata (which WP names, and which is noted by Ray's 1913 Languages of Borneo) to be a variant of one of these, though he is nonspecific as to which one. When I compare Blust's Terawan data and Burkhardt's The historic evolvement of true triphthong phonemes in Long Jegan Berawan, the differences are not large, especially considering the obvious differences in style of notation: e.g. Blust has Terawan gitoh "hundred", Burkhardt has Jegan getoʔ (Blust noted some uncertainty in his transcription of vowels); Terawan dimmeh "five", Jegan dimmiᵊy; T buluh "feathers or body hair", J bullǝw, T iciw "day", J iciᵊw, T iko "tail", J eko, T puté "white", J pote, T depeh "fathom", J dǝppiǝ̈. A larger difference is Terawan lebbih "two" vs Jegan duβiᵊy, T puʔ "head hair", J pǝuk, T manoʔ "bird", J manǎuk. The only entry we have in the dialects is pi "water", which is (per our entry) the same in all of them. It does seem that these could be merged. - -sche(discuss) 18:19, 1 September 2016 (UTC)Reply[reply]
→ I would prefer qfa-und-capund-cap to avoid it appearing to be a family (qfa-) code, or ine-cap... or und-cap^? And are words in the language attested? WP says no texts survive. - -sche(discuss) 06:08, 11 July 2016 (UTC)- -sche(discuss) 17:57, 14 August 2016 (UTC)Reply[reply]
I'm fine with three-part codes for a couple rare lects. As for attestation, I chose to put this language on the list chiefly based on The Expositor's Bible, vol. 35, p. 218, where it says: "Thus the ancient Cappadocian language is discussed and a lexicon of it compiled in a monograph which appeared in the Museum of the Evangelical school at Smyrna (1880-84), pp. 47—265." —Μετάknowledgediscuss/deeds 06:28, 11 July 2016 (UTC)Reply[reply]
Let's not add this, at least until we find positive evidence that it needs to be added. In addition to the statement that no texts survive, I see scholars such as Holger Pedersen savaging claims (by Karolidis, Kretschmer, et al) to have found relics of the language in Cappadocian Greek. - -sche(discuss) 18:13, 14 August 2016 (UTC)Reply[reply]
Done as sai-cno since -chn was taken. In addition to the catechism, Fitz-Roy (1839) has three words, and Jose Garcia (1889) has three words, per Loukotka. - -sche(discuss) 06:18, 17 August 2016 (UTC)Reply[reply]
For a language that's known only from a single wordlist, this sure has a lot of names... Jeikó, Geicó, Jeicó, Jaikó, Geikó, Yeikó, all of those with unaccented 'o', and Eyco... - -sche(discuss) 02:02, 20 August 2016 (UTC)Reply[reply]
Mimi of Decorse (qfa-und-mim) — another wordlist language lacking a good name
Presumably we also need Mimi of Nachtigal, so I suggest und-mmd and und-mmn rather than -mim to keep them maximally distinct (even if Nachtigal's Mimi is considered Maban). - -sche(discuss) 17:51, 6 August 2016 (UTC)Reply[reply]
Damin (art-dam) — this might be better off considered as a natural language instead so it can be in mainspace (--Meta)
If we want Damin in mainspace, I'd rather consider it a dialect of Lardil than a separate language. —Aɴɢʀ (talk) 20:43, 13 August 2016 (UTC)Reply[reply]
Hmm, this is an interesting case. On the one hand, they seem to be mutually unintelligible, and speakers seem to refer to them as separate languages, and outside linguists seem to treat them as separate things (compare Eskayan)... and we do consider even such very similar, often-overlapping things as Norwegian and a slightly different orthography of Norwegian to be entirely separate languages. On the other hand, Damin is said — by researchers with access to its full vocabulary, as opposed to access to only alimitedsurvivingcorpus — to have a vocabulary of "perhaps no more than 250 words in all", which makes it hard to consider it a language, and does make it seem similar to pandanus avoidance registers or especially thick cant or jargon. According to Wikipedia, some markers are different, but others are the same, e.g. genitive -kan and future -ur. I suppose treating it as a dialect of sorts does make the most sense. - -sche(discuss) 17:35, 14 August 2016 (UTC)Reply[reply]
Water is سو (su); the word also means 'tempering (of a sword)'. 'Ebçi' (apparently containing the occupational suffix '-çi'; ç = č) is 'woman', mentioned in one military treatise when it says that Indian swords are "useful only for hanging on the neck of a woman who cannot give birth to a son". 'sovuq' is 'cold' and 'sol' is 'left': andın songra sol egining ūzārā salgıl taqı boynuñ ūzārā tezgindūrgil "after that, place it over your left shoulder and make it rotate over your neck". See Munytu'l-Ghuzāt: a 14th-century Mamluk-Kipchak military treatise and Vocabulaire arabe-kiptchak de l'époque de l'État mamelouk. - -sche(discuss) 00:47, 31 August 2016 (UTC)Reply[reply]
This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.
Based on their limited mutual intelligibility, and following the references on these languages (see w:Rgyalrong languages' bibliography), we should split jya, and its category and any entries, into four languages:
Situ (sometimes called Eastern rGyalrong) (sit-sit?)