Open main menu

Wiktionary β

User talk:-sche


ubersexual / including non-durable citationsEdit

Translations of attributive use of nounsEdit

Add replacements to edit summaryEdit

In AWB Options > Normal setting uncheck 'Add replacements to edit summary' and it'll make the edit summaries only what you put in the 'Default Summary' box. Makes edit summaries shorter and more 'human'. Mglovesfun (talk) 18:38, 11 October 2012 (UTC)

Aha! Thanks for the tip. :) - -sche (discuss) 18:45, 11 October 2012 (UTC)


I'd like to take over WOTD — at least for now. I've already set up new words for October 28-31 to get the ball rolling again. Looking over diffs to see what others had done allowed me to figure out the basics, but there's still many other things I need to know about the process, especially what I need to do to create an archive, set up a new month, and polish the entry pages for words before they appear. Thanks! Astral (talk) 00:43, 28 October 2012 (UTC)

I'm glad you're interested!
The front-end part is simple—pick words and plug them into the templates. You're already doing a good job of that; I like your Halloween pick. As you seem to have gathered, the last definition doesn't end with a full stop/period (though if a word has multiple definitions, the preceding definitions do), because the template already adds one: double-dotted vs fixed. Featured words should have pronunciation info (either IPA or audio); the template will automatically notice and include an audio pronunciation if one is present.
The more additional info an entry has, like etymology, illustration or examples of usage, the more interesting it is likely to be to users who click through to it; on the other hand, trying to cite and find a picture for every word you feature on WOTD is a recipe for burning out. Strategise.
Once you've set a word, add the was-wotd template to the entry, so that it won't be featured again (mostly).
To create an archive, do what Ruakh did here, changing {{wotd archive|PREVIOUS|NEXT|YEAR|DAYS}} to the previous month, the next month, the year (four digits) and the number of days in the month (28, 29, 30, 31), and updating the pagename to the relevant month and year. An easy way of creating an archive is to copy-and-paste the relevant month's Recycled Page, e.g. Wiktionary:Word of the day/Recycled pages/October, simply changing {{wotd recycled}} to {{wotd archive}} and adding the YEAR and DAYS parameters.
At the end of the month, subst: all of the templates by changing each day's {{Wiktionary:Word of the day to {{subst:Wiktionary:Word of the day. The reason for not subst:ing a day before it's done is that someone might tweak the definition or fix a typo, etc.
- -sche (discuss) 04:41, 28 October 2012 (UTC)
Thanks. This is very helpful. I've got a couple of questions. First, I'm not good with IPA, so is there a way I could arrange for someone who is to add pronunciation data to entries before they appear? Second, is it okay to occasionally select words I've nominated myself? I already did this with trainiac, because I wanted something "fun" between mulct and peri-urban, but I don't want to do it again if it's something that should be avoided. Astral (talk) 03:33, 30 October 2012 (UTC)
Also, exactly how far back does the prohibition against using words featured as WOTDs on other sites go? It makes sense not to copy words other sites have featured recently, but three, four, five years back seems like a another matter. I need a verb, and wanted to use photobomb, but it was featured on Urban Dictionary in 2009, and more recently as a noun on September 28 of this year. Astral (talk) 03:49, 30 October 2012 (UTC)
So, I chose ambuscade instead, only to discover it was a Merriam Webster WOTD in 2010. Can't win. :( Astral (talk) 04:27, 30 October 2012 (UTC)
Disclaimer: I'm not Sche (@Sche: feel free to correct me on anything I say). Anyway, I think that choosing words that you nominate is fine, and that if you find a concise way to list all the entries you want IPA for pronto (on a subpage, maybe?) I would be happy to help out, as would Sche, Angr, et al. (probably) given their past contributions in that regard (and they're probably more trustworthy than I am). —Μετάknowledgediscuss/deeds 05:14, 30 October 2012 (UTC)
Yes, you can just comment that you'd like to feature a word but it lacks pronunciation info. Many users watch that page, and someone should take care of it. And yes, you can feature words you've nominated—at least, I did. It's probably best to let a couple days pass between when you nominate a word and when you use it, in case anyone comments with objections, but I doubt anything you nominate will be objectionable (you know not to nominate redlinks or offensive words). As for other sites' words of the day: personally, I never paid much attention to that rule; I checked if a word had been featured on another site in the past few months, and if not, looked no further. Sometimes, people would strike words that had been featured by other sites years ago, and in those cases, I respected the strikings and didn't use those words, but I didn't strike words that had been featured by other sites years ago myself. - -sche (discuss) 05:45, 30 October 2012 (UTC)

Inscriptions and whatnotEdit

Discussion moved to WT:T:ALA.

̶s̶̶c̶̶h̶̶r̶̶i̶̶e̶̶f̶̶s̶̶t̶̶a̶̶n̶, ̶s̶̶k̶̶r̶̶i̶̶if̶̶s̶̶t̶̶a̶,̶s̶̶c̶̶h̶̶r̶̶i̶̶e̶̶w̶̶s̶...Spelling standards for Low German.Edit

Ahoy. Please refer to this, leave a comment and maybe distribute it to people you know might have an interest in this. We can do it! Korn (talk) 19:14, 29 August 2013 (UTC)


WT:LANGTREAT doesn't mention Slovincian. I was wondering whether we made the decision not to treat it as a dialect of Kashubian, or whether it just happened that way. I have no preference one way or the other, since I don't know much about it anyway. --WikiTiki89 16:04, 9 December 2013 (UTC)

It looks like it just happened that way. I mean, both Slovincian and Pomeranian have exceptional codes, so someone made the conscious decision to treat them, Kashubian, and Polish as distinct from each other. But both codes were created by the same user who also created separate exceptional codes for the Pitcairn and the Norfolk varieties of Pitcairn-Norfolk, which subsequent discussions all agreed to re-merge, so it's possible (and indeed, apparently the case) that it was just that one use who got the idea that they should be split. There does not seem to have been any community discussion of Slovencian, Kashubian or Pomeranian, but Wiktionary:About Slovincian has been created. I have updated LANGTREAT to note that "in practice,..." they are currently distinct. - -sche (discuss) 18:50, 9 December 2013 (UTC)

Data consistency checking moduleEdit

Kephir wrote Module:data consistency check which performs a check on all the data modules, and makes sure there aren't any discrepancies. There are some, so I thought you might like to know. —CodeCat 23:45, 17 December 2013 (UTC)

Among other things, aus, sai, and cai ought to go, stupid geographic categories that they are. —Μετάknowledgediscuss/deeds 01:50, 18 December 2013 (UTC)
@CodeCat: thank you for the link. (And @Kephir, if you're reading this, thanks for designing that module!) @Metaknowledge: Indeed, and nai (which several things currently list as their family!). qfa-ame should also go, IMO, or at least be voted upon like Altaic and Zuni needs to be updated not to list qfa-ame as its family even if it is kept. (If qfa-ame is kept, we should reconsider having deleted Penutian.) I've been meaning to start Requests for Deletion, but I've been busy. Feel free to beat me to it. - -sche (discuss) 09:19, 18 December 2013 (UTC)
Wiktionary:Requests for deletion/Others#Certain_geographic_language_families. - -sche (discuss) 02:02, 20 December 2013 (UTC)

Removing scriptsEdit

Some entries may specify a script with sc= even if no language has that script specified. When you remove the scripts, those entries will eventually trigger script errors. —CodeCat 14:44, 21 December 2013 (UTC)

I checked for such entries. When they existed, I added the script code to the relevant language code rather than removing it. - -sche (discuss) 20:06, 21 December 2013 (UTC)

A barnstar for you!Edit

For your continuous work to improve coverage and consistency of languages, families and such. —CodeCat 03:16, 24 December 2013 (UTC)
Thank you! :) - -sche (discuss) 06:29, 24 December 2013 (UTC)

Re: jewing / using labels on inflected formsEdit


jade#Etymology 2 mentions the language Mordvin, but we consider Mordvin to be two different languages: Erzya (myv) and Moksha (mdf). Is it worth creating a small language family for Mordvinic languages (probably not)? If not not, how can I determine which language was meant in the etymology? --WikiTiki89 19:58, 4 August 2014 (UTC)

As a first step, I'd declare the term's language to be "und", and say it's from "either Erzya or Moksha". This obviates the need to create a code for Mordvinic (though one could still be created if there happened to be other reasons why it would be useful). Next, knowing that Moksha and Erzya are both written in Cyrillic, I'd test various possible Cyrillic spellings of the term combined if possible with various possible Russian translations, to see if I could find any Russian linguistic texts that mentioned the term — I've been able to verify the identity of some Lak and other Caucasian terms that way.
PS #1: that reminds me of how useful it would be if we had entries for the Russian abbreviations of various languages' names. I've added some (д.-в.-н.), but I think I stupidly didn't record the Caucasian abbreviations at the time I had them in front of me, even though it took me a while to figure them all out with the help of ru.wikipedia. Maybe I'll go looking for them again; shouldn't be too hard to find them again, and you and Anatoli can help verify what they're abbreviations of.
PS #2: do you think it's redundant to say "obviates the need" or "obviates the requirment", since obviate already specifies "bypass a requirement" in its definition? I've never been sure... - -sche (discuss) 20:54, 4 August 2014 (UTC)
After taking a look at the languages' orthographies, the only Cyrillic spelling of al'd'a that makes sense is альдя, for which Google shows several results in some strange language that might be either Moksha or Erzya, or might be something else entirely. I don't know nearly enough about these languages to be able to identify them, and none of the results are dictionaries. RE PS #1: Russian abbreviations always confuse me too. I'm not even sure whether the language abbreviations are standardized enough between dictionaries for it to make sense to add them. RE PS #2: I think the definition is supposed to be "bypass [a requirement]" in other words the requirement (or the word requirement itself) is meant to be the direct object of "obviate". --WikiTiki89 02:12, 5 August 2014 (UTC)
I checked such spellings as алда, ал'д'а and альдьа after I posted, and I couldn't find anything in a Uralic language, either. Some hits were Kazakh(!).
Per Thorson's 1936 Anglo-Norse studies: an inquiry into the Scandinavian elements in the modern English dialects, volume 1, derives dialectal English yad / yaad / yaud (used in "Sc Nhb Lakel Yks Lan", which I take to be Scotland, North Humberside?, Lakeland?, Yorkshire, and Lancashire) from Old Norse jalda (dialectal Swedish jäldä), from a Finnish word "elde" (citing "FT p. 319, Torp. p 156 fol."), but says "Eng. jade is not related." Likewise the Saga Book of the Viking Society for Northern Research, page 18, says "There is thus no etymological connection between ME. jāde MnE. jade and ME. jald MnE. dial. yaud etc. But the two words have influenced each other mutually, both formally and semantically." I'll see about expanding jade and yaud with this information. - -sche (discuss) 03:04, 5 August 2014 (UTC)
One last question, though. Should "Mordvin" be added as an alternative name for both Moksha and Erzya? --WikiTiki89 13:45, 5 August 2014 (UTC)
Yeah, enough references (especially old ones but even some modern ones) speak of "Mordvin" as a language made up of Moksha and Erzya dialects, rather than as a family, that recording that asan alt name would be helpful. - -sche (discuss) 15:26, 5 August 2014 (UTC)

It's эльде both in Moksha and Erzya. See Имяреков, Мокшанско-русский словарь, 1953, page 124b, and Серебренникова Б. А., Бузакова Р. Н., Мосина М. В. (ред.), Эрзянско-русский словарь, 1993, page 781b. If you can't find a spelling for any Uralic, Altaic or a Caucasian language, ask me, I have a lot of sources. --Vahag (talk) 08:44, 7 August 2014 (UTC)

Awesome, that's good to know. I knew you had resources on Caucasian languages, but didn't know about Finno-Ugric. I'll add a Moksha section to [[эльде]]. :) - -sche (discuss) 17:47, 7 August 2014 (UTC)

Haida languagesEdit

Have we thought out the treatment of these yet? We have both the macrolanguage code hai (and a category for terms derived from it, including the entry gwaai that I think I'll go and RFV) as well as the two sublects, hdn and hax, the latter of which I just unwittingly made a terms derived from category for. —Μετάknowledgediscuss/deeds 20:35, 13 August 2014 (UTC)

I recall looking into the Haida lects, but it seems from my "Note 2" in this RFM that I held off on posting about them for some reason, and then got distracted by events in real life. WT:LANGTREAT says to treat only the macrolanguage as a language, but like the pronouncements I mentioned in that RFM, it seems there was never discussion about that. There are noticeable phonological differences between the Northern and Southern lects. Each of those lects is in turn made up of its own (sub-)dialects, but the sub-dialects within each group are mutually intelligible, so it doesn't seem to be a problem to merge those (into hax and into hdn), and it seems most references do. I looked at a number of North Haida, South Haida and plain "Haida" materials (Enrico's Northern Haida Songs, etc) and references before I posted the above-linked RFM last year and planned to comment about Haida; I'll see if I can find the notes I made then. - -sche (discuss) 22:30, 13 August 2014 (UTC)
A tad more research on the matter suggests to me that we should deprecate the use of the macrolanguage and reassign it, then create categories for the sublects. If you've notes on it, though, I'll wait for you to start the RFM instead of blowing ahead myself. —Μετάknowledgediscuss/deeds 23:01, 13 August 2014 (UTC)
Ok, here are my notes, which I'd be happy to summarize in any RFM on the subject, or which you can feel free to pull from.
- -sche (discuss) 05:49, 14 August 2014 (UTC)
By the way, for entries I would suggest using Enrico's orthography (or maybe Bringhurst's), so as to avoid characters like that are hard to input and liable to display incorrectly. - -sche (discuss) 06:07, 14 August 2014 (UTC)
All sounds good, and you can feel free to copy my Support over to the RFM for splitting and deprecating hai, but I'm not on board with the orthography. In British Columbia, I've only seen the orthography with x̱ used, so I would presume it is standard among speakers and linguists. —Μετάknowledgediscuss/deeds 07:21, 14 August 2014 (UTC)
Wikipedia says SHIP's orthography "is the usual orthography used in Skidegate", while Enrico's is what I saw in my (limited) review for Northern Haida—but perhaps the set of materials I have access to is not representative of all materials. Are the texts you see in British Columbia Southern Haida, or are some Northern Haida? Meh, it would be undesirable to use two different orthographies... I suppose we can normalize both (South and North) on the SHIP spellings and mention the other spellings as alternative forms. (Cf this subthread, if you're bored.) - -sche (discuss) 23:45, 14 August 2014 (UTC)
OK, after waiting a few days for some other discussions to settle down, I started Wiktionary:Beer_parlour/2014/August#Haida_lects. - -sche (discuss) 19:09, 22 August 2014 (UTC)

The power of 'and'Edit

We COULD have both fixing AND fixing to be intelligible on their own, something we do with many comparable situations. DCDuring TALK 21:57, 14 August 2014 (UTC)

I've replied at WT:RFM so as to keep discussion in one place. Cheers! - -sche (discuss) 22:31, 14 August 2014 (UTC)

Attestability of "yellowman"Edit

The search for attestability seems to yield mostly references to a White Jamaican reggae artist. Purplebackpack89 04:41, 22 August 2014 (UTC)

Thanks for looking. I tried searching for the plural, "yellowmen", and although that turned up some scannos, it also turned up enough valid hits that I've now created yellowman. - -sche (discuss) 21:56, 22 August 2014 (UTC)
Two is enough? DCDuring TALK 23:55, 24 August 2014 (UTC)
The search turned up more than the two hits I typed up. CFI doesn't require that citations be typed up and put in entries unless the entries are challenged, but I have typed up a third citation. Incidentally, it also contains "whitemen" and "blackmen". - -sche (discuss) 00:56, 25 August 2014 (UTC)

Appendix:Place names in New York area with possible native American originsEdit

I gathered these from the book mentioned in the Appendix at a WP edit-a-thon held today at a local library. You provided an etymology for Mamaroneck that was better than that in the book, by Richard Lederer (or his father?). A few of the toponyms in the Appendix (eg, Osceola, Mohegan) are taken from native American tribes not from the immediate area, a few from neighbors on the west side of the Hudson, Connecticut, farther north in New York, or possibly from Long Island, but at least 80% are from tribes that lived in what are now Westchester, Putnam, or Bronx counties. The spellings are the only ones Lederer had. I assume he rejected some for good reason. He seems to have taken many of them from land purchase records of the 17th century. DCDuring TALK 23:51, 24 August 2014 (UTC)

Oh, neat. I will look over the list and see if I can clarify / expand any of the etymologies. Should I remove placenames from the list once we have entries for them with complete etymologies (as in the case of Ossining), or what? - -sche (discuss) 00:58, 25 August 2014 (UTC)
Let's keep them as examples of what can be achieved, at least for now.
Lederer seems to have worked fairly diligently through his sources, which include hundreds of primary documents and secondary works. I didn't see any works in the bibliography that seemed to be specifically books or articles on the native languages themselves, but I ran out of time so I didn't look all that carefully. I'll be able to take a closer look soon. I may also extract the Dutch origin names. The English ones are fairly uninteresting, even to locals.
Why are Germans so fascinated by native Americans? DCDuring TALK 04:27, 25 August 2014 (UTC)
BTW, I have the towns there to provide a hint where in the county these places are, in case geography might have a bearing on the language of the toponym. There are a few from the Long Island Sound area, more from Bronx and Yonkers and along the Hudson to Peekskill, and others inland in northern Westchester. HTH. DCDuring TALK 04:33, 25 August 2014 (UTC)
I'm sure there are books written on that subject. I think it's partly the earlier European Noble Savage myths, combined with the lack of territorial conflicts that might have provided motivation for negative stereotypes, but also just the lure of the exotic and safely far away. Chuck Entz (talk) 05:05, 25 August 2014 (UTC)
I wonder that myself sometimes.
If you're asking why so many materials on native American tribes and languages were compiled by Germans, a large part of the answer is prosaic. Germany has long produced large numbers of ethnographers and linguists. A lot of materials on Pacific and African peoples and languages were also compiled by Germans.
If you're asking why so many non-linguists love "Indian" things ... well, that's Karl May's doing. He bought into and sold others on the romanticized notion Chuck mentions of simple and noble, exotic people living "authentic lives".
The town names should be helpful. - -sche (discuss) 07:01, 26 August 2014 (UTC)

Neologisms and "Web Words"Edit

Personally, I've always taken "Web words" happily so long as they met certain criteria. I've always been particularly fond of Germanic or otherwise native ones, due to my love of writing "native" poetry.

Anent neologisms... it's been somewhat iffy. I am accepting of some, but not of others. For instance, "selfie" is a term that I never use; opting for the fairer "self-snapshot" or "snapshot of oneself". On the other hand, "troll" (as in the sense of "to bait and wait so as to start trouble" or the like) is one that I have happily accepted with open arms (mayhap due to its origins in angling terminology, though I honestly can't say for sure).

Now, the reason why I bring this up is because it seems that Wiktionary's methods of determining which "web words" and which neologisms are acceptable for inclusion are somewhat murkily composed. Whilst terms like "halgi" are included, others are not. I can't really tell what the "criteria for inclusion" entails sometimes, because it seems a bit vague.

Might you be able to shed some light on this? Tharthan (talk) 17:02, 31 August 2014 (UTC)

Yeah, numerous discussions have made it apparent that Wiktionary's policy on citing the internet is not as clear as it could be; in particular, it can take a while to unpack the ramifications of the words "durably archived" / "permanently recorded" in WT:CFI. But once those words are unpacked, "web words" and "print words" are subject to the same criteria for inclusion. Words in major languages have to be used, as in "he took a selfie", and not just mentioned, as in "he used the word 'selfie' to describe the picture he took of himself". (Lines like "he took what he called a 'selfie'" fall into a grey area of debatable use-vs-mention-ness.) The uses have to span a year, to weed out fad words that are only popular for a month, like the Russian translation of "pink slime" (which was only somewhat less of a fad in English). And the uses have to be in "durably archived"/"permanently recorded" media.
What is durably archived? Books, newspapers, journals and magazines are durably archived. (Google Books and Issuu are good ways of using the internet to search through those media.) Websites are not durable, because they go offline (and moreover are edited and reworded) without warning. Even articles on the websites of news organizations can be taken down — a Wikipedia article I just edited discussed a story which was removed at the request of the journalist, allegedly after he was intimidated. Even the Internet Archive, which has been discussed in the past, is not a durable archive, because it removes pages if site owners request that. The only online corpus which is durably archived is Usenet, because it is decentrally archived, and attempts to censor things from it have indeed failed (e.g. someone at one point tried to delete alt.religion.scientology, and failed). This failure of most web sources to be durably archived can make it harder to cite "web words" (cf. this). However, if a web word is attested per those criteria, it can have an entry just like any other attested word.
Does this clear things up any?
Note that because of the nature of Wiktionary (it's a work in progress, and it's a wiki anyone can edit in real-time), some unattested words may have entries (you can RFV those), and some attested words may not have entries yet (you can create those). Also note that strings that are analysable as misspellings (e.g. strings like licencise, and probably also uncommon strings from lolcat-speak or doge-speak) may be excluded as such. - -sche (discuss) 23:02, 31 August 2014 (UTC)
Yes. That clears up a lot. I now have a more adequate understanding of how the process works. Thanks much.
So citations from Usenet are considered to be among those of the "durably archived" / "permanently recorded" variety? Or, are they only somewhat so, and are thusly taken with a grain of salt? Tharthan (talk) 23:55, 31 August 2014 (UTC)
Usenet is as durably archived as print media, so a use of a word on Usenet is 'worth' as much as a use of a word in a book. But Usenet is more likely than print media to contain typos/misspellings, so if a string is analysable as a typo/misspelling, and it is only supported by Usenet citations, people may be more likely to analyse it as a misspelling and not an intentional use of a certain spelling/word. (For example, book citations might have done more to convince people of the word-hood of licencise than these Usenet citations did in this discussion.) But even books contain typos : I can't find an example offhand, but in RFV, if a book uses an unusual spelling sometimes and the usual spelling other times, it's usually assumed that the instances of the unusual spelling are typos. And when it's clear that something isn't a typo/misspelling, like "Rightpondian" or the video-game sense of "pull", then it doesn't matter whether the citations come from books or Usenet. There seem to be about 1100 entries that cite Usenet. - -sche (discuss) 01:56, 1 September 2014 (UTC)

Rollback in error at toldEdit

I believe this rollback was done in error. The alternate pronunciations that were there were intentional. I intend to restore them. - Gilgamesh (talk) 13:52, 26 September 2014 (UTC)

I should have undone your edit with a more informative summary, I'm sorry. The pronunciations you added are unattested and dubious, per discussion on Angr's talk page, so I've removed them until such time as evidence of them comes along. Rollback is sometimes/often used as a quick way of undoing edits around here (if the edits are merely felt to make the entry worse, without the implication 'rollback' has on Wikipedia that the edits are vandalism), since Wiktionary's relatively small number of admins tend to be a lot busier than Wikipedia's larger number of admins... but it can tend to cause confusion, like now, when the edit was intentional and in good faith, but still made the entry worse. - -sche (discuss) 14:18, 26 September 2014 (UTC)
I've started a thread at Wiktionary:Tea_room/2014/September. It's important that this be sorted out, because bowl-bull, cull-coal, etc. have indeed become homophones, and it effects even General American for most people certainly my age (34) and younger. - Gilgamesh (talk) 14:21, 26 September 2014 (UTC)

Hey, erm...Edit

I would have e-mailed you this or sent this message to you via a more private method if I could have, because I feel posting this here might come off as rude to the person in question (though I do not intend it as such).

User:Angr and I seem to be in disagreement over what should be allowed transcription-wise for a certain word, and we seem to be at a deadlock. As such, I thought that maybe a third party could be brought in so as to maybe give their opinion on the matter.

Now, I don't know really anything about your dialect, -sche, (and I don't mind being blissfully ignorant on that subject, since I think it's irrelevant to most parts of editing on Wiktionary) [though I remember seeing a reference to you at some point being in the Inland-North, though I don't really know the relevance of that] so I don't know where you'd fall anent this matter, but I would honestly hope (and truly do think) that that wouldn't (and shouldn't) matter, considering the argument here is transcription, and any linguist worth their salt knows how to properly transcribe vowel phonemes, and knows the difference between two different phonemes, whether monophthong, diphthong, or otherwise, irrespective of whether or not the vowel phonemes in question occur in his or her dialect.

Now, I firmly trust your knowledge and expertise in this field, hence why I have come to you. I think you may be able to help in settling this issue. So, if you'd be willing to offer your tuppence-worth on this matter, I'd be very grateful.

The aforementioned discussion can be found here: Tharthan (talk) 16:13, 28 September 2014 (UTC)

My "e-mail this user" link should be enabled (in the toolbar on the side of this page, a few items below "what links here"); if it's not, let me know. (Not that I check my e-mail with any frequency at all...)
My "expertise in this field" is amateur compared to Angr's. But since I've been asked, I'll give my thoughts:
I remember noticing during a previous Tea Room discussion of the M-m-m merger that one of the problems one faces if one wants to transcribe 'marry', 'merry' or 'Mary', or for that matter 'air' or 'ear', is that the IPA doesn't have symbols that denote these sounds perfectly, so one is left using approximate transcriptions. That's not automatically problematic — if a language's "e" sound is actually 15% closer to canonical /ɛ/ than canonical /e/ is, it's fine to nonetheless transcribe it as /e/, or if necessary /e̞/; one needn't invent a whole new letter for it. It does, however, mean that discussions of whether or not sounds are distinct (and discussions of how to transcribe them) are more difficult. For example, according to our entry and, 'merry' is /ˈmɛɹi/ and 'Mary' is /ˈmɛəɹi/ for speakers who don't have the M-m-m merger, while both are /ˈmɛɹi/ for speakers who do. However, both our audio clips and's contain a vowel that is distinct from the /ɛ/ in 'bet' (i.e., the Vr sequences in the audio clips aren't just /ɛ/ followed by /ɹ/). That means that someone who was trying to figure out whether her pronunciation of 'Mary' used /ɛə/ or /ɛ/ would run into trouble if she tried pronouncing 'Mary' and then pronouncing words with /ɛ/ in them like 'bet' to see if she used the same vowel in both — she'd probably conclude that she didn't use the same vowel for the two words, even if the vowel she used in 'Mary' was the one we transcribe as /ɛ/.
However, setting that issue to the side...
According to our entry and, 'air' is /ɛəɹ/*, with the same vowel as unmerged 'Mary'. Our audio clip is curt and sounds like it contains only a single (non-diphthong) vowel, but's has more of a /ə/. Likewise, 'affair' is /əˈfɛəɹ/ per, and the vowel in the audio file is the same as the vowel (diphthong) in's 'Mary' audio file.
That means it would be reasonable to transcribe the sound as /ɛəɹ/ (or /ɛɚ/, which is synonymous) for (some) American accents. But is /ɛɹ/ wrong? Well, is there an American accent that contrasts /ɛɹ/ and /ɛəɹ/ in this (non-intervocalic) context? If not, then the worst one can say is that /ɛɹ/ is potentially confusing, but as long as there's a page explaining how the symbols are used, it's not wrong, and it's possibly not even any more confusing (or any less accurate) than our use of /ɛ/ to mean one thing in merry and another in bet.
Merriam-Webster and Random House use the same transcription for 'merry' and 'affair', but also for 'Mary' (apparently they treat the M-m-m merger as standard). The various dictionaries that make up transcribe 'merry' and 'affair' differently.
You can raise the issue in the Tea Room for broader discussion if you think the default transcription of the 'air'/'affair' sound should be switched from /ɛɹ/ to /ɛəɹ/ (or /ɛɚ/). I have no strong preference, since I don't think either transcription is ideal (I don't think there is any ideal transcription of the sound).
Note that transcribing 'air' (and 'affair', etc) narrowly, in square brackets, as [ɛəɹ] or [ɛɚ] is another matter entirely, and probably a lot more straightforward.
(* Our entry also lists /ɛːɹ/ as a possible US pronunciation of 'air', but this is suspect, since vowel length is not phonemic in American English. Actually, that's another case where a small distinction is glossed over and one symbol is used for two slightly different but non-contrastive things: /i/, /u/, etc is longer in some words than in others in American English, but they're not distinguished as having /i/ vs /iː/ because vowel length is not actually contrastive.)
- -sche (discuss) 09:02, 29 September 2014 (UTC)
Oh, you're right. I didn't notice that on the sidebar.
I actually agree with you there, because I initially transcribed the /ɛə/ vowel as /e/, because that's how my mind thought of it (this might be due to plain /ɛ/ indeed being a plain /ɛ/ in my dialect, whilst /ɛə/ is more of an /ɛ̝ə/ in my dialect). Nevertheless, I agreed that the sound was far closer to /ɛə/ than /e/, so I changed my transcription practices accordingly.
Then is it fine to list both pronunciations /ɛɚ/ and /ɛɹ/? You're right to say that there is probably no English dialect that contrasts /ɛɹ/ and /ɛɚ/ (my non-mMm merger dialect doesn't, since, as far as I know, /ɛɹ/ doesn't end any word in the language [with the possible exception of "err", as I mentioned on Angr's talk page]), but it's still better to list both pronunciations /ɛɚ/ and /ɛɹ/ than to list just /ɛɹ/ and have people say "Wait a minute... "affair" has the same vowel as "fairy", which is /ɛɚ/ for me in my non-mMm merger dialect, but yet the only pronunciation listed here is /ɛɹ/. Am I wrong in pronouncing it /ɛɚ/?" Furthermore, it couldn't do any harm to have both pronunciations listed. So could we at least have both /ɛɚ/ and /ɛɹ/ pronunciations given for affair? Tharthan (talk) 11:03, 29 September 2014 (UTC)


Thanks for resolving the mini-contretemps at "bear"... AnonMoos (talk) 17:17, 29 September 2014 (UTC)


I hadn't realized I had accidentally put words in the wrong category. Thanks for the heads up. — LlywelynII 23:50, 30 September 2014 (UTC)


Discussion moved to Talk:lebendig.

Lewis and ClarkEdit

I've borrowed Lewis and Clark: Pioneering Naturalists, which has two appendices of plants and animals "discovered" by Lewis and Clark. For my purposes the listed species name(s) and vernacular names are of greatest interest. The appendices don't have non-English names. But the discoveries have references to the volume and page in Thwaite's edition and most have a date and location for the discovery. Have you already mined Lewis and Clark for native names? Do you intend to do so? Are there other sources for that? DCDuring TALK 20:15, 2 November 2014 (UTC)

I've only 'spot mined' Lewis and Clark, i.e. when Google Books let me know that a page of their journals mentioned pasheco, I checked the surrounding pages for other native / native-derived words. I haven't mined the whole work. If you'd like me to (try to) find and add native names for any of species or vernacular names you add, I'll see what I can do. I've been rather distracted from my Native American word documentation project. - -sche (discuss) 01:40, 24 November 2014 (UTC)


I've started a page User:DCDuring/Geology and copied your items there, as well as a WP table. It suggests some lines for improving our entries as well as showing redlinks. I also came across the Geowhen Database, which is a convenient source of confirmation of the meaning of some of these terms. DCDuring TALK 17:01, 3 November 2014 (UTC)

Just so you know, although I haven't had much time for editing lately, I'm still available to help with geological terms, as I have some training in the field. If you leave me a message on my talkpage or tag me in relation to any issue you have when adding geological jargon or etymologies thereof, I'll be sure to respond. —Μετάknowledgediscuss/deeds 20:57, 3 November 2014 (UTC)
If I can find the time, I'll check out which terms are (a) most-linked to within Wiktionary or, probably more usefully, Wikipedia (I wonder if there's a toolserver/wmflabs tool that does that), and/or (b) most common in ngrams. It would make sense to tackle those first. - -sche (discuss) 01:40, 24 November 2014 (UTC)


Hi. I saw you reverted some of my edits on this word. I was mistaken to change the etymology in the way I did. I thought the theory of its deriving from Slavic was outdated, so I put that one into a "postscript". I've since seen that Kluge is also of this opinion and I was about to make that revert myself. -- As to the quotation I deleted, I just think that it misleads people to believe the word is obsolete and there are no more current quotations to be found. I don't think such quotations are very useful, but I will refrain from deleting them from now on. Sorry! And best regards!Kolmiel (talk) 00:20, 4 November 2014 (UTC)

Yeah, and I made a little edit on the wording of your version, because I thought it might suggest that German Schmetten is from English (which of course you didn't intend).Kolmiel (talk) 00:22, 4 November 2014 (UTC)

think of the childrenEdit

Hi there -sche, you had previously pitched in and helpfully formatted an entry I improved, Streisand effect, as Word of the day.

Equinox (talkcontribs) created the entry on think of the children and I recently improved it.

I nominated it at Wiktionary:Word of the day/Nominations, however Ungoliant MMDCCLXIV (talkcontribs) mentioned at user talk:Equinox that unfortunately these days most of those that appear on the Main Page are recycled entries from prior years because it's pretty inactive.

I was wondering if you could add it to one of the upcoming dates for Word of the day?

Thank you,

-- Cirt (talk) 20:57, 5 November 2014 (UTC)

I was able to get help from others, but thanks for your time. :) -- Cirt (talk) 18:56, 16 November 2014 (UTC)
I'm glad someone helped you, and glad a new word will be featured. I'm sorry I didn't respond sooner. Perhaps over the upcoming holidays, when people have time off from work and school, someone will have time to set a bunch more Words of the Day. 01:40, 24 November 2014 (UTC)

Non-Oxford British English standard spellingEdit

Why put this at all? The fact that Oxford University Press uses the z spelling has nothing to do with the usage of the word. But I know you must have some reason for putting it in. What is it? Renard Migrant (talk) 21:12, 15 November 2014 (UTC)

Hi; sorry for not responding sooner. It seemed like the best way of distinguishing the two British spellings. Everyone (in Britain) spells flavour the same, but with something like actuali[sibilant]e, some Brits (most noticeably those affiliated with the OUP) spell it actualize, while many others spell it actualise. As I mentioned to an IP on Stephen's talk page, there have been a few discussions of how to describe the spellings that are used by British people, and other people throughout the Commonwealth, and all of the wordings have problems. Calling the spelling Oxford uses "Oxford British", and the other by elimination "non-Oxford", seemed best to me, but I'm open to being persuaded that another wording would be better. - -sche (discuss) 01:56, 24 November 2014 (UTC)


Re diff: I do think this is "more worthy of an 'uncommon' label than other -es genitives vs -s ones", because Archives really is virtually unknown in any German written in the past 175 years. That's why I wanted to label it "archaic", but the anon changed it to "rarer" because of a single cite on b.g.c from 2006 (which I think is simply a mistake on the author's part, but I can't prove it). —Aɴɢʀ (talk) 21:38, 17 December 2014 (UTC)

As the user points out on WT:RFD, there are more modern cites than just the one in the entry. And ngram data for both eines Archivs vs eines Archives and the compound Staatsarchivs vs Staatsarchives show that the -es version is still about half as common now as it was in the past (i.e. there does not seem to have been any sharp drop-off in usage), and it is about 1/25th as common in the modern era as the -s version, which is not an unusual ratio for an -es vs an -s form. Compare how, in the other direction, Geschäftsfreunds is now about 1/25th as common as Geschäftsfreundes, and Jubiläumsjahrs is about 1/15th as common as Jubiläumsjahres. (Those are two of the words the Duden cites in explaining how euphony helps decide which genitive ending to use.) - -sche (discuss) 22:39, 17 December 2014 (UTC)
But those are compounds, which are always skewed toward using the e-less form (eines Hofes is 15× more common than eines Hofs, but Hauptbahnhofes is only half as common as Hauptbahnhofs). The fact that Archiv isn't a compound would lead us to expect Archives to be more common than Archivs, not 25× rarer. —Aɴɢʀ (talk) 23:37, 17 December 2014 (UTC)


I'd be shocked if you found this as the imperfect subjunctive is a literary tense and fucker is new and extremely informal. Previous discussions have been favourable to creating all hypothetical verb forms because RFVing them would be a monstrously time consuming issue. See for example défragmentassions and the definition of défragmenter. Renard Migrant (talk) 20:36, 24 December 2014 (UTC)


See WT:RFV#Schlackenlosigkeit. The discussion has advanced beyond my extremely modest knowledge of German and may even need a native speaker. DCDuring TALK 23:01, 13 January 2015 (UTC)

Αγαρηνών et alEdit

The "misused" templates were put there for a purpose - if you want to change any more Greek entries please let me know.   — Saltmarshσυζήτηση-talk 11:11, 16 January 2015 (UTC)


Could you check the codes on this page? Thanks. DTLHS (talk) 22:08, 23 January 2015 (UTC)

Meh. Someone changed the header, but not the codes, from nds-de to plain nds (rather than adding a separate section for the Dutch Low Saxon term). >.>   The entry could be band-aided by either changing the header or the codes, but the general disagreement and slow-motion edit-warring about how to handle the various Low German lects makes for so much ugliness that I am losing interest in editing them. - -sche (discuss) 03:59, 24 January 2015 (UTC)

Why the "hmm..."?Edit

I agree that the previously-listed meaning of that was odd, but... what is the meaning of your edit summary? Are you doubtful of something? Or...? Tharthan (talk) 21:52, 24 January 2015 (UTC)

Mostly I was doubting the previously-listed meaning, but I also wonder if the wording I introduced really covers the citations, and/or if there are actually two senses, one used of people, and the other of places (the latter presumably similar to shire#Verb). - -sche (discuss) 23:22, 24 January 2015 (UTC)
I share your doubts. Also, are you sure that parish is a verb? Parished could easily be interpreted as a denominal adjective. DCDuring TALK 23:55, 24 January 2015 (UTC)
The 1972 citation and the second sentence of the 1992 citation seem very verbal to me. I'll see if I can find other inflected forms. - -sche (discuss) 01:41, 25 January 2015 (UTC)
Check out the 1917 and 1991 citations (the latter technically of re-parish). There's also the citation below, which I can't make sense of. - -sche (discuss) 01:49, 25 January 2015 (UTC)
  • 1903, Maxwell Gray, Richard Rosny, page 210:
    "You will take pleasure in parishing. Mother used to parish."
    "How do you know I like parishing?"
    "Your uncle said so."
    "Oh! did he?"
    "And you may like the rectory people; it's a fine old house, and often full of visitors."
after e/c
I'm not hostile to the verb view for the sense, just uncertain. I've looked for the parishing form, but just found it with certainty for what is now a new intransitive sense, for a distinct etymology of parish#Etymology 2 ("perish"), and for a noun sense. I may just have a block for the verb sense. There was a book title that seemed to be the sense I've been doubting.
The citation above is of the definition I added: "To visit residents of a parish". It's used of parish priests and also of women doing socializing possibly under color of visiting the sick, aged, shut-ins etc. DCDuring TALK 01:57, 25 January 2015 (UTC)
OK, I'll add it to that sense, which is now well-cited. - -sche (discuss) 01:59, 25 January 2015 (UTC)
The 1917 cite is syntactically though not semantically intransitive. The "re-parishing" cite is helpful. It's tough with a word that shows up so uncommonly in what are to me somewhat alien contexts. The word is certainly used with a meaning that is at least nearly verbal. I doubt anyone would challenge it on the same grounds such as my doubts. DCDuring TALK 02:08, 25 January 2015 (UTC)


As was revealed in a discussion that I had previously with Dbfirs, it seems the distribution of /ɛəɹ/ and /æɹ/ words differs between British English and dialects of North American English that do not possess the merry, Mary, marry merger.


"vary" is often /væɹi/ in non-merry,Mary,marry merger dialects (though, I will admit, its traditional /vɛəɹi/ pronunciation is still heard amongst the older generation. My mother, for instance, uses /vɛəɹi/, whilst my father and I use /væɹi/ [as does much of the younger generation]. Similarly, parent for myself, my family, and most of my peers is /ˈpæɹənt/, whilst /pɛəɹənt/ is the pronunciation I have heard in church and by some others. It seems to be about a 50-50 distribution.

In conclusion, some words that have a traditional /ɛəɹ/ in British English and old fashioned North American English seem to have shifted to /æɹ/ in the younger generations.

Do you (or anyone else visiting your talk page) have any idea as to why this might be? Tharthan (talk) 16:34, 25 January 2015 (UTC)

Generic phonetic simplification? Influence from GenAm, where the sounds aren't distinguished? I don't know. North American English regional phonology#New_England says "Western New England [... and] Connecticut and western Massachusetts in particular show the same general phonological system as the Inland North, and some speakers show a general tendency in the direction of the Northern Cities Vowel Shift—for instance, an /æ/ that is somewhat higher and tenser than average[.]" The phoneme that's next higher than /æ/ is /ɛ/. You're describing things going in the opposite direction, but I can imagine how a reduction in the contrast between the two sound in non-Mary-merging dialects, combined with an outright merger of the sounds in the surrounding dialects, could lead people who tried to maintain a distinction between the words (Mary, marry, merry) to use a new / un-original sound to do so. In English, I've heard people maintain the pen/pin distinction backwards, and in German people mix up [ɛː] and [eː] if they try to maintain a distinction between them. - -sche (discuss) 21:45, 25 January 2015 (UTC)
Hmm... it seems to me to be more of a specific hypercorrection than anything else, though, because other words besides the previous two seem to retain their correct pronunciations. I dunno. I just hope that we don't have another Great Vowel Shift or anything like that any time soon, because that seems to be the direction being headed towards. Tharthan (talk) 21:55, 25 January 2015 (UTC)


Hi there. I wanted to ask you about the [phonetic] transcription of the German /phonem/ /ʃ/. Should it be [ʃʷ] because of the lip rounding, or should we not use [ʷ] just as we've decided not to use [ʰ]? I personally would be in favour of [ʃʷ] because unlike aspiration there seems to be little regional/idiolectal variation and, even more importantly, there would be no wondering when and when not to use it since /ʃ/ would just always become [ʃʷ]... But I don't know. What do you think?Kolmiel (talk) 17:41, 25 January 2015 (UTC)

I would treat it like aspiration, and so I wouldn't use it. I note that de.Wikt, which only uses narrow transcriptions, doesn't use [ʷ]. You could ask on WT:T:ADE, though. This is not entirely here or there, but ... people occasionally propose "diaphonemes" around here (ultra-broad transcription); this seems like the opposite, ultra-narrow transcription. Perhaps one day we'll start adding both and have a sequence of //ultra-broad//, /broad/, [narrow] and [[ultra-narrow]] transcriptions. - -sche (discuss) 21:53, 25 January 2015 (UTC)
No it's fine, just wanted to check if you were in favour of using it. It's not that important I guess, and it's not a "Herzensangelegenheit" of mine.
I just think we shouldn't base our decision on the German wiktionary. Their transcriptions aren't narrow, they're just given between squared brackets because most traditional dictionaries do that. They would be very wrong if understood literally, especially things like [pakn̩] which don't exist in the German language and which I suspect might be almost physically impossible to the human mouth.Kolmiel (talk) 21:48, 27 January 2015 (UTC)

The names= field in the data modulesEdit

I'm looking at changing this now, and I already made a few initial modifications. But I'd like to confirm just what the plan was again. If I remember correctly, the idea was to split it into three fields:

  1. canonicalName
  2. otherNames
  3. Some field for the things that are subsumed under this name, but are not just alternative names.

I'm not sure what to call that third field, though, so do you have suggestions? Also, what should be done in ambiguous cases where there is no agreement whether something should be classified a subvariety or not? Perhaps, I could only split off number 1 for now, leaving 2 and 3 together until we sort that out more completely. —CodeCat 22:21, 25 January 2015 (UTC)

Oh, great! :)
Perhaps the third field could be called "varieties" or "varietyNames"?
I assume that when you say "no agreement whether something should be classified a subvariety or not", the alternative to classifying it as a subvariety is classifying it as an alternative name for the whole language. (If there's disagreement about whether or not something is a dialect of one language or a separate language, that's a question we're going to settle at an earlier stage, namely the stage of granting it a code or not, before we ever get to any of these names fields. Right?) There are cases where certain names refer both to dialects and to the whole language; in the earlier discussion I suggested that in such cases we could either (1) list the name in both places, or (2) decide that anything listed in a higher field will not be repeated in a lower field (so, anything listed in "otherNames" will not be repeated in "varietyNames"). - -sche (discuss) 22:37, 25 January 2015 (UTC)
The question is mostly relevant to reconstructed languages, at least in the way I intended it. Proto-Uralic for example has Proto-Finno-Ugric as a subvariety, but some linguists contend that they are one and the same. Austronesian is often considered synonymous with Mon-Khmer (both share a Wikipedia article too). And there are probably similar situations for other languages.
I'm not sure if "varieties" is clear enough. I would like to have "sub" in the name so that it's clear in what way it's distinct from "otherNames". So "subvarieties"? I've also seen "sublects" used by some people. —CodeCat 22:53, 25 January 2015 (UTC)
Well, I would handle proto-language cases the same as other cases, either always list such names in both fields, or decide one field always has priority. The first approach might more accurately convey that some authorities use _(whatever)_ as an alt name for the whole language and other authorities use it as the name of a "dialect", and keeps us from having to pick which field to list the name in. If we went with the second approach, my gut reaction would be to "prioritize" the "higher" field, and so list "Proto-Finno-Ugric" as an alternative to "Proto-Uralic" and not list it as a dialect.
As for the name: well, how about "subvarieties"/"subvarietyNames"? All but one of the hits of google books:"sublect" OR "sublects" are scannos of "subject". Or perhaps something like "subsumedVarieties", to convey that the main purpose is to list cases where ISO-code-having subvarieties have been subsumed, rather than e.g. to start listing every non-code-having dialect of English. - -sche (discuss) 23:22, 25 January 2015 (UTC)
(edit conflict) Of the two, I like "sublects"- it sounds more neutral. Actually, it's the "sub" part that makes me nervous. Except in the case of pluricentric languages, we don't explicitly mention the standard lect at all, which is every bit as much a sublect as all the things we call the sublects. More often than not, the only difference between the "standard" and the "sublects" is an accident of history: In Old English, for instance, the Wessex dialect is generally treated as standard, but eventually the East Midlands dialect took its place. That means a sublect became the standard and the standard became a sublect. In reality, though, they're still just two sublects, with the main difference being that the standard sublect tends to influence and crowd out the other sublects.
Of course, it would look funny to include "Standard xyz" in the list of sublects, so I guess we're stuck with the current arrangement. Still, I wonder if there's a way to distinguish the language as a whole from its sublects without implying that only those lects different from the standard are sublects.Chuck Entz (talk) 00:08, 26 January 2015 (UTC)
This raises the question of what we want to list in sub[variety/lect] field. Initially, when subvariety names were included in languages' lists of alt names, it was because the named subvarieties had previously been considered languages (generally by the ISO, but in some cases merely by us via granted and then revoked exceptional codes); the subvariety names were listed so that people who thought they were languages would know where they went.
However, I can see how we might find it useful to make comprehensive lists of languages' dialects (including dialects when have never been considered own languages); such lists could in some far-future version of Wiktionary be meshed with the context labels so that entries could be put in cleanup categories if they were categorized as belonging to another language's dialect, for instance.
I'd still use "subvarieties" for the name since "sublects" doesn't appear to be a word; even the Google Scholar hits are scannos for "subjects", lol. - -sche (discuss) 19:54, 26 January 2015 (UTC)
I think it would be a good idea to make a list of dialects. But it would be very hard to manage because there are so many, and there will always be a need to specify a particular variety that is more fine-grained than any we've defined so far. So if we want to add something like that, we would have to take the possibility of unrecognised dialects into account, like the label template already does. —CodeCat 20:05, 26 January 2015 (UTC)


Hi! If this is a real French verb, could you define it? If it's not a real verb, I'll need to delete all the inflected forms someone created for it (Special:WhatLinksHere/surbasser). - -sche (discuss) 09:20, 27 January 2015 (UTC)

Most often, it's a typo for surpasser or surbaisser. However: 1. it seems that, in architecture, surbassé has been used as well as surbaissé (but I cannot find citations clearly showing that it was used as a verb). 2. I also find surbassé used for music, and very few uses clearly using a verb surbasser (try to Google "il surbasse" and "qui surbasse"). I think I can guess the sense (make music overbassed), but I'm not a specialist. Lmaltier (talk) 21:36, 27 January 2015 (UTC)
I see. Thanks for checking! - -sche (discuss) 21:42, 27 January 2015 (UTC)

Flood flagEdit

Hi, could you give me the flood flag for about 20 minutes, please? --Type56op9 (talk) 18:41, 28 January 2015 (UTC)

Nah, you're not supposed to be operating a bot. - -sche (discuss) 19:36, 28 January 2015 (UTC)
Actually, it's not a bot. It is WT:ACCEL, which looks like a bot. --Type56op9 (talk) 11:40, 29 January 2015 (UTC)
Fair enough. I just went through and patrolled your latest batch. - -sche (discuss) 20:01, 29 January 2015 (UTC)


Hi, could you create a language module for Proto-Ta-Arawakan as well? --Victar (talk) 19:17, 30 January 2015 (UTC)

I've created a family code for Ta-Arawakan, "awd-taa". However, neither "Proto-Ta-Arawakan" nor "Proto-Ta-Arawak", nor "Proto-Ta-Maipurean", "Proto-Ta-Maipuran", or any of the other alt names I tried gets any Google Books or Scholar hits, or even raw web hits. Are you sure it's a valid proto-language? - -sche (discuss) 19:52, 30 January 2015 (UTC)
Thanks. Yeah, what happens is it usually just gets called Proto-Arawak. Incidentally, Arawak is also a language within Ta-Arawak, otherwise known as Lokono. It's all very convoluted, but consequentially I have these reconstructions that shouldn't be called Proto-Arawakan since they aren't attested outside of Ta-Arawak, ex. Lua error in Module:translations at line 45: Translations must be for attested and approved main-namespace languages.. --Victar (talk) 22:17, 30 January 2015 (UTC)
I've also seen it awkwardly called "proto-Caribbean Northern Arawak". --Victar (talk) 23:07, 30 January 2015 (UTC)
OK, thanks for the clarification. In general, I would say "meh, if someone wants to create entries for such-and-such proto-language that existed, go for it". However, User:Tropylium has recently been arguing against creating separate codes and appendices for cases where things are reconstructible only to certain dialects of proto-languages, and if other linguistic works just treat Proto-Ta-Arawak as Proto-Arawak (and AFAICT never mention or confirm the existence of Proto-Ta-Arawak at all), that does make me question if we really need a code for it. Tropylium, do you have an opinion on this? - -sche (discuss) 03:48, 31 January 2015 (UTC)
Looking at Wikipedia's classification, it seems that Ta-Arawakan is a fairly deep subgroup within the wider Arawakan family, and accepted by each of the three otherwise very different classification schemes. Sounds like good enough grounds for separate treatment. Cleanup will still be possible later, if it turns out that there exists a better way to define a subgroup comprising these languages (but AFAIK Arawakan is not one of those families where a micro-detailed family tree is known yet). --Tropylium (talk) 04:07, 31 January 2015 (UTC)
OK, I have created "Proto-Ta-Arawakan" with the code "awd-taa-pro". - -sche (discuss) 04:36, 31 January 2015 (UTC)
Thanks to you both! Yeah, the whole Arawak tree is outdated, based on paper from 1991. I'm working on a draft for a new version based on various published works, w:User:Victar/Template:Arawakan languages. --Victar (talk) 17:37, 31 January 2015 (UTC)

I wonder where the heck the D came from in awd and in Taíno the Q in tnq? I think the Arawak languages just got the bottom of the barrel. If I had some say, I would rename Arawak to lcn for Lokono/Locono and use arw for the language family. --Victar (talk) 01:27, 31 January 2015 (UTC)

Yeah, some forethought would have done the ISO good. Particularly strange are the cases where languages which have three-letter names have been given codes that aren't those three letters, e.g. Abu is ado (while abu is Abure), Col is liw, and so on. - -sche (discuss) 03:49, 31 January 2015 (UTC)


"sınalgı" was deleted but they (88.XXX.XXX.XXX) added again! --123snake45 (talk) 02:13, 1 February 2015 (UTC)

CodeCat has deleted it. The IP seems to be correct that there are citations of the word on Usenet now, but there are only two of them, and they're from only a few months apart; the word would need three citations spanning over a year to meet WT:CFI. - -sche (discuss) 02:22, 1 February 2015 (UTC)
The author (Arslan Tekin) says: "Look at it, it is using sınalgı for television and ünalgı for telephone at Kyrgyzstan"

So, it is Kyrgyz. It isn't Turkish. --123snake45 (talk) 03:00, 1 February 2015 (UTC)


Can you take a look at rfv page? I've added the citations with Azerbaijani adaptations so you may compare them. -- 17:13, 3 February 2015 (UTC)

I invited three Turkish-speaking users to take a look at the citations. One of them, User:Dijan, is the one who said the previous citations were Azeri. The Azeri versions you've provided do look consistent with Dijan's comment that "every single one of them is a Turkish rendition of the Azeri language (literature and poetry) that was not translated into Turkish", but I will wait for the other users to comment. I'm at a disadvantage here because I (and more other Wiktionarians) don't speak Turkish or Azeri, and it's clear there are people with axes to grind on both sides of this issue — in some cases it seems pretty clear that people have made up words that aren't actually in use, and in other cases people seem to be refusing to believe words that seem real (e.g. Citations:haydamak, where it looks like other print dictionaries are confirming that the citations are using haydamak to mean "drive"). - -sche (discuss) 17:26, 3 February 2015 (UTC)


I don't agree that languages are proper nouns, but if that is Wiktionary policy, I'm not going to upset the apple cart, but just let you know that not everyone agrees. Donnanz (talk) 17:51, 3 February 2015 (UTC)

I don't know that there's a policy, but it certainly seems to be common practice; all the other language names I can think of are currently categorized as proper nouns: Portuguese, Spanish, Basque, French, English, Dutch, German, Danish, Norwegian, Chinese, Navajo, etc. However, there has been some discussion in the past about how some of the things that are commonly categorized as proper nouns, such as personal names, fail to meet some of the usual tests of proper-noun-ness (names are countable; "there are two Johns in my class"). You could bring the matter up in the BP and see what others think. Languages do seem to meet more tests of proper-noun-ness than personal names, though (and there wasn't even consensus to stop treating names as proper nouns). - -sche (discuss) 18:09, 3 February 2015 (UTC)
Hmm, OK, I'll think about the Beer Parlour. I would categorise names such as Gertrude, the Houses of Parliament, the White House, and the Black Sea as proper nouns, and surnames of course, and stop there. But as you point out there can be a problem with people's names; the Browns and the Joneses spring to mind. Also place names, two Bristols, two Birminghams, two Londons (maybe more), but place names and people's names are really proper nouns despite that. Donnanz (talk) 18:39, 3 February 2015 (UTC)
If you are thinking about the matter, consider that taxa are considered proper nouns, because they are names of individual natural kinds (old-style Linnaean taxonomy) or lineages. This is somewhat similar to the Roman gens, or other groups of descendants of a common ancestor. Organization names, toponyms of all kinds, brands/trademarks are all proper nouns, whatever word class their components are. DCDuring TALK 18:50, 3 February 2015 (UTC)
No, I wouldn't argue with taxa (taxas?), brands, trademarks, names of organisations etc. I think it's just languages as proper nouns I disagree with. Donnanz (talk) 19:02, 3 February 2015 (UTC)
The argument, I think, is that a language is a singular thing that a community speaks, just like e.g. a country is a singular place that a community lives. Of course, both can be pluralized: one can speak of Germanies, Americas, and even Frances, and one can speak of "various Englishes" (American, British, Indian, etc), "Norwegians" (Bokmal, Nynorsk, Riksmal, etc), "Germans", etc (though our entries currently don't, except in the first case). It may well be as technically inaccurate to label countries and languages as proper nouns as it is to label personal names as proper nouns. On the other hand, it seems to be common, among those dictionaries which use the label "proper noun", to label all of those types of thing as proper noun, and they do generally fit tests of proper-noun-ness. - -sche (discuss) 19:41, 3 February 2015 (UTC)
There's quite a few examples of plural place names: Aleutians (that entry needs splitting), Falklands, Faroes (Faroe Islands), Netherlands (Nederland in Dutch) to name a few. But languages (in my opinion) are mass nouns, instead of Englishes and Norwegians (the people are Norwegians), we should refer to forms of English, forms of Norwegian and so on. Donnanz (talk) 20:42, 3 February 2015 (UTC)
I think Netherlands (where the word for a singular country happens to be plural) is different from Frances (the plural of France, used to talk about e.g. different temporal or social incarnations of France). I can find several instances of Netherlands being pluralized, both invariantly ("the two Netherlands", a la "the two fish") and, rarely (and only "in the wild", not in places that meet CFI), as Netherlandses.
Hmm, mass nouns... that's plausible. Well, we have a fair few grammarians here, let's see what they think. Would you like to bring it up in the BP, or would you like me to?
DCDuring, does CGEL say anything about whether languages are nouns or proper nouns or mass nouns? For that matter, does it say anything about whether given names are proper nouns or not? (Apologies if you've answered the latter question previously and I'm forgetting.) - -sche (discuss) 21:50, 3 February 2015 (UTC)
I think the name Netherlands may be historical as it also took in the all the low countries including Belgian Flanders at one time. It is still referred to as het Koninkrijk der Nederlanden (qv Nederlanden). Anyway, I suppose I had better start a thread in the BP. Donnanz (talk) 22:20, 3 February 2015 (UTC)
@-sche: I don't see any explicit statement in CGEL that a name of a language is a proper noun nor that is any other type of noun. There is no reason why a proper name couldn't have a homonym that is a mass noun. Or rather isn't that just one of the generic secondary uses of many proper names, eg, "We've had too little Ruakh in our discussions lately." (The "too much" examples would cause trouble.) DCDuring TALK 23:53, 3 February 2015 (UTC)


Sorry for all the deletion requests. I was basing the original reconstructions on some outdated material. Thanks. --Victar (talk) 07:04, 5 February 2015 (UTC)

No problem. With wt:AWB, it's not that hard to delete a bunch of pages. - -sche (discuss) 07:08, 5 February 2015 (UTC)

dative -eEdit

Discussion moved to Template talk:de-decl-noun-n.

A friendly request to enable AWB useEdit

And also, could you remove edit protected status for CheckPage? I can't edit it. --Dixtosa (talk) 12:41, 7 February 2015 (UTC)

Sure, I can add you to the checkpage. :) I'm not going to unprotect it, though; it's supposed to be protected, as a safeguard against people who don't know what they're doing adding themselves to it. - -sche (discuss) 18:20, 7 February 2015 (UTC)

Using passer and sortir with êtreEdit

Do passer and sortir use être under exactly the same circumstances? Their usage notes are a little different, and I'm not sure if that's meant to imply that the terms use être under different circumstances or not. If they use être under the same circumstances, I'd like to reword Template:U:fr:may take être as much as needed and deploy it on both entries; otherwise, there doesn't seem to be a use for that template (it's currently unused and there's no point in templatizing usage notes that only apply to a single entry) and I'd like to delete it, unless you know of other entries that could use it. - -sche (discuss) 20:51, 9 February 2015 (UTC)

Yes, this template is OK, it applies to both entries, but a more complete list is (at least) descendre, monter, passer, redescendre, remonter, rentrer, repasser, rerentrer, rerepasser, reressusciter, reretourner, ressortir, ressusciter, retourner, sortir. This list is not limitative (when you add re- to a verb, this is the same rule). Actually, avoir or être is used depending on the meaning, and this is best explained with examples, but the template seems to be a good summary: when used transitively (or with a transitive sense, even when the complement is omitted), it's always avoir. Otherwise, it's être. Lmaltier (talk) 21:15, 9 February 2015 (UTC)
Thanks for the clarification! I'll clean the template up a bit and add it to those entries. - -sche (discuss) 22:53, 9 February 2015 (UTC)
Also note that using être is also systematic for pronominal uses of verbs: cf. je me suis trompé vs j'ai trompé. But this is a different issue, it's not limited to a few verbs. Lmaltier (talk) 06:58, 11 February 2015 (UTC)

Crucially important questionEdit

From which episode of QI do those words on your main page come? It's snowy in Tennessee, and there's nothing to do. JohnC5 05:20, 17 February 2015 (UTC)

@JohnC5 I believe it was the J series episode 13 on "Jobs". Those were all occupations people said they had in old British censuses. - -sche (discuss) 05:47, 17 February 2015 (UTC)
I have seen that episode! Probably deserves a rewatching... JohnC5 05:50, 17 February 2015 (UTC)

Questionable revertEdit

I would appreciate it if your reverts were a bit more careful. For instance here, I think that edit would have been fine since many people confuse UUers for a religious denomination. However most academics refers to it as a distinct religion. By highlighting the coordinate terms, it would have been clearer that this is a distinct religion. I'm disappointed with your knee-jerk finger-trigger like reactions. 16:16, 21 February 2015 (UTC)

The merits aside, someone with your long, ugly history of questionable and often downright awful edits (yes, it's obvious who you are, whatever IP you happen to be using at the moment) is in any position to criticize the people who have to clean up after you. Chuck Entz (talk) 16:45, 21 February 2015 (UTC)
I think it's better to put the coordinate terms, synonyms, etc in the lemma entry, rather than in all the various possible abbreviated forms (UUers, UUs, etc). - -sche (discuss) 17:48, 21 February 2015 (UTC)

allosexual entryEdit

Many thanks for your improvements, which were far above my Wiktionariological or semantic capabilities. Looks great! FourViolas (talk) 15:03, 24 February 2015 (UTC)

Trans and frequenciesEdit

You must have the frequencies form transman, transwoman, etc. wrong; please check Google Ngram Viewer. --Dan Polansky (talk) 08:46, 7 March 2015 (UTC)

No, Ngram Viewer clearly shows that the spaced form is more common in the case of trans woman (link, which looks like this to me — is it different for you?). For trans man, the unspaced form was still slightly more common at the time Google's data cut off (2008), but the spaced form was becoming more common while the unspaced form was becoming less common, so it seems likely that more recent data would show the same situation as with trans woman, i.e. that the spaced form is more common (especially in light of the proscription of the unspaced form by some authorities). - -sche (discuss) 08:54, 7 March 2015 (UTC)
For transwoman, my mistake: I used the default Ngram settings which ends in 2000[1], but when one extends the graph to 2008[2], the picture changes.
For transman, you are making the less common form[3] (factor 1.6) the main dictionary entry, with justification that relies on extrapolation rather than actual situation. When one combines this with the proscriptions expressed online, I am not sure what to think of this. --Dan Polansky (talk) 09:09, 7 March 2015 (UTC)
Well, to assume that the actual situation matches the situation a decade ago would also be making an assumption. It would be a reasonable assumption for most words, which have many decades of use, which have consistent (parallel) trendlines, and which the events of 2008-2015 can't be expected to have had much of an impact upon. (For example, couch and sofa.) In this case, however, the trendlines are divergent (and only go back about 15 years anyway), and increasing awareness by the general public of trans people's preferences can be expected to have influenced usage in the same direction as the trendlines were going when the data cut off. (Consistency with trans woman also plays a role.) - -sche (discuss) 09:48, 7 March 2015 (UTC)
You actually have a good point; the 2008 data is 7 ears old. To bet that the trend for transman has after 2008 developed in a way parallel to trends seen even before 2008 for transwoman seems reasonable enough. Fair enough. --Dan Polansky (talk) 15:06, 7 March 2015 (UTC)


Concerning this. Just a thought, but I'm not convinced that it's sensible to split the definitions. This is because it seems not clear in many citations (especially earlier ones) which sense exactly is meant, and more generally I suspect that the precise meaning lies on a continuum between the two rather than being neatly split into one or the other. At any rate that was my impression when I was working (briefly) on the word. Ƿidsiþ 07:52, 9 March 2015 (UTC)

A lot of citations are ambiguous, yes. However, enough are unambiguous that I don't think conflating them is appropriate, particularly because the distribution of meanings seems to have a temporal component, i.e. the meaning seems to have changed over time. Citations that refer to the past often explicitly refer to hijras as eunuchs, defined by anatomy, while contemporary uses often (mostly?) refer to the third-gender people, defined by social role/presentation. Some of the latter works even explicitly specify that (modern) hijras are not necessarily eunuchs: google books:"uncastrated hijra|hijras" gets a few hits, and google books:"castrated hijra|hijras" (which would be redundant if hijras were necessarily eunuchs) gets several more, including some like "the not-yet-castrated hijra", "they were indistinguishable from castrated hijras when crossdressed - clearly, becoming hijra as a livelihood required neither castration nor gharana affiliation", and "[they] may or may not be castrated. Hijra is a developed stage." Perhaps the solution is to make the two specific senses into subsenses of a broad 'coverall' sense? - -sche (discuss) 08:27, 9 March 2015 (UTC)
And then there are google books:"female hijras", who most of the citation make clear have attained hijra status by adopting a third-gender role and not by castration. These would be especially hard to work in to a 'coverall' sense — they would require it to be very broad indeed, to cover both eunuchs and women. Perhaps the solution is to have a {{qualifier}} or usage note explain that some uses don't distinguish male eunuchs from male-bodied third-gender people? - -sche (discuss) 08:44, 9 March 2015 (UTC)

Upper Franconian language‏‎Edit

User:Purodha added user boxes that triggered the creation of a whole bunch of bad language categories and redlinks by Babel AutoCreate- pretty much the gamut of nds-nl & nds-de lects. I've gotten rid of most of the redlinks by replacing the narrow-lect category link with the appropriate broader-language category link in the User categories that were created. The one holdout is Upper Franconian, code vmf (see Category:User vmf): I'm not really sure whether it's nds-nl or nds-de. Any suggestions? Chuck Entz (talk) 03:55, 18 March 2015 (UTC)

@Chuck Entz See here. —Μετάknowledgediscuss/deeds 04:12, 18 March 2015 (UTC)
It's been a while since I was looking into this, so I forgot some important details. Yes, it's High German, not Low German. If you follow the link to the Wikimedia discussion, it turns out that after we had deleted the vmf code, Ethnologue came out with corrections that led to vmf being deemed eligible for a wiki after all. Now that Ethnologue is no longer claiming that vmf applies to Mainz and Frankfurt, we may need to revisit the issue. Chuck Entz (talk) 07:03, 18 March 2015 (UTC)
Thanks for the heads-up. I have been busy, but I will look into it. (I wonder if they have also clarified frs any.) - -sche (discuss) 01:27, 19 March 2015 (UTC)
Apparently they have, it's now called "Saxon, East Frisian Low". (But the population count is still wrong, hmph.) -- Liliana 01:33, 19 March 2015 (UTC)
While you're here, what are your thoughts on the newly-redefined Upper Franconian? do you think it should be included? All the varieties of German are such a mess to pick apart into discrete lects... - -sche (discuss) 02:41, 19 March 2015 (UTC)
Ethnologue does a horrible job at the German dialects. It appears to cover some, but not all of them and it's generally a huge mess to work with. (I hope you've seen my newest BP topic regarding the Swiss German lects.)
Have you seen the current vmf entry? It says "Hessen state: mostly River Main area, east of Mainz and Frankfurt." How much Hesse is there at the Main east of Frankfurt? lol. They really can't figure out what they want with this code, and it doesn't help that it's called "Mainfränkisch" with "Ostfränkisch" being a supposed alternate name, even though Mainfränkisch is just one of many subdivisions of Ostfraänkisch.
I mean, we could theoretically use it for the Franconian lects, but... eh. -- Liliana 00:06, 20 March 2015 (UTC)

frs Module errorsEdit

These have been hanging around since you removed the frs code. There were 146 to start with. I've chipped away at a few of the obvious ones, but there are still about 135. The problem is, I don't know which ones are Saterland Frisian, which ones are East Frisian Low Saxon, and which are some unspecified extinct Frisian East Frisian dialect.

It won't do to have all of those module errors for an extended period- there's already been one unrelated module error that I only found out about by going through all 136 entries in the category (there's an error in a Korean module that's since brought the total up to 199). Do you think you'll be able to fix them soon? Is there anything I can do to help? Maybe User:Leasnam, who added most of them, might be able to help. Chuck Entz (talk) 03:50, 23 March 2015 (UTC)

I've been changing them as I see them...but the majority of those I've added, by the looks of them, represent a sampling of various unspecified extinct East Frisian dialects. Where I can connect them to a modern Saterland Frisian word I am updating them, but not universally. Sometimes I just change the code to stq to get rid of the error short term Leasnam (talk) 04:41, 23 March 2015 (UTC)
Ugh, this is one of the few downsides to our use of language modules rather than language templates: I thought I had cleaned up all the uses of frs. (I should have waited for and searched an updated database dump to be sure.) I would temporarily reinstate the code, except that Ethnologue clarified that it refers to the Low German lect, which means I'd be replacing missing information (module errors) with potentially incorrect information (it's often unclear whether uses of the code on here are meant to refer to Frisian or Low German), which I am not sure would be an improvement. I'll chip away at what I can. If an entry simply lists an East Frisian word as a cognate (not an etymon), and it's not possible to determine which precise Frisian-ic or Low-German-ic lect it belongs to, it can simply be dropped, IMO. - -sche (discuss) 04:52, 23 March 2015 (UTC)
I have no qualms about dropping a non-essential cognate. We can fix later if need be Leasnam (talk) 06:06, 23 March 2015 (UTC)
Here is the reference cited in the first appendix entry I looked at. It seems to be treating East Frisian as a whole, which would include not just Saterland Frisian, but also at least a couple of the extinct dialects. Maybe we need an exception code for Frisian East Frisian as a whole, or maybe we should make stq the code for the whole language. Chuck Entz (talk) 07:01, 23 March 2015 (UTC)
It would be sensible to do one of those things, yes. In the past I had proposed creating gmw-fre or gmw-efr for East Frisian, but there was insufficient support for that because it was at the time still unclear if frs really referred to the Low German lect. - -sche (discuss) 14:03, 23 March 2015 (UTC)


"not convinced that this form is German and not Latin, but w/e" -- even states that there's a vocative for Jesus and Jesus Christus: "Jesus [...] Anredefall: Jesus und Jesu", "Jesus Christus [...] Anredefall: [...] Jesu Christe" ("Anredefall" is German for English vocative). There most likely would still be an ablative (cf. "von dem Nomine" [Nomen], "von dem Corpore" [Corpus], "von dem/der Radice" [Radix]), but the ablative of (Latin) Jesus and Christus equals the dative and so duden only mentions a dative. Also, though it should be obvious: the vocative of Jesus and Christus can especially be found in religious song books and most likely religious prayers etc. -13:48, 19 April 2015 (UTC)

Changing the parent language of Yiddish from MHG to OHGEdit

(Pinging people who may be interested) @Metaknowledge, CodeCat, Angr

It is not clear that Yiddish branched strictly after the beginning of the MHG period. See for example section 7.25 in Max Weinreich's History of the Yiddish Language, where he concludes "Hence we have to postulate that Yiddish began to take shape as early as the Old High German period" (p. 424). Is this enough of a reason to change Yiddish's ancestors = from "gmh" to "goh"?

Another more difficult question would be whether to add Hebrew, Aramaic, Yevanic, and/or Judeo-Romance as an ancestors (which in some sense they are), but then again we don't put Frankish as an ancestor of French (perhaps we should?).

--WikiTiki89 18:34, 20 April 2015 (UTC)

I'd say the second question is the easier one: No. Languages that are the sources of loanwords—even large numbers of them—are not considered ancestral. Anglo-Norman is not an ancestor of English; Latin is not an ancestor of Albanian and Welsh; Italian is not an ancestor of Maltese; and Hebrew, Aramaic, Slavic, etc., are not ancestors of Yiddish. I have no objection to changing the parent language of Yiddish to OHG. —Aɴɢʀ (talk) 18:53, 20 April 2015 (UTC)
But they're not exactly loanwords, they're more like kept-words. Jews that spoke other languages and settled in German-speaking areas, slowly and gradually adopted more and more German words and grammar, keeping many words and grammatical structures from their former languages, especially from Hebrew. This had already happened several times before and so the Hebrew words and grammatical structures were direct continuations from when Hebrew was their native language. This is different from loanwords, which speakers of one language simply borrow from another language. I presume that there was similar situation with French and Frankish, although I have never read about this and far fewer Frankish words survived in French for it to be significant. --WikiTiki89 19:08, 20 April 2015 (UTC)
Contact languages of any kind are going to be impossible to represent accurately in terms of choosing a language as a "parent". MHG seems no less (in)accurate to me as compared to OHG; during both time periods, there was an attested Jewish form of the language written in Hebrew script that had a lot of Semitic vocabulary. Yiddish has some differences in sound changes that allow us to estimate its general point of divergence, but the differences do not seem to be particular to Yiddish so much as features of some of the High German lects (not the one(s) that led to Modern Standard German). In the meantime, I think keeping it as MHG is perfectly fine, considering that MHG already represents a span of varying lects within certain parameters of time and space which arguably include the Jewish varieties. —Μετάknowledgediscuss/deeds 19:19, 20 April 2015 (UTC)
Well Weinreich says on the same page as the quote above "Yiddish speakers were in close contact with German speakers, and it need not occasion surprise had the German component of Yiddish, although already part of an independent language, continued to be affected by changes that took place in the German determinant." I don't know whether you find that contradictory to your point or not. --WikiTiki89 19:33, 20 April 2015 (UTC)
The ancestry of Yiddish is the subject of some disagreement. Wikipedia calls the view of a MHG origin a "prevailing" view. Bernard Spolsky (The Languages of the Jews: A Sociolinguistic History, 2014, page 157) says "The basis for Yiddish was a Middle High German dialect, for Yiddish often agrees with Middle High German rather than with modern German[.]" And Paul Wexler (Two-tiered Relexification in Yiddish, 2002, page 133) goes so far as to say "there are no specific Old High German phonological or lexical features in Yiddish (see Simon 1991: 253)." But Wexler believes the ultimate origin of Yiddish is actually Slavic, and the Germanic content is the result of relexification in the 9th to 12th centuries; indeed, his full sentence (emphasis mine) is "The first relexification to German took place in the Middle High German period, to judge from the fact that there are no specific Old High German phonological or lexical features in Yiddish." In turn, Weinreich says what you quote, but Wikipedia says that his model also posits that "Jewish speakers of Old French or Old Italian, who were literate in Hebrew or Aramaic, migrated to the Rhine Valley, [...] encountered and were influenced by Jewish speakers of High German" and that the ultimate origin of Yiddish is the fusion of all this, not simply OHG.
Perhaps we shouldn't list a parent at all?
De facto, we more often give OHG words than MHG words as the etyma of Yiddish words. (In the past, some entries gave modern High German forms as etyma, but this was known to be problematic and has for the most part been addressed.)
- -sche (discuss) 22:08, 20 April 2015 (UTC)
The way I see it is that listing MHG as a parent implies also OHG, but listing OHG as a parent does not imply MHG. So if we are unsure about MHG, then listing OHG is not wrong. But what actual consequences does listing the parent in the module have? What got me thinking about this was when I was adding פֿאָרן (forn) to *faraną and was unsure whether to put it under MHG or under OHG. Perhaps this should be decided on a word-by-word basis. If we know a word came from MHG, then we will list it under MHG, if we know it did not, then we would list it under OHG, and if it is unclear, that is where we need to choose a default and where I think OHG would be a better choice. --WikiTiki89 22:33, 20 April 2015 (UTC)
Frankish isn't really an ancestor of French: there were an awful lot more of the Romance-speaking Celts then there were Franks, so the Franks were somewhat like the Mongols in China- more important historically than linguistically. Chuck Entz (talk) 03:32, 21 April 2015 (UTC)
Ok, then my comparison to French/Frankish was wrong. My point remains about Yiddish/Hebrew. --WikiTiki89 14:15, 21 April 2015 (UTC)


Pertain, which pertaining is just a modified version of, is defined here on English wiktionary as "Verb[edit] pertain (third-person singular simple present pertains, present participle pertaining, simple past and past participle pertained)

(intransitive) to belong (intransitive) to relate, to refer, be relevant to" The "to belong" sense of pertaining is already covered by "of pedophilia", the "to relate" sense is already covered by "related to pedophilia", so it is redundant. Although its not necessary to be as simple here as on simple English wiktionary, its still important. Its best when writing to write in simple language, not complex. There is a book about this topic by H.W. Fowler called The King's English, you should read it. His first points in the book are, prefer simple words to complex words, prefer short words to long words, prefer common words to unusual words, and prefer Germanic words to Romance words. He would agree with me that pertaining would need to go in this case. --PaulBustion88 (talk) 02:12, 30 April 2015 (UTC)


Howdy-doo! I was just curious where you found the meaning of incomplete. It seems closely related to the meanings I've seen, but not quite the same. Just thought I'd ask. —JohnC5 04:01, 8 June 2015 (UTC)

I saw it in The Century Dictionary (1914) defined as "characterized by incomplete metamorphosis", and that sense is suggested by citations like "cockroaches, grasshoppers, lice, true bugs, and so on, undergo paurometabolous or incomplete development" (Foundations of Wildlife Diseases, 2014, ISBN 0520958950, page 126). That citation is why I offered the shorter gloss "incomplete" before the semicolon, btw (since "paurometabolous development" is not "development characterized by incomplete development"). It's probably not a separate sense, and could be removed if sense 1 were expanded a bit. Btw, Century has a second sense, "of or belonging to the Paurometabola", which is defined as "in Brauer's system of classification, those insects in which the metamorphoses are slow, inconspicuous, and very incomplete, as the Orthoptera". The former looks like a candidate for Category:mul:Taxonomic names (obsolete). - -sche (discuss) 05:10, 8 June 2015 (UTC)
Based on the wiki page for w:Hemimetabolism, I believe the word incomplete is used to mean "not executing all of the normal stages of metamorphosis," as opposed to "failing to complete metamorphosis." The ambiguity lies in that the members of Paurometabola succeed at their form of metamorphosis, but this metamorphosis does not conform to the standard metamorphic pattern. I might suggest abridged or atypical as opposed to incomplete because the latter most sounds like the bugs never succeed at maturing, which is certainly not true. Does this sound reasonable to you? —JohnC5 21:26, 8 June 2015 (UTC)
Good point about w:Hemimetabolism. Actually, why don't we just link to that page? See what you think of my change to the entry, and feel free to undo or expand upon it. - -sche (discuss) 00:23, 9 June 2015 (UTC)
Looks good to me! :)JohnC5 00:56, 9 June 2015 (UTC)

Partition verb senses by grammar, semantics, register/topic/context?Edit

Looking at your excellent, extensive work on take reminded me of a question that bothered me about sense division, especially in verbs (though it comes up in other word classes).

Which of the various possibilities should take precedence in grouping definitions? For verbs, most dictionaries divide definitions into transitive and intransitive and, as a result, have some redundancy and obscure some semantic relationships. I often feel that certain groups of registers/topics, eg, sports, games, nautical, belong together no matter whether there are semantic reasons to split them. Some would group all archaic and obsolete senses.

We already split some semantically analogous senses by PoS eg, adjectives and adverbs, conjunctions and adverbs, conjunctions and pronouns, prepositions and adverbs, adverbs and nouns (eg, home). These splits make it harder to see the semantic similarities. Have we written off that kind of semantic visibility? Do we have to?

My natural inclination is to have grammar take precedence, but I'd be happy to hear arguments for the other possibilities. DCDuring TALK 20:31, 10 June 2015 (UTC)

Working on take got me to thinking about sense grouping, too. I don't desire to adopt other dictionaries' practice of separating transitive and intransitive verbs, I only separated them on take to make the entry easier to work on. Now that I'm finished adding senses, I'll probably go back and interweave the transitive and intransitive ones, since I think it's better to group definitions/senses according to meaning. Separating transitive and intransitive senses often obscures the fact that some senses are ambitransitive (as here, where it resulted in what was basically the same sense being listed twice) or ergative.
Separating different parts of speech seems to me like a good practice to continue. The cases where it proves difficult (however) or could be regarded as obscuring semantic connections (home) are too few and far between to justify abandoning the practice.
- -sche (discuss) 05:20, 11 June 2015 (UTC)
If an English L2 section is to be read as some kind of structured, terse essay on a term, then it certainly makes sense to group somewhat semantically.
OTOH, if an English L2 section is intended to help an ordinary user find a definition, at least some users would benefit from a transitive/intransitive split, which would support faster scanning for the possible definition. (This argument also favors topical labels, which I have, perhaps wrongly, opposed.)
Another consideration is entry maintainability. Of course, to tinker with your efforts would be gilding the lily, but it is easier to assess, analyze, and repair the range of coverage of a set of definitions, if the set can be made smaller on some easy-to-determine grounds, like the hard grammatical distinction of transitivity/intransitivity. DCDuring TALK 12:41, 11 June 2015 (UTC)
You are right that transitivity is an (possibly the only) easy-to-determine hard-and-fast distinction, and that segregating senses according to it could help people find specific senses. I'm not strongly opposed to it, I simply think semantic grouping is better. Where would ambitransitive and ergative senses go if senses were split by transitivity? In sections all their own, e.g. between the transitive and intransitive senses? (That would seem a bit awkward, but not outright problematic.) Or would they be duplicated and placed in both the transitive and the intransitive section? That would seem unhelpful to English-speakers, though perhaps helpful to translators (if they have distinct translations in some languages, which seems likely).
Other ways of sorting verb senses are by age (oldest—or newest—senses first) and by commonness (most—or least—common senses first). I suppose those are not mutually exclusive with grouping senses by meaning or transitivity.
Perhaps someone will devise a gadget that will give users buttons, similar to the "show/hide quotations" buttons but located e.g. at the top of each POS section, which will allow users to optionally hide senses with certain tags, e.g. obsolete, archaic, transitive, intransitive, even US (if a user knows they're searching for a sense Brits use), UK, etc.
- -sche (discuss) 21:22, 11 June 2015 (UTC)
We could have sortable tables of definitions! Ugly, and needing a lot of artificial data to generate what we think is appropriate. Or we could let users run SQL queries against a database of definitions.
I've never been convinced of the utility of ergative and other high-falutin' linguists' labels for the supposed 'normal' users, if indeed we have any 'normal' users. Those mostly seem good for making sure that someone working on an entry checks to make sure that the appropriately reworded definition appears in both transitive and intransitive sections, ie, duplicate underlying semantics.
After group by the hardest of grammatical distinctions, I would group semantically, preferrably using subsenses, ordering the senses by date of attestation of the sense (in principle) or degree of concreteness (which might coincide with date of attestation for the definition in the language or an ancestor. Subsenses would follow the same ordering principle within the sense. But recourse to attestation actually means relying of OED for many words, though not so much for more recent sense development.
As we don't really have a clearly dominant approach, I think we can still let contributors do it the way they want to. I would not impose my ideal grouping and order on an entry that was a good example of another set of organizing principles and hope that no-one would waste time merely reordering and regrouping mine, unless there was a good reason (clear error, reorganizer actually working from the OED, etc). DCDuring TALK 22:11, 11 June 2015 (UTC)

Orange links and ACCELEdit

Hi. Is there any way to combine the orange link gadget with the WT:ACCEL one? --Type56op9 (talk) 17:36, 13 June 2015 (UTC)

Not that I'm aware of (I think people have asked about that before). It would be useful, though. You could ask in the Grease Pit. - -sche (discuss) 18:18, 13 June 2015 (UTC)
(edit conflict) Not as such. Acceleration works by adding preloads to a redlink, which requires that there be nothing there. One would have to have an app to add a language section to an existing entry, which would require different methods. It may be possible (bots certainly have no trouble with it), but it wouldn't be a trivial exercise. Chuck Entz (talk) 18:21, 13 June 2015 (UTC)
My illegal bot made such additions the time. But then it got blocked, so I had to hide the fact I was using a bot by changing the code. Then people figured out I was still using a bot. However, if this new orange-accel tool was around, I could use the illegal bot again, and pretend I was using the tool. Everyone's a winner! --Type56op9 (talk) 18:26, 13 June 2015 (UTC)


On the "Greek" page, that was a filter that I put on my computer that did that. I'll have to make sure to check that in the future to make sure that it doesn't sneak into my edits by accident. The filter replaces words with "[word deleted]". I installed the filter because too many people were swearing left and right on many of the websites that I visit, and I grew tired of seeing it.

But yes. xD

That was pretty funny. My bad. Tharthan (talk) 18:11, 15 June 2015 (UTC)

Ah, thanks for the explanation; I had wondered why it flagged "clit" but not "anal sex", haha. Thankfully people around here don't swear that much (not that I mind) — I guess it's to be expected that dictionary-editors know more articulate ways of expressing themselves. - -sche (discuss) 18:17, 15 June 2015 (UTC)
Yeah, frankly I would have set it to change each word to a clean synonym, but the filter in question only allows for one all-encompassing replacement (which kind of stinks, because it reminds me of those old IRC-type chatrooms that just replaced vulgarities with asterisks rather than creatively write around them). But it's the best I can find for Firefox.

By the way, I have to ask:

You said that the main criterion for cited sources is that they must be durably archived. Are there any exceptions to that? Do we allow citations of tabloids or other "buzzword books" that may indeed use a neologism or retronym for over a year but be truly the only ones to do so. Tharthan (talk) 18:47, 15 June 2015 (UTC)

Durably-archived tabloids are allowed; they aren't prestiguous, but their vocabulary is part of the great big grab-bag which is the English (or German, etc) language. Terms which are "neologisms", "slang", "informal", "rare", etc should certainly be marked with those labels, however, and in exceptional cases one can write usage notes.
What kind of "buzzword books" do you mean? Books that define and then give made-up examples of slang are disallowed by WT:CFI#Conveying_meaning, which "filters out [...] made-up examples of how a word might be used". But authors who like to work as many words from those kinds of books into their own literature, well, they're allowed. I got the impression that Georgette Heyer copied words from the 1811 Dictionary of the Vulgar Tongue and pasted them into her dialogues, sometimes clumsily. In fact, that makes me realize [4].
If a work is of such low quality that one can't be sure it is in fact using a given word (as opposed to unintentionally containing a string as a typo or misspelling), it is generally excluded, however (because CFI requires evidence of use). So, a citation like "Berlin, Germany has many ihstoric stires, as do most other cities in Germanny." would probably not be accepted as evidence that "Germanny" is an alternative spelling of "Germany". (But a book from 1600 that said "Southern Germanny is a Land of mannifold historickal Constructions, of a Roman Charackter" would suggest that "Germanny" was once an obsolete spelling of "Germany".)
- -sche (discuss) 21:11, 15 June 2015 (UTC)


Hallo -sche,
nach längerer Zeit habe ich mal wieder eine größere Bearbeitung getätigt und dabei den oben genannten Eintrag erweitert. Könntest du mal bitte drüberschauen und etwaige Format-, Formulierungs- und Übersetzungsfehler korrigieren. Danke im Voraus und lieben Gruß dir, Caligari ƆɐƀïиϠ 06:02, 16 June 2015 (UTC)

Natürlich; und lieben Gruß auch dir! PS, there must be something in the air (as they say) causing people to undertake big multilingual projects, since I just attempted one in the other direction, expanding (take and then) de:take. - -sche (discuss) 09:16, 16 June 2015 (UTC)
I guess my English got a bit rusty. So again, many thanks for your swift corrections. Each and every correction will improve further editings...hopefully :-).
@de:take: Wow! Indeed. Great job so far with regards to the massive content expansion. Let me know when you think you completed expanding "take". There are some formatting issues that I'll let you know on your German user talk once you've done with expanding. There need to be some "Feintuning" with regards to the format. As an advice I would recommend that you take a look at articles in de:Kategorie:Polnisch, de:Kategorie:Tschechisch or de:Kategorie:Schwedisch. If you need specific help, don't hesitate to let me know.
Lieben Gruß dir, Caligari ƆɐƀïиϠ 15:33, 16 June 2015 (UTC)

Moinsen. WT: ANDS.Edit

Moinsen. Ich biete dies: User_talk:Korn/sandkist Korn [kʰʊ̃ːæ̯̃n] (talk) 13:57, 20 June 2015 (UTC)

Merging the German and Dutch lects... bleh. I don't oppose it, or support it. (As I wrote further up on this page, "the general disagreement and slow-motion edit-warring about how to handle the various Low German lects makes for so much ugliness that I am losing interest in editing them" at all.) I strongly suggest, almost to the point of insist, that one orthography should be chosen for forms to be lemmatized on / normalized to (I don't know if this is what you intended the "consonants" and "vowels" sections to do), so that we don't end up with five entries lemmatized five different ways, representing the same diphthong five different ways, as if all the words were pronounced differently, when in fact they just use different orthographies or have predictable dialectal variation. Nouns should uniformly begin with majuscule letters, or uniformly not do that, for the same reason.
I've made a few typofixes and other small changes, e.g. dropping the Dutch spellings of "coïnciding" and "reëmergence". Also note that merging Plautdietsch would need discussion quite apart from merging GLG and DLS, because people (e.g. Angr, and me) in past discussions have supported keeping it separate on account of its separate history and development on another continent.
I also suggest either dropping the "During Middle Low German [...] Central and Upper German" line, or rewriting it to give native forms (we'll have to suck it up, bite the bullet, and perform whatever other idioms are necessary to give one dialect's forms as examples) so that it doesn't imply Low Germans actually used the words "German", "Low Landic", etc, especially given that "Low Landic" gets all of four Google hits. (Alternatively, a phrasing like During Middle Low German times, the language was known by cognates of the terms "Dutch", "Saxon", "Netherlandish" or "Netherdutch" would technically be accurate, but confusing to the uninitiated.)
- -sche (discuss) 17:26, 20 June 2015 (UTC)
Ganz ruhig. Ich glaube, Du verstehst meine Intention falsch. Der von mir geschriebene Text sollte ein Ausgangspunkt für ein Gespräch zwischen uns beiden über die Änderung des ANDS sein. Die derzeit existenten ANDS-DE, -NL und PDT sollte das noch gar nicht berühren, weshalb sie auch nicht erwähnt sind. Die Sektion über die Konsonanten und Vokale soll interessierte Autoren und Nutzer nur darauf hinweisen, dass eine Schreibung nicht bedeutet, dass überall dieselbe Aussprache vorherrscht und ggf. zu weiteren Eintragungen im Pronunciation-L3 anregen. (Oder wenigstens überhaupt welchen.) Von der Plautdietsch-Geschichte bin ich nicht überzeugt, da sich Plautdietsch kaum bis gar nicht von anderen Dialekten unterscheidet. Und den Teil mit den native forms verstehe ich ganz einfach nicht. Es klingt, als würdest Du befürchten, dass die Leser fälschlicherweise denken, dass die Holländer sich tatsächlich mit englischen Worten benannt hätten. Korn [kʰʊ̃ːæ̯̃n] (talk) 18:22, 20 June 2015 (UTC)

Old Italic display helpEdit

Hello! Remember this discussion way in which you mentioned you make fonts? Well, this is not exactly that, but I have been working on making Appendix:Old Italic script with all of the relevant Old Italic languages (I still need to add Raetic, Camunic, Lepontic, etc.). I will then use this table as a references to create Module:Ital-translit which will service all of the Old Italic languages. I thought that it would be very nice to be able to show all the different letter forms that would map to any given Unicode letter. The documentation for how the Unicode block is defined is here and contains descriptions of all the different letter forms for each sub-script (in section 3). I was hoping you (or someone you could suggest) might be able to create PNG's for the use in {{t2i}} so that we could display all the Old Italic letter forms both in this appendix and potentially in the mainspace for quoting inscriptions. I know that this isn't a high priority for anyone, but now that I've started, I've gotten quite excited about the whole business. Below are some other reference materials for all the scripts. I'm not hoping for every little variation of every character, but if you make PNG's for the major ones, I'll do all the rest. Also, if this is just too much work, just tell me. —JohnC5 21:11, 24 June 2015 (UTC)

Hmm, I'll see what I can do. Btw, I notice the Glagotic t2i images are a mix of svgs and gifs, although svg versions exist for at least some of the gifs and could be swapped in. - -sche (discuss) 20:01, 26 June 2015 (UTC)
Yeah, that is rather weird. I have not idea how why that is the case. Also, the behavior for which I asking you is a little different than the normal t2i behavior, because I would want {{t2i|a|a2|a3|a4|a5}} to be different versions of the same letter. Just making sure you understand that for which you signed up.
Also, thanks! —JohnC5 23:19, 26 June 2015 (UTC)
Hey again. Sorry to pester, but is there any progress on this? I want to have a discussion/take a vote to solidify the mapping of characters used in Module:Ital-translit and Appendix:Old Italic script since some of the character transcriptions (specifically those in South Picene and Camunic) are very odd. Having these for the discussion would be very useful. And again, if this is too annoying to do, please tell me. —JohnC5 06:29, 8 July 2015 (UTC)
Thanks for the poke.
The various letter-forms in the images you showed me are all, for lack of a better word, very line-y (as opposed to calligraphic like pen- or quill-and-ink handwriting, which is what I'm more used to designing fonts based on). I did mock up variants of the A in a style somewhat like the images of the Glagolitic letters, but finishing all the alphabets in that style would take quite a while. I was going to try jotting all the letters on paper and scanning it and autotracing it into a png or svg, and then post an update, but I've been busy. Hmm, you could try it yourself — and I hope that doesn't sound rude; I'm not saying "grr, do it yourself", I just mean that you could probably do that as well as I could. And if I do later find time to make more calligraphic letters, they could always be swapped in. - -sche (discuss) 07:02, 8 July 2015 (UTC)
No worries. I guess the whole making-png's-and-formatting-them-and-uploading-them thing would have somewhat of a learning curve for me. I didn't really need calligraphic versions―I was more hoping for just boring, old line versions of the different letterforms so I could disambiguate them in the appendix. It's kind of frustrating how many ways each character can appear, and having them all in a row would be useful. Is there anyone else you could recommend for this because I understand how making an all-lines-all-the-time font could be kind of dull? —JohnC5 07:19, 8 July 2015 (UTC)
@JohnC5 OK, I've made a batch of letters and variants, which can be found at commons:Category:Italic letters. I traced a picture of an inscription, which is why the 'C' for instance is not a perfect circle; I will probably go back and make geometric 'perfect circle' variants at some point. I haven't done the whole alphabet yet. - -sche (discuss) 22:53, 10 July 2015 (UTC)
You're the coolest! —JohnC5 00:32, 11 July 2015 (UTC)
@JohnC5 Uploaded some more. Sorry this is taking a while. Think we should make a table to show all the forms (a bit like Wiktionary:Gothic transliteration but probably vertical rather than horizontal)? - -sche (discuss) 02:19, 4 August 2015 (UTC)
Thanks for your help with this; they look great. I seem to have bitten off more than I can chew at the moment. Feel free to add them to the table as you see fit, or keep pestering me. Please keep pestering me. —JohnC5 02:52, 4 August 2015 (UTC)
For now, I'm storing these in Appendix:Italic script. By the way, I notice commons:Category:Etruscan letters and commons:Category:Oscan alphabet already have some letterforms in them. - -sche (discuss) 05:39, 8 August 2015 (UTC)
@JohnC5 Let me know if anything you need is missing from Appendix:Italic script. In each section, the first gallery / row are letter-forms I drew and the other rows are letter-forms which I discovered already existed on Commons. - -sche (discuss) 00:30, 9 August 2015 (UTC)
Wowzers. Thanks so much for all this work. My next task will be to load them all into {{Ital2img}} and then use that to populate Appendix:Old Italic script with the appropriate letterforms. Both steps may take a while in turn. I feel, however, that this will greatly clarify the equivalency of the different symbols across sub-alphabets.
PS: Is there an abbreviation for the Appendix namespace like there is for Wiktionary (WT). I feel like I've wasted several years of my like writing out the word Appendix. Just think if you could write out APP:AITAL. That would be magical. —JohnC5 00:41, 9 August 2015 (UTC)
There is not, but we do have a few cross-namespace redirects using the WT: shortcut. You could create WT:AITAL pointing to the appendix namespace (or even move the appendix into the Wiktionary namespace). Feel free to change the format of that page, btw. - -sche (discuss) 01:25, 9 August 2015 (UTC)

Other resourcesEdit


Hey, there's probably a better way to put it, but at double-team I wanted to express that it suggests two people penetrating. One person can double penetrate with fingers and/or dildos, but one person can't double-team, AFAIK. WurdSnatcher (talk) 03:03, 10 July 2015 (UTC)


I direct you to Special:AbuseFilter/41 and Wiktionary:Requests for verification#agyrophobia. Also, aWa will not automatically recognise the discussion result if you forget to embolden it. Keφr 11:56, 15 July 2015 (UTC)

Duly noted, thank you.
The filter says "Of the last 8,991 actions, this filter has matched (0.00%)", is that just because it's turned off? I've turned it on, but set it to only flag edits. We can see how that works and then potentially upgrade it to warn or stop editors. - -sche (discuss) 00:22, 17 July 2015 (UTC)


Irritating Wikipedians is a feature, not a bug. It prompts them either to drop the assumption that this project is run like Wikipedia, or leave. (Well, it did the former for me at least. And surely there are some that cannot do either, which means they should be blocked.) —Keφr 06:37, 25 July 2015 (UTC)

People shouldn't be importing e.g. navboxes from sister projects (and I don't think we need {{reflist}}). Having a redirect from the name that every other project (Commons, Meta, en.WP, Simple English Wiktionary, Voyage, Source, Quote) uses to the name we use for the same thing just seems helpful, not only to users from everywhere else but also potentially for those users here who complain about every keystroke they have to type... since tl is shorter. - -sche (discuss) 09:16, 25 July 2015 (UTC)

Lean keepEdit

You wrote "Lean keep per Equinox". What does it mean? Are you leaning towards a keep vote (but not quite sure), or is it an adjective, a sort of "lean" or thin/skinny/ephemeral keep, like a "weak keep"? Equinox 08:28, 25 July 2015 (UTC)

Leaning towards keeping. Ah, the terseness and ambiguity of our RFD jargon. The phrase "RFD-failed" is worse; a passing 'pedian at one point questioned me why I had deleted something if the "request for deletion failed". - -sche (discuss) 09:19, 25 July 2015 (UTC)
Yep everything is bloody awful. Thank you for explaining. Equinox 09:24, 25 July 2015 (UTC)
If it's not annoying, maybe I could suggest "weak __" for "lean __". I don't like placing the "vote" (weak or otherwise) if I'm not convinced, so I don't use it. But. I think I've occasionally written "weak oppose" etc. where I didn't like something but couldn't be bothered to explain why. It just needs a few of us to kill change by apathy. Hurrah. Equinox 09:26, 25 July 2015 (UTC)
Why not use "RFD deleted"? --Dan Polansky (talk) 10:01, 25 July 2015 (UTC)
I used to write "deleted" and someone scolded me and told me to write "failed". Equinox 10:05, 25 July 2015 (UTC)
They should not have scolded you; did they perhaps confuse RFD with RFV? Many RFDs are closed as "deleted"; it is a common practice, and one that makes sense. I prefer to write "RFD deleted" rather than just "Deleted", in keeping with "RFV passed", "RFV failed", and "RFD kept", in boldface; the point is to make the closure clear and distinct as a closure, and indicate which process is being closed. But again, "deleted" is fine, and multiple people used it quite recently, including bd2412. I actually think "RFD failed" should be banned as a closure. --Dan Polansky (talk) 11:13, 25 July 2015 (UTC)
I agree that "deleted" is clearer than and preferable to "failed". I suspect uses of "failed" are due to thinking of RFD (and RFV) as a process for deciding whether or not to keep an entry (an entry is deleted pursuant to the process = it fails to be kept). The deletion summary "Failed RFD, RFDO; do not re-enter" seems to conceptualize it in this way. I've boldly changed it. Several other deletion reasons in that list are redundant or need cleanup, IMO. - -sche (discuss) 22:39, 25 July 2015 (UTC)
You mentioned the two "No usable content given" lines: I added the one with "Please see WT:ELE" because there were enough cases where I was adding it by hand, but there are also plenty of cases where ELE wouldn't have helped. Chuck Entz (talk) 03:14, 26 July 2015 (UTC)

Wiktionary:Vietnamese transliterationEdit

By creating this page, you caused all instances of {{vi-noun}} that include Nôm transcriptions to display a link to this page. Where in Wikipedia is the reader expected to look? The Nôm script predates the Latin-based Vietnamese alphabet, so I want to make sure it doesn't sound like the given Nôm characters are derived from the alphabetic words somehow. – Minh Nguyễn 💬 06:39, 29 July 2015 (UTC)

I created several such pages following Wiktionary:Grease pit/2015/July#remove_junk_from_Special:WantedPages. It was my impression that a (black) link was already present even before the page existed, so my edit was just to clear it off of Special:WantedPages, where it sat because of how many entries linked to it even without it existing. Feel free to add more informative content or even delete the page. Ideally, the template/module that inserts the link should be rewritten the way Module:IPA was recently, to only add links for the small number of languages which have transliteration schemes documented on Wiktionary, rather than performing an expensive check (as it does now) to see whether or not the dot (which, as an aside, I doubt very many people notice in any language) should have a blue link or be black. - -sche (discuss) 06:59, 29 July 2015 (UTC)


Can we change the primary name of tmh (in Module:languages/data3/t) from "Tamashek" to "Tuareg"? tmh is the macrolanguage containing thv ("Tahaggart Tamahaq"), taq ("Tamasheq"), ttq ("Tawallammat Tamajaq"), and thz ("Tayart Tamajeq"). "Tamashek" is just an alternative spelling of "Tamasheq" and makes it very confusing. Also, "Tuareg" is simply a much more widely used name for these languages. --WikiTiki89 15:40, 6 August 2015 (UTC)

Yes, "Tuareg" would be a clearer name for it. Should we even have tmh at all, though, if we include its subvarieties as separate languages? (I note that ber, the macro-macro-language code containing tmh, was deprecated in favour of its subdivisions.) - -sche (discuss) 19:20, 6 August 2015 (UTC)
I personally feel that Berber is overdivided. I'm not an expert, but it seems Tuareg languages are all relatively mutually intelligible (see here, for example) even if they have different realizations of some consonants (evident in the language names I listed above). So maybe we should merge all of Tuareg into one? The simplest thing for now, though, is to just rename tmh to Tuareg. --WikiTiki89 19:44, 6 August 2015 (UTC)
Yes, deprecating the sub-dialect codes in favour of tmh would also work. (And yes, Berber is quite over-divided...) - -sche (discuss) 19:57, 6 August 2015 (UTC)

northern fur seal translations for WOTD?Edit

k'oon is soon (10 August) to be a foreign WOTD. I have added entries for Callorhinus ursinus and northern fur seal. Could you take a look? Also, if you can find any Native American translations, they would make northern fur seal more interesting. The seals apparently ranged as far south as Baja. I've also left a note for Chuck Entz, as this might really be in his wheelhouse. DCDuring TALK 16:09, 6 August 2015 (UTC)

I tend to know more about the languages on the other (Atlantic) coast, but I'll see what I can do. - -sche (discuss) 19:46, 6 August 2015 (UTC)
We are lucky if we get folks to click through at all, let alone look at translations, let alone be impressed. So only modest effort, with high likelihood of success, is worthwhile. Thanks. DCDuring TALK 19:58, 6 August 2015 (UTC)
There's a Tlingit translation here, which I think might be x̲'ún or x'ún in the orthography used by the current entries. Also, I wonder about the "hair seal" and "big seal" in this Yurok reference- could one of those be the northern fur seal]? Chuck Entz (talk) 21:19, 8 August 2015 (UTC)
I made an assumption, based on the distribution of fur seal species, that in any native northern Pacific language a word for fur seal had as its original referent the northern fur seal, whatever else might now be covered by the word. Hair seal seems likely. I could not venture a guess about big seal, as I don't know what seals have been extant on the Pacific coast of North America. DCDuring TALK 21:30, 8 August 2015 (UTC)
The northern elephant seal could easily be the referent for a term that glosses as "big seal". DCDuring TALK 21:34, 8 August 2015 (UTC)
Yurok, since it is Algic, I know a bit about: chkweges, which that work translates as "hair seal", is indeed the northern fur seal, Callorhinus ursinus. As for Tlingit, we do seem to use x̱ in pagetitles, so I think x̱'ún is the orthography to go with (some of our entries currently use , but this strikes me as wrong). - -sche (discuss) 21:32, 8 August 2015 (UTC)
Take a look at our entry for hair seal, and my revision of it. It is confusing that several references (not just the Yurok one) gloss as "hair seal" words that mean "fur seal". - -sche (discuss) 21:40, 8 August 2015 (UTC)
Maybe I was too hasty on hair seal. I can't imagine that any people that depended on seals for food, clothing, etc could fail to make a distinction between seals with fur and those with only hair, the latter being good for storage, portage, kayaks etc, more than for clothing, where animal fur would be valued for warmth. But I couldn't find in the Yurok reference a distinction between "hair" and "fur". Human hair, at least, seems to be the referent for words that included the morpheme "lep". It may be that the Yurok "big seal"/"sea lion" vs "hair seal" distinction (or at least that of the author of the lexicon) is close to ours between eared seals (Otaridae, which include the fur seals, but also include sea lions, which do not have fur) and earless seals (Phocidae). DCDuring TALK 23:33, 8 August 2015 (UTC)

Whitelist nominationsEdit

(tried responding back at the Whitelist, but I apparently don't have permission to do so – I apologise for posting here)

I checked Redboywild's edits and they seem to be ok – formatting is correct and I couldn't find a single mistake or bad translation. So I see no reason why he shouldn't be whitelisted. Thank you for consulting me about it :-)

PS: Just found out that this user has been warned a couple of times in the Romanian Wikipedia and blocked once for introducing obscenities. This happened some time ago and he hasn't done it since. He has probably – and hopefully – matured, but I'll keep an eye on his edits so they're up to par. --Robbie SWE (talk) 15:46, 10 August 2015 (UTC)

Oh, apologies, I forgot you were only a sysop on ro.Wikt and not here. Thanks for the input. - -sche (discuss) 17:39, 10 August 2015 (UTC)

Two spellingsEdit

I have a question: Are außlegen and meßen pre-1996 spellings? --Lo Ximiendo (talk) 02:58, 17 August 2015 (UTC)

In one sense, yes — they were used in the 1600s, and the 1600s are before 1996. But in practical terms, no — when it comes to categorization or the like, "pre-1996" refers to spellings which were still standard right up until 1996, which these weren't. - -sche (discuss) 07:18, 17 August 2015 (UTC)

Sardinian translationsEdit

If you weren't already (painfully) aware of this: see Category:Pages with module errors, which seem to be all translation and descendents sections. I've cleared a few, but it's slow going with the translation sections hidden. Also, I noticed that there were also a couple of minor Sardinian lects that weren't affected. Chuck Entz (talk) 13:01, 17 August 2015 (UTC)

Sigh. As I lamented about Frisian, these translations went un-updated because good translations have been invisible (short of searching a database dump) ever since we switched from templates to Module:languages, as opposed to ttbc and t-check translations, which are categorized. Perhaps all {{t}}s should put entries into hidden categories like "Entries with Sardinian translations".
Now that they're all in Category:Pages with module errors, I'll just plug that into AWB and go through them.
If you're referring to Gallurese and Sassarese, I didn't merge them because (as I wrote here) they are despite their names not unequivocally considered dialects of Sardinian; rather, they're often considered dialects of Corsican (co) or transitional between Sardinian and Corsican. I'll propose renaming them soon for that reason, and move any I find nested below Sardinian.
- -sche (discuss) 17:45, 17 August 2015 (UTC)
In the recent reclassification of Kölsch, I used a database dump to find and fix entries in translations tables before deprecating the code, so only a half dozen residual things made their way into Category:Pages with module errors. Progress! - -sche (discuss) 01:05, 3 September 2015 (UTC)
Great! I would also suggest using "insource:{{t|xxx" in the search box to find any that weren't in the dumps. Chuck Entz (talk) 03:32, 3 September 2015 (UTC)

Talossan (tzl)Edit

I see you edited this file many times before. Could you update the variable for Talossan (tzl) here and replace it with the following:

m["tzl"] = {
	canonicalName = "Talossan",
	type = "appendix-constructed",
	scripts = {"Latn"},
	family = "art",
	sort_key = {
		from = {"[àáâäå]", "ç", "ð", "[ëèéê]", "[ìíîï]", "ñ", "[öòóô]", "ß", "[üùúû]", "þ"},
		to   = {"a", "c", "d", "e", "i", "n", "o", "s", "u", "z"}} ,  -- the copyright sign is used to guarantee that ð and þ will always be sorted after all other words with respectively d and z


¡Graschcias, Robin van der Vliet (talk) (contribs) 18:42, 27 August 2015 (UTC)!

  Done. Are publications like the Guizua Compläts àl Glheþ Talossan copyrighted? If so, I would caution you not to add more than a couple dozen words in the language, because including too much of a copyrighted language (like Klingon) poses legal problems/risks for Wiktionary (for which reason the Klingon appendix was greatly condensed by me a while ago, following this BP thread). - -sche (discuss) 01:04, 28 August 2015 (UTC)
Ün Guizua Compläts àl Glheþ Talossan is (as far as I can tell) a copyrighted book, but it is not the source of the language. I am also not sure if the Talossan language is copyrighted and if languages can be copyrighted in the first place, as a language is a gigantic list of facts and facts can not be copyrighted. Robin van der Vliet (talk) (contribs) 16:52, 29 August 2015 (UTC)
Individual facts, no. A compilation of facts can be copyrighted, though. With a bit of work, any creative work can be analyzed as a collection of facts, but the way the facts are assembled by the creator of the work makes them copyrightable. Chuck Entz (talk) 23:31, 29 August 2015 (UTC)

Updates to Template:WOTDEdit

Hi, I updated Template:WOTD at Template:WOTD/sandbox, essentially adding a new parameter |comment= (or {{{6}}}) which allows editors to add a comment: see Template:WOTD/testcases. If that is all right, could you update Template:WOTD? I can't do it myself as I'm not an administrator. If this isn't the correct procedure for proposing changes to the template, please advise. Thanks. Smuconlaw (talk) 14:35, 31 August 2015 (UTC)

  Done. Neat idea; I had noticed your addition of it to manumission (28 August). - -sche (discuss) 16:51, 31 August 2015 (UTC)
Great! Thanks. Smuconlaw (talk) 21:53, 31 August 2015 (UTC)

Preventing long tagsEdit

In the unlikely case that you haven't noticed my edit at mir#German_Low_German, have a look. With something as splintered as Low German, do you think it would make sense to install an L4 for "Dialects using this word" or something instead of context labels? The pronunciation sections can simply go into a collapse. Korn [kʰʊ̃ːæ̯̃n] (talk) 15:23, 1 September 2015 (UTC)

@Korn: thanks for bringing this up. I thought about it a while ago when I saw anguañu, which specifies twenty different dialects that the term is used in. Perhaps in such cases the individual dialects can be specified under ====Usage notes=====, leaving the definition line to just say "many|_|dialects". (According to templatetiger, there are three other entries which use 9 or more parameters of {{label}}: recondite, quindecillion, and tu; and there are also a few entries which use 10 or more parameters of {{context}}: pardı and Mischief Night.) - -sche (discuss) 00:56, 3 September 2015 (UTC)
However, {{label}} adds categories which would need to be added manually or in another way if we moved away from using {{label}} on the definition lines of such terms... - -sche (discuss) 01:01, 3 September 2015 (UTC)

Unprotection of Word of the Day pagesEdit

Could you please unprotect "Wiktionary:Word of the day/September 29" and "Wiktionary:Word of the day/September 30" so I can update them? Thanks. (If you have time, perhaps you can also go through other days of the year and unprotect them as well.) Smuconlaw (talk) 16:08, 2 September 2015 (UTC)

  Done. I wonder why some, but only some, of the pages were protected in the first place. - -sche (discuss) 00:58, 3 September 2015 (UTC)
Thanks. No idea why this was done. Perhaps it was before there was cascading CSS protection of material on the Home Page? Smuconlaw (talk) 06:40, 3 September 2015 (UTC)

Updating of Template:quote-book/sourceEdit

I have created an updated version of {{quote-book/source}} at {{quote-book/source/sandbox}} to address the three issues mentioned at "Template talk:quote-book#Some suggested changes". Could you replace the contents of {{quote-book/source}} at {{quote-book/source/sandbox}}? Thanks. Smuconlaw (talk) 17:26, 3 September 2015 (UTC)

  Done and I left a slightly longer comment on that talk page. - -sche (discuss) 23:58, 3 September 2015 (UTC)


This is actually wrong. See the documentation for {{pl-decl-phrase}}. I realize that this interface is somewhat hacky, but I could not find a different way to pass keyword parameters to the declension patterns. --Tweenk (talk) 22:31, 3 September 2015 (UTC)

Oh, OK. At the time I made that edit, the template was just a big module error, and my edit (upon preview) made it resolve into a normal-looking table, so I figured the exclamation marks were an odd typo. - -sche (discuss) 23:53, 3 September 2015 (UTC)

German capitalisationEdit

Isn’t it about time for some archiving?

Anyway, could you please tell me if German always had the ‘capitalise all nouns’ rule? --Romanophile (talk) 03:25, 7 September 2015 (UTC)

Yeah, you're right, I need to archive.
No, German and its predecessors (Old/Middle High German) didn't always capitalize nouns. In the medieval period, capitals were generally only used at the beginning of sentences. Even after capitalization of nouns and names became standard in the Baroque period, some authorities (such as the Brothers Grimm, authors of the major Deutsches Wörterbuch) were opposed to it and persisted in writing in minuscule.
- -sche (discuss) 20:46, 7 September 2015 (UTC)
So, would it be permitted to include minuscule forms as obsolete forms? --Romanophile (talk) 21:11, 7 September 2015 (UTC)
I think that would be a bad idea, since the difference isn't specific to the word, but a general rule. You would end up with a lowercase entry for just about every noun attested before a certain date, which would be about as useful as a entries for italicized or underlined forms. Chuck Entz (talk) 21:26, 7 September 2015 (UTC)
Okay, fair enough. But what if the word is not attested in a capital form? Do we capitalise it anyway? --Romanophile (talk) 21:55, 7 September 2015 (UTC)
I would. Otherwise, you imply that there's some inherent difference from all the other nouns which were also lowercase back then. Of course, Middle High German and Old High German would be uppercase or lowercase by their own rules, since we consider them separate languages. Chuck Entz (talk) 22:19, 7 September 2015 (UTC)
I agree with Chuck. We do similarly for English: old capitalized Nouns don't have Entries, and we've tended not to capitalize common Nouns even if they're more common in old Works in capitalized Form, although there are a few Exceptions (like Admiraless, which I only just moved). - -sche (discuss) 22:29, 7 September 2015 (UTC)
"We do similarly for English: old capitalized Nouns don't have Entries" -- Does that mean there shouldn't be capitalized entries or does it simply mean that they're missing? Also what's in case of other European languages, like Latin and Danish in which nouns were also (sometimes) capitalized? If capitalized spellings are discrimited against, shouldn't there at least be some note somewhere? For example, there could be a page somewhere explaining English habits, like explainining differences between US English and UK English and explaining English capitalization habits. If a single page would be too long, there could be sub-pages like "English habits/Dialects" and "English habits/Capitalisation". - 12:31, 25 October 2015 (UTC)
Wiktionary has decided to exclude old Capitalizations of ordinary Nouns as a Matter of Course, along with sentence-initial Capitals and all-caps (the usual Examples cited in Discussions are "The" and "THE", Variants of "the"), long-s, and various typographic Literatures (e.g. Talk:fisherwoman). I proposed last Year that we should write these Exclusions down in some central Place, but nothing happened; perhaps I'll suggest it anew.
Wiktionary:About Latin#Orthography_for_Latin_entries documents how we handle Latin, although some Things (like that we don't include "EQVVS") seem to be so basic that they're not spelled out but only implied by e.g. the Note that the Form which we do have an Entry for is "equus".
I don't recall if we've discussed old Danish Capitalization or not, but I see no Reason it wouldn't be handled like old English Capitalization. - -sche (discuss) 19:10, 25 October 2015 (UTC)
Can the descisions be found somewhere? The exclusion of sentence-inital capitals and modern all-caps and typographic ligatures makes sense. But in case of capitalised nouns and normal antique Latin forms in all caps the exclusion is doubtful.
In case of long-s the exclusion would even be against Wiktionary's aim "to describe all words". While it's easy to change "winter" into "Winter" when one knows that "winter"/"Winter" is noun, it's not easy to change some s into long-s. In some cases, it's more like impossible to know where long-s's are put, if one doesn't know the rules concerning long-s's. (Simplified basic rules like "s" is used at a word's end and long-s is used elsewhere often are incorrect.)
Also old Latin abbreviations like "IMP" for "IMPERATOR" can be found in (special) dictionaries and can't be changed into a pseudo-modern spelling like "Imp" or "imp", because a modern abbreviation would be "imp." or "Imp." which is another word as it's written with a dot.
And in some of these cases, I have doubts whether descisions were made or not, or whether they were real descisions and not just some uttered opinions somewhere. For example, it's possible that nobody thought of old Latin abbreviations like "IMP" and thus no descision was made.
Also, what's in case of Modern Greek? The about page clearly states "This is a draft under discussion.". On the discussion page Katharevousa forms (Modern Greek spelled with diacritics etc.) are mentioned and some Katharevousa words have own entries (e.g. καρβονικόν). But the about page and the discussions don't clearify where to mention Katharevousa forms. Is e.g. καρβονικόν a related word, a synonym or an alternative form of καρβονικό? (Comparing it with other languages, like English prae- and pre-, Katharevousa forms should be alternative forms.) What about καρβονικός and it's declension, where should the neuter Katharevousa form καρβονικόν be mentioned? Under alternative forms with the addition "neuter", in the declension section, in the header?
Other questions in some way related to this:
* In case of Wiktionary:About Latin it seems that some things weren't discussed - at least not at Wiktionary talk:About Latin - but rather made up by some authors. E.g. in case of the edit from 27 November 2011 with the comment "→‎Quotations: Adds rule for marks over final a for disambiguation of ablatives from nominatives", it doesn't seem that there was a discussion. On the talk page there was no edit around that time and the author of that change didn't make a change which would indicate a discussion about it. (There was a discussion with another topic at "Wiktionary:Grease pit" and a discussion on his talk page which he commented. But both wouldn't be fair places for a discussions of his edit.)
* The About Latin page states that words with j should point to word with i. But what's if only a form with j is attested? Well, that doesn't mean that the form with i is not attestable, but when it's not attested (no one found a quote), then the word with j can't point to a form with i. Well, at least not, if one doesn't make up words and words forms.
* What's if there is a term without an English translation? E.g. Swedish "tankstreck" and German "Gedankenstrich" refer to the mark "–", but usually just when used in certain contexts and sometimes not it's not restricted to that smaller dash but might also refer to "—". Thus, both terms do not belong into a translations of "dash" or "en dash". But still it would be nice to see translations for these terms. The current practice is like this: The words are incorrectly given as translations of an English word or there is no translation section with words like that. Possible 'solutions' which should be better: (a) One could mention "tankstreck" under the Etymology of German "Gedankenstrich" and vice verse, as they are formed similary. (b) One could create an translation template like "template:translation - thoughts stroke" which than can be embedded in the entries of the foreign words.
So regarding your old, unanswered question at "Beer parlour": Those descisions should be collected somewhere. Also, maybe those descisions should be checked whether they still make sense or not. One could also check if all so-called descisions really were descisions. E.g. a user once wrote that as far as he knows About Latin is rather a collections of ideas than of actual rules. The part, "... think tank, working to develop a formal policy.", should support his attitude.
- 20:08, 27 October 2015 (UTC)
Questions about Wiktionary's policies towards Latin should be directed towards Latin-speaking and Latin-editing editors on WT:T:ALA. Likewise, questions about Ancient Greek should be directed to WT:T:AGRC; people there are more qualified than I am to tell you about Katharevousa. I've started a BP thread about long s and ligatures: Wiktionary:Beer parlour/2015/October#Documenting_how_to_handle_long_s_and_ligatures. - -sche (discuss) 02:20, 28 October 2015 (UTC)

Broken usage tracking in MediaWiki:Gadget-RegexMenuFramework.jsEdit

Hello -sche. You changed a link in MediaWiki:Gadget-RegexMenuFramework.js to remove it from Category:Pages with broken file links. The broken page links are a common hack to track global usage via Special:GlobalUsage or the usage tool. Unfortunately your change broke the tracking, so the page will no longer receive maintenance updates as needed. Would you consider excluding JavaScript pages from Category:Pages with broken file links instead? (I can put together the code to do that via MediaWiki:Broken-file-category for you.) Pathoschild (talk) 02:59, 11 September 2015 (UTC)

fickern seems to have become an autonym...Edit

Rather than create Category:Palatine German and Category:Kölsch German, I wanted to instead fix this entry, which is the sole entry in both of those- but I don't know enough about either language to do it even half right. I suspect you'll also want to remove some things from Module:labels/data. Thanks! Chuck Entz (talk) 04:40, 11 September 2015 (UTC)

The labels in the module are largely OK, because there are (almost certainly) terms used in standard German which are specific to the Palatinate / Köln, although the details of labels like those are under discussion on the module's talk page. This entry, on the other hand, is odd... the Pfälzisches Wörterbuch only has "fickeln" and "ficken"; the Rheinisches Wörterbuch doesn't have this sense; and Google Books hits all seem to be scannos or the noun. Even raw Google hits for "zu fickern" are mostly Google Books scannos. - -sche (discuss) 21:49, 11 September 2015 (UTC)


Could you take a look at Talk:Knabe. DCDuring TALK 12:16, 22 September 2015 (UTC)

Thanks for helping me with this sweet memory of my deceased parents. DCDuring TALK 06:46, 9 October 2015 (UTC)

American black bearEdit

Do you know what language "Dene" refers to here? DTLHS (talk) 18:08, 8 October 2015 (UTC)

If you go to WT:LOL and press Ctrl+F and type "Dene", you will find that the Chipewyan language (code chp) has "Dene" as one of its alternative names. --WikiTiki89 18:16, 8 October 2015 (UTC)
That's true, but "Dene" can also refer to a whole family of languages, so I don't know what was meant. DTLHS (talk) 18:38, 8 October 2015 (UTC)
You seem to be right. In WT:LOF, all I see is "Na-Dene", not "Dene", but the Wikipedia page on Na-Dene languages mentions that there the "Athabaskan" family can also be called "Dene". Anyway, the Chipewyan language is in the North Athabaskan family, which is in the Athabaskan family. Anyway, Chipewyan is the only single language I can find that goes by the name "Dene" and the Wikipedia page on the Chipewyan language says "Most Chipewyan people now use Dene and Dënesųłiné to refer to themselves and their language, respectively." Based on all this, I think Chipewyan is the correct choice. If it turns out to be wrong, it would be within our expected margin of error and we would know it's in the same family of languages anyway, so the actual Chipewyan would be similar enough. --WikiTiki89 18:56, 8 October 2015 (UTC)
That's a reasonable assumption, although in this case I think the rug is pulled out from under it because the gloss (=the claim that tsah means "black bear") seems to be mistaken. Desjarlais gives sas as the Dënesųłiné (Chipewyan) word for "bear", and an old article in the Transactions of the Canadian Institute clarifies the species by saying [in old orthography] "the "Déné word for Black Bear is s̀əs or s̀as according to the dialect". For comparison, Hargus gives səs as the word for "black bear" in either Sekani or Babine-Witsuwit'en — without reading her whole chapter I can't tell which — and Krauss gives x̯ešʷ as the Proto-Athabaskan word for "black bear". Whereas, Desjarlais says tsá is the word for "beaver", and Morritt citing Haas agrees (compare Sekani tsàʔ and Slave tsáʔ, both "beaver"). - -sche (discuss) 21:44, 8 October 2015 (UTC)
Historically, brown bear species almost certainly ranged over the lands of the Chipewyans. Is there a term that included brown bear? DCDuring TALK 00:14, 9 October 2015 (UTC)
I can't find a Chipewyan term for "brown bear", although I can find sources which gloss sas as just "bear", so it may have functioned as a generic term. I can find the term in other languages: Ruhlen has Haida xúuts "brown bear", Tlingit xúts (= /xúːc/, also written xoots) "brown bear" (/"grizzly bear"), Tsetsaut "grizzly bear". Athabaskan languages and the schools: a handbook for teachers (1984) notes "in Kutchin, shih means 'brown bear' but shìh (with lowered tone) means 'food', and these words are not grammatically or etymologically related." The Proto-Athabaskan term for "brown bear" was x̯...c per Krauss (he is unsure of the middle vowel). - -sche (discuss) 01:21, 9 October 2015 (UTC)
The problem is that the w:Na-Dene languages are called that because some variation of "dene" means "people" in the vast majority of at least the Athabaskan languages. More often than not, the word for "people" gets used in the language name (at least the one native speakers use for their own language), so there could be a number of candidates. The Chipewyan term is pretty close, so it would make sense to concentrate on that part of Northern Athabaskan. Or, better yet, get @DCDuring to tell us what source he used for his mass addition of American Indian translations to that page, and we might be able to figure it out that way. Given the rather poor understanding of American Indian languages and their orthography in most general sources, I'm not so sure that was a good idea. Off the top of my head, the Hopi looks plausible based on what I know of other Northern Uto-Aztecan languages, and the Southern Uto-Aztecan ones all seem to use reflexes of the same ancestral form, which is a good sign, but "close" isn't close enough for dictionary purposes. Chuck Entz (talk) 03:41, 9 October 2015 (UTC)
Why is that a "problem"? --WikiTiki89 19:22, 9 October 2015 (UTC)
It makes it difficult to tell which language a work that refers to "Dene" is referring to. Indeed, older generalist works (as they tend to do with a lot of languages, e.g. also Great Russian) often impressionistically consider whole swathes of the Dene family to be a "Dene" language divided into e.g. Northern and Southern dialects. - -sche (discuss) 20:30, 9 October 2015 (UTC)


You removed "songster" from Sänger. But than shouldn't "songstress" be removed from Sängerin too, or shouldn't it be replaced with "singeress" ("female person who sings" instead of "female person who sings (songs)")? - 12:23, 25 October 2015 (UTC)

Thanks for catching that. Yes, it's sufficient for Sängerin to say "female singer", IMO. If I heard someone say "singeress" it'd be a dead giveaway that English wasn't their native language. - -sche (discuss) 18:45, 25 October 2015 (UTC)

Request for Zipser German GrantedEdit

User -sche, þy wish haþ been granted. See here. --Lo Ximiendo (talk) 18:52, 25 October 2015 (UTC)

Þanks! - -sche (discuss) 19:18, 25 October 2015 (UTC)
I also added few words of Sathmar Swabian and Silesian German. --Lo Ximiendo (talk) 19:38, 26 October 2015 (UTC)
Great! Wiktionary's coverage of Germanic languages is slowly increasing. - -sche (discuss) 08:05, 27 October 2015 (UTC)
I added the white and yellow flag for the language header of Silesian German. --Lo Ximiendo (talk) 06:46, 30 October 2015 (UTC)


If that's "nonstandard", then please fix it. It's simply a fact, that there are two opinions about the part of speech:

  • Some say that Berliner and similar words are adjectives. This is also supported by dated spellings like berliner.
  • Some say that Berliner etc. are nouns in gentive plural: der Berliner, gen. pl. der Berliner - so Berliner Mauer literally means "Wall of the Berliners". This is also supported by German spelling rules: nouns begin with a capital letter, adjectives not (nominalised adjectives aren't adjectives anymore, but nouns too).

- 18:24, 27 October 2015 (UTC)


What’s wrong with the samples on Google Groups? --Romanophile (contributions) 20:56, 29 October 2015 (UTC)

Oh, there are some, that's great! They weren't there when I searched back in 2013, which is odd, since the posts were made before 2013... but Cloodcuckoolander (I think) has remarked upon how oddly unreliable Google's Groups search is. Thanks for revisiting the entry / noticing. (I have a short list of entries that just need one more citation that I check up on periodically, but it's woefully incomplete.) I'll turn it into an entry. - -sche (discuss) 05:06, 30 October 2015 (UTC)


This rollback is in error. As said before (cf. revision history): google doesn't differ between ſ and s, so antiqua "daſs" (around 1871-1902) incorrectly becomes "dass" by google and thus ngram is no reliable source. Maybe in case you don't know: daſs is not the same as dass, but an alternative form of daß used in antiqua when ß was not available (this usage was deprecated in some spelling rules). daſs could also be a Heyse spelling of daß, but then Heyse's spellings (including his antiqua spellings) are (said to be) different from 20th/21st century spellings as Heyse also used ſ in antiqua (rules from 1902 should deprecate the use of ſ in antiqua and only allow s and ß, which also holds for the 1996 reform though the use of ß and s were changed).
Also: daſs is an alternative form of daß/dass, older (antiqua) spellings with ſ can't easily be derived from modern antiqua spellings and there's no bijection between older (antiqua) spellings and modern antiqua spellings, e.g. both Wachſtube (Wach-stube) and Wachstube (Wachs-tube) become Wachstube in antiqua without long s. Thus it makes sense to add spellings with ſ. (It maybe makes no sense to have an own entry for it as it's not easy to input the character ſ, as most users don't know the difference between ſ and s, and as ſ and s might be similar in case of encoding etc., but there was no link anyway.)
P.S.: In the German spelling rules (Berlin, 1908) it is: "In lateiniſcher Schrift ſteht s für ſ und s, ss für ſſ, ß (besser als ſs) für ß, für ß tritt in großer Schrift sz ein, z. B. MASZE (Maße), aber MASSE (Masse)." (antiqua ß and fraktur ß actually look differently in the text and the text itself is printed in fraktur). In early Duden editions (late 19th century) it was: "Zu merken iſt, daß man in lateiniſscher Schrift s für ſ und s ohne Unterschied, ss für ſſ und ſs für ß anwendet. Statt ſs ist auch ß zulässig." So, daſs which can be found in antiqua texts from ca. 1871 till 1902 and is OCRed as "dass" by google is an alternative form of "daß" which was more common in antiqua after 1902.
daſs was also a real Heyse spelling. But I'm not sure whether it was used in fraktur or in antiqua. If it was used in antiua than it's obviously different from dass. If it was used in fraktur, then the traditional fraktur-antiqua transcription rules from early Duden editions and the rules from 1902 could say that it has to be transcribed as dass, but even than it's a different form as it's transcribed. But it's very likely that Heyse's spellings were not as common as Adelung's spellings.
Anyway, as google doesn't differ between ſ and s (and between fraktur and antiqua), it can't be used to cite a statement like "dass was more common than daß in 1871-1902". In case of 1950 till nowaydays, ngram maybe can be used as the nazis banned fraktur and it never became popular again and as ſ became unpopular in antiqua (cf. traditional spelling rules from 1902 and reformed spellings rules from 1996).
- 14:18, 6 November 2015 (UTC) and 14:40, 6 November 2015 (UTC), P.S. 17:47, 6 November 2015 (UTC)

Where is your evidence that daſs should count as daß and not "dass"? To the extent that "ess-zett" is treated as a separate thing from "two esses", "ſs" is two unligatured esses, one long and one short according to the usual (translingual) rule of long- vs short-s placement. - -sche (discuss) 21:16, 6 November 2015 (UTC)
Older Duden editions and German spelling rules from 1902 (see above). Both state that fraktur ß (actually more like a ligature of ſz) can be written as ſs in antiqua ("ſs für ß" and "ß (besser als ſs) für ß"). So antiqua daſs can be and often was an alternative form of daß and not of dass.
Without Duden and the German spelling rules (which say that in antiqua s is used instead of single fraktur ſ), antiqua daſs would still be another form than dass. That is, one would have to differ between three forms in antiqua: daß (traditional spelling, also prefered by the 1902 orthography), daſs (older antiqua spelling), dass (1996 reform spelling). It could be, and shouldn't be unlikely, that authors who used daſs (which could also be a Heyse spelling used in antiqua) would prefer daß over dass if they could only choose between these two forms. In case of the real Heyse spelling, at least the one used in fraktur, many arguments used against the 1996 reform spelling are invalid, e.g. sss shouldn't occur in a real Heyse spelling used in fraktur, and maybe in antiqua too.
daſs (not daſz (= daß)) in fraktur by Heyse could be dass in antiqua. But: 1. The older Duden and the spelling rules from 1902 can't be used to derive that spelling, as fraktur daſs is an incorrect form in the beginning. So, some other source is needed that says that Heyses fraktur daſs can be an antiqua dass, as it could also be that his correct antiqua form would be daſs too or that he proscribed the use of antiqua. 2. It's more likely that Heyse's spelling was rarer anway, as it was younger (Adelung came before him), was depracted in several German countries (e.g. in Prussia) and as it wasn't used in the 1902 orthography (if dass was prefered in 1871-1902, than it should be more likely that that spelling would be used in the 1902 orthography). 3. As google doesn't differ between ſ and s and between fraktur and antiqua, it is no reliable source. And to interpret google's ngram or google's books would be OR too.
(Regarding the quotes: It's hard to quote a fraktur text which differs between fraktur and antiqua in an antiqua text. Maybe it would be better with pseudo-HTML like "<antiqua>ſs</antiqua> can be used for <fraktur>ß</fraktur>", but maybe that would be harder to read.)
- 12:39, 7 November 2015 (UTC)


Discussion moved to Wiktionary:Tea room/2015/November#Neger.
(let's try to keep the discussion in one or two places rather than three)

Dative -e in German strong declensionsEdit

Discussion moved to Template talk:de-decl-noun-n.

German ordinal numbersEdit

Presently their lemmas are the forms in -e. Our general practice, I think, is to put adjectives without a bare form at -er. The ordinal numbers do have a bare form, which is used with zu: zu siebt, zu acht. But these seems to be separate idioms. So I guess -er would be the right place. And if you agree: Should I move them manually, or is there a better way? Kolmiel (talk) 14:22, 9 December 2015 (UTC)

I think you are correct about the general practice (or more accurately, the general desire — in practice a lot of entries were created at the wrong title and still need to be moved), which also applies to substantivized adjectives. But about this particular set of entries... why do de.Wikt and the Duden, which do lemmatize e.g. Verletzter m rather than Verletzte m (cf. this thread), both lemmatize siebte ([13]) rather than siebter ([14])? (The DWDS seems divided on the matter; there's no entry found if you search for "siebter", but if you search for "siebte", the DWDS-Wörterbuch entry lemmatizes that form, while the Etymologisches Wörterbuch entry that comes up lemmatizes the form that ends in r.) Do you know if there's any logic behind lemmatizing the -e forms? If not, then yes, for consistency they could be moved. I suppose AutoWikiBrowser could be used to speed up the process somewhat. - -sche (discuss) 01:03, 10 December 2015 (UTC)
I think Duden might lemmatize the er-form only in nouns, but the e-form in adjectives. For example "oberer" is given "obere, oberer, oberes" at Otherwise I don't think there's a special reason concerning ordinal numbers, except possibly that these are often preceded by the definite article. But that's true of others as well, and the er-forms of ordinal numbers do occur and aren't particularly rare at all (ein zweiter Versuch, zehnter Dezember). So I think they should be moved. Kolmiel (talk) 14:19, 10 December 2015 (UTC)
OK, I will find time to move and standardize most of them in AWB if there are too many for us to do by hand. AFAICT (from the category and the bluelinks in siebte) we're dealing with <50 entries, right? There's a lot of inconsistency in what part of speech the lemmas and non-lemmas use in their headers and headword-lines; achtzehnte uses 'ordinal number' and zweiter uses 'numeral', but siebte uses 'adjective'; siebtes uses 'adjective form', so I tentatively just appended 'form' to the headword line of neunzehnte and got 'numeral form'. They should all be 'adjectives' (the lemmas) and 'adjective forms' (the inflected forms), right? (This needs to be sorted out regardless of which forms we lemmatize.) Also pinging @CodeCat, who has helped sort out Wiktionary's labelling of numerals vs numbers vs adjectives. (The Duden goes with "Zahlwort"; de.Wikt with the double header "Adjektiv, Numerale".) - -sche (discuss) 22:14, 11 December 2015 (UTC)
These are straightforwardly adjectives. "Numeral" is a special kind of part of speech whose definition I'm still not sure of, but see w:Numeral (linguistics). That page mentions in particular that "not all words for cardinal numbers are necessarily numerals", so not everything with a number meaning is, part-of-speech wise, a numeral. That's why we have Category:German numbers, which exists outside the POS category tree. In fact, I believe numerals are a kind of determiner, closely allied to non-cardinal quantifiers like "all", "some" or "no". —CodeCat 22:50, 11 December 2015 (UTC)

Old PicardEdit

Should we have a language code, or at least an etymology-only code, for Old Picard? Otherwise, I assume it is currently treated as a dialect of Old French, and without an etymology code, I had to use some awkward phrasing at Rosine#Etymology. --WikiTiki89 19:37, 28 December 2015 (UTC)

I do find references (more than to "Old Italian"!) to Old Picard translations of texts, and to Old Picard words — including in dictionaries that give Old Picard words as etyma. Let's give it an etymology-only code, so that those etymologies which need to can cite it. Distinguishing it in general from Old French and Old Northern French might be messy, so I wouldn't grant it a full code and its own language sections until such time as someone makes a case that it needs/merits that. Based on "fro-nor", I guess the thing to add to Module:etymology languages/data would be "fro-opc". - -sche (discuss) 03:40, 29 December 2015 (UTC)
Maybe "fro-pic" instead? The oldness is already implied by "fro-", and we do have "fro-nor" rather than "fro-onr". --WikiTiki89 16:46, 29 December 2015 (UTC)
Sure. - -sche (discuss) 18:44, 29 December 2015 (UTC)
Maybe we should be consistent and use the language code for modern Picard: fro-pcd. Of course, we have fro-nor, instead of fro-nrf, so maybe I'm just playing host to the "hobgoblin of little minds". Chuck Entz (talk) 02:45, 30 December 2015 (UTC)
I thought about that, but then I wondered if using the modern language's code for that element would suggest that this was another code for the same thing. "de-AT" is (a variety of) the same language as "de", whereas "fro-pic" is not "pcd" but rather (a variety of) "fro". - -sche (discuss) 05:47, 30 December 2015 (UTC)
I already went with "fro-pic", I figured if there was a strong enough argument to change it, I would, but I don't see such a strong enough argument. --WikiTiki89 17:01, 30 December 2015 (UTC)
Gosh! And here I was expecting a discussion of one of the roles Patrick Stewart played in the Star Trek: The Next Generation series finale. We could have split up into Team Middle-Aged Picard and Team Old Picard a la the Twilight Saga. Oh well...
Seriously, though, I seem to remember reading somewhere that a few of the "Normans" that invaded England were really Picards, and that there are traces of Old Picard in English. Chuck Entz (talk) 07:25, 29 December 2015 (UTC)

Source accessEdit

I have no access to the PDF documents of Cambridge Ancient History. Do you know how to get access to it? --UK.Akma (talk) 21:02, 10 January 2016 (UTC)

I don't; I'm sorry. I just took the text that had been added to Subarian and trimmed out the speculation on ethnic identity and other things that belonged in Wikipedia rather than in a dictionary. - -sche (discuss) 21:29, 10 January 2016 (UTC)
See the discussion on the talk page of w:Subartu about what seem to be the same set of references. I have my doubts whether any of this should be allowed in the entry. Chuck Entz (talk) 22:13, 10 January 2016 (UTC)

all heartEdit

Why did you delete that page? Do you think that it’s sum‐of‐parts? --Romanophile (contributions) 00:03, 13 January 2016 (UTC)

@Romanophile: Because it was just nonsense, the page contained only the text "i love my family and everybody else around me". --WikiTiki89 00:30, 13 January 2016 (UTC)
@Wikitiki89 Ah! All right. Still, do you think that this entry would be acceptable if it were properly designed? --Romanophile (contributions) 00:36, 13 January 2016 (UTC)
Depends on what meaning you have in mind. --WikiTiki89 00:37, 13 January 2016 (UTC)
Hmm. I'm familiar with the collocation in sayings like "She was all heart" (=was very loving and/or compassionate, that kind of thing), but it does seem like it might just be "all" + "heart", and one can also say things like "She was all brain(s)" (=was very smart, perhaps without things like social awareness, hence the "all"), or "She was all legs" (=had long legs). - -sche (discuss) 00:50, 13 January 2016 (UTC)
"Definitions" 2 and 4 of heart#Noun cover the range of meanings of heart as used in "all heart" in my experience and in a review of the phrase at COCA. DCDuring TALK 02:09, 13 January 2016 (UTC)

Thank youEdit

For finishing cleaning out RFV, especially given that it had become too large to archive. The page that really needs your help, though, is WT:RFM (and I suppose to a lesser extent WT:RFDO), because I simply can't close many of those. Some of them are language mergers etc.; the ones that you haven't looked at need some expert attention, and even those upon which we've come to a consensus need to be executed. I still don't know all the steps one ought to go through to handle mergers and name changes (is there a manual somewhere?). The other hitch is that I don't have AWB, so it's a lot harder to find all the uses of a language's name or go through every page in a category to change it, which especially slows down requests at RFDO. Anyway, I appreciate the help! Cheers —Μετάknowledgediscuss/deeds 00:01, 1 February 2016 (UTC)

I will take a look. As for AWB, you could download it; it's not that hard to learn. There is Wiktionary:Guide to adding and removing languages; changing a language's name is not handled too differently from removing a language (you have to find the same things — old uses). - -sche (discuss) 00:55, 1 February 2016 (UTC)
I have a Mac, so using AWB would require virtual Windows AFAICT. In any case, I should've known about that guide — thanks. I'm not always sure where to archive the discussions, though. In any case, I guess all that I really need is for you to weigh in on long-ignored discussions. —Μετάknowledgediscuss/deeds 03:55, 1 February 2016 (UTC)

Update of "Template:quote-book/source"Edit

Hi! I have done an update of {{quote-book/source}}, which is at {{quote-book/source/sandbox}} (see {{quote-book/testcases}} for sample uses). The main changes are these:

  • Improved handling of |format=, |genre=, |language= and |doi=.
  • Addition of |archiveurl= and |archivedate=.

If it looks all right, could you please replace {{quote-book/source}} with the contents of the sandbox? Thanks. Smuconlaw (talk) 09:03, 3 February 2016 (UTC)

  Done. I'm monitoring Category:ParserFunction errors to see if errors arise, and actually, it looks like the category is losing a few pages, though that may be due to Kenny's edits. :) Still, thanks for all your hard work sorting out these quotation templates! - -sche (discuss) 17:46, 3 February 2016 (UTC)
Thanks, and you're most welcome. Let me know if anything in the template needs fixing. Smuconlaw (talk) 20:02, 3 February 2016 (UTC)
Whoops, there seems to be a space missing if |title= and |publisher= are specified, but |location= is not: see "freedom of speech". (I don't think this was an issue created by me.) Fixed it at {{quote-book/source/sandbox}}. Smuconlaw (talk) 20:11, 3 February 2016 (UTC)

Cool BeansEdit

Have added a recording of the phrase at the discussion page below, can it be added to the actual page? —This comment was unsigned.

Collocations dataEdit

I don't think our discussions of collocation space included anything about what the data might actually look like. I have a 1.7Mb file of sample (free) data from COCA. We could try using it for a few polysemic words for a demo, to determine its actual value to us, etc. The cost of getting more complete sets would not be prohibitive. I don't think it could be part of Wiktionary because of licensing. There are really only a handful or two of Wiktionarians that could and would make good use of the data anyway. DCDuring TALK 02:22, 5 February 2016 (UTC)

The mock up that you provided would add little value to English, for which we would want frequency data at the very least. I guess I am thinking principally of improving the quality of English definitions, not just of providing a home for translation targets out of principal namespace. Perhaps I should add the kind of table I am thinking of. DCDuring TALK 21:14, 6 February 2016 (UTC)
Hmm, yes, that would help me to understand what you're thinking of. Recording which collocations are most common? (Some usage notes already do that — added by you, I think; thanks!) - -sche (discuss) 21:34, 6 February 2016 (UTC)
See Talk:goods. DCDuring TALK 22:51, 6 February 2016 (UTC)
Is "frequency of collocation" how often that collocation occurs in the corpus? Then what are "total frequency" and "mutual info"?
Access to this kind of information seems like it would be helpful to Wiktionary in determining which collocations to list (and in what order). The frequencies of the different collocations might also be of interest to some readers. Is COCA copyrighted? Wiktionary should consider whether it would be infringing COCA's copyright to repeat such information in a large number of entries.
The table is quite large; obviously, it could be made collapsible. Another possibility would be condensing it radically into the list format a few entries (usage notes) already use: Words which collocate with goods: (goods and) services ([data from whichever field best indicates how often this collocation is relative to other collocations goes here]), consumer (goods) ([data]), [...]. It would also be possible to combine the table's data with translation tables, by putting the information from each row of the table into the "gloss" atop each translation table.
- -sche (discuss) 08:22, 7 February 2016 (UTC)
"Frequency of collocation" is frequency of the collocation (occurrence within 4 words of the main term, before or after) in the corpus, "total frequency" is the frequency of the individual term (eg, services) in the corpus. "w:Mutual information" is a measure of the strength of the association between the terms (ie, between goods and services)
I've started using this. (See sheer.) It is most helpful for polysemic words, but also helps determine whether a term is polysemic. I have had a brief e-mail discussion with Mark Davies at BYU who has pulled this together. I think we can experiment with this for quite a while before we would have to consider our options. I know that he would not be happy with our license terms.
Ultimately the presentation would probably be most useful if we grouped the collocates by the definition(s) most appropriate for them and their part of speech and presented them in decreasing order of frequency, first by PoS, then by term.
But the table that would be most useful for a contributor is one exactly like what is in [[goods]], but with a fuller list of collocates of the headword. We need such a voluminous table if we hope to cover all (well, most, actually) of the definitions in some of our polysemic entries. So I don't think they could fit in translation table headers. DCDuring TALK 00:12, 9 February 2016 (UTC)

adverbs Monday, Tuesday etc.Edit

Yep, that's fine what you have done. I considered doing that, but I thought some "wise guy" would come along and revert my edits denying the truth, hence the wording I used. Cheers! Donnanz (talk) 23:32, 8 February 2016 (UTC)

Do Brits not say that? --WikiTiki89 23:40, 8 February 2016 (UTC)
Nope. On Tuesday, on Tuesdays etc. It's not the same as on the west side of the pond. Donnanz (talk) 23:46, 8 February 2016 (UTC)
I mean, we also say "on Tuesday", but not always. In fact I would probably say that "on Tuesday" is more common than just "Tuesday" adverbially. I'm trying to think whether there is a pattern to when the "on" is dropped. --WikiTiki89 23:54, 8 February 2016 (UTC)
I have heard it minus "on" on American media, also read it in American literature. I tend to notice the usage when I see or hear it. There are also entries in Oxford saying this. Donnanz (talk) 00:14, 9 February 2016 (UTC)
I'm not saying it doesn't exist. It is pretty common, but what I'm saying is that I don't think it is the primary usage. And I'm wondering whether there's a pattern to how it's used. --WikiTiki89 01:31, 9 February 2016 (UTC)
FWIW I ran the numbers and ngrams confirm the regional divide; it's about twice as common to say "work on Tuesday(s)" than "work Tuesday(s)" in US books, whereas in UK books the "on"-less forms are too rare to register. - -sche (discuss) 06:06, 9 February 2016 (UTC)


Could you look at the IPA for Michäas? This is just my guess. —JohnC5 03:50, 9 February 2016 (UTC)

I've actually never encountered this form. In Ngrams, it seems to have been about 1/20th as common as Micha until the 1870s, thereafter about 1/50th as common (with a spike in 1952) until the 1980s, and thereafter trending sharply downwards towards about 1/1000th as common by 2008. I would pronounce it the way you noted. - -sche (discuss) 04:56, 9 February 2016 (UTC)
Thanks! —JohnC5 05:51, 9 February 2016 (UTC)


So the term girl can't be used in a platonic context, but only in a romantic? Ubuntuuser13 (talk) 03:00, 16 February 2016 (UTC)

Nvm, I've opened up an RfV for it. Ubuntuuser13 (talk) 03:10, 16 February 2016 (UTC)
Definition one ("A young female human") and two ("Any woman, regardless of her age") and five cover non-romantic use, do they not? Are there citations where "girl" means "a female friend" as opposed to "a [young] female (who may or may not be a friend)"? That might clarify matters. As it is, it seems like someone calling a female friend a "girl" is comparable to someone calling a blond-haired friend a "blond" — it doesn't cause "blond" to mean "a blond-haired friend", it's just the general definition. Usage like "girl, let's go see Andy!" is sense 5, the term of endearment. - -sche (discuss) 03:12, 16 February 2016 (UTC)
Just so you know I've opened up an RfV. Ubuntuuser13 (talk) 03:26, 16 February 2016 (UTC)
Thanks. - -sche (discuss) 02:13, 17 February 2016 (UTC)

Category:French TranslingualEdit

Category:Regional Translingual --Romanophile (contributions) 13:52, 16 February 2016 (UTC)

Thanks for noticing those. - -sche (discuss) 02:13, 17 February 2016 (UTC)

ISO codesEdit

Hey thanks for updating according to the new ISO standards. I was noticing when adding ancestors that we are missing some of the more newly added ISO codes. Is there anywhere I could look for a list of the new, merged, and deleted codes? —JohnC5 05:32, 24 February 2016 (UTC)

You're welcome. Changes to the standard are published here. :) I'm going through the 2015 changes now. - -sche (discuss) 05:41, 24 February 2016 (UTC)
I'll let you do it then! Tell me if you need any help. I'm a little terrified of sorting Austronesian. —JohnC5 05:50, 24 February 2016 (UTC)
The key to understanding the structure of the Austronesian languages is admitting that there isn't much: you have Malayo-Polynesian and the various Formosan branches, but within Malayo-Polynesian there are no widely accepted proto-languages outside of Oceanic and an assortment of local groupings. There are areal phenomena and substrata that let you classify Malayo-Polynesian into major subgroups, but those subgroups aren't genetic at all. Blust reconstructs all kinds of things, but his approach is to plug word lists into cladistics software designed for plant and animal taxonomy to produce trees, then use comparative reconstruction on the branches. His Austronesian Comparative Dictionary has an extremely impressive amount of data, but he regularizes the orthography, so you need to check the spelling against other sources, and you can't trust the comparative stuff due to his methodology. Chuck Entz (talk) 08:18, 24 February 2016 (UTC)
...So we're doomed until better research is done? :(JohnC5 15:29, 24 February 2016 (UTC)


You added the entry for "woman" here for Makalero, which is incorrect. Huber, the source you cited, simply states that it is Tongan, and that entry already existed there. I think that in your haste to create entries in a maximal number of languages, you may have made more errors that won't be caught for quite a while (you got lucky in that this one happened to turn up on my watchlist, and I felt it very unlikely that Makalero would borrow a vocabulary item like that from Polynesian, so I checked). In any case, I appreciate your project, but I think you need to take a lot more precaution to avoid these kinds of mistakes. —Μετάknowledgediscuss/deeds 06:51, 26 February 2016 (UTC)

Oh, you're right about Huber; I'm glad you caught that. I've been going through and checking my previous additions ever since Chuck's caution in the previous section that the Comparative Austronesian Dictionary (which had been recommended to me as a valid reference on Talk:water, when I was trying to verify the translations people had added there) normalizes orthography and so has to be checked against other sources. So, I hope to uncover any other errors. - -sche (discuss) 16:42, 26 February 2016 (UTC)
I see, Liliana should not have said that. Yeah, you can't really use Blust as a primary source for something serious, although the orthographic concerns run deeper; some of these languages are well nigh unwritten, and linguists might just put them in IPA. Thanks for going through them, anyhow. —Μετάknowledgediscuss/deeds 17:04, 26 February 2016 (UTC)
Looking at diffs of all my edits to water and woman, and so checking not only anything I added but any word I changed the spelling of and any language I updated the name of, and sometimes spot-checking things I had nothing to do with, I've found other references for the translations of water and woman into Äiwoo, Aklanon, Alaba, Alune, Antillean Creole (and added Guianese Creole and a usex to Haitian Creole), Anuta[n] (and Tikopia), and Arosi, Batad Ifugao, [Palawan] Batak (which we should possibly rename to avoid confusion with the Batak languages like Karo Batak), Bauro, Biak, Biloxi, Binukid, Bontoc (we probably shouldn't have both the macrolanguage code and the dialect codes there), Bughotu, Buli, Casiguran Dumagat Agta, Cebuano, Chewong, Dobu, Dupaninan Agta, Futuna, Fuyug, Gapapaiwa, Gedaged, and Gilbertese (should that language be renamed?). I had to fix Blust's spelling of several things, and fix Arosi and Bauro where he had the 'wrong' word, but the only translations for which I couldn't find any more-reliable references are Bukitan, Embaloh or Ende.
Several days ago, I removed the Ajië and Amurdag translations (not added by me — removed as part of the original project of checking the translations at water) because I couldn't find any references for them.
Abua and Abung things would benefit from more references: the only ref I find for the Abua translation of water (added by someone long ago) and of woman is R. Blench's work on the Central Delta languages; I'd prefer if there were additional sources. The Abung translation of water (likewise added long ago) is only in the Austronesian Basic Vocabulary Database (and in placenames, but ABVD is the only reference to define it as a common noun); likewise the translation of woman.
That's all the languages that start with A through G; I'll be going through the rest.
PS other people long ago added translations into several of these languages to the tables of a handful of other entries such as dog, which it may also be useful to check (in case they were working from Blust).
- -sche (discuss) 08:13, 27 February 2016 (UTC)
I've found other references for Hiligaynon (in several spellings, some dated), Isnag, Itawit, Jarai, Jola-Fonyi; Kambera, Kankanaey and Kapampangan (both even with citations of use), Kala Lagaw Ya (the spellings are all attested in dead-tree references, but the division into different dialects is per WP), Kedang and Kumak, Lamaholot, Lamboya, Lavukaleve and Lou; likewise Wandamen, Waray, Waropean, Wedau, Western Bukidnon Manobo, Wogeo, Woleaian, Wuvulu-Aua, Yami, Zaghawa, Zangskari, Zangwal. The Kua-nsi and Kuamasi and Sonaga translations are from the scholar who recently documented those languages and sucessfully petitioned for them to have ISO codes.
The K. Blaan translation is in ABVD and the word itself is used in Kibo Kbulung dad Fdas, but not glossed there (it might mean "sister" in addition to "woman", like a few other languages' words do).
I can't find [better] references for Kanowit (not added to the table by me).
The Komodo translation I can find a reference for, but it's in Indonesian and only glosses the term as part of longer sentences; likewise Waropean; it would be nice to find a better reference than Blust confirming or denying the spelling. Li'o is only in ABVD. Lawangan and Loniu I find only general references mentioning.
That's all the languages H through L (postscript: through R) and U through Z. - -sche (discuss) 19:51, 28 February 2016 (UTC)
  • You are so wonderfully diligent. If this were Wikipedia, I'd give you some annoying barnstar, but since it's here, you just get my gratitude. As for the points you've raised: the languages you've bolded are obscure enough that it may not be possible to do better for now; I see that Ende is discussed in a book called Deskripsi naskah dan sejarah perkembangan aksara Ende, Flores, Nusa Tenggara Timur, but finding that online appears to be no easy feat. As for the renames, it makes sense not to have a language called "Batak" alone. Google Ngrams show "speak Kiribati" as being insignificant as compared to "speak Gilbertese", but Google Books show more results for "speak Kiribati"; I for one have always called it Gilbertese, and it does seem that the switch has only happened in perhaps the last decade. On the whole, it doesn't seem worth changing. —Μετάknowledgediscuss/deeds 06:15, 29 February 2016 (UTC)
    • I'll second the gratitude. As for Kiribati, the name isn't any more aboriginal than Gilbertese- it actually is Gilbertese (or Gilbert, anyway) modified by the phonotactics of the language. Chuck Entz (talk) 06:49, 29 February 2016 (UTC)
      •  :)
        I learned the other day about the etymology of Kiribati — it makes me wonder what the language was called before its speakers met Gilbert!
        Plain "Kiribati" is considerably more common than "Gilbertese", but I suppose that's due to the fact that the former is also an often-mentioned placename. I'm fine with leaving the language name as is. By the way, I didn't keep a count, but I think (ignoring the hyphens he adds) Blust's spellings turned out more often than not to be the spellings other scholars used. - -sche (discuss) 07:30, 29 February 2016 (UTC)

Category:fr:Mythological locationsEdit

I noticed you added this to Champs-Élysées, while removing the entry from Category:fr:Fictional locations. Just thought I'd let you know that no such category currently exists. Purplebackpack89 19:06, 29 February 2016 (UTC)

Thanks. I've created it. - -sche (discuss) 21:31, 29 February 2016 (UTC)


Discussion moved to WT:LTD.

zerkreuzen and other thingsEdit

Thanks for the catch! I am indeed aware of the fact that we don't use the IPA ligature, and I do indeed copypaste from de.Wikt. I normally will catch those, but I also forget, as you saw. By the way, if you'd like to take a break from your wonderful work updating the mod:languages data, I could use the help of some more German editors. Kenny and I have written a new mod:de-headword that is already running the nouns, proper nouns, and adjectives, and has the verb logic written but not in use. The new logic allows us to detected the inflection type (strong, weak, irregular, etc.) of verbs automatically, which means that they may all be merged under {{de-verb}}. It also means, however, that we need to manually sort the current trnasclusions of de-verb into either {{de-verb-weak}}, {{de-verb-strong}}, and {{de-verb-irregular}}. Once that is done, we'll switch {{de-verb}} to the new module then have a bot merge all the other templates into it. If you'd be willing to help move the remainder of de-verb's tranclusions, I would appreciate the help, but only if you have the time. Regardless, thanks for the IPA fix! :)JohnC5 06:15, 2 March 2016 (UTC)

It's been a while since I used the de-verb templates, so I'll have to refresh myself on all the parameters, but I'll try to help out. - -sche (discuss) 08:22, 3 March 2016 (UTC)


You wrote "as you have been told previously, such obsolete invariant forms aren't listed in entries' tables". Where have I been told that? -Random187056 (talk) 22:51, 8 March 2016 (UTC)

When you've added RFCs to other entries requesting that such invariant forms be added to the tables. - -sche (discuss) 23:02, 8 March 2016 (UTC)

Languages that use the IPAEdit

Are there seriously languages that adopted the international phonetic alphabet? I know that it’s possible, but I thought that every language with writers simply had its own alphabet. --Romanophile (contributions) 05:19, 9 March 2016 (UTC)

It’s not that they adopt IPA, it’s that the only documentation available for a lot of languages is articles published by linguists. These tend to use IPA out of convenience, without any intention of establishing it as the language’s official writing system. — Ungoliant (falai) 05:29, 9 March 2016 (UTC)
Right. That said, a few languages have adopted IPA or IPA-like alphabets. `- -sche (discuss)


Hi -sche. I'm not sure whether the ping on Talk:sy³³ worked properly - I'm having doubts about the use of IPA to write languages mostly unwritten or lacking a writing system. Could you point me to the policy on this? Wyang (talk) 05:19, 9 March 2016 (UTC)

Holy shit! We both made the same topic simultaneously! --Romanophile (contributions) 05:20, 9 March 2016 (UTC)
As a side note, Bai language has a Latin-based writing system: see for example *g-sum, *ts(j)i(j) ~ tsjaj, *(s/r)-ma(ŋ/k) and *k-m-raŋ ~ s-raŋ. Wyang (talk) 05:23, 9 March 2016 (UTC)
In the general case: if we are to include words from languages which have been written down using IPA and which have not been written down in another way -- and they meet the criteria for inclusion, so I don't see a basis for excluding them -- how else would you suggest including them, if not in the way that other references do? (That's not a rhetorical question.)
I don't know that we have (m)any policy pages spelling out which scripts to include languages in (except some language-specific policies allowing multiple scripts, e.g. allowing Cyrillic Romanian). De facto we've had entries like this for years, e.g. naːnʔ³³, paʔ²⁴, and wã³nũ³tũ̱³ka̱³txi³su².
In this particular case, if there is another script we can match these entries to (either Chinese, or a Latin script), and you want to make the argument that these should be mapped to and moved to that script even if it's not the one they're attested in, that's OK by me.
- -sche (discuss) 05:44, 9 March 2016 (UTC)
For the first question, I think it would be best to hold off on creating entries in that language, until a substantial amount of studies have been done on that language. The status of having a writing system, or at least achieving transcriptional consistency in scholarly studies, should be used to assess whether transcriptions for a rarely attested language have become relatively stable. I don't have a strong opinion on this, however.
Regarding Bai language, here is a picture of the word "water" in Latin-scripted Bai: I think the Bai languages should be grouped together, and recorded using this writing system. Wyang (talk) 06:21, 9 March 2016 (UTC)
We can't exactly "hold off"; that's antithetical to "all words in all languages". As for the Bai lects, do you have a source that supports grouping them? I'm inclined to agree with you just because you are so much more knowledgeable about that part of the world, but evidence helps. —Μετάknowledgediscuss/deeds 06:37, 9 March 2016 (UTC)
This suggests that though they grade into one another, there are enough differences across the groups of lects that there is unintelligibility. That suggests that multiple centres of intelligibility may be a better way to capture what's going on, even if the ones the ISO uses are slightly arbitrary. —Μετάknowledgediscuss/deeds 06:41, 9 March 2016 (UTC)
"All words in all languages" is a simple enough catchline that summarises this project reasonably succinctly. We, however, do not aim to record all words in all languages, for example all the words in agglutinative languages, or transcribing words in a previously-undocumented language singlehandedly. We record lemmata and certain non-lemmata in as many languages, in forms these words are usually recorded in. (cf. the policy on neologisms)
Bai languages have a fairly well-conserved set of basic vocabulary across varieties, and are perceived by speakers and usually handled in studies and dictionaries as varieties of a single Bai language, which is the reason I'm in favour of the amalgamation. Wyang (talk) 07:11, 9 March 2016 (UTC)
Like Metaknowledge, I'm disinclined to exclude some languages, especially ones about which we have modern (often detailed and careful) documentation, sometimes in the form of entire dictionaries, grammars, and compendia of transcribed stories. We include some old languages from which only one old text or even only one word survives; that is arguably less useful or more prone to error: maybe it happens that the one word was spelled lazily; whereas, ɕy³³ was carefully recorded as the exact word used in 8 of 9 places. We can always move the content later if the community settles on a certain orthography; we do that even when a community of speakers changes from one established, non-IPA orthography to another (e.g. German entries use the currently-used currently-prescribed spellings — not the spellings from a mere 20 years ago — as the lemmas).
I find some PDFs that say they are examples of Bai, and that use xuix (27, 32); the difficulty they pose is that they don't contain glosses/translations, so it's difficult to figure out how to map the scholarly transcriptions into that orthography, or tell what the words in the texts mean.
Most references I can find do speak of "Bai" or "the Bai language" (or "Bai Dialect[s]") as if it were a unitary thing, so I'm not opposed to merging and making liberal use of {{label}}s and {{a}}s. I would have entered the words as only one language if there had been a single code for that. (Interestingly, a lot of its "fairly well-conserved set of basic vocabulary across varieties" is borrowed from Chinese — 47 of 100 Swadesh items.)
If we merge the Bai varieties, do you think it's better to repurpose one of the varieties' codes as the code for the whole language (as we tended to do in the past, e.g. with acf/gcf), or create a new (longer) code from scratch (as we've tended to do recently)? - -sche (discuss) 08:55, 9 March 2016 (UTC)


We've already blocked Willy2000 (talkcontribsglobal account infodeleted contribsnukeedit filter logpage movesblockblock logactive blocks) for mass creation of entries in languages they don't know based on non-English Wikipedia entries. They just created some more as an IP ( (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks), at least some in languages you've worked with. Could you check those entries? I would also appreciate your opinion on whether to start mass-deleting their entries in an attempt to get them to stop. Chuck Entz (talk) 13:34, 16 March 2016 (UTC)

He could also just nuke his work, but that’s kind of a lazy thing to do (in my view). Still, I can understand why he’d prefer that. --Romanophile (contributions) 13:42, 16 March 2016 (UTC)
By "mass-deleting", I was referring to what we call nuking. As for laziness: volunteer time from people who are knowledgeable enough to check Willy200's edits is a precious resource that shouldn't be wasted on following people around to clean up their messes- unless those volunteers want to do it. Chuck Entz (talk) 13:55, 16 March 2016 (UTC)
The Pennsylvania German words were correct. @Kolmiel can probably shed light on which Central Franconian spellings should be made the lemmas and which should be alternative forms, but the spellings this user entered are at least correct as alternative forms. Entries created based on Wikipedia articles could be wrong (in the BP we're discussing how some Wikipedias make up words), but when it comes to basic concepts like these, they're probably correct (which is probably why the user thinks it's OK). At least, it will normally be possible to find out what the correct terms are and move the entries (for well-documented languages, anyway; I'm having trouble finding out about Lombard), so I wouldn't nuke all the entries, but maybe the ones that it's not possible to find independent confirmation of (like Lombard). - -sche (discuss) 16:11, 16 March 2016 (UTC)
The days of the week? They look okay. Only in Sambsdaach I don't see any need for the -b-; the normal spelling would be Samsdaach, but Ripuarian wikipedia seems to use Sambsdaach, too, for whatever reason. Kolmiel (talk) 15:10, 17 March 2016 (UTC)
Oh yeah, and Freidaach is Moselle Franconian, while all the other forms entered are Kölsch. Kölsch for Friday would be Friedaach. Kolmiel (talk) 15:15, 17 March 2016 (UTC)


Hi. What prompted my removal of my edit to the homoflexible page? You asked me to move it to the homoflexible page from the homofelxibility page, and I did, yet you took it down again. What, may I ask was wrong with it? I very much want my edit to stay, so if there is anything I can do to, to make it right, please let me know.

Amuzgo entriesEdit

These need some cleanup after your rename of the language. DTLHS (talk) 03:16, 2 April 2016 (UTC)

Indeed. I tried to go through them with AWB yesterday but I couldn't log in (perhaps the same problem Semper mentions that his bot is having). - -sche (discuss) 06:32, 2 April 2016 (UTC)
(Now that AWB is working again,) I think I've fixed all the Amuzgo entries, and now only have to fix a couple dozen translations-table entries. - -sche (discuss) 16:04, 17 April 2016 (UTC)


Did you mean to add this word to the translation table of woman or water? The definition you put says water and I was curious as to whether something had gone wrong. Tulros (talk) 10:52, 17 April 2016 (UTC)

@Tulros Thanks for catching that! I added the Yanomamö translation of water and an entry for it, and then copied and pasted that to create this entry, but forgot to change the gloss (despite updating the references!). - -sche (discuss) 16:02, 17 April 2016 (UTC)


Currently the Russian Wiktionary has a mixed use of lowercase (20%) and uppercase (80%) Palochka. I'm trying to understand what is right. In this discussion on en.wiktionary in August 2012, you stated that "we should use the lower case", but was there any reason or documentation behind this recommendation? I think the Wikimedia project that uses this character the most is the Chechen Wikipedia, and it is totally dominated by the uppercase Palochka. I found 2.7 million occurrences in the latest XML dump. It would be nice if we could find a consensus covering all WMF projects on how to handle this special character. --LA2 (talk) 21:50, 20 April 2016 (UTC)

There's an ongoing discussion at User talk:Stephen G. Brown#Palochka. @LA2: It would make things easier if we didn't have millions of discussions in different places on the same topic. If you feel that someone's input is needed, it's better to direct them to an existing discussion than to start a new one on their talk page. --WikiTiki89 21:10, 21 April 2016 (UTC)

Leftovers from Zarphatic mergeEdit

Just on the off chance these were overlooked: see CAT:E. If you just haven't gotten around to fixing them, never mind. Chuck Entz (talk) 15:16, 29 May 2016 (UTC)

Thanks; I had searched for pages containing the language code, but I got the impression that the site was in the middle of updating (to reflect the removal of the code) at the time, which apparently means those few pages were in limbo and didn't show up. - -sche (discuss) 16:24, 29 May 2016 (UTC)
son got missed, probably because I said Zarphatic and the translation in question is Shuadit... Chuck Entz (talk) 03:27, 30 May 2016 (UTC)


Hello, -sche.

  • Just removing ik from the "German Low German" section is not justified and not sufficient as long as there is ik#Plautdietsch. And the proper way to get that entry removed should be to use WT:RfV. Dit un jant opp Plautdietsch has the form ik (e.g. in "Ut de Nacht bün ik kamen") besides ekj. So maybe it's a valid Plautdietsch form.
    Please use WT:RfV if you think that it is not a Plautdietsch word.
  • Wikipedia says that Plautdietsch is an East Low German dialect. So it should be a dialect and not "a separate language". As the German East isn't next to the Netherlands, it should rather be a German Low German dialect and not a Dutch Low German dialect. But well, as the dialect spread through the world, one maybe could argue that it's not German anymore but a (World) Low German dialect.

Greetings, Ikiaika (talk) 04:34, 7 June 2016 (UTC)

==Plautdietsch== (pdt) and ==German Low German== (nds-de) are currently treated as separate languages which are related, like Danish and Norwegian, and for that matter ==Norwegian== and ==Norwegian Nynorsk== and ==Norwegian Bokmal==. Separate languages are not obliged to be linked to each other, and are not supposed to be linked as ===Alternative forms===; they are often mentioned in etymology sections, and sometimes linked in ===See also===.
If you think Plautdietsch should not be considered a separate language, that's another matter; you can see my comments on WT:T:ANDS and Wiktionary:Beer_parlour/2016/April#Let.27s_kill_nds-de.2Fnds-nl. about it.
If you can find Plautdietsch works which use ik, that's great, and means ik#Plautdietsch doesn't need to be RFVed.
I can find German Low German works from Oldenburg and Münsterland which use ik, so ik#German_Low_German is fine, too (and was never removed, despite your comment).
- -sche (discuss) 19:52, 7 June 2016 (UTC)
  • So it rather was "is treated by the English Wiktionary as a separate language" than "is a separate language". Ok, that's a different thing.
    — Although, I have the impression that the German Low German label "in most dialects, including Low Prussian" includes Plautdietsch as a Low Prussian variety. But well, maybe there are no German Low German labels which clearly include Plautdietsch.
  • In this edit you moved Plautdietsch from "Alternative forms" into "See also" (I don't object to this), and also removed the mentioning of the (purportedly) Plautdietsch form ik. But the removal is not justified and not sufficient as long as there is ik#Plautdietsch.
    Well, Dit un jant opp Plautdietsch is just one book, so it wouldn't give three cites which usually are needed to attest a word. Also I can't read the whole book, so the usages of ik could be dated or maybe aren't Plautdietsch as the book could also include German Low German. That is, I don't say Plautdietsch ik is attestable or exists. I'm just saying that it might exist and that Wiktionary says it exists (ik#Plautdietsch).
  • ik occurs in many dialects. But I can't say in which dialects it is attestable for the English Wiktionary (three durably archived cites). For example, this poem has ik too and is from Ravensberg which is in the East of Westphalia. So ik should also appear in East Westphalian. "Niu lustert mol! Plattdeutsche Erzählungen und Anekdoten im Paderborner Dialekt" (1870) from ein Sohn der rothen Erde (a son of the red earth) has ik too, and maybe also ick. But it's just one book, which usually is not sufficient to attest the Paderbornish form.
Greetings, Ikiaika (talk) 00:59, 9 June 2016 (UTC)
Minor note @Ikiaika: Some of this issue stems from your misunderstanding of CFI. We only need one cite to attest a word in a Low German lect, and it can be in a dictionary, not necessarily a use. —Μετάknowledgediscuss/deeds 01:05, 9 June 2016 (UTC)
"Low Prussian" is the variety which was spoken inside Prussia. (It seems to be mentioned frequently because comprehensive references on it are readily available, probably in turn because it had some prestige as the variety of a large leading state.) Plautdietsch is the variety developed outside Prussia among certain (largely Mennonite) emigrants. Wiktionary has tended to keep lects with such different geographic and hence historical development separate, especially among Germanic languages (as I note on WT:T:ANDS) — indeed, we even keep cases spoken in the same place separate (as we keep Nynorsk, Bokmal, Riksmal, and other rural dialects of Norwegian separate under three codes). Merger proposals often prove controversial and get squicky fast; e.g., what would be the rationale for merging the separate(d) lects of GLG and Plautdietsch, but keeping Luxembourgish and T Saxon separate from not only other Moselle Franconian but also all the other varied things we group under gmw-cfr? But what would be the rationale for merging Luxembourgish? The rationale for keeping them all separate is of course the separate geographic/national linguistic development. - -sche (discuss) 15:02, 9 June 2016 (UTC)
@Metaknowledge: Yes, I didn't know how many cites where needed for Low German, and "usually" above refered to the practice of e.g. English and German, not to e.g. Latin and dead languages like Gothic and Old High German. Thanks for the info! That makes many things much easier.
@-sche: (1.) Well, I didn't have the impression that Low Prussian has any more prestige or was or is more common than other dialects (though that might be a wrong impression), and one one can find statements like this:
  • "Plautdietsch, das Niederdeutsch aus Westpreußen mit einer über 200 Jahre alten Migrationsgeschichte" (Plautdietsch, the Low German from West Prussia ...)
  • "Plautdietsch ist [...] eine niederpreußische Mundart" (Plautdietsch is [...] a Low Prussian dialect)
  • German Wikipedia: "Plautdietsch ist [...] eine niederpreußische Varietät" (Plautdietsch is [...] a Low Prussian variety), "den ostniederdeutschen Dialekt Plautdietsch" (the East Low German dialect Plautdietsch)
  • English Wikipedia: "Plautdietsch, a Low German variety, is included within Low Prussian by some observers"
That's why I (incorrectly) added Plautdietsch forms as German Low German alternative forms. Similary some sources or some user could have labeled Plautdietsch terms East Low German or Low Prussian. Here at Wiktionary this label would be incorrect as Plautdietsch is not treated as a part of German Low German or Low German (like you said, thanks for that!), but nevertheless it could be present in some entries. For me that seemed more plausible (again, it might just be a wrong impression).
Just for clarification: It might just a wrong impression, and I'm not saying that there is any error in an entry and I'm not saying that any or even all Low Prussian labels here should be checked.
(2.) Well, Bosnian, Croatian and Serbian here are merged into Serbo-Croatian. The rationale for this surely was the linguistic similarity, even though there are different nations. So maybe for the same reason Plautdietsch could be merged into Low German as a Low German dialect. I don't have enough knowledge of Luxembourgish, Serbo-Croation, Low Prussian and Plautdietsch to argue for or against any of this, and I have no intention of making a split or merger proposal.
(3.) I re-added Plautdietsch ik next to the qualifier Plautdietsch in diff as there is the Plautdietsch entry ik (ik#Plautdietsch). I'm okay with a re-removal of it, but please use WT:RfV first. Than both, the mentiong next to the qualifier and the entry ik#Plautdietsch, can either stay or be removed.
Thanks, and greetings, Ikiaika (talk) 01:24, 13 June 2016 (UTC)
Oh, yes, mislabelling certainly could be present in some entries. Ideally, all nds, nds-de and nds-nl entries' labels would be checked and expanded (and perhaps replaced with a table as discussed on WT:T:ANDS), because many are far too short even when they contain only correct things and no incorrect things: the people who added them apparently weren't sure which dialects besides their own a word or spelling was found in, and so only listed the few dialects they were sure of, which is not entirely unhelpful, but is insufficient.
It is possible that some Plautdietsch-only things have been entered under a wrong header; we've certainly had a few Kashubian words entered as Polish because older dictionaries treat Kashubian as Polish. And quite a few apparently Middle-English- and/or Scots-only words have been entered as English, because some dictionaries (including the OED) don't distinguish those three languages.
Bosnian, Croatian and Serbian are copies of the same Eastern Herzegovinian subdialect of the same Shtokavian dialect, and are so identical that their mutual intelligibility "exceeds that between the standard variants of English, French, German, or Spanish" (per WP, quoting Paul-Louis Thomas). I don't think they provide an argument for merging anything else, heh. :-p
I'll assume that ik is used in Plautdietsch based on the book you found above. It'd be nice to figure out more specifically who uses it, because standard references all seem to have only ekj / etj, but it's not a pressing concern. - -sche (discuss) 19:06, 13 June 2016 (UTC)
Well, I only saw a snippet of the book, and the book could have a Low German text in it (maybe similar to this, which has Plautdietsch with a Dutch translation) or an older Plautdietsch text, like from some kind of *Proto-Plautdietsch when Dutch and Low German where mixing and creating Plautdietsch.
In diff the Plautdietsch entry ik got extracted from the Low German entry. In older versions, like from 25th December 2010, there is no Low German entry but a Low Saxon entry. There it was "Ik kwam, ik zag, ik overwon (nl), Ik keem, ik keek, ik wun (pd)". In diff the nl example got replaced by nds. pd could have meant Low German (Plautdeutsch or Plattdütsch/Plattdüütsch), including both Dutch and German Low German. So it once could have been an Dutch or German Low German example, while later someone misinterpreted the abbreviation and it developed into an Plautdietsch entry. nl:ik#Nedersaksisch has the example as Nedersaksisch.
Based on this I'm using WT:RfV, see WT:RFV#ik.23Plautdietsch.
Greetings, Ikiaika (talk) 12:53, 14 June 2016 (UTC)

Old Italian module errorsEdit

Hi, -sche. It seems your edit to roa-oit in the modules caused a bunch of module errors. Please see CAT:E. --Daniel Carrero (talk) 00:33, 30 June 2016 (UTC)

Thanks. I think I've fixed them all. - -sche (discuss) 02:08, 30 June 2016 (UTC)

All those languagesEdit

I've added another stack of them to RFM, which is now positively flooded. I'll probably stop adding anything for some time now; I've a planned wikibreak coming up somewhat soon. I'm happy to help add them, but for the most part, a) it's good to have another set of eyes check things and b) it's even better when that set of eyes is as good at being scholarly as yours are. Please ping me whenever you want an opinion or need access to any research materials, etc.

On a related topic, I remembered that I'm still responsible for FWOTD for an indefinite period of time. I set a "barely attested languages" focus week recently and I was annoyed how Eurocentric it turned out to be, especially considering how many barely attested languages there are around the world. I think this merits focus weeks for the barely attested languages of North America, South America, and Australia respectively, probably spaced out over the next 6 months or so. Considering you're making the entries now, I'd really appreciate if you could keep a list of words with particularly interesting meanings (i.e. not just your typical "boy" or "fire") from such languages, either here or at WT:FWOTD/FW or on a userpage, so they can become focus weeks in the future. Thank you so much for all your hard work! —Μετάknowledgediscuss/deeds 05:27, 6 July 2016 (UTC)

auto catEdit

I don't know if you are aware, but it may make your life easier just using "{{auto cat}}", without the need to use a language code. See diff for example.

The auto cat should be able to work in all POS and derivation categories, but it does not work in categories that use {{langcatboiler}}, like Category:English language. --Daniel Carrero (talk) 01:31, 7 July 2016 (UTC)

Thanks. I was just running a script that replaced the deprecated codes one-to-one with the modern codes, though. I figure replacing explicit templates/parameters with auto-cat can be done by a bot whenever desired. - -sche (discuss) 18:54, 7 July 2016 (UTC)
The nice thing about auto cat is that you can move categories using it without editing them, since it gets the codes from the page name. Changing it to auto cat now means you won't ever have to mess with the wikitext ever again. Chuck Entz (talk) 02:34, 8 July 2016 (UTC)


Just curious, why did you have to do a bunch of fancy deletions and moving rather than just editing the page? --WikiTiki89 17:32, 7 July 2016 (UTC)

When I tried to save the page with updated content, it had no effect - it brought me back to the edit window and hadn't changed the page (and hadn't given an error message, either — and if you look at the content I was trying to add, it was well-formed, AFAICT). This persisted in two browsers and after logging out, but didn't effect my ability to edit other pages or even other modules(!), hence I could tell it wasn't the result of the database being locked. I tried a workaround, and along the way learned that pages have 'types' set, and when moving a non-Module-space page to Module-space, it retains its classification as a non-module... :/ - -sche (discuss) 17:36, 7 July 2016 (UTC)
Hmm... Very strange. Our practice has been to create userspace modules as Module:User:-sche/x, that way it's in the module namespace and still works as a module. --WikiTiki89 18:08, 7 July 2016 (UTC)
Btw I'm making a run of null edits to the pages in CAT:E to clear them. - -sche (discuss) 18:09, 7 July 2016 (UTC)

Smuconlaw vote indentationEdit

Ah, I see what you mean. My issue was that indenting WF's vote under mine made it look somehow "attached" to mine, like a child or subsidiary vote. Not a big deal, though; we can leave it as it is. Equinox 23:29, 12 July 2016 (UTC)

Yeah, it has that negative side-effect, but it seems to be the usual way of handling ineligible votes. I'm not aware of any other way of doing it, unfortunately, short of entirely removing the vote or separating it at the end of all the other votes. - -sche (discuss) 00:51, 13 July 2016 (UTC)


Do you think there is enough evidence for it? I've found some scholarly references for this, but I don't believe the language itself has been reconstructed, only elements. —JohnC5 21:23, 29 August 2016 (UTC)

LaPolla reconstructs a Proto-Qiangic first-person actor suffix *-ŋa and second-person suffix *-na and some other things; other references mention other words, like *pram "white" (whence the Prinmi / Pimi / Pumi name, apparently); but several scholars such as Matisoff note that a systematic reconstruction has not been undertaken. There is extensive information on various sub-branch proto-languages (I'm not sure if that's what you mean by "elements" or if you're referring to reconstructing e.g. the lack of tone, or the phoneme *f as opposed to a word *foobar). Dominic Yu reconstructs Proto-Ersuic (I may add it later), and Guillaume Jacques and Alexis Michaud reconstruct Proto-Naic in connection with their argument that Qiangic should be called "Na-Qiangic" with Ersuic and Naic being considered separate branches alongside Qiangic rather than branches of Qiangic per se. One scholar (Chirkova) argues that Qiangic is not a family at all but rather a diffusion area, but more other scholars support a genetic relationship. But I added the family code not so much to reconstruct a proto-language as because it seemed odd to have a large number of languages sorted directly into the highest-possible family (Sino-Tibetan), lol. - -sche (discuss) 22:06, 29 August 2016 (UTC)
I'll admit that I kind of want this so that Kenny's Module will sort them into a subgroup. —JohnC5 02:31, 30 August 2016 (UTC)
That module ought to be (and I thought at one point it was) adapted to accept families and not just proto-languages. :-p As a side effect, that might encourage people to add families, the way the module's initial creation lead to a rush of adding "ancestors" (which were, however, often redundant to the family info). - -sche (discuss) 02:37, 30 August 2016 (UTC)
You're right: it had been suggested! (@kc kennylau *hint, hint, nudge, nudge, wink, wink, cough, cough, gasp, gasp, asphyxiate, asphyxiate*)JohnC5 02:42, 30 August 2016 (UTC)
@JohnC5: Please edit my ANC yourself; I'll be quite busy for the next few months. --kc_kennylau (talk) 02:50, 30 August 2016 (UTC)
Hmmm, I'm not sure I'm clever enough to do so. @CodeCat, might you be free? :DJohnC5 03:09, 30 August 2016 (UTC)

About the "Visual description" sectionEdit

I added separate support/oppose/abstain options for each name in Wiktionary:Beer parlour/2016/August#Poll: Description section.

I suppose your comment "This name is too long, IMO." about "Visual description" should be counted as an oppose vote? --Daniel Carrero (talk) 00:03, 30 August 2016 (UTC)

Cascading protection of main pageEdit

I remember the BP discussion about this where I voiced my concerns, and I didn't remember that you ignored said concerns and proceeded to remove cascading protection. As a result, we had vandalism on the main page for more than an hour today because an anon edited the FWOTD template. The main page needs cascading protection unless you (or someone else) takes it upon themselves to find a way to protect everything that goes on the main page individually. —Μετάknowledgediscuss/deeds 00:43, 16 September 2016 (UTC)

FWOTD templates should be protected the way we protect WOTD templates, which is to say, individually (protect only those pages which need protection) so that all the modules that transclude them aren't restricted as collateral damage. For what it's worth, I didn't ignore your concerns, you voiced them after I acted and you were the only editor to voice concerns. - -sche (discuss) 00:55, 16 September 2016 (UTC)
Thank you. Sorry for my tone; I was rather angry about the vandalism, but I shouldn't have been so accusatory.
Obviously, protecting all the FWOTDs individually is not reasonable, given that a new one is created for each day. However, there is still a scenario where the next day's FWOTD can be vandalised and nobody will notice before it appears on the main page (as happened in this case). Can we get that one protected as well? —Μετάknowledgediscuss/deeds 02:54, 16 September 2016 (UTC)
It's alright; I understand how easy it is to get stressed when something goes wrong, like vandalism. I'm sorry that my reply was also snippy. I think we could create a page that would load the next day's FWOTD (and then protect that page the way Template:FWOTD main is protected), if we can find the magic words to do it (I mean, the Mediawiki magic words). I've tried here (btw I am open to any suggestions if there are better names for these pages); I haven't got it working yet, but it seems like it should be possible because the preload templates for creating new votes seem to manage it. However, there doesn't seem to be a pre-made magic word for "tomorrow" (or is there?) the way there is one for "current day", so we might have to have the page load several possible templates to account for "tomorrows" at the end of months, years, etc. - -sche (discuss) 02:49, 17 September 2016 (UTC)
Yep, looks like you'll have to special-case them all (or ask at the GP to get further advice). —Μετάknowledgediscuss/deeds 03:54, 17 September 2016 (UTC)
Or you could use #time: instead of magic words. I've fixed Template:FWOTD tomorrow so it would show tomorrow's FWOTD- if there was one. Chuck Entz (talk) 06:20, 17 September 2016 (UTC)
Thanks! And yes, you've noticed that I tend to wait until the last moment to set FWOTDs. It's a bad habit, but I'm always hoping for something brilliant to be nominated. —Μετάknowledgediscuss/deeds 06:46, 17 September 2016 (UTC)
Aha, thank you! Are the workings of #time documented anywhere (presumably on mediawiki-wiki)? - -sche (discuss) 06:56, 17 September 2016 (UTC)
Yes, here, which is linked to from the magic words page you linked to above. When I looked at the code in {{timestamp}}, I saw it used there, so I knew we had the extension installed. Chuck Entz (talk) 07:29, 17 September 2016 (UTC)
  • Not sure why, but your fix didn't work... an anon went and created the next FWOTD before I got a chance to (Special:Contributions/ I'm still pretty perturbed by how easily anons can edit stuff that ends up on our front page — luckily this one isn't a vandal. —Μετάknowledgediscuss/deeds 00:57, 7 November 2016 (UTC)
    @Metaknowledge I'm sorry for the late response. I believe the reason that edit wasn't stopped is that the protected page we're using (and even the main page, if we moved the code and protection back to it) loads the upcoming FWOTD and then cascades protection down onto it. If the FWOTD doesn't exist (if no November 7th FWOTD has been created yet), the code can't load it and hence can't protect it. - -sche (discuss) 16:34, 14 November 2016 (UTC)
    Well, that's pretty screwy. Any ideas? @Ungoliant MMDCCLXIVΜετάknowledgediscuss/deeds 19:00, 14 November 2016 (UTC)
    It's probably like -sche described. Not sure how to avoid it; perhaps we should make the template load missing FWOTDs from the FWOTD of the corresponding day in 2013. — Ungoliant (falai) 19:08, 14 November 2016 (UTC)
    Actually, that's probably a good idea anyway, as a failsafe. —Μετάknowledgediscuss/deeds 19:22, 14 November 2016 (UTC)

RTL or LTR marksEdit

Hi, what is the best way to see these marks? The diff does not show them. I'm sure they were added by copy and paste, but how does one copy/paste without adding these marks? --Panda10 (talk) 22:42, 16 September 2016 (UTC)

I don't know how to avoid copying them when copying text, but one can delete them after pasting text by using the delete key on the places where one "knows" they're hiding, either from experience (e.g., I know Google Books uses them before and sometimes after authors' names) or — I use a Firefox extension that lets me highlight text and then "identify characters", showing the Unicode values and names of each character and revealing any hidden marks like those; Chrome probably has a similar extension. Obviously, it's inconvenient to check any and every text one pastes, so I wouldn't worry about doing that: it's easy enough to do periodic AutoWikiBrowser runs to eliminate them from a database-dump list of all pages that use them. Ideally we would resurrect the AutoFormat bot and it could probably remove the marks automatically.
PS there's a (RTL-mark-free) arrow → in the "Miscellaneous" field of the "Edittools" below the edit window when you edit a page.
- -sche (discuss) 03:11, 17 September 2016 (UTC)
Ok, thanks. --Panda10 (talk) 12:34, 17 September 2016 (UTC)


How many genders does Nasioi have? --Romanophile (contributions) 22:18, 17 September 2016 (UTC)

Technically maybe no genders, although it has a lot of noun classes. Poking around, I see that John McWhorter claims in two different books that it has "a hundred" or "two hundred genders", but he is apparently oversimplifying/mislabelling a system of noun classes. (In fairness, Ranko Matasović, in Gender in Indo-European, while mentioning various non-IE languages for comparison, says "languages such as Nasioi (of southern Bougainville) have a nominal classification category which is intermediate between a true gender system and" a noun-class system. And what's the difference to a layman?) Nasioi attaches enclitics to adjuncts in a sentence, based on the class and enclitic of the head noun, and these classes are very specific, e.g. raampu "tens of sago shingles", ruʡ "fluid", ruta "eye", va "house", vari "tree", vo "mother and children" (per William Foley, The Papuan Languages of New Guinea, saying Hurd 1977 has a comprehensive list). - -sche (discuss) 03:24, 18 September 2016 (UTC)
Noun classes are genders. This sounds significantly different from that, though, whereas Bantu noun classes function remarkably similarly to SAE genders (although SAE doesn't make verbs agree for gender as well). —Μετάknowledgediscuss/deeds 06:26, 18 September 2016 (UTC)
Conceptually, most of the time. But certainly the works above make a distinction, at least on the level of terminology. Probably "gender" is more likely to be used when there are only a few classes. - -sche (discuss) 22:16, 18 September 2016 (UTC)


Hi -sche, thanks for your attention to this page. Rather than go back and forth with edits to the page, I wanted to bring up a couple of points here…I'm happy with "now sometimes proscribed" (though personally I still think that's giving too much ground, especially outside the US), but I really don't think citations support an "informal" tag. It's used in all kinds of formal English. The only reason I think this is so important in this case is that – the idea that this is "informal" usage is put forward as a matter of assertion by certain sides of the debate; it can't be justified historically and I'm doubtful it can be justified now. I'd also ask you to reconsider the Guardian quote that you removed, which I think is pretty clearly talking about "men" and "women" as separate classes of people, not as different social constructs.

Part of the problem, maybe, is that any use of the "sex" meaning is now inevitably influenced by the "social construct" meaning, so that assigning citations to one definition or the other is quite hard – in many people's heads the word probably exists in a vague superposition between the two. I'm just looking at the OED's entry, and under the "males or females viewed as a group; sex" definition, they add the following note: "Originally extended from the grammatical use at sense 1 (sometimes humorously), as also in Anglo-Norman and Old French. In the 20th cent., as sex came increasingly to mean sexual intercourse (see sex n.1 4b), gender began to replace it (in early use euphemistically) as the usual word for the biological grouping of males and females. It is now often merged with or coloured by sense 3b." Which sums up the difficulty, I guess, although their own entry has no problem putting citations of the "both genders" type under the "sex" definition. Any thoughts? Ƿidsiþ 13:21, 27 September 2016 (UTC)

Thank you for your edits to the entry, especially adding the missing electrical sense. (Does one also speak of the "sex" of plugs as well as the "gender"? I see one book saying "For every cable connection, the cables that plug into a connector must be of the opposite sex." But it's hard to find any more because the other meanings are so common, hah.)
Re the "informal" tag, you're right that it wasn't informal historically; the usage notes touch on that. Perhaps "now chiefly informal and sometimes proscribed"? Or I guess just "now sometimes proscribed" would work, inasmuch as people would probably realize that formal works would not be likely to use proscribed words/senses. As you note, uses of "gender" to mean "sex" are usually ambiguous now, because the other reading is usually also possible.
In my view, the Guardian citation more likely refers to social categories, but is ambiguous in any case. If I say I think reviewers of Sherlock are judging Vinette Robinson based only on looks and not acting, someone might say: "well, Rupert Graves faces the same problem, I think both genders are judged on looks." And if I say I think reviewers of #Hashtag are judging Jen Richards only on looks, the same person might say: "but Ryan Crice is judged the same way: again, both genders are judged on looks." Occam's razor suggests the person means "gender" the same way each time, and the second example makes clear that they're talking about the visible social categories of 'men' and 'women' to which people belong based on how they live their lives, present themselves, etc, and not the genital/gonadal/chromosomal/etc ('sex') categories. (If the speaker meant "both sexes", the second sentence wouldn't make sense, since it's my understanding that the actress Jen Richards and the actor Ryan Circe are of the same sex.)
Maybe the "biological category" and "social category" senses should be subsenses of a broad sense to the effect of "a category such as 'male' or 'female'" (", to which organisms belong based on biological or social factors]"?), paralleling how the grammatical senses were made into subsenses...? Then ambiguous citations — and pre-1900s ones! — could go under than broad sense. When writing the sense, we would need to keep in mind that there are works that use this word in talking about real and fictional species with more than two genders/sexes and societies with more than two genders.
- -sche (discuss) 06:52, 28 September 2016 (UTC)

North American native languages projectEdit

At this page is information about a native language preservation project based at the computer science department at Southern Oregon University. Have you heard of them?

One of their projects involves dictionary software. I wonder whether there is any kind of cooperation that makes sense with respect to:

  1. a smooth interface between their software and ours (both ways)
  2. getting content from projects associated with them
  3. using their software to encourage users to create specialized glossaries from our content.

- DCDuring TALK 12:17, 3 October 2016 (UTC)

I was not aware of that site; thanks for linking it. (I have seen a few similar sites run by other universities.) They say their project "provides web-based export of information", but I don't know how they / the dictionary-authors who are using their format would feel about letting us import their content (which would release it under our licence). They might be interested in importing our data into their format, since they could do so for free as long as they attributed us per our licence, but there are not many North American languages that we document more extensively than other online (single-language) dictionaries, e.g. our coverage of Cheyenne is tiny compared to the online Cheyenne Dictionary. - -sche (discuss) 02:09, 4 October 2016 (UTC)
Willingness to share depends on attitude and situation. Death, retirement, loss of funding, etc might cause some of these databases to become available. Facilitating export to their format might yield future benefit in the form of increased willingness of dictionary makers to let us have their data when the time comes. DCDuring TALK 12:36, 4 October 2016 (UTC)

Verb form of "king" at article on "kings"Edit

I agree that there was already a definition of the verb on the page for king, but shouldn't the page for kings have information that the word "kings" is the third-person singular simple present form of the word "king" ? The lack of any mention of the verb form was why I put the verb in at "kings". Is this kind of back pointer common for verbs? I know the back pointer is there for nouns. ie: "plural of king" Bcent1234 (talk) 21:16, 18 October 2016 (UTC)

You put the definition of the noun "kinging" ("the action of promoting...") into the verb [[kings]] and used {{head|en|verb}} as if [[kings]] were a lemma, which was incorrect on a number of levels. If you look at an entry for any third-person verb form, e.g. looks, you can see how they are formatted: {{head|en|verb form}} # {{en-third-person singular of|king}}, like so. :)
Incidentally, I'm not convinced "(poker slang) a pair of kings" is really a sense of "kings" distinct from "plural of king"...
- -sche (discuss) 22:59, 19 October 2016 (UTC)
  • By the logic for inclusion that we follow. Because one could be confused as whether kings in poker meant "2, 3, 4 kings", we would include it. DCDuring TALK 10:10, 20 October 2016 (UTC)
    • But is it really limited to two kings? Wouldn't you say "there were kings and aces scattered around the room" if poker players threw a couple of packs of cards in a fit of rage? How is this different from defining "rackets" as "(tennis slang) a pair of rackets, or (doubles tennis slang) a foursome of rackets" because that's how many rackets tennis is played with? - -sche (discuss) 19:27, 20 October 2016 (UTC)
You could be right. It's just a matter of fact. Unfortunately (?) I don't spend much time playing poker or watching/listening to others playing poker. DCDuring TALK 19:45, 20 October 2016 (UTC)
In poker, out of the final five-card hand, if you say you have "kings" that means you have a pair of kings and not more than that. If you have three kings, you would say you have "trip kings", or if you have four, "quad kings". It's of course not limited to kings and applies to all cards. And of course if someone dropped the deck, you can still say "there are aces and kings scattered on the floor", because having a poker-specific sense doesn not imply that all other senses suddenly don't exist. Having said that, I still don't think we necessarily need to have this sense. --WikiTiki89 19:53, 20 October 2016 (UTC)


You said to leave a post on your talk page about the Zigeuner entry, but I just started and want to follow Wiktionary recommendations, so we should talk about this in the [Room]

Dzungalo77 (talk) 21:38, 28 October 2016 (UTC)


Hi @-sche, Could you please add a Proto-Nawiki language code, perhaps nwk-pro, which is based on PNwk from this paper. Thanks. Pinging @Metaknowledge as well. --Victar (talk) 16:30, 16 November 2016 (UTC)

Also related, I have a few Arawak family languages that need codes as well. Thanks!

--Victar (talk) 18:25, 16 November 2016 (UTC)

As I told you, awd-nwk-pro will be necessary. Also, we've been discussing some of those languages already (search for their name at WT:RFM). Specifically, I remember being unsure about whether Wainumá and Mariaté are actually separate languages, and I think -sche may have added notes about some of the others. —Μετάknowledgediscuss/deeds 21:51, 16 November 2016 (UTC)
No need for nasty "I told you"s. You said awd-nwk-pro was necessary if NWK is of my own invention. As I stated above, it is not. If Wainumá and Mariaté are considered one in the same, that's fine, but I'm still lacking a code regardless. --Victar (talk) 22:20, 16 November 2016 (UTC)
Sorry, I didn't intend that to be nasty. I suppose I was unclear earlier; we are trying to make codes that complement ISO standards, rather than conflict with them. It's not about who made it up, but simply that it's not in an ISO standard. And I don't know enough and haven't done the research to judge whether those should be merged or separated; I would trust your judgement either way, but I wanted to bring up the issue. —Μετάknowledgediscuss/deeds 22:25, 16 November 2016 (UTC)
Thanks a ton for the explanation. I didn't realize that ISO was the only excepted source for new un-hyphenated codes. It makes sense though; cuts down on arguments and possible future conflicts.
I think quite a few indigenous American languages are actually dialects on one another, but because they have more research, ISO assigns them separate codes. I figured since these are just hyphenated codes anyway, it doesn't much matter. Either way, I just need some way to add them. --Victar (talk) 23:41, 16 November 2016 (UTC)
OK, I've added Nawiki as a family and Proto-Nawiki as a language, with the code as above (hopefully I did it all correctly). We'll have to populate the family; which codes ought to be in Nawiki? —Μετάknowledgediscuss/deeds 22:34, 16 November 2016 (UTC)
Thank you so much! Proto-Nawiki corresponds to the parent of Western Nawiki and Eastern Nawiki on Wikipedia, so the descendant codes would be: awd-pas, rgr, cbb, awd-kaw, ycn, mht, gae, bwi, kpc, tae, pio, along with the propossed awd-ymn, awd-wmt and awd-wrn. --Victar (talk) 23:41, 16 November 2016 (UTC)
@Metaknowledge: could you add Proto-Newiki as an alternative name? --Victar (talk) 22:17, 17 November 2016 (UTC)
I added the alternative name for the family and the protolanguage, and added the family to all the already existing codes. I haven't created the new codes you requested yet because I'm waiting on -sche's input. —Μετάknowledgediscuss/deeds 00:00, 18 November 2016 (UTC)
You rock! Thanks once again. --Victar (talk) 00:07, 18 November 2016 (UTC)
@Metaknowledge if you could, I also have another related proto language, proto-Piro-Apurinã. I figure if I'm reconstructing them for PAwk, I might as well be creating entries for proto-Piro-Apurinã as well. I've also seen is called proto-Apurinã-Piro-Iñapari and proto-Purus in one case, but proto-Piro-Apurinã is most common. awd-pia-pro would be perfect. Derived languages code would be apu, inp, pib and mpd. --Victar (talk) 19:42, 18 November 2016 (UTC)
I can't find linguistic works with much to say about Mariaté and Wainumá (Wai, Waima, Wainumi, Wainambí, Waiwana, Waipi, Yanuma) beyond that they exist; has short wordlists which are quite similar. Do we want to take a conservative approach and give each its own code as we often do, or are we confident enough that WP is right to group them?
My preference is to keep them separated, but I don't have a strong opinion. --Victar (talk) 07:11, 19 November 2016 (UTC)
Marawá needs to be distinguished from Marawán. And do we want to give it, Guinau, and "Baré" (a terribly ambiguous name) three codes, or merge them as Barawana, as Aikhenvald suggests?
It's quite annoying that they're so similarly named, but they are indeed two separate languages: mara1408, mara1409.
Guinau and Baré are actually quite different in many ways, from what I've seen. --Victar (talk) 07:11, 19 November 2016 (UTC)
I thought Uirina had already been discussed somewhere, but [after searching] I guess not (must have been a language with a similar name). - -sche (discuss) 05:09, 19 November 2016 (UTC)
I've added Yumana. - -sche (discuss) 05:20, 19 November 2016 (UTC)
Thanks! --Victar (talk) 07:11, 19 November 2016 (UTC)
@-sche: What code did you use, because the proposed awd-ymn didn't work? --Victar (talk) 07:21, 19 November 2016 (UTC)
I've added Wainumá as awd-wai, which was suggested on WT:RFM and which is more clearly distinguished from sai-wnm (Wanham) — it's a small thing, but it seems better not to have a sai- ("South American languages") code and an awd- (South American "Arawakan languages") code be identical except in their potentially mentally-interchangeable prefixes. And I've added Mariaté as awd-mrt, as proposed here, rather than -mar (as proposed on RFM), to better distinguish it from sai-mar. Yumana was added as awd-yum (as proposed on WT:RFM). I added Guinau and noticed we already had a word in it, and a code for Bare despite that term (Bare) 's ambiguousness. - -sche (discuss) 04:37, 21 November 2016 (UTC)


Hello, do you happen to know whether espan is also a descendant of the Munsee word (or whether a related language is more probable or whatever else)? Lingo Bingo Dingo (talk) 11:19, 26 November 2016 (UTC)

I can't find any references that say anything explicit about the Swedish term (you've outdone me by just finding so many citations of it! excellent work!), but the word is clearly Algonquian, and based on where the Swedish settlements in America were it 's most plausible that it came from Munsee, Unami or possibly Nanticoke. Of those, Nanticoke echsup and the (clearly unrelated) Unami nahënëm are phonologically implausible, but Munsee é·span (where é· is /ɛː/) is a great fit. Combined with the sure derivation of the very similar Jersey Dutch word from Munsee, I'd say the Swedish word is from Munsee, or if one wanted to be conservative, "From an Eastern Algonquian language (from Proto-Algonquian *e·hsepana, most likely Munsee é·span." - -sche (discuss) 22:42, 26 November 2016 (UTC)
Thanks, I'll add the more conservative version to be on the safe side. Lingo Bingo Dingo (talk) 13:19, 30 November 2016 (UTC)

Kurze AnfrageEdit

Tagchen [tʰa̝x̠ʝɪ̈n] - Du bist doch in der weiten Bücherei deutscher Regionalismen gut rumgekommen. Ich bin neulich an 'ner Studie vorbeigekommen, die erwähnte, dass /ɛː/ in Mecklenburg-Vorpommern als Merkmal eines fremdsprachlichen (plattdeutschen) Akzentes empfunden wird und nur die Aussprache von ⟨Ä⟩ als /eː/ als korrekt empfunden wird. Ich weiß nur leider nicht mehr, welche es war. Es wahr wohl Dahl 1974, der norddeutsche Sprachatlas oder diese andere Arbeit über Sprache in Mecklenburg-Vorpommern aus den 70ern, die mir einfach nicht mehr einfällt. Du weißt nicht zufällig, wo man dieses Zitat findet oder wo man diese Arbeiten einsehen kann oder wenigstens, wer die Arbeit geschrieben hat, die nicht Dahl ist? Korn [kʰũːɘ̃n] (talk) 21:11, 29 November 2016 (UTC)

Illion numbersEdit

You can see the larger number to Wiktionary.

Millillion - 10^3003

Dumillillion - 10^6003

Myrillion - 10^30003

Micrillion - 10^3000003

Nanillion - 10^3000000003

Picillion - 10^3000000000003

Femtillion - 10^3000000000000003

Attillion - 10^30000000000000000003

Zeptillion - 10^3000000000000000000003

Yoctillion - 10^3000000000000000000000003

Xonillion - 10^3000000000000000000000000003

Cyrus noto3at bulaga (talk) 10:03, 4 December 2016 (UTC)

A proposal on splitting Monguor into Mangghuer and MongghulEdit

Hey, a proposal I've made at Wiktionary:Requests_for_moves,_mergers_and_splits#Splitting_Monguor_into_Mangghuer_and_Mongghul seems to be stuck for a long time now, could you perhaps take a look at it, share your thoughts and vote? Crom daba (talk) 00:35, 21 December 2016 (UTC)

//NOTE: This message was crossposted to multiple talk pages. Crom daba (talk) 00:35, 21 December 2016 (UTC)


Hey, I see you've made a Salar entry in Arabic script, do you have any resources on the orthography? I'm making some Proto-Mongolic entries and it will feature in descendants. Crom daba (talk) 01:51, 26 December 2016 (UTC)

Finno-Ugric to UralicEdit

Note that a bunch of module errors have been generated as a result. —Μετάknowledgediscuss/deeds 03:40, 11 March 2017 (UTC)

Coorne citationEdit

Please explain why you removed the OED Coorne citation. Note in particular that though it gives no actual quotation, it explicitly supplies the necessary citation. JonRichfield (talk) 12:04, 18 March 2017 (UTC)

That would be the type of citation that Wikipedia requires, but not the type that Wiktionary requires. It's one thing to list the OED as a source for more information in a separate section, but it's not a good idea to use an inline citation as if that verified the existence of the word to Wiktionary standards (we have an entire appendix of dictionary-only terms that can't be regular entries). If the OED gives examples of use, cite the usage in the original sources. See WT:CFI for details. Chuck Entz (talk) 18:52, 18 March 2017 (UTC)
Chuck Entz thank you. I followed some of your refs. Would you care to comment on the looseness (by current standards) of earlier English spelling? Eg, Tyndale's bible had at least three spellings for corn, two of which occur in the quote I subsequently supplied. Coorne itself is labelled (correctly IMO) as obsolete in English. JonRichfield (talk) 19:31, 18 March 2017 (UTC)
It's a definite problem, because we treat Early Modern English as English, and that means 3 cites/quotes for verification. Wiktionary is structured around precise spellings, so coorn and coorne have to be verified separately. Lemmas, as our stand-in for the term as a whole, can be verified by inflected forms- so a quote for corns will verify our entry for corn- but these are alternative forms. Chuck Entz (talk) 20:54, 18 March 2017 (UTC)
Chuck Entz I had thought of that independently, but as it happens Tyndale not only has been republished in editions by various editors who retained his spellings (which I have verified as accurate by looking very carefully at an image of the original) but also has been quoted more or less correctly (though at least one misquoted "corn" as "corne", but retained "coorne" correctly). Now, corn/corne does not look like much of a problem for most readers, in searching for corne in WKt the reader might well notice "corn" and make the connection, and besides spelling in those days (16th c and earlier) was pretty arbitrary (GBS would have LOVED it). But it is perfectly possible for someone reading coorne out of context, or in a different context, to read "coorne" and wonder what the bleep it meant ("coronet" for goodness' sake!!!) without making the corn connection. For such a reader that is IMO a definite justification for such an entry whether it is a ghost word or not. Furthermore, check the 15th century entry I have just added to the Dutch; in those days, English being what it was, it is quite conceivable that the form 'coorne' from the Dutch was known in southern England; and don't forget that the rejected OED entry, though it does not supply sources for that one, does give it for a variation of kernel, coornel, which should count as being as much of a cite as any other book, even if not as a satisfactory dictionary entry. (Other books also have their ghost words :D ) JonRichfield (talk) 11:28, 19 March 2017 (UTC)

Lake ChargoggagoggmanchauggagoggchaubunagungamauggEdit

A language code was adding in this diff. Is it correct? I also wonder if the rest of the etymology is correct, after having looked at the Wikipedia article and a couple of its sources. Thanks! Chuck Entz (talk) 01:48, 19 March 2017 (UTC)

Check Chaubunagungamaug, where I haven't tampered with the etym. — AWESOME meeos * ([nʲɪ‿nəʐɨˈmajtʲe sʲʊˈda]) 02:32, 19 March 2017 (UTC)
Your involvement is why I noticed it, but isn't really the reason I asked. I noticed your edit when you made it, and made a mental note to ask about it later when I had time and when -sche seemed to have time to be asked. Discussing your edits jogged my memory, so I posted this today. My question still remains, though Angr's involvement does ease my concerns a bit. Chuck Entz (talk) 03:01, 19 March 2017 (UTC)
The base name is generally agreed to ultimately be Nipmuck. I consider it suboptimal to use the same code both in situations like this, where the language being referred to or speculated about is identifiable as the language of the Nipmucks, and also when dealing with the wordlist that is known by the obvious placeholder name "Loup A" that is merely assumed to be the same language. Nonetheless, sources do treat them the same, to such an extent that the Grammar of the Nipmuck Language identifies itself as "a grammatical sketch of Loup A". (Maybe I'll give Nipmuck an etymology-only code like New Latin, though. Or we could change the canonical name of Loup A.)
Thanks for adding the pronunciation, Awesomemeeos.
I've added some references explaining that the longer name is a 1920s hoax. - -sche (discuss) 03:34, 19 March 2017 (UTC)
It reminds me of the first assignment in the American Indian Languages class I took at UCLA thirty years ago: we were asked to look up the origin of a list of US place names (in books- the World Wide Web hadn't been invented yet). It was eye-opening how much bogus information there was in respectable references.
By the way, I added an archive link to the Webster Lake Association website cite, since the site is now apparently defunct. Thanks, -sche! Chuck Entz (talk) 04:01, 19 March 2017 (UTC)
Thanks; I copied that one over from Wikipedia and, as you see, ultimately removed it as unnecessary/redundant. - -sche (discuss) 16:29, 19 March 2017 (UTC)


diff I think you made a mistake here? You removed a bunch of language codes. —CodeCat 01:24, 28 March 2017 (UTC)

Yes; I've reverted myself except for the one change I was trying to make, to fix Chono's code. - -sche (discuss) 01:30, 28 March 2017 (UTC)

Intersection-al and inter-sectionalEdit

Is it perhaps wise to divide intersectional into Etymology 1 and Etymology 2? The dominant sense today relates to intersectionality, which seems to have been coined by Kimberlé Williams Crenshaw in or about 1989. But there is also a literature on inter-sectional politics during the American Civil War, and I've seen mention (in twentieth century texts) that this notion of relationships across "sections" was popular among eighteenth- or nineteenth-century American thinkers – maybe Federalists? (See the two different noun senses at intersectionalist.)

In Google Books, before 1989 intersectional mainly seems refer to scientific meetings or sports tournaments, or to the Civil War-era intersectionalism. After 1989, it seems to refer mainly to (theories or treatments of) gender, race, discrimination, etc.

Any road, do you think it is worthwhile to think about dividing the word into senses with different etymologies? or is it fine as-is? Cnilep (talk) 00:52, 6 April 2017 (UTC)

The page could certainly be split that way. I only didn't make the effort doing that at the time because the two etymologies are pretty similar and ultimately both break down to the same inter- section -al. - -sche (discuss) 02:41, 6 April 2017 (UTC)
Agreed. Maybe I'll think more about it if I have some free time in the future. Thanks, Cnilep (talk) 03:42, 6 April 2017 (UTC)


Hope this is where you wanted me to reply/discuss this.

Yes, I do adamantly feel the rollback is in error. Please kindly revert to my 4-10 updated mombie definition. There is currently no proper usage cited. Defined as a 'mombie' (even though male) myself I understand what it means fully. My definition had rave reviews by many mombie parents. The proper usage is as stated. This word is not always derogatory. The proper definition is becoming more popular and should be listed first. 9/10 actual mombies agree. Parenting is not easy by any means. Mombies give their all and then some. There needs to be a definition listed that is non-derogatory as well as the derogatory usage.

Let's get the accurate proper use of the word back online for all, please. It would mean a lot to myself and 'mombies' around the world. Thank you very much for your time.

Joshua Crum (talk) 14:41, 10 April 2017 (UTC)

@Joshua Crum: First of all, don't add it back without discussion first. It wasn't reverted by mistake. Your definition obviously can't stand even if the non-derogatory sense is fine; if you look elsewhere in the dictionary, you will see that definitions are never self-aggrandising paragraph-long spiels. More importantly, though, not just any definition can be included. Only those that pass WT:ATTEST are allowed in English. Even if you personally use a word every day, if it hasn't entered into the popular lexicon of durably archived English, we can't accept it here. —Μετάknowledgediscuss/deeds 17:31, 10 April 2017 (UTC)
The long, aggrandizing paragraph was over the top, but I appreciate the point that usage is not always derogatory; some books and other uses I see are just referring to the sleep-deprivation-induced mindlessness you mentioned. I've tried to expand the definition a bit. Let me know if any key elements are still missing. - -sche (discuss) 21:15, 10 April 2017 (UTC)

das seine - pronunciationEdit


One of the Russian opposition journalists said that in "jedem das seine" there is no /z/ sound. While I agree that Russian /z/ is much more voiced than the German, I'd say that the final s in "das" is voiceless but the next "s" is slightly voiced, isn't it? Or is it completely devoiced? What do you think? --Anatoli T. (обсудить/вклад) 00:40, 16 April 2017 (UTC)

That's an unusual thing for Russian journalists to be discussing! The "s" in "seine" is /z/ in standard German, even after the /s/ of "das". - -sche (discuss) 01:55, 16 April 2017 (UTC)
What? Voicing of /z/ ⟨s⟩ is entirely facultative in every register of German. And progressive devoicing is completely normal. Korn [kʰũːɘ̃n] (talk) 12:13, 16 April 2017 (UTC)
The canonical sound is /z/, however. Compare for example aussöhnen, which the Duden (and e.g. Viëtor's Deutsches Aussprachewörterbuch) transcribes [ˈaʊ̯szøːnən]; our colleagues at de.Wikt are missing the verb (as are we) but have Aussöhnung [ˈaʊ̯sˌzøːnʊŋ], as well as e.g. es sich [ɛs zɪç]. - -sche (discuss) 18:01, 16 April 2017 (UTC)
But even the section 'Genormte Lautung' (opposed with 'Umgangssprache') in the current Duden Aussprechewörterbuch only says something along the lines of /s/ can also be voiced in these positions (Same goes for vocalisation of /r/.) and then explicitly states that amongst these they pick one, but the others are always an equally standard alternative. I'd prefer if we use our (basically) infinite space to afford some wider precision over a brevity which might accidentally turn us non‐descriptivist. Also, I feel a bit impolite for barging into your talk page, sorry, but it's on my watchlist for some reason, and with German there is a lot of prescriptivist spirit floating around the Wiki projects. I also use the [sz] pronunciaton in these cases and I'm all for differentiating /ß/ from /s/, but we shouldn't make it wrongly sound like it's the one right way. Korn [kʰũːɘ̃n] (talk) 10:42, 17 April 2017 (UTC)

Spellings in dialectal categoriesEdit

(This is not another dispute on which English forms deserve lemma.)

I’ve noticed that there are quite a few alternative spellings which are placed in the same category as regionalisms. A few examples include Euroskeptic, favor, gigametre and humourless, amongst others. Considering that we have categories for this purpose, e.g. category:Canadian English forms, all that it does is clutter. As well, template:standard spelling of appears to be using the wrong categories.

Is there any chance that we can fix up this? (@Daniel Carrero can also weigh in, since he seems familiar with this kind of thing.) — (((Romanophile))) (contributions) 04:38, 19 April 2017 (UTC)

Switching from "British" to "British spelling" seems to bring about correct categorization, both in {{label}} and in {{standard spelling of}} (see my edits to gigametre), but going through all the entries in the regionalism categories and making sure they use the right labels will take a fair bit of work. I'm sorry for this delayed and probably disappointing response. I may try going through the dialect categories and checking labels with AWB sometime.
A tangentially related question is whether a word that's limited to, say, Shetland, should use {{lb|en|UK|dialectal|Shetland}} as some entries do, which puts them into "British English" although the words are not used in all or even most British English varieties. The lack of consistency about whether all entries which are dialectal go into "Category:English dialectal terms" or only some random entries do is also unfortunate.
- -sche (discuss) 03:52, 8 May 2017 (UTC)

Categorizing CategoriesEdit

Hi -sche, I haven't had a lot of users discussions yet, so if this needs to be at another place or done in a different way - let me know. I created a while ago category:English_false_friends_for_German_speakers. Your bot left a note for clean-up, also a while ago. But I do not really understand, how to add a language category to a category. Can you help? Thx. (talk) 09:35, 20 April 2017 (UTC)

I'm not sure the current name is the best name for the category, but the main issue is that I'm not sure where "false friends" categories like it fit into Wiktionary's system of categories. - -sche (discuss) 03:43, 8 May 2017 (UTC)

hommesse "woman"Edit

I found the very rare French word hommesse, "woman", on the multilingual website: [[15]].

Genèse 2:23 French: Martin (1744): - Alors Adam dit : A cette fois celle-ci est os de mes os, et chair de ma chair; on la nommera hommesse, parce qu'elle a été prise de l'homme. -- New International Version: - The man said, "This is now bone of my bones and flesh of my flesh; she shall be called 'woman,' for she was taken out of man."

Th noun hommesse is listed on the fr.Wikt under Dérivés "Derived terms" for homme [[16]]. Therefore I think the word hommesse "woman" should not be deleted from the English Wiktionary, even if it is very rare. I think that separate Wiktionary pages for hommesse, "woman" ought to be created to show that this word was used in an earlier French translation of the Bible. Cf. German Mann / Manne "man" and Männin --

1 Mose 2:23 German: Luther (1912): Da sprach der Mensch: Das ist doch Bein von meinem Bein und Fleisch von meinem Fleisch; man wird sie Männin heißen, darum daß sie vom Manne genommen ist. [[17]] Have an excellent day! Hans-Friedrich Tamke (talk) 03:24, 8 May 2017 (UTC)
It would be fine to have an entry for it, which could note how rare a bit of wordplay is. But it's downright misleading to list it as a translation in the translations table at woman, which is why I removed it from there. Incidentally, it seems to more often mean something different, along the lines of "effeminate/androgynous man". - -sche (discuss) 03:39, 8 May 2017 (UTC)

Nice 'crattageEdit

Great admin work, -sche. So much so, I may have to nom you as our next bureaucrat. —This unsigned comment was added by Celui qui crée ébauches de football anglais (talkcontribs).

Thank you for prompting me to take care of so many of those old RFMs. Do you have aWa enabled? (Can non-admins enable it?) It would be preferable to archive old discussions to talk pages rather than just deleting them. - -sche (discuss) 22:30, 13 May 2017 (UTC)
I can't use aWa, no. It's only for autopatrollers and above. I know it is preferable to archive them, but I'm not known to do things the "right way". In fact, I'm no teven checking my spelling, punctuation or signing my posts these days, which really pisses people off .
Ah, that's silly, that you can't use it. Oh well! :p - -sche (discuss) 22:50, 16 May 2017 (UTC)

Old NorthwestEdit

FYI, you just added it to a category that doesn't exist. Will you create the category? Purplebackpack89 23:21, 14 May 2017 (UTC)

I believe there is a bot that creates wanted categories that fit naming patterns. I know there is one for POS and derivation categories. - -sche (discuss) 23:37, 14 May 2017 (UTC)
Now there's Category:en:Regions of the United States as well as Category:en:Regions of the United States of America... —CodeCat 00:05, 15 May 2017 (UTC)
The "cities" and "towns" categories use the full name, so presumably "regions" should, too, at least until we decide on an overarching policy. (A shorter name like just US would be easier to type, but some have felt the full name is more professional.) - -sche (discuss) 00:16, 15 May 2017 (UTC)
There's still a data entry in one of the modules of topic cat somewhere. —CodeCat 00:17, 15 May 2017 (UTC)
Good catch (that was my own error and apparently also someone else's, since it was added twice; hah). Btw, do you know why we have both Module:category tree/topic cat/data/Place names and Module:category tree/topic cat/data/Place names old? Does the split serve a purpose or could I merge them, for example at Module:category tree/topic cat/data/Places? - -sche (discuss) 00:20, 15 May 2017 (UTC)
I think it was because I made some changes to the module that someone else didn't like so they forked it? —CodeCat 00:22, 15 May 2017 (UTC)
Do you happen to notice any reason why it would cause errors if I started moving labels from the "old" module into the plain module? I don't; the formatting of the labels looks identical. Do you have a preference for whether they be un-forked at Module:category tree/topic cat/data/Places or at Module:category tree/topic cat/data/Place names? - -sche (discuss) 00:33, 15 May 2017 (UTC)
Just "Places" would make more sense, since that's the new head of the tree. I don't know if the data entry for "Place names" would even belong in there anymore. —CodeCat 00:43, 15 May 2017 (UTC)
OK; I've started centralizing the labels there. I am inclined to leave "place names" as the only label in its module, for a while, while people adjust. - -sche (discuss) 01:51, 15 May 2017 (UTC)
If I remember correctly, User:Daniel Carrero did that when he created {{place}} and wanted to add a gazillion new items. He copied the existing module to Module:category tree/topic cat/data/Place names old and put his stuff in Module:category tree/topic cat/data/Place names. Chuck Entz (talk) 02:21, 16 May 2017 (UTC)
@Chuck Entz: Actually, my contribution to {{place}} is very minor. I basically just created a worthless stub template and then @Ungoliant MMDCCLXIV created a full module and made it work. Which is awesome. (I also checked the results and gave feedback while he did the hard work.) You mentioned my contributions in Module:category tree/topic cat/data/Places so I'll reply to that: You're welcome. --Daniel Carrero (talk) 19:48, 19 May 2017 (UTC)
You were more influential than you may think: the main reason I chose to do it was to avoid the proliferation of decentralised templates like {{place:Brazil/municipality}}. — Ungoliant (falai) 20:02, 19 May 2017 (UTC)
I'm happy that you did it. I believe my ability in editing modules at the time was close to 0%, so I created the municipality thing because it was better than nothing. But it's way better to use {{place}} than decentralised templates like {{place:Brazil/municipality}}. --Daniel Carrero (talk) 20:14, 19 May 2017 (UTC)
I’d like to say that this sort of situation is exactly where I felt that something like {{place}} would be useful: people adding placenames only need to worry about the correctness of the information they are adding, and let the data module worry about its categorisation.
I am the first to admit that it became a kitchen sink, though. — Ungoliant (falai) 20:02, 19 May 2017 (UTC)
I'm hoping to get place name information from Wikidata in the future. --Daniel Carrero (talk) 20:28, 19 May 2017 (UTC)

United States of America categoriesEdit

There are name that are just "United States" without "of America":

If you can move these and the subcategories, that would be great. —Justin (koavf)TCM 04:04, 15 May 2017 (UTC)

Thanks for finding these. The "state capitals" one has been stagnating at RFC/RFM since 2009, I see! I'm going to go with the suggestion made there of "State capitals of..." [the United States of America]. I wonder what motivated the weird capitalization on it and the nicknames category. It'd be nice if category-redirects would also take any pages put into them and put them into the "main" category, so we could use short forms like "US" when typing the names out in entries, but I guess that's just wishful thinking! - -sche (discuss) 04:23, 15 May 2017 (UTC)
No problem. It would be easy (for someone who makes bots) to make a bot to do this. —Justin (koavf)TCM 10:07, 15 May 2017 (UTC)
Also, Category:Georgia (State). In addition to the caps, "state" means both an independent state (like the republic) and a subdivision of the U.S. On en.wp, we use "(U.S. state)". —Justin (koavf)TCM 18:29, 15 May 2017 (UTC)
Shorter category names are much easier to enter. That's my tuppence worth. DonnanZ (talk) 23:58, 15 May 2017 (UTC)
I agree with that. How about we create category-redirects at shorter names, maybe systematically replacing "[the] United States of America" with "US". Then we could empty out the redirects periodically by bot. I've created Category:en:Georgia (US) and Category:en:Cities in Georgia (US). By the way, Hotcat is a great help with adding categories without having to spell out their full names. - -sche (discuss) 02:19, 16 May 2017 (UTC)
Someday we might decide to make the "US" short forms the main categories,but so far there has been disagreement in the various discussions at WT:RFM and elsewhere whenever this has come up, between those who want short names and those who want "professional" full names on the finished product. - -sche (discuss) 02:22, 16 May 2017 (UTC)


I think we need an actual idea of how to approach to handling macrolanguage codes that coexist with codes for the constituent lects. Guaraní has an open RFM on this issue at the moment: the macrolanguage code gn has been used almost entirely, if not entirely, for Paraguayan Guaraní, which has its own code, gug. The RFM drew overwhelming interest and support for a merger (well, overwhelming by our usual standards), but some confusion about which code/name to keep and which to merge. There are other cases, like Kurdish, that I want to bring up at RFM, but first I wanted to get your thoughts on how best to solve these: allow one dialect to take the macrolanguage status, or retire the macrolanguage code altogether? —Μετάknowledgediscuss/deeds 03:06, 16 May 2017 (UTC)

If we consider all the dialects of Guaraní to be one language, then it makes sense to merge all the codes into gn (obviously).
And if we consider all the dialects of a language to be distinct enough to keep separate, then IMO it's clearer to retire the macrolanguage code, even where one dialect is more prominent than the others ... unless there is a clear tendency for the unmarked language name to refer to that dialect even when other dialects are being talked about. (For example, we merged ekk into et rather than vice versa.)
There is only one case that comes to mind where we let a dialect have a macrolanguage code although the situation was arguably not that clear, namely mhr = chm, which is silly because we still call the language "Eastern Mari" ... that should probably be revisited. (Als into sq, although not as clear at ekk into et, is probably fine.)
In this case, "Guaraní" does seem to usually be identified with gug (almost certainly helped by the fact that gug has many orders of magnitude more speakers), so it would probably be fine to merge gug into gn even if other dialects are kept separate, especially because we probably want to keep the name as "Guaraní".
For Kurdish, it seems like it might make more sense to retire the macrolanguage code, since no one dialect seems to have an overwhelming case for taking it on (Kurmanji has the most speakers, but only 2-3 times as many as Sorani, and Sorani is standard in Iraq), although I only made a brief look into it and could be wrong.
Guillermo's comment "[nhd] is similar and very close to [gug] but it's slightly different and always confused with [gug]" actually supports merging those codes, IMO, although he's arguing the opposite. Wikipedia is ambivalent about what should be done with them, and I'm not sure yet either.
- -sche (discuss) 04:30, 16 May 2017 (UTC)

Properly splitting topic and set categoriesEdit

I would like to work on a proposal for this, but there's several issues to sort out first. I'm hoping you can help with this. There are two other issues which are also at play with these categories, which have come up before. First is the matter of naming the "by language" topical categories. They have literally no naming scheme, and we've occasionally run into naming conflicts with these, so adding something to the names so that they are clearly set apart as topic/set categories is useful. Second is the matter of the language codes in the names. All our other categories use language names, and people have complained about the presence of codes in user-facing parts of the dictionary before. If we're going to rename the categories, we might as well tackle all issues together, so that we don't have to rename the categories multiple times. —CodeCat 17:47, 17 May 2017 (UTC)

Yes, this is a tricky tangle of issues. I will look back over previous discussions to refresh myself on what potential problems have been pointed out with some of the previously-proposed solutions. I agree with you that it would be useful — necessary, really — to add something to the the names of both types of categories, so that they can finally be told apart, and so that we avoid naming conflicts. Maybe we could have a poll to gauge if people would prefer quick-to-type prefixes like "t:" and "s:"/"l:", or spelled-out prefixes "topic:" and "set:"/"list:", and also if they would prefer spelled-out language names or codes. I know some people dislike language codes, but other people dislike long names, and codes are shorter (and code-based categories don't have to be moved when we rename languages, a minor benefit). If we used spelled-out names, we should probably set them off by colons (maybe someone has already suggested this), because renaming CAT:en:Dogs to "Category:s:English dogs" or "Category:list:English dogs" or even "Category:English dogs" makes it seem like it's for England's breeds only. But should the subcategory of "Category:Dogs" be "set:English:Dogs" or "English:set:Dogs"? I guess the second one is maybe more logical from a sorting perspective?
- -sche (discuss) 03:00, 18 May 2017 (UTC)
I wouldn't like shortcuts like t: and s: because these categories are meant to be understandable for the average user. —CodeCat 15:20, 21 May 2017 (UTC)
Return to the user page of "-sche".