User talk:-sche


ubersexual / including non-durable citationsEdit

Translations of attributive use of nounsEdit

Add replacements to edit summaryEdit

In AWB Options > Normal setting uncheck 'Add replacements to edit summary' and it'll make the edit summaries only what you put in the 'Default Summary' box. Makes edit summaries shorter and more 'human'. Mglovesfun (talk) 18:38, 11 October 2012 (UTC)

Aha! Thanks for the tip. :) - -sche (discuss) 18:45, 11 October 2012 (UTC)


I'd like to take over WOTD — at least for now. I've already set up new words for October 28-31 to get the ball rolling again. Looking over diffs to see what others had done allowed me to figure out the basics, but there's still many other things I need to know about the process, especially what I need to do to create an archive, set up a new month, and polish the entry pages for words before they appear. Thanks! Astral (talk) 00:43, 28 October 2012 (UTC)

I'm glad you're interested!
The front-end part is simple—pick words and plug them into the templates. You're already doing a good job of that; I like your Halloween pick. As you seem to have gathered, the last definition doesn't end with a full stop/period (though if a word has multiple definitions, the preceding definitions do), because the template already adds one: double-dotted vs fixed. Featured words should have pronunciation info (either IPA or audio); the template will automatically notice and include an audio pronunciation if one is present.
The more additional info an entry has, like etymology, illustration or examples of usage, the more interesting it is likely to be to users who click through to it; on the other hand, trying to cite and find a picture for every word you feature on WOTD is a recipe for burning out. Strategise.
Once you've set a word, add the was-wotd template to the entry, so that it won't be featured again (mostly).
To create an archive, do what Ruakh did here, changing {{wotd archive|PREVIOUS|NEXT|YEAR|DAYS}} to the previous month, the next month, the year (four digits) and the number of days in the month (28, 29, 30, 31), and updating the pagename to the relevant month and year. An easy way of creating an archive is to copy-and-paste the relevant month's Recycled Page, e.g. Wiktionary:Word of the day/Recycled pages/October, simply changing {{wotd recycled}} to {{wotd archive}} and adding the YEAR and DAYS parameters.
At the end of the month, subst: all of the templates by changing each day's {{Wiktionary:Word of the day to {{subst:Wiktionary:Word of the day. The reason for not subst:ing a day before it's done is that someone might tweak the definition or fix a typo, etc.
- -sche (discuss) 04:41, 28 October 2012 (UTC)
Thanks. This is very helpful. I've got a couple of questions. First, I'm not good with IPA, so is there a way I could arrange for someone who is to add pronunciation data to entries before they appear? Second, is it okay to occasionally select words I've nominated myself? I already did this with trainiac, because I wanted something "fun" between mulct and peri-urban, but I don't want to do it again if it's something that should be avoided. Astral (talk) 03:33, 30 October 2012 (UTC)
Also, exactly how far back does the prohibition against using words featured as WOTDs on other sites go? It makes sense not to copy words other sites have featured recently, but three, four, five years back seems like a another matter. I need a verb, and wanted to use photobomb, but it was featured on Urban Dictionary in 2009, and more recently as a noun on September 28 of this year. Astral (talk) 03:49, 30 October 2012 (UTC)
So, I chose ambuscade instead, only to discover it was a Merriam Webster WOTD in 2010. Can't win. :( Astral (talk) 04:27, 30 October 2012 (UTC)
Disclaimer: I'm not Sche (@Sche: feel free to correct me on anything I say). Anyway, I think that choosing words that you nominate is fine, and that if you find a concise way to list all the entries you want IPA for pronto (on a subpage, maybe?) I would be happy to help out, as would Sche, Angr, et al. (probably) given their past contributions in that regard (and they're probably more trustworthy than I am). —Μετάknowledgediscuss/deeds 05:14, 30 October 2012 (UTC)
Yes, you can just comment that you'd like to feature a word but it lacks pronunciation info. Many users watch that page, and someone should take care of it. And yes, you can feature words you've nominated—at least, I did. It's probably best to let a couple days pass between when you nominate a word and when you use it, in case anyone comments with objections, but I doubt anything you nominate will be objectionable (you know not to nominate redlinks or offensive words). As for other sites' words of the day: personally, I never paid much attention to that rule; I checked if a word had been featured on another site in the past few months, and if not, looked no further. Sometimes, people would strike words that had been featured by other sites years ago, and in those cases, I respected the strikings and didn't use those words, but I didn't strike words that had been featured by other sites years ago myself. - -sche (discuss) 05:45, 30 October 2012 (UTC)

Inscriptions and whatnotEdit

Discussion moved to WT:T:ALA.

̶s̶̶c̶̶h̶̶r̶̶i̶̶e̶̶f̶̶s̶̶t̶̶a̶̶n̶, ̶s̶̶k̶̶r̶̶i̶̶if̶̶s̶̶t̶̶a̶,̶s̶̶c̶̶h̶̶r̶̶i̶̶e̶̶w̶̶s̶...Spelling standards for Low German.Edit

Ahoy. Please refer to this, leave a comment and maybe distribute it to people you know might have an interest in this. We can do it! Korn (talk) 19:14, 29 August 2013 (UTC)


WT:LANGTREAT doesn't mention Slovincian. I was wondering whether we made the decision not to treat it as a dialect of Kashubian, or whether it just happened that way. I have no preference one way or the other, since I don't know much about it anyway. --WikiTiki89 16:04, 9 December 2013 (UTC)

It looks like it just happened that way. I mean, both Slovincian and Pomeranian have exceptional codes, so someone made the conscious decision to treat them, Kashubian, and Polish as distinct from each other. But both codes were created by the same user who also created separate exceptional codes for the Pitcairn and the Norfolk varieties of Pitcairn-Norfolk, which subsequent discussions all agreed to re-merge, so it's possible (and indeed, apparently the case) that it was just that one use who got the idea that they should be split. There does not seem to have been any community discussion of Slovencian, Kashubian or Pomeranian, but Wiktionary:About Slovincian has been created. I have updated LANGTREAT to note that "in practice,..." they are currently distinct. - -sche (discuss) 18:50, 9 December 2013 (UTC)

Data consistency checking moduleEdit

Kephir wrote Module:data consistency check which performs a check on all the data modules, and makes sure there aren't any discrepancies. There are some, so I thought you might like to know. —CodeCat 23:45, 17 December 2013 (UTC)

Among other things, aus, sai, and cai ought to go, stupid geographic categories that they are. —Μετάknowledgediscuss/deeds 01:50, 18 December 2013 (UTC)
@CodeCat: thank you for the link. (And @Kephir, if you're reading this, thanks for designing that module!) @Metaknowledge: Indeed, and nai (which several things currently list as their family!). qfa-ame should also go, IMO, or at least be voted upon like Altaic and Zuni needs to be updated not to list qfa-ame as its family even if it is kept. (If qfa-ame is kept, we should reconsider having deleted Penutian.) I've been meaning to start Requests for Deletion, but I've been busy. Feel free to beat me to it. - -sche (discuss) 09:19, 18 December 2013 (UTC)
Wiktionary:Requests for deletion/Others#Certain_geographic_language_families. - -sche (discuss) 02:02, 20 December 2013 (UTC)

Removing scriptsEdit

Some entries may specify a script with sc= even if no language has that script specified. When you remove the scripts, those entries will eventually trigger script errors. —CodeCat 14:44, 21 December 2013 (UTC)

I checked for such entries. When they existed, I added the script code to the relevant language code rather than removing it. - -sche (discuss) 20:06, 21 December 2013 (UTC)

A barnstar for you!Edit

Original Barnstar Hires.png Barnstar
For your continuous work to improve coverage and consistency of languages, families and such. —CodeCat 03:16, 24 December 2013 (UTC)
Thank you! :) - -sche (discuss) 06:29, 24 December 2013 (UTC)

Re: jewing / using labels on inflected formsEdit


jade#Etymology 2 mentions the language Mordvin, but we consider Mordvin to be two different languages: Erzya (myv) and Moksha (mdf). Is it worth creating a small language family for Mordvinic languages (probably not)? If not not, how can I determine which language was meant in the etymology? --WikiTiki89 19:58, 4 August 2014 (UTC)

As a first step, I'd declare the term's language to be "und", and say it's from "either Erzya or Moksha". This obviates the need to create a code for Mordvinic (though one could still be created if there happened to be other reasons why it would be useful). Next, knowing that Moksha and Erzya are both written in Cyrillic, I'd test various possible Cyrillic spellings of the term combined if possible with various possible Russian translations, to see if I could find any Russian linguistic texts that mentioned the term — I've been able to verify the identity of some Lak and other Caucasian terms that way.
PS #1: that reminds me of how useful it would be if we had entries for the Russian abbreviations of various languages' names. I've added some (д.-в.-н.), but I think I stupidly didn't record the Caucasian abbreviations at the time I had them in front of me, even though it took me a while to figure them all out with the help of ru.wikipedia. Maybe I'll go looking for them again; shouldn't be too hard to find them again, and you and Anatoli can help verify what they're abbreviations of.
PS #2: do you think it's redundant to say "obviates the need" or "obviates the requirment", since obviate already specifies "bypass a requirement" in its definition? I've never been sure... - -sche (discuss) 20:54, 4 August 2014 (UTC)
After taking a look at the languages' orthographies, the only Cyrillic spelling of al'd'a that makes sense is альдя, for which Google shows several results in some strange language that might be either Moksha or Erzya, or might be something else entirely. I don't know nearly enough about these languages to be able to identify them, and none of the results are dictionaries. RE PS #1: Russian abbreviations always confuse me too. I'm not even sure whether the language abbreviations are standardized enough between dictionaries for it to make sense to add them. RE PS #2: I think the definition is supposed to be "bypass [a requirement]" in other words the requirement (or the word requirement itself) is meant to be the direct object of "obviate". --WikiTiki89 02:12, 5 August 2014 (UTC)
I checked such spellings as алда, ал'д'а and альдьа after I posted, and I couldn't find anything in a Uralic language, either. Some hits were Kazakh(!).
Per Thorson's 1936 Anglo-Norse studies: an inquiry into the Scandinavian elements in the modern English dialects, volume 1, derives dialectal English yad / yaad / yaud (used in "Sc Nhb Lakel Yks Lan", which I take to be Scotland, North Humberside?, Lakeland?, Yorkshire, and Lancashire) from Old Norse jalda (dialectal Swedish jäldä), from a Finnish word "elde" (citing "FT p. 319, Torp. p 156 fol."), but says "Eng. jade is not related." Likewise the Saga Book of the Viking Society for Northern Research, page 18, says "There is thus no etymological connection between ME. jāde MnE. jade and ME. jald MnE. dial. yaud etc. But the two words have influenced each other mutually, both formally and semantically." I'll see about expanding jade and yaud with this information. - -sche (discuss) 03:04, 5 August 2014 (UTC)
One last question, though. Should "Mordvin" be added as an alternative name for both Moksha and Erzya? --WikiTiki89 13:45, 5 August 2014 (UTC)
Yeah, enough references (especially old ones but even some modern ones) speak of "Mordvin" as a language made up of Moksha and Erzya dialects, rather than as a family, that recording that asan alt name would be helpful. - -sche (discuss) 15:26, 5 August 2014 (UTC)

It's эльде both in Moksha and Erzya. See Имяреков, Мокшанско-русский словарь, 1953, page 124b, and Серебренникова Б. А., Бузакова Р. Н., Мосина М. В. (ред.), Эрзянско-русский словарь, 1993, page 781b. If you can't find a spelling for any Uralic, Altaic or a Caucasian language, ask me, I have a lot of sources. --Vahag (talk) 08:44, 7 August 2014 (UTC)

Awesome, that's good to know. I knew you had resources on Caucasian languages, but didn't know about Finno-Ugric. I'll add a Moksha section to [[эльде]]. :) - -sche (discuss) 17:47, 7 August 2014 (UTC)

Haida languagesEdit

Have we thought out the treatment of these yet? We have both the macrolanguage code hai (and a category for terms derived from it, including the entry gwaai that I think I'll go and RFV) as well as the two sublects, hdn and hax, the latter of which I just unwittingly made a terms derived from category for. —Μετάknowledgediscuss/deeds 20:35, 13 August 2014 (UTC)

I recall looking into the Haida lects, but it seems from my "Note 2" in this RFM that I held off on posting about them for some reason, and then got distracted by events in real life. WT:LANGTREAT says to treat only the macrolanguage as a language, but like the pronouncements I mentioned in that RFM, it seems there was never discussion about that. There are noticeable phonological differences between the Northern and Southern lects. Each of those lects is in turn made up of its own (sub-)dialects, but the sub-dialects within each group are mutually intelligible, so it doesn't seem to be a problem to merge those (into hax and into hdn), and it seems most references do. I looked at a number of North Haida, South Haida and plain "Haida" materials (Enrico's Northern Haida Songs, etc) and references before I posted the above-linked RFM last year and planned to comment about Haida; I'll see if I can find the notes I made then. - -sche (discuss) 22:30, 13 August 2014 (UTC)
A tad more research on the matter suggests to me that we should deprecate the use of the macrolanguage and reassign it, then create categories for the sublects. If you've notes on it, though, I'll wait for you to start the RFM instead of blowing ahead myself. —Μετάknowledgediscuss/deeds 23:01, 13 August 2014 (UTC)
Ok, here are my notes, which I'd be happy to summarize in any RFM on the subject, or which you can feel free to pull from.
- -sche (discuss) 05:49, 14 August 2014 (UTC)
By the way, for entries I would suggest using Enrico's orthography (or maybe Bringhurst's), so as to avoid characters like that are hard to input and liable to display incorrectly. - -sche (discuss) 06:07, 14 August 2014 (UTC)
All sounds good, and you can feel free to copy my Support over to the RFM for splitting and deprecating hai, but I'm not on board with the orthography. In British Columbia, I've only seen the orthography with x̱ used, so I would presume it is standard among speakers and linguists. —Μετάknowledgediscuss/deeds 07:21, 14 August 2014 (UTC)
Wikipedia says SHIP's orthography "is the usual orthography used in Skidegate", while Enrico's is what I saw in my (limited) review for Northern Haida—but perhaps the set of materials I have access to is not representative of all materials. Are the texts you see in British Columbia Southern Haida, or are some Northern Haida? Meh, it would be undesirable to use two different orthographies... I suppose we can normalize both (South and North) on the SHIP spellings and mention the other spellings as alternative forms. (Cf this subthread, if you're bored.) - -sche (discuss) 23:45, 14 August 2014 (UTC)
OK, after waiting a few days for some other discussions to settle down, I started Wiktionary:Beer_parlour/2014/August#Haida_lects. - -sche (discuss) 19:09, 22 August 2014 (UTC)

The power of 'and'Edit

We COULD have both fixing AND fixing to be intelligible on their own, something we do with many comparable situations. DCDuring TALK 21:57, 14 August 2014 (UTC)

I've replied at WT:RFM so as to keep discussion in one place. Cheers! - -sche (discuss) 22:31, 14 August 2014 (UTC)

Attestability of "yellowman"Edit

The search for attestability seems to yield mostly references to a White Jamaican reggae artist. Purplebackpack89 04:41, 22 August 2014 (UTC)

Thanks for looking. I tried searching for the plural, "yellowmen", and although that turned up some scannos, it also turned up enough valid hits that I've now created yellowman. - -sche (discuss) 21:56, 22 August 2014 (UTC)
Two is enough? DCDuring TALK 23:55, 24 August 2014 (UTC)
The search turned up more than the two hits I typed up. CFI doesn't require that citations be typed up and put in entries unless the entries are challenged, but I have typed up a third citation. Incidentally, it also contains "whitemen" and "blackmen". - -sche (discuss) 00:56, 25 August 2014 (UTC)

Appendix:Place names in New York area with possible native American originsEdit

I gathered these from the book mentioned in the Appendix at a WP edit-a-thon held today at a local library. You provided an etymology for Mamaroneck that was better than that in the book, by Richard Lederer (or his father?). A few of the toponyms in the Appendix (eg, Osceola, Mohegan) are taken from native American tribes not from the immediate area, a few from neighbors on the west side of the Hudson, Connecticut, farther north in New York, or possibly from Long Island, but at least 80% are from tribes that lived in what are now Westchester, Putnam, or Bronx counties. The spellings are the only ones Lederer had. I assume he rejected some for good reason. He seems to have taken many of them from land purchase records of the 17th century. DCDuring TALK 23:51, 24 August 2014 (UTC)

Oh, neat. I will look over the list and see if I can clarify / expand any of the etymologies. Should I remove placenames from the list once we have entries for them with complete etymologies (as in the case of Ossining), or what? - -sche (discuss) 00:58, 25 August 2014 (UTC)
Let's keep them as examples of what can be achieved, at least for now.
Lederer seems to have worked fairly diligently through his sources, which include hundreds of primary documents and secondary works. I didn't see any works in the bibliography that seemed to be specifically books or articles on the native languages themselves, but I ran out of time so I didn't look all that carefully. I'll be able to take a closer look soon. I may also extract the Dutch origin names. The English ones are fairly uninteresting, even to locals.
Why are Germans so fascinated by native Americans? DCDuring TALK 04:27, 25 August 2014 (UTC)
BTW, I have the towns there to provide a hint where in the county these places are, in case geography might have a bearing on the language of the toponym. There are a few from the Long Island Sound area, more from Bronx and Yonkers and along the Hudson to Peekskill, and others inland in northern Westchester. HTH. DCDuring TALK 04:33, 25 August 2014 (UTC)
I'm sure there are books written on that subject. I think it's partly the earlier European Noble Savage myths, combined with the lack of territorial conflicts that might have provided motivation for negative stereotypes, but also just the lure of the exotic and safely far away. Chuck Entz (talk) 05:05, 25 August 2014 (UTC)
I wonder that myself sometimes.
If you're asking why so many materials on native American tribes and languages were compiled by Germans, a large part of the answer is prosaic. Germany has long produced large numbers of ethnographers and linguists. A lot of materials on Pacific and African peoples and languages were also compiled by Germans.
If you're asking why so many non-linguists love "Indian" things ... well, that's Karl May's doing. He bought into and sold others on the romanticized notion Chuck mentions of simple and noble, exotic people living "authentic lives".
The town names should be helpful. - -sche (discuss) 07:01, 26 August 2014 (UTC)

Neologisms and "Web Words"Edit

Personally, I've always taken "Web words" happily so long as they met certain criteria. I've always been particularly fond of Germanic or otherwise native ones, due to my love of writing "native" poetry.

Anent neologisms... it's been somewhat iffy. I am accepting of some, but not of others. For instance, "selfie" is a term that I never use; opting for the fairer "self-snapshot" or "snapshot of oneself". On the other hand, "troll" (as in the sense of "to bait and wait so as to start trouble" or the like) is one that I have happily accepted with open arms (mayhap due to its origins in angling terminology, though I honestly can't say for sure).

Now, the reason why I bring this up is because it seems that Wiktionary's methods of determining which "web words" and which neologisms are acceptable for inclusion are somewhat murkily composed. Whilst terms like "halgi" are included, others are not. I can't really tell what the "criteria for inclusion" entails sometimes, because it seems a bit vague.

Might you be able to shed some light on this? Tharthan (talk) 17:02, 31 August 2014 (UTC)

Yeah, numerous discussions have made it apparent that Wiktionary's policy on citing the internet is not as clear as it could be; in particular, it can take a while to unpack the ramifications of the words "durably archived" / "permanently recorded" in WT:CFI. But once those words are unpacked, "web words" and "print words" are subject to the same criteria for inclusion. Words in major languages have to be used, as in "he took a selfie", and not just mentioned, as in "he used the word 'selfie' to describe the picture he took of himself". (Lines like "he took what he called a 'selfie'" fall into a grey area of debatable use-vs-mention-ness.) The uses have to span a year, to weed out fad words that are only popular for a month, like the Russian translation of "pink slime" (which was only somewhat less of a fad in English). And the uses have to be in "durably archived"/"permanently recorded" media.
What is durably archived? Books, newspapers, journals and magazines are durably archived. (Google Books and Issuu are good ways of using the internet to search through those media.) Websites are not durable, because they go offline (and moreover are edited and reworded) without warning. Even articles on the websites of news organizations can be taken down — a Wikipedia article I just edited discussed a story which was removed at the request of the journalist, allegedly after he was intimidated. Even the Internet Archive, which has been discussed in the past, is not a durable archive, because it removes pages if site owners request that. The only online corpus which is durably archived is Usenet, because it is decentrally archived, and attempts to censor things from it have indeed failed (e.g. someone at one point tried to delete alt.religion.scientology, and failed). This failure of most web sources to be durably archived can make it harder to cite "web words" (cf. this). However, if a web word is attested per those criteria, it can have an entry just like any other attested word.
Does this clear things up any?
Note that because of the nature of Wiktionary (it's a work in progress, and it's a wiki anyone can edit in real-time), some unattested words may have entries (you can RFV those), and some attested words may not have entries yet (you can create those). Also note that strings that are analysable as misspellings (e.g. strings like licencise, and probably also uncommon strings from lolcat-speak or doge-speak) may be excluded as such. - -sche (discuss) 23:02, 31 August 2014 (UTC)
Yes. That clears up a lot. I now have a more adequate understanding of how the process works. Thanks much.
So citations from Usenet are considered to be among those of the "durably archived" / "permanently recorded" variety? Or, are they only somewhat so, and are thusly taken with a grain of salt? Tharthan (talk) 23:55, 31 August 2014 (UTC)
Usenet is as durably archived as print media, so a use of a word on Usenet is 'worth' as much as a use of a word in a book. But Usenet is more likely than print media to contain typos/misspellings, so if a string is analysable as a typo/misspelling, and it is only supported by Usenet citations, people may be more likely to analyse it as a misspelling and not an intentional use of a certain spelling/word. (For example, book citations might have done more to convince people of the word-hood of licencise than these Usenet citations did in this discussion.) But even books contain typos : I can't find an example offhand, but in RFV, if a book uses an unusual spelling sometimes and the usual spelling other times, it's usually assumed that the instances of the unusual spelling are typos. And when it's clear that something isn't a typo/misspelling, like "Rightpondian" or the video-game sense of "pull", then it doesn't matter whether the citations come from books or Usenet. There seem to be about 1100 entries that cite Usenet. - -sche (discuss) 01:56, 1 September 2014 (UTC)

Rollback in error at toldEdit

I believe this rollback was done in error. The alternate pronunciations that were there were intentional. I intend to restore them. - Gilgamesh (talk) 13:52, 26 September 2014 (UTC)

I should have undone your edit with a more informative summary, I'm sorry. The pronunciations you added are unattested and dubious, per discussion on Angr's talk page, so I've removed them until such time as evidence of them comes along. Rollback is sometimes/often used as a quick way of undoing edits around here (if the edits are merely felt to make the entry worse, without the implication 'rollback' has on Wikipedia that the edits are vandalism), since Wiktionary's relatively small number of admins tend to be a lot busier than Wikipedia's larger number of admins... but it can tend to cause confusion, like now, when the edit was intentional and in good faith, but still made the entry worse. - -sche (discuss) 14:18, 26 September 2014 (UTC)
I've started a thread at Wiktionary:Tea_room/2014/September. It's important that this be sorted out, because bowl-bull, cull-coal, etc. have indeed become homophones, and it effects even General American for most people certainly my age (34) and younger. - Gilgamesh (talk) 14:21, 26 September 2014 (UTC)

Hey, erm...Edit

I would have e-mailed you this or sent this message to you via a more private method if I could have, because I feel posting this here might come off as rude to the person in question (though I do not intend it as such).

User:Angr and I seem to be in disagreement over what should be allowed transcription-wise for a certain word, and we seem to be at a deadlock. As such, I thought that maybe a third party could be brought in so as to maybe give their opinion on the matter.

Now, I don't know really anything about your dialect, -sche, (and I don't mind being blissfully ignorant on that subject, since I think it's irrelevant to most parts of editing on Wiktionary) [though I remember seeing a reference to you at some point being in the Inland-North, though I don't really know the relevance of that] so I don't know where you'd fall anent this matter, but I would honestly hope (and truly do think) that that wouldn't (and shouldn't) matter, considering the argument here is transcription, and any linguist worth their salt knows how to properly transcribe vowel phonemes, and knows the difference between two different phonemes, whether monophthong, diphthong, or otherwise, irrespective of whether or not the vowel phonemes in question occur in his or her dialect.

Now, I firmly trust your knowledge and expertise in this field, hence why I have come to you. I think you may be able to help in settling this issue. So, if you'd be willing to offer your tuppence-worth on this matter, I'd be very grateful.

The aforementioned discussion can be found here: Tharthan (talk) 16:13, 28 September 2014 (UTC)

My "e-mail this user" link should be enabled (in the toolbar on the side of this page, a few items below "what links here"); if it's not, let me know. (Not that I check my e-mail with any frequency at all...)
My "expertise in this field" is amateur compared to Angr's. But since I've been asked, I'll give my thoughts:
I remember noticing during a previous Tea Room discussion of the M-m-m merger that one of the problems one faces if one wants to transcribe 'marry', 'merry' or 'Mary', or for that matter 'air' or 'ear', is that the IPA doesn't have symbols that denote these sounds perfectly, so one is left using approximate transcriptions. That's not automatically problematic — if a language's "e" sound is actually 15% closer to canonical /ɛ/ than canonical /e/ is, it's fine to nonetheless transcribe it as /e/, or if necessary /e̞/; one needn't invent a whole new letter for it. It does, however, mean that discussions of whether or not sounds are distinct (and discussions of how to transcribe them) are more difficult. For example, according to our entry and, 'merry' is /ˈmɛɹi/ and 'Mary' is /ˈmɛəɹi/ for speakers who don't have the M-m-m merger, while both are /ˈmɛɹi/ for speakers who do. However, both our audio clips and's contain a vowel that is distinct from the /ɛ/ in 'bet' (i.e., the Vr sequences in the audio clips aren't just /ɛ/ followed by /ɹ/). That means that someone who was trying to figure out whether her pronunciation of 'Mary' used /ɛə/ or /ɛ/ would run into trouble if she tried pronouncing 'Mary' and then pronouncing words with /ɛ/ in them like 'bet' to see if she used the same vowel in both — she'd probably conclude that she didn't use the same vowel for the two words, even if the vowel she used in 'Mary' was the one we transcribe as /ɛ/.
However, setting that issue to the side...
According to our entry and, 'air' is /ɛəɹ/*, with the same vowel as unmerged 'Mary'. Our audio clip is curt and sounds like it contains only a single (non-diphthong) vowel, but's has more of a /ə/. Likewise, 'affair' is /əˈfɛəɹ/ per, and the vowel in the audio file is the same as the vowel (diphthong) in's 'Mary' audio file.
That means it would be reasonable to transcribe the sound as /ɛəɹ/ (or /ɛɚ/, which is synonymous) for (some) American accents. But is /ɛɹ/ wrong? Well, is there an American accent that contrasts /ɛɹ/ and /ɛəɹ/ in this (non-intervocalic) context? If not, then the worst one can say is that /ɛɹ/ is potentially confusing, but as long as there's a page explaining how the symbols are used, it's not wrong, and it's possibly not even any more confusing (or any less accurate) than our use of /ɛ/ to mean one thing in merry and another in bet.
Merriam-Webster and Random House use the same transcription for 'merry' and 'affair', but also for 'Mary' (apparently they treat the M-m-m merger as standard). The various dictionaries that make up transcribe 'merry' and 'affair' differently.
You can raise the issue in the Tea Room for broader discussion if you think the default transcription of the 'air'/'affair' sound should be switched from /ɛɹ/ to /ɛəɹ/ (or /ɛɚ/). I have no strong preference, since I don't think either transcription is ideal (I don't think there is any ideal transcription of the sound).
Note that transcribing 'air' (and 'affair', etc) narrowly, in square brackets, as [ɛəɹ] or [ɛɚ] is another matter entirely, and probably a lot more straightforward.
(* Our entry also lists /ɛːɹ/ as a possible US pronunciation of 'air', but this is suspect, since vowel length is not phonemic in American English. Actually, that's another case where a small distinction is glossed over and one symbol is used for two slightly different but non-contrastive things: /i/, /u/, etc is longer in some words than in others in American English, but they're not distinguished as having /i/ vs /iː/ because vowel length is not actually contrastive.)
- -sche (discuss) 09:02, 29 September 2014 (UTC)
Oh, you're right. I didn't notice that on the sidebar.
I actually agree with you there, because I initially transcribed the /ɛə/ vowel as /e/, because that's how my mind thought of it (this might be due to plain /ɛ/ indeed being a plain /ɛ/ in my dialect, whilst /ɛə/ is more of an /ɛ̝ə/ in my dialect). Nevertheless, I agreed that the sound was far closer to /ɛə/ than /e/, so I changed my transcription practices accordingly.
Then is it fine to list both pronunciations /ɛɚ/ and /ɛɹ/? You're right to say that there is probably no English dialect that contrasts /ɛɹ/ and /ɛɚ/ (my non-mMm merger dialect doesn't, since, as far as I know, /ɛɹ/ doesn't end any word in the language [with the possible exception of "err", as I mentioned on Angr's talk page]), but it's still better to list both pronunciations /ɛɚ/ and /ɛɹ/ than to list just /ɛɹ/ and have people say "Wait a minute... "affair" has the same vowel as "fairy", which is /ɛɚ/ for me in my non-mMm merger dialect, but yet the only pronunciation listed here is /ɛɹ/. Am I wrong in pronouncing it /ɛɚ/?" Furthermore, it couldn't do any harm to have both pronunciations listed. So could we at least have both /ɛɚ/ and /ɛɹ/ pronunciations given for affair? Tharthan (talk) 11:03, 29 September 2014 (UTC)


Thanks for resolving the mini-contretemps at "bear"... AnonMoos (talk) 17:17, 29 September 2014 (UTC)


I hadn't realized I had accidentally put words in the wrong category. Thanks for the heads up. — LlywelynII 23:50, 30 September 2014 (UTC)


Discussion moved to Talk:lebendig.

Lewis and ClarkEdit

I've borrowed Lewis and Clark: Pioneering Naturalists, which has two appendices of plants and animals "discovered" by Lewis and Clark. For my purposes the listed species name(s) and vernacular names are of greatest interest. The appendices don't have non-English names. But the discoveries have references to the volume and page in Thwaite's edition and most have a date and location for the discovery. Have you already mined Lewis and Clark for native names? Do you intend to do so? Are there other sources for that? DCDuring TALK 20:15, 2 November 2014 (UTC)

I've only 'spot mined' Lewis and Clark, i.e. when Google Books let me know that a page of their journals mentioned pasheco, I checked the surrounding pages for other native / native-derived words. I haven't mined the whole work. If you'd like me to (try to) find and add native names for any of species or vernacular names you add, I'll see what I can do. I've been rather distracted from my Native American word documentation project. - -sche (discuss) 01:40, 24 November 2014 (UTC)


I've started a page User:DCDuring/Geology and copied your items there, as well as a WP table. It suggests some lines for improving our entries as well as showing redlinks. I also came across the Geowhen Database, which is a convenient source of confirmation of the meaning of some of these terms. DCDuring TALK 17:01, 3 November 2014 (UTC)

Just so you know, although I haven't had much time for editing lately, I'm still available to help with geological terms, as I have some training in the field. If you leave me a message on my talkpage or tag me in relation to any issue you have when adding geological jargon or etymologies thereof, I'll be sure to respond. —Μετάknowledgediscuss/deeds 20:57, 3 November 2014 (UTC)
If I can find the time, I'll check out which terms are (a) most-linked to within Wiktionary or, probably more usefully, Wikipedia (I wonder if there's a toolserver/wmflabs tool that does that), and/or (b) most common in ngrams. It would make sense to tackle those first. - -sche (discuss) 01:40, 24 November 2014 (UTC)


Hi. I saw you reverted some of my edits on this word. I was mistaken to change the etymology in the way I did. I thought the theory of its deriving from Slavic was outdated, so I put that one into a "postscript". I've since seen that Kluge is also of this opinion and I was about to make that revert myself. -- As to the quotation I deleted, I just think that it misleads people to believe the word is obsolete and there are no more current quotations to be found. I don't think such quotations are very useful, but I will refrain from deleting them from now on. Sorry! And best regards!Kolmiel (talk) 00:20, 4 November 2014 (UTC)

Yeah, and I made a little edit on the wording of your version, because I thought it might suggest that German Schmetten is from English (which of course you didn't intend).Kolmiel (talk) 00:22, 4 November 2014 (UTC)

think of the childrenEdit

Hi there -sche, you had previously pitched in and helpfully formatted an entry I improved, Streisand effect, as Word of the day.

Equinox (talkcontribs) created the entry on think of the children and I recently improved it.

I nominated it at Wiktionary:Word of the day/Nominations, however Ungoliant MMDCCLXIV (talkcontribs) mentioned at user talk:Equinox that unfortunately these days most of those that appear on the Main Page are recycled entries from prior years because it's pretty inactive.

I was wondering if you could add it to one of the upcoming dates for Word of the day?

Thank you,

-- Cirt (talk) 20:57, 5 November 2014 (UTC)

I was able to get help from others, but thanks for your time. :) -- Cirt (talk) 18:56, 16 November 2014 (UTC)
I'm glad someone helped you, and glad a new word will be featured. I'm sorry I didn't respond sooner. Perhaps over the upcoming holidays, when people have time off from work and school, someone will have time to set a bunch more Words of the Day. 01:40, 24 November 2014 (UTC)

Non-Oxford British English standard spellingEdit

Why put this at all? The fact that Oxford University Press uses the z spelling has nothing to do with the usage of the word. But I know you must have some reason for putting it in. What is it? Renard Migrant (talk) 21:12, 15 November 2014 (UTC)

Hi; sorry for not responding sooner. It seemed like the best way of distinguishing the two British spellings. Everyone (in Britain) spells flavour the same, but with something like actuali[sibilant]e, some Brits (most noticeably those affiliated with the OUP) spell it actualize, while many others spell it actualise. As I mentioned to an IP on Stephen's talk page, there have been a few discussions of how to describe the spellings that are used by British people, and other people throughout the Commonwealth, and all of the wordings have problems. Calling the spelling Oxford uses "Oxford British", and the other by elimination "non-Oxford", seemed best to me, but I'm open to being persuaded that another wording would be better. - -sche (discuss) 01:56, 24 November 2014 (UTC)


Re diff: I do think this is "more worthy of an 'uncommon' label than other -es genitives vs -s ones", because Archives really is virtually unknown in any German written in the past 175 years. That's why I wanted to label it "archaic", but the anon changed it to "rarer" because of a single cite on b.g.c from 2006 (which I think is simply a mistake on the author's part, but I can't prove it). —Aɴɢʀ (talk) 21:38, 17 December 2014 (UTC)

As the user points out on WT:RFD, there are more modern cites than just the one in the entry. And ngram data for both eines Archivs vs eines Archives and the compound Staatsarchivs vs Staatsarchives show that the -es version is still about half as common now as it was in the past (i.e. there does not seem to have been any sharp drop-off in usage), and it is about 1/25th as common in the modern era as the -s version, which is not an unusual ratio for an -es vs an -s form. Compare how, in the other direction, Geschäftsfreunds is now about 1/25th as common as Geschäftsfreundes, and Jubiläumsjahrs is about 1/15th as common as Jubiläumsjahres. (Those are two of the words the Duden cites in explaining how euphony helps decide which genitive ending to use.) - -sche (discuss) 22:39, 17 December 2014 (UTC)
But those are compounds, which are always skewed toward using the e-less form (eines Hofes is 15× more common than eines Hofs, but Hauptbahnhofes is only half as common as Hauptbahnhofs). The fact that Archiv isn't a compound would lead us to expect Archives to be more common than Archivs, not 25× rarer. —Aɴɢʀ (talk) 23:37, 17 December 2014 (UTC)


I'd be shocked if you found this as the imperfect subjunctive is a literary tense and fucker is new and extremely informal. Previous discussions have been favourable to creating all hypothetical verb forms because RFVing them would be a monstrously time consuming issue. See for example défragmentassions and the definition of défragmenter. Renard Migrant (talk) 20:36, 24 December 2014 (UTC)


See WT:RFV#Schlackenlosigkeit. The discussion has advanced beyond my extremely modest knowledge of German and may even need a native speaker. DCDuring TALK 23:01, 13 January 2015 (UTC)

Αγαρηνών et alEdit

The "misused" templates were put there for a purpose - if you want to change any more Greek entries please let me know.   — Saltmarshσυζήτηση-talk 11:11, 16 January 2015 (UTC)


Could you check the codes on this page? Thanks. DTLHS (talk) 22:08, 23 January 2015 (UTC)

Meh. Someone changed the header, but not the codes, from nds-de to plain nds (rather than adding a separate section for the Dutch Low Saxon term). >.>   The entry could be band-aided by either changing the header or the codes, but the general disagreement and slow-motion edit-warring about how to handle the various Low German lects makes for so much ugliness that I am losing interest in editing them. - -sche (discuss) 03:59, 24 January 2015 (UTC)

Why the "hmm..."?Edit

I agree that the previously-listed meaning of that was odd, but... what is the meaning of your edit summary? Are you doubtful of something? Or...? Tharthan (talk) 21:52, 24 January 2015 (UTC)

Mostly I was doubting the previously-listed meaning, but I also wonder if the wording I introduced really covers the citations, and/or if there are actually two senses, one used of people, and the other of places (the latter presumably similar to shire#Verb). - -sche (discuss) 23:22, 24 January 2015 (UTC)
I share your doubts. Also, are you sure that parish is a verb? Parished could easily be interpreted as a denominal adjective. DCDuring TALK 23:55, 24 January 2015 (UTC)
The 1972 citation and the second sentence of the 1992 citation seem very verbal to me. I'll see if I can find other inflected forms. - -sche (discuss) 01:41, 25 January 2015 (UTC)
Check out the 1917 and 1991 citations (the latter technically of re-parish). There's also the citation below, which I can't make sense of. - -sche (discuss) 01:49, 25 January 2015 (UTC)
  • 1903, Maxwell Gray, Richard Rosny, page 210:
    "You will take pleasure in parishing. Mother used to parish."
    "How do you know I like parishing?"
    "Your uncle said so."
    "Oh! did he?"
    "And you may like the rectory people; it's a fine old house, and often full of visitors."
after e/c
I'm not hostile to the verb view for the sense, just uncertain. I've looked for the parishing form, but just found it with certainty for what is now a new intransitive sense, for a distinct etymology of parish#Etymology 2 ("perish"), and for a noun sense. I may just have a block for the verb sense. There was a book title that seemed to be the sense I've been doubting.
The citation above is of the definition I added: "To visit residents of a parish". It's used of parish priests and also of women doing socializing possibly under color of visiting the sick, aged, shut-ins etc. DCDuring TALK 01:57, 25 January 2015 (UTC)
OK, I'll add it to that sense, which is now well-cited. - -sche (discuss) 01:59, 25 January 2015 (UTC)
The 1917 cite is syntactically though not semantically intransitive. The "re-parishing" cite is helpful. It's tough with a word that shows up so uncommonly in what are to me somewhat alien contexts. The word is certainly used with a meaning that is at least nearly verbal. I doubt anyone would challenge it on the same grounds such as my doubts. DCDuring TALK 02:08, 25 January 2015 (UTC)


As was revealed in a discussion that I had previously with Dbfirs, it seems the distribution of /ɛəɹ/ and /æɹ/ words differs between British English and dialects of North American English that do not possess the merry, Mary, marry merger.


"vary" is often /væɹi/ in non-merry,Mary,marry merger dialects (though, I will admit, its traditional /vɛəɹi/ pronunciation is still heard amongst the older generation. My mother, for instance, uses /vɛəɹi/, whilst my father and I use /væɹi/ [as does much of the younger generation]. Similarly, parent for myself, my family, and most of my peers is /ˈpæɹənt/, whilst /pɛəɹənt/ is the pronunciation I have heard in church and by some others. It seems to be about a 50-50 distribution.

In conclusion, some words that have a traditional /ɛəɹ/ in British English and old fashioned North American English seem to have shifted to /æɹ/ in the younger generations.

Do you (or anyone else visiting your talk page) have any idea as to why this might be? Tharthan (talk) 16:34, 25 January 2015 (UTC)

Generic phonetic simplification? Influence from GenAm, where the sounds aren't distinguished? I don't know. North American English regional phonology#New_England says "Western New England [... and] Connecticut and western Massachusetts in particular show the same general phonological system as the Inland North, and some speakers show a general tendency in the direction of the Northern Cities Vowel Shift—for instance, an /æ/ that is somewhat higher and tenser than average[.]" The phoneme that's next higher than /æ/ is /ɛ/. You're describing things going in the opposite direction, but I can imagine how a reduction in the contrast between the two sound in non-Mary-merging dialects, combined with an outright merger of the sounds in the surrounding dialects, could lead people who tried to maintain a distinction between the words (Mary, marry, merry) to use a new / un-original sound to do so. In English, I've heard people maintain the pen/pin distinction backwards, and in German people mix up [ɛː] and [eː] if they try to maintain a distinction between them. - -sche (discuss) 21:45, 25 January 2015 (UTC)
Hmm... it seems to me to be more of a specific hypercorrection than anything else, though, because other words besides the previous two seem to retain their correct pronunciations. I dunno. I just hope that we don't have another Great Vowel Shift or anything like that any time soon, because that seems to be the direction being headed towards. Tharthan (talk) 21:55, 25 January 2015 (UTC)


Hi there. I wanted to ask you about the [phonetic] transcription of the German /phonem/ /ʃ/. Should it be [ʃʷ] because of the lip rounding, or should we not use [ʷ] just as we've decided not to use [ʰ]? I personally would be in favour of [ʃʷ] because unlike aspiration there seems to be little regional/idiolectal variation and, even more importantly, there would be no wondering when and when not to use it since /ʃ/ would just always become [ʃʷ]... But I don't know. What do you think?Kolmiel (talk) 17:41, 25 January 2015 (UTC)

I would treat it like aspiration, and so I wouldn't use it. I note that de.Wikt, which only uses narrow transcriptions, doesn't use [ʷ]. You could ask on WT:T:ADE, though. This is not entirely here or there, but ... people occasionally propose "diaphonemes" around here (ultra-broad transcription); this seems like the opposite, ultra-narrow transcription. Perhaps one day we'll start adding both and have a sequence of //ultra-broad//, /broad/, [narrow] and [[ultra-narrow]] transcriptions. - -sche (discuss) 21:53, 25 January 2015 (UTC)
No it's fine, just wanted to check if you were in favour of using it. It's not that important I guess, and it's not a "Herzensangelegenheit" of mine.
I just think we shouldn't base our decision on the German wiktionary. Their transcriptions aren't narrow, they're just given between squared brackets because most traditional dictionaries do that. They would be very wrong if understood literally, especially things like [pakn̩] which don't exist in the German language and which I suspect might be almost physically impossible to the human mouth.Kolmiel (talk) 21:48, 27 January 2015 (UTC)

The names= field in the data modulesEdit

I'm looking at changing this now, and I already made a few initial modifications. But I'd like to confirm just what the plan was again. If I remember correctly, the idea was to split it into three fields:

  1. canonicalName
  2. otherNames
  3. Some field for the things that are subsumed under this name, but are not just alternative names.

I'm not sure what to call that third field, though, so do you have suggestions? Also, what should be done in ambiguous cases where there is no agreement whether something should be classified a subvariety or not? Perhaps, I could only split off number 1 for now, leaving 2 and 3 together until we sort that out more completely. —CodeCat 22:21, 25 January 2015 (UTC)

Oh, great! :)
Perhaps the third field could be called "varieties" or "varietyNames"?
I assume that when you say "no agreement whether something should be classified a subvariety or not", the alternative to classifying it as a subvariety is classifying it as an alternative name for the whole language. (If there's disagreement about whether or not something is a dialect of one language or a separate language, that's a question we're going to settle at an earlier stage, namely the stage of granting it a code or not, before we ever get to any of these names fields. Right?) There are cases where certain names refer both to dialects and to the whole language; in the earlier discussion I suggested that in such cases we could either (1) list the name in both places, or (2) decide that anything listed in a higher field will not be repeated in a lower field (so, anything listed in "otherNames" will not be repeated in "varietyNames"). - -sche (discuss) 22:37, 25 January 2015 (UTC)
The question is mostly relevant to reconstructed languages, at least in the way I intended it. Proto-Uralic for example has Proto-Finno-Ugric as a subvariety, but some linguists contend that they are one and the same. Austronesian is often considered synonymous with Mon-Khmer (both share a Wikipedia article too). And there are probably similar situations for other languages.
I'm not sure if "varieties" is clear enough. I would like to have "sub" in the name so that it's clear in what way it's distinct from "otherNames". So "subvarieties"? I've also seen "sublects" used by some people. —CodeCat 22:53, 25 January 2015 (UTC)
Well, I would handle proto-language cases the same as other cases, either always list such names in both fields, or decide one field always has priority. The first approach might more accurately convey that some authorities use _(whatever)_ as an alt name for the whole language and other authorities use it as the name of a "dialect", and keeps us from having to pick which field to list the name in. If we went with the second approach, my gut reaction would be to "prioritize" the "higher" field, and so list "Proto-Finno-Ugric" as an alternative to "Proto-Uralic" and not list it as a dialect.
As for the name: well, how about "subvarieties"/"subvarietyNames"? All but one of the hits of google books:"sublect" OR "sublects" are scannos of "subject". Or perhaps something like "subsumedVarieties", to convey that the main purpose is to list cases where ISO-code-having subvarieties have been subsumed, rather than e.g. to start listing every non-code-having dialect of English. - -sche (discuss) 23:22, 25 January 2015 (UTC)
(edit conflict) Of the two, I like "sublects"- it sounds more neutral. Actually, it's the "sub" part that makes me nervous. Except in the case of pluricentric languages, we don't explicitly mention the standard lect at all, which is every bit as much a sublect as all the things we call the sublects. More often than not, the only difference between the "standard" and the "sublects" is an accident of history: In Old English, for instance, the Wessex dialect is generally treated as standard, but eventually the East Midlands dialect took its place. That means a sublect became the standard and the standard became a sublect. In reality, though, they're still just two sublects, with the main difference being that the standard sublect tends to influence and crowd out the other sublects.
Of course, it would look funny to include "Standard xyz" in the list of sublects, so I guess we're stuck with the current arrangement. Still, I wonder if there's a way to distinguish the language as a whole from its sublects without implying that only those lects different from the standard are sublects.Chuck Entz (talk) 00:08, 26 January 2015 (UTC)
This raises the question of what we want to list in sub[variety/lect] field. Initially, when subvariety names were included in languages' lists of alt names, it was because the named subvarieties had previously been considered languages (generally by the ISO, but in some cases merely by us via granted and then revoked exceptional codes); the subvariety names were listed so that people who thought they were languages would know where they went.
However, I can see how we might find it useful to make comprehensive lists of languages' dialects (including dialects when have never been considered own languages); such lists could in some far-future version of Wiktionary be meshed with the context labels so that entries could be put in cleanup categories if they were categorized as belonging to another language's dialect, for instance.
I'd still use "subvarieties" for the name since "sublects" doesn't appear to be a word; even the Google Scholar hits are scannos for "subjects", lol. - -sche (discuss) 19:54, 26 January 2015 (UTC)
I think it would be a good idea to make a list of dialects. But it would be very hard to manage because there are so many, and there will always be a need to specify a particular variety that is more fine-grained than any we've defined so far. So if we want to add something like that, we would have to take the possibility of unrecognised dialects into account, like the label template already does. —CodeCat 20:05, 26 January 2015 (UTC)


Hi! If this is a real French verb, could you define it? If it's not a real verb, I'll need to delete all the inflected forms someone created for it (Special:WhatLinksHere/surbasser). - -sche (discuss) 09:20, 27 January 2015 (UTC)

Most often, it's a typo for surpasser or surbaisser. However: 1. it seems that, in architecture, surbassé has been used as well as surbaissé (but I cannot find citations clearly showing that it was used as a verb). 2. I also find surbassé used for music, and very few uses clearly using a verb surbasser (try to Google "il surbasse" and "qui surbasse"). I think I can guess the sense (make music overbassed), but I'm not a specialist. Lmaltier (talk) 21:36, 27 January 2015 (UTC)
I see. Thanks for checking! - -sche (discuss) 21:42, 27 January 2015 (UTC)

Flood flagEdit

Hi, could you give me the flood flag for about 20 minutes, please? --Type56op9 (talk) 18:41, 28 January 2015 (UTC)

Nah, you're not supposed to be operating a bot. - -sche (discuss) 19:36, 28 January 2015 (UTC)
Actually, it's not a bot. It is WT:ACCEL, which looks like a bot. --Type56op9 (talk) 11:40, 29 January 2015 (UTC)
Fair enough. I just went through and patrolled your latest batch. - -sche (discuss) 20:01, 29 January 2015 (UTC)


Hi, could you create a language module for Proto-Ta-Arawakan as well? --Victar (talk) 19:17, 30 January 2015 (UTC)

I've created a family code for Ta-Arawakan, "awd-taa". However, neither "Proto-Ta-Arawakan" nor "Proto-Ta-Arawak", nor "Proto-Ta-Maipurean", "Proto-Ta-Maipuran", or any of the other alt names I tried gets any Google Books or Scholar hits, or even raw web hits. Are you sure it's a valid proto-language? - -sche (discuss) 19:52, 30 January 2015 (UTC)
Thanks. Yeah, what happens is it usually just gets called Proto-Arawak. Incidentally, Arawak is also a language within Ta-Arawak, otherwise known as Lokono. It's all very convoluted, but consequentially I have these reconstructions that shouldn't be called Proto-Arawakan since they aren't attested outside of Ta-Arawak, ex. Lua error in Module:translations at line 37: Translations must be for attested and approved main-namespace languages.. --Victar (talk) 22:17, 30 January 2015 (UTC)
I've also seen it awkwardly called "proto-Caribbean Northern Arawak". --Victar (talk) 23:07, 30 January 2015 (UTC)
OK, thanks for the clarification. In general, I would say "meh, if someone wants to create entries for such-and-such proto-language that existed, go for it". However, User:Tropylium has recently been arguing against creating separate codes and appendices for cases where things are reconstructible only to certain dialects of proto-languages, and if other linguistic works just treat Proto-Ta-Arawak as Proto-Arawak (and AFAICT never mention or confirm the existence of Proto-Ta-Arawak at all), that does make me question if we really need a code for it. Tropylium, do you have an opinion on this? - -sche (discuss) 03:48, 31 January 2015 (UTC)
Looking at Wikipedia's classification, it seems that Ta-Arawakan is a fairly deep subgroup within the wider Arawakan family, and accepted by each of the three otherwise very different classification schemes. Sounds like good enough grounds for separate treatment. Cleanup will still be possible later, if it turns out that there exists a better way to define a subgroup comprising these languages (but AFAIK Arawakan is not one of those families where a micro-detailed family tree is known yet). --Tropylium (talk) 04:07, 31 January 2015 (UTC)
OK, I have created "Proto-Ta-Arawakan" with the code "awd-taa-pro". - -sche (discuss) 04:36, 31 January 2015 (UTC)
Thanks to you both! Yeah, the whole Arawak tree is outdated, based on paper from 1991. I'm working on a draft for a new version based on various published works, w:User:Victar/Template:Arawakan languages. --Victar (talk) 17:37, 31 January 2015 (UTC)

I wonder where the heck the D came from in awd and in Taíno the Q in tnq? I think the Arawak languages just got the bottom of the barrel. If I had some say, I would rename Arawak to lcn for Lokono/Locono and use arw for the language family. --Victar (talk) 01:27, 31 January 2015 (UTC)

Yeah, some forethought would have done the ISO good. Particularly strange are the cases where languages which have three-letter names have been given codes that aren't those three letters, e.g. Abu is ado (while abu is Abure), Col is liw, and so on. - -sche (discuss) 03:49, 31 January 2015 (UTC)


"sınalgı" was deleted but they (88.XXX.XXX.XXX) added again! --123snake45 (talk) 02:13, 1 February 2015 (UTC)

CodeCat has deleted it. The IP seems to be correct that there are citations of the word on Usenet now, but there are only two of them, and they're from only a few months apart; the word would need three citations spanning over a year to meet WT:CFI. - -sche (discuss) 02:22, 1 February 2015 (UTC)
The author (Arslan Tekin) says: "Look at it, it is using sınalgı for television and ünalgı for telephone at Kyrgyzstan"

So, it is Kyrgyz. It isn't Turkish. --123snake45 (talk) 03:00, 1 February 2015 (UTC)


Can you take a look at rfv page? I've added the citations with Azerbaijani adaptations so you may compare them. -- 17:13, 3 February 2015 (UTC)

I invited three Turkish-speaking users to take a look at the citations. One of them, User:Dijan, is the one who said the previous citations were Azeri. The Azeri versions you've provided do look consistent with Dijan's comment that "every single one of them is a Turkish rendition of the Azeri language (literature and poetry) that was not translated into Turkish", but I will wait for the other users to comment. I'm at a disadvantage here because I (and more other Wiktionarians) don't speak Turkish or Azeri, and it's clear there are people with axes to grind on both sides of this issue — in some cases it seems pretty clear that people have made up words that aren't actually in use, and in other cases people seem to be refusing to believe words that seem real (e.g. Citations:haydamak, where it looks like other print dictionaries are confirming that the citations are using haydamak to mean "drive"). - -sche (discuss) 17:26, 3 February 2015 (UTC)


I don't agree that languages are proper nouns, but if that is Wiktionary policy, I'm not going to upset the apple cart, but just let you know that not everyone agrees. Donnanz (talk) 17:51, 3 February 2015 (UTC)

I don't know that there's a policy, but it certainly seems to be common practice; all the other language names I can think of are currently categorized as proper nouns: Portuguese, Spanish, Basque, French, English, Dutch, German, Danish, Norwegian, Chinese, Navajo, etc. However, there has been some discussion in the past about how some of the things that are commonly categorized as proper nouns, such as personal names, fail to meet some of the usual tests of proper-noun-ness (names are countable; "there are two Johns in my class"). You could bring the matter up in the BP and see what others think. Languages do seem to meet more tests of proper-noun-ness than personal names, though (and there wasn't even consensus to stop treating names as proper nouns). - -sche (discuss) 18:09, 3 February 2015 (UTC)
Hmm, OK, I'll think about the Beer Parlour. I would categorise names such as Gertrude, the Houses of Parliament, the White House, and the Black Sea as proper nouns, and surnames of course, and stop there. But as you point out there can be a problem with people's names; the Browns and the Joneses spring to mind. Also place names, two Bristols, two Birminghams, two Londons (maybe more), but place names and people's names are really proper nouns despite that. Donnanz (talk) 18:39, 3 February 2015 (UTC)
If you are thinking about the matter, consider that taxa are considered proper nouns, because they are names of individual natural kinds (old-style Linnaean taxonomy) or lineages. This is somewhat similar to the Roman gens, or other groups of descendants of a common ancestor. Organization names, toponyms of all kinds, brands/trademarks are all proper nouns, whatever word class their components are. DCDuring TALK 18:50, 3 February 2015 (UTC)
No, I wouldn't argue with taxa (taxas?), brands, trademarks, names of organisations etc. I think it's just languages as proper nouns I disagree with. Donnanz (talk) 19:02, 3 February 2015 (UTC)
The argument, I think, is that a language is a singular thing that a community speaks, just like e.g. a country is a singular place that a community lives. Of course, both can be pluralized: one can speak of Germanies, Americas, and even Frances, and one can speak of "various Englishes" (American, British, Indian, etc), "Norwegians" (Bokmal, Nynorsk, Riksmal, etc), "Germans", etc (though our entries currently don't, except in the first case). It may well be as technically inaccurate to label countries and languages as proper nouns as it is to label personal names as proper nouns. On the other hand, it seems to be common, among those dictionaries which use the label "proper noun", to label all of those types of thing as proper noun, and they do generally fit tests of proper-noun-ness. - -sche (discuss) 19:41, 3 February 2015 (UTC)
There's quite a few examples of plural place names: Aleutians (that entry needs splitting), Falklands, Faroes (Faroe Islands), Netherlands (Nederland in Dutch) to name a few. But languages (in my opinion) are mass nouns, instead of Englishes and Norwegians (the people are Norwegians), we should refer to forms of English, forms of Norwegian and so on. Donnanz (talk) 20:42, 3 February 2015 (UTC)
I think Netherlands (where the word for a singular country happens to be plural) is different from Frances (the plural of France, used to talk about e.g. different temporal or social incarnations of France). I can find several instances of Netherlands being pluralized, both invariantly ("the two Netherlands", a la "the two fish") and, rarely (and only "in the wild", not in places that meet CFI), as Netherlandses.
Hmm, mass nouns... that's plausible. Well, we have a fair few grammarians here, let's see what they think. Would you like to bring it up in the BP, or would you like me to?
DCDuring, does CGEL say anything about whether languages are nouns or proper nouns or mass nouns? For that matter, does it say anything about whether given names are proper nouns or not? (Apologies if you've answered the latter question previously and I'm forgetting.) - -sche (discuss) 21:50, 3 February 2015 (UTC)
I think the name Netherlands may be historical as it also took in the all the low countries including Belgian Flanders at one time. It is still referred to as het Koninkrijk der Nederlanden (qv Nederlanden). Anyway, I suppose I had better start a thread in the BP. Donnanz (talk) 22:20, 3 February 2015 (UTC)
@-sche: I don't see any explicit statement in CGEL that a name of a language is a proper noun nor that is any other type of noun. There is no reason why a proper name couldn't have a homonym that is a mass noun. Or rather isn't that just one of the generic secondary uses of many proper names, eg, "We've had too little Ruakh in our discussions lately." (The "too much" examples would cause trouble.) DCDuring TALK 23:53, 3 February 2015 (UTC)


Sorry for all the deletion requests. I was basing the original reconstructions on some outdated material. Thanks. --Victar (talk) 07:04, 5 February 2015 (UTC)

No problem. With wt:AWB, it's not that hard to delete a bunch of pages. - -sche (discuss) 07:08, 5 February 2015 (UTC)

dative -eEdit

Discussion moved to Template talk:de-decl-noun-n.

A friendly request to enable AWB useEdit

And also, could you remove edit protected status for CheckPage? I can't edit it. --Dixtosa (talk) 12:41, 7 February 2015 (UTC)

Sure, I can add you to the checkpage. :) I'm not going to unprotect it, though; it's supposed to be protected, as a safeguard against people who don't know what they're doing adding themselves to it. - -sche (discuss) 18:20, 7 February 2015 (UTC)

Using passer and sortir with êtreEdit

Do passer and sortir use être under exactly the same circumstances? Their usage notes are a little different, and I'm not sure if that's meant to imply that the terms use être under different circumstances or not. If they use être under the same circumstances, I'd like to reword Template:U:fr:may take être as much as needed and deploy it on both entries; otherwise, there doesn't seem to be a use for that template (it's currently unused and there's no point in templatizing usage notes that only apply to a single entry) and I'd like to delete it, unless you know of other entries that could use it. - -sche (discuss) 20:51, 9 February 2015 (UTC)

Yes, this template is OK, it applies to both entries, but a more complete list is (at least) descendre, monter, passer, redescendre, remonter, rentrer, repasser, rerentrer, rerepasser, reressusciter, reretourner, ressortir, ressusciter, retourner, sortir. This list is not limitative (when you add re- to a verb, this is the same rule). Actually, avoir or être is used depending on the meaning, and this is best explained with examples, but the template seems to be a good summary: when used transitively (or with a transitive sense, even when the complement is omitted), it's always avoir. Otherwise, it's être. Lmaltier (talk) 21:15, 9 February 2015 (UTC)
Thanks for the clarification! I'll clean the template up a bit and add it to those entries. - -sche (discuss) 22:53, 9 February 2015 (UTC)
Also note that using être is also systematic for pronominal uses of verbs: cf. je me suis trompé vs j'ai trompé. But this is a different issue, it's not limited to a few verbs. Lmaltier (talk) 06:58, 11 February 2015 (UTC)

Crucially important questionEdit

From which episode of QI do those words on your main page come? It's snowy in Tennessee, and there's nothing to do. JohnC5 05:20, 17 February 2015 (UTC)

@JohnC5 I believe it was the J series episode 13 on "Jobs". Those were all occupations people said they had in old British censuses. - -sche (discuss) 05:47, 17 February 2015 (UTC)
I have seen that episode! Probably deserves a rewatching... JohnC5 05:50, 17 February 2015 (UTC)

Questionable revertEdit

I would appreciate it if your reverts were a bit more careful. For instance here, I think that edit would have been fine since many people confuse UUers for a religious denomination. However most academics refers to it as a distinct religion. By highlighting the coordinate terms, it would have been clearer that this is a distinct religion. I'm disappointed with your knee-jerk finger-trigger like reactions. 16:16, 21 February 2015 (UTC)

The merits aside, someone with your long, ugly history of questionable and often downright awful edits (yes, it's obvious who you are, whatever IP you happen to be using at the moment) is in any position to criticize the people who have to clean up after you. Chuck Entz (talk) 16:45, 21 February 2015 (UTC)
I think it's better to put the coordinate terms, synonyms, etc in the lemma entry, rather than in all the various possible abbreviated forms (UUers, UUs, etc). - -sche (discuss) 17:48, 21 February 2015 (UTC)

allosexual entryEdit

Many thanks for your improvements, which were far above my Wiktionariological or semantic capabilities. Looks great! FourViolas (talk) 15:03, 24 February 2015 (UTC)

Trans and frequenciesEdit

You must have the frequencies form transman, transwoman, etc. wrong; please check Google Ngram Viewer. --Dan Polansky (talk) 08:46, 7 March 2015 (UTC)

No, Ngram Viewer clearly shows that the spaced form is more common in the case of trans woman (link, which looks like this to me — is it different for you?). For trans man, the unspaced form was still slightly more common at the time Google's data cut off (2008), but the spaced form was becoming more common while the unspaced form was becoming less common, so it seems likely that more recent data would show the same situation as with trans woman, i.e. that the spaced form is more common (especially in light of the proscription of the unspaced form by some authorities). - -sche (discuss) 08:54, 7 March 2015 (UTC)
For transwoman, my mistake: I used the default Ngram settings which ends in 2000[1], but when one extends the graph to 2008[2], the picture changes.
For transman, you are making the less common form[3] (factor 1.6) the main dictionary entry, with justification that relies on extrapolation rather than actual situation. When one combines this with the proscriptions expressed online, I am not sure what to think of this. --Dan Polansky (talk) 09:09, 7 March 2015 (UTC)
Well, to assume that the actual situation matches the situation a decade ago would also be making an assumption. It would be a reasonable assumption for most words, which have many decades of use, which have consistent (parallel) trendlines, and which the events of 2008-2015 can't be expected to have had much of an impact upon. (For example, couch and sofa.) In this case, however, the trendlines are divergent (and only go back about 15 years anyway), and increasing awareness by the general public of trans people's preferences can be expected to have influenced usage in the same direction as the trendlines were going when the data cut off. (Consistency with trans woman also plays a role.) - -sche (discuss) 09:48, 7 March 2015 (UTC)
You actually have a good point; the 2008 data is 7 ears old. To bet that the trend for transman has after 2008 developed in a way parallel to trends seen even before 2008 for transwoman seems reasonable enough. Fair enough. --Dan Polansky (talk) 15:06, 7 March 2015 (UTC)


Concerning this. Just a thought, but I'm not convinced that it's sensible to split the definitions. This is because it seems not clear in many citations (especially earlier ones) which sense exactly is meant, and more generally I suspect that the precise meaning lies on a continuum between the two rather than being neatly split into one or the other. At any rate that was my impression when I was working (briefly) on the word. Ƿidsiþ 07:52, 9 March 2015 (UTC)

A lot of citations are ambiguous, yes. However, enough are unambiguous that I don't think conflating them is appropriate, particularly because the distribution of meanings seems to have a temporal component, i.e. the meaning seems to have changed over time. Citations that refer to the past often explicitly refer to hijras as eunuchs, defined by anatomy, while contemporary uses often (mostly?) refer to the third-gender people, defined by social role/presentation. Some of the latter works even explicitly specify that (modern) hijras are not necessarily eunuchs: google books:"uncastrated hijra|hijras" gets a few hits, and google books:"castrated hijra|hijras" (which would be redundant if hijras were necessarily eunuchs) gets several more, including some like "the not-yet-castrated hijra", "they were indistinguishable from castrated hijras when crossdressed - clearly, becoming hijra as a livelihood required neither castration nor gharana affiliation", and "[they] may or may not be castrated. Hijra is a developed stage." Perhaps the solution is to make the two specific senses into subsenses of a broad 'coverall' sense? - -sche (discuss) 08:27, 9 March 2015 (UTC)
And then there are google books:"female hijras", who most of the citation make clear have attained hijra status by adopting a third-gender role and not by castration. These would be especially hard to work in to a 'coverall' sense — they would require it to be very broad indeed, to cover both eunuchs and women. Perhaps the solution is to have a {{qualifier}} or usage note explain that some uses don't distinguish male eunuchs from male-bodied third-gender people? - -sche (discuss) 08:44, 9 March 2015 (UTC)

Upper Franconian language‏‎Edit

User:Purodha added user boxes that triggered the creation of a whole bunch of bad language categories and redlinks by Babel AutoCreate- pretty much the gamut of nds-nl & nds-de lects. I've gotten rid of most of the redlinks by replacing the narrow-lect category link with the appropriate broader-language category link in the User categories that were created. The one holdout is Upper Franconian, code vmf (see Category:User vmf): I'm not really sure whether it's nds-nl or nds-de. Any suggestions? Chuck Entz (talk) 03:55, 18 March 2015 (UTC)

@Chuck Entz See here. —Μετάknowledgediscuss/deeds 04:12, 18 March 2015 (UTC)
It's been a while since I was looking into this, so I forgot some important details. Yes, it's High German, not Low German. If you follow the link to the Wikimedia discussion, it turns out that after we had deleted the vmf code, Ethnologue came out with corrections that led to vmf being deemed eligible for a wiki after all. Now that Ethnologue is no longer claiming that vmf applies to Mainz and Frankfurt, we may need to revisit the issue. Chuck Entz (talk) 07:03, 18 March 2015 (UTC)
Thanks for the heads-up. I have been busy, but I will look into it. (I wonder if they have also clarified frs any.) - -sche (discuss) 01:27, 19 March 2015 (UTC)
Apparently they have, it's now called "Saxon, East Frisian Low". (But the population count is still wrong, hmph.) -- Liliana 01:33, 19 March 2015 (UTC)
While you're here, what are your thoughts on the newly-redefined Upper Franconian? do you think it should be included? All the varieties of German are such a mess to pick apart into discrete lects... - -sche (discuss) 02:41, 19 March 2015 (UTC)
Ethnologue does a horrible job at the German dialects. It appears to cover some, but not all of them and it's generally a huge mess to work with. (I hope you've seen my newest BP topic regarding the Swiss German lects.)
Have you seen the current vmf entry? It says "Hessen state: mostly River Main area, east of Mainz and Frankfurt." How much Hesse is there at the Main east of Frankfurt? lol. They really can't figure out what they want with this code, and it doesn't help that it's called "Mainfränkisch" with "Ostfränkisch" being a supposed alternate name, even though Mainfränkisch is just one of many subdivisions of Ostfraänkisch.
I mean, we could theoretically use it for the Franconian lects, but... eh. -- Liliana 00:06, 20 March 2015 (UTC)

frs Module errorsEdit

These have been hanging around since you removed the frs code. There were 146 to start with. I've chipped away at a few of the obvious ones, but there are still about 135. The problem is, I don't know which ones are Saterland Frisian, which ones are East Frisian Low Saxon, and which are some unspecified extinct Frisian East Frisian dialect.

It won't do to have all of those module errors for an extended period- there's already been one unrelated module error that I only found out about by going through all 136 entries in the category (there's an error in a Korean module that's since brought the total up to 199). Do you think you'll be able to fix them soon? Is there anything I can do to help? Maybe User:Leasnam, who added most of them, might be able to help. Chuck Entz (talk) 03:50, 23 March 2015 (UTC)

I've been changing them as I see them...but the majority of those I've added, by the looks of them, represent a sampling of various unspecified extinct East Frisian dialects. Where I can connect them to a modern Saterland Frisian word I am updating them, but not universally. Sometimes I just change the code to stq to get rid of the error short term Leasnam (talk) 04:41, 23 March 2015 (UTC)
Ugh, this is one of the few downsides to our use of language modules rather than language templates: I thought I had cleaned up all the uses of frs. (I should have waited for and searched an updated database dump to be sure.) I would temporarily reinstate the code, except that Ethnologue clarified that it refers to the Low German lect, which means I'd be replacing missing information (module errors) with potentially incorrect information (it's often unclear whether uses of the code on here are meant to refer to Frisian or Low German), which I am not sure would be an improvement. I'll chip away at what I can. If an entry simply lists an East Frisian word as a cognate (not an etymon), and it's not possible to determine which precise Frisian-ic or Low-German-ic lect it belongs to, it can simply be dropped, IMO. - -sche (discuss) 04:52, 23 March 2015 (UTC)
I have no qualms about dropping a non-essential cognate. We can fix later if need be Leasnam (talk) 06:06, 23 March 2015 (UTC)
Here is the reference cited in the first appendix entry I looked at. It seems to be treating East Frisian as a whole, which would include not just Saterland Frisian, but also at least a couple of the extinct dialects. Maybe we need an exception code for Frisian East Frisian as a whole, or maybe we should make stq the code for the whole language. Chuck Entz (talk) 07:01, 23 March 2015 (UTC)
It would be sensible to do one of those things, yes. In the past I had proposed creating gmw-fre or gmw-efr for East Frisian, but there was insufficient support for that because it was at the time still unclear if frs really referred to the Low German lect. - -sche (discuss) 14:03, 23 March 2015 (UTC)


"not convinced that this form is German and not Latin, but w/e" -- even states that there's a vocative for Jesus and Jesus Christus: "Jesus [...] Anredefall: Jesus und Jesu", "Jesus Christus [...] Anredefall: [...] Jesu Christe" ("Anredefall" is German for English vocative). There most likely would still be an ablative (cf. "von dem Nomine" [Nomen], "von dem Corpore" [Corpus], "von dem/der Radice" [Radix]), but the ablative of (Latin) Jesus and Christus equals the dative and so duden only mentions a dative. Also, though it should be obvious: the vocative of Jesus and Christus can especially be found in religious song books and most likely religious prayers etc. -13:48, 19 April 2015 (UTC)

Changing the parent language of Yiddish from MHG to OHGEdit

(Pinging people who may be interested) @Metaknowledge, CodeCat, Angr

It is not clear that Yiddish branched strictly after the beginning of the MHG period. See for example section 7.25 in Max Weinreich's History of the Yiddish Language, where he concludes "Hence we have to postulate that Yiddish began to take shape as early as the Old High German period" (p. 424). Is this enough of a reason to change Yiddish's ancestors = from "gmh" to "goh"?

Another more difficult question would be whether to add Hebrew, Aramaic, Yevanic, and/or Judeo-Romance as an ancestors (which in some sense they are), but then again we don't put Frankish as an ancestor of French (perhaps we should?).

--WikiTiki89 18:34, 20 April 2015 (UTC)

I'd say the second question is the easier one: No. Languages that are the sources of loanwords—even large numbers of them—are not considered ancestral. Anglo-Norman is not an ancestor of English; Latin is not an ancestor of Albanian and Welsh; Italian is not an ancestor of Maltese; and Hebrew, Aramaic, Slavic, etc., are not ancestors of Yiddish. I have no objection to changing the parent language of Yiddish to OHG. —Aɴɢʀ (talk) 18:53, 20 April 2015 (UTC)
But they're not exactly loanwords, they're more like kept-words. Jews that spoke other languages and settled in German-speaking areas, slowly and gradually adopted more and more German words and grammar, keeping many words and grammatical structures from their former languages, especially from Hebrew. This had already happened several times before and so the Hebrew words and grammatical structures were direct continuations from when Hebrew was their native language. This is different from loanwords, which speakers of one language simply borrow from another language. I presume that there was similar situation with French and Frankish, although I have never read about this and far fewer Frankish words survived in French for it to be significant. --WikiTiki89 19:08, 20 April 2015 (UTC)
Contact languages of any kind are going to be impossible to represent accurately in terms of choosing a language as a "parent". MHG seems no less (in)accurate to me as compared to OHG; during both time periods, there was an attested Jewish form of the language written in Hebrew script that had a lot of Semitic vocabulary. Yiddish has some differences in sound changes that allow us to estimate its general point of divergence, but the differences do not seem to be particular to Yiddish so much as features of some of the High German lects (not the one(s) that led to Modern Standard German). In the meantime, I think keeping it as MHG is perfectly fine, considering that MHG already represents a span of varying lects within certain parameters of time and space which arguably include the Jewish varieties. —Μετάknowledgediscuss/deeds 19:19, 20 April 2015 (UTC)
Well Weinreich says on the same page as the quote above "Yiddish speakers were in close contact with German speakers, and it need not occasion surprise had the German component of Yiddish, although already part of an independent language, continued to be affected by changes that took place in the German determinant." I don't know whether you find that contradictory to your point or not. --WikiTiki89 19:33, 20 April 2015 (UTC)
The ancestry of Yiddish is the subject of some disagreement. Wikipedia calls the view of a MHG origin a "prevailing" view. Bernard Spolsky (The Languages of the Jews: A Sociolinguistic History, 2014, page 157) says "The basis for Yiddish was a Middle High German dialect, for Yiddish often agrees with Middle High German rather than with modern German[.]" And Paul Wexler (Two-tiered Relexification in Yiddish, 2002, page 133) goes so far as to say "there are no specific Old High German phonological or lexical features in Yiddish (see Simon 1991: 253)." But Wexler believes the ultimate origin of Yiddish is actually Slavic, and the Germanic content is the result of relexification in the 9th to 12th centuries; indeed, his full sentence (emphasis mine) is "The first relexification to German took place in the Middle High German period, to judge from the fact that there are no specific Old High German phonological or lexical features in Yiddish." In turn, Weinreich says what you quote, but Wikipedia says that his model also posits that "Jewish speakers of Old French or Old Italian, who were literate in Hebrew or Aramaic, migrated to the Rhine Valley, [...] encountered and were influenced by Jewish speakers of High German" and that the ultimate origin of Yiddish is the fusion of all this, not simply OHG.
Perhaps we shouldn't list a parent at all?
De facto, we more often give OHG words than MHG words as the etyma of Yiddish words. (In the past, some entries gave modern High German forms as etyma, but this was known to be problematic and has for the most part been addressed.)
- -sche (discuss) 22:08, 20 April 2015 (UTC)
The way I see it is that listing MHG as a parent implies also OHG, but listing OHG as a parent does not imply MHG. So if we are unsure about MHG, then listing OHG is not wrong. But what actual consequences does listing the parent in the module have? What got me thinking about this was when I was adding פֿאָרן ‎(forn) to *faraną and was unsure whether to put it under MHG or under OHG. Perhaps this should be decided on a word-by-word basis. If we know a word came from MHG, then we will list it under MHG, if we know it did not, then we would list it under OHG, and if it is unclear, that is where we need to choose a default and where I think OHG would be a better choice. --WikiTiki89 22:33, 20 April 2015 (UTC)
Frankish isn't really an ancestor of French: there were an awful lot more of the Romance-speaking Celts then there were Franks, so the Franks were somewhat like the Mongols in China- more important historically than linguistically. Chuck Entz (talk) 03:32, 21 April 2015 (UTC)
Ok, then my comparison to French/Frankish was wrong. My point remains about Yiddish/Hebrew. --WikiTiki89 14:15, 21 April 2015 (UTC)


Pertain, which pertaining is just a modified version of, is defined here on English wiktionary as "Verb[edit] pertain (third-person singular simple present pertains, present participle pertaining, simple past and past participle pertained)

(intransitive) to belong (intransitive) to relate, to refer, be relevant to" The "to belong" sense of pertaining is already covered by "of pedophilia", the "to relate" sense is already covered by "related to pedophilia", so it is redundant. Although its not necessary to be as simple here as on simple English wiktionary, its still important. Its best when writing to write in simple language, not complex. There is a book about this topic by H.W. Fowler called The King's English, you should read it. His first points in the book are, prefer simple words to complex words, prefer short words to long words, prefer common words to unusual words, and prefer Germanic words to Romance words. He would agree with me that pertaining would need to go in this case. --PaulBustion88 (talk) 02:12, 30 April 2015 (UTC)


Howdy-doo! I was just curious where you found the meaning of incomplete. It seems closely related to the meanings I've seen, but not quite the same. Just thought I'd ask. —JohnC5 04:01, 8 June 2015 (UTC)

I saw it in The Century Dictionary (1914) defined as "characterized by incomplete metamorphosis", and that sense is suggested by citations like "cockroaches, grasshoppers, lice, true bugs, and so on, undergo paurometabolous or incomplete development" (Foundations of Wildlife Diseases, 2014, ISBN 0520958950, page 126). That citation is why I offered the shorter gloss "incomplete" before the semicolon, btw (since "paurometabolous development" is not "development characterized by incomplete development"). It's probably not a separate sense, and could be removed if sense 1 were expanded a bit. Btw, Century has a second sense, "of or belonging to the Paurometabola", which is defined as "in Brauer's system of classification, those insects in which the metamorphoses are slow, inconspicuous, and very incomplete, as the Orthoptera". The former looks like a candidate for Category:mul:Taxonomic names (obsolete). - -sche (discuss) 05:10, 8 June 2015 (UTC)
Based on the wiki page for w:Hemimetabolism, I believe the word incomplete is used to mean "not executing all of the normal stages of metamorphosis," as opposed to "failing to complete metamorphosis." The ambiguity lies in that the members of Paurometabola succeed at their form of metamorphosis, but this metamorphosis does not conform to the standard metamorphic pattern. I might suggest abridged or atypical as opposed to incomplete because the latter most sounds like the bugs never succeed at maturing, which is certainly not true. Does this sound reasonable to you? —JohnC5 21:26, 8 June 2015 (UTC)
Good point about w:Hemimetabolism. Actually, why don't we just link to that page? See what you think of my change to the entry, and feel free to undo or expand upon it. - -sche (discuss) 00:23, 9 June 2015 (UTC)
Looks good to me! :)JohnC5 00:56, 9 June 2015 (UTC)

Partition verb senses by grammar, semantics, register/topic/context?Edit

Looking at your excellent, extensive work on take reminded me of a question that bothered me about sense division, especially in verbs (though it comes up in other word classes).

Which of the various possibilities should take precedence in grouping definitions? For verbs, most dictionaries divide definitions into transitive and intransitive and, as a result, have some redundancy and obscure some semantic relationships. I often feel that certain groups of registers/topics, eg, sports, games, nautical, belong together no matter whether there are semantic reasons to split them. Some would group all archaic and obsolete senses.

We already split some semantically analogous senses by PoS eg, adjectives and adverbs, conjunctions and adverbs, conjunctions and pronouns, prepositions and adverbs, adverbs and nouns (eg, home). These splits make it harder to see the semantic similarities. Have we written off that kind of semantic visibility? Do we have to?

My natural inclination is to have grammar take precedence, but I'd be happy to hear arguments for the other possibilities. DCDuring TALK 20:31, 10 June 2015 (UTC)

Working on take got me to thinking about sense grouping, too. I don't desire to adopt other dictionaries' practice of separating transitive and intransitive verbs, I only separated them on take to make the entry easier to work on. Now that I'm finished adding senses, I'll probably go back and interweave the transitive and intransitive ones, since I think it's better to group definitions/senses according to meaning. Separating transitive and intransitive senses often obscures the fact that some senses are ambitransitive (as here, where it resulted in what was basically the same sense being listed twice) or ergative.
Separating different parts of speech seems to me like a good practice to continue. The cases where it proves difficult (however) or could be regarded as obscuring semantic connections (home) are too few and far between to justify abandoning the practice.
- -sche (discuss) 05:20, 11 June 2015 (UTC)
If an English L2 section is to be read as some kind of structured, terse essay on a term, then it certainly makes sense to group somewhat semantically.
OTOH, if an English L2 section is intended to help an ordinary user find a definition, at least some users would benefit from a transitive/intransitive split, which would support faster scanning for the possible definition. (This argument also favors topical labels, which I have, perhaps wrongly, opposed.)
Another consideration is entry maintainability. Of course, to tinker with your efforts would be gilding the lily, but it is easier to assess, analyze, and repair the range of coverage of a set of definitions, if the set can be made smaller on some easy-to-determine grounds, like the hard grammatical distinction of transitivity/intransitivity. DCDuring TALK 12:41, 11 June 2015 (UTC)
You are right that transitivity is an (possibly the only) easy-to-determine hard-and-fast distinction, and that segregating senses according to it could help people find specific senses. I'm not strongly opposed to it, I simply think semantic grouping is better. Where would ambitransitive and ergative senses go if senses were split by transitivity? In sections all their own, e.g. between the transitive and intransitive senses? (That would seem a bit awkward, but not outright problematic.) Or would they be duplicated and placed in both the transitive and the intransitive section? That would seem unhelpful to English-speakers, though perhaps helpful to translators (if they have distinct translations in some languages, which seems likely).
Other ways of sorting verb senses are by age (oldest—or newest—senses first) and by commonness (most—or least—common senses first). I suppose those are not mutually exclusive with grouping senses by meaning or transitivity.
Perhaps someone will devise a gadget that will give users buttons, similar to the "show/hide quotations" buttons but located e.g. at the top of each POS section, which will allow users to optionally hide senses with certain tags, e.g. obsolete, archaic, transitive, intransitive, even US (if a user knows they're searching for a sense Brits use), UK, etc.
- -sche (discuss) 21:22, 11 June 2015 (UTC)
We could have sortable tables of definitions! Ugly, and needing a lot of artificial data to generate what we think is appropriate. Or we could let users run SQL queries against a database of definitions.
I've never been convinced of the utility of ergative and other high-falutin' linguists' labels for the supposed 'normal' users, if indeed we have any 'normal' users. Those mostly seem good for making sure that someone working on an entry checks to make sure that the appropriately reworded definition appears in both transitive and intransitive sections, ie, duplicate underlying semantics.
After group by the hardest of grammatical distinctions, I would group semantically, preferrably using subsenses, ordering the senses by date of attestation of the sense (in principle) or degree of concreteness (which might coincide with date of attestation for the definition in the language or an ancestor. Subsenses would follow the same ordering principle within the sense. But recourse to attestation actually means relying of OED for many words, though not so much for more recent sense development.
As we don't really have a clearly dominant approach, I think we can still let contributors do it the way they want to. I would not impose my ideal grouping and order on an entry that was a good example of another set of organizing principles and hope that no-one would waste time merely reordering and regrouping mine, unless there was a good reason (clear error, reorganizer actually working from the OED, etc). DCDuring TALK 22:11, 11 June 2015 (UTC)

Orange links and ACCELEdit

Hi. Is there any way to combine the orange link gadget with the WT:ACCEL one? --Type56op9 (talk) 17:36, 13 June 2015 (UTC)

Not that I'm aware of (I think people have asked about that before). It would be useful, though. You could ask in the Grease Pit. - -sche (discuss) 18:18, 13 June 2015 (UTC)
(edit conflict) Not as such. Acceleration works by adding preloads to a redlink, which requires that there be nothing there. One would have to have an app to add a language section to an existing entry, which would require different methods. It may be possible (bots certainly have no trouble with it), but it wouldn't be a trivial exercise. Chuck Entz (talk) 18:21, 13 June 2015 (UTC)
My illegal bot made such additions the time. But then it got blocked, so I had to hide the fact I was using a bot by changing the code. Then people figured out I was still using a bot. However, if this new orange-accel tool was around, I could use the illegal bot again, and pretend I was using the tool. Everyone's a winner! --Type56op9 (talk) 18:26, 13 June 2015 (UTC)


On the "Greek" page, that was a filter that I put on my computer that did that. I'll have to make sure to check that in the future to make sure that it doesn't sneak into my edits by accident. The filter replaces words with "[word deleted]". I installed the filter because too many people were swearing left and right on many of the websites that I visit, and I grew tired of seeing it.

But yes. xD

That was pretty funny. My bad. Tharthan (talk) 18:11, 15 June 2015 (UTC)

Ah, thanks for the explanation; I had wondered why it flagged "clit" but not "anal sex", haha. Thankfully people around here don't swear that much (not that I mind) — I guess it's to be expected that dictionary-editors know more articulate ways of expressing themselves. - -sche (discuss) 18:17, 15 June 2015 (UTC)
Yeah, frankly I would have set it to change each word to a clean synonym, but the filter in question only allows for one all-encompassing replacement (which kind of stinks, because it reminds me of those old IRC-type chatrooms that just replaced vulgarities with asterisks rather than creatively write around them). But it's the best I can find for Firefox.

By the way, I have to ask:

You said that the main criterion for cited sources is that they must be durably archived. Are there any exceptions to that? Do we allow citations of tabloids or other "buzzword books" that may indeed use a neologism or retronym for over a year but be truly the only ones to do so. Tharthan (talk) 18:47, 15 June 2015 (UTC)

Durably-archived tabloids are allowed; they aren't prestiguous, but their vocabulary is part of the great big grab-bag which is the English (or German, etc) language. Terms which are "neologisms", "slang", "informal", "rare", etc should certainly be marked with those labels, however, and in exceptional cases one can write usage notes.
What kind of "buzzword books" do you mean? Books that define and then give made-up examples of slang are disallowed by WT:CFI#Conveying_meaning, which "filters out [...] made-up examples of how a word might be used". But authors who like to work as many words from those kinds of books into their own literature, well, they're allowed. I got the impression that Georgette Heyer copied words from the 1811 Dictionary of the Vulgar Tongue and pasted them into her dialogues, sometimes clumsily. In fact, that makes me realize [4].
If a work is of such low quality that one can't be sure it is in fact using a given word (as opposed to unintentionally containing a string as a typo or misspelling), it is generally excluded, however (because CFI requires evidence of use). So, a citation like "Berlin, Germany has many ihstoric stires, as do most other cities in Germanny." would probably not be accepted as evidence that "Germanny" is an alternative spelling of "Germany". (But a book from 1600 that said "Southern Germanny is a Land of mannifold historickal Constructions, of a Roman Charackter" would suggest that "Germanny" was once an obsolete spelling of "Germany".)
- -sche (discuss) 21:11, 15 June 2015 (UTC)


Hallo -sche,
nach längerer Zeit habe ich mal wieder eine größere Bearbeitung getätigt und dabei den oben genannten Eintrag erweitert. Könntest du mal bitte drüberschauen und etwaige Format-, Formulierungs- und Übersetzungsfehler korrigieren. Danke im Voraus und lieben Gruß dir, Caligari ƆɐƀïиϠ 06:02, 16 June 2015 (UTC)

Natürlich; und lieben Gruß auch dir! PS, there must be something in the air (as they say) causing people to undertake big multilingual projects, since I just attempted one in the other direction, expanding (take and then) de:take. - -sche (discuss) 09:16, 16 June 2015 (UTC)
I guess my English got a bit rusty. So again, many thanks for your swift corrections. Each and every correction will improve further editings...hopefully :-).
@de:take: Wow! Indeed. Great job so far with regards to the massive content expansion. Let me know when you think you completed expanding "take". There are some formatting issues that I'll let you know on your German user talk once you've done with expanding. There need to be some "Feintuning" with regards to the format. As an advice I would recommend that you take a look at articles in de:Kategorie:Polnisch, de:Kategorie:Tschechisch or de:Kategorie:Schwedisch. If you need specific help, don't hesitate to let me know.
Lieben Gruß dir, Caligari ƆɐƀïиϠ 15:33, 16 June 2015 (UTC)

Moinsen. WT: ANDS.Edit

Moinsen. Ich biete dies: User_talk:Korn/sandkist Korn [kʰʊ̃ːæ̯̃n] (talk) 13:57, 20 June 2015 (UTC)

Merging the German and Dutch lects... bleh. I don't oppose it, or support it. (As I wrote further up on this page, "the general disagreement and slow-motion edit-warring about how to handle the various Low German lects makes for so much ugliness that I am losing interest in editing them" at all.) I strongly suggest, almost to the point of insist, that one orthography should be chosen for forms to be lemmatized on / normalized to (I don't know if this is what you intended the "consonants" and "vowels" sections to do), so that we don't end up with five entries lemmatized five different ways, representing the same diphthong five different ways, as if all the words were pronounced differently, when in fact they just use different orthographies or have predictable dialectal variation. Nouns should uniformly begin with majuscule letters, or uniformly not do that, for the same reason.
I've made a few typofixes and other small changes, e.g. dropping the Dutch spellings of "coïnciding" and "reëmergence". Also note that merging Plautdietsch would need discussion quite apart from merging GLG and DLS, because people (e.g. Angr, and me) in past discussions have supported keeping it separate on account of its separate history and development on another continent.
I also suggest either dropping the "During Middle Low German [...] Central and Upper German" line, or rewriting it to give native forms (we'll have to suck it up, bite the bullet, and perform whatever other idioms are necessary to give one dialect's forms as examples) so that it doesn't imply Low Germans actually used the words "German", "Low Landic", etc, especially given that "Low Landic" gets all of four Google hits. (Alternatively, a phrasing like During Middle Low German times, the language was known by cognates of the terms "Dutch", "Saxon", "Netherlandish" or "Netherdutch" would technically be accurate, but confusing to the uninitiated.)
- -sche (discuss) 17:26, 20 June 2015 (UTC)
Ganz ruhig. Ich glaube, Du verstehst meine Intention falsch. Der von mir geschriebene Text sollte ein Ausgangspunkt für ein Gespräch zwischen uns beiden über die Änderung des ANDS sein. Die derzeit existenten ANDS-DE, -NL und PDT sollte das noch gar nicht berühren, weshalb sie auch nicht erwähnt sind. Die Sektion über die Konsonanten und Vokale soll interessierte Autoren und Nutzer nur darauf hinweisen, dass eine Schreibung nicht bedeutet, dass überall dieselbe Aussprache vorherrscht und ggf. zu weiteren Eintragungen im Pronunciation-L3 anregen. (Oder wenigstens überhaupt welchen.) Von der Plautdietsch-Geschichte bin ich nicht überzeugt, da sich Plautdietsch kaum bis gar nicht von anderen Dialekten unterscheidet. Und den Teil mit den native forms verstehe ich ganz einfach nicht. Es klingt, als würdest Du befürchten, dass die Leser fälschlicherweise denken, dass die Holländer sich tatsächlich mit englischen Worten benannt hätten. Korn [kʰʊ̃ːæ̯̃n] (talk) 18:22, 20 June 2015 (UTC)

Old Italic display helpEdit

Hello! Remember this discussion way in which you mentioned you make fonts? Well, this is not exactly that, but I have been working on making Appendix:Old Italic script with all of the relevant Old Italic languages (I still need to add Raetic, Camunic, Lepontic, etc.). I will then use this table as a references to create Module:Ital-translit which will service all of the Old Italic languages. I thought that it would be very nice to be able to show all the different letter forms that would map to any given Unicode letter. The documentation for how the Unicode block is defined is here and contains descriptions of all the different letter forms for each sub-script (in section 3). I was hoping you (or someone you could suggest) might be able to create PNG's for the use in {{t2i}} so that we could display all the Old Italic letter forms both in this appendix and potentially in the mainspace for quoting inscriptions. I know that this isn't a high priority for anyone, but now that I've started, I've gotten quite excited about the whole business. Below are some other reference materials for all the scripts. I'm not hoping for every little variation of every character, but if you make PNG's for the major ones, I'll do all the rest. Also, if this is just too much work, just tell me. —JohnC5 21:11, 24 June 2015 (UTC)

Hmm, I'll see what I can do. Btw, I notice the Glagotic t2i images are a mix of svgs and gifs, although svg versions exist for at least some of the gifs and could be swapped in. - -sche (discuss) 20:01, 26 June 2015 (UTC)
Yeah, that is rather weird. I have not idea how why that is the case. Also, the behavior for which I asking you is a little different than the normal t2i behavior, because I would want {{t2i|a|a2|a3|a4|a5}} to be different versions of the same letter. Just making sure you understand that for which you signed up.
Also, thanks! —JohnC5 23:19, 26 June 2015 (UTC)
Hey again. Sorry to pester, but is there any progress on this? I want to have a discussion/take a vote to solidify the mapping of characters used in Module:Ital-translit and Appendix:Old Italic script since some of the character transcriptions (specifically those in South Picene and Camunic) are very odd. Having these for the discussion would be very useful. And again, if this is too annoying to do, please tell me. —JohnC5 06:29, 8 July 2015 (UTC)
Thanks for the poke.
The various letter-forms in the images you showed me are all, for lack of a better word, very line-y (as opposed to calligraphic like pen- or quill-and-ink handwriting, which is what I'm more used to designing fonts based on). I did mock up variants of the A in a style somewhat like the images of the Glagolitic letters, but finishing all the alphabets in that style would take quite a while. I was going to try jotting all the letters on paper and scanning it and autotracing it into a png or svg, and then post an update, but I've been busy. Hmm, you could try it yourself — and I hope that doesn't sound rude; I'm not saying "grr, do it yourself", I just mean that you could probably do that as well as I could. And if I do later find time to make more calligraphic letters, they could always be swapped in. - -sche (discuss) 07:02, 8 July 2015 (UTC)
No worries. I guess the whole making-png's-and-formatting-them-and-uploading-them thing would have somewhat of a learning curve for me. I didn't really need calligraphic versions―I was more hoping for just boring, old line versions of the different letterforms so I could disambiguate them in the appendix. It's kind of frustrating how many ways each character can appear, and having them all in a row would be useful. Is there anyone else you could recommend for this because I understand how making an all-lines-all-the-time font could be kind of dull? —JohnC5 07:19, 8 July 2015 (UTC)
@JohnC5 OK, I've made a batch of letters and variants, which can be found at commons:Category:Italic letters. I traced a picture of an inscription, which is why the 'C' for instance is not a perfect circle; I will probably go back and make geometric 'perfect circle' variants at some point. I haven't done the whole alphabet yet. - -sche (discuss) 22:53, 10 July 2015 (UTC)
You're the coolest! —JohnC5 00:32, 11 July 2015 (UTC)
@JohnC5 Uploaded some more. Sorry this is taking a while. Think we should make a table to show all the forms (a bit like Wiktionary:Gothic transliteration but probably vertical rather than horizontal)? - -sche (discuss) 02:19, 4 August 2015 (UTC)
Thanks for your help with this; they look great. I seem to have bitten off more than I can chew at the moment. Feel free to add them to the table as you see fit, or keep pestering me. Please keep pestering me. —JohnC5 02:52, 4 August 2015 (UTC)
For now, I'm storing these in Appendix:Italic script. By the way, I notice commons:Category:Etruscan letters and commons:Category:Oscan alphabet already have some letterforms in them. - -sche (discuss) 05:39, 8 August 2015 (UTC)
@JohnC5 Let me know if anything you need is missing from Appendix:Italic script. In each section, the first gallery / row are letter-forms I drew and the other rows are letter-forms which I discovered already existed on Commons. - -sche (discuss) 00:30, 9 August 2015 (UTC)
Wowzers. Thanks so much for all this work. My next task will be to load them all into {{Ital2img}} and then use that to populate Appendix:Old Italic script with the appropriate letterforms. Both steps may take a while in turn. I feel, however, that this will greatly clarify the equivalency of the different symbols across sub-alphabets.
PS: Is there an abbreviation for the Appendix namespace like there is for Wiktionary (WT). I feel like I've wasted several years of my like writing out the word Appendix. Just think if you could write out APP:AITAL. That would be magical. —JohnC5 00:41, 9 August 2015 (UTC)
There is not, but we do have a few cross-namespace redirects using the WT: shortcut. You could create WT:AITAL pointing to the appendix namespace (or even move the appendix into the Wiktionary namespace). Feel free to change the format of that page, btw. - -sche (discuss) 01:25, 9 August 2015 (UTC)

Other resourcesEdit


Hey, there's probably a better way to put it, but at double-team I wanted to express that it suggests two people penetrating. One person can double penetrate with fingers and/or dildos, but one person can't double-team, AFAIK. WurdSnatcher (talk) 03:03, 10 July 2015 (UTC)


I direct you to Special:AbuseFilter/41 and Wiktionary:Requests for verification#agyrophobia. Also, aWa will not automatically recognise the discussion result if you forget to embolden it. Keφr 11:56, 15 July 2015 (UTC)

Duly noted, thank you.
The filter says "Of the last 8,991 actions, this filter has matched (0.00%)", is that just because it's turned off? I've turned it on, but set it to only flag edits. We can see how that works and then potentially upgrade it to warn or stop editors. - -sche (discuss) 00:22, 17 July 2015 (UTC)


Irritating Wikipedians is a feature, not a bug. It prompts them either to drop the assumption that this project is run like Wikipedia, or leave. (Well, it did the former for me at least. And surely there are some that cannot do either, which means they should be blocked.) —Keφr 06:37, 25 July 2015 (UTC)

People shouldn't be importing e.g. navboxes from sister projects (and I don't think we need {{reflist}}). Having a redirect from the name that every other project (Commons, Meta, en.WP, Simple English Wiktionary, Voyage, Source, Quote) uses to the name we use for the same thing just seems helpful, not only to users from everywhere else but also potentially for those users here who complain about every keystroke they have to type... since tl is shorter. - -sche (discuss) 09:16, 25 July 2015 (UTC)

Lean keepEdit

You wrote "Lean keep per Equinox". What does it mean? Are you leaning towards a keep vote (but not quite sure), or is it an adjective, a sort of "lean" or thin/skinny/ephemeral keep, like a "weak keep"? Equinox 08:28, 25 July 2015 (UTC)

Leaning towards keeping. Ah, the terseness and ambiguity of our RFD jargon. The phrase "RFD-failed" is worse; a passing 'pedian at one point questioned me why I had deleted something if the "request for deletion failed". - -sche (discuss) 09:19, 25 July 2015 (UTC)
Yep everything is bloody awful. Thank you for explaining. Equinox 09:24, 25 July 2015 (UTC)
If it's not annoying, maybe I could suggest "weak __" for "lean __". I don't like placing the "vote" (weak or otherwise) if I'm not convinced, so I don't use it. But. I think I've occasionally written "weak oppose" etc. where I didn't like something but couldn't be bothered to explain why. It just needs a few of us to kill change by apathy. Hurrah. Equinox 09:26, 25 July 2015 (UTC)
Why not use "RFD deleted"? --Dan Polansky (talk) 10:01, 25 July 2015 (UTC)
I used to write "deleted" and someone scolded me and told me to write "failed". Equinox 10:05, 25 July 2015 (UTC)
They should not have scolded you; did they perhaps confuse RFD with RFV? Many RFDs are closed as "deleted"; it is a common practice, and one that makes sense. I prefer to write "RFD deleted" rather than just "Deleted", in keeping with "RFV passed", "RFV failed", and "RFD kept", in boldface; the point is to make the closure clear and distinct as a closure, and indicate which process is being closed. But again, "deleted" is fine, and multiple people used it quite recently, including bd2412. I actually think "RFD failed" should be banned as a closure. --Dan Polansky (talk) 11:13, 25 July 2015 (UTC)
I agree that "deleted" is clearer than and preferable to "failed". I suspect uses of "failed" are due to thinking of RFD (and RFV) as a process for deciding whether or not to keep an entry (an entry is deleted pursuant to the process = it fails to be kept). The deletion summary "Failed RFD, RFDO; do not re-enter" seems to conceptualize it in this way. I've boldly changed it. Several other deletion reasons in that list are redundant or need cleanup, IMO. - -sche (discuss) 22:39, 25 July 2015 (UTC)
You mentioned the two "No usable content given" lines: I added the one with "Please see WT:ELE" because there were enough cases where I was adding it by hand, but there are also plenty of cases where ELE wouldn't have helped. Chuck Entz (talk) 03:14, 26 July 2015 (UTC)

Wiktionary:Vietnamese transliterationEdit

By creating this page, you caused all instances of {{vi-noun}} that include Nôm transcriptions to display a link to this page. Where in Wikipedia is the reader expected to look? The Nôm script predates the Latin-based Vietnamese alphabet, so I want to make sure it doesn't sound like the given Nôm characters are derived from the alphabetic words somehow. – Minh Nguyễn 💬 06:39, 29 July 2015 (UTC)

I created several such pages following Wiktionary:Grease pit/2015/July#remove_junk_from_Special:WantedPages. It was my impression that a (black) link was already present even before the page existed, so my edit was just to clear it off of Special:WantedPages, where it sat because of how many entries linked to it even without it existing. Feel free to add more informative content or even delete the page. Ideally, the template/module that inserts the link should be rewritten the way Module:IPA was recently, to only add links for the small number of languages which have transliteration schemes documented on Wiktionary, rather than performing an expensive check (as it does now) to see whether or not the dot (which, as an aside, I doubt very many people notice in any language) should have a blue link or be black. - -sche (discuss) 06:59, 29 July 2015 (UTC)


Can we change the primary name of tmh (in Module:languages/data3/t) from "Tamashek" to "Tuareg"? tmh is the macrolanguage containing thv ("Tahaggart Tamahaq"), taq ("Tamasheq"), ttq ("Tawallammat Tamajaq"), and thz ("Tayart Tamajeq"). "Tamashek" is just an alternative spelling of "Tamasheq" and makes it very confusing. Also, "Tuareg" is simply a much more widely used name for these languages. --WikiTiki89 15:40, 6 August 2015 (UTC)

Yes, "Tuareg" would be a clearer name for it. Should we even have tmh at all, though, if we include its subvarieties as separate languages? (I note that ber, the macro-macro-language code containing tmh, was deprecated in favour of its subdivisions.) - -sche (discuss) 19:20, 6 August 2015 (UTC)
I personally feel that Berber is overdivided. I'm not an expert, but it seems Tuareg languages are all relatively mutually intelligible (see here, for example) even if they have different realizations of some consonants (evident in the language names I listed above). So maybe we should merge all of Tuareg into one? The simplest thing for now, though, is to just rename tmh to Tuareg. --WikiTiki89 19:44, 6 August 2015 (UTC)
Yes, deprecating the sub-dialect codes in favour of tmh would also work. (And yes, Berber is quite over-divided...) - -sche (discuss) 19:57, 6 August 2015 (UTC)

northern fur seal translations for WOTD?Edit

k'oon is soon (10 August) to be a foreign WOTD. I have added entries for Callorhinus ursinus and northern fur seal. Could you take a look? Also, if you can find any Native American translations, they would make northern fur seal more interesting. The seals apparently ranged as far south as Baja. I've also left a note for Chuck Entz, as this might really be in his wheelhouse. DCDuring TALK 16:09, 6 August 2015 (UTC)

I tend to know more about the languages on the other (Atlantic) coast, but I'll see what I can do. - -sche (discuss) 19:46, 6 August 2015 (UTC)
We are lucky if we get folks to click through at all, let alone look at translations, let alone be impressed. So only modest effort, with high likelihood of success, is worthwhile. Thanks. DCDuring TALK 19:58, 6 August 2015 (UTC)
There's a Tlingit translation here, which I think might be x̲'ún or x'ún in the orthography used by the current entries. Also, I wonder about the "hair seal" and "big seal" in this Yurok reference- could one of those be the northern fur seal]? Chuck Entz (talk) 21:19, 8 August 2015 (UTC)
I made an assumption, based on the distribution of fur seal species, that in any native northern Pacific language a word for fur seal had as its original referent the northern fur seal, whatever else might now be covered by the word. Hair seal seems likely. I could not venture a guess about big seal, as I don't know what seals have been extant on the Pacific coast of North America. DCDuring TALK 21:30, 8 August 2015 (UTC)
The northern elephant seal could easily be the referent for a term that glosses as "big seal". DCDuring TALK 21:34, 8 August 2015 (UTC)
Yurok, since it is Algic, I know a bit about: chkweges, which that work translates as "hair seal", is indeed the northern fur seal, Callorhinus ursinus. As for Tlingit, we do seem to use x̱ in pagetitles, so I think x̱'ún is the orthography to go with (some of our entries currently use , but this strikes me as wrong). - -sche (discuss) 21:32, 8 August 2015 (UTC)
Take a look at our entry for hair seal, and my revision of it. It is confusing that several references (not just the Yurok one) gloss as "hair seal" words that mean "fur seal". - -sche (discuss) 21:40, 8 August 2015 (UTC)
Maybe I was too hasty on hair seal. I can't imagine that any people that depended on seals for food, clothing, etc could fail to make a distinction between seals with fur and those with only hair, the latter being good for storage, portage, kayaks etc, more than for clothing, where animal fur would be valued for warmth. But I couldn't find in the Yurok reference a distinction between "hair" and "fur". Human hair, at least, seems to be the referent for words that included the morpheme "lep". It may be that the Yurok "big seal"/"sea lion" vs "hair seal" distinction (or at least that of the author of the lexicon) is close to ours between eared seals (Otaridae, which include the fur seals, but also include sea lions, which do not have fur) and earless seals (Phocidae). DCDuring TALK 23:33, 8 August 2015 (UTC)

Whitelist nominationsEdit

(tried responding back at the Whitelist, but I apparently don't have permission to do so – I apologise for posting here)

I checked Redboywild's edits and they seem to be ok – formatting is correct and I couldn't find a single mistake or bad translation. So I see no reason why he shouldn't be whitelisted. Thank you for consulting me about it :-)

PS: Just found out that this user has been warned a couple of times in the Romanian Wikipedia and blocked once for introducing obscenities. This happened some time ago and he hasn't done it since. He has probably – and hopefully – matured, but I'll keep an eye on his edits so they're up to par. --Robbie SWE (talk) 15:46, 10 August 2015 (UTC)

Oh, apologies, I forgot you were only a sysop on ro.Wikt and not here. Thanks for the input. - -sche (discuss) 17:39, 10 August 2015 (UTC)

Two spellingsEdit

I have a question: Are außlegen and meßen pre-1996 spellings? --Lo Ximiendo (talk) 02:58, 17 August 2015 (UTC)

In one sense, yes — they were used in the 1600s, and the 1600s are before 1996. But in practical terms, no — when it comes to categorization or the like, "pre-1996" refers to spellings which were still standard right up until 1996, which these weren't. - -sche (discuss) 07:18, 17 August 2015 (UTC)

Sardinian translationsEdit

If you weren't already (painfully) aware of this: see Category:Pages with module errors, which seem to be all translation and descendents sections. I've cleared a few, but it's slow going with the translation sections hidden. Also, I noticed that there were also a couple of minor Sardinian lects that weren't affected. Chuck Entz (talk) 13:01, 17 August 2015 (UTC)

Sigh. As I lamented about Frisian, these translations went un-updated because good translations have been invisible (short of searching a database dump) ever since we switched from templates to Module:languages, as opposed to ttbc and t-check translations, which are categorized. Perhaps all {{t}}s should put entries into hidden categories like "Entries with Sardinian translations".
Now that they're all in Category:Pages with module errors, I'll just plug that into AWB and go through them.
If you're referring to Gallurese and Sassarese, I didn't merge them because (as I wrote here) they are despite their names not unequivocally considered dialects of Sardinian; rather, they're often considered dialects of Corsican (co) or transitional between Sardinian and Corsican. I'll propose renaming them soon for that reason, and move any I find nested below Sardinian.
- -sche (discuss) 17:45, 17 August 2015 (UTC)
In the recent reclassification of Kölsch, I used a database dump to find and fix entries in translations tables before deprecating the code, so only a half dozen residual things made their way into Category:Pages with module errors. Progress! - -sche (discuss) 01:05, 3 September 2015 (UTC)
Great! I would also suggest using "insource:{{t|xxx" in the search box to find any that weren't in the dumps. Chuck Entz (talk) 03:32, 3 September 2015 (UTC)

Talossan (tzl)Edit

I see you edited this file many times before. Could you update the variable for Talossan (tzl) here and replace it with the following:

m["tzl"] = {
	canonicalName = "Talossan",
	type = "appendix-constructed",
	scripts = {"Latn"},
	family = "art",
	sort_key = {
		from = {"[àáâäå]", "ç", "ð", "[ëèéê]", "[ìíîï]", "ñ", "[öòóô]", "ß", "[üùúû]", "þ"},
		to   = {"a", "c", "d", "e", "i", "n", "o", "s", "u", "z"}} ,  -- the copyright sign is used to guarantee that ð and þ will always be sorted after all other words with respectively d and z


¡Graschcias, Robin van der Vliet (talk) (contributions) 18:42, 27 August 2015 (UTC)!

Yes check.svg Done. Are publications like the Guizua Compläts àl Glheþ Talossan copyrighted? If so, I would caution you not to add more than a couple dozen words in the language, because including too much of a copyrighted language (like Klingon) poses legal problems/risks for Wiktionary (for which reason the Klingon appendix was greatly condensed by me a while ago, following this BP thread). - -sche (discuss) 01:04, 28 August 2015 (UTC)
Ün Guizua Compläts àl Glheþ Talossan is (as far as I can tell) a copyrighted book, but it is not the source of the language. I am also not sure if the Talossan language is copyrighted and if languages can be copyrighted in the first place, as a language is a gigantic list of facts and facts can not be copyrighted. Robin van der Vliet (talk) (contributions) 16:52, 29 August 2015 (UTC)
Individual facts, no. A compilation of facts can be copyrighted, though. With a bit of work, any creative work can be analyzed as a collection of facts, but the way the facts are assembled by the creator of the work makes them copyrightable. Chuck Entz (talk) 23:31, 29 August 2015 (UTC)

Updates to Template:WOTDEdit

Hi, I updated Template:WOTD at Template:WOTD/sandbox, essentially adding a new parameter |comment= (or {{{6}}}) which allows editors to add a comment: see Template:WOTD/testcases. If that is all right, could you update Template:WOTD? I can't do it myself as I'm not an administrator. If this isn't the correct procedure for proposing changes to the template, please advise. Thanks. Smuconlaw (talk) 14:35, 31 August 2015 (UTC)

Yes check.svg Done. Neat idea; I had noticed your addition of it to manumission (28 August). - -sche (discuss) 16:51, 31 August 2015 (UTC)
Great! Thanks. Smuconlaw (talk) 21:53, 31 August 2015 (UTC)

Preventing long tagsEdit

In the unlikely case that you haven't noticed my edit at mir#German_Low_German, have a look. With something as splintered as Low German, do you think it would make sense to install an L4 for "Dialects using this word" or something instead of context labels? The pronunciation sections can simply go into a collapse. Korn [kʰʊ̃ːæ̯̃n] (talk) 15:23, 1 September 2015 (UTC)

@Korn: thanks for bringing this up. I thought about it a while ago when I saw anguañu, which specifies twenty different dialects that the term is used in. Perhaps in such cases the individual dialects can be specified under ====Usage notes=====, leaving the definition line to just say "many|_|dialects". (According to templatetiger, there are three other entries which use 9 or more parameters of {{label}}: recondite, quindecillion, and tu; and there are also a few entries which use 10 or more parameters of {{context}}: pardı and Mischief Night.) - -sche (discuss) 00:56, 3 September 2015 (UTC)
However, {{label}} adds categories which would need to be added manually or in another way if we moved away from using {{label}} on the definition lines of such terms... - -sche (discuss) 01:01, 3 September 2015 (UTC)

Unprotection of Word of the Day pagesEdit

Could you please unprotect "Wiktionary:Word of the day/September 29" and "Wiktionary:Word of the day/September 30" so I can update them? Thanks. (If you have time, perhaps you can also go through other days of the year and unprotect them as well.) Smuconlaw (talk) 16:08, 2 September 2015 (UTC)

Yes check.svg Done. I wonder why some, but only some, of the pages were protected in the first place. - -sche (discuss) 00:58, 3 September 2015 (UTC)
Thanks. No idea why this was done. Perhaps it was before there was cascading CSS protection of material on the Home Page? Smuconlaw (talk) 06:40, 3 September 2015 (UTC)

Updating of Template:quote-book/sourceEdit

I have created an updated version of {{quote-book/source}} at {{quote-book/source/sandbox}} to address the three issues mentioned at "Template talk:quote-book#Some suggested changes". Could you replace the contents of {{quote-book/source}} at {{quote-book/source/sandbox}}? Thanks. Smuconlaw (talk) 17:26, 3 September 2015 (UTC)

Yes check.svg Done and I left a slightly longer comment on that talk page. - -sche (discuss) 23:58, 3 September 2015 (UTC)


This is actually wrong. See the documentation for {{pl-decl-phrase}}. I realize that this interface is somewhat hacky, but I could not find a different way to pass keyword parameters to the declension patterns. --Tweenk (talk) 22:31, 3 September 2015 (UTC)

Oh, OK. At the time I made that edit, the template was just a big module error, and my edit (upon preview) made it resolve into a normal-looking table, so I figured the exclamation marks were an odd typo. - -sche (discuss) 23:53, 3 September 2015 (UTC)

German capitalisationEdit

Isn’t it about time for some archiving?

Anyway, could you please tell me if German always had the ‘capitalise all nouns’ rule? --Romanophile (talk) 03:25, 7 September 2015 (UTC)

Yeah, you're right, I need to archive.
No, German and its predecessors (Old/Middle High German) didn't always capitalize nouns. In the medieval period, capitals were generally only used at the beginning of sentences. Even after capitalization of nouns and names became standard in the Baroque period, some authorities (such as the Brothers Grimm, authors of the major Deutsches Wörterbuch) were opposed to it and persisted in writing in minuscule.
- -sche (discuss) 20:46, 7 September 2015 (UTC)
So, would it be permitted to include minuscule forms as obsolete forms? --Romanophile (talk) 21:11, 7 September 2015 (UTC)
I think that would be a bad idea, since the difference isn't specific to the word, but a general rule. You would end up with a lowercase entry for just about every noun attested before a certain date, which would be about as useful as a entries for italicized or underlined forms. Chuck Entz (talk) 21:26, 7 September 2015 (UTC)
Okay, fair enough. But what if the word is not attested in a capital form? Do we capitalise it anyway? --Romanophile (talk) 21:55, 7 September 2015 (UTC)
I would. Otherwise, you imply that there's some inherent difference from all the other nouns which were also lowercase back then. Of course, Middle High German and Old High German would be uppercase or lowercase by their own rules, since we consider them separate languages. Chuck Entz (talk) 22:19, 7 September 2015 (UTC)
I agree with Chuck. We do similarly for English: old capitalized Nouns don't have Entries, and we've tended not to capitalize common Nouns even if they're more common in old Works in capitalized Form, although there are a few Exceptions (like Admiraless, which I only just moved). - -sche (discuss) 22:29, 7 September 2015 (UTC)
"We do similarly for English: old capitalized Nouns don't have Entries" -- Does that mean there shouldn't be capitalized entries or does it simply mean that they're missing? Also what's in case of other European languages, like Latin and Danish in which nouns were also (sometimes) capitalized? If capitalized spellings are discrimited against, shouldn't there at least be some note somewhere? For example, there could be a page somewhere explaining English habits, like explainining differences between US English and UK English and explaining English capitalization habits. If a single page would be too long, there could be sub-pages like "English habits/Dialects" and "English habits/Capitalisation". - 12:31, 25 October 2015 (UTC)
Wiktionary has decided to exclude old Capitalizations of ordinary Nouns as a Matter of Course, along with sentence-initial Capitals and all-caps (the usual Examples cited in Discussions are "The" and "THE", Variants of "the"), long-s, and various typographic Literatures (e.g. Talk:fisherwoman). I proposed last Year that we should write these Exclusions down in some central Place, but nothing happened; perhaps I'll suggest it anew.
Wiktionary:About Latin#Orthography_for_Latin_entries documents how we handle Latin, although some Things (like that we don't include "EQVVS") seem to be so basic that they're not spelled out but only implied by e.g. the Note that the Form which we do have an Entry for is "equus".
I don't recall if we've discussed old Danish Capitalization or not, but I see no Reason it wouldn't be handled like old English Capitalization. - -sche (discuss) 19:10, 25 October 2015 (UTC)
Can the descisions be found somewhere? The exclusion of sentence-inital capitals and modern all-caps and typographic ligatures makes sense. But in case of capitalised nouns and normal antique Latin forms in all caps the exclusion is doubtful.
In case of long-s the exclusion would even be against Wiktionary's aim "to describe all words". While it's easy to change "winter" into "Winter" when one knows that "winter"/"Winter" is noun, it's not easy to change some s into long-s. In some cases, it's more like impossible to know where long-s's are put, if one doesn't know the rules concerning long-s's. (Simplified basic rules like "s" is used at a word's end and long-s is used elsewhere often are incorrect.)
Also old Latin abbreviations like "IMP" for "IMPERATOR" can be found in (special) dictionaries and can't be changed into a pseudo-modern spelling like "Imp" or "imp", because a modern abbreviation would be "imp." or "Imp." which is another word as it's written with a dot.
And in some of these cases, I have doubts whether descisions were made or not, or whether they were real descisions and not just some uttered opinions somewhere. For example, it's possible that nobody thought of old Latin abbreviations like "IMP" and thus no descision was made.
Also, what's in case of Modern Greek? The about page clearly states "This is a draft under discussion.". On the discussion page Katharevousa forms (Modern Greek spelled with diacritics etc.) are mentioned and some Katharevousa words have own entries (e.g. καρβονικόν). But the about page and the discussions don't clearify where to mention Katharevousa forms. Is e.g. καρβονικόν a related word, a synonym or an alternative form of καρβονικό? (Comparing it with other languages, like English prae- and pre-, Katharevousa forms should be alternative forms.) What about καρβονικός and it's declension, where should the neuter Katharevousa form καρβονικόν be mentioned? Under alternative forms with the addition "neuter", in the declension section, in the header?
Other questions in some way related to this:
* In case of Wiktionary:About Latin it seems that some things weren't discussed - at least not at Wiktionary talk:About Latin - but rather made up by some authors. E.g. in case of the edit from 27 November 2011 with the comment "→‎Quotations: Adds rule for marks over final a for disambiguation of ablatives from nominatives", it doesn't seem that there was a discussion. On the talk page there was no edit around that time and the author of that change didn't make a change which would indicate a discussion about it. (There was a discussion with another topic at "Wiktionary:Grease pit" and a discussion on his talk page which he commented. But both wouldn't be fair places for a discussions of his edit.)
* The About Latin page states that words with j should point to word with i. But what's if only a form with j is attested? Well, that doesn't mean that the form with i is not attestable, but when it's not attested (no one found a quote), then the word with j can't point to a form with i. Well, at least not, if one doesn't make up words and words forms.
* What's if there is a term without an English translation? E.g. Swedish "tankstreck" and German "Gedankenstrich" refer to the mark "–", but usually just when used in certain contexts and sometimes not it's not restricted to that smaller dash but might also refer to "—". Thus, both terms do not belong into a translations of "dash" or "en dash". But still it would be nice to see translations for these terms. The current practice is like this: The words are incorrectly given as translations of an English word or there is no translation section with words like that. Possible 'solutions' which should be better: (a) One could mention "tankstreck" under the Etymology of German "Gedankenstrich" and vice verse, as they are formed similary. (b) One could create an translation template like "template:translation - thoughts stroke" which than can be embedded in the entries of the foreign words.
So regarding your old, unanswered question at "Beer parlour": Those descisions should be collected somewhere. Also, maybe those descisions should be checked whether they still make sense or not. One could also check if all so-called descisions really were descisions. E.g. a user once wrote that as far as he knows About Latin is rather a collections of ideas than of actual rules. The part, "... think tank, working to develop a formal policy.", should support his attitude.
- 20:08, 27 October 2015 (UTC)
Questions about Wiktionary's policies towards Latin should be directed towards Latin-speaking and Latin-editing editors on WT:T:ALA. Likewise, questions about Ancient Greek should be directed to WT:T:AGRC; people there are more qualified than I am to tell you about Katharevousa. I've started a BP thread about long s and ligatures: Wiktionary:Beer parlour/2015/October#Documenting_how_to_handle_long_s_and_ligatures. - -sche (discuss) 02:20, 28 October 2015 (UTC)

Broken usage tracking in MediaWiki:Gadget-RegexMenuFramework.jsEdit

Hello -sche. You changed a link in MediaWiki:Gadget-RegexMenuFramework.js to remove it from Category:Pages with broken file links. The broken page links are a common hack to track global usage via Special:GlobalUsage or the usage tool. Unfortunately your change broke the tracking, so the page will no longer receive maintenance updates as needed. Would you consider excluding JavaScript pages from Category:Pages with broken file links instead? (I can put together the code to do that via MediaWiki:Broken-file-category for you.) Pathoschild (talk) 02:59, 11 September 2015 (UTC)

fickern seems to have become an autonym...Edit

Rather than create Category:Palatine German and Category:Kölsch German, I wanted to instead fix this entry, which is the sole entry in both of those- but I don't know enough about either language to do it even half right. I suspect you'll also want to remove some things from Module:labels/data. Thanks! Chuck Entz (talk) 04:40, 11 September 2015 (UTC)

The labels in the module are largely OK, because there are (almost certainly) terms used in standard German which are specific to the Palatinate / Köln, although the details of labels like those are under discussion on the module's talk page. This entry, on the other hand, is odd... the Pfälzisches Wörterbuch only has "fickeln" and "ficken"; the Rheinisches Wörterbuch doesn't have this sense; and Google Books hits all seem to be scannos or the noun. Even raw Google hits for "zu fickern" are mostly Google Books scannos. - -sche (discuss) 21:49, 11 September 2015 (UTC)


Could you take a look at Talk:Knabe. DCDuring TALK 12:16, 22 September 2015 (UTC)

Thanks for helping me with this sweet memory of my deceased parents. DCDuring TALK 06:46, 9 October 2015 (UTC)

American black bearEdit

Do you know what language "Dene" refers to here? DTLHS (talk) 18:08, 8 October 2015 (UTC)

If you go to WT:LOL and press Ctrl+F and type "Dene", you will find that the Chipewyan language (code chp) has "Dene" as one of its alternative names. --WikiTiki89 18:16, 8 October 2015 (UTC)
That's true, but "Dene" can also refer to a whole family of languages, so I don't know what was meant. DTLHS (talk) 18:38, 8 October 2015 (UTC)
You seem to be right. In WT:LOF, all I see is "Na-Dene", not "Dene", but the Wikipedia page on Na-Dene languages mentions that there the "Athabaskan" family can also be called "Dene". Anyway, the Chipewyan language is in the North Athabaskan family, which is in the Athabaskan family. Anyway, Chipewyan is the only single language I can find that goes by the name "Dene" and the Wikipedia page on the Chipewyan language says "Most Chipewyan people now use Dene and Dënesųłiné to refer to themselves and their language, respectively." Based on all this, I think Chipewyan is the correct choice. If it turns out to be wrong, it would be within our expected margin of error and we would know it's in the same family of languages anyway, so the actual Chipewyan would be similar enough. --WikiTiki89 18:56, 8 October 2015 (UTC)
That's a reasonable assumption, although in this case I think the rug is pulled out from under it because the gloss (=the claim that tsah means "black bear") seems to be mistaken. Desjarlais gives sas as the Dënesųłiné (Chipewyan) word for "bear", and an old article in the Transactions of the Canadian Institute clarifies the species by saying [in old orthography] "the "Déné word for Black Bear is s̀əs or s̀as according to the dialect". For comparison, Hargus gives səs as the word for "black bear" in either Sekani or Babine-Witsuwit'en — without reading her whole chapter I can't tell which — and Krauss gives x̯ešʷ as the Proto-Athabaskan word for "black bear". Whereas, Desjarlais says tsá is the word for "beaver", and Morritt citing Haas agrees (compare Sekani tsàʔ and Slave tsáʔ, both "beaver"). - -sche (discuss) 21:44, 8 October 2015 (UTC)
Historically, brown bear species almost certainly ranged over the lands of the Chipewyans. Is there a term that included brown bear? DCDuring TALK 00:14, 9 October 2015 (UTC)
I can't find a Chipewyan term for "brown bear", although I can find sources which gloss sas as just "bear", so it may have functioned as a generic term. I can find the term in other languages: Ruhlen has Haida xúuts "brown bear", Tlingit xúts (= /xúːc/, also written xoots) "brown bear" (/"grizzly bear"), Tsetsaut "grizzly bear". Athabaskan languages and the schools: a handbook for teachers (1984) notes "in Kutchin, shih means 'brown bear' but shìh (with lowered tone) means 'food', and these words are not grammatically or etymologically related." The Proto-Athabaskan term for "brown bear" was x̯...c per Krauss (he is unsure of the middle vowel). - -sche (discuss) 01:21, 9 October 2015 (UTC)
The problem is that the w:Na-Dene languages are called that because some variation of "dene" means "people" in the vast majority of at least the Athabaskan languages. More often than not, the word for "people" gets used in the language name (at least the one native speakers use for their own language), so there could be a number of candidates. The Chipewyan term is pretty close, so it would make sense to concentrate on that part of Northern Athabaskan. Or, better yet, get @DCDuring to tell us what source he used for his mass addition of American Indian translations to that page, and we might be able to figure it out that way. Given the rather poor understanding of American Indian languages and their orthography in most general sources, I'm not so sure that was a good idea. Off the top of my head, the Hopi looks plausible based on what I know of other Northern Uto-Aztecan languages, and the Southern Uto-Aztecan ones all seem to use reflexes of the same ancestral form, which is a good sign, but "close" isn't close enough for dictionary purposes. Chuck Entz (talk) 03:41, 9 October 2015 (UTC)
Why is that a "problem"? --WikiTiki89 19:22, 9 October 2015 (UTC)
It makes it difficult to tell which language a work that refers to "Dene" is referring to. Indeed, older generalist works (as they tend to do with a lot of languages, e.g. also Great Russian) often impressionistically consider whole swathes of the Dene family to be a "Dene" language divided into e.g. Northern and Southern dialects. - -sche (discuss) 20:30, 9 October 2015 (UTC)


You removed "songster" from Sänger. But than shouldn't "songstress" be removed from Sängerin too, or shouldn't it be replaced with "singeress" ("female person who sings" instead of "female person who sings (songs)")? - 12:23, 25 October 2015 (UTC)

Thanks for catching that. Yes, it's sufficient for Sängerin to say "female singer", IMO. If I heard someone say "singeress" it'd be a dead giveaway that English wasn't their native language. - -sche (discuss) 18:45, 25 October 2015 (UTC)

Request for Zipser German GrantedEdit

User -sche, þy wish haþ been granted. See here. --Lo Ximiendo (talk) 18:52, 25 October 2015 (UTC)

Þanks! - -sche (discuss) 19:18, 25 October 2015 (UTC)
I also added few words of Sathmar Swabian and Silesian German. --Lo Ximiendo (talk) 19:38, 26 October 2015 (UTC)
Great! Wiktionary's coverage of Germanic languages is slowly increasing. - -sche (discuss) 08:05, 27 October 2015 (UTC)
I added the white and yellow flag for the language header of Silesian German. --Lo Ximiendo (talk) 06:46, 30 October 2015 (UTC)


If that's "nonstandard", then please fix it. It's simply a fact, that there are two opinions about the part of speech:

  • Some say that Berliner and similar words are adjectives. This is also supported by dated spellings like berliner.
  • Some say that Berliner etc. are nouns in gentive plural: der Berliner, gen. pl. der Berliner - so Berliner Mauer literally means "Wall of the Berliners". This is also supported by German spelling rules: nouns begin with a capital letter, adjectives not (nominalised adjectives aren't adjectives anymore, but nouns too).

- 18:24, 27 October 2015 (UTC)


What’s wrong with the samples on Google Groups? --Romanophile (contributions) 20:56, 29 October 2015 (UTC)

Oh, there are some, that's great! They weren't there when I searched back in 2013, which is odd, since the posts were made before 2013... but Cloodcuckoolander (I think) has remarked upon how oddly unreliable Google's Groups search is. Thanks for revisiting the entry / noticing. (I have a short list of entries that just need one more citation that I check up on periodically, but it's woefully incomplete.) I'll turn it into an entry. - -sche (discuss) 05:06, 30 October 2015 (UTC)


This rollback is in error. As said before (cf. revision history): google doesn't differ between ſ and s, so antiqua "daſs" (around 1871-1902) incorrectly becomes "dass" by google and thus ngram is no reliable source. Maybe in case you don't know: daſs is not the same as dass, but an alternative form of daß used in antiqua when ß was not available (this usage was deprecated in some spelling rules). daſs could also be a Heyse spelling of daß, but then Heyse's spellings (including his antiqua spellings) are (said to be) different from 20th/21st century spellings as Heyse also used ſ in antiqua (rules from 1902 should deprecate the use of ſ in antiqua and only allow s and ß, which also holds for the 1996 reform though the use of ß and s were changed).
Also: daſs is an alternative form of daß/dass, older (antiqua) spellings with ſ can't easily be derived from modern antiqua spellings and there's no bijection between older (antiqua) spellings and modern antiqua spellings, e.g. both Wachſtube (Wach-stube) and Wachstube (Wachs-tube) become Wachstube in antiqua without long s. Thus it makes sense to add spellings with ſ. (It maybe makes no sense to have an own entry for it as it's not easy to input the character ſ, as most users don't know the difference between ſ and s, and as ſ and s might be similar in case of encoding etc., but there was no link anyway.)
P.S.: In the German spelling rules (Berlin, 1908) it is: "In lateiniſcher Schrift ſteht s für ſ und s, ss für ſſ, ß (besser als ſs) für ß, für ß tritt in großer Schrift sz ein, z. B. MASZE (Maße), aber MASSE (Masse)." (antiqua ß and fraktur ß actually look differently in the text and the text itself is printed in fraktur). In early Duden editions (late 19th century) it was: "Zu merken iſt, daß man in lateiniſscher Schrift s für ſ und s ohne Unterschied, ss für ſſ und ſs für ß anwendet. Statt ſs ist auch ß zulässig." So, daſs which can be found in antiqua texts from ca. 1871 till 1902 and is OCRed as "dass" by google is an alternative form of "daß" which was more common in antiqua after 1902.
daſs was also a real Heyse spelling. But I'm not sure whether it was used in fraktur or in antiqua. If it was used in antiua than it's obviously different from dass. If it was used in fraktur, then the traditional fraktur-antiqua transcription rules from early Duden editions and the rules from 1902 could say that it has to be transcribed as dass, but even than it's a different form as it's transcribed. But it's very likely that Heyse's spellings were not as common as Adelung's spellings.
Anyway, as google doesn't differ between ſ and s (and between fraktur and antiqua), it can't be used to cite a statement like "dass was more common than daß in 1871-1902". In case of 1950 till nowaydays, ngram maybe can be used as the nazis banned fraktur and it never became popular again and as ſ became unpopular in antiqua (cf. traditional spelling rules from 1902 and reformed spellings rules from 1996).
- 14:18, 6 November 2015 (UTC) and 14:40, 6 November 2015 (UTC), P.S. 17:47, 6 November 2015 (UTC)

Where is your evidence that daſs should count as daß and not "dass"? To the extent that "ess-zett" is treated as a separate thing from "two esses", "ſs" is two unligatured esses, one long and one short according to the usual (translingual) rule of long- vs short-s placement. - -sche (discuss) 21:16, 6 November 2015 (UTC)
Older Duden editions and German spelling rules from 1902 (see above). Both state that fraktur ß (actually more like a ligature of ſz) can be written as ſs in antiqua ("ſs für ß" and "ß (besser als ſs) für ß"). So antiqua daſs can be and often was an alternative form of daß and not of dass.
Without Duden and the German spelling rules (which say that in antiqua s is used instead of single fraktur ſ), antiqua daſs would still be another form than dass. That is, one would have to differ between three forms in antiqua: daß (traditional spelling, also prefered by the 1902 orthography), daſs (older antiqua spelling), dass (1996 reform spelling). It could be, and shouldn't be unlikely, that authors who used daſs (which could also be a Heyse spelling used in antiqua) would prefer daß over dass if they could only choose between these two forms. In case of the real Heyse spelling, at least the one used in fraktur, many arguments used against the 1996 reform spelling are invalid, e.g. sss shouldn't occur in a real Heyse spelling used in fraktur, and maybe in antiqua too.
daſs (not daſz (= daß)) in fraktur by Heyse could be dass in antiqua. But: 1. The older Duden and the spelling rules from 1902 can't be used to derive that spelling, as fraktur daſs is an incorrect form in the beginning. So, some other source is needed that says that Heyses fraktur daſs can be an antiqua dass, as it could also be that his correct antiqua form would be daſs too or that he proscribed the use of antiqua. 2. It's more likely that Heyse's spelling was rarer anway, as it was younger (Adelung came before him), was depracted in several German countries (e.g. in Prussia) and as it wasn't used in the 1902 orthography (if dass was prefered in 1871-1902, than it should be more likely that that spelling would be used in the 1902 orthography). 3. As google doesn't differ between ſ and s and between fraktur and antiqua, it is no reliable source. And to interpret google's ngram or google's books would be OR too.
(Regarding the quotes: It's hard to quote a fraktur text which differs between fraktur and antiqua in an antiqua text. Maybe it would be better with pseudo-HTML like "<antiqua>ſs</antiqua> can be used for <fraktur>ß</fraktur>", but maybe that would be harder to read.)
- 12:39, 7 November 2015 (UTC)


Discussion moved to Wiktionary:Tea room/2015/November#Neger.
(let's try to keep the discussion in one or two places rather than three)

Dative -e in German strong declensionsEdit

Discussion moved to Template talk:de-decl-noun-n.

German ordinal numbersEdit

Presently their lemmas are the forms in -e. Our general practice, I think, is to put adjectives without a bare form at -er. The ordinal numbers do have a bare form, which is used with zu: zu siebt, zu acht. But these seems to be separate idioms. So I guess -er would be the right place. And if you agree: Should I move them manually, or is there a better way? Kolmiel (talk) 14:22, 9 December 2015 (UTC)

I think you are correct about the general practice (or more accurately, the general desire — in practice a lot of entries were created at the wrong title and still need to be moved), which also applies to substantivized adjectives. But about this particular set of entries... why do de.Wikt and the Duden, which do lemmatize e.g. Verletzter m rather than Verletzte m (cf. this thread), both lemmatize siebte ([13]) rather than siebter ([14])? (The DWDS seems divided on the matter; there's no entry found if you search for "siebter", but if you search for "siebte", the DWDS-Wörterbuch entry lemmatizes that form, while the Etymologisches Wörterbuch entry that comes up lemmatizes the form that ends in r.) Do you know if there's any logic behind lemmatizing the -e forms? If not, then yes, for consistency they could be moved. I suppose AutoWikiBrowser could be used to speed up the process somewhat. - -sche (discuss) 01:03, 10 December 2015 (UTC)
I think Duden might lemmatize the er-form only in nouns, but the e-form in adjectives. For example "oberer" is given "obere, oberer, oberes" at Otherwise I don't think there's a special reason concerning ordinal numbers, except possibly that these are often preceded by the definite article. But that's true of others as well, and the er-forms of ordinal numbers do occur and aren't particularly rare at all (ein zweiter Versuch, zehnter Dezember). So I think they should be moved. Kolmiel (talk) 14:19, 10 December 2015 (UTC)
OK, I will find time to move and standardize most of them in AWB if there are too many for us to do by hand. AFAICT (from the category and the bluelinks in siebte) we're dealing with <50 entries, right? There's a lot of inconsistency in what part of speech the lemmas and non-lemmas use in their headers and headword-lines; achtzehnte uses 'ordinal number' and zweiter uses 'numeral', but siebte uses 'adjective'; siebtes uses 'adjective form', so I tentatively just appended 'form' to the headword line of neunzehnte and got 'numeral form'. They should all be 'adjectives' (the lemmas) and 'adjective forms' (the inflected forms), right? (This needs to be sorted out regardless of which forms we lemmatize.) Also pinging @CodeCat, who has helped sort out Wiktionary's labelling of numerals vs numbers vs adjectives. (The Duden goes with "Zahlwort"; de.Wikt with the double header "Adjektiv, Numerale".) - -sche (discuss) 22:14, 11 December 2015 (UTC)
These are straightforwardly adjectives. "Numeral" is a special kind of part of speech whose definition I'm still not sure of, but see w:Numeral (linguistics). That page mentions in particular that "not all words for cardinal numbers are necessarily numerals", so not everything with a number meaning is, part-of-speech wise, a numeral. That's why we have Category:German numbers, which exists outside the POS category tree. In fact, I believe numerals are a kind of determiner, closely allied to non-cardinal quantifiers like "all", "some" or "no". —CodeCat 22:50, 11 December 2015 (UTC)

Old PicardEdit

Should we have a language code, or at least an etymology-only code, for Old Picard? Otherwise, I assume it is currently treated as a dialect of Old French, and without an etymology code, I had to use some awkward phrasing at Rosine#Etymology. --WikiTiki89 19:37, 28 December 2015 (UTC)

I do find references (more than to "Old Italian"!) to Old Picard translations of texts, and to Old Picard words — including in dictionaries that give Old Picard words as etyma. Let's give it an etymology-only code, so that those etymologies which need to can cite it. Distinguishing it in general from Old French and Old Northern French might be messy, so I wouldn't grant it a full code and its own language sections until such time as someone makes a case that it needs/merits that. Based on "fro-nor", I guess the thing to add to Module:etymology languages/data would be "fro-opc". - -sche (discuss) 03:40, 29 December 2015 (UTC)
Maybe "fro-pic" instead? The oldness is already implied by "fro-", and we do have "fro-nor" rather than "fro-onr". --WikiTiki89 16:46, 29 December 2015 (UTC)
Sure. - -sche (discuss) 18:44, 29 December 2015 (UTC)
Maybe we should be consistent and use the language code for modern Picard: fro-pcd. Of course, we have fro-nor, instead of fro-nrf, so maybe I'm just playing host to the "hobgoblin of little minds". Chuck Entz (talk) 02:45, 30 December 2015 (UTC)
I thought about that, but then I wondered if using the modern language's code for that element would suggest that this was another code for the same thing. "de-AT" is (a variety of) the same language as "de", whereas "fro-pic" is not "pcd" but rather (a variety of) "fro". - -sche (discuss) 05:47, 30 December 2015 (UTC)
I already went with "fro-pic", I figured if there was a strong enough argument to change it, I would, but I don't see such a strong enough argument. --WikiTiki89 17:01, 30 December 2015 (UTC)
Gosh! And here I was expecting a discussion of one of the roles Patrick Stewart played in the Star Trek: The Next Generation series finale. We could have split up into Team Middle-Aged Picard and Team Old Picard a la the Twilight Saga. Oh well...
Seriously, though, I seem to remember reading somewhere that a few of the "Normans" that invaded England were really Picards, and that there are traces of Old Picard in English. Chuck Entz (talk) 07:25, 29 December 2015 (UTC)

Source accessEdit

I have no access to the PDF documents of Cambridge Ancient History. Do you know how to get access to it? --UK.Akma (talk) 21:02, 10 January 2016 (UTC)

I don't; I'm sorry. I just took the text that had been added to Subarian and trimmed out the speculation on ethnic identity and other things that belonged in Wikipedia rather than in a dictionary. - -sche (discuss) 21:29, 10 January 2016 (UTC)
See the discussion on the talk page of w:Subartu about what seem to be the same set of references. I have my doubts whether any of this should be allowed in the entry. Chuck Entz (talk) 22:13, 10 January 2016 (UTC)

all heartEdit

Why did you delete that page? Do you think that it’s sum‐of‐parts? --Romanophile (contributions) 00:03, 13 January 2016 (UTC)

@Romanophile: Because it was just nonsense, the page contained only the text "i love my family and everybody else around me". --WikiTiki89 00:30, 13 January 2016 (UTC)
@Wikitiki89 Ah! All right. Still, do you think that this entry would be acceptable if it were properly designed? --Romanophile (contributions) 00:36, 13 January 2016 (UTC)
Depends on what meaning you have in mind. --WikiTiki89 00:37, 13 January 2016 (UTC)
Hmm. I'm familiar with the collocation in sayings like "She was all heart" (=was very loving and/or compassionate, that kind of thing), but it does seem like it might just be "all" + "heart", and one can also say things like "She was all brain(s)" (=was very smart, perhaps without things like social awareness, hence the "all"), or "She was all legs" (=had long legs). - -sche (discuss) 00:50, 13 January 2016 (UTC)
"Definitions" 2 and 4 of heart#Noun cover the range of meanings of heart as used in "all heart" in my experience and in a review of the phrase at COCA. DCDuring TALK 02:09, 13 January 2016 (UTC)

Thank youEdit

For finishing cleaning out RFV, especially given that it had become too large to archive. The page that really needs your help, though, is WT:RFM (and I suppose to a lesser extent WT:RFDO), because I simply can't close many of those. Some of them are language mergers etc.; the ones that you haven't looked at need some expert attention, and even those upon which we've come to a consensus need to be executed. I still don't know all the steps one ought to go through to handle mergers and name changes (is there a manual somewhere?). The other hitch is that I don't have AWB, so it's a lot harder to find all the uses of a language's name or go through every page in a category to change it, which especially slows down requests at RFDO. Anyway, I appreciate the help! Cheers —Μετάknowledgediscuss/deeds 00:01, 1 February 2016 (UTC)

I will take a look. As for AWB, you could download it; it's not that hard to learn. There is Wiktionary:Guide to adding and removing languages; changing a language's name is not handled too differently from removing a language (you have to find the same things — old uses). - -sche (discuss) 00:55, 1 February 2016 (UTC)
I have a Mac, so using AWB would require virtual Windows AFAICT. In any case, I should've known about that guide — thanks. I'm not always sure where to archive the discussions, though. In any case, I guess all that I really need is for you to weigh in on long-ignored discussions. —Μετάknowledgediscuss/deeds 03:55, 1 February 2016 (UTC)

Update of "Template:quote-book/source"Edit

Hi! I have done an update of {{quote-book/source}}, which is at {{quote-book/source/sandbox}} (see {{quote-book/testcases}} for sample uses). The main changes are these:

  • Improved handling of |format=, |genre=, |language= and |doi=.
  • Addition of |archiveurl= and |archivedate=.

If it looks all right, could you please replace {{quote-book/source}} with the contents of the sandbox? Thanks. Smuconlaw (talk) 09:03, 3 February 2016 (UTC)

Yes check.svg Done. I'm monitoring Category:ParserFunction errors to see if errors arise, and actually, it looks like the category is losing a few pages, though that may be due to Kenny's edits. :) Still, thanks for all your hard work sorting out these quotation templates! - -sche (discuss) 17:46, 3 February 2016 (UTC)
Thanks, and you're most welcome. Let me know if anything in the template needs fixing. Smuconlaw (talk) 20:02, 3 February 2016 (UTC)
Whoops, there seems to be a space missing if |title= and |publisher= are specified, but |location= is not: see "freedom of speech". (I don't think this was an issue created by me.) Fixed it at {{quote-book/source/sandbox}}. Smuconlaw (talk) 20:11, 3 February 2016 (UTC)

Cool BeansEdit

Have added a recording of the phrase at the discussion page below, can it be added to the actual page? —This comment was unsigned.

Collocations dataEdit

I don't think our discussions of collocation space included anything about what the data might actually look like. I have a 1.7Mb file of sample (free) data from COCA. We could try using it for a few polysemic words for a demo, to determine its actual value to us, etc. The cost of getting more complete sets would not be prohibitive. I don't think it could be part of Wiktionary because of licensing. There are really only a handful or two of Wiktionarians that could and would make good use of the data anyway. DCDuring TALK 02:22, 5 February 2016 (UTC)

The mock up that you provided would add little value to English, for which we would want frequency data at the very least. I guess I am thinking principally of improving the quality of English definitions, not just of providing a home for translation targets out of principal namespace. Perhaps I should add the kind of table I am thinking of. DCDuring TALK 21:14, 6 February 2016 (UTC)
Hmm, yes, that would help me to understand what you're thinking of. Recording which collocations are most common? (Some usage notes already do that — added by you, I think; thanks!) - -sche (discuss) 21:34, 6 February 2016 (UTC)
See Talk:goods. DCDuring TALK 22:51, 6 February 2016 (UTC)
Is "frequency of collocation" how often that collocation occurs in the corpus? Then what are "total frequency" and "mutual info"?
Access to this kind of information seems like it would be helpful to Wiktionary in determining which collocations to list (and in what order). The frequencies of the different collocations might also be of interest to some readers. Is COCA copyrighted? Wiktionary should consider whether it would be infringing COCA's copyright to repeat such information in a large number of entries.
The table is quite large; obviously, it could be made collapsible. Another possibility would be condensing it radically into the list format a few entries (usage notes) already use: Words which collocate with goods: (goods and) services ([data from whichever field best indicates how often this collocation is relative to other collocations goes here]), consumer (goods) ([data]), [...]. It would also be possible to combine the table's data with translation tables, by putting the information from each row of the table into the "gloss" atop each translation table.
- -sche (discuss) 08:22, 7 February 2016 (UTC)
"Frequency of collocation" is frequency of the collocation (occurrence within 4 words of the main term, before or after) in the corpus, "total frequency" is the frequency of the individual term (eg, services) in the corpus. "w:Mutual information" is a measure of the strength of the association between the terms (ie, between goods and services)
I've started using this. (See sheer.) It is most helpful for polysemic words, but also helps determine whether a term is polysemic. I have had a brief e-mail discussion with Mark Davies at BYU who has pulled this together. I think we can experiment with this for quite a while before we would have to consider our options. I know that he would not be happy with our license terms.
Ultimately the presentation would probably be most useful if we grouped the collocates by the definition(s) most appropriate for them and their part of speech and presented them in decreasing order of frequency, first by PoS, then by term.
But the table that would be most useful for a contributor is one exactly like what is in [[goods]], but with a fuller list of collocates of the headword. We need such a voluminous table if we hope to cover all (well, most, actually) of the definitions in some of our polysemic entries. So I don't think they could fit in translation table headers. DCDuring TALK 00:12, 9 February 2016 (UTC)

adverbs Monday, Tuesday etc.Edit

Yep, that's fine what you have done. I considered doing that, but I thought some "wise guy" would come along and revert my edits denying the truth, hence the wording I used. Cheers! Donnanz (talk) 23:32, 8 February 2016 (UTC)

Do Brits not say that? --WikiTiki89 23:40, 8 February 2016 (UTC)
Nope. On Tuesday, on Tuesdays etc. It's not the same as on the west side of the pond. Donnanz (talk) 23:46, 8 February 2016 (UTC)
I mean, we also say "on Tuesday", but not always. In fact I would probably say that "on Tuesday" is more common than just "Tuesday" adverbially. I'm trying to think whether there is a pattern to when the "on" is dropped. --WikiTiki89 23:54, 8 February 2016 (UTC)
I have heard it minus "on" on American media, also read it in American literature. I tend to notice the usage when I see or hear it. There are also entries in Oxford saying this. Donnanz (talk) 00:14, 9 February 2016 (UTC)
I'm not saying it doesn't exist. It is pretty common, but what I'm saying is that I don't think it is the primary usage. And I'm wondering whether there's a pattern to how it's used. --WikiTiki89 01:31, 9 February 2016 (UTC)
FWIW I ran the numbers and ngrams confirm the regional divide; it's about twice as common to say "work on Tuesday(s)" than "work Tuesday(s)" in US books, whereas in UK books the "on"-less forms are too rare to register. - -sche (discuss) 06:06, 9 February 2016 (UTC)


Could you look at the IPA for Michäas? This is just my guess. —JohnC5 03:50, 9 February 2016 (UTC)

I've actually never encountered this form. In Ngrams, it seems to have been about 1/20th as common as Micha until the 1870s, thereafter about 1/50th as common (with a spike in 1952) until the 1980s, and thereafter trending sharply downwards towards about 1/1000th as common by 2008. I would pronounce it the way you noted. - -sche (discuss) 04:56, 9 February 2016 (UTC)
Thanks! —JohnC5 05:51, 9 February 2016 (UTC)


So the term girl can't be used in a platonic context, but only in a romantic? Ubuntuuser13 (talk) 03:00, 16 February 2016 (UTC)

Nvm, I've opened up an RfV for it. Ubuntuuser13 (talk) 03:10, 16 February 2016 (UTC)
Definition one ("A young female human") and two ("Any woman, regardless of her age") and five cover non-romantic use, do they not? Are there citations where "girl" means "a female friend" as opposed to "a [young] female (who may or may not be a friend)"? That might clarify matters. As it is, it seems like someone calling a female friend a "girl" is comparable to someone calling a blond-haired friend a "blond" — it doesn't cause "blond" to mean "a blond-haired friend", it's just the general definition. Usage like "girl, let's go see Andy!" is sense 5, the term of endearment. - -sche (discuss) 03:12, 16 February 2016 (UTC)
Just so you know I've opened up an RfV. Ubuntuuser13 (talk) 03:26, 16 February 2016 (UTC)
Thanks. - -sche (discuss) 02:13, 17 February 2016 (UTC)

Category:French TranslingualEdit

Category:Regional Translingual --Romanophile (contributions) 13:52, 16 February 2016 (UTC)

Thanks for noticing those. - -sche (discuss) 02:13, 17 February 2016 (UTC)

ISO codesEdit

Hey thanks for updating according to the new ISO standards. I was noticing when adding ancestors that we are missing some of the more newly added ISO codes. Is there anywhere I could look for a list of the new, merged, and deleted codes? —JohnC5 05:32, 24 February 2016 (UTC)

You're welcome. Changes to the standard are published here. :) I'm going through the 2015 changes now. - -sche (discuss) 05:41, 24 February 2016 (UTC)
I'll let you do it then! Tell me if you need any help. I'm a little terrified of sorting Austronesian. —JohnC5 05:50, 24 February 2016 (UTC)
The key to understanding the structure of the Austronesian languages is admitting that there isn't much: you have Malayo-Polynesian and the various Formosan branches, but within Malayo-Polynesian there are no widely accepted proto-languages outside of Oceanic and an assortment of local groupings. There are areal phenomena and substrata that let you classify Malayo-Polynesian into major subgroups, but those subgroups aren't genetic at all. Blust reconstructs all kinds of things, but his approach is to plug word lists into cladistics software designed for plant and animal taxonomy to produce trees, then use comparative reconstruction on the branches. His Austronesian Comparative Dictionary has an extremely impressive amount of data, but he regularizes the orthography, so you need to check the spelling against other sources, and you can't trust the comparative stuff due to his methodology. Chuck Entz (talk) 08:18, 24 February 2016 (UTC)
...So we're doomed until better research is done? :(JohnC5 15:29, 24 February 2016 (UTC)


You added the entry for "woman" here for Makalero, which is incorrect. Huber, the source you cited, simply states that it is Tongan, and that entry already existed there. I think that in your haste to create entries in a maximal number of languages, you may have made more errors that won't be caught for quite a while (you got lucky in that this one happened to turn up on my watchlist, and I felt it very unlikely that Makalero would borrow a vocabulary item like that from Polynesian, so I checked). In any case, I appreciate your project, but I think you need to take a lot more precaution to avoid these kinds of mistakes. —Μετάknowledgediscuss/deeds 06:51, 26 February 2016 (UTC)

Oh, you're right about Huber; I'm glad you caught that. I've been going through and checking my previous additions ever since Chuck's caution in the previous section that the Comparative Austronesian Dictionary (which had been recommended to me as a valid reference on Talk:water, when I was trying to verify the translations people had added there) normalizes orthography and so has to be checked against other sources. So, I hope to uncover any other errors. - -sche (discuss) 16:42, 26 February 2016 (UTC)
I see, Liliana should not have said that. Yeah, you can't really use Blust as a primary source for something serious, although the orthographic concerns run deeper; some of these languages are well nigh unwritten, and linguists might just put them in IPA. Thanks for going through them, anyhow. —Μετάknowledgediscuss/deeds 17:04, 26 February 2016 (UTC)
Looking at diffs of all my edits to water and woman, and so checking not only anything I added but any word I changed the spelling of and any language I updated the name of, and sometimes spot-checking things I had nothing to do with, I've found other references for the translations of water and woman into Äiwoo, Aklanon, Alaba, Alune, Antillean Creole (and added Guianese Creole and a usex to Haitian Creole), Anuta[n] (and Tikopia), and Arosi, Batad Ifugao, [Palawan] Batak (which we should possibly rename to avoid confusion with the Batak languages like Karo Batak), Bauro, Biak, Biloxi, Binukid, Bontoc (we probably shouldn't have both the macrolanguage code and the dialect codes there), Bughotu, Buli, Casiguran Dumagat Agta, Cebuano, Chewong, Dobu, Dupaninan Agta, Futuna, Fuyug, Gapapaiwa, Gedaged, and Gilbertese (should that language be renamed?). I had to fix Blust's spelling of several things, and fix Arosi and Bauro where he had the 'wrong' word, but the only translations for which I couldn't find any more-reliable references are Bukitan, Embaloh or Ende.
Several days ago, I removed the Ajië and Amurdag translations (not added by me — removed as part of the original project of checking the translations at water) because I couldn't find any references for them.
Abua and Abung things would benefit from more references: the only ref I find for the Abua translation of water (added by someone long ago) and of woman is R. Blench's work on the Central Delta languages; I'd prefer if there were additional sources. The Abung translation of water (likewise added long ago) is only in the Austronesian Basic Vocabulary Database (and in placenames, but ABVD is the only reference to define it as a common noun); likewise the translation of woman.
That's all the languages that start with A through G; I'll be going through the rest.
PS other people long ago added translations into several of these languages to the tables of a handful of other entries such as dog, which it may also be useful to check (in case they were working from Blust).
- -sche (discuss) 08:13, 27 February 2016 (UTC)
I've found other references for Hiligaynon (in several spellings, some dated), Isnag, Itawit, Jarai, Jola-Fonyi; Kambera, Kankanaey and Kapampangan (both even with citations of use), Kala Lagaw Ya (the spellings are all attested in dead-tree references, but the division into different dialects is per WP), Kedang and Kumak, Lamaholot, Lamboya, Lavukaleve and Lou; likewise Wandamen, Waray, Waropean, Wedau, Western Bukidnon Manobo, Wogeo, Woleaian, Wuvulu-Aua, Yami, Zaghawa, Zangskari, Zangwal. The Kua-nsi and Kuamasi and Sonaga translations are from the scholar who recently documented those languages and sucessfully petitioned for them to have ISO codes.
The K. Blaan translation is in ABVD and the word itself is used in Kibo Kbulung dad Fdas, but not glossed there (it might mean "sister" in addition to "woman", like a few other languages' words do).
I can't find [better] references for Kanowit (not added to the table by me).
The Komodo translation I can find a reference for, but it's in Indonesian and only glosses the term as part of longer sentences; likewise Waropean; it would be nice to find a better reference than Blust confirming or denying the spelling. Li'o is only in ABVD. Lawangan and Loniu I find only general references mentioning.
That's all the languages H through L (postscript: through R) and U through Z. - -sche (discuss) 19:51, 28 February 2016 (UTC)
  • You are so wonderfully diligent. If this were Wikipedia, I'd give you some annoying barnstar, but since it's here, you just get my gratitude. As for the points you've raised: the languages you've bolded are obscure enough that it may not be possible to do better for now; I see that Ende is discussed in a book called Deskripsi naskah dan sejarah perkembangan aksara Ende, Flores, Nusa Tenggara Timur, but finding that online appears to be no easy feat. As for the renames, it makes sense not to have a language called "Batak" alone. Google Ngrams show "speak Kiribati" as being insignificant as compared to "speak Gilbertese", but Google Books show more results for "speak Kiribati"; I for one have always called it Gilbertese, and it does seem that the switch has only happened in perhaps the last decade. On the whole, it doesn't seem worth changing. —Μετάknowledgediscuss/deeds 06:15, 29 February 2016 (UTC)
    • I'll second the gratitude. As for Kiribati, the name isn't any more aboriginal than Gilbertese- it actually is Gilbertese (or Gilbert, anyway) modified by the phonotactics of the language. Chuck Entz (talk) 06:49, 29 February 2016 (UTC)
      •  :)
        I learned the other day about the etymology of Kiribati — it makes me wonder what the language was called before its speakers met Gilbert!
        Plain "Kiribati" is considerably more common than "Gilbertese", but I suppose that's due to the fact that the former is also an often-mentioned placename. I'm fine with leaving the language name as is. By the way, I didn't keep a count, but I think (ignoring the hyphens he adds) Blust's spellings turned out more often than not to be the spellings other scholars used. - -sche (discuss) 07:30, 29 February 2016 (UTC)

Category:fr:Mythological locationsEdit

I noticed you added this to Champs-Élysées, while removing the entry from Category:fr:Fictional locations. Just thought I'd let you know that no such category currently exists. Purplebackpack89 19:06, 29 February 2016 (UTC)

Thanks. I've created it. - -sche (discuss) 21:31, 29 February 2016 (UTC)


What do you think about adding a code for this language, and under what name? Wikipedia describes it at Katembri language; they cite Fabre for the claim that this language is only preserved in a single brief wordlist, where it is called Kiriri (the wordlist is on page 22 (section 3.4) of this pdf). Regardless, that document does seem to be a good place to find more words for water. —Μετάknowledgediscuss/deeds 03:24, 2 March 2016 (UTC)

There's a Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Katembri language, and another Kiriri attested in a single wordlist collected from an elder from the 1960s that's a Xukuru language? Well, that's confusing.
The fact that there's only a limited number of words is no reason not to include the language, but it would be good to avoid the ambiguous name Kiriri. How about calling them Katembri and Xukuru, like Wikipedia does? Wait, (as a minor point of curiosity,) if the wordlist is labelled Kiriri, where'd the alternative name come from?
The difficult part will be assigning codes, given that the family affiliation is unclear. - -sche (discuss) 03:45, 2 March 2016 (UTC)
Given the naming issues, I'm suddenly confused about whether I have correctly identified the wordlist being referred to. I don't know anything about any of these languages, so I feel lost (it's so much better in Austronesian, for example, where I at least feel like I have a hold on what goes where). Anyway, that naming scheme makes sense; we can use qfa codes and not worry about the families, no? —Μετάknowledgediscuss/deeds 05:01, 2 March 2016 (UTC)
qfa is the prefix for exceptional family codes. All of our exceptional language codes which start with qfa do so because they start with a family code that starts with qfa, like qfa-ctc-col. There's been at least one case where we've created a family code for an accepted family (qfa-len, the Lencan languages) in order to use it in constructing a language code (qfa-len-slv for Salvadoran Lencan), but Wikipedia notes that scholars aren't certain what family either Kiriri belonged to, so we couldn't do that here because we couldn't accurately, confidently assign either one a family code (even an exceptional family code). I suppose we could construct codes starting with qfa-und, like qfa-und-ktm for Katembri. I wouldn't want to use bare qfa-___ (e.g. qfa-ktm for Katembri) because it would look like a family code. - -sche (discuss) 08:21, 3 March 2016 (UTC)

zerkreuzen and other thingsEdit

Thanks for the catch! I am indeed aware of the fact that we don't use the IPA ligature, and I do indeed copypaste from de.Wikt. I normally will catch those, but I also forget, as you saw. By the way, if you'd like to take a break from your wonderful work updating the mod:languages data, I could use the help of some more German editors. Kenny and I have written a new mod:de-headword that is already running the nouns, proper nouns, and adjectives, and has the verb logic written but not in use. The new logic allows us to detected the inflection type (strong, weak, irregular, etc.) of verbs automatically, which means that they may all be merged under {{de-verb}}. It also means, however, that we need to manually sort the current trnasclusions of de-verb into either {{de-verb-weak}}, {{de-verb-strong}}, and {{de-verb-irregular}}. Once that is done, we'll switch {{de-verb}} to the new module then have a bot merge all the other templates into it. If you'd be willing to help move the remainder of de-verb's tranclusions, I would appreciate the help, but only if you have the time. Regardless, thanks for the IPA fix! :)JohnC5 06:15, 2 March 2016 (UTC)

It's been a while since I used the de-verb templates, so I'll have to refresh myself on all the parameters, but I'll try to help out. - -sche (discuss) 08:22, 3 March 2016 (UTC)


You wrote "as you have been told previously, such obsolete invariant forms aren't listed in entries' tables". Where have I been told that? -Random187056 (talk) 22:51, 8 March 2016 (UTC)

When you've added RFCs to other entries requesting that such invariant forms be added to the tables. - -sche (discuss) 23:02, 8 March 2016 (UTC)

Languages that use the IPAEdit

Are there seriously languages that adopted the international phonetic alphabet? I know that it’s possible, but I thought that every language with writers simply had its own alphabet. --Romanophile (contributions) 05:19, 9 March 2016 (UTC)

It’s not that they adopt IPA, it’s that the only documentation available for a lot of languages is articles published by linguists. These tend to use IPA out of convenience, without any intention of establishing it as the language’s official writing system. — Ungoliant (falai) 05:29, 9 March 2016 (UTC)
Right. That said, a few languages have adopted IPA or IPA-like alphabets. `- -sche (discuss)


Hi -sche. I'm not sure whether the ping on Talk:sy³³ worked properly - I'm having doubts about the use of IPA to write languages mostly unwritten or lacking a writing system. Could you point me to the policy on this? Wyang (talk) 05:19, 9 March 2016 (UTC)

Holy shit! We both made the same topic simultaneously! --Romanophile (contributions) 05:20, 9 March 2016 (UTC)
As a side note, Bai language has a Latin-based writing system: see for example *g-sum, *ts(j)i(j) ~ tsjaj, *(s/r)-ma(ŋ/k) and *k-m-raŋ ~ s-raŋ. Wyang (talk) 05:23, 9 March 2016 (UTC)
In the general case: if we are to include words from languages which have been written down using IPA and which have not been written down in another way -- and they meet the criteria for inclusion, so I don't see a basis for excluding them -- how else would you suggest including them, if not in the way that other references do? (That's not a rhetorical question.)
I don't know that we have (m)any policy pages spelling out which scripts to include languages in (except some language-specific policies allowing multiple scripts, e.g. allowing Cyrillic Romanian). De facto we've had entries like this for years, e.g. naːnʔ³³, paʔ²⁴, and wã³nũ³tũ̱³ka̱³txi³su².
In this particular case, if there is another script we can match these entries to (either Chinese, or a Latin script), and you want to make the argument that these should be mapped to and moved to that script even if it's not the one they're attested in, that's OK by me.
- -sche (discuss) 05:44, 9 March 2016 (UTC)
For the first question, I think it would be best to hold off on creating entries in that language, until a substantial amount of studies have been done on that language. The status of having a writing system, or at least achieving transcriptional consistency in scholarly studies, should be used to assess whether transcriptions for a rarely attested language have become relatively stable. I don't have a strong opinion on this, however.
Regarding Bai language, here is a picture of the word "water" in Latin-scripted Bai: I think the Bai languages should be grouped together, and recorded using this writing system. Wyang (talk) 06:21, 9 March 2016 (UTC)
We can't exactly "hold off"; that's antithetical to "all words in all languages". As for the Bai lects, do you have a source that supports grouping them? I'm inclined to agree with you just because you are so much more knowledgeable about that part of the world, but evidence helps. —Μετάknowledgediscuss/deeds 06:37, 9 March 2016 (UTC)
This suggests that though they grade into one another, there are enough differences across the groups of lects that there is unintelligibility. That suggests that multiple centres of intelligibility may be a better way to capture what's going on, even if the ones the ISO uses are slightly arbitrary. —Μετάknowledgediscuss/deeds 06:41, 9 March 2016 (UTC)
"All words in all languages" is a simple enough catchline that summarises this project reasonably succinctly. We, however, do not aim to record all words in all languages, for example all the words in agglutinative languages, or transcribing words in a previously-undocumented language singlehandedly. We record lemmata and certain non-lemmata in as many languages, in forms these words are usually recorded in. (cf. the policy on neologisms)
Bai languages have a fairly well-conserved set of basic vocabulary across varieties, and are perceived by speakers and usually handled in studies and dictionaries as varieties of a single Bai language, which is the reason I'm in favour of the amalgamation. Wyang (talk) 07:11, 9 March 2016 (UTC)
Like Metaknowledge, I'm disinclined to exclude some languages, especially ones about which we have modern (often detailed and careful) documentation, sometimes in the form of entire dictionaries, grammars, and compendia of transcribed stories. We include some old languages from which only one old text or even only one word survives; that is arguably less useful or more prone to error: maybe it happens that the one word was spelled lazily; whereas, ɕy³³ was carefully recorded as the exact word used in 8 of 9 places. We can always move the content later if the community settles on a certain orthography; we do that even when a community of speakers changes from one established, non-IPA orthography to another (e.g. German entries use the currently-used currently-prescribed spellings — not the spellings from a mere 20 years ago — as the lemmas).
I find some PDFs that say they are examples of Bai, and that use xuix (27, 32); the difficulty they pose is that they don't contain glosses/translations, so it's difficult to figure out how to map the scholarly transcriptions into that orthography, or tell what the words in the texts mean.
Most references I can find do speak of "Bai" or "the Bai language" (or "Bai Dialect[s]") as if it were a unitary thing, so I'm not opposed to merging and making liberal use of {{label}}s and {{a}}s. I would have entered the words as only one language if there had been a single code for that. (Interestingly, a lot of its "fairly well-conserved set of basic vocabulary across varieties" is borrowed from Chinese — 47 of 100 Swadesh items.)
If we merge the Bai varieties, do you think it's better to repurpose one of the varieties' codes as the code for the whole language (as we tended to do in the past, e.g. with acf/gcf), or create a new (longer) code from scratch (as we've tended to do recently)? - -sche (discuss) 08:55, 9 March 2016 (UTC)


We've already blocked Willy2000 (talkcontribsglobal account infodeleted contribsnukeedit filter logpage movesblockblock logactive blocks) for mass creation of entries in languages they don't know based on non-English Wikipedia entries. They just created some more as an IP ( (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks), at least some in languages you've worked with. Could you check those entries? I would also appreciate your opinion on whether to start mass-deleting their entries in an attempt to get them to stop. Chuck Entz (talk) 13:34, 16 March 2016 (UTC)

He could also just nuke his work, but that’s kind of a lazy thing to do (in my view). Still, I can understand why he’d prefer that. --Romanophile (contributions) 13:42, 16 March 2016 (UTC)
By "mass-deleting", I was referring to what we call nuking. As for laziness: volunteer time from people who are knowledgeable enough to check Willy200's edits is a precious resource that shouldn't be wasted on following people around to clean up their messes- unless those volunteers want to do it. Chuck Entz (talk) 13:55, 16 March 2016 (UTC)
The Pennsylvania German words were correct. @Kolmiel can probably shed light on which Central Franconian spellings should be made the lemmas and which should be alternative forms, but the spellings this user entered are at least correct as alternative forms. Entries created based on Wikipedia articles could be wrong (in the BP we're discussing how some Wikipedias make up words), but when it comes to basic concepts like these, they're probably correct (which is probably why the user thinks it's OK). At least, it will normally be possible to find out what the correct terms are and move the entries (for well-documented languages, anyway; I'm having trouble finding out about Lombard), so I wouldn't nuke all the entries, but maybe the ones that it's not possible to find independent confirmation of (like Lombard). - -sche (discuss) 16:11, 16 March 2016 (UTC)
The days of the week? They look okay. Only in Sambsdaach I don't see any need for the -b-; the normal spelling would be Samsdaach, but Ripuarian wikipedia seems to use Sambsdaach, too, for whatever reason. Kolmiel (talk) 15:10, 17 March 2016 (UTC)
Oh yeah, and Freidaach is Moselle Franconian, while all the other forms entered are Kölsch. Kölsch for Friday would be Friedaach. Kolmiel (talk) 15:15, 17 March 2016 (UTC)


Hi. What prompted my removal of my edit to the homoflexible page? You asked me to move it to the homoflexible page from the homofelxibility page, and I did, yet you took it down again. What, may I ask was wrong with it? I very much want my edit to stay, so if there is anything I can do to, to make it right, please let me know.

Amuzgo entriesEdit

These need some cleanup after your rename of the language. DTLHS (talk) 03:16, 2 April 2016 (UTC)

Indeed. I tried to go through them with AWB yesterday but I couldn't log in (perhaps the same problem Semper mentions that his bot is having). - -sche (discuss) 06:32, 2 April 2016 (UTC)
(Now that AWB is working again,) I think I've fixed all the Amuzgo entries, and now only have to fix a couple dozen translations-table entries. - -sche (discuss) 16:04, 17 April 2016 (UTC)


Did you mean to add this word to the translation table of woman or water? The definition you put says water and I was curious as to whether something had gone wrong. Tulros (talk) 10:52, 17 April 2016 (UTC)

@Tulros Thanks for catching that! I added the Yanomamö translation of water and an entry for it, and then copied and pasted that to create this entry, but forgot to change the gloss (despite updating the references!). - -sche (discuss) 16:02, 17 April 2016 (UTC)


Currently the Russian Wiktionary has a mixed use of lowercase (20%) and uppercase (80%) Palochka. I'm trying to understand what is right. In this discussion on en.wiktionary in August 2012, you stated that "we should use the lower case", but was there any reason or documentation behind this recommendation? I think the Wikimedia project that uses this character the most is the Chechen Wikipedia, and it is totally dominated by the uppercase Palochka. I found 2.7 million occurrences in the latest XML dump. It would be nice if we could find a consensus covering all WMF projects on how to handle this special character. --LA2 (talk) 21:50, 20 April 2016 (UTC)

There's an ongoing discussion at User talk:Stephen G. Brown#Palochka. @LA2: It would make things easier if we didn't have millions of discussions in different places on the same topic. If you feel that someone's input is needed, it's better to direct them to an existing discussion than to start a new one on their talk page. --WikiTiki89 21:10, 21 April 2016 (UTC)

Leftovers from Zarphatic mergeEdit

Just on the off chance these were overlooked: see CAT:E. If you just haven't gotten around to fixing them, never mind. Chuck Entz (talk) 15:16, 29 May 2016 (UTC)

Thanks; I had searched for pages containing the language code, but I got the impression that the site was in the middle of updating (to reflect the removal of the code) at the time, which apparently means those few pages were in limbo and didn't show up. - -sche (discuss) 16:24, 29 May 2016 (UTC)
son got missed, probably because I said Zarphatic and the translation in question is Shuadit... Chuck Entz (talk) 03:27, 30 May 2016 (UTC)
Return to the user page of "-sche".