Open main menu

Wiktionary β

Wiktionary:Beer parlour

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit

May 2017

Blocking is too easy?Edit

I'm of the opinion that blocking should really be a last resort. I see too many editors too willing to block other users. We need all the editors we can get, and blocking users is really counterproductive in this respect.

One example: User:Stubborn Pen, who was blocked Jan 2016 by User:Metaknowledge for having an "unacceptable username" (???). This user had put in 5,000+ edits, most of which look fine to me, and had clearly became a regular contributor. I don't for the life of me see what's wrong with this user name, and even if something were wrong, this is hardly grounds for blocking -- instead, suggest to the user that they rename their user account (which doesn't seem to have been done, AFAICT). Benwing2 (talk) 02:56, 2 May 2017 (UTC)

Hmm, OK, I see from another message that this was an alias of Wonderfool, so maybe a block was reasonable, but people seem to tolerate Wonderfool, so who knows. Benwing2 (talk) 02:58, 2 May 2017 (UTC)
If you look at Stubborn Pen's last contributions you can see they requested to be blocked. DTLHS (talk) 03:04, 2 May 2017 (UTC)
Actually, someone should indefinitely block him rather than block him for 14 years. @Benwing2: do you have other examples? —Justin (koavf)TCM 03:23, 2 May 2017 (UTC)
  • Firstly, we don't need "all the editors we can get". We need editors who contribute well. If an editor's edits are half constructive and half requiring cleaning or attention from others and they refuse to improve their quality after thorough explanations of the problems and warnings, then a block is the right thing to do for the project. Secondly, I understand the source of your confusion, but I don't appreciate this subject being brought up with a block that you did not even look into, but you used anyway to paint me as abusive. —Μετάknowledgediscuss/deeds 03:24, 2 May 2017 (UTC)
Sorry, I missed his request to be blocked. Metaknowledge, I'm not trying to paint you or anyone as abusive, I'm just questioning whether we need to block as quickly as we do. In the short term, it might be pragmatically the right thing for the project, but in the long term, it might not be, because it fosters the impression that Wiktionary is intolerant of new editors. (Wiktionary's syntax is definitely not so easy to learn, and not so well documented, so it's already not very easy for new editors to get involved.) As for other examples, User:Wikitiki89 blocked User:D1gggg for edit warring, and I then unblocked him because the edit war in question wasn't really an edit war, just a revert. Now, granted, User:D1gggg is very annoying and makes edits that need cleaning up after, but IMO it's not clear that blocking is the right solution. In any case, D1gggg seems to have gone away on his own for the moment, which is probably a better outcome for such users. Similarly, it's not clear to me that User:Awesomemeeos should have been blocked for a year, although he was similarly annoying. Benwing2 (talk) 03:35, 2 May 2017 (UTC)
Excuse me? When one user makes an edit, is reverted, and then reverts the reversion, that's an edit war. --WikiTiki89 06:06, 2 May 2017 (UTC)
I didn't block Awesomemeeos for being annoying. I blocked him for two weeks for repeatedly adding incorrect information in languages he doesn't know after being warned, and upped it to a year for block evasion. Are you trying to say that when somebody gets around a block by editing as an IP or creating a new account, we should let them off lightly and encourage them to keep doing this? —Μετάknowledgediscuss/deeds 03:41, 2 May 2017 (UTC)
No, but a year seems excessive to me. A year might as well be indefinite; I doubt you'll see Awesomemeeos back after a year. I would do something like double the block every time you need to block him anew, rather than jumping directly from 2 weeks to a year. Benwing2 (talk) 05:07, 2 May 2017 (UTC)
It's common to give an indefinite block for abusing multiple accounts, actually. Remember that the length of a block is about the offences a user has committed, not about the length of the last block they happened to get. —Μετάknowledgediscuss/deeds 06:12, 2 May 2017 (UTC)
@Benwing2, I agree with you. For dealing with editors who are not admins, I have always preferred to protect a page for a few days rather than blocking the persistent editor. However, an odd thing usually happens when I protect the page instead of blocking the editor ... the editor becomes enraged that the page was locked. OTOH, if I leave the page unprotected, but block the editor for a week, then when he comes back he does not seem upset by it. It's the weirdest thing ... if I act unreasonably and block him, he doesn't mind. But if I act with restraint and treat him with respect, protecting the page instead of blocking him, he goes insane. In spite of this, I still prefer to protect a page if possible and not block a good editor.
The problem is, most people have certain talents, certain areas or subjects they're good at, and they're not so good at other things. Some are great at programming and writing applications and templates. Others are great at formatting a page or categorizing. Some are good at writing or translating or researching or citing. Some people are good at surgery, cooking, undertaking, or mechanics. On Wiktionary, there are a lot of different areas that require expertise and experience, and few if any editors are good at everything. One of the areas that most people suck at is policing the people: the regular editors, the occasional editors, the visitors, the jokers, the vandals. Some of us get upset with Awesomemeeos for adding information in languages that he does not know, yet we think nothing of it when an editor who is a great writer but who hasn't a clue when it comes down to policing the wetware, the human beings who work here or come to visit. Ideally, dealing with the people here should be left up to certain admins who have that talent. —Stephen (Talk) 09:14, 2 May 2017 (UTC)
Sounds like you'd prefer to formally introduce subtypes of admindom with powers only assigned areas. Korn [kʰũːɘ̃n] (talk) 10:31, 2 May 2017 (UTC)
Well, I think it would be nice if people could recognize areas where they lack experience, knowledge, and talent. Everyone (except Awesomemeeos) seems to be able to recognize Awesomemeeos's weakness in languages that he does not know, yet it does not occur to anyone that they themselves may not have experience and wisdom when it comes to guiding human behavior, and what tools should be used when, and how to make reasoned and constructive decisions on severity and length of punishments. What we do on Wiktionary is like Donald Trump naming himself to the Supreme Court. We each know our strengths and talents, but, like Awesomemeeos, we don't know our weaknesses. I have no answers. —Stephen (Talk) 11:18, 2 May 2017 (UTC)
Thanks for saying that Stephen; I think we could do a lot of work on establishing what it means to be an administrator here, and, as importantly, what it does not mean. The roles and responsibilities of various account flags are poorly defined. That is not the whole of the problem, but I think it might be a part of it. - TheDaveRoss 14:58, 2 May 2017 (UTC)
As I recently said on my own talk page: "Maybe we need a lot of trusted people with reverting/patrolling rights, template editing rights, page deleting rights, but I don't think we need so many people with blocking and page protecting rights." --WikiTiki89 15:14, 2 May 2017 (UTC)
I started a draft regarding what administrators are meant to be here, if anyone would like to contribute: Wiktionary:Administrators/About. - TheDaveRoss 20:56, 2 May 2017 (UTC)
In order to try to improve our retention of non-vandal newcomers, a few years ago I started using a new patrolling strategy based on three pillars:
  • no undoing or non-minor changing of the newcomer’s edits will go unexplained, even if it is obvious why it is being undone/changed;
  • if the newcomer adds new content, I will try to incorporate it into the entry. If an edit adds an unformatted entry or a definition in the middle of nowhere, I will try to fix the formatting rather than undoing;
  • express my gratitude to newcomers who make contributions that show a relatively big effort whenever possible. This means using the thanks feature for registered newcomers, and using “thanks, but [] ” when undoing. Unfortunately there is no similarly simple way to thank IPs unless their contribution requires a follow-up edit.
Another thing I now avoid is giving someone the {{welcome}} template. I feel that that impersonal wall of text is more likely to scare than help. I think it’s better to send someone a message explaining what they are specifically doing wrong.
It’s not always possible to follow this strategy, but it is a long-time investment: I spend more time dealing with each individual edit than I would if I just freely smashed that good old revert button (leading to fewer edits being patrolled by me), but hopefully even a slight increase in retained editors will, after some years, increase the pool of users who are able to help with patrolling and building the dictionary. — Ungoliant (falai) 13:28, 2 May 2017 (UTC)
I agree with you on this, especially {{welcome}}. The current template is just too long and one-size-fits-all, better to leave a short personal note directly addressing issues and options w.r.t. the areas/languages they've been contributing in. A link to a page with info similar to that in the current template could still easily be included in a post like that. — Kleio (t · c) 19:30, 2 May 2017 (UTC)
I agree as well. I've generally been following those three pillars, although I should try harder on the first one. If anyone notices me breaking those pillars, feel free to remind me, because I may not have realized it. --WikiTiki89 19:37, 2 May 2017 (UTC)
@Ungoliant MMDCCLXIV And thank you for that. I made some pretty stupid edits when I started off here, and your patient handling of the situation made me feel a lot more welcome than if you had just reverted my edits with no explanation (and made me actually feel badly about it, rather than angry, so I was motivated to watch myself a bit more). The same goes for Μetaknowledge and Wikitiki89. You all did a good job not scaring me away. ;) Andrew Sheedy (talk) 21:07, 2 May 2017 (UTC)

'character info' boxEdit

As used in Pa and when using tabbed languages this box becomes intrusive - occupying up to 2/5 of my page width - leaving 2/5 for the rest of the entry (and 1/5 for the tabs). Two points:

  • Do we need these boxes?
  • If we must have them, they shouldn't be above the top language heading.

And there is also a supplementary point: what use are the "㎩" type characters in everyday life! — Saltmarsh. 05:11, 2 May 2017 (UTC)

If you don't like the boxes, one possible solution would be using CSS to hide them just for you. We don't seem to have a CSS class for the whole character box right now, but it can be easily added.
I like them above the top language heading because the box is not something tied to a single language, but I'm OK with discussing it. It could be in the Translingual section.
Maybe you already know about what I'm going to say, but since asked anyway, the boxes are important because they provide images for people who don't have the fonts (Pa doesn't have an image because nobody added it yet), and also hexadecimal/decimal codes for typing/displaying them easily (the hexa links to, with lots of technical information about the character), and it automatic places the entry in the respective "block" category, and links them to the block appendix. It's a good navigation tool. I consider the previous/next links a nice touch, too. As a future project, I'd like the boxes to automatically include codepoints for encodings other than Unicode.
Maybe some characters are more important or more commonly used than others, but I think all characters with entries need boxes, yes, not just some characters which are more important. Case in point: if we didn't have the character box in the entry Pa, I think I would insist on adding an "Usage notes" section in the entry, explaining that the single-character "㎩" exists. --Daniel Carrero (talk) 05:31, 2 May 2017 (UTC)
Thanks for replying. Brief - not argumentative - replies :)
(1) "one possible solution would be using CSS …" - I've avoided using CSS and would like to continue. The problem with much tweaking of our appearance is that editors don't see what the casual punter sees. (OK I know I'm using the tabbed/gadget).
(2) "It could be in the 'Translingual' section." I think that this might be a good idea. As you suggest, it could be repeated in the 'English' section under 'Usage notes' or similar. I'm afraid that I already moved some (before raising the subject) to a position below the Transligual heading.
(3) "the boxes are important because they provide images for people who don't have the fonts". But surely only specialists would require these characters. I was a chemist and studied physics - Pa would be typed from the keyboard! But it is good to know these characters exist (I didn't). — Saltmarsh. 15:51, 2 May 2017 (UTC)
You're welcome. :)
Maybe we should have a poll/discussion/vote with the proposals: 1) always keep the character box above all language sections vs. 2) always keep the characters box in the first language section.
For context, it seems almost all entries have the box above all language sections because at least three people consistently added it this way in many entries: @Visviva (the creator of the original character box with lots of manual parameters), @Kephir (the creator of the Lua module that gets some of the info automatically) and myself (I believe I could add the box in all existing single-character entries that didn't have it yet, including thousands of Chinese characters).
Yes, "Pa" (comprised of two separate letters) would be typed from the keyboard! But that box is specifically about the single-character "㎩". The box is on the normal Wiktionary entry "Pa" (two characters) because it seems we don't need separate entries for different codepoints that present exactly the same characters. ( redirects to Pa) Admittedly, I don't think anybody on Earth is greatly interested in typing "㎩" (single character). But maybe someone who doesn't have the right fonts to see it copies the symbol here to find the information. --Daniel Carrero (talk) 17:13, 2 May 2017 (UTC)
I think all these boxes should be removed. We are not a Unicode database. --WikiTiki89 15:54, 2 May 2017 (UTC)
Do you mean that we shouldn't have character codepoint names, abbreviations and corrected names, Unicode hexadecimal codepoints starting with U+ (like: U+00C7), character entities formatted with &# ; (like: Ç), LaTeX inputs, lists of languages where a character is used, Unicode appendices and links from entries to appendices, character images, lists of different codepoints for basically the same character (!, ❕, ❗, ❢, etc.), lists of alternate codepoints, Unicode block categories, character compositions (Ç = C + ¸; 기 = ㄱ + ㅣ), Dubeolsik inputs, hieroglyph codes, previous/next Unicode characters, and links to and w:List of XML and HTML character entity references?
Or should we place that information elsewhere in the entries?
Besides, we should be a Unicode database. That would be great. But that's just what I proposed in a past discussion (link), and I know you opposed the idea. --Daniel Carrero (talk) 17:26, 2 May 2017 (UTC)
I think such information should go in an appendix at best, not in entries. Entries should be for lexicographical data. —CodeCat 17:30, 2 May 2017 (UTC)
Yes exactly, if we are to have this information, it should be in an appendix. --WikiTiki89 17:33, 2 May 2017 (UTC)
A character entry would be incomplete without it. Maybe not all dictionaries would have an entry for a letter, punctuation mark, diacritical mark or symbol but since we have them, let's say how it's composed (Á should link somehow both to A and ´), what are their codepoints, etc.
Should we have a vote with the proposal "Have the character box in all character entries"? I realize it's not part of WT:EL yet. --Daniel Carrero (talk) 17:34, 2 May 2017 (UTC)
I've never seen a dictionary that gives Unicode codepoints for character entries. To see an example of what normal dictionaries have in entries for letters, see the Oxford Dictionaries' entry for the letter A, which is a fairly long and detailed entry with 26 senses/subsenses, but has absolutely none of the information supplied by these character boxes. That's the sort of information we should have in the main namespace for letters. --WikiTiki89 17:40, 2 May 2017 (UTC)
It's fairly long and detailed, because it includes long things like lists of definitions for the article a and abbreviations, not to mention phrases like "from A to B". It barely says anything about just the letter. I'm all for having all that information, but we have definitions for symbols that Oxford doesn't have, like + and ´ (it does have & and %, but they are too short, they don't have the etymology or mostly anything). Most dictionaries were published before Unicode or other codepoints existed, and they don't have a great list of symbols. We are already a symbol database, as long as the symbols are attested. Feel free to try and change that (maybe by deleting a lot of symbols?) but that's the status quo. (as you know, my proposal in the previous discussion was to have unattested symbols too) --Daniel Carrero (talk) 17:54, 2 May 2017 (UTC)
@Daniel Carrero: Note that I'm only talking about the entry for the letter (not the article or abbreviations) which is A3 and only starts at the anchor link I provided. And this particular online dictionary is continually updated long after Unicode. And my point exactly is that it talks about the uses of the letter, and not how its encoded by computers. --WikiTiki89 18:00, 2 May 2017 (UTC)
I take your points: this particular online dictionary is continually updated long after Unicode, and you are talking about only the letter, A3.
If we were allowed to copy the senses directly from Oxford, we would insert most of 3 into a "Noun" section rather than a "Letter" section. (I'm not saying this is right or wrong, this is just how our senses are currently formatted.) The letter section would still have the etymology, multiple pronunciations, place in the alphabet, usage notes concerning orthography, links to versions with accents when they exist, etc.
I still support having the character box, since we're a symbol database already. Maybe we should have a vote to see if there's consensus from other people? --Daniel Carrero (talk) 18:11, 2 May 2017 (UTC)
What section it's in doesn't matter so much, that's not the point I was trying to make. I'm only talking about which information we should have versus which information we don't need. You say "we're a symbol database already", but what I'm saying is that we shouldn't be. --WikiTiki89 18:57, 2 May 2017 (UTC)
It goes without saying, but feel free to discuss/vote/talk to people about that. I'm not convinced that we shouldn't be a symbol database (especially because we're pretty awesome at being a symbol database), but presumably whatever we collectively decide is fine. --Daniel Carrero (talk) 19:06, 2 May 2017 (UTC)
So just curious, do we have the slightest shred of evidence that users come here looking for "Dubeolsik inputs"? Equinox 17:32, 2 May 2017 (UTC)
I don't know, can we ask our Korean speakers here? Are there a lot of ways to type in Korean other than knowing Dubeolsik? Is having Dubeolsik here useful? I started to learn a little about Dubeolsik specifically because it was already there when I was editing the character box module. --Daniel Carrero (talk) 17:56, 2 May 2017 (UTC)
I go to Wikipedia for Unicode-related information. If you think about it, to get to the entry for a particular character, you already need to know its code point (or rather, the computer needs to). At that point, the entry is really just telling you the code point of the character you just typed in. There's lots of sites that do that job way better than Wiktionary, like say which does it for an entire string at once. —CodeCat 18:13, 2 May 2017 (UTC)
I like that does it for an entire string at once. But I wouldn't ask whether other websites already do a better job, I would ask if we can do a better job than them. Besides, Wiktionary is better than in everything other than full string searches and presenting UCS-2 binaries. That website does not have any images for the characters, especially not any legally reusable images. And I didn't find any appendices listing all the characters there (with legally reusable images). It does not have lists of different codepoints for the same character. It does not have most of the things I listed above. Wiktionary is much better than that. --Daniel Carrero (talk) 18:23, 2 May 2017 (UTC)
Toasters can also toast bread better than Wiktionary. That's not an invitation for us to try to outdo a toaster. —CodeCat 18:25, 2 May 2017 (UTC)
Toasters can toast bread better than Wiktionary, but Wiktionary is a better symbol database than, except for full string searches and presenting UCS-2 binaries. --Daniel Carrero (talk) 18:27, 2 May 2017 (UTC)
Even if we are a better symbol database than, that doesn't mean that we should be. Perhaps we should create a new Wikimedia project called Wikisymbols or something and you can go and make that the best symbol database around, but I don't think that belongs at Wiktionary. --WikiTiki89 19:00, 2 May 2017 (UTC)
I don't know, maybe I could support having a separate Wikimedia project like "Wikicharacters" if other people want to create it. But I'd most certainly still think that it's good for Wiktionary to keep being a dictionary of characters, even if that separate database existed.
"Wikisymbols" looks like it could become something about pictographic associations. A project to list all the abstract "meanings" of pearls, dragons, the color red, the ocean and the stars and so on. I used to have a pretty thick "dictionary of symbols" that worked like this. --Daniel Carrero (talk) 19:18, 2 May 2017 (UTC)
Why? Wikipedia kicked out its dictionary-like entries into a separate project called Wiktionary. Now Wikipedia no longer needs dictionary-like entries. Same thing here with symbols. And I don't particularly care what this new project would be called. --WikiTiki89 19:21, 2 May 2017 (UTC)
I gave a few reasons above for keeping symbol entries in Wiktionary. Anyway, if other people decide they want to move all our symbol entries to a new symbol project, and Wikimedia accepts it, I guess it shall be done. It just doesn't have my support. --Daniel Carrero (talk) 19:28, 2 May 2017 (UTC)
I also wouldn't mind if it were moved to an appendix. --WikiTiki89 19:41, 2 May 2017 (UTC)
My two cents:
Wikipedia is more than an encyclopedia, and includes things that no other encyclopedia would ever include. Why shouldn't Wiktionary be much more than a dictionary? We already have synonyms and a thesaurus, which I reference regularly (I still have to supplement it with, etc., but it's getting there), as well as translations, non-lemma entries, and any number of things that normal dictionaries don't have, and which I find extremely useful. Why not include symbols as well, since they're so closely linked to language? Language boxes may present more than lexicographical information, but, among other things, they do help a person understand what they're reading if they can't see the symbol, and isn't that the point of a dictionary? Now, I don't feel too strongly about whether we should have symbols here or not, but I have used Wiktionary a number of times to figure out what certain characters were when my browser didn't support the fonts needed to see them, so I certainly don't think there's any harm in keeping them. Maybe the boxes could be made less obtrusive, though. Andrew Sheedy (talk) 20:19, 2 May 2017 (UTC)
I can understand having an image of the various shapes of the character and possibly a description. But why is the Unicode information, HTML encoding, and Dubeolsik input necessary here? --WikiTiki89 20:24, 2 May 2017 (UTC)
I can see HTML coding maybe being useful to someone, and possibly Unicode as well (assuming that's what you use with the ALT key when typing characters that aren't on your keyboard, but I don't do that, so I wouldn't know...). I wouldn't object to those being removed, however, unless someone spoke up and said they actually use that information. Andrew Sheedy (talk) 20:50, 2 May 2017 (UTC)
Nevermind, the ALT key thing doesn't seem to have anything to do with Unicode, unless I'm doing it wrong. That being said, I wouldn't be opposed to adding that information as well, if someone found it useful. Andrew Sheedy (talk) 20:55, 2 May 2017 (UTC)
Well I'm not even saying that the information isn't useful, just that it's not dictionary material. There are other places where you can look up this kind of information. --WikiTiki89 21:04, 2 May 2017 (UTC)
I guess what I mean is "useful to have on Wiktionary." It might be useful, but easier/more logical to look up elsewhere, in which case Wiktionary isn't the place for it. But if there are people who use Wiktionary to find that information, then I have absolutely no problem keeping it here. Andrew Sheedy (talk) 21:10, 2 May 2017 (UTC)
The main benefit of the boxes, AFAICT, is the images; the main harm is the amount of space they take up, which could be reduced. Perhaps the links to the "preceding" and "following" characters could be removed (I don't see what use they are), and the other things could be collapsed, and the width of the box could be reduced. The Unicode character names are sometimes potentially helpful (when descriptive), sometimes harmful (when they are wrong but Unicode insists on them for "stability"), and usually neither (neutral). In the specific entry that sparked this post, Pa, I think it's probably unhelpful and possibly misleading to give the Unicode info of a character that is not the one the viewer is viewing. - -sche (discuss) 21:03, 2 May 2017 (UTC)
See ό, which has a "technical reasons" message in the character box. As was discussed before (link), I would support consistently adding short explanations in the character boxes like "This is the entry for Pa, this is the character box for ." or whatever wording explains it best. See Wiktionary:Character variations for a tentative think-tank page concerning the character variations and their character boxes.
See also Category:Character boxes with corrected names, which is automatically populated with 10 entries whose character boxes have a "corrected name". I don't know the source of the corrected names, but presumably they are meant to fix the problem of Unicode keeping a wrong codepoint name for "stability". --Daniel Carrero (talk) 21:13, 2 May 2017 (UTC)
I personally find the character boxes useful. It allows me to see the Unicode name and number of the character, when the character is visually similar or almost identical to other symbols (see, for instance, ), or to see the character's canonical decomposition.
Regarding the proposal to move this information to an appendix: I like having the information readily available in the entry, but having it in an appendix would be okay as long as it's quick to navigate there. Any proposals for how to make that possible? A link near the "see also" links at the top of the entry? — Eru·tuon 01:53, 3 May 2017 (UTC)
I oppose moving them to an appendices. I mean, are we talking about how many characters per appendix? Maybe each appendix would list lots of characters, or the appendix would have one character only, or somewhere in between? We already have Appendix:Unicode, listing lots of tables with character information. They are nice as navigation tools, but finding a single character there is a hassle. Single-character appendices, like maybe Appendix:∌ or Appendix:な would feel like unneeded duplication, since we'll still have the symbol entries. Linking the appendix at the top would be more trouble to find the desired information. Maybe not everyone would think of finding and clicking the link. --Daniel Carrero (talk) 04:16, 3 May 2017 (UTC)

I created Wiktionary:Votes/2017-06/Allowing character boxes. --Daniel Carrero (talk) 09:55, 5 June 2017 (UTC)


@JohnC5, Chuck Entz, Metaknowledge, Vahagn Petrosyan: User:Djkcel continues to add spurious etymologies, habitually mis-formats entries, using plain text and incorrect template code, and doesn't know how to properly cite references. Users have repeatedly gotten on his case for it, but he just shrugs it off and deletes the comments from his talk page. At the very least, can we please remove his autopatrol status? --Victar (talk) 01:20, 3 May 2017 (UTC)

I've pointed out errors several times over the years, and they apologize for the specific error, but continue to add etymologies from sources that they don't understand. Here is me cleaning up a spectacularly bad Greek etymology added shortly after they removed Victar's request that they be more careful. They're trying very hard to be accurate, but they simply don't have the background.
You'll notice that the first Ancient Greek word they gave has no accent- that's because they just transliterated it from a Latin-script mention somewhere. The accent is phonemic in Ancient Greek, so you can't link to the right entry without it- not that you'd want to in this case, because our lemmas are at the first-person present indicative, not the infinitive. The second one is a first-person present indicative form, and it does have an accent- after an extra "ιζ" that doesn't belong there. After years of adding Greek terms to etymologies, they still don't know enough Ancient Greek to spot these obvious errors. They also included a Proto-Indo-European root without any explanation for the missing *k and *n ("ζ" from *d seems plausible, though). This isn't a one-off fluke, either- they simply don't know what they're doing, and it shows. Chuck Entz (talk) 04:19, 3 May 2017 (UTC)
Exactly. Djkcel simply wanders where he pleases and edits entries in languages he demonstrably has absolutely no knowledge in. He has clearly shown over the course of several years that he is unwilling and/or incapable of improving his edit quality. --Victar (talk) 04:58, 3 May 2017 (UTC)
I have been continuously frustrated by this user. I have no further commentary to add except that this is not the first appeal I have received to look over this user's etymologies for accuracy, which is a nigh insurmountable undertaking. I have no plan for remediation but would register the continual irritation of having to rewrite these minor untruths. —JohnC5 05:01, 3 May 2017 (UTC)
I too would like to register my irritation with the Djkcel, who apparently acts in good faith but in my opinion is harming the project. --Vahag (talk) 07:30, 3 May 2017 (UTC)
I figured there’d be a gathering of the gods about me at some point or another, but I at least appreciate the ability to partake in it. A few things:
  • I promise you that all the edits, good or bad, are in good faith. I have no malicious intent and I just want to help.
  • I sincerely do apologize for the issues in formatting and sourcing. I am making a conscious and daily effort to improve across the board. As someone mentioned further up on this page, Wiktionary’s syntax is difficult to master and its documentation has left some to be desired (e.g. I can’t find any entry on ‘’templunk”, or how to replace the word ‘unknown’ with ‘uncertain’ and keep it tagged as unknown. I had to find it through other edits); even when I think I’ve done it well, it’s apparently wrong. In the past I’ve been guilty of forsaking time spent formatting in favor of rapid editing, because I felt that the formatting got in the way of the true goal of expanding knowledge and discussion. This was a mistake, and I’m going to pay more careful attention to it in the future. Also, regarding sources, I’m going to consult direct ones. I’ve used ‘middle-man’ sources like Harper’s etymOnline, but, according to the editors/reverters on here, over 50% of his entire site is wrong. Also, you're right, my Greek sucks. I'll use Template:rfscript for words I'm not able to copy and paste.
  • While I acknowledge these learning curves and mistakes, I do think guys are being a little too harsh on me by saying I have no idea what I’m doing. With the 6000+ edits I’ve made I do like to think that I am choosing interesting stuff to work on, meticulously tracing word lineage and at least creating some unique dialogue/discussion. 95% of what’s said to me on here is criticism and I’m rarely thanked for any edits, so it is hurtful. I do want to be accepted for my contributions instead of being seen as a burden or some idiot who’s doing more harm than good.
  • I didn’t know deleting posts off of your talk page was frowned upon. User:Victar’s post was a request for more careful sourcing, something I processed, followed by a threat to remove me from the whitelist. I just didn’t want to see it anymore. I won’t delete posts again.
  • Finally, @Victar, I appreciate your recent turn of attention toward me and the quality of my edits. Since fsojic left, I haven’t had a primary scolder. Although, the things he said to me bordered on downright abusive and constantly violated Wiktionary’s rules of civility, while your discussion has been a little more constructive and concerned with the overall good of the project's quality. It’s helpful and it keeps me from going on autopilot.
Thank you all for the patience, advice, prodding, etc. over the years. For personal reasons, this has been an extremely difficult year for me, and I can’t lose wiktionary. I’m very passionate about exploring this site - for every 1 entry I edit I’m looking at 30 other pages out of sheer curiosity. Please realize that I’m going to make every effort to continue to improve, expand, stimulate, and quality-check the knowledge and discussion I contribute to this project. I hear you, and I'm with you. Djkcel (talk) 14:24, 3 May 2017 (UTC)
@Djkcel:, I can appreciate your intentions, but the problem is we've been dealing with the same issue over and over again since 2012 and we haven't seen much of any improvement. We don't need apologies, we need change and 6000+ edits means nothing if the majority are bad. I had asked you to be more mindful of sources, particularly for reconstructions, and the next day you went and did that same thing and then deleted my post from your talk page. That's not indicative of someone willing/capable of changing. We simply can't have you editing languages you don't have knowledge, i.e. Greek, etc, because doing so, as Vahagn says, is harming the project. --Victar (talk) 15:13, 3 May 2017 (UTC)

Voting has begun in 2017 Wikimedia Foundation Board of Trustees electionsEdit

19:14, 3 May 2017 (UTC)

List of redlinksEdit

Is there any way to generate a list of redlinked uses of {{l}} and {{m}} for a certain language? It would be useful to add e.g. words that already have cognates listed in another language's entry. —Aryamanarora (मुझसे बात करो) 22:57, 3 May 2017 (UTC)

Yes; Category:Redlinks by language. I don't remember who was running it, though. @DTLHS, was it you? —Μετάknowledgediscuss/deeds 23:21, 3 May 2017 (UTC)
That's an automatically generated category. I could generate a list for a specific language from specific templates if requested. DTLHS (talk) 23:23, 3 May 2017 (UTC)
There seem to be such categories already: Category:Ancient Greek redlinks/l. — Eru·tuon 00:02, 4 May 2017 (UTC)
But only for certain languages, it seems. Andrew Sheedy (talk) 02:43, 4 May 2017 (UTC)
It's an expensive thing to check, especially if we did it for every language. DTLHS (talk) 02:46, 4 May 2017 (UTC)
It's User:Daniel Carrero Please help him make this job less expensive.--Anatoli T. (обсудить/вклад) 02:52, 4 May 2017 (UTC)
If it helps, you can enable/disable redlink checking for a certain language by editing the list of languages at {{redlink category}}. --Daniel Carrero (talk) 02:57, 4 May 2017 (UTC)
Thanks for all the help! —Aryamanarora (मुझसे बात करो) 14:10, 4 May 2017 (UTC)

Cognate temporarily disabledEdit

Hello all,

Since yesterday evening, the extension that provides the interwikilinks is temporarily disabled, to solve a performance problem. We hope that the extension will be back during the present day, and we apologize for the inconvenience caused. Lea Lacroix (WMDE) (talk) 09:07, 4 May 2017 (UTC)

Update: we're still working on fixing the problems of the extension. To avoid you more inconvenience, we reactivated Cognate on "read-only" mode: that means that the interwikilinks are back, but if new pages are created, or some pages deleted, the links are not currently updated. Thanks again for your understanding, I'll keep you posted as soon as everything is running correctly again. Lea Lacroix (WMDE) (talk) 18:33, 4 May 2017 (UTC)
Update: Cognate is back :) If you spot some errors, links that are missing or should not be there, etc. please ping me.
I'll come back to you soon with a proposal to deal with the redirects. Lea Lacroix (WMDE) (talk) 08:17, 11 May 2017 (UTC)

May Lexisession: flowerEdit

A mayflower.

Monthly suggested collective task is to take care of flowers. At first, some people aim to create a thesaurus in French Wiktionary and suggested this topic. Here, Wikisaurus:flower already exist but may be expand as well. Also, there is plenty pictures on Commons to illustrated entries, but quotations are quite hard to find and good sources are precious. The Category of flowers contains 233 entries but garden plants on Wikipedia already list 624 pages!

Let's celebrate spring and populate our virtual garden with some flowers!

Lexisession is a collaborative experiment without any guide nor direction. You're free to participate as you like and to suggest next month topic. If you do something this month, please report it here, to let people know you are involve in a way or another. I hope there will be some people interested by flowers   Noé 09:38, 4 May 2017 (UTC)

Removing a block reasonEdit

Hi. I propose removing "Adding nonsense/gibberish" as a block reason. If the nonsense is being added by way of troublemaking, then "Vandalism" already covers this. If not, or if we're not sure, then "Disruptive edits" would seem to cover it. Equinox 18:57, 4 May 2017 (UTC)

Fine with me. To be fair, disruptive edits and abusing multiple accounts cover pretty much everything I ever block for. - TheDaveRoss 19:19, 4 May 2017 (UTC)
I like it. I think it makes the vandal sound childish, whereas they might be pleased to be "disruptive" or have their actions labelled as "vandalism". That's pure speculation, of course. —Μετάknowledgediscuss/deeds 19:47, 4 May 2017 (UTC)
Mmm. Just a thought. I suppose my reasoning could also support removing "Vandalism" etc. because they are a subset of "Disruptive edits". To me, "disruptive" is usually something that the editor thinks is good but that actually causes big trouble, like single-handedly trying to reorganise Serbo-Croatian. Equinox 20:02, 4 May 2017 (UTC)
Personally, I miss having "Stupidity" as a block reason. —Aɴɢʀ (talk) 20:16, 4 May 2017 (UTC)
Oh, I would think "disruptive edits" were intentionally disruptive. I do think we have too many and redundant block reasons, and page deletion reasons. Maybe we can't merge all four of "adding nonsense/gibberish", "disruptive edits", "unacceptable conduct", and "vandalism", but some of them could be combined. How about combining as "Disruptive or unacceptable conduct", and (at the very least) "Adding nonsense/gibberish/vandalism"? And how often do we have to block for "unacceptable username"? And do we still have to block Open Proxies often enough that it needs to be a prewritten option in the list, or can we remove it? (Btw, on the subject of things vandals might be pleased by: being described as "intimidating"!) - -sche (discuss) 20:17, 4 May 2017 (UTC)
I suspect vandals are ultimately pleased by causing disruption, regardless of the particular words that the "disruptee" community uses to describe it. But yeah it seems funny to me sometimes to use "intimidation" when I am basically reverting "John Smith is a dick" and he could be anyone anywhere. Equinox 22:24, 4 May 2017 (UTC)
Support. People might add nonsense by means of not understanding the site or by error. The block reason should be the malign intention, which would make it vandalism. Korn [kʰũːɘ̃n] (talk) 22:22, 4 May 2017 (UTC)
Per Equinox's comment in the section below this, I've put a copy of the list of common block reasons at MediaWiki talk:Ipbreason-dropdown, where (I think) everyone can comment on each reason (maybe indented under each reason?) with what they use it for, whether they think it could be combined with something else, etc. - -sche (discuss) 02:40, 8 May 2017 (UTC)

Preset page deletion reasonsEdit

Related to the discussion above, I think there are too many preset page deletion reasons, which pushes useful page deletions down far enough that you have to scroll to find them, which is an admittedly minor annoyance. I don't think "bad redirect" and "residual from move" needed to be split and I suggest recombining them: "bad redirect or residual from move". And do we still delete transwikis often enough that it needs to be a preset delete reason? Related to the comments above about how "Vandalism" might be taken as a badge of honor, perhaps it could be folded into "incomprehensible, meaningless, or empty"? I would also suggest combining "not a citation; see Citations:hydrogen as an example" and "improper use of a talk or citations page". And I don't think we delete orphaned talk pages anymore (we put e.g. the entry's RFD on them), so I suggest removing that deletion reason and combining "orphaned documentation" with "unused template subpage". - -sche (discuss) 20:31, 4 May 2017 (UTC)

I think a bunch of them could be merged into a catchall "Technical deletion" line (redirect, created in error, residual from move, transwiki, technical). - TheDaveRoss 20:34, 4 May 2017 (UTC)
A milder solution could just be "unwanted redirect". —CodeCat 20:45, 4 May 2017 (UTC)
Okay. Can we create a temporary discussion page that lists all of the current deletion reasons, and have people comment under each one (possibly keep/delete, but preferably a proper comment with their understanding of what it means, and how they use it)? Then we could get a "big picture" of usage and probably collapse a few of them into others. Equinox 22:23, 4 May 2017 (UTC)
Good idea. I've put a copy of the list of common deletion reasons at MediaWiki talk:Deletereason-dropdown, where (I think) everyone can comment on each reason (maybe indented under each reason?) with what they use it for, whether they think it could be combined with something else, etc. - -sche (discuss) 02:41, 8 May 2017 (UTC)
I boldly combined the two "bad redirect" deletion lines and the "transwiki" line into an "unwanted redirect or transwiki" line. I also noticed there were two lines for in effect "deleted due to move/merge", which I combined. And I removed "orphaned talk page" as a delete reason. - -sche (discuss) 02:58, 8 May 2017 (UTC)

The late Robert UllmannEdit

The userpage of prolific Wiktionary contributor, Robert Ullmann--who passed in March 2011, and whose user page was agreed to have a link to his obituary back in September 2011--currently has a broken link.

The discussion with support for fixing his user page in the first place is here.

I did a quick google search and apparently valid links to his obit may still be found online. I suggest that the Wiktionary editor community come to consensus and agree to fix the (now broken) link on his user page. I would, of course, support that outcome and see his user page updated once more, after his death. N2e (talk) 00:33, 6 May 2017 (UTC)

There's no need for consensus, or even for a post in the Beer parlour. I've fixed the link, and in the future, you can just ask me or any other admin to do it. —Μετάknowledgediscuss/deeds 03:13, 6 May 2017 (UTC)
What should we do about the "email this user" feature in such cases? SemperBlotto (talk) 05:14, 6 May 2017 (UTC)
Nothing, in my opinion. If someone is stupid enough to try to email him despite the big obituary notice on the top of the page and lack of a talk-page, that's their problem. —Μετάknowledgediscuss/deeds 05:50, 6 May 2017 (UTC)
I am also not sure that we could change that setting. The developers could I suppose, but I agree with Meta that it is not a big deal. - TheDaveRoss 12:58, 6 May 2017 (UTC)
THanks folks. I agree that I could have asked an admin; but I'm fairly inexperienced on Wiktionary as Wikipedia is my primary gig and I could not find an easy link to get me to an admin on Wiktionary. Sorry for the trouble. Cheers. N2e (talk) 02:27, 7 May 2017 (UTC)

Illyrian languageEdit

There are a number of module errors in Cat:E after User:Metaknowledge changed the language code "xil" in the data modules to an etymology-only language. As I've said in the past, there doesn't seem to be any justification for having mainspace entries in this language: it has no direct attestation and extremely limited indirect attestation in Ancient Greek and Latin texts, so I'm not advocating keeping it as unrestricted. I do note, however, that there are a number of etymological references which cite Illyrian words, and not all of them are fringe sources. Aside from terms actually mentioned in Greek and Roman texts, these seem to be reconstructed by analyzing those terms and looking for possible Indo-European cognates to possible morphemes within them.

I would recommend Illyrian being marked as a reconstruction-only language, but our current practice seems to be to to reserve the Reconstruction namespace for terms derived via the comparative method from descendants. Although there's some speculation about the Albanian language itself and various substrata in other Balkan languages being descended from Illyrian, these don't really have any undisputed descendants. Any suggestions for how to resolve this without removing all Illyrian terms from our etymologies? Chuck Entz (talk) 05:08, 7 May 2017 (UTC)

Unfortunately, I think these are needful module errors (errors that should not have existed in the first place, and need to be cleaned up by hand). These entries should not be referencing Illyrian entries the way they are, especially those terrible Albanian etymologies. Reconstructions would be pretty unsupportable, but I'm not sure how (or if) hypothetical Illyrian terms should be mentioned. Unfortunately, I don't have the background or the resources to do a worthy job of fixing these; @JohnC5, Vahagn Petrosyan might be able to fix them. —Μετάknowledgediscuss/deeds 05:16, 7 May 2017 (UTC)
Since Illyrian is not attested in inscriptions, I support disallowing it in the mainspace. The glosses and onomastics attested in Ancient Greek and Latin texts can be handled under Ancient Greek and Latin. But I don't see any problem in allowing Illyrian reconstructions, as long as they are sourced. --Vahag (talk) 16:58, 7 May 2017 (UTC)
A user (you know the one) has been simply removing the {{l}} template around Illyrian terms as a way to fix the error that it now gives, apparently not getting the hint. —CodeCat 21:57, 7 May 2017 (UTC)
Yeah, this user is getting on my nerves. I would prefer to have Illyrian be a reconstructed language. As for Messapic and Phrygian, I can't decide whether I'm fine with this user using the Greek alphabet for the entries. Paleo-Phrygian and Messapic were in Greek-ish alphabet (ignoring Neo-Phrygian, which is in the Greek alphabet), but that is something that would need to be discussed. Honestly, I don't know what to do about this situation, since the user does cite real, if slightly misread, antiquated, or outdated, sources. —JohnC5 22:41, 7 May 2017 (UTC)
Sorry for not responding quicker to your pinging me on the user's talkpage; I wasn't sure what should be done with this language. Almost all of our "etymology languages" are attested varieties of other languages, or in a few cases, substrates where we're not even sure if they're one language, whereas this is a specific language which we know existed but haven't found direct attestations of. I like Vahag's idea of making it a type = "reconstructed", language. (Conversely, there are some "exceptional codes" for conlangs, for example Dothraki, which I wonder if we could move to "etymology languages".) - -sche (discuss) 02:19, 8 May 2017 (UTC)

In case anyone missed it, the user has now added the entry Terittituniš from the language "Hayasan" with a great deal of non-dictionary information. —JohnC5 17:28, 8 May 2017 (UTC)

All these entries are still in CAT:E. It's kind of annoying, since it makes it somewhat hard to find the errors that relate to other things. What should be done about them? If there's no immediate solution, what about restoring Illyrian to Module:languages/data3/x and marking it as deprecated? — Eru·tuon 17:07, 11 May 2017 (UTC)

Per the discussion above, I have restored Illyrian to a 'full code' (in Module:languages/data3/x rather than an etymology-only code), but as a "reconstructed" language. Out of curiosity, is this the first ISO-coded language that we've classified as "reconstructed"? It's a sort of counterpart to Proto-Norse (which is, unusually, a directly-attested proto-language). - -sche (discuss) 19:17, 11 May 2017 (UTC)
Proto- is really just a name, and it has other names such as "Primitive Norse" (parallel to "Primitive Irish") as well. However, since it's the common ancestor of the Norse/North Germanic languages, it fits. —CodeCat 19:23, 11 May 2017 (UTC)
@-sche: Thanks! I've gone through with AWB and added asterisks to Illyrian terms. That should clear up more module errors. — Eru·tuon 19:46, 11 May 2017 (UTC)

PIE root clean-upEdit

I suggest that we develop and enforce a policy on the writing of PIE roots to avoid reduplication. My understanding is that at present the mainstream view and/or convention is that PIE did not contain /a/, that a minimal phonological scheme contains four vowel sounds /e/, /o/, /ē/, /ō/, and that roots previously written with /a/ should be written using one of the laryngials (/h₂/, usually) plus /e/. The evidence that /a/ is necessary for an explanation of derived forms, independently of /h₂/, does not have broad acceptance. There are other schemes, other conventions, and all of this remains a matter of dispute, but at the moment this dictionary uses a mixture of various schemes. *átta is included in our current PIE root list as distinct from *ph₂tḗr and it is not clear to me how this can be justified. Ordinary Person (talk) 10:41, 7 May 2017 (UTC)

Most of what you say makes sense, even if some of it is debatable (see WT:AINE, which covers most of it, though it allows for rare original a). The last sentence, though, is bizarre: ignoring the vowels and laryngeals, you still have tt equated with ptr, which is often analyzed as p + tr. Huh?. Chuck Entz (talk) 15:50, 7 May 2017 (UTC)
Inconsistent transcription is mostly backlog on cleaning up etymologies from before we had standards, or added by inexperienced editors. @Sobreira has been collecting some working lists on this, while I've imported a list of "standard roots" at User:Tropylium/Proto-Indo-European/Verbs.
Roots contained *a are not very popularly supported (it remains as a minority opinion), but allowing *a in isolated lemmas such as *átta or *ǵʰans would probably be less of a problem. (I suggested some time back writing a-coloring out even in derived complete words, so e.g. *h₂antíos and not *h₂entíos, but this went unsupported.) --Tropylium (talk) 17:09, 7 May 2017 (UTC)
Fair rejoinder, Chuck Entz, that was not a good example. I'm fairly new to Wiktionary, is there a clean-up project for this that I could join? Ordinary Person (talk) 02:28, 8 May 2017 (UTC)
@Ordinary Person @Chuck Entz @Tropylium I suggest User:Sobreira/*pele- for a vision of the situation about an example PIE:pele-, pala-, etc. I'm not an expert, but I guess this is a mess. IMO, yes, it needs a clean-up, but my knowledge is not that far. I can help just gathering info, but not in taking decissions (unless quite basic). Sobreira ►〓 (parlez) 13:08, 8 May 2017 (UTC)
OK, let's go over this as an example. The variation in writing any of these with or without a dash should be a trivial issue I hope, but the rest is indeed a mess:
  1. *pelh₂- is a basic root, meaning roughly 'flat, flatness, flatten'.
    • The reconstruction is, however, wrong! We currently have the lemma for this root instead at *pleh₂- (compare the derivative *pleh₂k-, number 6 below).
  2. *pele- is (one version of) the corresponding pre-laryngealist transcription.
  3. *pla- appears to be the root in zero grade in pre-laryngealist transcription, corresponding to to modern *pl̥h₂-. (But we do not create separate root entries just for the zero grade.)
  4. *pl̥h₂t- is a derived noun stem, for which we could create either *pl̥h₂tós or *pl̥h₂téh₂ as the lemma.
  5. *plat- and *plÁt- are different pre-laryngealist transcriptions of the previous.
  6. *pleh₂k- is another derived stem; this time a verb 'to flatten', I think. However, it is also a variant of a separate unrelated root *pleh₂g-, meaning '(to) beat'. (This covers Latin plaga, sense 1)
  7. *plāk- is the previous in pre-laryngealist notation.
  8. *plak- is the zero-grade of this stem in pre-laryngealist notation, corresponding to modern *pl̥h₂k-.
--Tropylium (talk) 22:36, 8 May 2017 (UTC)
What's your argument for treating *pelh₂- as the root instead, when all the given descendants agree with *pleh₂-? —CodeCat 22:42, 8 May 2017 (UTC)
No, sorry, I mean that *pelh₂- as given at flan is the wrong reconstruction. --Tropylium (talk) 22:45, 8 May 2017 (UTC)
Ah, I see. There's also another issue with that etymology: *pl̥h₂t- could never possibly give *flaþ- directly, it would become *fulþ- or *fuld-. —CodeCat 23:00, 8 May 2017 (UTC)
Hmm, right. Per Kroonen, *flaþô is apparently instead from the unrelated root *pleth₂-; which also means 'flat', but which per our current knowledge cannot be connected with *pleh₂-. --Tropylium (talk) 23:19, 8 May 2017 (UTC)
It's likely that in the post-PIE period, when laryngeals had already vocalised, that a was reanalysed as the zero-grade vowel of such formations. Consider for roots of CeH shape, that you'd have an alternation between ē/ā/ō, ō and a in early Germanic. It would not be surprising then if this pattern was also applied to roots of CReH shape, whose inherited ablaut was Rē/ā/ō, Rō, uR, which is rather irregular. Analogical pressure to treat a as the zero-grade of such "long vowel roots" would not be surprising. I think this could be the origin of *flaþô, and also some others such as *bladą. With *flaþô there's the additional question of why a zero grade had an accent, which was very rare in PIE. —CodeCat 19:31, 11 May 2017 (UTC)
So, @Tropylium, would it be right if we place?:
  1. *pelh₂-/*pleh₂- (I didn't get it) with Alternate spelling *pele-, which would include *pla- as Alternate spelling for 0-grade *pl̥h₂-
  • Verb root *pleh₂k- (which AFAIU is actually *pleh₂ + k-) with Alternate spelling *plāk-, rooted from and to be included in 1, which would include *plak- as Alternate spelling for 0-grade *pl̥h₂k-.
  • Noun root *pl̥h₂t- (which AFAIU is actually *pl̥h₂ + t-; but why not *pleh₂t- from *pleh₂- + t? we do create entries only for E grade, innit?) with Alternate spellings *plat- and *plÁt-, rooted from and to be included in 1
    1. *pl̥h₂tós or *pl̥h₂téh₂ to be created as origin for some nouns and rooted from 1 pointing to section 1.2 Noun root
  • all Alternate spellings redirected (as per WT:AINE; but do we include it like we include Alternative reconstructions?) and/or removed and changed from the respective Etym sections. Sobreira ►〓 (parlez) 10:58, 15 May 2017 (UTC)

Name of Wikisaurus namespaceEdit

Anyone else think this namespace might be better named simply Thesaurus or Synonyms? I mean, we have Talk, Citations and User as namespaces, which are all generic terms. Wikisaurus might work as a project name, but seems strange as a namespace name. Equinox 21:42, 7 May 2017 (UTC)

I kind of agree. "Synonyms" is more straightforward, so that has my vote. —CodeCat 21:53, 7 May 2017 (UTC)
At least, I think Thesaurus: should automatically redirect to Wikisaurus:, like this: Thesaurus:good -> Wikisaurus:good. This wouldn't do any harm, even if no one uses it. --Daniel Carrero (talk) 01:01, 8 May 2017 (UTC)
This can be done easily by using Thesaurus as an alias for Wikisaurus (or the opposite), as it is already done for WT -> Wiktionary (e.g. WT:BP). — Dakdada 14:22, 9 May 2017 (UTC)
The pages house more than synonyms (for example, they include antonyms and coordinate terms), so maybe "Thesaurus" is a better name than "synonyms"; I'm not sure. The whole thing seems a bit redundant to the synonyms (etc) sections in entries. - -sche (discuss) 04:27, 8 May 2017 (UTC)
I agree. I also think it seems somewhat redundant, but I'd rather see synonyms, antonyms, holonyms, etc. moved to Wikisaurus to declutter entries than get rid of Wikisaurus. Andrew Sheedy (talk) 04:32, 8 May 2017 (UTC)
Or we could try to do it using categories. DTLHS (talk) 04:36, 8 May 2017 (UTC)
Can you elaborate on this? I like the idea of using Mediawiki structures to store data, but I am not sure how this might actually be accomplished. One problem is that not all terms which are allowable in Wikisaurus are allowable in the main namespace. Beyond that, I am guessing the idea would be to create a category which is something along the lines of "Terms semantically related to X", and then have subcategories which are "Synonyms of X", "Antonyms of X", etc. Is that the kind of thing you are imagining? I have always thought that it would be much harder to maintain that sort of structure, but it might provide some interesting opportunities. - TheDaveRoss 15:12, 8 May 2017 (UTC)
That's basically it. We would add a template like {{synonyms|xyz}}, where xyz is the canonical word that we've chosen to list synonyms under for a particular concept. Each page would then categorize into "Synonyms of xyz". We could use the categorytree extension to create expandable lists of synonyms in the way that {{prefixsee}} works. The downside like you said is if there are thesaurus entries that can't be mainspace entries. DTLHS (talk) 15:17, 8 May 2017 (UTC)
Color me intrigued. This and the current incarnation of Wikisaurus are not mutually exclusive, so perhaps we should apply this style to a couple of terms to see how it looks and what the drawbacks might be? It would certainly cut down on some redundant effort, and it would consolidate more information about a term to that term's entry. - TheDaveRoss 17:57, 8 May 2017 (UTC)
This is already possible with {{syn}}, albeit that there's nothing particular about the order of terms listed in it, currently. —CodeCat 17:59, 8 May 2017 (UTC)
It also lets you link to Wikisaurus pages: Synonyms: Wikisaurus:good. It currently wraps the link in language/script tags, which is not ideal, but it doesn't break. —CodeCat 18:02, 8 May 2017 (UTC)
I like the idea of changing the namespace description to "Thesaurus". - TheDaveRoss 15:12, 8 May 2017 (UTC)
I am fine with "Thesaurus". What do you think of "Wikinyms"? I oppose "Synonyms" as inaccurate. --Dan Polansky (talk) 19:45, 8 May 2017 (UTC)
While I don't dislike Wikinyms, I think it suffers from the same problems as Wikisaurus. I think I might like Wikinyms better than Wikisaurus though. - TheDaveRoss 20:21, 8 May 2017 (UTC)
I would prefer, in order: "Thesaurus" > "Wikisaurus" > "Synonyms" > "Wikinyms". --Tropylium (talk) 22:42, 8 May 2017 (UTC)

The word thesaurus have two meanings: one in a lexicographic context, for terms semantically related (Q179797), two in a documentation or database context, for information retrieval (Q17152639). I am not sure English Wiktionary and French Wiktionary agree on which one we want to develop in our projects. I think, for most of the contributors, both are quite similar, but the traditional objects and the purpose of each ones are very different. In a lexicographic one, we can offer much more content without defining referential descriptors. The other use is much more formal, with rigid categories. So, which one do you think Wikisaurus is? (as usual, sorry for my mistakes, as English is not my mother tongue, please correct any mistakes leading to misunderstanding)   Noé 13:33, 9 May 2017 (UTC)

Categorization of dialectal termsEdit

Many entries include not only which dialect they belong to, but also the categories of the broader dialects to which those specific dialects belong, as well as the vague label "dialect". For example, scaurie is labelled "(Britain, dialect, Shetland)", so it is in CAT:British English, even though I take the label to mean it is not found pan-Britannically but rather just in Shetland, and it is also in CAT:English dialectal terms. But scattald is labelled "(Orkney and Shetland)", so it is only in categories for Orkney and Shetland English, and not CAT:British English nor CAT:English dialectal terms, which is inconsistent.

Do we want all terms which belong to any dialect to also go in CAT:English dialectal terms? If so, we could just have every dialectal label double-categorize into that category. Whereas, if not, a lot of entries need to be edited. Likewise, do we want terms from British (American, etc) sub-dialects to be double-categorized into the "big" dialect categories, or should such (manual) double-categorization be removed? - -sche (discuss) 04:25, 8 May 2017 (UTC)

Optimally I think we would prefer to have zero words categorized directly under categories like CAT:English dialectal terms or CAT:Regional English, instead placing every such word under some more specific dialect category. (Additionally, I have no idea what the distinction these two categories is supposed to serve.) A lumpencategory "English dialectal terms" is no more useful than something like "lemmas in Slavic languages that lack cognates in other Slavic languages".
I have no immediate opinion on dialect group double categorization: it seems like just a particular dialect group's equivalent of the mother categories like CAT:English lemmas. On the other hand, "American English" and especially "British English" are more geographical groups than dialectological ones, which may be a point in favor of reserving them just for pan-Americanisms and pan-Britishisms. --Tropylium (talk) 10:46, 8 May 2017 (UTC)

Etymology summaries for intermediate sourcesEdit

It is often the case that we can reliably trace a given word's etymology to a particular source language (e.g. a reconstructed intermediate proto-language, or an influental attested language such as Latin or Arabic), while the further etymology remains disputed or uncertain. We often give information on these less likely earlier stages anyway. However, if new information comes up, this may end up requiring much duplicated work in fixing mainspace entries. For an example, I recently added a new source for the etymology of Proto-Germanic *swerdą (sword). This however has several dozen descedants, many of which still refer to the older etymology.

I propose we create a template for printing a small lay summary of the etymology of a word in some given "chokepoint" language. The intended output would be something like: {{etymsumm|en|gem-pro|*swerdą}} → "possibly from {{der|en|ine-pro|*seh₂w-||sharp}}"

For editor accessibility, i.e. not codewalling etymologies to be editable only by editors versed in data modules, I think the lay summary should be stored in some accessible format on its own subpage or the like: e.g. Reconstruction:Proto-Germanic/swerdą/summary. This could have simply the content possibly from {{der|XXX|ine-pro|*seh₂w||sharp}}. The {{etymsumm}} template would then do the actual data massaging, to substitute the appropriate language codes for "XXX" (or whatever other keyword we choose) — or other functions as desired. (For example, perhaps {{PIE root|XXX|feh₃}} could be auto-inserted whenever the summary contains {{der|XXX|ine-pro|*feh₃-}}.)

Thus, whenever new etymological information comes up, instead of having to emend dozens of descendant etymologies by hand every time, we could simply centrally emend the lay summary and end up with all affected entries automatically updated.

Admittedly there will not be much immediate benefit to this, since editing our current etymologies to actually use {{etymsumm}} and writing said summary entries would take manual work anyway… but in the long run this seems likely to be valuable.

Thoughts? --Tropylium (talk) 12:16, 8 May 2017 (UTC)

I'd rather wait for a better solution to come along. I will say though that I think non-English lemmas should only point to its immediate parent in the etymology, if that parent already has an entry. That would drastically cut down on duplication, ex. Old Saxon swerd. --Victar (talk) 16:36, 8 May 2017 (UTC)
The issue of duplication and (non)synchronization is problematic even without chokepoints, so it may be better to solve it by putting the whole chain into Template:findetym or similar templates. However, even that may be unable to handle cases where there is a long list of possibilities after a chokepoint (see the etymology of the "group of people distinguished from others" sense of race, for an example), where an approach like you suggest would be good, except with the whole etymology (rather than a summary) put in the template.
An alternative idea for the second case is to centralize on "lemma" entries, so if e.g. Japanese borrowed English "race", the Japanese entry would just say "from English race, see there for more", and maybe the Italian entry would also say "uncertain, see English" (since this is the English wiki and English is privileged). Some (like Victar) might suggest moving the etymology to the Italian entry and having English say "see Italian", but that removes the (IMO desirable) categorization into "English terms derived from Lombardic" (though regarding the category name, maybe it would be good to have a set of categories saying more specifically "...possibly derived..." for cases of great uncertainty like this, or maybe not).
- -sche (discuss) 16:45, 8 May 2017 (UTC)
{{findetym}} is horrible, don't use it. —CodeCat 16:54, 8 May 2017 (UTC)
@-sche: I caveated saying non-English lemmas. English lemmas should be governed by their own set of guidelines. Though yes, I too lament that using this method is also killing Category:Old French terms derived from Frankish. --Victar (talk) 17:34, 8 May 2017 (UTC)
I definitely wouldn't support "lemmatizing etymologies", exactly because it's going to make it difficult to find X terms that have made it into a "non-etymology-lemma" language Y. Also we'd need to set up a whole damn hierarchy of languages for lemmatizing this or that. If we lemmatize English as primary, we still need some way to handle cases where no English reflex exists, and probably even in many cases in which it does. It would be absurd to refer, say, the Dravidian cognates for the Tamil source of mung to the English entry.
In cases like "race" though, the only reasonable etymology summary would probably still come out as "of disputed origin, see race for details". --Tropylium (talk) 18:07, 8 May 2017 (UTC)
I just remove cognates when they're already listed on one of the ancestors. Ideally, cognates would not need to be listed at all. —CodeCat 18:18, 8 May 2017 (UTC)
Ops, and I place them in a box apart... beech Sobreira ►〓 (parlez) 09:46, 15 May 2017 (UTC)

Beta Feature Two Column Edit Conflict ViewEdit

Birgit Müller (WMDE) 14:28, 8 May 2017 (UTC)

That looks like a lovely feature, and I plan to enable it when it's available. (It doesn't yet appear in Preferences, even though it's May 9th already.) Thanks for letting us know. — Eru·tuon 21:02, 9 May 2017 (UTC)

Eponym categoriesEdit

I have noticed that all the eponym categories, such as Category:Czech eponyms, include a description like "Czech terms derived from names of real or fictitious people." However, I think that only a noun can be an eponym (English Wikipedia says: "An eponym is a person, place, or thing for whom or for which something is named, or believed to be named.") and e. g. verbs derived from names of people are not eponyms. So I suggest to change the word terms for nouns in the description of the categories. --Jan Kameníček (talk) 15:44, 8 May 2017 (UTC)

Our definition says that an eponym is a word formed from a name - so could be adjective as well (e.g. quixotic). SemperBlotto (talk) 15:48, 8 May 2017 (UTC)
Yes, I can see it now, but the definition is not accompanied by quotations attesting such broad understanding of the term eponym. I have also never heard it in connection with e. g. verbs.
Another possible solution could be renaming the categories for "Terms derived from personal names". --Jan Kameníček (talk) 15:55, 8 May 2017 (UTC)
I have just done a quick Google search and found some examples where the term eponym is used in connection with adjectives, but I am not sure how widely this is accepted. I cannot find any verb to be described as eponym. Is švejkovat an eponym? --Jan Kameníček (talk) 16:08, 8 May 2017 (UTC)
Well, "Bowdlerize is an eponym after Dr Thomas Bowdler", for example, to quote one of the many books at google books:bowdlerize eponym link updated which call that verb an eponym. And agrees that an eponym is "a word based on or derived from a person's name". Whereas, several other dictionaries don't even recognize any words as eponyms, but only the people who give their names to things (our entry's sense 1). - -sche (discuss) 16:50, 8 May 2017 (UTC)
As for "many books at Google books", the search that you have provided shows me just Wordcraft dictionary. Is it a source acceptable to attest the meaning in the entry eponym?
As for, it does not say that bowdlerize is an eponym. It has got an entry eponym, but a dictionary entry is not acceptable to attest a meaning in our entry. --Jan Kameníček (talk) 17:21, 8 May 2017 (UTC)
There are more than enough books at the search I linked to (perhaps they are only visible in some countries? Google is picky about who it shows what; I can type up three if needed) which attest that the usual English meaning of eponym includes verbs, adjectives, etc. I only mentioned since you appealed to Wikipedia's definition, which weakly implied that the term was restricted to nouns, whereas affirms that it applies to any word. - -sche (discuss) 17:25, 8 May 2017 (UTC)
Hm, that is quite possible that some search results are not visible in my location. Are there any publications suitable for our attestation purposes? --Jan Kameníček (talk) 17:29, 8 May 2017 (UTC)
Yes, I've typed up a few in the entry and its citations page. But also my link was off, which is probably w3hy it didn't work: it shouldn't have included the quotation marks, I'm sorry! - -sche (discuss) 17:47, 8 May 2017 (UTC)
That looks good. Thank you! --Jan Kameníček (talk) 18:05, 8 May 2017 (UTC)

Atmospheric phenomenaEdit

We are currently lacking a category for atmospheric phenomena, like sun dog, rainbow, heiligenschein, the newly-discovered Steve, and so on. So I'd like to create Category:en:Atmospheric phenomena for these. We also don't have a category for different weather phenomena, like the basic rain, snow, fog, and so on. I considered creating Category:en:Meteorological phenomena for these, but I'm actually not sure if these two categories are clearly delineated. I'm assuming that all meteorological phenomena are also atmospheric phenomena and so the former is a subcategory of the latter. But I can't really establish clearly whether something is "meteorological" or not, so that makes the categorisation difficult, and I can imagine others will have similar difficulties. So should we just put them all in the one "atmospheric" category, or create both? —CodeCat 21:16, 8 May 2017 (UTC)

We do already have a category for some weather phenomena: Category:en:Weather, with "rain" and "snow" belonging specifically its subcategories Category:en:Rain and Category:en:Snow.
As for where to put Steve, I would just create the one "atmospheric" category; given the noted difficulty of deciding whether a term is 'atmospheric' or 'meteorological', I might only worry about splitting the category up if it ended up more than ~200 entries in it, by which point perhaps some other subdivisions might suggest themselves.
- -sche (discuss) 20:20, 9 May 2017 (UTC)
The "weather" category is topical, though, whereas I am proposing a set category. One for just the names of phenomena, leaving terms related to weather to another category. I like to keep these things separate so that the "related" terms don't muddle up the rest. —CodeCat 21:37, 9 May 2017 (UTC)
I've now created Category:en:Atmospheric phenomena and managed to fill it up with quite a few entries pilfered from other categories. I also categorised "Clouds" under it, since all clouds are atmospheric phenomena and therefore form a subset. —CodeCat 22:03, 9 May 2017 (UTC)


How am I supposed to order the languages when the language name is no longer present in the wikitext? —CodeCat 13:47, 9 May 2017 (UTC)

I suppose you have to look back and forth between the preview and the wikitext, but that's annoying. Perhaps a module could do it, but that would make it difficult to put inherited words first. Or a bot. But again, I'm not sure if a bot would be smart enough. I see the disadvantage to using {{desc}}. — Eru·tuon 19:13, 9 May 2017 (UTC)
This is probably a bad idea, but the template could be modified to use canonical names instead of codes. That would be inconsistent with all the other templates, though. — Eru·tuon 19:15, 9 May 2017 (UTC)
I don't see any problem with having to refer to the preview. I'd rather that than typos and wrong language codes, which are common. --Victar (talk) 19:28, 9 May 2017 (UTC)
That is a distinct advantage, being able to automatically check that a canonical name and code are valid. And when one is listing a form in a subvariety of a language, there was, till now, no way to check the subvariety against Module:etymology languages/data to see that it exists. — Eru·tuon 20:24, 9 May 2017 (UTC)
@Erutuon: Yeah, previously I was using {{etym}} when I wasn't sure if my lang code was correct, so it's saves me from having to do that now. --Victar (talk) 20:51, 9 May 2017 (UTC)
You don't see a problem, I do have a problem. What's going to be done about it? Or is your purpose to make it so tedious for me to edit descendants that I just won't bother anymore? —CodeCat 21:38, 9 May 2017 (UTC)
We have two problems: no checking for validity of a canonical name when the language name is written out, the tediousness of alphabetizing when the language name is not written out. A solution that deals with both these problems would be best. — Eru·tuon 22:07, 9 May 2017 (UTC)
Plus the tediousness of having to write out both the language name and canonical name. --Victar (talk) 22:36, 9 May 2017 (UTC)
I find having to figure out the order of all the language codes much more tedious and mentally heavy. —CodeCat 22:43, 9 May 2017 (UTC)
I just don't see it being a problem if you use preview, but two possible solutions to {{desc}} and {{desctree}} are 1. add a comment automatically before each use, ex. <!--Old French-->, or 2. that when you save, it automatically hard-codes the prefix, → Old French: . I'm not sure if preview would work though, which kills the validity check benefit in preview. --Victar (talk) 23:01, 9 May 2017 (UTC)
I however would rather see a technical solution, like {{sortdesc}}. --Victar (talk) 23:42, 9 May 2017 (UTC)
And really, we should have a sortdescBot doing such work. --Victar (talk) 21:50, 10 May 2017 (UTC)
@Victar: The difficulty with using a bot is that we typically put inherited languages first. There would have to be a way to signal to the bot either that those languages are supposed to be first (arbitrarily) or that they are inherited and therefore the bot can infer that they should be first. — Eru·tuon 22:49, 10 May 2017 (UTC)
@Erutuon: Right, it can't be a simple alphabetizer -- it would have to crosscheck against all the language trees. --Victar (talk) 00:22, 11 May 2017 (UTC)
On another note, @Erutuon, I was thinking about adding |noalt= (default true) to {{desc}} as well. Good idea? --Victar (talk) 20:51, 9 May 2017 (UTC)
@Victar: I think the name you proposed initially, before your edit changing it, |alts=, would be better. Generally, it is preferred to have the default value of a boolean parameter be false, and to name the parameter accordingly. — Eru·tuon 21:14, 9 May 2017 (UTC)
Sure, whichever is fine with me. Good idea though? As I've been converting descendant lists, I've come across times I could use it, instead of manually inputting alternatives. --Victar (talk) 21:18, 9 May 2017 (UTC)
Actually, I'm confused, since you're talking about {{desc}} rather than {{desctree}}. What would this parameter do in {{desc}}? — Eru·tuon 21:25, 9 May 2017 (UTC)
I added |noalts= to {{desctree}}, but I'm talking about adding auto alternative forms to {{desc}}, with an |alts=1 parameter. --Victar (talk) 21:38, 9 May 2017 (UTC)
Ahh. That sounds like a neat idea. — Eru·tuon 22:07, 9 May 2017 (UTC)
Cool, I'll move forward with that than. --Victar (talk) 22:36, 9 May 2017 (UTC)

I think {{desc}} should not be deployed widely until an easier solution to the problem of alphabetization is developed. Just think if we tried to do a similar thing for {{t}}: have it automatically generate the language name. That would be a nightmare for an editor to alphabetize by glancing back and forth between preview and wikitext, when there are as many translations as in the Translations section of water.

Admittedly, automatically alphabetizing a translations table would be easier, because everything in a translations table is alphabetized alike, rather than there being some languages that are put first.

But until a bot or JavaScript or Lua can alphabetize descendants lists, I don't think it's not a good idea to remove the written-out language names. Adds too much tedium to editors' work. — Eru·tuon 20:02, 11 May 2017 (UTC)

If you mean mass conversions, sure, but I disagree that people shouldn't continue using it. If we ever want to change how we use {{desc}} in the future, we can always run a bot on entries that employ it. Also, again, I don't think the alphabetizing of entries a big deal at all. --Victar (talk) 20:30, 11 May 2017 (UTC)
Well, I dunno. You say you don't mind it, @CodeCat says she does. I'm on the fence. We need more editors to comment here. — Eru·tuon 20:47, 11 May 2017 (UTC)
Also, I don't what translations have anything to do with this. {{t}} is its own template and unrelated to {{desc}}. --Victar (talk) 20:35, 11 May 2017 (UTC)
Translations are just another thing with unlinked language names that must be alphabetized. — Eru·tuon 20:44, 11 May 2017 (UTC)
Right, but it's a bit of an argument fallacy. What works one place, isn't going to work everywhere. Translation lists will always be huge, because you're comparing against all languages in the world. Descendant trees don't even come close to that. Also, because we nest descendant lists, we don't have to deal with one long list. It's simply not a good comparison. --Victar (talk) 20:50, 11 May 2017 (UTC)
I wasn't really intending it as a logical argument. A translation list has the similarity to descendant trees that I mentioned, but, as you mention, it has the difference of being much longer (when it is nearer to complete) and not being nested. The tedium of alphabetizing when only language codes are visible in the wikitext is proportional to the length of the list. The longer, the more difficult and, I suspect, the more likely editors will not want to deal with it. With an automatic tool for alphabetization, {{desc}} would be much easier to use (when handling words with a lot of descendants). — Eru·tuon 21:36, 11 May 2017 (UTC)

Subdividing categories for parts of the bodyEdit

Currently we have the very generic Category:Body, and Category:Anatomy which really shouldn't be used for parts of the body but more to terms used in the field of anatomy. I've been puzzling for a while about how to make these categories more specific and thus useful. My primary view on topical categories is that they should be anthropocentric and reflect everyday life, not necessarily scientific analyses. To that end, I think a primary distinction should be made between outward features of the body and inner features. This then leads to a distinction between Category:Internal organs and Category:External organs.

Not every feature on the outside (or inside) is an organ, though, such as the arms, legs, head, hair etc. A Category:Limbs couldn't possibly ever have many members, so that doesn't seem like a good idea, but I have no idea where else they might go. Perhaps a more general category for something like divisions of the body? It could then also include abdomen, head and foot. I don't know what such a category might be named.

Does anyone have feedback or suggestions? —CodeCat 23:53, 9 May 2017 (UTC)

I don't have suggestions, but I would like to point out that Category:Anatomy contains non-human anatomy as well. Technically that should go in Category:Zootomy and Category:Phytotomy, but those are terribly named and therefore understandably underused. —Μετάknowledgediscuss/deeds 00:11, 10 May 2017 (UTC)
Humans obviously have a lot of overlap with other animals, so it's hard to have a category tree that doesn't apply to other species as well. It certainly would not be desirable to have one set of categories for human anatomy only, and another set just for the parts in other animals that are not applicable to humans. A single category for all non-human body terms would probably become too unspecific to be useful. —CodeCat 00:16, 10 May 2017 (UTC)

French Wiktionary monthly news - ActualitésEdit


I am glad to inform you that the 25th issue of Wiktionary Actualités just came out!
As usual, it is a short page about French Wiktionary and lexicography in general.
This time: a focus on a dictionary about regionalisms in French, a workpiece on rhymes and a review of a talk made about Wiktionary and ideology!

Sorry that it is not perfectly translated in English despite all our efforts. Please note that we do not received any money for this publication since the beginning! We are eager to receive feedback to know if we continue to translate it or if you imagine ways to improve it. Thank you   Noé 13:13, 10 May 2017 (UTC)

Policy for categorizing certain Vulgar or Late Latin termsEdit

If a term is attested in something like the Appendix Probi, a third or fourth century Latin work listing common mistakes arising in the written Latin of the time, should it be counted as an actual attested (Late) Latin term, and hence not a reconstructed Vulgar Latin one, using the preceding asterisk (*)? Many seem to think that the terms in there were examples of Vulgar Latin, the spoken speech, creeping into Late Latin written forms as used by people who weren't educated in the use of the Classical version of the language. How, then, should these be handled? I ask because I've gotten to the descendants of Latin frigidus, which passed through a few intermediate forms before reaching the Romance forms (there's also the later learned borrowings, but that's a different story). For example, frigdus and fricdus seem to have been attested in this Appendix, and the form fridus was found on some Pompeian inscriptions, basically graffiti. Word dewd544 (talk) 20:36, 10 May 2017 (UTC)

Yep, they're attested, so you might as well refer to them when they fit the form you'd reconstruct anyway. Surprisingly enough, it seems that nobody's gone through and added all the words from the Appendix Probi to Wiktionary, but it should be done. —Μετάknowledgediscuss/deeds 21:51, 10 May 2017 (UTC)
Okay, I figured that would the case. Thanks. There's also the Reichenau Glosses, which were written later. Word dewd544 (talk) 15:37, 12 May 2017 (UTC)

Improvements to get people to the main lemma fasterEdit

We use {{inflection of}} to point people to the lemma of which a term is an inflection, but if that lemma is itself an alternative form, then they have to click through a second time to get to the page with the actual definitions. Consider for example capitalises, which is a form of capitalise, itself an alternative spelling of capitalize. It can get worse too: Middle Dutch beseekede is alternative spelling of the normalised spelling besekede, which in turn is an inflection of beseken, itself an alternative form lemmatised at beseiken. I'm wondering what we could do to speed this up and make it easier for people to get to the main lemma faster. —CodeCat 21:13, 10 May 2017 (UTC)

We could add another parameter to say something like "verb form of capitalise (alternative form of capitalize)". DTLHS (talk) 00:23, 11 May 2017 (UTC)
Perhaps we should just add a parameter like "lemma" to display "(see x)" without additional text. That way, even beseekede could link straight to the lemma. - -sche (discuss) 02:44, 11 May 2017 (UTC)
In this case, both capitalize and capitalise are categorized as English lemmas, so perhaps the parameter should be named something like |main entry= or |main=. — Eru·tuon 03:11, 11 May 2017 (UTC)
main= is probably preferable; it's shorter and makes it easier to create main t=, main tr= and such if needed. —CodeCat 19:49, 11 May 2017 (UTC)
OK, that's a good name, if we do this. - -sche (discuss) 07:18, 12 May 2017 (UTC)
Another case: Middle Dutch voer is an alternative spelling of voor, whose definitions are in turn at vore. There's no inflections involved this time. A case could be made that, as currently, voer should be simply be defined as an alternative form of vore directly. But this misses out on the fact that voor is in the normalised orthography for Middle Dutch whereas voer is not. There are, really, only two possible forms, vore and voor, both of which are in normalised orthography. Both in turn have further variations in spelling, such as voere for the former, and voer and voir for the latter. —CodeCat 20:20, 11 May 2017 (UTC)
Yes, the issue is not uncommon, especially when something can be an inflected form of an {{altcaps}} of an {{altform}} of something. We should, however, consider the drawback that if "inflected forms of" or "alternative form of" allows also linking to a main entry and not just the nearest lemma-form, but then we decide (e.g. for reasons of commonness) that another entry should be the main entry, we have a lot of inflected forms of alt forms to forget to update and leave pointing to the wrong place. - -sche (discuss) 07:18, 12 May 2017 (UTC)

cannot open translation boxesEdit

What do you do when the "open" link for the translation boxes does not appear when you load a Wiktionary page? When it happened to me before I would clear the cache and it would go back to normal, but this time it's not working (on any browser). Any ideas? ---> Tooironic (talk) 04:44, 11 May 2017 (UTC)

Me got this error too, but on another boxes. I think it is about un-working Nav classes. --Octahedron80 (talk) 04:50, 11 May 2017 (UTC)

  • Ditto. All show/hide options are not working. SemperBlotto (talk) 04:53, 11 May 2017 (UTC)
Let's consolidate discussion (currently spread across many fora) in Wiktionary:Grease pit/2017/May#Show-and-hide_templates_not_working_properly. - -sche (discuss) 08:00, 11 May 2017 (UTC)

Wiktionary:Votes/2012-01/Renaming requests for verificationEdit

For once I actually AGREE with Dan Polansky on something. This vote was totally unfair. It had an 8 majority vote for support, and was "failed" by an opposer. I demand some kind of explanation because Liliana sure didn't give one! PseudoSkull (talk) 12:42, 11 May 2017 (UTC)

I agree that Liliana probably shouldn't have been to the one to close it, but also agree that the vote did not pass. 8-5-3 is not a consensus as far as I am concerned. - TheDaveRoss 12:53, 11 May 2017 (UTC)
@User:TheDaveRoss Why not? I've been told time and time again that abstentions do not count at all whatsoever! Plus, one of the votes is a "weak delete". PseudoSkull (talk) 05:41, 12 May 2017 (UTC)
You're missing the point. We pass votes with a supermajority ~70%; this only had 61% support of non-abstainers. You can always try running the vote again, and it might pass. I'd still support it. —Μετάknowledgediscuss/deeds 05:50, 12 May 2017 (UTC)
Usually 2/3s, I think, but yes, this one was shy of having enough support. - -sche (discuss) 07:19, 12 May 2017 (UTC)
Re: "We pass votes with a supermajority ~70%": Not true; many people support 2/3 and we now have multiple votes passed with less than 70%. --Dan Polansky (talk) 19:35, 12 May 2017 (UTC)
A fact which I find objectionable, we should drift in the direction of more consensus rather than less consensus. - TheDaveRoss 19:57, 12 May 2017 (UTC)
@TheDaveRoss: You seem to have changed your mind, compared to your proposal at Wiktionary:Beer_parlour/2010/June#Vote_Closure_Rubric. --Dan Polansky (talk) 20:20, 12 May 2017 (UTC)
So it seems! I have become a soft, community-minded liberal in my old age I guess. I blame the 2017 election. Also, that is some impressive sleuth-work. - TheDaveRoss 20:35, 12 May 2017 (UTC)
It so happened to be linked from User:Dan Polansky/Voting :). Secret revealed. --Dan Polansky (talk) 20:49, 12 May 2017 (UTC)
@PseudoSkull: Let me suggest that you put a late support to the vote if you in fact support its proposal. This you can do by indenting the support by :, and prefixing "support" with "late", or the like. Late support does not change the vote result, but helps informally track the evolving support/oppose situation which can serve as a motivation for a new vote. --Dan Polansky (talk) 19:38, 12 May 2017 (UTC)
Considering the previous vote is over 5 years old, I think that would be silly. Consensus can change in 5 years; it makes more sense to start a new vote than to cast a meaningless support in a long-stale vote. —Aɴɢʀ (talk) 20:33, 12 May 2017 (UTC)
I don't think it silly, and meaningless; it prevents starting a new vote or poll while it still enables some sort of tracking and collecting of input on a subject, in one convenient place. It certainly does not harm anything. --Dan Polansky (talk) 20:49, 12 May 2017 (UTC)

Prakrit mergerEdit

I once again suggest we merge Sauraseni (psu), Maharastri (pmh) and Ardhamagadhi (pka) Prakrits into one language. Most descriptive grammars ([2] [3]) treat them as a whole, with several regional dialects. The differences in orthography are regular; see the incomplete User:Aryamanarora/Prakrit. After all, we treat Sanskrit as a language when it was spoken from Gandhar to Trivandrum, and even Classical Sanskrit ended up diverging into different dialects. Middle Indic is no different. —Aryamanarora (मुझसे बात करो) 15:52, 11 May 2017 (UTC)

They may be grammatically homogeneous, but really the more relevant question for us should be, how homogeneous are they lexically? There are three angles that would be relevant:
  • plain old lexical divergence;
  • "orthographic divergence": even if differences are regular, if almost all words end up having different spellings, we'd have to add separate entries for all of them anyway, so merging them as a single language would not save us any work on that front. I notice your working list so far has seven word pairs with differing orthography, versus one identical pair.
  • attestation: how well recorded different Prakrits are, to begin with? IIUC there is quite a bit of variation.
--Tropylium (talk) 00:11, 12 May 2017 (UTC)
@Tropylium: The three main Dramatic Prakrits (Sauraseni, Maharastri, [Ardha]Magadhi) are very well recorded ([4] [5] Old plays and dramas, and the texts of the Jain religion), and the unusual Gandhari Prakrit is surprisingly well recorded ([6]). Lexical divergence is very little, as in the words retain their meanings, since Panini's standardized Classical Sanskrit was still in use with the Prakrits and Prakrit writers increasingly emulated Sanskrit styles of writing. The orthographic divergence is least among the three Dramatic Prakrits; really, it was a difference between, say, American and British English. They were even used in the same text together; certain characters would use certain Prakrits to reflect their socioeconomic status. ([7] source). It is harder to make a case for Gandhari and Elu Prakrit (the other two attested Prakrits) to be merged as well because they diverged significantly due to their geographic isolation. Gandhari even used the Kharoshthi script and borrowed Greek vocabulary from the Indo-Greeks of Alexander. —Aryamanarora (मुझसे बात करो) 14:31, 12 May 2017 (UTC)
If a merger were to go forward (which I am opposed to), I would definitely say that Gandhari and Elu be separate due to their having many many more differences than the Dramatic Prakrits. DerekWinters (talk) 18:33, 12 May 2017 (UTC)

Delete or rename all Vulgar Latin reconstruction pagesEdit

Right now, we have a bunch of Latin reconstructions that are duplicates of actually attested terms. I am opposed to treating these as different terms from their attested Latin ancestors. The descendants should be placed on the attested Latin entry, not on a separate reconstruction page. For example, *cinque is exactly the same term as quinque; the former is merely an unattested form of the latter. Why would the descendants then be placed at this unattested form instead of a form that actually is attested? It's not our practice to place descendants on alternative forms anyway. Either these entries should all be merged into their attested counterparts, or an entirely separate reconstructed language should be created for them. My preference is towards the former, because Vulgar Latin/Proto-Romance is not sufficiently different from regular Latin to warrant a separate language. —CodeCat 19:47, 11 May 2017 (UTC)

I agree that reconstructions of alternate forms should not be created, and examples like *cinque be deleted/merged, but we have a good number of reconstructed Latin terms that probably cannot be treated in this way, e.g. *baccinum or *matrastra: derivatives that have not been attested and which it would be counterproductive to merge with their root words. --Tropylium (talk) 11:44, 13 May 2017 (UTC)
I agree that reconstructed terms that are completely unattested are ok, but I do think they should be treated like any other Latin term in terms of inflection. —CodeCat 12:15, 13 May 2017 (UTC)

product nameEdit

Is this the right place to ask whether the second picture in the jam article violates any relevant Wiktionary policy? If it doesn't, i suggest we at least change the confusing caption A "London Traffic Jam" strawberry jam and peanut butter sandwich to A sandwich with strawberry jam and peanut butter. --Espoo (talk) 22:07, 11 May 2017 (UTC)

I imagine someone thought it was clever because it worked in two senses (or wordplay based on two senses) of jam, but I think it'd be better to have a jar of nonspecific jam, and separately a picture of a traffic jam. - -sche (discuss) 14:48, 12 May 2017 (UTC)
Yes, puns are likely to confuse learners. Equinox 23:12, 12 May 2017 (UTC)

WT:RFV is now splitEdit

Pursuant to WT:Beer parlour/2017/April#Splitting WT:RFV, I have now divided it into WT:Requests for verification/English and WT:Requests for verification/Non-English. The biggest change is that now {{rfv}} and {{rfv-sense}} require language codes. Please feel free to offer feedback or improve/prettify it in any way.

This idea met with a lot of support, but it may turn out that we liked the old way better. Never fear, we can regard this as a testing period, and change it back later if there's consensus to do so. —Μετάknowledgediscuss/deeds 19:12, 12 May 2017 (UTC)

Note: if no language code is given, {{rfv}} and {{rfv-sense}} default to Non-English. I think the default should probably be an explanatory error, but I don't know the best way of doing that. —Μετάknowledgediscuss/deeds 19:18, 12 May 2017 (UTC)
Someone should update aWa to work on the new subpages, and probably just on all subpages of any of the request pages (I think there are some unresolved-request subpages of RFC or RFM it also doesn't work on). - -sche (discuss) 19:20, 12 May 2017 (UTC)
Was there a consensus for creating two new pages? —CodeCat 19:21, 12 May 2017 (UTC)
There was a consensus for splitting RFV, not splitting hairs. —Μετάknowledgediscuss/deeds 19:24, 12 May 2017 (UTC)
It could have been split by creating just one new page. I already stated that that was my preference. I don't see a particular agreement for the contrary. —CodeCat 19:25, 12 May 2017 (UTC)
Metaknowledge and Wikitiki89 supported doing it with two new pages, whereas you supported making one page the default. No one else expressed a preference as far as I can tell. For what it's worth, I prefer doing it with two new pages. —Granger (talk · contribs) 19:30, 12 May 2017 (UTC)
I support the splitting after the fact. It looks good. --Daniel Carrero (talk) 05:26, 13 May 2017 (UTC)
I like having the two separate pages, because it would be confusing figuring out which page to go to, if the topic (English or Non-English) were not included in the title. — Eru·tuon 05:30, 13 May 2017 (UTC)

Poll: putting "nyms" directly under definition linesEdit

CodeCat has been using templates like {{hypo}} to put various nyms directly under definition lines, as at maths. I support placing them there, but I'm not a fan of the clutter they are likely to create in entries with a lot of different nyms. I quite like Simple English Wiktionary's solution, which is to collapse them in the same way as quotations (see simple). I'd like to see whether there's some consensus for this, as well as finding out whether people support this alternate placement of nyms.

Feel free to improve my wording or add another proposal. Andrew Sheedy (talk) 23:02, 12 May 2017 (UTC)

Can we have a demo page that shows the various options in action? Equinox 23:13, 12 May 2017 (UTC)
Unfortunately, I have no idea how to write whatever code is required to make the nyms collapsable, but I would greatly appreciate if someone else was willing to do it. I'm kind of hoping that either someone will be willing to alter the nym templates so they collapse the nyms, or that CodeCat will leave nyms formatted the standard way if people are opposed to having them directly under individual definitions.... Andrew Sheedy (talk) 00:09, 13 May 2017 (UTC)
I am supportive of this effort, but I would like to broaden it. I think it would be good if we reviewed the layout as a whole and came up with a new design which had a coherent user experience. This suggestion is a great first step, but our treatment of nyms should be aesthetically and functionally related to our treatment of everything else in the design. - [The]DaveRoss 12:28, 16 May 2017 (UTC)

Placing various nyms directly under corresponding sensesEdit

Support regardless of whether nyms are hidden or notEdit

  1.   SupportCodeCat 22:52, 12 May 2017 (UTC)
  2.   Support As long as they are wrapped inside a template. We will be easlily able to change the layout whenever we want. Matthias Buchmeier (talk) 10:48, 13 May 2017 (UTC)
  3.   SupportUngoliant (falai) 19:20, 13 May 2017 (UTC)
  4.   SupportAryamanarora (मुझसे बात करो) 22:03, 19 May 2017 (UTC)
  5.   Support DTLHS (talk) 22:19, 19 May 2017 (UTC)
  6.   Support --Barytonesis (talk) 00:38, 28 May 2017 (UTC)

Support only if nyms are hiddenEdit

  1.   Support. — Andrew Sheedy (talk) 22:50, 12 May 2017 (UTC)
  2.   Support Wyang (talk) 23:27, 12 May 2017 (UTC)
  3.   SupportSaltmarsh. 05:46, 13 May 2017 (UTC)
  4.   Support I recognize that for editors, putting nyms directly under senses offers both benefits (maybe it helps when re-ordering senses?) and drawbacks (it makes the edit window more cluttered when revising senses). For readers it's probably helpful, especially for terms with many senses. It would also avoid the perennial confusion many readers have regarding the glosses that are supplied to {{sense}} before antonyms — I am assuming that, like at present in entries that use {{synonyms}}, the nyms will be givne on lines starting with "Synonyms:", "Antonyms:", etc. And if nyms are put underneath senses, I would prefer they be hidden like quotations. - -sche (discuss) 21:57, 13 May 2017 (UTC)
  5.   Support DerekWinters (talk) 20:51, 20 May 2017 (UTC)


  1.   Oppose PseudoSkull (talk) 03:36, 13 May 2017 (UTC)
  2.   Oppose How hard is it to use {{sense}}? DCDuring (talk) 17:54, 13 May 2017 (UTC)
    It's not hard to use as an editor, but it's hard on the user viewing the page. If you've ever had to go back and forth to find which sense goes with which synonym, like I have on many pages, then yes, it's hard. —CodeCat 19:18, 13 May 2017 (UTC)
    Also, I always find {{sense}} bugging when it's used next to antonyms (I understand the logic, but it's counterintuitive nonetheless): an example. Having the antonyms directly below the relevant sense removes all confusion. --Barytonesis (talk) 00:38, 28 May 2017 (UTC)
  3.   Oppose. The priority should be making definitions easy to skim. Even a collapsible text "Synonyms" makes it harder. Furthermore, for senses with multiple -nyms sections, there would have to be multiple collapsible items, like Synonyms, Antonyms, Hyponyms. When you consider that there would be also collapsible Quotations, so many items make for cluttered user interface. --Dan Polansky (talk) 22:18, 19 May 2017 (UTC)


  1.   Abstain I have voted for enough good-sounding things that turned out to be awful (hi COALMINE!) not to want to support unconditionally, but I don't know that I want them hidden either. I support having nyms under their individual senses but I don't think either of the support-options gives enough flexibility about how. Equinox 03:27, 13 May 2017 (UTC)
    "Unconditionally" may not be the best word, since I didn't quite mean it to be a "no matter what" kind of option, just a way of saying that it didn't matter if the nyms were hidden or not. I think I'll change the wording, since not many people have weighed in yet. Feel free to add another option if you wish. Andrew Sheedy (talk) 05:15, 13 May 2017 (UTC)

Hiding all nyms the same way as quotationsEdit

Suggested in a Beer Parlour discussion from February.


  1.   SupportAndrew Sheedy (talk) 22:50, 12 May 2017 (UTC)
  2.   SupportSaltmarsh. 05:46, 13 May 2017 (UTC)
  3.   SupportUngoliant (falai) 19:21, 13 May 2017 (UTC)
  4.   Support, but I would like to see a complete design review rather than a piecemeal process. - [The]DaveRoss 12:28, 16 May 2017 (UTC)
  5.   Support, but under a nyms dropdown of course. —Aryamanarora (मुझसे बात करो) 22:03, 19 May 2017 (UTC)
  6.   Support – I kind of like how the Simple English Wikipedia does synonyms and antonyms. — Eru·tuon 22:05, 19 May 2017 (UTC)


  1.   Oppose as a default, but there's nothing against making this a preference that users can opt into. —CodeCat 22:55, 12 May 2017 (UTC)
  2.   Oppose. Why the hell do we need to hide these? They're fine as they are! PseudoSkull (talk) 03:35, 13 May 2017 (UTC)
    I think this is about hiding the nyms next to definitions, not the Synonyms, Antonyms, or other -nyms sections below the definitions. — Eru·tuon 04:04, 13 May 2017 (UTC)
    Okay, that makes a little more sense. But still, why? PseudoSkull (talk) 05:03, 13 May 2017 (UTC)
    Because if they're under senses, they add extra information between definitions, making it harder to skim an entry for a sense you're looking for, as well as just adding clutter. Hiding the nyms keeps them tucked out of the way until you actually want to look at them (and if there were eventually entries with synonyms, antonyms, hyponyms, and hypernyms all in different lines under a single definition, it would likely look pretty messy). Andrew Sheedy (talk) 06:05, 13 May 2017 (UTC)


Hiding only those nyms that take up more than one lineEdit

As suggested by Equinox.


  1.   SupportCodeCat 22:53, 12 May 2017 (UTC)
  2.   SupportSaltmarsh. 05:47, 13 May 2017 (UTC)


  1.   Oppose. I'd rather they all be hidden. Andrew Sheedy (talk) 22:50, 12 May 2017 (UTC)
  2.   Oppose. That would be inconsistent. Either show them all, or hide them all. Do, or don't do, something with them all. Besides, what takes up more than one line often varies by screen size. --Daniel Carrero (talk) 00:06, 13 May 2017 (UTC)
    Then have it detect the screen size before making the decision. Equinox 00:41, 13 May 2017 (UTC)
    And behaving differently in different circumstances often makes sense. Perhaps an e-mail client shows two separate panels (folders and messages) if the screen/window is big enough, but collapses it all into one space if the screen is very small, like a phone. Such decisions are common. Equinox 00:42, 13 May 2017 (UTC)
    It still looks like a bad idea to me. If sense #1 has many synonyms and sense #2 has few synonyms, then only the synonyms of #2 are shown, so at first glance it would look like #1 had no synonyms at all.
    I would support doing this: show at most X synonyms for all senses, but if there are more than X synonyms, display a "show more" button. Like this:
    1. good
      Synonyms: acceptable, agreeable, commendable, decent[show more] (this button is not working at this moment, but you get the point)
    --Daniel Carrero (talk) 04:58, 13 May 2017 (UTC)
    Well, ya might be right, or not :) How do I arm-twist someone into making a demo page so we can decide without so much hypothesising? Equinox 05:04, 13 May 2017 (UTC)
    Just to note, I voted support because I believed this would be implemented by hiding every line but the first if the list became too long. That was the idea originally posted in the talk page discussion linked above. I'm opposed to hiding the nyms completely under any circumstance; if there are nyms, then at least some should be visible. —CodeCat 21:06, 13 May 2017 (UTC)
  3.   Oppose, simply because I oppose the entire thing. Keep all nyms where they were before. PseudoSkull (talk) 03:37, 13 May 2017 (UTC)
  4.   Oppose for the same reason as Daniel. - -sche (discuss) 21:01, 13 May 2017 (UTC)


  1.   Abstain, actually a weak oppose. —Aryamanarora (मुझसे बात करो) 22:03, 19 May 2017 (UTC)

Poll: collapse usage examplesEdit

Since the argument was made that the -nyms take up too much vertical space, this same argument can be used for usage examples as well. So I'm creating this poll to see how people feel about that. —CodeCat 22:58, 12 May 2017 (UTC)


  1.   Support. Look at ձի (ji). --Vahag (talk) 19:36, 13 May 2017 (UTC)
    Why so many? DCDuring (talk) 20:26, 13 May 2017 (UTC)
    IMHO that's way too many usexes. Some should be turned into derived terms and others should be removed, leaving 2-4 at most. —Granger (talk · contribs) 20:38, 13 May 2017 (UTC)
    I imported the usexes from an out-of-copyright source. It is a pity to lose so much free information in a language so badly documented on the Internet. --Vahag (talk) 07:31, 14 May 2017 (UTC)
    The ones that aren't SOP should be added as derived terms. The ones that are translations of one-word or idiomatic English terms should be added to the appropriate translation tables. For the ones that are SOP translations of SOP English phrases, I imagine we wouldn't usually be losing that much information. —Granger (talk · contribs) 12:34, 14 May 2017 (UTC)
  2.   Support Seriously, usexes are useful, even if they are SOP —Aryamanarora (मुझसे बात करो) 22:04, 19 May 2017 (UTC)


  1.   Oppose. I assume ideally we should have 1 usage example per sense, maybe 2 usage examples per sense sometimes. If this is correct, maybe this doesn't take up a lot of space. Quotations absolutely need to be hidden because we can have more quotations per sense, with additional lines for title/author/year/etc information. Usage examples can appear normally, without needing to be hidden. That's my opinion. --Daniel Carrero (talk) 12:55, 13 May 2017 (UTC)
  2.   Oppose Sometimes the usage example is better than the definition to help someone identify the sense they seek. DCDuring (talk) 17:56, 13 May 2017 (UTC)
    I suppose that my concern is for English, so I am open to other views regarding usage examples for other languages. DCDuring (talk) 20:25, 13 May 2017 (UTC)
  3.   Oppose per DCDuring. — Ungoliant (falai) 19:11, 13 May 2017 (UTC)
  4.   Oppose also per DCDuring. A quick peek at an example is sometimes all you need. Leasnam (talk) 15:57, 16 May 2017 (UTC)


  1.   Abstain. Oddly enough, I don't really care either way about usage examples. I think it's because it's mainly links that look messy to me. — Andrew Sheedy (talk) 23:25, 12 May 2017 (UTC)

According to Wiktionary talk:Whitelist#Rollbacker, this will be very quick and simple.Edit

I desire to have rollbacking rights, but do not desire admin rights. The simple reason is I need something else to do around here, and rollbacking rights would be useful to me whenever I feel like looking through the RC all day and reverting vandals. I've been here for a few years now, and have a fairly decent case here, though a few people might disagree. I don't desire to rollback anything except for obvious and blatant vandalism (which seems to happen a lot around here). I've been reluctant to do the whole RC patrolling game simply because the undo process is slow sometimes, etc., you know how that is. It'd just feel good to be that person who took down some vandalism quickly and on sight. Sometimes, rollbacking is even good to revert my own edits quickly if I make some mistake and realize it directly afterwards. Will someone please give me the rollback rights? PseudoSkull (talk) 04:03, 13 May 2017 (UTC)

And no, I'm not switching accounts anymore, ever. PseudoSkull (talk) 04:07, 13 May 2017 (UTC)
You might as well go for the admin vote and ignore any tools you don't care about. I'd probably support. Equinox 04:18, 13 May 2017 (UTC)
(In case you weren't aware, there is a rollback-ish gadget in preferences. It does not act instantaneously like the real thing though. —suzukaze (tc) 04:27, 13 May 2017 (UTC))
  • I'm not sure I'd trust you to be an admin, but you'd be fine having the rollback tool. I'll nominate you. —Μετάknowledgediscuss/deeds 04:31, 13 May 2017 (UTC)
@User:Metaknowledge Thanks. I don't think I'm ready to be an admin yet. I used to think I'd never want to be. I'd definitely fail an admin vote as of now, but someday it'd be nice to have the admin tools. There are a lot of people I've pissed off, including and not limited to User:Dan Polansky. I must clarify that, though it doesn't matter anyway, I do indeed feel bad for past mistakes. If I were to go back in time, I'd probably support his admin vote, because he mostly makes good contributions to the site and acts in a civil manner most of the time. I no longer hold any grudges against him, because I don't really know him, and we all get angry sometimes. Also, with the whole Danish inflection incident that was caused by me, I think that's another major turnoff to the other editors. But I feel like I'm learning as I go, and someday I see it as a possibility that I may get the admin tools. I kid with the occasional pranks like what once every 5 months(?), but I genuinely dream of this site one day becoming the most popular and useful dictionary in the world, and to know that I'm contributing to such a masterpiece is so amazing.
Anyway, anyway, I don't know about starting an admin vote yet. Would waiting 2 more years be reasonable? Or does User:Equinox or other people think I should just go ahead and give it a shot? PseudoSkull (talk) 04:58, 13 May 2017 (UTC)
Okay. Want some boring brutal honesty? Of course you do. I think people often go through an initial phase of wanting to be an admin ('cause it's like an elite club with brandy and cigars?? -- anyway it reminds me of people wanting operator status on IRC, back when anyone cared about IRC; or being a forum moderator, or whatever) and later they forget about that nonsense, and just do a ton of work, and when someone finally says "how come X has been doing all this stuff for 5 years and isn't an admin yet?" they kind of shrug, "well I'll do it anyway, but vote if you like, that's nice". Pseudoskull, you have been around a while, and okay I do disagree with some of your entries (Transformers came up recently!) but I am pretty sure I would trust you to revert vandalism etc. Frankly the fact that you just wrote a huge paragraph (yeah look who's talking), and deleted the "don't admin me" bit off your user page, suggests that you are quite keen to be an admin but trying to be all secret-like and hide it. This seems to place you in the first camp (wanting the brandy and cigars). Not judging. Just saying. P.S. I'm not allowed to hand out rollback or I'd just do it instead of writing this. Equinox 05:10, 13 May 2017 (UTC)
Actually, you can just hand it out. —Μετάknowledgediscuss/deeds 05:21, 13 May 2017 (UTC)
@User:Equinox Well, it's true, I admit, that the title of admin itself is tempting. I did delete that bit off of my user page, because I changed my mind. I never really thought about it that much until recently. But you know, sometimes I come across a page that's obviously vandalism or has failed RFD/RFV and I'm thinking it'd be so much more useful if I could just delete it myself rather than speedying or something, if I was an admin that is. I've always considered adminship as a remote possibility, but never thought anyone would actually suggest it, which got me thinking. Then again, maybe I'm just not ready for it yet, and need more time to mature, which is what I fear. (And not that it has anything to do with this, but I write long paragraphs about a lot of things. lol)
On a different note, can someone even nominate themselves as admin? You know...if that's even allowed or if it's even my place to do that? I usually see other people nominating someone for adminship rather than them nominating themselves, but WT:ADMIN doesn't really seem to cover the steps. I imagine it is a courtesy to be nominated by someone else. PseudoSkull (talk) 05:53, 13 May 2017 (UTC)

Phrases with interchangeable pronouns (English at least for now)Edit

I've been bothered by this for quite some time, and I think it's time we finally do something about it. We need some way on Wiktionary to show people how "oneself" is inflected in specific phrases and include entries for all of such combinations. For instance, I recently created express oneself. In my ideal scenario, we'd have entries for express myself, express yourself, express ourselves, etc.

Okay, so I'm not good with templates, but I have created a basic outline of my proposal here by changes to two example entries. See User:PseudoSkull/express oneself for example entries. The "Inflection" section will not be so primitive as in my example, with a bulleted list, but will instead be represented by some sort of template. This is where you can find most of the pronouns.

What I propose is that we list the main lemma's inflections "express oneself" separately (i.e. expresses oneself, expressing oneself, etc.), with the "Inflection" section or some similar section, like maybe "Usage notes", to link to all the common interchangings. Yes, the template will include those notes to establish that some of the forms are archaic, or some are considered nonstandard but are still somewhat common. Then, for all the links to "express myself", "express himself", etc., those will be separate entries. They will individually have definitions that tell what "form" of the "express oneself" term they are; for instance, "express myself" would be the "first person singular" form of "express oneself", according to the entry. The entry would treat the "express myself" form as a lemma in itself, though, sort of, like how French past participles are treated in their entries...sort of. Anyway, so express myself will still use Template:en-verb, to list off all of its conjugated forms (expresses myself, expressing myself, expressed myself), and those will have entries too.

MOTIVE: Now, I understand that in the past people have generally favored doing the "myself", "yourself" entries as redirects to the main entry, but if you've known me long enough, I don't believe in redirects, and am pretty big on the whole entry inclusionism thing. I believe that any term that technically is not a sum of its own parts and meets our attestation criteria should be included as an entry here. Also, the "oneself" forms are coincidentally a lot less common than the "myself" or "yourself" forms anyway.

QUESTIONS: In the case that this passes Wiktionary's legislation, I wonder if there's a way to semi-automatically create these entries, because it'd be a real pain in the butt to sit there manually creating all those myself, himself, themselves entries for thousands of phrases. I wonder if there's a bot that we could feed that could do a lot of this for us.

Also, as most non-lemma forms don't need individual attestation necessarily, well, how are we supposed to treat these "pseudo-lemmas" (i.e. express himself would be the lemma of expresses himself, but express himself is apparently an inflected form of express oneself)?

I do want to go as far as to start a vote on this, but I need some input first. How can we best implement the inclusion of myself, ourselves, yourself, etc. phrase entries? PseudoSkull (talk) 07:13, 13 May 2017 (UTC)

I don't think this is useful. This isn't lexical information but grammatical. Pronouns can be replaced by other pronouns, or by actual nouns ("talk to oneself" can also be "talk to Bob" etc.). Equinox 21:53, 16 May 2017 (UTC)
I agree, although I don't think such pronouns can often be replaced with a wide variety of pronouns or nouns when it's a reflexive verb. You can't say "express him" or "express Bob", for example. Andrew Sheedy (talk) 22:04, 16 May 2017 (UTC)
Redirects would be much better than this suggestion. We need better entries more than we need additional entries, especially ones that add so little. DCDuring (talk) 22:02, 16 May 2017 (UTC)
Well these forms are attested in most cases, so I guess I'd be okay with redirects, but only if we also include their inflections as redirects (i.e. expresses himself would redirect to express oneself?) PseudoSkull (talk) 22:32, 16 May 2017 (UTC)
It's a weird situation because it's a phrase in which both the verb is inflected and the reflexive pronoun changes its person and number to match the subject of the verb. I guess it's sort of inflection. I wouldn't object to having a table of the various grammatical combinations of verbal form and reflexive pronoun, and creating redirects, because it would help readers find the entries, but I don't know what I think about giving these things entries.
I'm not pleased with the free-form list in your sample entry. It should be structured somehow, like verb inflection tables. And it could contain a lot more forms: as of yet, it just has bare verb forms, no inflections (expresses, expressing, expressed). What is the rationale behind this? — Eru·tuon 20:22, 17 May 2017 (UTC)
@ User:Erutuon I only used a list because I do not know how to create such a template, but wanted to give a basic idea of what would be on the template. PseudoSkull (talk) 20:43, 17 May 2017 (UTC)
How should this be handled in other languages? Dutch has no impersonal possessive, the closest is impersonal "you". —CodeCat 20:29, 17 May 2017 (UTC)
How do Dutch dictionaries normally handle that? --WikiTiki89 20:57, 17 May 2017 (UTC)
I don't know. —CodeCat 20:59, 17 May 2017 (UTC)
Can you try to find out? --WikiTiki89 21:22, 17 May 2017 (UTC)

links from 'User contributions' section directly to user's actual postEdit

Hi, when following a link from 'User contributions' it takes me to the top of the corresponding month's page, yet I'd like to go directly to my post. Thanks in advance. --Backinstadiums (talk) 16:59, 13 May 2017 (UTC)

@Backinstadiums: What do you mean by "month"? Are you talking about the dated revision links (with a full date on them), or about links to this or another discussion page? You can click on the arrow before the section name in the edit summary, if there is one, to go to the section header. — Eru·tuon 17:04, 13 May 2017 (UTC)
@Erutuon: Exactly, the arrow, but instead I want to be redirected directly to my post. --Backinstadiums (talk) 17:10, 13 May 2017 (UTC)
@Backinstadiums: That would be nice, but there is no way to do that, at the moment. The arrow only works because there's a section heading with an id="section title" in the HTML source code. There is no id in your posts. — Eru·tuon 17:16, 13 May 2017 (UTC)
  • There is an ugly and crude way to achieve that to some extent tho: from the contribution page we can get section name and timestamp that most certainly identify one post. The hardest part is bounding the post in HTML. In the simplest cases like we are having here it is possible to do that. I will try to make a small proof of concept script. --Dixtosa (talk) 18:05, 13 May 2017 (UTC)
    Here's the script(newer version here) to highlight post and this is to scroll to an appropriate section(no longer needed). It is actually not too bad.--Dixtosa (talk) 20:08, 13 May 2017 (UTC)
    @Dixtosa: I imported those scripts, but I'm not seeing anything. I go to a diff page for a discussion page comment, and the comment isn't highlighted. Not sure how to go to the comment either. — Eru·tuon 16:47, 14 May 2017 (UTC)
    1. It only works if you click the page title not the diff button. 2. It does not scroll to the post, but only highlights it (in the lamest color possible - lime). 3. some edits like this one have one minute difference between the edit timestamp and the signature time. On those edits it won't work (yet). --Dixtosa (talk) 18:58, 14 May 2017 (UTC)
    @Dixtosa: Huh. I realized I wasn't clicking from the right page (it is meant for User contributions), but still I'm not seeing any highlighting when clicking on page titles for contributions that have added a timestamp. (I also checked to see that the revision date was the same as the timestamp in the signature: it was.)Eru·tuon 19:04, 14 May 2017 (UTC)
    I wonder if some browser or operating system difference is causing it not to work on my machine. I'm using Chrome on Windows 10. — Eru·tuon 19:10, 14 May 2017 (UTC)
    1. Do you have your actual time zone set correctly in preferences? 2. Could you see if there are any js errors in your browser's console?--Dixtosa (talk) 19:57, 14 May 2017 (UTC)
    1. Yes, my timezone is set. 2. I checked the console, and there were no errors, aside from some deprecated stuff in other scripts. But then I looked for your scripts in the list of "sources", and couldn't find them. So I must not have managed to load them correctly. Huh. — Eru·tuon 20:05, 14 May 2017 (UTC)
    Oops, they are shown in the HTML source code. Not sure why they don't show up in the list of "sources". — Eru·tuon 20:08, 14 May 2017 (UTC)
    Then it must be a date format issue. I hope you use a format different from this template "20:16, 14 May 2017"?--Dixtosa (talk) 20:18, 14 May 2017 (UTC)
    Strangely not. That is my current date format. I turned off the "change dates to be relative to local time" gadget when I enabled your scripts. — Eru·tuon 20:37, 14 May 2017 (UTC)
  • The script send a popup error now, saying the timestamp was not found. It has weirdly transformed the hour and minute for this edit, from 22:07 to 05:7. — Eru·tuon 22:23, 14 May 2017 (UTC)
    Should be fixed now. Why was that weird though? Minute was not zero padded. --Dixtosa (talk) 22:45, 14 May 2017 (UTC)
    Yay!!! It works now!!! Awesome! Okay, the minute was not all that bizarre. But I don't know what happened with the hour. — Eru·tuon 23:11, 14 May 2017 (UTC)
    I'm doubly pleased now, because your tool works even when I turn the local time gadget back on. — Eru·tuon 01:05, 15 May 2017 (UTC)
    You can remove User:Dixtosa/RushToSubsection.js from your common.js. RushToSubsection.js basically replaces the link associated to page title with the one from →. But it also works on watchlist and "Related changes" page. --Giorgi Eufshi (talk) 06:38, 15 May 2017 (UTC)
    Hmm, okay. It doesn't really work to do that from the watchlist, since (in my display) there are multiple posts on the same page grouped together. — Eru·tuon 19:30, 15 May 2017 (UTC)
  • @Dixtosa: Would it be possible to make this work for dated revision links from page histories? Also, can you make it so the CSS applied to the post can be customized? — Eru·tuon 19:30, 15 May 2017 (UTC)
    First of all, please use this script. My sandbox.js is for sandboxing purposes. Yes history pages could also have this implemented. The highlighted post is wrapped in div with the id equal to "SkipToPostJs-highlighted-post". Bear in mind you'd have to add " !important" to your styles.--Giorgi Eufshi (talk) 06:15, 16 May 2017 (UTC)

so is it feasible or what? --Backinstadiums (talk) 20:41, 27 May 2017 (UTC)

@Backinstadiums: Yes. You have to install User:Dixtosa/skipToPost.js in your common.js. To install, add the code importScript("User:Dixtosa/skipToPost.js"); to the page. — Eru·tuon 00:09, 28 May 2017 (UTC)
@Erutuon: It doesn't work; this messae pops up "skipToPost.js: timestamp not found on the page" could you please check if I have proceeded properly? --Backinstadiums (talk) 07:45, 28 May 2017 (UTC)
It does work but has a lot of assumptions: You should be using this format for dates: "20:16, 14 May 2017" and you should have your timezone set correctly. Dixtosa (talk) 07:52, 28 May 2017 (UTC)
@Dixtosa: Could you please elaborate a bit on how to set everything correctly? --Backinstadiums (talk) 09:00, 28 May 2017 (UTC)
@Backinstadiums, I have updated the script to handle all of the date formats. As for timezone see if it is correct here. P.S. It does not work for the posts that have more than 1 minute difference between
the signature time and the official timestamp for the edit and obviously it can not highlight modifications to a post. --Dixtosa (talk) 13:16, 28 May 2017 (UTC)

think tank and draft proposal?Edit

What the heck is the difference between Template:policy-TT and Template:policy-DP? --Celui qui crée ébauches de football anglais (talk) 18:47, 13 May 2017 (UTC)

Perhaps such a distinction had meaning early in Wiktionary's history, but at this point I think we could survive with just one class for "draft policies". - -sche (discuss) 01:09, 14 May 2017 (UTC)
  Support merging both. Either the page is a full policy or it isn't. --Daniel Carrero (talk) 09:32, 15 May 2017 (UTC)

Proposal: remove "; usually, the English translation is given, instead of a definition" from WT:EL - TranslationsEdit


Remove "; usually, the English translation is given, instead of a definition" from WT:EL#Translations.

Current text:

"Translations should be given in English entries, and also in Translingual entries for taxonomic names. Entries for languages other than English and Translingual should not have Translations sections; usually, the English translation is given, instead of a definition. Any translation between two foreign languages is best handled on the Wiktionaries in those languages."

End result:

"Translations should be given in English entries, and also in Translingual entries for taxonomic names. Entries for languages other than English and Translingual should not have Translations sections. Any translation between two foreign languages is best handled on the Wiktionaries in those languages."

Procedural note:

  • This is a minor edit. It is not intended to change any regulations.


  • The text to be removed is at WT:EL#Translations, but it does not concern the "Translations" section. It concerns definitions/senses, which are the subject of WT:EL#Definitions.
  • Not every foreign language word has an English translation.
  • That rule about definitions/senses does not seem to cover non-gloss definitions, not to mention form-of entries.
  • That full paragraph was voted and approved at Wiktionary:Votes/pl-2016-01/Translations of taxonomic names, but even there at least one person disagreed with that text to be removed.

--Daniel Carrero (talk) 08:55, 14 May 2017 (UTC)

I went ahead and did it without a vote. Feel free to revert or discuss. --Daniel Carrero (talk) 22:50, 17 May 2017 (UTC)

Category:Requests for attention in Proto-Indo-European entriesEdit

This title is inaccurate. The great majority of these requests for attention are not in Proto-Indo-European entries. I suspect they have been placed in Etymology sections containing a PIE term, since I just placed one in bison. Perhaps the category name should be Requests for attention to Proto-Indo-European terms, though that sounds a little stilted. (The catfix should probably be removed too.) — Eru·tuon 18:56, 14 May 2017 (UTC)

@Erutuon, Wikitiki89, This, that and the other: What about Category:Requests for attention concerning Proto-Indo-European, and doing the same for the "attention" categories in all languages? It's normal to use the template {{attention}} in etymologies, which could be about any language, not necessarily the language of the current entry. --Daniel Carrero (talk) 16:33, 18 May 2017 (UTC)
I agree it should be changed. Not sure to what though. --WikiTiki89 16:58, 18 May 2017 (UTC)
I don't really like that title either. It's hard to come up with a good name because the template is so general that we don't really know what it's being used for. — Eru·tuon 21:18, 18 May 2017 (UTC)
Why not just "Requests for attention (Proto-Indo-European)"? It seems sufficiently vague. This, that and the other (talk) 00:14, 19 May 2017 (UTC)
Oh, yeah. I'd be fine with that. — Eru·tuon 00:36, 19 May 2017 (UTC)
I'm not a huge fan of that. All the other requests categories are nice descriptive phrases, so I was hoping the attention categories could be too. --Daniel Carrero (talk) 19:55, 19 May 2017 (UTC)
As I recall, we moved away from or decided not to move to categories with a name along the lines of "Requests (German)", because they sounded like categories for "Können Sie...?" "Requests for attention (Proto-Indo-European)" suffers from the same issue. I think Category:Requests for attention concerning Proto-Indo-European is a fine, broad name. - -sche (discuss) 20:24, 19 May 2017 (UTC)
Hmm, good point. I withdraw my objection to the option containing "concerning" then. — Eru·tuon 20:54, 19 May 2017 (UTC)
That's good. The "concerning" name also gets extra points because it'll be one of the subcategories of Category:Requests concerning Proto-Indo-European entries. --Daniel Carrero (talk) 21:04, 19 May 2017 (UTC)
I started to (re-)rename the attention categories now. (example: Category:Requests for attention in Swedish entriesCategory:Requests for attention concerning Swedish) --Daniel Carrero (talk) 18:00, 20 May 2017 (UTC)
Actually, per this discussion, this other category should be renamed too for the same reasons: Category:Requests concerning Proto-Indo-European entriesCategory:Requests concerning Proto-Indo-European. (this is the top-level request category, but not all request categories are about entries as demonstrated here) I'll do this later if it's OK. --Daniel Carrero (talk) 18:25, 22 May 2017 (UTC)
That sounds good to me. — Eru·tuon 03:25, 23 May 2017 (UTC)

Replacement heading for External links in taxonomic name entriesEdit

I missed the vote and today was notified that "External link' is to be phased out.

"Further reading" does not well characterize the links to external databases, few of which contain much running text, which to me is what is conveyed by "Further reading". What are under the External links heading are links to databases, containing entries most commonly structured in ways not unlike our own in gross terms. Sometimes there are images as well. The databases often reflect hypernyms and hyponyms differing from or supplementing what we show. In accordance with a 2011 vote these links were to appear under "External links" instead of "See also".

I would like some thoughts about which of the remaining legal headings is satisfactory: "See also", "References", other, no heading at all. DCDuring (talk) 23:51, 14 May 2017 (UTC)

I'd use 'References'. —Μετάknowledgediscuss/deeds 00:06, 15 May 2017 (UTC)
References is for, well, references in the entry. Like the <references /> tag. "See also" is for other entries within the same language that may be relevant to the user, not for links to other languages, and definitely not links outside Wiktionary. —CodeCat 00:07, 15 May 2017 (UTC)
I made a comment to this effect on the vote page. @Daniel Carrero argued that Further reading was fine since the template I mentioned, {{R:EOL}}, contains readable text in some of its entries. I didn't feel like pursuing the issue further. — Eru·tuon 00:21, 15 May 2017 (UTC)
To summarize Dan C.'s argument: Because some of these links SOMETIMES have text, all entries using those databases as well as the numerous databases that have no such text, eg WikiCommons, MycoBank, ITIS, NCBI, should have "Further reading" headings. Seems like flawed reasoning to me. Not to mention the lack of a fact base. DCDuring (talk) 00:31, 15 May 2017 (UTC)
Your summary of my argument is wrong. In the discussion that Erutuon linked, I never said anything to this effect: "Because some of these links SOMETIMES have text, all entries using those databases as well as the numerous databases that have no such text, eg WikiCommons, MycoBank, ITIS, NCBI, should have 'Further reading' headings."
I just said that EOL, specifically does fit in a "Further reading" section. I even asked him: "If there are any other examples of websites without much readable text, I'd be happy to discuss about them."
I agree that Wikimedia Commons does NOT fit in a "Further reading" section. If the purpose is linking to a large list of images at Commons, maybe we should use the "Gallery" section that a few entries have. For example, ดังโงะ has a "Gallery" section with 4 images. It could have a link saying "For more images, see Commons." An empty "Gallery" section could have a link saying "For images, see Commons." (without the word "more") --Daniel Carrero (talk) 09:25, 15 May 2017 (UTC)
AFAICT four different headings could be required for many taxonomic entries and three for English entries for corresponding vernacular names. Any single heading of those available would be a procrustean bed for the common range of items that have appeared under "External links".
The vote evidently had an opinion base, not a fact base, as if it were some kind of adversarial proceeding, in which the only relevant facts are those brought to the discussion by adversaries. And if parties with some particular knowledge were not present, so much the worse for any situations they were familiar with.
"Further Readings" fits EOL (which SOMETIMES has running text, almost always taken from WP), Tree of Life, the Angiosperm Phylogeny site.
"Gallery" fits Wikicommons, but nothing else.
"References" fits most other external sites (but not WP, Wikispecies and Commons, which, as with all wikis AFAICT, we don't treat as valid, authoritative references).
This latter point should have come up, even without my participation in the discussion, as it relates to the use of the heading for any entry that puts the links to other MW projects under a heading.
The upshot is that virtually all taxonomic entries would need to have three headings: "Further readings" for WP, etc, "Gallery" for Wikicommons, and "References" for everything else. Wikispecies doesn't fit under any of the three headings, so perhaps we could draft "See also" into the mix. DCDuring (talk) 02:29, 16 May 2017 (UTC)
"Additional resources" might capture all of those. I think the sister project links should be done with the float boxes like Wikipedia, which would eliminate them from consideration here. - [The]DaveRoss 12:46, 16 May 2017 (UTC)
Many have objected to the floating boxes. They are noxious for pages that have images or a right-hand table of contents, especially on long pages, such as those with multiple Etymology section or multiple languages. I place the sister project links next to the other "external links" for the convenience of contributors (often needed to finish the entry) and other users in selecting sources for additional information. DCDuring (talk) 15:14, 16 May 2017 (UTC)
"Additional resources" is not a permitted header. DCDuring (talk) 15:16, 16 May 2017 (UTC)
When I see "Additional" in a bibliographic section of a book or article, it is usually for some kind of list that supplements a primary list. For example, it might include certain kinds of online resources whereas the bibliography is limited to books and article. I suppose we could try a vacuous header like "Sources" (of what? for the entry as it is or for its improvment?) or "Resources" sounds too inclusive to me. Would it include seeds, taxidermy supplies, books or online courses on taxonomy? DCDuring (talk) 15:29, 16 May 2017 (UTC)
Additional to Wiktionary. And what is permitted is up to us, if nothing fits the bill we can do something different. I also understand the objections with the boxes, it might be nice to merge them into a single box which can include links to all projects. We should also figure out a consistent ordering of images, project links and tables of contents. All of that could benefit from an overall design policy. - [The]DaveRoss 15:37, 16 May 2017 (UTC)
In the meantime someone did the service of inserting "Further readings" into 6,222 Translingual entries. In virtually every case the heading is misleading for the reasons discussed.
Are we capable of having a good consensus universal design policy? Do we really need one? Given our inability to make a good consensus decision on the simple matter of a single heading, is it really best that we double down and go for something vastly more comprehensive? DCDuring (talk) 12:41, 18 May 2017 (UTC)
That is a mischaracterization, there was a consensus reached regarding this issue. You opposed it after the fact, but that is not an indictment of the process. I think we are able to make consensus decisions, and very often do. I also think we would benefit from having coherent design policy. Coming to a common understanding about what to collapse and where to place things on the page are both valuable. For instance, I think that we should do our best to ensure that the most relevant information is the most readily accessible information, e.g. you shouldn't have to scroll to get to definitions. I don't know if everyone agrees with that, but it is a discussion worth having. - [The]DaveRoss 12:51, 18 May 2017 (UTC)
That the consensus should have been on a heading ("Further reading") that was unsuited to the nature of most the material included under "External links", eg, Wikicommons, Wikispecies, a dictionary portal, and other dictionaries, makes manifest that the process is insufficient. "Further" to what? I'd be shocked if anyone could produce evidence that (normal) dictionary users view their use of a dictionary as "reading". In fact "reading" is such an exceptional thing for dictionary use that the word was used to mark the distinctiveness of Reading the OED: One Man, One Year, 21,730 Pages Ammon Shea. The unsuitability of "reading" to characterize the items actually included has already been discussed above, with apologies for but not a successful defense of the consensus.
The consensus decision seems simply wrong. Its wrongness should be humbling. My vote wouldn't have changed the outcome, but no one sought my input either, nor did anyone analyze the nature of the material included under External links. The fact-free, but opinion- and anecdote-laden, nature of the discussion suggests that we can reach consensus, but that the thought processes are defective. IMO the defects of the process are probably attributable to the limited commitment of most contributors (including me) to matters outside their direct interests. Even when someone ventures into the larger matters, it is difficult for them to do all the fact-gathering that would be necessary to do a good job by, eg, not trampling on areas that have special requirements. The 6,222 instances of "External links" in Translingual (that have been "corrected" to "Further Readings") were made pursuant to an [Wiktionary:Votes/2011-07/External links earlier vote] declaring WP (presumably all languages), Commons, and Wikispecies to be "External links". That vote was also a consensus. Was that consensus incorrect? By what standard? Why should I expect this consensus to be any less changeable? DCDuring (talk) 00:32, 19 May 2017 (UTC)
Things change over time, this should not be surprising. Nor should it be surprising that, from time to time, the community reaches decisions with which you find fault. Both of these are expected outcomes of collaborative/wiki projects. Is "Further reading" the most precise heading for all types of information which are included under it? Probably not. Is it equally as imprecise as "External links" is in some cases? Probably so. - [The]DaveRoss 00:53, 19 May 2017 (UTC)
  • I suppose that the best header for the items included formerly under External links is "References", not "Further reading." The important content for Wiktionary is the correspondence or lack thereof to the semantic relations and definitions included in the entry. The "reading" content is mostly incidental. Even if it were not, I can't see any good from having both "References" and "Further reading" in our already heading-heavy entries. I wish that footnoted references did not appear in a way so distinct from other references. I am sure that our technical mavens will resolve this sometime in this millennium, possibly even this decade.
Now there is just the question of how to get a bot to change the "Additional readings" header to "References". DCDuring (talk) 22:58, 19 May 2017 (UTC)

IP user not using proper templatesEdit

Not sure if this is the place to post about this.

The IP user has been creating letter entries, among other things. Lately they have been working on A, adding a bunch of new language sections. They sometimes use the proper templates ({{head}}, {{Latn-def}}), but sometimes not. They also improperly capitalize the See also section as See Also.

I've corrected their edits, and posted on their talk page advising them to use these templates and to correct the capitalization, but they don't respond and don't seem to change their behavior. For instance, see the twelve edits they've just made today. They've used {{Latn-def}}, but created a bunch of POS sections without {{head}} and used the capitalization "See Also" consistently.

I don't want to ask for this person to be blocked as a vandal, since they're clearly trying to do useful work, but what can be done here? — Eru·tuon 23:01, 15 May 2017 (UTC)

It's possible that they don't speak English- are there any clues about what specific languages they speak? DTLHS (talk) 23:14, 15 May 2017 (UTC)
None that I can see. The alphabet templates they're adding are for so many different languages. — Eru·tuon 23:18, 15 May 2017 (UTC)

@Leasnam has gone with the solution of simply reverting the edits. Shrug. — Eru·tuon 21:38, 18 May 2017 (UTC)

I am afraid that that is probably the best way to handle it at this point. They don't want to cooperate, and their contributions are sometimes problematic in and of themselves. —Μετάknowledgediscuss/deeds 21:48, 18 May 2017 (UTC)

A note about a future changeEdit

Discussion moved from Wiktionary:Beer parlour/2016/May.

Hi everyone. I'm sure you've already noticed this in Tech News as it's been already mentioned twice, but I wanted to flag again that because of an upcoming change, there's some code that may need to be fixed on this wiki:

You can find the list of the pages you'll want to check in a table on In most cases those are false positives, and you only really need to fix as explained on when the "defective" syntax is inside a template or a wikilink, basically. When you have completed the checks, please edit the Notes field of the table accordingly! If you have questions, please reach the team on the talk page. Hope this helps, and thanks for your attention and your help, --Elitre (WMF) (talk) 10:29, 16 May 2017 (UTC)

@Elitre (WMF) You might want to check the batteries on your calendar... ;) Chuck Entz (talk) 13:17, 16 May 2017 (UTC)
LOL, thanks. It's because I followed the link to the page I edited last year and then forgot to reach the current one! --Elitre (WMF) (talk) 15:48, 16 May 2017 (UTC)
I have also mentioned this change today in an email to tech ambassadors. So far the table on doesn't reflect any fixes on this wiki. I'll just add that fixes are of course possible also after the change happens, although it may be suboptimal, and that you may want to look into automated means of solving the problem (come ask how en:Wikipedia:AutoWikiBrowser was used on the big wikis, for example). Hope this helps. Elitre (WMF) (talk) 13:05, 22 May 2017 (UTC)


Birgit Müller (WMDE) 14:39, 16 May 2017 (UTC)

Thanks, it looks pretty cool. —Aryamanarora (मुझसे बात करो) 22:09, 19 May 2017 (UTC)

Using the script code for Nastaʿlīq rather than ArabicEdit

I'm puzzled by the fact that languages that generally use Nastaliq script are coded as using variations of the Arab (Arabic), when they should use the code Aran (Nastaliq). For instance, Persian and Urdu text is marked with the classes fa-Arab and ur-Arab. Seems like this should be fa-Aran and ur-Aran. I'm not sure which other languages generally use Nastaliq, but they should use the code Aran rather than Arab too. Or was Arab used for a reason? — Eru·tuon 19:57, 16 May 2017 (UTC)

Nastaʿlīq is a style not a script. Just because a script code exists, doesn't mean we should use it. If we just want to use a different font for a particular language, then script codes are not the way to do it. --WikiTiki89 20:15, 16 May 2017 (UTC)
Well, the code is listed in Module:scripts/data, but not used (as far as I know). Should it be deleted? — Eru·tuon 20:33, 16 May 2017 (UTC)
I think so. --WikiTiki89 20:44, 16 May 2017 (UTC)

Join the next cycle of Wikimedia movement strategy discussions (underway until June 12)Edit

21:08, 16 May 2017 (UTC)

((d)) template - entries marked for imminent deletionEdit

Could/should we change this template to enforce a "reason" parameter? It isn't helpful when people tag pages for deletion with no reason given. Equinox 22:52, 16 May 2017 (UTC)

I don't know that it should require a reason. It's often obvious, like if the page is a redirect left after a move, and typing {{d}} is quick. In cases where it's not obvious, I see people convert it to {{rfd}} or sometimes {{rfv}}. - -sche (discuss) 23:19, 16 May 2017 (UTC)
Support. PseudoSkull (talk) 23:20, 16 May 2017 (UTC)
If you're incapable of figuring out why it should be deleted, let someone else deal with it. (Or if it requires a specific skillset, like speaking Russian, ping someone who speaks Russian.) —Μετάknowledgediscuss/deeds 01:47, 17 May 2017 (UTC)

Subcategorize names in English countable proper nounsEdit

As the category description says, "Proper nouns are generally not countable. This category contains those that are. They may be actually common nouns that have been miscategorized." First and last names are countable as a rule, however, so they swamp the otherwise-useful category, obscuring its other entries: a quick list check using AWB's list-comparer suggests that out of its 6,887 entries, 209 are not names, but they are needles in a haystack. I think it'd be useful to move names to a subcategory, perhaps by passing a name=1 parameter to {{en-proper noun}}. I am not sure what to call the subcategory; maybe "English countable names" or "English countable proper nouns (names)"? Or given and last names could go in different subcategories. It might also be useful to subcategorize place names. - -sche (discuss) 16:52, 17 May 2017 (UTC)

To take one example: Hatfield is readily used in its plural Hatfields, as in "Let's invite the Hatfields over for dinner."/"Not the Mingo County Hatfields, I hope."/"No, just our local Hatfieds, the Harold Hatfields, Senior, Junior and Trey."/"Trey isn't a real Hatfield; he's adopted, you know."
Isn't that countable use of a proper noun Hatfield? DCDuring (talk) 17:25, 17 May 2017 (UTC)
Yes, and all names are like that, so names are swamping out the other things in CAT:English countable proper nouns. That's why I want to move them to a subcategory. - -sche (discuss) 03:57, 18 May 2017 (UTC)
What bothered me was the category description, the wrongness or poor wording of which seemed to me to have escaped your notice. Creating the new subcategory would be an opportunity to insert better information in the "category description". The facts seem to be that, 1., virtually any proper noun/proper name could be used in the plural, but that, 2., most usage of proper nouns is in the singular, 3., uncountable use is rare, and 4., there is not gross deviation from this pattern, excepting for certain toponyms, like Rockies/Rocky Mountains. I have come to believe that we have little justification for make assertions about the (un)countability of nouns other than perhaps to say whether it is countable or uncountable use is the more common. Even that presupposes that we have done some quantitative work to justify the assertion. DCDuring (talk) 12:22, 18 May 2017 (UTC)
I support categorising them as just nouns. —CodeCat 17:36, 17 May 2017 (UTC)
Yes, I think it would be helpful to separate out all names. Andrew Sheedy (talk) 05:53, 18 May 2017 (UTC)
If it is true that all names are countable, the categories of countable given names and countable surnames would be redundant with the existing given name and surname categories. Seems to me the simplest solution would be to put these existing name categories under countable proper nouns and remove all names from the countable proper nouns category. — Eru·tuon 07:44, 18 May 2017 (UTC)
Good point about redundancy. One can find individual names where the plural is not used often enough to meet CFI (usually only if the name itself is not used often). Maybe someone would find it interesting to know which names' plurals do and don't meet our (arbitrary, in the scheme of things) criteria; we do have Category:English nouns with unattested plurals. But yes, as a rule, names pluralize. I wouldn't mind what you suggest. And/or we could just have name=1 prevent categorization into CAT:English countable proper nouns without adding any category. - -sche (discuss) 08:43, 18 May 2017 (UTC)
Isn't the categorization into Category:English proper nouns sufficient? Topical subcategorization into given names, surnames, toponyms, fictional character names, company/brand names, etc seems much more potentially useful than the vacuous or misleading grammatical subcategorization by the relative prevalence of countable usage. DCDuring (talk) 12:22, 18 May 2017 (UTC)

Links to headings in WT:ELEdit

A little proposal that I don't think will be controversial. Under the "List of headings" there's a block of code in a fixed-width font that lists possible headers. I propose adding links to each of those headers, pointing to the section on the page that describes those headers in more detail. —CodeCat 17:09, 18 May 2017 (UTC)

I don't mind linking, but I don't like the fact that we have two lists of headings: the one in a fixed-width font, and the other one just below it. --Daniel Carrero (talk) 17:20, 18 May 2017 (UTC)

Vote: Simplifying CFI about constructed languagesEdit

FYI, I created Wiktionary:Votes/pl-2017-05/Simplifying CFI about constructed languages.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:43, 19 May 2017 (UTC)

Categorization and use of subcategoriesEdit

On Wiktionary:Categorization, I changed some word and also solicited feedback on the talk page but I figured I would also post here. I have been informed that--following my intuition and the use of categories on many other projects--entries shouldn't be in a parent and its child categories simultaneously. There may be some rare exceptions but I certainly can't think of any. I changed the wording to reflect that generally you should use the most applicable and narrow category only and not put an entry in (e.g.) Category:English lemmas (edit: I knew better than that--but I'll let the example stand to show my haste and ignorance) and Category:English nouns and Category:English proper nouns or Category:en:Cities and Category:en:Cities in France, etc. Again, I don't think this is particularly controversial but I want clarification from the community because at least one other editor feels otherwise and I have solicited his feedback here. —Justin (koavf)TCM 16:53, 19 May 2017 (UTC)

Certainly for the grammatical categories, we do have things in both parent and child categories, although Proper nouns are not considered a child of Nouns. But dog is in both CAT:English nouns and CAT:English lemmas, and Germany is in both CAT:English proper nouns and CAT:English lemmas. For the topical categories, you're probably right that we should avoid duplication and only put things in the most specific category available, thus Toulouse should be in CAT:en:Cities in France but not in CAT:en:Cities or CAT:en:France. —Aɴɢʀ (talk) 17:03, 19 May 2017 (UTC)

You will find the discussion with koavf here. Generally he is right with other categories, but with Category:en:Counties of the United States of America I think there is a unique situation. DonnanZ (talk) 17:16, 19 May 2017 (UTC)

@Donnanz: How is this unique? This is the question that I keep on asking but you're not answering: How are counties any different than Category:Cities in the United States of America or Category:Rivers in the United States of America? On your talk, you explicitly said that, "A compromise would be to have categories for both country and states. But don't expect me to create the categories." Which I did and then populated and you undid to populate the parent category. You have been told on that talk page as well by a bureaucrat to not do this, I have confirmed that other admins said to not do this, and you've gone behind my back and undone my edits to re-populate the main category. I didn't want to make this a discussing about the two of us bickering but you're really forcing my hand here: please tell me why you think American counties are somehow distinct from all of the other topical categories. If you can't answer this question, then you will have to stop editing like this. I've tried to meet you half-way here by doing the work of making the 50 subcats and populating them from hundreds of edits. I've also additionally made Appendix:Counties of the United States of America so you can have a list of that really matters to you that much. Your editing makes no sense to me here. —Justin (koavf)TCM 17:47, 19 May 2017 (UTC)
Why on earth should I use that appendix? You can lead a horse to water.... DonnanZ (talk) 18:04, 19 May 2017 (UTC)
@Donnanz: Your reason for categorizing them all together was that you wanted to see a list of all of them at once. So I started one. You're still not answering the fundamental question and you've been told numerous times that this is not how the topical categories function. —Justin (koavf)TCM 18:06, 19 May 2017 (UTC)
You are the only one ordering me about, I would like to hear from others. DonnanZ (talk) 18:09, 19 May 2017 (UTC)
@Donnanz: So, you want county names to be in the "counties of the US" category and the "counties of a given state" category? Or am I misunderstanding? — Eru·tuon 18:18, 19 May 2017 (UTC)
@Erutuon: You are correct. —Justin (koavf)TCM 18:33, 19 May 2017 (UTC)
@Koavf: I would rather hear @Donnanz say it. I have not seen a clear indication of what he wants. — Eru·tuon 18:35, 19 May 2017 (UTC)
@Erutuon: Without prejudice to Donnanz--whom I sincerely wish would explain himself--here is his justification. Again, I created an appendix to meet this same need but he evidently doesn't find this suitable (he refuses to explain why). —Justin (koavf)TCM 18:40, 19 May 2017 (UTC)
@Koavf: Aha, I see. I don't understand why he didn't reply to me; perhaps because you answered... — Eru·tuon 18:42, 19 May 2017 (UTC)
  • @Erutuon Gimme a chance. You will find the discussion on my talk page via the link above. DonnanZ (talk) 18:47, 19 May 2017 (UTC)
@Donnanz: No, Chuck Entz on your talk and Angr above have both said the same thing. See also, e.g. User:DTLHS writing on my talk. All of them are admins or bureaucrats. Also, in practice, do you know of any other topical categories where entries are in both child and parent categories? Since you're refusing to give any examples or explain why this one is unique, what do you expect to happen? I tried to help you here by making the appendix--I don't know what more you want. —Justin (koavf)TCM 18:20, 19 May 2017 (UTC)
My reasons for wanting entries to be only in leaf categories are purely practical. If we put into common use a template for topical categorization (such as {{C}}), we could easily make it automatically categorize up the category tree to a certain point (so that for example entries in Category:Crickets and grasshoppers are also in Category:Insects). Until we have this capability I believe categorizing only in the most specific category is reasonable. DTLHS (talk) 18:49, 19 May 2017 (UTC)
  • At the moment Koavf is reverting all of my edits. He needs to be stopped. DonnanZ (talk) 18:30, 19 May 2017 (UTC)
  • I am not reverting all of your edits... I am undoing the addition of the parent category when the child categories are present. —Justin (koavf)TCM 18:32, 19 May 2017 (UTC)
  • You have no consensus for that. I can always create entries without any categories at all if you like. DonnanZ (talk) 18:37, 19 May 2017 (UTC)
  • @Donnanz: Why would you think I would prefer that? Also, entries are supposed to be categorized. Can you please address the several questions directed to you above? —Justin (koavf)TCM 18:40, 19 May 2017 (UTC)
So if you would rather see entries categorised, surely they can be categorised as the user sees fit, right? What you are doing in deleting the master category is satisfying your own view on the matter, which is flawed. DonnanZ (talk) 18:55, 19 May 2017 (UTC)
@Donnanz: "surely they can be categorised as the user sees fit, right?" No, since categories have to be exclusive: we can't have every conceivable scheme at one time. We can generate database reports or use bots that can create infinite trivial categories ("All towns in India with a 'x' in the name" or "All rivers which are also words in Estonian", etc.) "which is flawed" How is it flawed? No one else seems to think it's flawed and you are not explaining why is this category scheme unique? It's less than useless for you to even write this. You refuse to answer any direct and simple questions for yourself and are acting petulant. Why are you doing this? —Justin (koavf)TCM 19:01, 19 May 2017 (UTC)
Then I will have no choice but not categorise entries, as you are still reverting my edits. That way my time isn't wasted. DonnanZ (talk) 19:05, 19 May 2017 (UTC)
Another solution: add only the "county in a given state" category. The "counties in the US" category can easily be added after that, if it is resolved it should be added, I would think. — Eru·tuon 19:14, 19 May 2017 (UTC)
@donnanz Or you could just categorize by state and not put them in the parent category. Why are you acting like this? —Justin (koavf)TCM 19:20, 19 May 2017 (UTC)
  • Umm. Okay, I will add my voice to begin constructing the requested consensus. I agree that entries should not be put in both the parent and child topic category. There are other ways to generate the list of US county names that you desire. — Eru·tuon 18:46, 19 May 2017 (UTC)

@Donnanz: Okay, now you are being outright disruptive and vandalizing entries. Stop it. —Justin (koavf)TCM 19:28, 19 May 2017 (UTC)

And again. Stop removing categories. —Justin (koavf)TCM 19:33, 19 May 2017 (UTC)
This is just petty. What's the point of this, @Donnanz:? What is the endgame here? —Justin (koavf)TCM 19:34, 19 May 2017 (UTC)
Do you want county entries or don't you? I positively refuse to be dictated to by you. DonnanZ (talk) 19:41, 19 May 2017 (UTC)
@Donnanz: It's not me: it's the entire community and common sense and the usage of categories elsewhere. You know that what you are doing is inappropriate so I'm not going to continue playing this game with you. You're acting childish, so please just quit it. Make the entries with the categories by state and not the parent category but don't actually remove just Indiana as a state category: what could possibly be the reason for that? You're not answering any of these questions, you're not making sense, you're contradicting the community's wishes, and you're just being bullish. Grow up. —Justin (koavf)TCM 19:46, 19 May 2017 (UTC)
Huh? Bullish? That's the pot calling the kettle black. DonnanZ (talk) 19:51, 19 May 2017 (UTC)
@Donnanz: Come on. You're being childish in the above-linked diffs where you remove all "county in a given state" categories, or just one. It looks like retaliation for @Koavf's edits that you find to be overly controlling. But I agree @Koavf is being controlling. It would be best if both of you would refrain from removing any categories until this discussion has finished. Removing or adding the parent category can easily be done by bot, I would imagine. — Eru·tuon 19:58, 19 May 2017 (UTC)
I removed one or two categories to prove a point, but I strongly object to the master category being removed before this discussion is resolved. That is childish, I'm throwing that comment back. DonnanZ (talk) 20:12, 19 May 2017 (UTC)
Fine, @Koavf is also behaving childishly. Is that satisfactorily impartial? — Eru·tuon 20:17, 19 May 2017 (UTC)
Childish or not, I don't normally indulge in edit warring, which is senseless and fruitless. I leave that to other users like Koavf. DonnanZ (talk) 21:31, 19 May 2017 (UTC)
@Donnanz: Honestly, man just get over it. Please just answer the above questions and maybe someone can agree with your proposal. As it stands, no one is agreeing with you and you refuse to justify your actions. —Justin (koavf)TCM 21:33, 19 May 2017 (UTC)
@Erutuon: See here. Adding categories is trivially done and removing them is slightly harder. Do you suggest that this just stop for a few days altogether or should he continue making pages and then if consensus stands as it is, I remove them at that point? —Justin (koavf)TCM 22:21, 19 May 2017 (UTC)
@Koavf: I would suggest that you and DonnanZ both continue adding whatever categories you would like to, and do not remove categories that the other person has added. — Adding categories is trivial? I would think removing them would be quite trivial too. @DTLHS could remove all "counties of the US" categories from entries containing a "counties in a given state" category quite easily with his bot, I would imagine. — Eru·tuon 23:10, 19 May 2017 (UTC)
@Erutuon: Or I could with AWB--that is actually exactly why I got AWB access in the first place--diffusing large topical categories. —Justin (koavf)TCM 23:20, 19 May 2017 (UTC)

Today I learned a lot about pedantry. —Aryamanarora (मुझसे बात करो) 22:07, 19 May 2017 (UTC)

Koavf is still deleting, despite being asked to stop. This is irresponsible behaviour. Can admin intervene please? DonnanZ (talk) 22:33, 19 May 2017 (UTC)
@Donnanz: You were also asked to stop adding to the parent category. I left the 108 entries in Category:Counties_of_the_United_States_of_America as a sign of good faith. Why are you adding more? —Justin (koavf)TCM 22:38, 19 May 2017 (UTC)
All right, I won't use Category:en:Counties of the United States of America for the moment as long as you keep your side of the bargain. Hopefully it can be restored to all county entries soon. DonnanZ (talk) 22:59, 19 May 2017 (UTC)
Actually there should be about 500 entries in the category by now, not 108, but no one can tell now. Of the 254 counties in Texas, only a fifth of them have entries, so there's a lot to be done yet, but due to Koavf's violent opposition the job may never be finished. DonnanZ (talk) 11:07, 20 May 2017 (UTC)
@Donnanz: You could maybe get some consensus for this if you were to actually explain yourself. Barring that, I don't see anyone agreeing with you. —Justin (koavf)TCM 23:07, 19 May 2017 (UTC)
You have obviously forgotten that I explained my reasons on my talk page. I needn't repeat them here. I'm still going to create more entries despite the unsatisfactory situation you created all by yourself. I must be barmy. Don't bother pinging me again, it's bedtime. DonnanZ (talk) 23:14, 19 May 2017 (UTC)
@Donnanz: You did not. E.g. you said this is "unique"--how? You said that my understanding of categories is "flawed"--again, please enlighten me. You claimed on your talk that you just want a list o fall U.S. counties in one placed one for you and then you didn't like it. I have no problem giving 48/72 hours for more feedback but I'd be very surprised if anyone else thought that it was a good idea to use categories the way you suggest and several others have said that we shouldn't. —Justin (koavf)TCM 23:18, 19 May 2017 (UTC)
That's a difficult question. I feel it is different from other categories, especially when some county names are shared by several states e.g. Washington County in 30 out of 48 states (and 1 parish), on the other hand many names occur in only one state. But I feel that a master category is useful for users, although city names can also be shared by more than one state too, maybe not to the same extent. You will also find that categories for cities don't exist in 22 out of 48 states, so entries for US cities can appear under a state heading as well as in Category:en:Cities and Category:en:Cities in the United States of America. Some so-called US cities are ridiculously small and some counties have very few people, but that's digressing. DonnanZ (talk) 10:10, 20 May 2017 (UTC)
In addition, Koavf didn't do the job properly when he first created the Indiana county category, and entries for other states with the same county name have been left in the state category, which was fine before, and not moved to a county category. At the moment I can't be bothered correcting them. Apathy rules. DonnanZ (talk) 13:35, 20 May 2017 (UTC)
@Donnanz: "Koavf didn't do the job properly... I can't be bothered." and "categories for cities don't exist in 22 out of 48 states". Well, Wiktionary isn't done yet. Things take time. "Entries for US cities can appear under a state heading as well as in Category:en:Cities and Category:en:Cities in the United States of America" They can but should they? The answer from everyone else is "no". The fact that there are county names shared across states is no different than cities, towns, villages, rivers, or any other geographic feature. "I feel that a master category is useful for users"--how is a category somehow better than the appendix which I started for your benefit? In what scenario would a category listing be superior? —Justin (koavf)TCM 15:51, 20 May 2017 (UTC)
I am merely pointing out that there are US cities in Category:en:Cities. Am I allowed to mention that without being jumped on? And I am not going to be led on a noose to an appendix created by guess who. You must be joking, I didn't ask for the damned appendix. DonnanZ (talk) 16:20, 20 May 2017 (UTC)
@Donnanz: ...led by a noose? Honestly, you need to step back several steps and chill out. You said that you wanted a centralized listing of all of the counties and this is it. I'm asking what is the problem with the appendix? I didn't say that you explicitly asked for it, I'm asking how does this not meet what you wanted? —Justin (koavf)TCM 17:09, 20 May 2017 (UTC)
The problem would not arise if too granular categories were not created. I for one would be happy if Category:en:Cities had no subcategories; we are a dictionary and not an encyclopedia and not an atlas. However, I am sure multiple people disagree with me. Maybe some more people would agree that Category:en:Cities in Kyoto (Prefecture) is too granular; I don't know. --Dan Polansky (talk) 19:34, 20 May 2017 (UTC)
As @SemperBlotto said at WT:RFD#McClain County (to be archived at Talk:McClain County), we are a geographical dictionary. We may have entries for a lot of place names based on this recent vote, so it's only natural to have categories for them. Category:en:Cities is too broad. Having Category:en:Cities in Kyoto (Prefecture) is fine, but I'm thinking of proposing cleaning up the names for all place name categories at some point, they are a mess. --Daniel Carrero (talk) 20:20, 20 May 2017 (UTC)
@Daniel Carrero: We need to systematize the "X of Y" categories and "X in Y" categories. No idea why we use both prepositions seemingly randomly. —Justin (koavf)TCM 20:28, 20 May 2017 (UTC)
I didn't think I would ever find myself agreeing with you. I would suggest standardising all such categories to "X in Y" which seems more natural to me. DonnanZ (talk) 20:52, 20 May 2017 (UTC)
@Donnanz: Are you talking to me? You would never agree with me about what...? For what it's worth, I think that "X in Y" is also more sensible as well, especially for a geographic category since a city will quite literally be in a state. —Justin (koavf)TCM 21:12, 20 May 2017 (UTC)
I agree with Daniel though, Category:en:Cities is far too broad and covers the whole world. No, I'm not hoisting myself by my own petard, but suitable subcategories (and in some cases sub-subcategories} already exist for most countries. Maybe a lot of entries were categorised before subcategories were created. DonnanZ (talk) 23:52, 20 May 2017 (UTC)
Strangely there is no Category:en:Counties as a parent category, not that it would need any entries. Similarly for Category:en:Municipalities, yet one exists for Category:en:Towns. DonnanZ (talk) 07:42, 21 May 2017 (UTC)

@Koavf: I have made 4 trial additions to the appendix. Although it looks good, it has to be edited manually. I don't think etymology should be included as this usually appears in the main entry anyway, and makes editing unduly complex for multiple state entries such as Adams County. No tally of entries is provided, and the longer the appendix gets the more unwieldy it will be, and the more difficult it will become to insert new entries. The only way I can find into the appendix is by scrolling past all the subcategories in the parent category, the addition of the parent category to entries is a much easier and far more direct option. All I was basically looking for is the basic information that is provided in a category such as Category:en:Counties of Indiana and Category:en:Counties of the United States of America including a tally of the number of entries. I think I have given the appendix a fair trial, but I don't think it is a good substitute for replacing entries in the parent category. DonnanZ (talk) 09:17, 22 May 2017 (UTC)

@Donnanz: Well, I respect you trying it out and giving it a shot. It's unfortunate that it isn't a workable solution for you. Otherwise, category scanning tools or AWB may be a good fit for essentially generating a report of what you want to see. —Justin (koavf)TCM 15:40, 22 May 2017 (UTC)
@Koavf: I think that with potentially up to 3000 entries (allowing for duplicates) in the appendix it would be unworkable, and take a helluva lot of editing. On the other hand Category:en:Counties of the United States of America manages itself, no editing is required apart from allocating the category. I notice in the intro to the appendix it is intended for former counties also, and some counties are still being renamed in the 21st century (Dade County is an example) so maybe the appendix can be kept for county names that are no longer used. Category:Traditional Scottish Counties (still incomplete) is an example of this. DonnanZ (talk) 19:54, 22 May 2017 (UTC)
@Donnanz: Appendices are definitely very flexible in what they allow, so a list of former U.S. county names would probably fly but would be redundant to an actual article at Wikipedia. (Edit:) But yes, you are correct, that generating the full list in the first place would be substantial editing. —Justin (koavf)TCM 20:20, 22 May 2017 (UTC)
@Koavf: So do we still have an impasse, or can we reach agreement? DonnanZ (talk) 09:10, 24 May 2017 (UTC)
@Donnanz: I respect the fact that you gave the appendix namespace a good faith try and that it's not what you think meets you needs but no, we are not on the same page. I still think that parent topical categories need to be emptied into their children and the consensus above and on my talk is in favor of what I'm suggesting. There are scripts and tools which can generate the kind of lists you would like to see and so I think that's the only way you're going to be able to get a report of the kind that you would like. Thanks for talking this thru with me; I hope we can have constructive dialogue in the future but in this case, I think we have incompatible perspectives and I think that the community is on my side as it were. —Justin (koavf)TCM 17:07, 24 May 2017 (UTC)
@Koavf: Actually, it's a little more complicated. @DTLHS said that the wikitext should only have the "counties in a given state" category, but the parent "counties in the US" category should also be added automatically (when the category-adding function is able to do that). So, you probably have consensus on not adding the parent category in addition to the children categories, but on having the parent category empty, there is disagreement. — Eru·tuon 18:15, 24 May 2017 (UTC)
I don't know if anyone else wants that to happen. It's just an idea of mine. DTLHS (talk) 18:18, 24 May 2017 (UTC)
Well, I like the idea. It could make category-adding simpler. Not sure how it could be done, though. — Eru·tuon 19:28, 24 May 2017 (UTC)
@Erutuon: Adding "Counties in Indiana" and "Counties in the USA" to the same entry is trivially easy. The point is that it's redundant and out of step with how the category structure works in general. I see that DTLHS is suggesting that this is a possibility but still I think my point remains that between solicited comments in this case, comments on similar cases, and general practice here and on other WMF wikis, there's a very broad and long-standing pattern of not putting an article in a parent and child category at the same time (without some clear reasoning and unusual circumstance). —Justin (koavf)TCM 19:45, 24 May 2017 (UTC)
I don't think Koavf has the support of the whole community on this issue. When Dan Polansky states "I for one would be happy if Category:en:Cities had no subcategories" that is disagreement from a different angle, and a stance I can't support. I can see the value of entries in this case (US counties) in both parent category and subcategory. If the parent category can be added automatically that would be great, but I'm not averse to adding the parent category myself. So far I have resisted the temptation of adding both categories to brand new entries, but I won't rule it out. DonnanZ (talk) 20:33, 24 May 2017 (UTC)

CFI and sum of part numbersEdit

In diff, text about sum of part numbers was added, apparently without a vote:

The consensus of the community is that numbers, numerals, and ordinals over 100 should not be included in the dictionary, unless the number, numeral, or ordinal in question has a separate idiomatic sense that meets the CFI.

Pursuant to "Any substantial or contested changes require a VOTE", this should be made via a vote: it is probably not contested but it is substantial in so far as it impacts policy. --Dan Polansky (talk) 08:41, 20 May 2017 (UTC)

I removed the substantial change from CFI. I created Wiktionary:Votes/pl-2017-05/Numbers, numerals, and ordinals. --Daniel Carrero (talk) 12:02, 20 May 2017 (UTC)

Wiktionary talk:Votes/pl-2017-05/Numbers, numerals, and ordinals --Octahedron80 (talk) 12:26, 20 May 2017 (UTC)

lexicographic approach to code languages (C++, python, etc.) and mathematical logicEdit

HI, I won't specify too much so that the gist of it can be understood. I'd like to know whether code langauges can follow the lexicographic principles applied to natural languages. I only know some basic regex, which nonetheless leads me to think it may be very similar. Similarly, mathematical logic is a formal language (a notion defined in linguistic terms in its entry. --Backinstadiums (talk) 18:37, 20 May 2017 (UTC)

What do you mean by lexicographic paradigm? DTLHS (talk) 19:03, 20 May 2017 (UTC)
Perhaps lexicographic methodology/principles is clearer --Backinstadiums (talk) 19:06, 20 May 2017 (UTC)
I am not sure if this helps but Ill say anyways: programming languages are formal languages too. --Dixtosa (talk) 19:11, 20 May 2017 (UTC)
@Backinstadiums: I realize that you don't want to get too far in the weeds on this but an example might help. Or maybe what are you planning to do if the answer is yes? —Justin (koavf)TCM 19:12, 20 May 2017 (UTC)
Maybe it would help you to read about how programming languages are created. Usually someone will create a specification, which is a document that describes the language in a formal way. Someone will then implement the language according to this specification. Any ambiguities in this specification become undefined behavior for that language. You can compare this to the way dictionaries work, which is to describe the language as it is created by the people who use it. Some natural languages have language regulators that proscribe words that can officially enter the language, but they still don't create the vocabulary or grammar from scratch. DTLHS (talk) 20:05, 20 May 2017 (UTC)
@DTLHS Thanks for the info. about the specification. --Backinstadiums (talk) 20:22, 20 May 2017 (UTC)

Proposal: Remove text from EL - Variations for languages other than EnglishEdit


Remove the text from WT:EL#Variations for languages other than English.

Entries for terms in other languages should follow the standard format as closely as possible regardless of the language of the word. However, a translation into English should normally be given instead of a definition, including a gloss to indicate which meaning of the English translation is intended. Also, the translations section should be omitted.


  • This is obvious:
    "Entries for terms in other languages should follow the standard format as closely as possible regardless of the language of the word."
  • This describes a universal rule concerning how to format definitions, but it seems to be controversial (we haven't agreed on that universal rule):
    "However, a translation into English should normally be given instead of a definition, including a gloss to indicate which meaning of the English translation is intended. "
  • This is false per Wiktionary:Votes/pl-2016-01/Translations of taxonomic names, which allowed translations sections in Translingual taxonomic entries as well as English entries:
    "Also, the translations section should be omitted."

Procedural note:

  • I don't know if this needs a vote. This is not intended to change any regulations. The text to be removed does nominally have some regulations, but they are not followed anyway per the reasons above.

--Daniel Carrero (talk) 14:42, 21 May 2017 (UTC)

Sauraseni Apabhramsa, Gurjar Apabhramsa, etc.Edit

I'd like to propose Sauraseni Apabhramsa and Gurjar Apabhramsa for codes. Maybe sap-inc and gup-inc, with the Apabhramsa standard being something like XXp-inc. @Aryamanarora If you have any other ones you want added, or even the Jain Prakrits. DerekWinters (talk) 02:30, 22 May 2017 (UTC)

I think Jain Prakrits should be treated as dialects of the Prakrits we have. I think inc-sap and inc-gup would be better. —Aryamanarora (मुझसे बात करो) 12:13, 22 May 2017 (UTC)
Sounds good. @Atitarev can you add inc-sap and inc-gup. Both use Devanagari and are descendants of Sauraseni Prakrit (psu). Sauraseni Apabhramsa is the ancestor of Old Hindi (inc-ohi), and probably some more. And Gurjar Apabhramsa is the ancestor of Old Gujarati (inc-ogu) and Marwari (mwr). DerekWinters (talk) 17:45, 22 May 2017 (UTC)
Could you post your request on Grease pit. I won't get to my desktop for a while.--Anatoli T. (обсудить/вклад) 21:50, 22 May 2017 (UTC)
I can add them. [Edit: done and done. Let me know if there are any errors.]Eru·tuon 22:18, 22 May 2017 (UTC)
Thanks! --Anatoli T. (обсудить/вклад) 11:17, 23 May 2017 (UTC)

@Erutuon Thanks! Could Magadhi Prakrit be added as well? inc-mgd would work. Its ancestor is Ardhamagadhi Prakrit (pka). —Aryamanarora (मुझसे बात करो) 13:03, 23 May 2017 (UTC)

@Aryamanarora: Sure, but I need some more details. Is it also written with Devanagari? And does it have any descendants? — Eru·tuon 17:10, 23 May 2017 (UTC)
@Erutuon: No, it's Brahmi script. The descendants of Ardhamagadhi Prakrit (pka) should be transferred to it. —Aryamanarora (मुझसे बात करो) 18:32, 23 May 2017 (UTC)
@Aryamanarora: Done! It sure has a lot of descendants. — Eru·tuon 19:02, 23 May 2017 (UTC)
@Erutuon: Thanks! —Aryamanarora (मुझसे बात करो) 19:03, 23 May 2017 (UTC)

egregious syllable category errorsEdit

Why, for example is "absolute of enfleurage" in both English 1-syllable words and English 3-syllable words, when both are obviously wrong? How many of these errors do we have? Will the error-maker be going through and fixing them? Equinox 17:14, 22 May 2017 (UTC)

Does {{IPA}} provide a way to suppress syllable categories? — Ungoliant (falai) 17:18, 22 May 2017 (UTC)
Not currently. Clearly, that would be a good idea. Have any ideas for what the parameter should be called? — Eru·tuon 19:20, 22 May 2017 (UTC)
Obviously the syllable count is going to be wrong when every word is in its own IPA instance. DTLHS (talk) 17:19, 22 May 2017 (UTC)
Well, I find it very obnoxious for someone to introduce a new technology (auto syllable bot) and know that it will "obviously" create errors, and allow it to do so anyway. Equinox 17:23, 22 May 2017 (UTC)
Not a bot error. How should this class of errors be detected? DTLHS (talk) 17:24, 22 May 2017 (UTC)
The existing entry may be unconventionally formatted but it is not informationally wrong. Once the bot adds those categories, it is wrong. Therefore the bot is at fault. Equinox 17:36, 22 May 2017 (UTC)
There is no bot that adds syllable categories. DTLHS (talk) 17:37, 22 May 2017 (UTC)
The syllables are counted by Module:syllables, which is used by Module:IPA to add the syllable-count categories. I will add a |nocat= parameter to turn off categorization. — Eru·tuon 19:15, 22 May 2017 (UTC)
Actually, that's not a good name. It's misleading, since not all categories would be turned off. Any suggestions for a clearer parameter name? — Eru·tuon 19:17, 22 May 2017 (UTC)
What if Module:syllables never categorized pages with spaces in the title? "absolute of enfleurage" is a phrase rather than a single word, and the syllable categories have "words" in the name. --Daniel Carrero (talk) 19:20, 22 May 2017 (UTC)
I personally would rather have multi-word terms categorized. (And technically, many supposed multi-word terms in English are actually single words syntactically, since they are compounds with some or all of the components separated by spaces.) However, since someone might want to find a list of all things spelled as words with a certain syllable count, perhaps there should be a category for terms with a certain syllable count and inside it a category for words with a certain syllable count. But misleadingly, the categories say word and not term, so perhaps, to be strictly correct, multi-word terms should be removed. — Eru·tuon 22:43, 22 May 2017 (UTC)
|noct=, or |nosylct= if you want to be 100% explicit. Chuck Entz (talk) 03:56, 23 May 2017 (UTC)
The error here, as pointed out, was that someone nonstandardly input each word into its own instance of {{IPA}}. - -sche (discuss) 20:23, 22 May 2017 (UTC)

Changes to {{alter}}Edit

I'd like to propose some changes to {{alter}}. Right now we can only add labels to the end of the alternative form lists by using a blank |4= and |5=, |6= etc., ex. {{alter|grc|παραγίνομαι||ion|post-Classical}} >> παραγίνομαι (paragínomai)Ionic, post-Classical.

I find this confusing, non-standard, and limiting. I propose adding parameters |dN= (dialect) and |qN= (qualifier) parameters:

Any objections, suggestions? Tagging: @Atelaes, Erutuon, Wikitiki89, ObsequiousNewt --Victar (talk) 19:37, 22 May 2017 (UTC)

Alternatively, they could combined into one parameter, ex. {{alter|grc|παραγίνομαι|q=ion,post-Classical}}. --Victar (talk) 20:45, 22 May 2017 (UTC)
@Victar: So, the purpose of {{alter}} as it has been traditionally be conceived is that each instance of alter represents a separate set of alternative forms. So in Ancient Greek, you would have the Ionic forms on one line and the Doric forms on another. As such, I would be opposed to the first set of changes you have proposed. The second set could be fine. —JohnC5 00:42, 23 May 2017 (UTC)
@JohnC5: If people want to keep dialects on separate lines, they'd be still able to do so. The only change they would be using actually parameters over the unusual and confusing blank |4= system. When they're scraped and parsed by {{desc}} and {{desctree}}, they'll be placed in a single line though. --Victar (talk) 01:07, 23 May 2017 (UTC)
@Victar: So wait, are you arguing for adding new parameters, or adding them and also removing the old parameter format? — Eru·tuon 17:15, 23 May 2017 (UTC)
@Erutuon, JohnC5: Both -- either as two parameters, |dN= and |qN= or as one single parameter. --Victar (talk) 19:56, 28 May 2017 (UTC)
@Victar: So, you are saying that, yes, you want to remove the old parameter format, in which a blank parameter separates the terms from the labels? — Eru·tuon 19:59, 28 May 2017 (UTC)
@Erutuon: Yes. I think I've repeated that 3 times here and just as many on the template talk page. --Victar (talk) 20:52, 28 May 2017 (UTC)
@Victar: Okay. Well, actually, your first post here said something about adding parameters, not about what to do with the old parameters, so I was unsure. But now it is clear. — Eru·tuon 21:01, 28 May 2017 (UTC)
@Erutuon: Glad we're on the same page in understanding now. =) --Victar (talk) 21:54, 28 May 2017 (UTC)
I think there must be a way to supply labels that apply to all the forms. There cannot just be numbered labels that apply to the numbered forms. It seems you are proposing the parameters |d= and |q= (with no number after them) for this purpose, and also that the labels be separated by commas in this parameter: |d=Doric, Boeotian. That will only work if there are no labels that contain commas. Then, for instance, what is currently coded as {{alter|grc|ᾱ̔μός|ᾱ̓μός||epi|dor|tra}}, from the entry ἡμέτερος (hēméteros), would be coded as {{alter|grc|ᾱ̔μός|ᾱ̓μός|d=epi, dor, tra}}. Is that correct? — Eru·tuon 20:11, 28 May 2017 (UTC)
@Erutuon: Yes, as I exampled above, it would be as {{alter|grc|ᾱ̔μός|ᾱ̓μός|d=epi,dor,tra|q1=archaic}} for multiple parameters for ᾱ̔μός (hāmós) (archaic), ᾱ̔μός (hāmós) (Homeric Greek, Doric, Tragic), or perhaps Homeric Greek, Doric, Tragic: ᾱ̔μός (hāmós) (archaic), ᾱ̔μός (hāmós) if people use |d= as opposed to |dN=, keeping them on separate lines. --Victar (talk) 20:52, 28 May 2017 (UTC)
@Victar: Okay. Well, it's quite different to have individual labels in the same parameter separated by commas. That's not currently done in any labeling templates. But I don't know what I think about it yet. It would help if more people would comment. — Eru·tuon 21:05, 28 May 2017 (UTC)
Isn't what what it used to be, with the previous |dial= parameter? I'd say it's even more unconventional using a blank |4= and |5=, |6=. --Victar (talk) 22:10, 28 May 2017 (UTC)
No, I think |dial= was just an alternative to |dial1= when there was only one dialect, and could only contain one label, not a list of labels separated by commas. — Eru·tuon 22:31, 28 May 2017 (UTC)

Start of the 2017 Wikimedia Foundation Funds Dissemination Committee electionsEdit

21:05, 23 May 2017 (UTC)

Appendix:Ancient Greek words with English derivativesEdit

What are we to do of this page? It seems unmanageable and useless to me. --Barytonesis (talk) 15:58, 24 May 2017 (UTC)

Probably superfluous in function compared to CAT:English terms derived from Ancient Greek, but we should at least check if we do give all listed English words as Greek-derived. --Tropylium (talk) 16:28, 24 May 2017 (UTC)

Alternative spellings for all parts of speechEdit

Currently there's the entry eenich. Its format isn't exactly ideal; it's just repeating the same thing three times, just for three different parts of speech. It would be useful if this could be said just once, that eenich is an alternative spelling of ênich for all and any part of speech. For Chinese, we have entries like , which contain no POS information and just point to the traditional spelling. What's going on here is similar, ênich is a normalised spelling (following WT:ADUM) while eenich is an unnormalised spelling found in manuscripts. —CodeCat 19:40, 24 May 2017 (UTC)

One problem with saying that something applies to "all" of something else is that it isn't future-proof, because the set of all things can grow. Maybe a new PoS will appear some day (not for Middle Dutch, admittedly) and won't take the alt spelling. Equinox 19:51, 24 May 2017 (UTC)
It's as future-proof as the Chinese entries are. It will only ever change if Wiktionary changes its rules, whether to have Chinese entries at their simplified spelling, or to normalise Middle Dutch in any other way. With the rules as they are now, will always be an alternative spelling of , and eenich will always be an alternative spelling of enich. There are other old languages that use normalised spellings; Old Norse is a notable example. They could also benefit from a solution. —CodeCat 19:56, 24 May 2017 (UTC)

Disable usexes and quotes for reconstructionsEdit

Our usual French IP vandal has been adding usage examples for Proto-Germanic terms on occasion. I don't think this is desirable. Can Module:usex be edited so that it throws an error whenever it's used in the Reconstruction: namespace? —CodeCat 14:01, 25 May 2017 (UTC)

@CodeCat: Sounds like a good idea. I've added an error that shows up in two cases: if the language is reconstructed, and if the namespace is Reconstruction. — Eru·tuon 18:33, 25 May 2017 (UTC)
I disabled the error after @Chuck Entz posted on my talk page. There needs to be more discussion on this. — Eru·tuon 04:19, 27 May 2017 (UTC)

Admin Rights RemovalsEdit

As WF points out, the vote has ended and we will now begin removing the admin role from folks who have not used admin tools within the past five years.

Here is a list of the current admins who fall into that category:

An interesting case, Paul G has not used tools in five years, but he is also a 'crat. The vote was specific to admins, we ought to have another vote for 'crat and checkuser rights as well.

Thanks, - [The]DaveRoss 16:01, 25 May 2017 (UTC)

I have created the vote you suggested: Wiktionary:Votes/2017-05/Removing bureaucrat and checkuser rights for inactivity. Feedback is welcome. —Μετάknowledgediscuss/deeds 03:34, 26 May 2017 (UTC)
What's even different about 'crats from admins anyway? PseudoSkull (talk) 04:56, 26 May 2017 (UTC)
See WT:B. —Μετάknowledgediscuss/deeds 05:08, 26 May 2017 (UTC)
Paul G has been active on Facebook within the last few weeks. Somebody might like to ask him. (He is the Paul Giaccone with a beard) SemperBlotto (talk) 05:11, 26 May 2017 (UTC)
I don't see a need to seek him out on social media. If he wanted to retain his rights, he'd notice all the pings and the message on his talk-page for a start. —Μετάknowledgediscuss/deeds 05:19, 26 May 2017 (UTC)
I don't think it is necessary, but I also think it would be nice to reach out to him, perhaps we will regain a good contributor in the process. - [The]DaveRoss 12:11, 26 May 2017 (UTC)

rfv, rfd and rfc -- language categories populatedEdit

@Chuck Entz, Erutuon, Justinrleung:

About {{rfv}}, {{rfd}} and {{rfc}}.

I edited the templates and populated the categories below.

I added the language code where needed in all instances of {{rfv}}, {{rfd}} and {{rfc}} in entries, reconstruction and citation pages. I also made the langcode mandatory in these namespaces, or else the page will display a module error (but the langcode is not mandatory when you add one of these requests in a template, category, or appendix).

Now, you can use the 1st parameter instead of "lang=" in all these templates. In other words, you can type: {{rfv|en|write reason here}}. (You can still use "lang=" and type {{rfv|lang=en|write reason here}} but this is obviously longer. I removed the "lang=" from the rfv/rfd/rfc in all pages to discourage using that parameter.)

Feel free to discuss or propose changes to that system.

(I also deleted Category:Language code missing/rfv, Category:Language code missing/rfd and Category:Language code missing/rfc for obvious reasons.)

--Daniel Carrero (talk) 23:48, 25 May 2017 (UTC)

IMO, a language code should not be mandatory for RFC or RFD, since those pages (WT:RFC, WT:RFD) are not split by language. - -sche (discuss) 20:46, 26 May 2017 (UTC)
Reply to @-sche: Yes, but the requests categories are split by language. Now, I'm OK with discussing as to whether these categories are helpful (I think they are!), but I didn't add the langcodes just for the benefit of linking to the discussion pages. --Daniel Carrero (talk) 22:18, 26 May 2017 (UTC)
I'm also not very happy with this change. It's totally unnecessary. --WikiTiki89 20:56, 26 May 2017 (UTC)
I too find this pointless. The addition of language codes was only for RFV, and only necessary for the splitting of the page. The categories you made are extra clutter. —Μετάknowledgediscuss/deeds 21:07, 26 May 2017 (UTC)
Reply to @Metaknowledge: What addition of language codes was only for RFV? {{rfc}} accepts langcodes for categorization purposes since 2008 (diff), {{rfd}} since 2011 (diff) and {{rfv}} since 2012 (diff). --Daniel Carrero (talk) 22:11, 26 May 2017 (UTC)
@Daniel Carrero: Why is this so complicated? I didn't say the addition of parameters for language codes, I said the addition of language codes. —Μετάknowledgediscuss/deeds 03:28, 28 May 2017 (UTC)
My reply above was meant to address basically your comment, which seems misguided: "The categories you made are extra clutter." My point is, there were categories before so they can't be extra clutter. But of course we can delete the categories if people want (even though I think the categories are great and should be kept, but what do I know). Many entries already had language codes before I added the rest of language codes. If we were not meant to add the language codes, then the "language code missing" categories themselves (Category:Language code missing/rfd, Category:Language code missing/rfc and maybe Category:Language code missing/rfv) were pointless by definition. --Daniel Carrero (talk) 03:33, 28 May 2017 (UTC)
Apart from the need to add the langcode (which most if not all request templates have: {{rfi}}, {{rfe}}, {{rfp}}, etc.) is it a bad thing to know what exactly are the French or Chinese or Spanish entries nominated for deletion? The categories can be deleted if unwanted of course. --Daniel Carrero (talk) 21:13, 26 May 2017 (UTC)
I forgot to mention: before I did anything, when you added the langcode in these templates, the entries were categorized in "attention" categories. So, I didn't introduce the concept of per-language categorization of entries with requests. I just moved them from Category:English terms needing attention (old category, deleted by vote) to "Category:Requests for (request type) in English entries", which is clearer. When the entry didn't have the language code, it was sent to "language code missing" categories like Category:Language code missing/rfv. So I added the code everywhere, thereby cleaning up the cleanup categories. --Daniel Carrero (talk) 21:23, 26 May 2017 (UTC)
The worst part of this is that it's no longer possible to add something like {{rfc|This is a mess}}, regardless of the namespace. We should not make reporting problems more complicated than it is, and we shouldn't be changing an important part of the behavior of a basic template without discussion, especially since it's been this way for over a decade. I'm getting tired of this "I'mgoingtochangetheorderofparametersineverytemplateintheprojectunlesssomeonehasanobjectioninthenext30seconds. OK?   Done" way of doing things. Chuck Entz (talk) 22:27, 26 May 2017 (UTC)
@Chuck Entz I'll remember that next time. Admittedly I never discussed before what we are discussing now. I apologize for that and take responsibility. (naturally, the recent vote about requests doesn't count -- it listed a lot of request categories but it didn't say anything about deletion/cleanup/verification)
Still, let's discuss this idea on its merits. Most of {{rfc}} requests are in the main namespace (530+ entries) as opposed to other namespaces (53 pages + categories). Most of the request templates (not to mention templates overall) already use 1= as language code. Arguably, having consistency makes thing less complicated, not more, because less often you have to remember the rules for each separate template. This assumes that we want the request templates to have similar results, including fully populated categories for entries of every language. (which should be automatic once Wiktionary:Votes/2015-12/Install Extension:Variables is implemented) --Daniel Carrero (talk) 23:46, 26 May 2017 (UTC)
You forgot to fix the error at Category:Design, which is a good illustration of why rfc isn't like rfv. Chuck Entz (talk) 00:08, 27 May 2017 (UTC)
No, it isn't a good illustration of that. rfc is exactly like rfv in entries (I'm talking about the design: they are two of the many request templates that allow a parameter for a language code and another for a reason). I fixed that category. --Daniel Carrero (talk) 03:52, 28 May 2017 (UTC)
rfc is used for a wide variety of problems that can't be assigned a language code: invalid or missing headers and headword templates, misplaced content, language sections in the wrong order, mislabeled or jumbled together, to name a few. It's also used in appendices, categories, templates and other namespaces. Yes, there's overlap, but they're not the same.
Of course you're talking about the design (if all you've got is a hammer, everything looks like a nail), but someone who's been using the same template for years isn't thinking about design, they're just going to be annoyed that they're getting a module error because someone redesigned some other templates and someone else wants to make everything a matched set. Every change comes with a cost, and you're not qualified to weigh that cost: coming up with ideas is fun, but having to adjust to changes that you didn't ask for usually isn't. For some reason, you often remind me of this scene from Jabberwocky... Chuck Entz (talk) 06:35, 28 May 2017 (UTC)
I'm sorry that adjusting to changes like this is (as you put it) not fun. I'll remember to discuss proposed changes in the future. The rfc/rfd/rfv changes can be undone if people want. (I realize I'm repeating myself.) The templates can be edited to use only "lang=" again as the language, even though that's longer and not exactly my personal preference. --Daniel Carrero (talk) 07:18, 28 May 2017 (UTC)
I think these new categories are useful. They allow editors to look for entries that need verification or cleanup, or are being deleted, in a language that they actually know something about, rather than having a bunch of entries of indeterminate language all jumbled together in the same category, some of which the editor will be able to help with and some that he or she will not. — Eru·tuon 03:41, 28 May 2017 (UTC)
Thank you. --Daniel Carrero (talk) 03:47, 28 May 2017 (UTC)

shorthand writingEdit

Hi, I've just come across the method of shorthand writing, so I do not know much about it. I'd like to ask you whether you'd herd of it, and how it could be added to wiktionary. --Backinstadiums (talk) 10:24, 26 May 2017 (UTC)

@Backinstadiums: There is only one form of shorthand in Unicode so the first step would be adding the individual characters like 𛰀 before adding any term written in the script. We had a similar discussion regarding Morse code as well. —Justin (koavf)TCM 16:48, 26 May 2017 (UTC)
@Koavf Thanks. I do not know the protocol to follow. Should it become a formal petition? --Backinstadiums (talk) 17:33, 26 May 2017 (UTC)
@Backinstadiums: Good question. Entries here need attestation and verification to show that they actually exist. What that means varies widely depending on the language, the term, etc. The guiding help page for this is Wiktionary:Criteria for inclusion which explains that individual characters are suitable for inclusion. I would find it unlikely that shorthand versions of words in standard scripts would be included, tho. —Justin (koavf)TCM 17:46, 26 May 2017 (UTC)
I am sceptical that we want entries in shorthand, other than entries for each shorthand character. Compare Deseret script; some discussion, which also mentions shorthand, is at Template talk:deseret. - -sche (discuss) 20:44, 26 May 2017 (UTC)
@Backinstadiums, -sche: Altho individual words which have been shortened to one character would be a good addition. E.g. & and which are individual characters representing and. I'm pretty ignorant of shorthand but I'm certain that some common words have been truncated to a single character. —Justin (koavf)TCM 21:18, 26 May 2017 (UTC)
Until approximately February 2014 we had many entries for words beginning in "ab" which had the header "Shorthand". One example is the entry for able, which had the header until this edit. The contribution had been made by one User:Wikigregg. (See his contributions.) I don't know on what grounds the material was deleted. I thought the entries were a great example of an abortive effort. DCDuring (talk) 00:59, 27 May 2017 (UTC)
@DCDuring Thanks for the info. But shouldn't it be just the other way roud, 'a' the one standing for 'able'? Also, what defines an 'abortive effort'? --Backinstadiums (talk) 07:45, 27 May 2017 (UTC)
@DCDuring: We should 100% have transcodings of Latin characters into Braille, fingerspellings, Morse Code, Semaphore, and shorthand. Why not? It could be done by machine and is valuable. —Justin (koavf)TCM 08:30, 27 May 2017 (UTC)
@Backinstadiums, Koavf For encoding, including the various transcription schemes under a heading in a table would work. But so would an appendix that had the characters or symbols. For decoding, the search engine would find the coded word on the right page, probably placing it at the top of the "failed-search page list. DCDuring (talk) 12:57, 27 May 2017 (UTC)
@DCDuring 100% agreed. A table like "{{subst:BASEPAGENAME}} in other endcodings" that generates Braille, fingerspelling, Morse Code, semaphore, and shorthand would be useful. Are there any other variations that you can think of to add? —Justin (koavf)TCM 19:19, 27 May 2017 (UTC)

@DCDuring I am afraid I have no idea about such 'encoding', so I cannot be of much help here. --Backinstadiums (talk) 13:49, 27 May 2017 (UTC)

Too bad there's no convenient resource for discovering the possible intended meaning of a word whose meaning one doesn't know. If only someone would work on that.
In this case, "encoding" means producing the shorthand equivalent of an English word. "Decoding" means finding the English word that has the meaning of the shorthand. DCDuring (talk) 13:57, 27 May 2017 (UTC)

@DCDuring I meant its effective implementation in wiktionary, using the proper programming language. --Backinstadiums (talk) 14:53, 27 May 2017 (UTC)

I see. The programming language in use here for more-or-less real time applications is Lua. Automatic transliteration efforts have been undertaken with varying degrees of success and acceptance. I'm not the one to ask. DCDuring (talk) 18:05, 27 May 2017 (UTC)

Language codes for Bourbonnais and PoitevinEdit

Hey, I was wondering if we could get some language codes for Bourbonnais (Oïl) and Poitevin Poitevin-Saintongeais, perhaps roa-bbn and roa-poi roa-psg? --Victar (talk) 19:48, 26 May 2017 (UTC)

The map of Oïl Languages in the French Wiktionary.
Another map.
Without intending to bring out "the pitchforks" as you put it in another thread, let's think about how these languages of France, and the others which are one these maps and mentioned at Wiktionary talk:About Bourguignon, would best be grouped. We could add ~20 codes, one for each variety on the maps, but is that best, or should some be grouped?
Our treatment of mutually intelligible lects is highly inconsistent, almost as badly so as the ISO's/SIL's — except that while they tend to split African, Asian and Oceanic lects even with very high mutual intelligibility (though with some notable exceptions like Australian lects with sub-45% intelligibility which share a code) and to not split European languages (again with some exceptions, like the entirely mutually intelligible varieties of Serbo-Croatian, Norwegian, or Dutch Low Saxon), we have often done the opposite, merging Fula etc but having far more codes for e.g. Romance and Germanic lects than the ISO. Like Picard, Walloon, Norman, Burgundian, Gallo, but not many other oïl languages (yet) like the ones you name; Rhine Franconian, Pennsylvania German, and Volga German; Asturian and Leonese; Bavarian, Mocheno, and Gottsheerisch; Norwegian, Norwegian, and slightly-differently-spelled Norwegian; Luxembourgish and Transylvanian Saxon; etc.
Should we keep doing that, and grant codes to most of the lects on those maps? I'm not actually opposed to that... although my understanding is that Poitevin should perhaps be grouped with Saintongeais, either as "Poitevin-Saintongeais", or just "Poitevin" with "Saintongeais" listed as an alt/subsumed name.
- -sche (discuss) 20:38, 26 May 2017 (UTC)
Heh, I knew it wasn't going to be a simple request. =) You're right, the inconsistency is pretty atrocious. We probably need a panel to go through all the ones we have, removing some and adding others. I think we can agree though that these two are more than mere dialects of French. I'm in total agreement on Poitevin-Saintongeais, which is actually how it's done on the French Wiktionary. This page might also be a good guide. --Victar (talk) 21:20, 26 May 2017 (UTC)
Ha, I'm sorry for broadening the thread so much :-p by bringing up Fula, Bavarian, etc. Let's see if we can decide which oïl languages to include, though, so we don't keep adding them piecemeal. We have codes for
  • Walloon,
  • Picard,
  • Norman,
  • Gallo,
  • and Bourguignon(-Morvandiau).
  • We can add a code for Poitevin(-Saintongeais).
  • How different is Bourbonnais from Berrichon? Some references group them, some don't.
  • How do Bourbonnais and Berrichon compare to Orléanais and/or Tourangeau?
  • Can Angevin be considered part of Gallo, or does it need its own code? (Some references group them, some don't.) What about Mayennais and Sarthois (fr.WP considers them part of Angevin)?
  • Do Champenois, Lorrain, and Franc-Comtois all need codes?
- -sche (discuss) 03:35, 27 May 2017 (UTC)
I have italicized the ones which have been added. - -sche (discuss) 20:17, 30 May 2017 (UTC)
No worries, might as well get them over with. I'm really inclined to go with the list the French Wiktionary team came up with. If some 11 French users agree to it, that's probably better than what you and I can determine on our own. They had a whole discussion on it and everything. --Victar (talk) 04:31, 27 May 2017 (UTC)
Well, I hope(d) other people might join this discussion, too! :p The French Wiktionary's list is a good starting point, although in general they tend to be very splittist (splitting e.g. Alsatian German from Alemannic German, Hoanya from Papora, etc). - -sche (discuss) 04:48, 27 May 2017 (UTC)
OK, pinging some people than. @JohnC5, Benwing, Chuck Entz, Romanophile, Wikitiki89 --Victar (talk) 12:55, 27 May 2017 (UTC)
Sure, we could have those codes. — (((Romanophile))) (contributions) 21:17, 27 May 2017 (UTC)
I'll admit to knowing little about the dialectology of modern languages. —JohnC5 01:28, 29 May 2017 (UTC)
I've added "roa-poi" for "Poitevin-Santongeais", a name which is long and unwieldy but is clear (and is also what fr.Wikt uses); the canonical name could be shortened to "Poitevin" if desired.
If there are no more comments within a day or two, I'll add Bourbonnais next. (Do we need a code for "Arverno-bourbonnais"?) - -sche (discuss) 03:29, 29 May 2017 (UTC)
Cool. I could use one from Orléanais as well. --Victar (talk) 04:06, 29 May 2017 (UTC)

Hi! There is an ongoing project named Langues de France that works on this matter by contacting local groups of speakers, linguists and lexicographers on local languages (mainly Alsatian and Occitan at this point) and Wikimedia France chapter organized a dozen of workshops with speakers all over France. Let's summon @Lyokoï and @Xenophôn, they may help you better than me   Noé 14:39, 30 May 2017 (UTC)

Hi dear all ! I'm the autor of this page ! I will try to respond to your questions (and sorry for my bad english) :
  • and Bourguignon(-Morvandiau) : The part name Morvandiau is the most speaked dialect of Bourguignon language. It is located in the center of the Bourguignon language region. The speakers have a lot of political weight so they have to force the french government to add their dialect name to the language name (as if their dialect was a different language).
  • We can add a code for Poitevin(-Saintongeais).  : Yes, Poitevin and Saintongeais are the two largest dialects of the same language Poitevin-Saintongeais. There is lots of sub-dialect (I have not yet listed) and the language don't have his own name, so linguists and policies have just to merge the tow names of dialect. You should to name the language Poitevin-Saintongeais.
  • How different is Bourbonnais from Berrichon? Some references group them, some don't. : Yeah it's a difficult part. But, Bourbonnais is an historical region with a big identity. I do the choice to separate it from Berrichon because if new people want to contribute to Wiktionnaire, he know that his language is named patois or maybe patois of bourbonnais. He will never give any link with the Berrichon language (or Berry region).
  • Can Angevin be considered part of Gallo, or does it need its own code? (Some references group them, some don't.) : the references who group Gallo and Angevin are old. Now, we make a difference (See Henriette Walter and Marie-Rose Simoni-Aurembou). It need its own code.
  • What about Mayennais and Sarthois (fr.WP considers them part of Angevin)? : Very complicated question. Some references group Mayennais and Sarthois in Manniot, some don't, some just say Parler centraux de l'ouest (West Central Languages). I choose like Bourbonnais. Even if there is not much left, there are still people who speak these languages, so I took the choice to collect them with the most likely name for a speaker.
  • Do Champenois, Lorrain, and Franc-Comtois all need codes? : Yes ! Really yes ! Franc-comtois allready have his code (it's fc in french wiktionary, but I do not know where it comes from), and Lorrain and Champenois are well known to linguists.
If you have other questions, don't forget to ping me ! :D --Lyokoï (talk) 17:19, 30 May 2017 (UTC)
@Lyokoï, Victar OK, I have added codes for Angevin (roa-ang), Champenois (roa-cha), Lorrain (roa-lor), and Franc-Comtois (roa-fcm, intentionally not "(roa-)frc" because that's too close to Cajun French's code), Orléanais (roa-orl) and Tourangeau (roa-tou).
Regarding Bourbonnais and Berrichon, Mayennais and Sarthois, and Percheron : I wouldn't presume to know more about the languages of France than the French do, but I get the impression that fr.Wikt has assigned codes on the basis of whether the speakers would want separate labels (and fr.Wikt is the wiki the speakers would probably choose to add words to), whereas en.Wikt as a basic difference is more interested in how mutually intelligible or not the lects are, and perhaps especially in how many words are spelled the same. If Bourbonnais and Berrichon have a lot of overlap and we would have a lot of pages with identical Bourbonnais and Berrichon sections, it's probably best to have one code for "Bourbonnais-Berrichon"; whereas, if all the words are spelled differently in addition to the arguments above for keeping the separate, then maybe we might as well have separate codes, although again, it depends on how mutually intelligible they are.
What is to be done about Francilien/Francien? How different is it (supposedly) from French? On en.WP and fr.WP, I see not only that it seems very similar, but also that there is debate over whether the lect existed as such at all.
- -sche (discuss) 21:08, 30 May 2017 (UTC)
Natural regions of France, each have its own identity, and very often its language.
@-sche The big problem is that the oïl region is technically a big dialectal continuum. So most parts of the region are mutually intelligible, but with differencies as one moves away from the reference point. There is lots of isogloss, but little sheaf of them exist. Second problem, there are no or too few studies on the distinction of the different linguistic regions in oïl region. Personally, I did not find any exploitable to make a map simply. Not being able to rely on the pronunciation, we can find differences on vocabulary and syntax (But it's difficult for it), but ultimately it is especially on the traditions of division of regional identity that we have real viable information. Easily identifiable cutting units are natural regions (map), below there are the communes. If you search documentation and dictionaries of oïl languages, you have a lot of scales :
  • The municipalities : Le patois de (the patois of) <name of the commune>.
  • The valley/canton/land : Le patois de (the patois of) <name of the valley/canton/land>
  • The natural region : Le patois de (the patois of) <name of the natural region>
  • An assembly of natural régions : Le patois de (the patois of) <name of the natural region>, <name of the natural region>, <name of the natural region>, <name of the natural region>,... (I have a dictionnary with 5 natural regions in its title).
So, when you need to describe oïl languages you can't find purely linguistic informations, because of scales, because of history (regional langage in France are very badly considered by politicians since four centuries), because of disinterest of linguists, because of disinterest of speakers, and more. I try to be precise with each reference (dictionaries, lexicons, glossaries) by creating localization models. See trigeasse with Chef-Boutoune model. --Lyokoï (talk) 07:03, 31 May 2017 (UTC)
I've been looking at texts in and about Bourbonnais and Berrichon, and there seems to be much overlap, and as much variation within the dialects as between them. In Duchon's dictionary of Bourbonnais, the Berrichon forms are frequently identical to the Moulins forms. In keeping with en.Wikt's general efforts to not split up units (of mutually intelligible lects) into overly many codes, seen also in e.g. how the regional lects of Germany (Rhine Franconian, etc) have been handled, I have added a code for Bourbonnais-Berrichon (roa-bbn). - -sche (discuss) 20:09, 5 June 2017 (UTC)
Thanks, -sche. I'll take what I can get. --Victar (talk) 20:24, 5 June 2017 (UTC)
@-sche, Victar fr.Wikt don't have this "not split up units" rules (but I think he needed), but I understand this position. Split or not Berrichon-Bourbonnais make sense but with different argument. I am delighted to have been able to help you choose. --Lyokoï (talk) 07:58, 7 June 2017 (UTC)

@-sche, Victar, Lyokoï: I just found a very interesting article by Haspelmath, the person in charge of Glottolog about the problem of naming languages. It is published in a free open-access journal, Language Documentation & Conservation. Another article by Irina Wagner can also be interesting for us, on new technologies. And an article by Gooskens & Schneider on Testing mutual intelligibility between closely related languages in an oral society (but mostly wrote for fieldworkers). I'll try to do a review for Actualités at some point, but if you want to write it first, you are welcome! I think resuming those articles for Wiktionarians can participate to our understanding of these questions.   Noé 07:13, 7 June 2017 (UTC)

Intersting ! Thanks for the link ! Lyokoï (talk) 07:58, 7 June 2017 (UTC)
Hah, Haspelmath's piece is remarkably contortionist. "You should never change a language's name! But if you do while writing a big book, we respect that. But you shouldn't have done it. But everyone should follow you in doing it." The claim that languages can't have two names in one langiage, and even the claim that cities can't, is amusing. Tell it to the people who speak Irish or Gaelic in Derry or Londonderry, to start with! - -sche (discuss) 21:51, 9 June 2017 (UTC)
To be fair, his focus is only on canonical names for languages to be used in scholarly contexts, and he says up front that he's proposing a prescriptive standard. I also find his contortions amusing, though. Basically, he's saying: "These prescriptive, cut-and-dried rules are the only way you should do things, but abandon them if authorities on the languages decide otherwise." Chuck Entz (talk) 04:47, 10 June 2017 (UTC)


@-sche, Lyokoï, Noé: So, somewhat related, what about Occitan? The varieties of Occitan are really just as diverse, ex. hairo#Descendants. In all truth, Occitan is just a language group. --Victar (talk) 15:57, 14 June 2017 (UTC)

@Victar Less, at present the consensus is on a single domain although independantists are struggling for their variation to be considered separately. But it is a rather sensitive subject. On the Wiktionnaire, we followed the recommendations most represented: everything is in one language. Lyokoï (talk) 16:26, 14 June 2017 (UTC)
@Lyokoï: I'm sure you know as well as I, calling Gascon and Provencal one language, is like calling Norman and Waloon one in the same! It strikes me as very inconsistent. --Victar (talk) 17:15, 14 June 2017 (UTC)
@Victar: I'm not linguist. You touch my limit of this subject. You have right to do that, but the occitanists on fr:WT prefer to follow the consensus. I respect their choice. --Lyokoï (talk) 08:36, 15 June 2017 (UTC)
Victar, you wrote "It strikes me as very inconsistent." but yes, languages borders perceptions are inconsistent and vague. Frontiers of languages move with time and are very dependent of political authorities. We can't apply an easy pattern to every situations, as it was pure data, it is not. Sorry for that.   Noé 08:28, 22 June 2017 (UTC)
Much of the literature on this language speaks of it as a language, rather than as many languages, so I would leave it unsplit. See also my comments in the subsection below. - -sche (discuss) 18:18, 22 June 2017 (UTC)


@-sche, Lyokoï, Noé: What about Franco-Provençal? For example: Late Latin cripia. --Victar (talk) 01:57, 22 June 2017 (UTC)

@-sche, Victar, Noé : This is wrong, Francprovençal is a dialectal continuum. We must be precise to the commune. Because of the variation, which is very random on the pronunciation and the vocabulary concerned, the most neutral position to describe Francoprovençal is to give the list of each village concerned by the form described, with differences in pronunciation if there are any.
And as a gift, know that a large part of the language has no orthographic standard... So for each described form, we need to say where we can find it, with which orthograph, and in what text. Without sources, francoprovençal words (like other regional French Language) has no value. --Lyokoï (talk) 09:31, 22 June 2017 (UTC)
@Lyokoï: You're obviously very opinionated on the subject, but the way it is listed is neither wrong nor lacking in value. There is no set standard in displaying Franco-Provençal entries and listing each town they are from, in many cases, is simply not an option due to the sources they're taken from.
My question though, as with French, is should each major branch of Franco-Provençal have a language code, ex. roa-sav for Savoyard? --Victar (talk) 11:57, 22 June 2017 (UTC)
This is one of a number of languages where every word needs to be tagged by dialect (as broadly or precisely as sources allow), but I would follow the ISO in leaving them all under one code, as one language; much of the literature speaks of a Franco-Provençal language with many dialects. If the ISO had granted a single code for "Oil language", rather than granting codes to some but not all of the major varieties, there would've been a case for doing the same thing with all the Oil languages, frankly! - -sche (discuss) 18:14, 22 June 2017 (UTC)
If we left ourselves purely to the mercy of the ISO, we'd be pretty hamstrung! Thankfully we don't, and already diverge from them considerably. We do also create language codes for families, so if we want to say these are dialect families, I think it stands to reason to give them language codes. Thoughts? --Victar (talk) 19:18, 22 June 2017 (UTC)

How to attest a word on Wiktionary without even tryingEdit

If there are only 2 citations of a term from like 5 years ago or more, just go on Google Groups and start a new topic where you use the word; then it's attested.

In the long-term, if you want to make up a word, you can also just make a Google Groups topic now using the word, then make your friend do that same thing in 2 years just using it in a different sentence/context, and do that again in 2 more years. You've just made a Wiktionary-attestable word! Even if that word is complete nonsense; it doesn't matter. Still attested, as long as it can classify as a POS and is used in English sentences. Hmmm... PseudoSkull (talk) 03:21, 27 May 2017 (UTC)

Time to take a look at RFV and attest some words. PseudoSkull (talk) 03:22, 27 May 2017 (UTC)
@PseudoSkull what does RFV stand for? --Backinstadiums (talk) 15:10, 27 May 2017 (UTC)
What's the point you're trying to make by asking that? requests for verification. PseudoSkull (talk) 15:52, 27 May 2017 (UTC)
For WT:LDLs you don't even need three citations. You just need to use a word on usenet or google groups, and then it's attested.
The only way to prevent such "attestations" is to exclude usenet and google groups citations for attestng a word. Searching here for usenet or google groups or google group gives some terms which then might have to be deleted.
(@Backinstadiums: Maybe see also WT:RFV.) - 16:14, 27 May 2017 (UTC)
@PseudoSkull I looked it up and didn't make sense. Threfore, should the sense of 'requests for verification' be added? --Backinstadiums (talk) 16:35, 27 May 2017 (UTC)
PseudoSkull is referring to Wiktionary:Requests for verification, the pages we use to verify whether terms or senses exist. The use of the abbreviation "RFV" for these pages is limited to Wiktionary, and appears not to meet our criteria for inclusion (see a past discussion), so it probably shouldn't be added to the entry. —Granger (talk · contribs) 16:40, 27 May 2017 (UTC)

Restoring context template for page history legibilityEdit

Can we please restore {{context}} and {{cx}} to keep page histories legible? {{context}} was very widely used at one point. The further use of the template can be prevented by placing entries using it into Category:Pages using deprecated templates, which seems to work well with {{infl}}. The template "failed" RFDO but there were almost no people: Wiktionary:Requests_for_deletion/Others#Template:context: there's the nominator dMoberg, and Romanophile and that's it; the further supporter is probably the deleting editor, Μeτaknowledge. --Dan Polansky (talk) 07:43, 27 May 2017 (UTC)

I'd favor restoration. DCDuring (talk) 12:48, 27 May 2017 (UTC)
@Erutuon Is it possible to create a CSS class that shows an error if the template is used on a live entry but not if it is seen in an old version of the page (similar the error message that appears in previews for invalid IPA characters). DTLHS (talk) 17:40, 27 May 2017 (UTC)
@DTLHS: If there's a class that only appears in one of the wrapping <div> tags on diffs or old versions, it would be possible to add a CSS rule that would make a certain class display only on current versions. I can't seem to find anything like that in the HTML source code, though.
I wonder if a class for diffs or old versions could be added somehow in the MediaWiki software. @TheDaveRoss?
Another idea would be to add a JavaScript function that hides the messages only on diffs or old versions. JavaScript has ways to get the last revision of a current page and the revision currently being viewed. If those are not identical, then we are in a diff or old version. Not sure what the fallback would be for users without JavaScript. — Eru·tuon 18:01, 27 May 2017 (UTC)
{{infl}} renders the text in different color, and it has a moveover that says that the template is deprecated. This was done by Daniel Carrero. Again, Category:Pages using deprecated templates provides excellent control, and helps us see whether more drastic measures are required; my guess is that they are not. --Dan Polansky (talk) 20:52, 27 May 2017 (UTC)
Re diffs of old versions, I think it could be added to MW via a Phabricator request (not sure they would actually do it) but it could also be inserted via JS based on the page URL. It is beyond my JS experience to try and make that happen.
Re widely used historical templates, I am of the opinion that we should only ever deprecate such templates, rather than deleting or re-purposing. We can easily enough prevent them being added through abuse filters, but the number of historical revisions which are incomprehensible due to missing templates is frustrating. - [The]DaveRoss 12:20, 28 May 2017 (UTC)
I imagine I could create a JS function to add a class marking a non-current version. — Eru·tuon 00:31, 30 May 2017 (UTC)
I support restoring {{context}} and {{cx}}. Given that Category:Pages using deprecated templates is empty now, it shouldn't be too much trouble to keep it empty once these templates are restored. — Eru·tuon 00:31, 30 May 2017 (UTC)
  1. Restored completely {{context}} and {{cx}} to keep page histories legible.
  2. Added the ".deprecated" CSS class to the template, causing it to show text of a different color.
  3. Pages using the template are going to show up in Category:Pages using deprecated templates.
  4. Added the hover text explaining that the template is deprecated.
  5. Edited the documentation of the template to explain again the same thing.
I believe this is all exactly the same as I have done with {{infl}} before. Feel free to edit them further if necessary.
I support making the different color only appear on current revisions, but I don't know how to do that. The HTML "body" tag does not seem to have any different CSS classes when you are viewing an older revision. --Daniel Carrero (talk) 01:21, 30 May 2017 (UTC)

I've created User:Erutuon/scripts/oldRevisionClass.js, which adds the class old-revision to the <body> tag in all but the Special and Media namespaces when viewing an old revision. Really simple. Suggestions are welcome. There may be a better class name, for one.

@TheDaveRoss: It actually doesn't even use the page URL, rather the latest and current revision numbers (from the mw.config library). — Eru·tuon 06:12, 30 May 2017 (UTC)

Good thinking, thanks for doing that. - [The]DaveRoss 11:50, 30 May 2017 (UTC)

pocket as adjectiveEdit

I was to listen to Mandy Moore's Pocket philosopher and my doubt is: Is its meaning kind of pejorative as in Spanish de pacotilla (cf. pacotilla s. 2)? Sobreira ►〓 (parlez) 10:39, 27 May 2017 (UTC)

No, not pejorative. It is like vade mecum (vademécum). —Stephen (Talk) 16:33, 27 May 2017 (UTC)
Btw, aren't these only attributive uses of the noun? --Barytonesis (talk) 18:12, 27 May 2017 (UTC)
Yes, or, more correctly, compound nouns formed with pocket. — Eru·tuon 18:20, 27 May 2017 (UTC)
@Stephen G. Brown I don't get it, do you mean by the sense? pocket n. as mecum pron.+prep.? Sobreira ►〓 (parlez) 10:36, 30 May 2017 (UTC)
He was using the English term vade mecum, which refers to something that can be taken with one. The Latin the term is derived from literally means the same as Spanish "va conmigo", which is what one would figuratively ask of a vade mecum. Chuck Entz (talk) 13:48, 30 May 2017 (UTC)
I know, I know what it is, my sister is a vet. I didn't get it because I was supposing Stephen was comparing the syntax and etymological evolution of VMC to Pocket + noun and for me they have nothing to do with each other. But I realise now that he is equating the meaning of attr. pocket to the VMC (~ portable, companion). Merci. Sobreira ►〓 (parlez) 22:44, 30 May 2017 (UTC)

how should we format etymologies of words like unshakability?Edit

I just created this entry, and logically it seems to me it should be unshakable + -ty, along with the general rule that converts -able + -ty to -ability. However, this isn't how most such words are formatted, e.g. unflappability is given as un- + flappability, even though the latter word doesn't seem to exist. Do we need to mention the general rule, and if so how should it be mentioned? Benwing2 (talk) 20:03, 27 May 2017 (UTC)

In principle we could create an allomorph suffix like -ility, or segment it as based on -ability. The former is maybe kind of preferrable in that forms like unshake don't necessarily exist (this one does, but e.g. unflap doesn't). On the other hand, the latter has the benefit of being neatly segmentable; maybe we could turn it into a circumfix based directly on shake. --Tropylium (talk) 22:04, 27 May 2017 (UTC)
One possibility is to write {{affix|en|unshakable|alt1=unshak(able)|-ability}}, which produces unshak(able) +‎ -ability. This is how I have formatted various etymologies in Russian, e.g. -ница (-nica) is the feminine of -ник (-nik), so under отли́чница (otlíčnica) I write e.g. {{affix|ru|отличник|alt1=отли́ч(ник)|-ница}}, which gives отли́ч(ник) (otlíč(nik)) +‎ -ница (-nica). (On the other hand, under certain circumstances I don't do this and instead just note under the suffix that it transforms words it's added to in certain ways, e.g. -ный (-nyj) is added onto the non-reduced stem of a word (which may show up only in the genitive plural) and palatalizes certain preceding consonants; see the "Usage notes" section of -ный (-nyj). Benwing2 (talk) 22:11, 27 May 2017 (UTC)
I'll also add that trying to analyze unshakable into unshake + -able is clearly wrong, and likewise for unshakability; the meaning of "unshakable" is not "able to unshake" (where "unshake" means "retract" or "unfold"), but "not able to be shaken", hence un- + shake + -able. Benwing2 (talk) 22:14, 27 May 2017 (UTC)
This case seems to be much simpler than the putative general case. shakability, though it occurs in seismology, is very uncommon in other use. The semantics would seem to require that the etymology appear as unshakable + -ity. To me the more complicated renderings having to do with the transformation of able to ability don't belong in this entry, but rather in ability. Perhaps a "See ability (Etymology)" would suffice for the students of morphology among our users. DCDuring (talk) 15:33, 28 May 2017 (UTC)
I prefer the analysis {{affix|en|unshakable|alt1=unshak(able)|-ability}}, despite its clunkiness, because the word seems to me to be derived from unshakable, and {{affix|en|unshakable}} requires a weird transformation of the -able part and does not acknowledge the fact that the suffix -ability exists. — Eru·tuon 17:47, 28 May 2017 (UTC)

I suppose you know it's linguistically identified as bracketing paradoxes --Backinstadiums (talk) 07:50, 28 May 2017 (UTC)

How about breaking it down into all of its parts: un- +‎ shake +‎ -able +‎ -ity? —Aɴɢʀ (talk) 17:19, 28 May 2017 (UTC)
I suppose one could analyse -abil- as a combining form of -able, but it looks more unshakable + -ity, with a rule that transforms -ble to -bil by analogy with noble/nobility, stable/stability and able/ability. Chuck Entz (talk) 22:21, 28 May 2017 (UTC)
In response to Angr, I don't think it's useful to break it down into all its parts, as it has structure in it: something like [[un- + shake + -able] + -ity]. Benwing2 (talk) 00:53, 29 May 2017 (UTC)
@Benwing2: Or even [[[un- + [shake + -able]] + -ity], but I don't see that as a reason not to include all the morphemes in the etymology. On the other hand un- +‎ shake +‎ -able +‎ -ity would put it into CAT:English words suffixed with -able, which seems undesirable. So maybe just un- +‎ shake +‎ -ability? —Aɴɢʀ (talk) 10:52, 29 May 2017 (UTC)
@Angr: I think the un- prefix belongs to the unshakable entry. The highest level of derivation should be either unshakable +‎ -ity or unshak(able) +‎ -ability, as it seems to me that if I were coining the word, I would be forming it from unshakable: taking an adjective and forming an abstract noun from it. — Eru·tuon 23:37, 29 May 2017 (UTC)

Vote for "Ideophone" in ELEdit

I'd like to create a vote to confirm that we want "Ideophone" in WT:EL as per this unvoted diff. --Daniel Carrero (talk) 23:09, 29 May 2017 (UTC)

PIE root for affixesEdit

I'm not 100% sure this is the right place to ask, but here goes.

Should the PIE root template include roots for affixes which have lost their semantic value?

I'm thinking of Italian -mente (adverb-forming suffix), derived from Latin mente, ablative singular form of mēns (mind). ultimately derived from the PIE root *men- (to think): should a word suffixed with -mente be categorized as (also) derived from the root *men- even though it doesn't bear the original meaning of “thinking”? –– GianWiki (talk) 13:42, 30 May 2017 (UTC)

Yes, and adding a note regarding the semantic loss would only improve it --Backinstadiums (talk) 18:26, 30 May 2017 (UTC)

I would say entries such as chiaramente, should not include the root *men-, but the page for the affix -mente itself should. --WikiTiki89 18:35, 30 May 2017 (UTC)
I would tend to agree. Andrew Sheedy (talk) 05:57, 31 May 2017 (UTC)

About template:affixEdit

Two questions:

  • I sometimes see people change etymologies that were using {{prefix}}, {{suffix}} and {{confix}} to {{affix}}, even though it seems less accurate to me; why is that?
  • Is volleyball a good example? Isn't it rather a compound? --Clitar Hero (talk) 23:05, 30 May 2017 (UTC)
About your first question: it doesn't really matter, it's just that {{affix}} can automatically figure out what kind of affix something is by looking at where the hyphens are. —CodeCat 23:32, 30 May 2017 (UTC)
{{affix}} can also be used for compounds, like {{compound}}. That's why {{affix|volley|ball}} is included on the documentation page. — Eru·tuon 00:19, 31 May 2017 (UTC)
Indeed, I hadn't noticed the category. I find this a little confusing, but whatever. --Clitar Hero (talk) 02:00, 31 May 2017 (UTC)

Use "Cognate" to link between citation pagesEdit

This is a really minor thing that won't get a chance to be used much at the moment, but I wonder if it's possible to make the Cognate extension generate automatic interwikis between citations pages like these:

What do you think, @Lea Lacroix (WMDE)? --Daniel Carrero (talk) 18:10, 31 May 2017 (UTC)

Thanks for your suggestion. Is Citations: a namespace? It seems a bit different. I'd like to know more about this.
Are the actual titles of the pages (excluding the Citations: prefix) always guaranteed to be the same as the word they are about? If yes, we can consider Cognate.
If it is a namespace, just as Wiktionary: Help: and the others, then it won't be linked by Cognate, but by the centralized links on Wikidata, which will be announced very soon :) If it is something else, then we have to think about a dedicated solution for these pages. Lea Lacroix (WMDE) (talk) 18:14, 31 May 2017 (UTC)
Why does the namespace make a difference? Cognate handles the main namespace just fine, but not any others? —CodeCat 18:27, 31 May 2017 (UTC)
@Lea Lacroix (WMDE): Please see Citations:they for an example of the "Citations:" namespace. It contains uses of the word "they", usually cited from books and running text.
Any entry can potentially have a "Citations:" page.
The Portuguese Wiktionary has 4,700+ Portuguese citations, like the page pt:Citações:trabalho.
I thought it might be a good idea to link the citations pages automatically when they have the same spelling, much like Cognate does for the main namespace.
Yes, the actual titles of the pages (excluding the Citations: prefix) are always guaranteed to be the same as the word they are about. --Daniel Carrero (talk) 18:30, 31 May 2017 (UTC)
Technically it is a custom namespace. On en.wikt {{ns:114}}, on pt.wikt {{ns:108}} with a localized canonical name not matching the English one as all canonical ones should do. Maybe it is a problem to do not have the same namespace number and same canonical name, either for Cognate or for Wikidata. --Vriullop (talk) 19:18, 31 May 2017 (UTC)
@Lea Lacroix (WMDE): You said "I'd like to know more about this.", so I'll talk about "lemmatizing" citations: the "lemma" is the main form, and "lemmatize" basically means treat and format as the main form.
  1. A page like Citations:be may have citations using the words "be", as well as "am", "is", "are", "been", "being". But in addition to that, we may have Citations:am with only "am", as well as Citations:is, Citations:are, Citations:been, etc.
  2. The page Citations:lamaism currently has citations for "Lamaism", "Lamaist" and other related words. We may want to create redirects from Citations:Lamaism, Citations:Lamaist, etc. or we may still be able to create these separate citations pages on their own merits.
--Daniel Carrero (talk) 11:10, 2 June 2017 (UTC)

yay (accompanied by a hand gesture)Edit

Hi, I've just come across the description "(accompanied by a hand gesture)", and I love the idea of adding this type of info. where relevant, yet more detailed, with images of even video accompanying it. By the way, what realm of linguistics studies it? --Backinstadiums (talk) 20:12, 31 May 2017 (UTC)

The hand gesture is optional, though. I'm sure I've used expressions like "yay big" even when talking on the phone. —Aɴɢʀ (talk) 07:56, 1 June 2017 (UTC)
Thanks for answering. Then adding "usually/normally" would improve the note. I do not know how to organize this, but it would really make a difference for Wiktionary to add that kind of additional info., such as body expressions (nodding, grimaces, etc.). To put it on record, on what page of the one's in Help:Contents should it be added? --Backinstadiums (talk) 09:53, 1 June 2017 (UTC)
Have you seen Appendix:Gestures? Equinox 11:20, 1 June 2017 (UTC)
@Equinox: Thanx. Creating a visual repository for wiktionary would undoubtedly add a great technical value. There're good academic papers about it, so I'd propose to create working group to research about it and create content. --Backinstadiums (talk) 13:20, 1 June 2017 (UTC)
You may also find meyakan interesting. - -sche (discuss) 18:33, 22 June 2017 (UTC)

June 2017

Last remaining private use area charactersEdit

The following pages have private use characters: , proposition, 슴새, bên, 배다, xǔxǔ, and 癩 (a redirect). Can we clean these up? DTLHS (talk) 05:06, 1 June 2017 (UTC)

Out of curiosity, did you check all namespaces, or just the mainspace? Just the other day, I changed a PUA character in a sortkey of Module:languages/data3/t! - -sche (discuss) 05:10, 1 June 2017 (UTC)
Just mainspace. DTLHS (talk) 05:15, 1 June 2017 (UTC)
@-sche Here's the entire site if you want to see it. DTLHS (talk) 05:36, 1 June 2017 (UTC)
Thanks! The uses in userspace and some other places can probably be left as-is, but e.g. the uses in Template:IPAsym looks like ones we should check. In one case, I see that the character has since been added to Unicode. - -sche (discuss) 18:26, 1 June 2017 (UTC)
All done. Wyang (talk) 07:33, 1 June 2017 (UTC)

Conventions for EgyptianEdit

@Furius, Hyarmendacil, Strabismus, CAmbrose I’ve been doing some work with Egyptian and have come across a number of problems that aren’t yet addressed or could do with changing in WT:About Egyptian, so I thought I’d ask about them to try to come to some sort of consensus before going ahead with implementing anything unilaterally. (Unfortunately I think all the Egyptologically inclined editors I pinged are inactive, but I figured it’s worth a try.) The proposals I would make are these:

  • Distinguish between s and z instead of merging them both into s like we (nominally) do now. The phonemes merged by Middle Egyptian, but they were still separate in Old Egyptian, and most authors still make the distinction, from Allen to the Wörterbuch. Even Faulkner’s dictionary of Middle Egyptian puts the variants with z in parentheses where appropriate. On Wiktionary, Egyptian covers Old Egyptian as well as Middle, so that separating z out makes more sense, and there are already a few entries that have z in defiance of the current policy.
  • Change 3 to Ꜣ in transliteration. This is the correct Unicode codepoint dedicated to Egyptological aleph, and font support has become good enough that I think most people can see it. (Is this actually the case?) The current 3 wreaks havoc with the linkify function in templates, which converts it to an unlinked superscript 3 whenever it comes at the end of a word.
  • Change ˤ to ꜥ in transliteration. Not so important, just a switch to the correct codepoint now that it’s probably supported for most people. (Again, can anyone confirm/deny?)
  • Standardize the hieroglyph Z4 (the two dual strokes
    ) and the nisba adjective ending to be a variant of j (
    ) rather than y (
    ). (Right now it is not defined either way.) Both conventions are common — Faulkner and Hoch call it y, while Allen, Loprieno, and the Thesaurus Linguae Aegyptiae call it j — but the convention in most of our entries was to call it j, presumably because most of our editors were reading Allen. I’m proposing that we standardize what already seemed most common.
  • Only use dots (.) to separate morphemes in the case of suffix pronouns etc., not with inflectional suffixes for singular, plural, etc. This is again the convention that already seems most common, presumably because it’s the one Allen follows. I’d be fine with the other common system, too, where equals signs (=) separate out the suffix pronouns and dots separate out all the other morphemes, but consensus seems to favor the one Allen uses.
  • Standardize the use of periods as dots, rather than using interpuncts (·). Again for this, I have no opinions either way, but this seems to be current consensus.

Thoughts? —Vorziblix (talk) 07:56, 1 June 2017 (UTC)

I don't remember when and why our current standards were established, but it is true that nobody working on them is active any longer. I think all your points seem very reasonable, although alternate transliterations should remain as soft redirects to the standardised lemma form using {{egy-alternative transliteration of}}. —Μετάknowledgediscuss/deeds 15:16, 1 June 2017 (UTC)
Most of these proposals seem reasonable; I do not know enough to comment on j vs y. I strongly support avoiding the equals sign =, which causes difficulties if provided in a template (link, etc), and possibly also if we were to devise a module that parses the content of our entries to generate something (as is done in some Thai entries, and recently proposed for Arabic), and possibly also for some re-users of our content, because of what it usually means in template syntax. - -sche (discuss) 18:37, 1 June 2017 (UTC)
@Vorziblix: I can confirm that the and characters (as well as ˤ) all display on the machine I am currently using, which is a loaner that does not have obscure font support. —Justin (koavf)TCM 19:44, 1 June 2017 (UTC)
I don't edit Egyptian, but and just show up as boxes for me... Andrew Sheedy (talk) 02:22, 2 June 2017 (UTC)

Thanks for all the feedback. Let’s consider the use of the equals sign definitively ruled out, and the soft redirect policy makes sense. Regarding lack of universal support for and , I am inclined to think the template issues and Unicode compliance still make the changeover worthwhile, but will defer to consensus if the general opinion is otherwise. In any case I’ll wait a few more days to give more time for comments before making any changes/additions to WT:About Egyptian. —Vorziblix (talk) 04:20, 2 June 2017 (UTC)

The symbols also display correctly for me. If we switch away from , remember to update that entry, which currently mentions the use of ˤ. I wonder if it would help to switch egy from using Latn as its script, to using Latinx, which I think calls on more comprehensive fonts. - -sche (discuss) 05:20, 2 June 2017 (UTC)
That’s probably a good idea; the characters are in the LATIN-EXTENDED-D block, which is explicitly looked for by Latinx but not by Latn, if I’m interpreting that page correctly. —Vorziblix (talk) 20:26, 2 June 2017 (UTC)
@-sche Since I can’t edit Module:languages/data3/e without admin privileges, would you be willing to make the switch to Latinx? Thanks, —Vorziblix (talk) 20:24, 21 June 2017 (UTC)
  Done. - -sche (discuss) 21:24, 21 June 2017 (UTC)
I am very inactive, sadly, but all of these are reasonable. The first point (s/z) is the one that is most likely to cause issues, I think. What would the new rule mean for e.g. nts? So far as I'm aware (and I confess that I have never studied old Egyptian) the alternative reading only comes into existence once z and s had merged. Would we create a new article for ntz? Furius (talk) 15:57, 2 June 2017 (UTC)
The lemma form would definitely be at nts, since it derives from the fem. sing. 3p. suffix pronoun .s, which was s all through Old Egyptian; if ntz exists at all, it should just be an ‘alternative form of…’ entry, but it’s probably unneeded. In general words that first appear after OE and never had the s/z distinction would be lemmatized with ‘s’; words that had ‘z’ in OE would be at the ‘z’ variant, but with an ‘alternative form’ entry at the ‘s’ variant; and words that had ‘s’ in OE would straightforwardly be at the ‘s’ variant. That way anyone searching by ME form could always search with ‘s’, and anachronistic readings like ntz wouldn’t be necessary.
By the way, thanks for your work with verb conjugations; it helps a lot with making templates! —Vorziblix (talk) 20:26, 2 June 2017 (UTC)
As further potentially useful information, at the bottom of this page there’s a discussion on the principles used by Allen for transliteration; regarding s vs. z they line up pretty well with my suggestion above: ‘If the hieroglyphic writing of a particular instance of a word has z but the original spelling had s, we transliterate the consonant as s,… but vice versa, if a particular instance has s but the original spelling had z, we transliterate the consonant as s.… We transliterate as z only if both the original spelling and the particular instance have z.’
Some of the other discussions there might also be good to consider, particularly the question of whether to hyphenate compound words, where we don’t seem to have any consistent policy. —Vorziblix (talk) 01:41, 3 June 2017 (UTC)

Barring further objections, I’ll start moving ahead with standardizing entries. —Vorziblix (talk) 05:49, 5 June 2017 (UTC)

Enable sitelinks on Wikidata for Wiktionary pages (outside main namespace)Edit

Hello all,

Here's an important information about the evolution of Wiktionary sitelinks in the next weeks.

Short version: From June 20th, we are going to store the interwiki links of all the namespaces (except main, user and talk) in Wikidata. This will not break your Wiktionary, but if you want to use all the features, you will have to remove your sitelinks from wikitext and connect your pages to Wikidata.

Long version available and translatable here.

If you have any question or concern, feel free to ping me.

Thanks, Lea Lacroix (WMDE) (talk) 08:23, 1 June 2017 (UTC)

@Lea Lacroix (WMDE) What do you mean by "all the namespaces"? There are many custom namespaces (Appendix, Concordance, Index, Rhymes...) Note the discussion at Wiktionary:Beer parlour/2017/May#Use "Cognate" to link between citation pages. --Vriullop (talk) 08:59, 1 June 2017 (UTC)
The namespaces you mention can be stored in Wikidata as well. The namespace Rhymes, for example, has an equivalent on German Wiktionary, so it will be possible to make links between the pages. About Citations: we're going to investigate to see if it is more relevant to make automatic links with Cognate, or centralized links in Wikidata. Lea Lacroix (WMDE) (talk) 09:08, 1 June 2017 (UTC)
For what it's worth, the Portuguese Wiktionary has a Rhymes namespace too, but it's virtually unused. It has only one page, which is obviously a stub: pt:Rimas:Inglês. But I suppose it wouldn't hurt to link it to our Rhymes:English through interwikis. --Daniel Carrero (talk) 13:36, 1 June 2017 (UTC)
Citations pages match their connected main-namespace pages. An exception is when citations are centralized on one page; one wiki might choose to centralize citations of both "have got someone's back" and "have someone's back" on Citations:have someone's back, where another wiki might centralize them on its equivalent of Citations:have got someone's back ... but one wiki might choose to lemmatize one of those phrases in the main namespace, too, where another wiki might lemmatize the other phrase ... and hard or soft redirects might or might not exist ... much like if a certain discussion happened to take place only on Talk:have got someone's back and not Talk:have someone's back. So IMO it makes as much sense to handle Citations pages via the Colgate extension, as it does to handle main-namespace and talk pages. Interwiki links between citations pages seem not very useful, anyway. - -sche (discuss) 05:48, 2 June 2017 (UTC)
I believe Cognate is the better option for Citations pages. Please see my reply at Wiktionary:Beer parlour/2017/May#Use "Cognate" to link between citation pages. --Daniel Carrero (talk) 11:11, 2 June 2017 (UTC)
What about categories? Most Wiktionaries have the same category for each language, so adding each of them individually to Wikidata is a huge waste of effort. There's no difference between Category:English nouns and Category:Dutch nouns other than the language, they should be treated as the same thing. —CodeCat 18:39, 1 June 2017 (UTC)
This phase is the same done for Wikipedia and other projects, so any interwiki link existing on any page will be exported to Wikidata, except user pages, talk pages and main space provided by Cognate. That means Wiktionary: pages, categories, templates, modules, etc., including custom namespaces I asked for. You have not to add them individually, it is a mass export. So Category:English nouns will continue linking to nl:Categorie:Zelfstandig naamwoord in het Engels, etc., but it will be easier to maintain it in a centralized place. Any interwiki link in one language project will appear in all wikts. Any page renamed or deleted will update interwikis immediately (as currently does Cognate). See d:Q7923975. I suppose that besides Wikipedia, Wikibooks, etc, it should appear Wiktionary with Category:English language, nl:Categorie:Woorden in het Engels and all wikt interwikis. Apart from interwiki links for Wiktionary you can access to equivalent links in other projects, if any. --Vriullop (talk) 06:53, 2 June 2017 (UTC)
About CodeCat's "There's no difference between Category:English nouns and Category:Dutch nouns other than the language ..." This only applies to a small minority of categories in the English Wiktionary. Yes, if we had a centralized database of languages, we could then generate a list of "Category:(language) nouns" in the English Wiktionary, but do many Wiktionaries use a predictable system for their nouns categories, and all other categories? Their category systems might unpredictably change at some point, and some existing and future Wiktionaries might not have figured standards for categories yet. Maybe Wikidata is not the most perfect conceivable system for listing our category interwikis using little storage space but at least it seems to be as flexible as needed. --Daniel Carrero (talk) 11:20, 2 June 2017 (UTC)
@Daniel Carrero: Individual Wiktionaries moving things is an argument for a central repository at Wikidata. If they get moved on the (e.g.) Dutch Wiktionary, then the links will be instantly and automatically updated at the (e.g.) Swahili Wiktionary. —Justin (koavf)TCM 15:53, 2 June 2017 (UTC)
It just occurred to me that we would probably want Category:English language directly connected to d:Q1860, and similar for the main category of other languages. Would this be done as part of this change? —CodeCat 19:55, 11 June 2017 (UTC)
d:Q7923975 is the concept "Category:English language" in multiple Wikimedia projects, including Wikiversity, Wikibooks and Wikisource. Its value "category's main topic" is d:Q1860. --Daniel Carrero (talk) 20:01, 11 June 2017 (UTC)
Hmm, I can't say I understand why there's this distinction. Wikipedia's page w:English language is really not about anything different than our Category:English language. —CodeCat 20:02, 11 June 2017 (UTC)
Probably just to store interwikis for the same category in multiple Wikipedia languages: w:en:Category:English language, w:nl:Categorie:Engels... You get the point. This, in addition to the "Category:English language" in Wikibooks, Wikisource etc. as mentioned above. --Daniel Carrero (talk) 20:06, 11 June 2017 (UTC)
But it has a much more prominent role on Wiktionary than on any of those other sites. It's our main page about English, just like w:English language is Wikipedia's. —CodeCat 20:11, 11 June 2017 (UTC)
See for example English Wikibooks. b:Subject:English language is connected to d:Q1860 and b:Category:Subject:English language to d:Q7923975. Accordingly, Wiktionary:About English should connect to Q1860 and Category:English language to Q7923975. --Vriullop (talk) 12:01, 12 June 2017 (UTC)
I agree with Vriullop. Our Category:English language is pretty awesome, with the description full of diverse English-related links, but its main purpose is still just being the main category about English. Wikidata controls interwikis, and the category interwikis should probably be between categories only. --Daniel Carrero (talk) 12:32, 12 June 2017 (UTC)
Is there a discussion page or something else about these "category" items, that might explain their purpose and thus help decide whether Category:English language should be in there? —CodeCat 20:13, 11 June 2017 (UTC)
@CodeCat: I don't know about any discussions, but as I'm sure you know, apparently anything that exists in separate Wikipedia pages also may have separate Wikidata items for the sake of storing interwikis.
--Daniel Carrero (talk) 14:07, 14 June 2017 (UTC)
To be more exact, apparently any page in some Wikimedia projects can have its own Wikidata item to store interwikis. For example, d:Q30237873 is the recent Wikinews article "Theresa May's Conservative Party wins UK election but loses majority, leaving Brexit plan in question". --Daniel Carrero (talk) 06:58, 19 June 2017 (UTC)

Cognate & redirectsEdit

Hello again,

Several persons mentioned the idea of Cognate linking to redirection pages. This is a complex issue that should be decided with a consensus of different languages communities. To complete the discussions that have been running on Phabricator and have more points of view, I created a discussion topic here. Feel free to add a comment. Thanks, Lea Lacroix (WMDE) (talk) 13:13, 1 June 2017 (UTC)

lexicographic approach to learningEdit

One the main uses of a lexicographic resource is that of educational learning, therefore it would enrich Wiktionary to create a discussion room to ask for advice from advanced users regarding the learning process itself. The feedback would enable discovering current weaknesses and so improve Wiktioanry on the whole. --Backinstadiums (talk) 15:07, 1 June 2017 (UTC)

But it still leaves the learning process with an excessively narrow focus on people like us. We are already pretty good at taking ourselves as model users. We need more than a convenience sample of active, interested users of Wiktionary. Any thoughts on how to get that? DCDuring (talk) 16:45, 1 June 2017 (UTC)
@Backinstadiums, can you give an example of what kind of discussion you would expect to see in such a page? — Ungoliant (falai) 01:31, 2 June 2017 (UTC)

Personal advice from those who are already fluent, what they'd do differently knowing what they know now, mistakes to be avoided, grammatical aspects which the entry of a certain term doesn't clarify, etc. For example, strategies to learn chinese characters and their respective pronunciation (homophones), or arabic broken plurals (unpredictability). --Backinstadiums (talk) 06:02, 2 June 2017 (UTC)

Parameters of Template:quote vs Template:quote-book etc.Edit

Currently we have {{quote}}, which is just for formatting the text of a quote much like {{usex}}. Then we have {{quote-book}} and its relatives, which also show the source info above the quote, and are a lot more elaborate and contain many parameters. {{quote}} is quite easy to use, since it works the exact same way as {{usex}} and people are very familiar with that. I think it would be desirable if the other set of templates could be modified so that they are more compatible with {{quote}}, so that the more elaborate templates are easier to use for people (like me) who are used to {{quote}} and {{usex}}. I'm thinking mainly of the parameters: first parameter is language, second is quote, third is translation. Would this be ok? —CodeCat 18:56, 1 June 2017 (UTC)

I'm OK with that. You might have to do a lot of cleanup though, since language codes aren't required in {{quote-book}} right now, even if there is a translation. DTLHS (talk) 18:58, 1 June 2017 (UTC)
Yes, and I believe it accepts language names rather than language codes too. This is a drawback of these quote templates currently: they don't tag the quoted text, which {{quote}} does. —CodeCat 19:01, 1 June 2017 (UTC)
Pinging @Smuconlaw as a person who has worked a lot on these templates. —CodeCat 20:37, 1 June 2017 (UTC)
I'm not very clear as to what is being suggested. All the {{quote-}} family of templates can be used with positional parameters for the basic parameters; for example, {{quote-book|[year]|[author]|[title]|[url]|[page]|[passage]}}. I don't understand what "tagging the quoted text" entails, nor why adding a language code as the first parameter is needed. The {{quote-}} (and {{cite-}}) templates have not hitherto had a language code parameter, so if such a parameter is added a bot would need to update all the uses of the template. — SMUconlaw (talk) 20:29, 2 June 2017 (UTC)
All templates on Wiktionary that display words in another language will wrap text in a bit of HTML that indicates the language of the text, by convention using either the first parameter to indicate the language, or the lang= parameter, depending on the template. However, this is missing for the quote- templates. They have a language parameter named language=, but it's optional, and it's provided with a language name rather than a language code. It doesn't follow the conventions of other templates. What I would like is if the first three parameters of the quote- templates could be the same as those of {{quote}}, with the others being named parameters: {{quote-book|[language code]|[passage]|[translation]|title=[title]|url=[url]|author=[author]|page=[page]|year=[year]}}. The current transliteration= parameter would be renamed to tr=, again to match other templates. —CodeCat 20:36, 2 June 2017 (UTC)
I guess I have no strong objection to changing the first three positional parameters so long as a bot can come along and carry out all the required changes to pages where the templates have been used. However, I'm afraid I have no experience with adding language tagging to templates. What is such tagging for, actually, and which text is tagged – the quotation? — SMUconlaw (talk) 21:11, 2 June 2017 (UTC)
Script tagging goes along with language tagging, allows the text to be displayed in appropriate fonts, which are specified in MediaWiki:Common.css. This is particularly important for non-Latin scripts, or Latin scripts containing unusual characters. Language tagging (I hear) tells screen readers which language the text is written in, allowing them to read it correctly. Script tagging is generally added at the same time as language tagging is added, by Module:script utilities. Linking templates like {{l}} and {{m}} add script and language tagging using this module, as well as {{usex}} and {{lang}}. — Eru·tuon 22:00, 2 June 2017 (UTC)
If you want to see it in action, go to Special:ExpandTemplates and type in {{quote|en|this is a quote}}. —CodeCat 22:04, 2 June 2017 (UTC)

A category for words like legged and learnedEdit

I’d like us to have a category for words with -ed pronounced a full -èd where -’d is expected (in other words, it wouldn’t include words like pitted and modded), but I can’t think of a good, accurate name. Any ideas? — Ungoliant (falai) 20:28, 1 June 2017 (UTC)

Words with unexpected syllabic -ed? — Eru·tuon 20:37, 1 June 2017 (UTC)
Are you talking about homographs with different standard pronunciations and distinct meanings, but not distinct etymologies? Some more: aged, cussed, dogged. Would you include words that just had different (presumably standard) pronunciation of the -ed like alleged. DCDuring (talk) 22:58, 1 June 2017 (UTC)
Why not something like Category:English heteronyms (-ed). One could see how to generalize the category name to other morphemes or morpheme groups, though the scheme would probably not work for all heteronyms in all languages. DCDuring (talk) 23:02, 1 June 2017 (UTC)
Hmm, heteronym apparently means a word with a different pronunciation and meaning. The category we're speaking of should only relate to pronunciation. — Eru·tuon 23:33, 1 June 2017 (UTC)
legged and learned have common etymologies and related meanings, but not identical meanings. alleged is, I think the only one of those so far mentioned with the same meaning for both pronunciations. DCDuring (talk) 23:49, 1 June 2017 (UTC)
I assumed from the original post that the category would only relate to pronunciation, because @Ungoliant MMDCCLXIV didn't mention meaning at all. — Eru·tuon 00:31, 2 June 2017 (UTC)
Yes, I’m talking about a category for words that have -ed but don’t follow the suffix’s usual pronunciation rule (as Erutuon explains below), regardless of the relationship between its senses and regardless of whether the same word is pronounceable in a way that follows the rule. — Ungoliant (falai) 01:30, 2 June 2017 (UTC)
For more context, the rule is that the suffix -ed is pronounced /d/ or /t/ after most consonants, but /ɪd/ or /əd/ after /d/ or /t/. (The difference between /ɪd/ and /əd/ is dialectal; some dialects have one, some have the other.) So the category would be for words in -ed that don't follow this rule, that should have /d/ or /t/, but have /ɪd/ or /əd/ instead. — Eru·tuon 00:31, 2 June 2017 (UTC)

Here's a comprehensive summary for -edness, -edly . --Backinstadiums (talk) 05:58, 2 June 2017 (UTC)

Category:English terms with unexpected syllabic -ed it is. — Ungoliant (falai) 20:57, 7 June 2017 (UTC)
I don't like the word the "unexpected". Is there a better way to exclude -ted and -ded? --WikiTiki89 21:30, 7 June 2017 (UTC)
As a native speaker I certainly expect them or at least allow for their possibility, except for the ones based on misspellings (wretch) or with one of the pronunciations being for a rare word, usually a verb (eg, sacre).
Do I understand correctly that the motivation for the category is that someone who encountered in writing the word being used in a sense which commonly required the separate syllable for -ed would mispronounce it, having insufficient experience hearing it? If so, this seems a much better Appendix than a category. A usage note containing a link to such appendix would be more useful than a category. I think an appendix would allow for more flexibility (and length) in its title. DCDuring (talk) 00:34, 8 June 2017 (UTC)
I think the usefulness is simply that someone may be interested in seeing a list of words in this category, which is exactly what categories are for. --WikiTiki89 15:27, 8 June 2017 (UTC)
@Wikitiki89 feel free to rename it to anything you think will be accepted. — Ungoliant (falai) 15:23, 8 June 2017 (UTC)

template:univerbation and Category:UniverbationsEdit

I think we're missing this. For example, зачем (začem) is not a prefixation, it's really the preposition за (za) with its instrumental regimen чем (čem) (compare French pourquoi). --Barytonesis (talk) 12:05, 2 June 2017 (UTC)

Wouldn't that fall under {{compound}}? -- GianWiki (talk) 11:07, 16 June 2017 (UTC)
Sounds like it would be a subtype of compound: one in which the elements were formerly separate words in a phrase. — Eru·tuon 18:23, 16 June 2017 (UTC)
Sounds like a great idea. It would be useful in the Greek etymon of ephemeral, for instance. — Eru·tuon 18:23, 16 June 2017 (UTC)

Proposal: Clean up, rename and replace "en:" → "English" in all categoriesEdit

(This obviously would need a vote to be implemented.)

In the past, some old categories like Category:es:Japanese derivations and Category:es:Derogatory were renamed to remove the language code. I think this was an improvement. (related votes: Derivations categories, Lexical categories)

Please check if the grammar is OK everywhere. Feel free to make any corrections, suggest any changes or ask any questions.

See also discussion: User talk:-sche#Properly splitting topic and set categories.

Place names (see also WT:Place names for naming conventions, edit that page if needed)

Proposal: ALWAYS add the country at the end when applicable.

English ... jargon

(to avoid questions like: is "Medicine" for medicine jargon or for terms relating to the medicine?)

English names of ... (for proper nouns?)
English terms relating to ... (or "pertaining to", or "involving", etc.)

Reason for "relating to" -- most or many of the description of these categories use "related to"

--Daniel Carrero (talk) 14:16, 2 June 2017 (UTC)

It seems somewhat ok, hesitantly. My biggest gripe at the moment is that it's not Category:English names of cities in Ontario, Canada. The addition of country names is a definite improvement. I'll abstain for the moment on the "relating to" categories. —CodeCat 14:43, 2 June 2017 (UTC)
Oops! I made a mistake and typed Category:English cities in Ontario, Canada without "names of". You made me realize that, and I fixed it in the list above. --Daniel Carrero (talk) 14:45, 2 June 2017 (UTC)
I don't know that I disagree as such but I think this will cause a lot more problems than it solves and it seems like a huge undertaking for very little benefit. —Justin (koavf)TCM 15:53, 2 June 2017 (UTC)
I see. What problems do you think it will cause? FWIW, see Wiktionary:Votes/2017-03/Request categories 2 for another large category renaming project that was voted and approved, and was successfully implemented. --Daniel Carrero (talk) 15:58, 2 June 2017 (UTC)
@Daniel Carrero: The wording of many of these categories. "English terms referring to [X]" or "English terms related to [X]"? "English words for [Y]" or "English language words for [Y]", etc. As it stands now, the scheme is very straight forward: code:Idea. —Justin (koavf)TCM 16:32, 2 June 2017 (UTC)
The current scheme is also ambiguous about whether a category is for words in a given set, or words related to a topic. This has been causing quite a few headaches lately. Is Category:Stars for names of individual stars, words for types of stars, or any words related to stars? What if we want categories for each of these types? Daniel's scheme at least avoids the ambiguity: Category:English names of stars, Category:English terms related to stars. I'm not sure where words for types of stars would go. —CodeCat 16:36, 2 June 2017 (UTC)
Shall we introduce a new type of category to the proposed list? I'm thinking of this:
There's a 2011 vote that concerned what to do with a few categories, including specifically Category:en:Stars, but the approved rule is not being followed right now. According to Wiktionary:Votes/2011-08/Categories of names 2, Category:en:Stars must contain only names of stars, and never other star-related terms like fixed star, quadruple star, quadruple star system, etc. The category description says the same thing. But the name Category:en:Stars does not help, it could easily contain any of those "forbidden" terms.
Another example of a currently confusing category name: In practice, Category:en:Internet contains two things: terms involving the internet (frame, FTP, e-mail...) and terms used in the internet (FWIW, IOW, IYKWIM...). The latter should actually be in Category:English internet slang, but once again the category name doesn't help -- "en:Internet" could mean either of the aforementioned possibilities. If this proposal passes, Category:en:Internet should be renamed to Category:English terms relating to the internet (I guess the "the" fits here, right?). Feel free to discuss about different wordings, other than "relating to", but I believe we can always say just "English", never "English language" -- after all, we use Category:English nouns, never Category:English language nouns.
Aside from that, the place name categories need to be cleaned up one way or another, naturally. --Daniel Carrero (talk) 16:53, 2 June 2017 (UTC)
What's the supposed benefit of using language names rather than language codes? Names can be awkwardly long, and are not guaranteed to be unique. Equinox 19:59, 2 June 2017 (UTC)
They are guaranteed to be unique, per WT:LANG. And the benefit is that users can't be expected to learn language codes merely to use Wiktionary. It's an unnecessary barrier. —CodeCat 20:00, 2 June 2017 (UTC)
Yes, they are guaranteed to be as unique in category names as they are in L2 headers. Which means, there could be cases where people assume "Riang" means the Bangladesh language ria when it actually means the Burma language ril, but that shouldn't cause any more headaches with categories than it does with L2 headers. - -sche (discuss) 22:50, 2 June 2017 (UTC)
There are two components to this proposal: changing language code to language name, and changing the names of the categories (the part after the language code or name). Hypothetically the second could be done without the other: that is, there could be a chimeric category name, Category:en:Names of cities in Kyoto Prefecture, Honshu, Japan. I suppose it makes sense to do both at the same time, since (I imagine) thousands of categories will have to be renamed when either change is made. However, some editors might want to keep language codes but have the other part of the category name be changed. (The other option, using language names but not changing the rest of the category name, will not work: Category:en:SexCategory:English sex is nonsensical.)
On the whole, the category names are clearer, but they will be more difficult to find when one is categorizing, because they are longer. Admittedly, the existing categories are difficult to find too, especially when creating one that doesn't exist for a particular language yet. I wonder if a tool could be created to make this easier. HotCat isn't quite what I'm thinking of. Perhaps something where you could type in a phrase, like "cities kyoto" and find Category:English names of cities in Kyoto Prefecture, Honshu, Japan, or the umbrella category thereof.
Functionally, the only purpose of using language codes rather than names is to distinguish topic or set categories from the categories that use language names (part-of-speech categories especially). If language names are used, the only distinction will be the category names after the language name. So, the two types of categories will be harder to tell apart. — Eru·tuon 21:48, 2 June 2017 (UTC)
Category:English sex is nonsensical, but the colon doesn't have to be deleted, so Category:English:Sex. —CodeCat 22:00, 2 June 2017 (UTC)
I would indeed favour something like that over "sentence" categories like "English terms involving/about/relating to sex". The latter seem a bit wordy and would probably sound actively bad or strange in certain cases. BTW, regarding Daniel's proposal, I think "jargon" is a very poor choice: it is a loaded term, suggesting that this is needlessly complex language; we did in fact remove "jargon" glosses from a lot of entries at one stage. Equinox 22:21, 2 June 2017 (UTC)
I have similar concerns/dislike as Erutuon and Equinox towards the long names. "English names of cities in Kyoto Prefecture, Honshu, Japan" is horribly long, and probably too finely "granular" (but the latter is a separate issue). Long names are harder for users to write when adding or searching for categories.
Language codes have several benefits over language names, including being shorter and not needing to be moved when we rename a language (moving 100 categories is a major hassle which I suppose we can avoid now by mass-deleting the categories and letting bots create the new cats and WikiData sort the interwikis out). OTOH, I understand those who feel they are a barrier for less-adept users.
Can we avoid the unwieldy "sentence" names and make the topic category "English:Foobar", and the set/list category "English:List of foobars"? (Another idea, proposed on my talk page, is "Category:English:topic:Foobar" vs "Category:English:List:Foobar(s)".)
As others have said, we should avoid calling things "jargon". Maybe "terminology" would work. - -sche (discuss) 22:42, 2 June 2017 (UTC)
Re: "Functionally, the only purpose of using language codes rather than names is to distinguish topic or set categories from the categories that use language names (part-of-speech categories especially)." -- It's not been always like this in the past, like in the old categories mentioned in the op (Category:es:Japanese derivations and Category:es:Derogatory). Even if that rule is supposed to be followed now, we have Category:English female given names, Category:English surnames, etc., and if we rename any category to contain "names", "terms", "jargon" or "terminology" in the name, it technically becomes a "lexical" category and will need to start with "English..." as per this rule.
Are codes better than names always, or are names better than codes always? If we had no categories for derivations whatsoever, would we want to create Category:es:Japanese derivations? The proposed category names like Category:English terms related to sex are supposed to be straightforward -- the category contains what the title says, as in a normal English text.
The granularity of place name categories is optional. I proposed Category:English names of cities in Kyoto Prefecture, Honshu, Japan, but it could be Category:English names of cities in Japan, even though I prefer the longer name.
By the way, if we renamed all categories as proposed above we could remove "by language" from all categories. In the current naming scheme, we have this:
If this proposal passes, we could have this, without "by language" anywhere (unless we want to keep the "by language", of course, but that won't be a requirement like today):
Let's compare this: Category:English medicine terminology and Category:English medicine jargon. Is "terminology" really better? I wouldn't like the category to have common terms like disease, heal, doctor, check-up. So I feel that "jargon" is better for the purpose of avoiding these terms.
Additional proposal: we could have one module listing all catgegories with "names of", another module for "terms for", another for "terms relating to", and another for the list of place names. --Daniel Carrero (talk) 02:16, 3 June 2017 (UTC)

We seem to have many different ideas. Let's have a poll to get a rough tally of whether more people like names vs codes, and whether more people like shorter/condensed or longer/descriptive names. That way, we can see what kind of names we should focus on (like, if most people want descriptive names, then we can focus on deciding whether "pertaining to" or "involving" is better, but if most people want short names, then bikeshedding the format of hypothetical long names would be silly). - -sche (discuss) 22:42, 3 June 2017 (UTC)

Categories for terms which are limited to the "jargon" of medicine (if you are distinguishing them from "terms that pertain to the topic of medicine") are neither "topic" nor exactly "list" categories, IMO, but are in the same vein as categories for terms which are limited to British English. So, I don't think my poll covers them, because I think they should be addressed separately. - -sche (discuss) 22:42, 3 June 2017 (UTC)

There are way too many reasons not to like this, so I'll start with just one: all of these ideas move the interesting information farther and farther to the end of the name "blahblahblahblahblahblahblah fish" in newspaper jargon terminology talk this is known as burying the lede. The way sorting works means we can't rearrange the order, so we need to be as concise as possible in the left-hand parts. Take a look at the massive logjam of categories at the bottom of most multilingual entries. Now imagine it being twice the size without a corresponding increase in the size of the rightmost nodes. Someone mentioned HotCat: the auto-complete feature is going to be pretty much useless if you have to type in pretty much the whole category name before it gets to the part that it can fill in. And all the stuff about descriptive English sentences being easier for people to understand was the main selling point of COBOL- remember COBOL? I'm sure every programmer aspires to someday write code like the immortal "ADD A TO B GIVING C", and I'm sure any kid in grade school could debug three COBOL business applications before breakfast... right? >;p Chuck Entz (talk) 09:07, 4 June 2017 (UTC)
@Chuck Entz: Do you think you could at least agree with me that the current language-code categories need to be cleaned up one way or the other? I'm pretty sure we can't say that the current system is acceptable. One of Wikimedia's values is "We strive for excellence." Place name categories are probably the messiest of all.
I don't mean it in a sarcastic way, it's just a normal question: As you know, I prefer language names instead of codes and I gave my reasons in this discussion. But if language codes were the best option always, shouldn't all language-specific categories like Category:English nouns and Category:Italian terms derived from Ancient Greek be renamed to include language codes instead of names? When we use the auto-complete feature, I believe we have to type at least "Italian terms d" if we want to get Italian terms derived from other languages. Or we just navigate to Category:Italian terms derived from other languages (which is named using normal English text, like the other categories I'm proposing).
Based on this discussion and this vote, I think in 2011 I myself helped to introduce and cement our existing tradition that apparently "lexical" categories have language names and "topical" categories have language codes, but it was just because I wanted to clean up some categories and put forward the proposal of replacing say Category:es:Euphemisms into Category:Spanish euphemisms without necessarily having to change the whole system yet.
I don't think the logic of programming languages applies to our categories. COBOL's "ADD A TO B GIVING C" is not better than "A+B=C", but Category:English names of stars has some merits discussed elsewhere in this discussion, as opposed to Category:en:Stars. If we want a category for medicine slang (multileaf collimator? s/p? CINV?) which cannot contain everyday words like heal, disease and doctor, then maybe the word "jargon" (or slang, or terminology) somehow needs to be in the category name.
Just for the sake of discussion, if having short categories with the main idea first is important, I wonder if there's some merit to using names like Category:Medicine (English, jargon), Category:Stars (English, star names) or Category:Stars (English, related terms). I'm not saying I support having these names, I'm just discussing what their merits are. We could also talk about renaming Category:Italian terms derived from Ancient Greek to Category:grc→it and Category:Italian terms derived from other languages to Category:*→it, which is probably one of the shortest options available (not to mention these are shorthands that feel kind of related to programming language logic) where they could still be understood as derivation categories. Again, I'm not trying to be sarcastic. Obviously, I don't think the shortest names are always the better ones. I know "grc→it" is a bad category name (in my opinion, at least), but I'm fine with discussing multiple possibilities. --Daniel Carrero (talk) 10:37, 4 June 2017 (UTC)

Poll 1: Language names vs codesEdit

On a new line, please indicate what you support:

  • (1) language names (like "French") to be used in naming topic categories (like categories for terms pertaining to the topic of religion);
  • (2) language names to be used in set/list categories (like lists of dog breeds); or
  • (3) language codes (like "fr") to be used in naming topic categories;
  • (4) language codes to be used in set categories.
Thus, if you want names to be used in all cases, you can indicate "1&2", or if you want codes in all cases, then indicate "3&4". But if you want a mix like "1&4", you can indicate that.
  1. Abstain for now; both codes and names have both benefits and drawbacks. - -sche (discuss) 22:42, 3 June 2017 (UTC)
  2. 1&2 —CodeCat 22:55, 3 June 2017 (UTC)
  3. Support 1&2.
    As I proposed above and then further commented in my reply to "Poll 2" below, I'd prefer category names that read like normal English, like Category:English terms relating to sex. In "real life", are many people even aware of different ISO codes for each language, not to mention our non-ISO made-up codes like "alv-pro", "nai-dly" and others found in Module:languages/datax?
    Currently, Category:Baseball contains 18 subcategories like "en:Baseball", "fr:Baseball", "ko:Baseball", etc. Sure, we all know what they mean, but navigating them requires learning how ISO handles codes and how we separately handle them. Language codes are gibberish if you don't know what they mean. — Sure, even "I like monkeys." is gibberish if you don't know its meaning, but you get the point. Using normal text with the actual language name should make the category contents immediately obvious to English speakers.
    Categories starting with our specific set of ISO and made-up language codes are an English Wiktionary signature, and therefore a barrier for reuse. Anyone is allowed to copy Wiktionary content - they can create mirrors, books, CDs with it. So, it's better to make the material as "generic" as possible. If a reader is navigating a new site called and finds a category name like Category:gmq-bot:Music, they will probably not understand what "gmq-bot" means. Even if we generously assume that the reader has the privilege of being knowledgeable about our language codes, then it's possible they will be compelled to think "oh so we're using that English Wiktionary system now, I'd better get my list of language codes that they use to start navigating categories like this". By contrast, if we renamed Category:gmq-bot:Music to Category:Westrobothnian terms relating to music, the new name should be sensible anywhere. --Daniel Carrero (talk) 05:07, 4 June 2017 (UTC)

Poll 2: Longer vs shorter names for set categoriesEdit

On a new line, please indicate what basic type of name you support — for the part of the category that comes after the language name or code:

  • (1) Long descriptive names like "names of municipalities in São Paulo, Brazil" and "names of dog breeds" or "terms for dog breeds". (The precise format, like "names of" vs "terms for", can be worked out next if it's clear that people prefer long descriptive names to short names.)
  • (2) Short names that contain "list" or "set" to distinguish list vs topic categories: like "list:municipalities in São Paulo, Brazil" and "list:Dog breeds" or "set:Dog breeds".
  • (3) Very short names that don't distinguish list vs topic categories: like "municipalities in São Paulo, Brazil" and "Dog breeds".
  1. Prefer (2). Oppose (3) as too ambiguous. - -sche (discuss) 22:42, 3 June 2017 (UTC)
  2. I believe (1) is the best option, because names like Category:English terms relating to sex are to be read as normal English text, like other categories already are. Oppose (2) because the distinction between list, or set, or topic is arbitrary and not immediately obvious. And of course, oppose (3) as too ambiguous. See further comments below.
    If you get all terms relating to Christianity, this is equally a list, a set and the topic of Christianity-related terms. If we choose to pretend these words have unique meanings which clearly set them apart (which they don't), this would be an obvious kludge to avoid category overlap using as few characters as possible. Using "list", "set:", "topic:" at the start of categories is like using obscure abbreviations to save a few characters. (Some people oppose using template abbreviations like {{der}} instead of {{derived}}, saying that the latter is easier to read. What to do with templates is a separate discussion, but that is a good point. Aside from that, I believe we have a consensus not to use abbreviations like q.v., L., Gr., esp., cf., &c. in etymologies.) If we implemented these arbitrary category prefixes, I fear we would probably have to constantly lecture anons and new editors about what is the correct category prefix to use.
    What I'm saying here is consistent with what I proposed above. The proposed category names are supposed to be read as normal English text. Currently, we have to wonder if categories like Category:en:Stars contains names of stars, and/or types of stars, and/or terms relating to stars. If this proposal passes, we may have this:
    I'm not saying all the three "star" categories above will have to be created, but I'd like any existing category to conform to a naming system like this. The wording may change if people want, naturally.
    In the past, other proposals that made category names read like normal English text were voted and approved:
    --Daniel Carrero (talk) 04:25, 4 June 2017 (UTC)
  3. Prefer (1). Category names should describe exactly what they contain. DTLHS (talk) 05:11, 4 June 2017 (UTC)
  4. 1&2. Oppose 3 as too ambiguous as well. —CodeCat 15:31, 4 June 2017 (UTC)

Poll 3: Longer vs shorter names for topic categoriesEdit

On a new line, please indicate what basic type of name you support — for the part of the category that comes after the language name or code:

  • (1) Long descriptive names like "terms pertaining to Christianity" or "terms relating to Christianity". (The precise format can be worked out next if it's clear people prefer long descriptive names to short names.)
  • (2) Short names that contain "topic" to distinguish list vs topic categories: like "topic:Christianity".
  • (3) Very short names that don't distinguish list vs topic categories: like "Christianity".
  1. Prefer (2). Oppose (1) as too long (cf Chuck's comments further up this thread), and oppose (3) as too ambiguous. - -sche (discuss) 22:42, 3 June 2017 (UTC)
  2. I believe (1) is the best option, because names like Category:English terms relating to sex are to be read as normal English text, like other categories already are. Oppose (2) because the distinction between list, or set, or topic is arbitrary and not immediately obvious. And of course, oppose (3) as too ambiguous. See further comments in my response to the "Poll 2" above. --Daniel Carrero (talk) 04:25, 4 June 2017 (UTC)
  3. (1) DTLHS (talk) 05:12, 4 June 2017 (UTC)
  • 3: Category titles are supposed to be as short as possible. Purplebackpack89 15:30, 4 June 2017 (UTC)
  1. 1&2. Oppose 3 as too ambiguous as well. —CodeCat 15:31, 4 June 2017 (UTC)


Sorry, I'm being lazy (or "delegating"; there's other stuff I want to focus on). This should probably be a prefix though: we have enviro-friendly, environazi, environut, Enviropig, envirospeak, envirotard. Most currently have blend etymologies. Equinox 00:57, 3 June 2017 (UTC)

Done. —Vorziblix (talk) 03:23, 4 June 2017 (UTC)

June Lexisession: concertEdit

A concert.

Monthly suggested collective task is to take care of concert. You already have a Wikisaurus:musical instrument and a Wikisaurus:musical composition but nothing about the show! Well, in June, there is the Fête de la Musique [World Music Day] and that's enough to plan a new Wikisaurus:concert, isn't it? Also, there is plenty pictures on Commons to illustrated entries related to musical performances.

Show must go on!

By the way, Lexisession is a collaborative experiment without any guide nor direction. You're free to participate as you like and to suggest next month topic. If you do something this month, please report it here, to let people know you are involve in a way or another. I hope there will be some people interested by playing music   Noé 09:21, 3 June 2017 (UTC)

Where to place {{wikipedia}} templates?Edit

@Atitarev, Cinemantique, Wikitiki89, CodeCat In new entries, I've been putting {{wikipedia|lang=ru}} templates just under the ==Russian== headword, but some existing entries put in under the ===Noun=== or similar headword. See атеи́зм (atɛízm) for an example where I moved it up. Not sure if this is correct, comments? Also, is there a difference for English-language and foreign-language Wikipedia references? An example where I put both is пого́ст (pogóst); this Russian term has several meanings specific to Russian culture and has an English-language entry under pogost, which is helpful in explaining some of the meanings. Benwing2 (talk) 20:50, 3 June 2017 (UTC)

For English terms I used to place them under the ==English== header too. However, another editor tended to remove and replace them with:
===Further reading===
* {{pedia}}
so that's what I do now. — SMUconlaw (talk) 20:59, 3 June 2017 (UTC)
IMO {{wikipedia}} should go under the language header; I find it a bit messy if it's under the POS, maybe unless it only applies to e.g. the noun section of an entry that lists a verb first, in which case I understand putting it underneath the relevant headword line template. Alternatively using {{pedia}} as Smuconlaw describes is also fine. - -sche (discuss) 01:01, 4 June 2017 (UTC)
The situations with Translingual (taxonomic) is very different than for other languages: the pedia, species, and commons links are on all fours with other external links (which can be numerous). Taxonomic entries often have images so the right hand side can become cluttered and push into other language sections. Thus it makes more sense to me to put the sister-project links under "References".
Something similar applies to entries for English vernacular names of organisms, which often have either images or many sister-project links.
Finally I have the ToC on the right hand side, which further pushes right-hand side content down into other language sections.
I don't know how applicable this is to the use of such links in other L2s. DCDuring (talk) 05:04, 4 June 2017 (UTC)
Thanks for all the comments. Benwing2 (talk) 23:41, 11 June 2017 (UTC)
I'm one of the editors who tends to convert {{wikipedia}} to {{pedia}} in Further Reading. The main reason is consistency: why should Wikipedia links get a special treatment? The box is also quite big and takes up a a lot space on the screen, especially when multiple links are stacked. Even the {{wikipedia}} documentation page says: "Consider instead using the inline version of this template". – Jberkel (talk) 06:15, 15 June 2017 (UTC)
I prefer all sister-project-link templates be placed immediately under the language heading. - [The]DaveRoss 13:35, 15 June 2017 (UTC)
Hmm, there's conflicting documentation and practice. Should we have a vote on this? – Jberkel (talk) 21:53, 15 June 2017 (UTC)

What are / were / should be the rules for anagrams in English and other languages?Edit

I would be willing to run a bot to update anagram sections (in which subset of languages?) if I know the rules. "Rules" meaning, which characters to ignore / normalize, minimum length, etc. DTLHS (talk) 01:26, 4 June 2017 (UTC)

  • I assume that the rules are the same for usage of the {{also}} template (that also needs bottifying). SemperBlotto (talk) 05:57, 4 June 2017 (UTC)

French Wiktionary monthly news - ActualitésEdit


I am dazzled to inform you that the 26th issue of Wiktionary Actualités just came out in English!

As usual, Actualités is in English but talk about French Wiktionary and lexicography in general.

This time: a focus on proposals for Wikimania 2017 related to Wiktionaries, a presentation of two dictionaries about the body and some words about Guaraní language. There is also a stack of statistics, shorts and a game!

As usual, it is translated in English by non-native speakers, so it is not perfect, but can be improved by readers (wiki-spirit and all). Please note that we do not received any money for this publication and we translate it because we are eager to read the same kind of publication about your project in the future and be inspire by your projects. Feel free to leave us comments!   Noé 10:12, 4 June 2017 (UTC)

Fascinating! Hmm, what is en.Wikt doing that could be reported in such a publication?
Last time I recall us comparing the new words other dictionaries had added to our entries, we similarly already had most of them.
We've been working on templates/scripts that would enable an entry to specify its most recent etymon, and have the script find and display that term's etymon, and that etymon's etymon, to reduce content duplication and dissynchronization.
Efforts to create a module that can automatically transliterate vocalized Hebrew are continuing and may lead to a proposal to Unicode to encode separate codepoints for big and small shvas like big and small qamats.
We've been expanding our coverage of languages that (are not dialects of other languages and) do not have ISO codes (Module:languages/datax).
We've been expanding the number of languages we have referenced/verifiable entries in, using in some cases fr.Wikt's entries (which in many cases were based on en.Wikt's redlinked translations of water, ha).
- -sche (discuss) 02:19, 6 June 2017 (UTC)
Sure, we can report hot topics of en.wikt! Easily if someone from the project do a summary, just like you did. We tried in some old issues, but it was biased because discussions are split in several pages and we didn't know enough the names of the participants to get the whole picture.
The comparison with other dictionaries seems to me very different in English than in French because the editorial choices of French dictionaries is to select only around 60.000 entries whether English dictionary select an average of 100.000. It is purely arbitrary. So in French dictionaries, you will not find any words for technical practices such as leather work for example. Each year, they delete words from dictionaries to save space for the new ones, and I think it is definitively a plus for French Wiktionary, because we do not, and for readers of "old" books, definitions will only be available in Wiktionary, and not anymore in dictionaries! So, French Wiktionary can pretend to have a better coverage than published dictionaries. I think it is more difficult to communicate on this matter for English Wiktionary. Well, I hope you will disconfirm.
Templates for etymologies is a big deal. In French Wiktionary, we prefer to write paragraphs of etymology with large compilations of sources, in a similar way as Wikipedia writing. Policies Wiktionnaire:Étymologie and Wiktionary:Etymology show great differences. French Wiktionary promote long etymologies including folk etymology and false ones with sources. We do not want to have From latin to French in etymology because it is barely false considering the history of the language and the influence of the dialects. We want to trace the path for the forms and the meanings, mentioning regional uses when needed (for example: bataclan). Also, lot of sources says obscure origin when words come from Arab or Gitan (Romani), we prefer to quote these official sources and more recent analyses displaying better data to show that old official sources can be politicly biased. Our etymology have to be more neutral than old ones. I do not judge one strategy better than the other. I think we need to do both. French Wiktionary need to develop a template to display schematic trees with the basic history of words, but also to provide plenty details with the whole history and controversial hypotheses.
Hebrew and Unicode: Great news! I hope you will publish a press release when it will be done!
Expanding coverage for underdescribed languages is great! Is it a global effort or a contribution by few people? In French Wiktionary, it is mainly do by two people, including Pamputt, for water translations   Noé 08:44, 7 June 2017 (UTC)
  • Wiktionary:Milestones and Wiktionary:News for editors are generally used to announce new things. But they're generally not very interesting. I can't imagine many people going "Wow! How cool! Greek nouns have been reclassified from invariable to indeclinable!!!" -WF.

Wiktionary:Votes/2017-06/Allowing character boxesEdit

Based on Wiktionary:Beer parlour/2017/May#'character info' box, I created Wiktionary:Votes/2017-06/Allowing character boxes. --Daniel Carrero (talk) 09:55, 5 June 2017 (UTC)

Language codes for East, South and West SlavicEdit

I was wondering if it wouldn't be a be a good idea to create language codes for Proto East, South and West Slavic. They're well established and would make sense to reconstruct. --Victar (talk) 23:15, 5 June 2017 (UTC)

"Proto-East Slavic" is Old East Slavic, is it not? As for West and South, are you sure it's possible to reconstruct as single Proto-West Slavic and a single Proto-South Slavic? --WikiTiki89 23:18, 5 June 2017 (UTC)
I don't know myself what the differences would be between PES and OES, but if there is no distinction, than I don't think we should have a separate level in descendant trees. I do know that PWS has some pretty distinct features. Matasović argues that South Slavic is strictly as a geographical grouping, not a genetic clade, so I'm not clear on that. @Benwing, Vahagn Petrosyan? --Victar (talk) 01:36, 6 June 2017 (UTC)
West Slavic is really not that distinct. Most of its features are retentions of things already present in Proto-Slavic. —CodeCat 17:23, 6 June 2017 (UTC)
Having distinct features does not mean it is possible to have a single consistent reconstruction. And OES is Proto-East Slavic. Don't forget that "Proto" doesn't mean reconstruction, but just that it is the ancestor of a language group. In this case OES is the ancestor of the East Slavic languages, so it is Proto-East Slavic. --WikiTiki89 17:33, 6 June 2017 (UTC)
I obviously understand what proto means, but if you follow the point I was making, If you're arguing that PES and OES are identical, which I generally disagree with, than we shouldn't have level in a descendant trees for PES above OES, but instead call that branch Old East Slavic. Otherwise that's like adding a Proto Norse above each entry of Old Norse in PGmc descendant trees. --Victar (talk) 18:03, 6 June 2017 (UTC)
I already do that. I label the line as "East Slavic" but put the OES term on the same line. —CodeCat 18:06, 6 June 2017 (UTC)
That's great, but I haven't seen that the case in the entries I've come accross, as per my example above. If people are in agreement with that though, I'm satisfied. --Victar (talk) 18:17, 6 June 2017 (UTC)
First of all, why would it be obvious to me what you personally know or don't know? If I want to make sure we are on the same page in terms of terminology, you shouldn't take it personally. Second of all, OES is the ancestor of all East Slavic languages; that makes it by definition Proto-East Slavic. It's not something you can disagree with unless you want to say that OES is not the ancestor of all East Slavic languages (which you could maybe make a case for regarding the Old Novgorod dialect or North Russian, but in that case we probably can't reconstruct a single Proto-East Slavic anyway). If the tree is wrong, it should be fixed; don't impose incorrect descriptions of reality onto reality itself. --WikiTiki89 18:14, 6 June 2017 (UTC)

Manichaean Middle PersianEdit

Manichaean Middle Persian is currently designated as a separate language from Middle Persian, but this isn't the case, as it's simply one of several scripts used. Shouldn't we delete it? @Vahagn Petrosyan? --Victar (talk) 03:29, 6 June 2017 (UTC)

The difference is not just in the script. Manichaean Middle Persian has systematic dialectal differences from Zoroastrian (Book) Middle Persian. For example, to Zoroastrian nd corresponds Manichaean nn, as in Book bnd (band) : Manichaean bn (bann, bond, link); Old Persian rd gives Zoroastrian l but often r in Manichaean, e.g. sāl vs sār ‘year’; Iranian gives Zoroastrian ar but often ir in Manichaean, e.g. mard vs mird ‘man’.
Even if we decide to merge both under Middle Persian, we should keep Manichaean Middle Persian as an etymology-only language. @ZxxZxxZ, what do you think? --Vahag (talk) 07:13, 6 June 2017 (UTC)
I believe some of those differences are due to limitations and idiosyncrasies of the scripts itself. Note that even though the transcription BMP s’lyn- (to provoke) contains an l, it is pronounced /sārēn-/, as per Cheung. Both alphabets also lacked a full set of vowels. Even so, I think these are minor dialectal changes. --Victar (talk) 08:00, 6 June 2017 (UTC)
As Vahag said the differences are beyond the script, not just Manichaean, but also Zoroastrian (Pazend, Avestan alphabet). I've even read there are differences between the Middle Persian written in Inscriptional Pahlavi and the Middle Persian in Book Pahlavi (which was mostly written in Islamic period and is called "late Middle Persian", as opposed to the "early Middle Persian" of the inscriptions), though I'm not aware of any instances beyond spelling differences (e.g. in the arameograms used). Regarding the s’lyn- instance, it's true, though the letter "l" is also used for l, anyway the instance provided by Vahag is a different case: we know ŠNT was pronounced as sāl in Middle Persian, but it is recorded with r in Manichaean alphabet. I think we should keep them separate. --Z 11:30, 6 June 2017 (UTC)
This family of languages has fluctuated between l and r since the days of IIr. I see more variation in dialects of English than in forms of Middle Persian -- certainly not enough be called a separate language -- and all these "differences" seem highly predictable to me. @-sche, this strikes me as one of those "splittist" cases you mentioned. --Victar (talk) 14:12, 6 June 2017 (UTC)
On second thought I changed my mind a bit regarding this. --Z 20:37, 6 June 2017 (UTC)
I am open to being persuaded otherwise by you all, who seem to have greater knowledge of this subject than I do, but based on this discussion and from what Wikipedia says, it does sound like we are dealing with dialects. They should of course have separate etymology codes (like Cajun French vs standard French). If l/r variation also exists within one or especially both varieties and not just as a distinction between them, that would suggest it should not be held up as a reason to separate them; likewise, if the appearance or absence of any particular variation is due to the constraints of script! Wikipedia speaks of using the documents which were written in more expressive/conservative scripts to understand the documents written in the other script. (It makes me think of the ISO granting separate codes to hieroglyphic vs cuneiform Luwian.) - -sche (discuss) 04:56, 8 June 2017 (UTC)

Re-add two vote references in Wiktionary:Criteria for inclusion/Well documented languagesEdit

In this diff, @Metaknowledge reverted my edit to Wiktionary:Criteria for inclusion/Well documented languages.

I'd like to do the same edit again, where I added two vote references. See the history of the page for further comments from him and myself. --Daniel Carrero (talk) 04:00, 6 June 2017 (UTC)

Addendum: Wiktionary:Votes/2011-04/Sourced policies is the vote where it was accepted to link every piece of text in EL and CFI to their supporting votes through the wiki technique of references. There are too many so-called policies created unilaterally without verifiable consensus. WT:EL and WT:CFI themselves are partially voted and partially non-voted. If we don't link the votes, we can't easily verify the fact Wiktionary:Criteria for inclusion/Well documented languages is thankfully almost 100% voted and approved, with a few unvoted changes concerning Arabic, Irish and Welsh. --Daniel Carrero (talk) 17:22, 6 June 2017 (UTC)

All words in all languagesEdit

In the very first paragraph of our main page we have "It aims to describe all words of all languages using definitions and descriptions in English.". This is manifestly false. We do not include all words (we omit some brand names for example) and we do not treat computer programming languages to be languages. There are two main ways we could improve this situation. The first (my preferred option, as I'm sure you know) is to make the statement true - to include all words, in all languages. A second option is either to rewrite the statement as " ... most words of most languages ... " or to follow it with an asterisk that somehow points to a "terms and conditions apply" section. What does everyone else think? SemperBlotto (talk) 04:56, 7 June 2017 (UTC)

What is this platonic definition of "word" that you seek to make us follow? Just because you say brand names are words doesn't mean we have to agree with you. DTLHS (talk) 04:59, 7 June 2017 (UTC)
Of course you don't have to agree with me. But we define brand name as a form of name, and we define name as a type of word. SemperBlotto (talk) 05:04, 7 June 2017 (UTC)
Then you can see that it's totally impossible to "make the statement true" since we will never agree on what a word is. DTLHS (talk) 05:07, 7 June 2017 (UTC)
This of course comes back to what constitutes a "word". If I stub my toe and say, "Yowzawhoawhoawhoa!" then that will certainly communicate some meaning to someone else ("that hurt a lot") but it's not really a word. There are a lot of signed, spoken, and written things which convey meaning but I think that anyone using common sense would realize that no dictionary could ever include all of these phenomena. —Justin (koavf)TCM 05:24, 7 June 2017 (UTC)
"aspirational". --Catsidhe (verba, facta) 05:26, 7 June 2017 (UTC)
Of course we would include "Yowzawhoawhoawhoa" - if it makes it into three different books by three different authors (like "Windows" has done). SemperBlotto (talk) 05:32, 7 June 2017 (UTC)
Sure, and it never will. That doesn't make it more or less of a "word", it's just not a "word" for our purposes. Someone else could rightly call a lot of things "words" which we don't: everyone would have some caveats on what would constitute "every word in every language" and I don't think ours are so unreasonable as to expect them to need a disclaimer on the front page. —Justin (koavf)TCM 05:34, 7 June 2017 (UTC)
For the record, maybe some dictionaries have a broader criteria for inclusion than us. is an English-Japanese dictionary where in addition to "normal" stuff, you can search for some people names, brand names and movie titles, among other things. --Daniel Carrero (talk) 08:58, 7 June 2017 (UTC)
  • Indeed, this is manifestly false: we only include attested words. Even if we relaxed attestation criteria even more, we still have to admit that we do not have an omnicorpus of all utterances of all languages that ever existed on the planet, and therefore, we will necessarily fail to cover some words. This is not just a hypothetical concern; we are very certain that we omit some words for lack of evidence, even though we do not necessarily have to know which words.

    One remedy is to do nothing and read the sentence as a slogan that does not contain the necessary qualifications. Another remedy is to inject "approximately", "basically" somewhere in the slogan. The asterisk mentioned above is also an option. Adding "as long as there is enough evidence", which occurred to me, is not so good since that would address the attestation requirement but not the other requirements and exclusions." --Dan Polansky (talk) 10:08, 7 June 2017 (UTC)

  • We could change
    As an international dictionary, Wiktionary is intended to include “all words in all languages”.
    As an international dictionary, Wiktionary is intended to include basically “all words in all languages”, subject to certain conditions.
    --Dan Polansky (talk) 10:12, 7 June 2017 (UTC)
    I like the proposed change. It's an honest assessment of what words we actually accept. Though technically we also accept phrases, symbols and other things. --Daniel Carrero (talk) 10:24, 7 June 2017 (UTC)
  • How about "a bunch of words in a bunch of languages"? -WF
  • Footnoted motto/slogan? An example:
Our motto, annotated

All1 words2 in3 all4 languages5

The ordinary-word meaning of this slogan is somewhat misleading. The following notes explain the qualifications:

1Not every word is included at all, let alone in a meaningful way. Obviously we haven't gotten around to all of them. Attestation requirements exclude many. Due to the narrowness of our contributor base many languages are unrepresented and many specialized contexts are unrepresented, even in English.
2"Word" can include letters, numbers, symbols, abbreviations, proverbs, idiomatic expressions, some non-idiomatic expressions, clitics, affixes.
3Some "words2" could fall between languages. A multi-word expression borrowed from a foreign language could be non-idiomatic in its original language and thereby not includable in that language. It may also only be found in italics or quotation marks in running text in other languages, indicating that authors and editors don't think it has entered the lexicon in that language.
4See Vote on Serbo-Croatian.
5Translingual is not a language. Many non-words are better characterized as things. Things that are not words are not part of languages.

This approach works in real life for more-or-less unchangeable statements of great importance, like those in the US Constitution. DCDuring (talk) 13:09, 7 June 2017 (UTC)

How about "Wiktionary: It's complicated." - [The]DaveRoss 13:31, 7 June 2017 (UTC)
It seems to me that it worth expanding it at Wiktionary:All words in all languages, not as a disclaimer nor a policy but as an introduction to a pillar of Wiktionary. Comments on this thread worth to be annotated in a project/essay page, including "it's complicated" :-) --Vriullop (talk) 13:42, 7 June 2017 (UTC)
I created Wiktionary:Votes/pl-2017-06/CFI leading sentence. --Dan Polansky (talk) 16:13, 16 June 2017 (UTC)
I think the slogan is an acceptable aspirational slogan (good word, Catsidhe). Vriullop's idea of a page explaining it is interesting, but might overlap heavily with WT:CFI. It seems to be obvious to many people who comment on it that it is not to be taken to Amelia Bedelia levels of literalism. Even if we started including brand names, book titles, nonces, etc as some Wiktionaries do, we are unable to include all words in all languages, because some words were never recorded by anyone before they passed out of memory (e.g. in the Khazar language, Ciguayo, or Jassic). We are even apparently prevented by law from including all words in all languages because (in previous discussions in which some of our users who are lawyers have participated, it has been noted that) languages like Dothraki probably constitute significant parts of commercial franchises, and it would probably violate copyright if a third-party dictionary like us included all or a substantial number of the words in such a language. Even the most permissive inclusionism will hit hard limits.
Personally, I expect that as we become more complete, we will include codes for over 9000(!!!!1) languages. And we might broadly guesstimate that poorly-attested languages and highly-inflected languages may average out to half a million entries per language, so perhaps our slogan could say we aim "to describe four and a half billion words in nine thousand languages"? ;D - -sche (discuss) 17:44, 16 June 2017 (UTC)

Wiktionary:Votes/2017-06/borrowing, borrowedEdit

Sorry, I'm not sure now is the best time to create this vote since I had created another one two days ago. But we have an ongoing discussion about what to do with {{bor}} in all entries, so I created it anyway. Please check Wiktionary:Votes/2017-06/borrowing, borrowed. There are a few discussions linked there. Feel free to edit the vote or suggest any changes.

Aside from that, I intend to create a new Wikidata vote once the current one ends in June 11. --Daniel Carrero (talk) 14:10, 7 June 2017 (UTC)

category for past forms of verbs used in turn as verbs on their ownEdit

A category of these forms may help the learner very much, since if they are not acquainted with these forms, finding them confuses momentarily the undertanding. For example, slew may mean "to veer", but it's as well the simple past tense of "slay"; likewise, "lay" is a transitive verb, as well as the simple past tense of "to lie, when pertaining to position". I do not know how to overlap categories so that I can get the one I wish to create. --Backinstadiums (talk) 14:12, 7 June 2017 (UTC)

I think it would be useful to have a more general category for overlapping of verb forms. This category could also include set and read, which don't distinguish tense in writing. —CodeCat 14:17, 7 June 2017 (UTC)
I see them as two different cases, the so-called "irregular verbs" being more treated than the one I propose. --Backinstadiums (talk) 14:47, 7 June 2017 (UTC)
Could sb. please teach me how to proceed? --Backinstadiums (talk) 12:53, 8 June 2017 (UTC)

WT:ELE - How to alphabetize languagesEdit

ELE dictates that language sections should be in alphabetical orders. Some languages have unusual characters in their English names, should they be alphabetized including those characters at face value, or without those characters? By way of an example, what is the correct order of "Ch'orti', Chachi, Cofán". - [The]DaveRoss 14:16, 7 June 2017 (UTC)

I would say to just ignore the non-letters (from an English point of view). So the order would be Chachi, Ch'orti', Cofán. —CodeCat 14:17, 7 June 2017 (UTC)
I find it hard to believe we haven't had this discussion before somewhere. I support using code-point order sorting, i.e. not ignoring the non-letters, since it's the easiest to implement and it's the state of all entries are in now. DTLHS (talk) 14:50, 7 June 2017 (UTC)
If we decide to do something about this issue, let's please update WT:EL#Languages. It just says that languages besides Translingual and English are "in alphabetical order". --Daniel Carrero (talk) 14:52, 7 June 2017 (UTC)
@DTLHS I figured the issue had been resolved and I was just not aware and couldn't find it readily. While I also like using the code-point ordering, I have found that not all entries are in that order (see: A).
@CodeCat If we go that way we will need to settle on what constitutes a non-letter (e.g. accents). Thankfully the set of allowable language names is limited so we can be comprehensive. - [The]DaveRoss 15:05, 7 June 2017 (UTC)
One other note, the translation sections should also follow the same policy, whatever it is. And presumably other sorted lists should have a policy. - [The]DaveRoss 15:47, 7 June 2017 (UTC)
I don't know about the examples you gave but I'd like to simply caution that those typographic characters actually are letters in some languages, e.g. ʻokina in Hawaiʻian. —Justin (koavf)TCM 16:03, 7 June 2017 (UTC)
Right, but our L2 headers, i.e. language names, are in English. We have ==French==, ==German==, and ==Spanish==, not ==Français==, ==Deutsch==, and ==Español==. For that reason, I'm in favor of ignoring things like apostrophes and diacritics when it comes to alphabetizing languages. —Aɴɢʀ (talk) 20:39, 7 June 2017 (UTC)
There is also a tendency to omit apostrophes in English when they don't seem to do anything (compare Mi'kmaq and Mikmaq), so the order might change depending on which spelling of a language we use, which is potentially confusing. Andrew Sheedy (talk) 20:49, 7 June 2017 (UTC)
The list of L2 headers we currently employ includes many non-English characters, perhaps that should not be the case but it is at the moment. I think we are relatively consistent in this regard within a single language, but I might be wrong on that. - [The]DaveRoss 21:04, 7 June 2017 (UTC)
I'm sympathetic to the argument that codepoint ordering is possibly the easiest to maintain (and is the one used by many entries now, due to bots sorting things), but it would not seem to be too difficult to define a more natural order, with Xârâcùù sorted before Xhosa, etc. It does appear as if other references ignore apostrophes, click letters, and diacritics when alphabetizing:
  • The International Encyclopedia of Linguistics lists in this order ’Akhoe, ǀAnda, Deti, ǁGana, Ganádi, ǀGwi, Hadza, Haiǀom, Hietshware, ǂHua (they print it as =/Hua), Juǀ’hoan [...] Nǀu, ǃOǃung, Sandawe, [...] ǀXam, ǁXegwi, Xiri, ǃXóõ.
  • Dalby's Dictionary of Languages sorts Larestani, Lārī, Lashi, Lāsī, Latgalian, [...] Mabwe-Lungu, Mača, Macao, [...] Māhārāshtri, Mahi.
  • The Ethnologue itself has Afrikaans, ||Ani, Birwa, [...] English, ||Gana, Gciriku, |Gwi, Hai||om, Herero, ‡Hua[sic], Ju|’hoansi, Kalanga [...] |Xam, ||Xegwi, ‡Ungkue.
  • Hodge's old Handbook of American Indians North of Mexico has Háami, Hāʼanaʟěnox, Haatse, Háatsü-háno, Habasopis, [...] Hailtsa, Haiʼ‘luntchi, Haiʼmāaxstō, Hai-ne-na-une, [...] Háiokalita, Haiowanni.
I agree that translations should be sorted in the same order.
Looking at WT:LOL, it appears that the list of characters besides A-Z used in language names and alt names (ignoring case) is
á, à, â, ä, ȁ, å, ã, ā : treat as a?
æ : treat as ae?
ɓ : treat as b?
ç, č : treat as c?
ḍ, ḏ : treat as d?
é, è, ê, ë : treat as e?
ɛ : also treat as e?
ğ : treat as g?
 : treat as h?
í, ì, î, ï, ĩ, ī, ɨ : treat as i?
ł : treat as l?
ñ : treat as n?
ŋ : treat as ng? At the moment, all languages with alt names using "ŋ" indeed use "ng" in their canonical names.
ó, ò, ô, ö, ȍ, õ, ō : treat as o?
ɔ, ɔ̃ : also treat as o?
š : treat as s?
 : treat as t?
ú, ù, ü, ũ, ŭ, ų : treat as u?
ŵ : treat as w?
ý : treat as y?
 : treat as z?
ə : what should happen to schwas?
() (as in "Kare (Africa)", "Yao (South America)") : parse as-is i.e. in codepoint order?
- (and , which was used in four alt names I just switched to use hyphens) : parse as-is?
. (as in "Mt. Iraya Agta") : parse as-is? or ignore i.e. discard?
', ʼ, ǀ, ǁ, ǃ, ǂ and ʻ and ˀ, ʔ (and the nonstandard ’, ‡) : ignore i.e. discard?
Note that many of these special characters only appear in alt names (which I included as a decent repository of what special characters might one day appear in canonical names of same languages), and are already normalized as above in their canonical names which we'd be dealing with, anyway! (Perhaps someone else feels like making a list of only those special chars which appear in canonical names.)
- -sche (discuss) 15:42, 8 June 2017 (UTC)
Can we simplify your suggestion to: "Use the natural ordering after removing all combining characters and punctuation, and splitting ligatures."? I am worried about converting similar characters in other scripts to their Latin counterparts, since that way lies incredible complexity and subjectivity. - [The]DaveRoss 19:05, 9 June 2017 (UTC)
My post only spells out all of the diacritical letters that are in use so people can see which letters those are and see if they agree with the proposed normalization; I expect that an actual rule would be phrased in more general terms, yes. For example, much of it can be simplified to "ignore diacritics". Noe links to an article by the person in charge of Glottolog about naming languages, which agrees with replacing (and sorting) ɛ and ɔ as e and o. As for punctuation, do we want to remove it? Suppose we had a language called "Kala (Zimbabwe)", should it be sorted before or after "Kala Lagaw Ya"? - -sche (discuss) 19:26, 9 June 2017 (UTC)
I am certainly not the right person to make the calls about the best course here, I am merely hoping for a general rule if possible rather than a mapping system. - [The]DaveRoss 19:34, 9 June 2017 (UTC)
I think parenthesized qualifiers such as "(Zimbabwe)" should be entirely ignored unless it results in two languages having identical names, and only then should they be used to sort them. In other words, sort it as just "Kala" but if there happens to be another "Kala" then use the qualifiers to determine which goes first. —CodeCat 19:36, 9 June 2017 (UTC)
Wouldn't that be (effectively) just like sorting with the parenthesis left in place? Perhaps there are instances I am not considering. - [The]DaveRoss 19:48, 9 June 2017 (UTC)
Perhaps, but editors can't be expected to know Unicode code points, whereas they can be expected to know English alphabetical ordering. So even if any programmed implementation treats it as you say, a human-readable description of the process would have to describe it more as I did. —CodeCat 19:57, 9 June 2017 (UTC)
As far as I know, parentheses are only used when two languages do have the same name, and then, in almost all cases — the only exception that comes to mind is that we haven't yet added a qualifier to the million-strong language "Yao" just because the tiny, extinct language "Yao (South America)" exists — they are used on both languages. So I suppose we should leave the parentheses as-is and thus sort a hypothetical "Kala (Zimbabwe)" above "Kala Lagaw Ya". A rule that parentheses' contents "should be dropped unless X is true" where X is true 100% of the time would be needlessly confusing, IMO. - -sche (discuss) 20:00, 9 June 2017 (UTC)
Not really, because "Kala (Zimbabwe)" should be sorted before a hypothetical language called "Kalaza". —CodeCat 20:09, 9 June 2017 (UTC)
I think I may understand the source of the confusion. I'm assuming that a space doesn't count for sorting, since it's not an alphabetical character. So "Kala (Zimbabwe)" would be "Kalazimbabwe" for sorting purposes if we didn't take out the parenthetical part. Or, to take two real examples, I'm saying that "Tokelauan" would come before "Tok Pisin". —CodeCat 20:16, 9 June 2017 (UTC)
Aha, that's a place our assumptions differed; I assumed spaces would be counted. Poking around other reference works, I see that G. Cinque's Typological Studies: Word Order and Relative Clauses sorts "Tokelauan, Tok Pisin" (in the alphabetical index), while J. Lynch's Pacific Languages: An Introduction sorts "Tok Pisin, Tokelauan". I'm not sure which is better. But independent of whether or not spaces are counted, I would never sort "Kala (Zimbabwe)" as "Kalazimbabwe". And if we both want "Kala (Zimbabwe)" above "Kalaza", isn't the simplest way to obtain that to treat the parentheses as parentheses, which are sorted ahead of alphabetic characters? - -sche (discuss) 20:49, 9 June 2017 (UTC)
Here's a list of only those characters that are current used in canonical names, i.e. the ones we'd actually have to sort right now: [a-z], (space), - (hyphen), ' (apostrophe), . (dot), () (parentheses), á à â ä ã å ç é è ê ë í ì î ï ñ ó ò ô ö õ ú ù ü (diacritics, which could be handled by a rule "treat letters with diacritics the same as their base letters"), and ǀ ǁ ǃ ǂ (click consonants, which could be handled by a rule "treat click consonants as if they are not there"). It's possible that we should even remove some of those from the canonical names themselves, i.e. rename the languages. - -sche (discuss) 20:49, 9 June 2017 (UTC)
Should hyphens be dropped? For example, how should Yan-nhangu, Yangkam, Yanomámi be sorted? - -sche (discuss) 22:52, 9 June 2017 (UTC)
If we are doing this, I suggest that we create a module function that outputs a list of every language name in whatever internal order we decide on. Bots can read that page and order language sections and translations accordingly. DTLHS (talk) 22:57, 9 June 2017 (UTC)
My suggestion- four classes of characters:
  1. Basic English letters
  2. Basic English letters with diacritics
  3. Non-English letters with no English counterpart (most or all them clicks and glottal stops)
  4. Punctuation
Perform the following transformations to produce the sort key, in the order given:
  1. Convert apostrophes to one of the other glottal-stop characters so they won't be treated as punctuation.
  2. Convert all punctuation to spaces and then convert multiple spaces to single spaces.
  3. Prefix all letters having diacritics with the corresponding basic English letter.
  4. Swap non-English/no-counterpart letters with the following letter so the following letter comes first. If there are multiple such letters, swap all of those letters next to each other as a group.
This has the advantage of having things sorted first by basic English letters, but having the order of the diacritics followed as well, and having the order of the "ignorable" non-English/no-counterpart letters followed, too. Since spaces are sorted before basic English letters, that also honors the principle that "nothing comes before something".
I'm not positive about the second transformation, since that will mean "Abc (def)" will be the same as "Abc def", but it's something to start with- tweaking is welcome. Chuck Entz (talk) 03:55, 10 June 2017 (UTC)
@Chuck Entz, can you elaborate on transformation number four? - [The]DaveRoss 17:52, 16 June 2017 (UTC)
Sure. The idea is that there should be a basic English letter before other characters for sorting purposes, with the others following it to distinguish between cases distinguishable only by those other characters. "Swapping" isn't really the best choice of words: what I mean is that the first basic English letter following one or more non-English-no-counterpart characters should be moved in front of them. Now that I've had a chance to think about it, maybe it would be better to ensure that the diacriticed letter goes with it, probably by moving the fourth transformation before the third, and clarifying that both basic English letters and diacriticed letters should be treated the same by this new third transformation. Thus 'ábçd would become aá'bcçd. Using these rules on -sche's examples below: Gadang→Gadang, Ga'dang→Gad'ang, and Madi→Madi, Ma'di→Mad'i. Chuck Entz (talk) 21:53, 16 June 2017 (UTC)
...? Why is having to sort Mad'i any better than having to sort Ma'di? It seems like it would be neater to say: ignore apostrophes (clicks, and diacritics) when sorting languages on the page, but if that causes two or more languages to have the same name, then sort those two or more amongst themselves with the apostrophes present (left where they are).
Should spaces also be subjected to such a process (resulting in "Tokelauan, Tok Pisin"), or left alone? I tend to think spaces should be left in per the "nothing comes before something" principle you mention (so, "Tok Pisin, Tokelauan"), at which point there's no reason to remove the parentheses (and indeed, removing them would only make things more complicated and difficult), since leaving them in ensures that a hypothetical "Foo (Bar)" comes before "Foo Bar", which seems appropriate because bare "Foo" should also come before "Foo (Bar)". - -sche (discuss) 04:09, 17 June 2017 (UTC)
You raise an important point, that some language names would be identical if diacritics and special characters were removed. These include gdk Gadang and gdg Ga'dang, and grg Madi and mhi Ma'di. What order should these be in: "Ga'dang, Gadang" or "Gadang, Ga'dang"? - -sche (discuss) 18:20, 16 June 2017 (UTC)

Proposal: a page to centralise the patrolling effortEdit

For some four years we’ve been unable to keep up with the rate of unpatrolled changes. This means that a lot of inadequate edits, and sometimes even vandalism, gets through. I’ve been trying to think of ways to improve the efficiency of our patrolling, because Special:RecentChanges is very hard to use: it lists all unpatrolled edits in all languages and all areas, but an individual patroller has the knowledge to patrol perhaps 10% of them. For example, right now most recent unpatrolled edits are changes to Hungarian entries and the addition of Galician, Ukrainian and Italian translations. I have enough knowledge to verify whether the Galician translations are good, and I could look up some published dictionaries to check the Italian and Ukrainian translations (but other users could do it faster and better), and I can’t possibly hope to check the content of the Hungarian entries (just the formatting).

In addition to the raw recent changes page, we could have a page with a list of users with unpatrolled edits, separated by language and topic. For example, if I am patrolling the recent changes and come across a user adding Japanese etymologies, I would add a new item to section ==Japanese==, subsection ===Etymology=== on this page, with a link to the user’s contribution page and perhaps an explanation as to why I think their contributions need special attention. Eventually a patroller who is more proficient in Japanese etymology will see this link and check the contributions. In order to keep the patrolling process invisible, as it already is, this page should have a mechanism to prevent users from being pinged (WT:VIP has such a mechanism, if I remember correctly). The advantages that such a page might bring include:

  • Encourage users who don’t the time or patience to go through Special:RecentChanges to patrol.
  • Encourage patrollers to delegate edits to someone who feels more confident in the language and topic.
  • Provide a place where the correctness of someone’s edits can be discussed (like a pre-WT:RFC, without implying that their edits need to be cleaned up)
  • Make it easier to identify sockpuppets and patterns of odd behaviour.
  • Prevent unpatrolled edits from being lost in the Recent Changes limbo.

Ungoliant (falai) 16:38, 7 June 2017 (UTC)

This is a fantastic idea. I think that the problem of a piling up of unpatrolled edits has not been due to the difficulty of patrolling so much as to the fact that not that many people are patrolling. This kind of page could help make clear to admins who don't patrol as much how they can help out in the common effort. —Μετάknowledgediscuss/deeds 18:39, 7 June 2017 (UTC)
I think that this collation of edits by topic and language could be done mostly automatically if anyone wants to do it. DTLHS (talk) 18:41, 7 June 2017 (UTC)
I am not sure I am understanding the suggestion exactly, but it does seem like it could be useful. Is there any way you could mock something up which demonstrates what you are suggestion (even if in a limited way)? I think I would support this effort even if that wasn't possible, I am more curious about what the implementation might look like and be capable of. - [The]DaveRoss 13:27, 8 June 2017 (UTC)
@TheDaveRoss mockup. — Ungoliant (falai) 15:22, 8 June 2017 (UTC)
Thank you for that, it isn't exactly what I expected but it makes a lot of sense now. This seems like a great way to collaborate. - [The]DaveRoss 15:38, 8 June 2017 (UTC)
What did you have in mind, Dave? — Ungoliant (falai) 15:59, 8 June 2017 (UTC)
For some reason I pictured something which showed the edits to be patrolled, and I couldn't think of how that might work (without a lot of fancy coding). I get now that it is more of a WT:VIP analog. - [The]DaveRoss 17:21, 8 June 2017 (UTC)
@Ungoliant MMDCCLXIV: I patrol changes to existing Hungarian entries almost every day by checking my Watchlist. Since it doesn't show new entries, those are harder to find. It would be great to have a better method. We had a Recent Changes list by languages a long time ago, that worked well. I'm not sure what technical challenges it presents. --Panda10 (talk) 18:45, 8 June 2017 (UTC)

Appendix: Easily confused chinese wordsEdit

Hi, regading the concept of chinese anagram, I think it would be of great help to create an appendix similar to Easily_confused_Chinese_characters but Easily "confused Chinese words", which in theory could be easily created from a corpus of words, just selecting those with the same number of the same characters yet in different positions. Furthermore, I'd like to know how the concept of anagram can be used for characters themselves, transposing radicals or even strokes. --Backinstadiums (talk) 13:14, 8 June 2017 (UTC)

Wikidata precautionary principleEdit

Once the Wikidata vote ends, there's a chance we'll get Wikidata installed here.

If that happens, what do you think of restricting its use by implementing the rule below?

"Any and all edits using Wikidata shall be reverted on sight if they were not discussed or voted before."

Or we might demand all Wikidata uses to be voted, not just discussed. --Daniel Carrero (talk) 14:31, 8 June 2017 (UTC)

I agree that it is right to be cautious in any implementation of data from Wikidata, especially when that data will be presented directly to the user. I would suggest that, at least to begin with, any use of Wikidata data which is non-controversial (e.g. using the mapping between ISO codes and language names [in as much as they conform to our current use]) be discussed publicly and agreed upon, and any potentially controversial use (e.g. including a Wikidata identifier [Q12345] as a template parameter; presenting Wikidata data directly) be subject to a vote. - [The]DaveRoss 14:45, 8 June 2017 (UTC)
I'm not sure the language code→name mapping is 100% noncontroversial. I've been thinking it might be a good idea to create a separate vote with this proposal: "Moving all the data from Category:Language data modules, Category:Dialectal data modules, Module:families/data, Module:scripts/data and Category:Unicode data modules to Wikidata." --Daniel Carrero (talk) 15:01, 8 June 2017 (UTC)
Ehh. (Or maybe: LOL.) Treatment of languages (as dialects or macrolanguages, etc) is rather complex, and not a good candidate for quick Wikidata-fication. Treatment of ISO- as well as exceptionally- coded lects as separate or as dialects of one unit varies a lot between wikis; we merge even some lects that have separate wikis, like the Serbo-Croatian lects. Names also vary a lot not just between wikis of different languages (which obviously use their own native names for things), but also within different wikis that use the same language, if wikis have different priorities with respect to e.g. calling each language by its native name, or by the name that is most common in references on the language, vs calling it by a name that distinguishes it from other languages with the same name: hence we have "Austronesian Mor" and "Mbo (Congo)", where another wiki might prefer "Mor (Austronesian)" or "Mbo (Democratic Republic of the Congo)", or might even think the name should be plain "Mor" and damn the torpedoes.
Other things that pertain to languages also vary between wikis, e.g. some wikis might consider that a language that is documented by linguists in the Latin script, or in Cyrillic, but that has no natively-used script should not be said to have "Latn"/"Cyrl" as a script, whereas other wikis might feel otherwise. Since our templates need to know what scripts a language is written in so as to know whether it needs a transliteration, and since our CSS applies fonts based on that information as well, we wouldn't want other folks to pull the rug out from under us as to what script a language had.
Even family information is the source of both disagreement (is Tibeto-Burman different from Sino-Tibetan? is Finno-Ugric different from Uralic? etc) and different priorities (some wikis might want comprehensive family trees that listed every node; for us, they would be too finely granular and would ghettoize languages and derivations into tiny categories hidden at the ends of long trees).
To move only some language names / script infos / etc to Wikidata and handle others locally would be inefficient and unwise, IMO. And in order to move them all, we would need "infrastructure" to be added there that would handle e.g. "is called X on wiki Y", at which point, we'd just be doing the same thing we're doing well here, but doing it there for some reason, which seems inefficient and unwise.
IMO, treatment of languages is so central to what each Wiktionary does that, especially for a wiki with as many active linguistically-adept and technically-adept editors as en.Wikt, it makes sense to do it locally.
Peripheral things, things that are not core to our mission, are better candidates for moves: e.g., parsing that a certain city is in a certain county/province/etc in a certain country on a certain continent, or (re the recent discussion of eponyms) parsing that a certain person has a certain nationality. - -sche (discuss) 16:34, 8 June 2017 (UTC)
Here's an idea: if Wikidata allows it, each language could have an "English Wiktionary name" property. We would use "Austronesian Mor" even if for other purposes the same language is called by other names. --Daniel Carrero (talk) 16:54, 8 June 2017 (UTC)
There is a strong argument for bringing ourselves more into compliance with international standards where we can, but also I disagree that managing by exception would overly complex. There are relatively few people who have any idea how/where to modify the existing language mapping structure, it is both obfuscated and has significant functionality issues (see water, man). As Daniel suggests there are also other paths for this (and virtually every) problem, e.g. creating new properties which would be controlled by this community. - [The]DaveRoss 17:01, 8 June 2017 (UTC)
I'd like to say something again about compliance with standards, which is important. I'm pretty sure all or most language codes listed at Module:languages/datax are not in compliance with ISO. I would support changing them all to be ISO-compliant, but some people may oppose doing that. (this was one of the points discussed at Wiktionary:Beer parlour/2017/February#Proposal: Implementing Wikidata access) --Daniel Carrero (talk) 17:10, 8 June 2017 (UTC)
  • The sentence in quotes is something that I would enthusiastically support. It would certainly assuage a lot of my (and others') personal fears with respect to Wikidata. —Μετάknowledgediscuss/deeds 15:19, 8 June 2017 (UTC)
I support requiring a vote for each distinguishable use case. DTLHS (talk) 16:47, 8 June 2017 (UTC)

Wiktionary:Votes/2017-05/Installing Wikidata passed. I created Wiktionary:Votes/pl-2017-06/Wikidata precautionary principle to implement the principle proposed here. In the vote, I rewrote the proposed text to be a bit longer and more policy-like in my opinion, but the idea is 100% the same as proposed here. Feel free to edit the vote or suggest any changes. --Daniel Carrero (talk) 00:30, 11 June 2017 (UTC)

1000 Middle Dutch entries!Edit

With the appropriately-chosen fêeste, there are now 1000 entries for Middle Dutch. Some of them are alternative forms, but this is compensated by some entries that have more than one lemma. —CodeCat 20:52, 8 June 2017 (UTC)

Impressive! Our coverage of languages like this is an asset. - -sche (discuss) 19:16, 9 June 2017 (UTC)

Borrowed descendantsEdit

Is there is standard format for words whose descendants are all borrowed? Case in point, French hangar. Related, I wonder if we need a variant of {{see desc}} for examples like Frankish *haimgard that instead reads (see there for further borrowings). --Victar (talk) 00:53, 9 June 2017 (UTC)

What do you care for most? What are you concerned with? Take part in the strategy discussionEdit


The more involved we are, the more ideas or wishes concerning the future of Wikipedia we have. We want to change some things, but other things we prefer not to be changed at all, and we can explain why for each of those things. At some point, we don’t think only about the recent changes or personal lists of to-dos, but also about, for example, groups of users, the software, institutional partners, money!, etc. When we discuss with other Wikimedians, we want them to have at least similar priorities that we have. Otherwise, we feel we wasted our time and efforts.

We need to find something that could be predictable, clear and certain to everybody. A uniting idea that would be more nearby and close to the every day’s reality than the Vision (every human can freely share in the sum of all knowledge).

But people contribute to Wikimedia in so many ways. The thing that should unite us should also fit various needs of editors and affiliates from many countries. What’s more, we can’t ignore other groups of people who care about or depend on us, like regular donors or “power readers” (people who read our content a lot and often).

That’s why we’re running the movement strategy discussions. Between 2019 and 2034, the main idea that results from these discussions, considered by Wikimedians as the most important one, will influence big and small decisions, e.g. in grant programs, or software development. For example: are we more educational, or more IT-like?

We want to take into account everybody’s voice. Really: each community is important. We don’t want you to be or even feel excluded.

Please, if you are interested in the Wikimedia strategy, follow these steps:

  • Have a look at this page. There are drafts of 5 potential candidates for the strategic priority. You can comment on the talk pages.
  • The last day for the discussion is June, 12. Later, we’ll read all your comments, and shortly after that, there’ll be another round of discussions (see the timeline). I will give you more details before that happens.
  • If you have any questions, ask me. If you ask me here, mention me please.

Friendly disclaimer: this message wasn't written by a bot, a bureaucrat or a person who doesn't care about your project. I’m a Polish Wikipedian, and I hope my words are straightforward enough. SGrabarczuk (WMF) (talk) 11:01, 9 June 2017 (UTC)

Yikes, the use of marketing jargon here is horrible, the opposite of wikis' origins as common-sense technical tools. I think I agree with the Germans. Equinox 18:50, 9 June 2017 (UTC)
@Equinox: Excellent thanks for the link! --WikiTiki89 19:03, 9 June 2017 (UTC)
I wish they would keep us out of their propagandistic marketing agendas. --Victar (talk) 19:08, 9 June 2017 (UTC)
This isn't marketing. I'm not a marketer. And I'm not 'we'. Tell me you don't care about anything beyond English Wiktionary, fine, I'll understand (there are many users who 'just edit'), but don't imply I do things that I don't. For me, it's an utter lack of WT:AGF. In other words: are you interested in the movement strategy? do you have any questions? remember: questions to me personally, not to the entire WMF. And please, see my user page and read who I am. SGrabarczuk (WMF) (talk) 21:16, 10 June 2017 (UTC)
I apologize, SGrabarczuk (WMF), the principle of WT:AGF has never been very popular on en.wiktionary. Don't take it personally, though. Plenty of our first-time visitors get jumped on. Think of it as baptism by fire. —Stephen (Talk) 22:01, 10 June 2017 (UTC)
Stephen, I appreciate what you wrote, however, I'm not new at all. I got used to a daily MMA situation when my opponents get medieval on my arguments, and I can do the same with their words. That's a 'normal' wiki-way, but only when one doesn't like the opponent, or when one talks to someone from the other side (a newbie, a WMF staffer). But I'm not one of them, and there's no reason not to like me in advance. I, like 'Red' Redding, may get your community sth from the other side, provided I'm asked civilly. That propaganda, I wrote it myself. I tried to avoid corporate speech, added a friendly disclaimer, but yeah. Cheers. SGrabarczuk (WMF) (talk) 00:52, 11 June 2017 (UTC)
SGrabarczuk, one of our admins has indefinitely blocked w:Jimmy Wales himself. We are nothing if not ecumenical abusers. —Stephen (Talk) 20:58, 16 June 2017 (UTC)
No, nobody is attacking you for no reason. Stuff like "movement strategy ecosystems and actors" is opaque marketing-style jargon from the commercial world, not to be trusted on an open project. Equinox 21:05, 16 June 2017 (UTC)
@Equinox, SGrabarczuk (WMF) is Polish, and his English can be expected to be a bit off. It likely reflects the sort of English texts that he often reads. He indicates that his English level is en-3, so judgments such as "opaque marketing-style jargon from the commercial world, not to be trusted on an open project" are unfair, inappropriate, and very likely incorrect. —Stephen (Talk) 23:08, 16 June 2017 (UTC)
@Equinox, I bet both of us well know how to deal with the History tab. You can see that I didn't write that. You'll welcome to react like tell (err, where did you find this?) insert_the_author's_nick_here not to use that sort of words. I'll agree! I know that most of users assume that guy has a (WMF) in his signature, so he's co-responsible for all that rubbish, but that's an incorrect, simplifying, and not exactly fair assumption. I'm not a boss, thus I'm responsible only for myself. SGrabarczuk (WMF) (talk) 22:05, 16 June 2017 (UTC)
@Stephen G. Brown Really, I always found en.Wikt to be very manageable and relaxed in terms of good faith. Make a round on German Wikipedia (and German editors here...). Quick to assume agendas and deceit and not as shy to say so. (Not excluding myself, I quietly accuse a lot of them of agenda pushing.) @SGrabarczuk (WMF) I agree that the headlines sound like suit buzzwords. The actual explanations are valid topics that might benefit the Wiki projects. I think editors, certainly here, have little interest in diverting time and attention to the peripherals of what they actually like doing. En.Wikt is, from what I see, populated not by a people who wanted to be part of a commune and better the world but by people who use dictionaries a lot, do linguistics a lot or have a pet language they would like to propagate. Small wik-, big -tionary. Even the participation in these here talk pages is minuscule. This means for the 'other side' that things have to be made with the awareness/anticipation that they will be judged (and most likely discarded) after a cursory glance, so things like headlines should be very plain and on point, even if it doesn't have much ring as a motto. And to give you at least some form of feedback: The only things I ever see in these discussions which people would need other than 'more editors making more edits' are technical things. So maybe explicitly asking what's needed in that regard would get a better reply. Really just a guess, though, I don't deal in the Grease Pit. Korn [kʰũːɘ̃n] (talk) 22:45, 16 June 2017 (UTC)
@Korn, read the comments in this discussion by Equinox. Not what I would call manageable and relaxed in terms of good faith. —Stephen (Talk) 23:08, 16 June 2017 (UTC)
@Stephen G. Brown Might be personal bias as I'm partially affected, but I still find "Marketing jargon isn't to be trusted in an open project" less venomous than the things I find on the talk pages of Wikipedia, and I don't think that kind of attitude is common here towards editors editing. I could only tell you one person here who ever showed an actual lack of good faith [read: was instantaneously making blatant accusations] in dealing with such situations and it's a German. Even Wyang and CodeCat only think of each other as stubborn idiots, rather than people with ill intent. As an example of actual bad faith: My worst personal encounter with not-good-faith was moving an article called 'German dialects', dealing in equal parity and no little detal with Dutch, Low German and German dialects, to 'Continental West-Germanic dialects'. I was, IIRC, then accused of doing that for vandalism, by people who had never heard the term, and the fact that I had made a typo, as I do, and had to move the article twice, was taken as evidence that I was covering up my villainy. The article was then moved back and butchered from a perfectly good article about Continental West-Germanic into a disgrace about 'German'. Now that is real lack of good faith, the important kind, which affects the projects' contents. And I never saw that kind happening here. (Feel free to direct me to examples.) Equinox' distrust, while not a good sign of project spirit, doesn't actually make Wiktionary any worse to the user than it is, nor does it drive away able, good-willed editors. And I don't think he's actually thinking WMF is out to worsen the projects. I read it more as an accusation of being out of touch. Korn [kʰũːɘ̃n] (talk) 09:00, 17 June 2017 (UTC)
@Korn, yes, not everyone treats others that way. The number of abusive editors seems to have been decreasing over the years, very gradually. Nowadays, I think there are only a couple of them left. In the old days, half of our editors seemed to be looking for any reason to jump down someone's throat. The atmosphere was tense. It used to be common for some admins to block other admins over minor or weird reasons. A few years ago, a certain admin blocked an new editor for one year because he transcribed a Persian word using /sh/ instead of /š/. The situation is improving, but it is still nothing to brag about. —Stephen (Talk) 09:22, 17 June 2017 (UTC)
It's no insult. All the propaganda and stuff coming from "the Wikimedia foundation" uses a completely different tone and context than the ENTIRE rest of the Wiktionary site. I see it as inappropriate. Contributors here may be involved in the Wikipedia project or other WM projects as well, but are not necessarily and many aren't. Wikipedia and Wiktionary are two completely separate communities, other than the fact that the two often link to each other on a lot of pages, often share contributors, and that they're funded by the same foundation. In my opinion, besides possible funding issues, if Wiktionary ever broke off from its relations with WMF, there would be no differences. The people, the contributors, the tone, expectations of contributors, style of editing, etc., would all be the same. It is my opinion that the ads need to stop coming to places like the Beer Parlour because it's simply not a concern or priority of ours. PseudoSkull (talk) 10:43, 17 June 2017 (UTC)
Really, now? You're attacking the people who do the friendly work of making sure this project stays online because they offer you the choice to participate in the decisions of what they do with the funds they raised? Which you can just ignore and move on with your life? That's the type of thing you want to do? Korn [kʰũːɘ̃n] (talk) 12:28, 18 June 2017 (UTC)
As above, I'm not the only one holding the side you just stated. Also, I wouldn't say my statement is "attacking", but rather stating an opinion that the propaganda is somewhat inappropriate here. Maybe we should just make a page called Wiktionary:Place propaganda and advertisements here to keep it out of our site discussion space. Like I said, Wiktionary is a completely separate community with very little direct relations to Wikipedia, so the propaganda is not really a concern to us. That isn't an attack; that's just the stating that I'm annoyed by it like others above me, as it has always clearly stood out to me that the propaganda did not belong here. I never said anything, though, because I didn't want to possibly be that one egg who said something that pissed Wiktionary off. PseudoSkull (talk) 16:54, 18 June 2017 (UTC)
To be fair, I never said that the content of the propaganda was bad; I personally actually agree with most of what it said and think it's good that free knowledge is being highly encouraged by the foundation, etc., but the fact that there IS propaganda at all is the bad part, regardless of how much I agree or disagree with its statements. PseudoSkull (talk) 16:57, 18 June 2017 (UTC)

English orthographic categoriesEdit

Hi, taking into account the educational characteristic inherent in lexicography, I'd like to propose creating categories for intricate issues related to orthography. Thus, a category for words with doubled letters, even twice (lemmas: aggress, accommodate, etc.) would help learners review lists and make less mistakes. --Backinstadiums (talk) 16:34, 9 June 2017 (UTC)

We have a few already: Category:English terms by orthographic property. — Ungoliant (falai) 16:48, 9 June 2017 (UTC)

Wikidata proposal: Add "English Wiktionary name" as a statement about all languagesEdit

@Lea Lacroix (WMDE): Here in the English Wiktionary, we use a single name for each language everywhere.


The chosen language name is used in etymologies, translations, descendants, categories, appendices, policies and other places. If we decide to change a language name, it will have to change everywhere. We often use modules and templates to convert language code→name information, such as "mhz"→"Austronesian Mor". We have large data modules containing language code→name information. The "mhz"→"Austronesian Mor" information is one of the 621 languages currently stored at Module:languages/data3/m. See also Category:Language data modules for all the language data. WT:LANG is our language policy.

Three questions:

  1. Can we move all the language data modules to Wikidata? (we don't know yet if there's consensus here to do it — it may or may not be done, depending on consensus)
  2. Can we add a new statement "English Wiktionary name" where its value is exactly the name we use? This way, the language name will be exactly as the English Wiktionary decides, even if a different synonym is chosen for other projects and purposes.
  3. Can we protect the new statement "English Wiktionary name" so that people are generally disallowed to edit it even if the rest of the data item is free to be edited? Maybe only Wikidata admins would edit it.

This concern was raised by @-sche in this discussion above: #Wikidata precautionary principle. --Daniel Carrero (talk) 17:23, 9 June 2017 (UTC)

The canonical name of a language is the single most queried item across Wiktionary. This means that there are very strict requirements on speed. How fast can such names be queried from Wikidata? What if it's hundreds of them like on water? As for the name of the data item, something like en.wiktionary might be better, but I have very little knowledge of how Wikidata works in detail (could someone explain it to me on my talk page?). —CodeCat 17:35, 9 June 2017 (UTC)
@Daniel Carrero, Lea Lacroix (WMDE): If this foolish idea is to proceed, then as I noted in the previous discussion, you would also need to add "English Wiktionary script(s)" and "English Wiktionary family" field, and corresponding "French Wiktionary script(s)", "English Wikipedia family", etc to handle cases where script and family information is disputed/controversial between wikis, e.g. where one wiki follows Ethnologue in declaring a language unwritten (Zxxx / Zyyy, because there is no natively-authored literature) but we regard it as being in Cyrl (if we have entries in Cyrl citing linguistic reference works in Cyrl, our templates and CSS rely on that being declared as a script of that language so they know to ask for transliteration, and to call on the right fonts for displaying it); and (respectively) where one wiki treats Uralic and Finno-Ugric as different (maybe a WP would), and another treats them as the same (cf Tibeto-Burman vs Sino-Tibetan, etc), or where one wiki wants every node in a family tree represented, while another wants pragmatic levels of categorization, not to ghettoize languages into a dozen nested levels of families. You would also need parameters for when one wiki considers several ISO or non-ISO coded lects to be distinct, but another wiki considers them one language (e.g., Serbian vs Croatian vs Serbo-Croatian, Rhine Franconian vs Central Franconian vs Kölsch, European French vs Cajun French, Boubonnais-Berrichon vs Bourbonnais and separately Berrichon etc), as well as the basic capacity to handle lects which lack ISO codes. You would need, in effect, to duplicate everything we currently do here, but on another site, where the people who maintain language data on this site would need to have the full editing rights to keep it up to date there, or bother Wikidata editors about hundreds of things a month — see how many distinct codes I've changed in Module:languages's submodules recently — or, more likely, have things go unupdated. And, as CodeCat notes, fetching the data from Wikidata would have to be as fast as fetching it from our local modules, unless you were going to make the existing problems large entries have (which seem to be due to the use of many complex auto-transliteration modules) worse. I continue to regard this fixation on outsourcing everything, despite the lack of benefits, as foolish; language data is so central to each wiki that it makes sense to do it locally, at least on a wiki with as many active technically- and linguistically-adept editors as en.Wikt. - -sche (discuss) 18:10, 9 June 2017 (UTC)
I'm curious as to whether using Wikidata would actually be faster than our local language modules. Is there any way to measure that? Currently, a template replacing "mhz"→"Austronesian Mor" would have to transclude the entirety of Module:languages/data3/m. With Wikidata, it would only have to access a single property (language name, or en.wikt language name) of d:Q2122792. --Daniel Carrero (talk) 20:03, 9 June 2017 (UTC)
Arbitrary access to Wikidata is an expensive function, that is access to a page not connected with the current one. See mw:Extension:Wikibase Client/Lua#mw.wikibase.getEntity. I think it is not an option for the water case. --Vriullop (talk) 20:51, 9 June 2017 (UTC)
That makes me wonder: can we have per-wiki Wikidatas? We don't want to outsource all this data, but the infrastructure of Wikidata could still be beneficial (whether in terms of speed remains to be seen) if it could be kept locally. —CodeCat 20:19, 9 June 2017 (UTC)
It's important that d:Q42365 (the Wikidata item for "Old English") already has statements identifiers like "Quora topic ID" ("Old-English") and "Encyclopædia Britannica Online ID" ("topic/Old-English-language"), so using Wikidata to contain information about multiple sites seems the normal thing to do, it's not an earth-shattering new idea. So I think Wikidata might as well have "English Wiktionary name" (or "English Wiktionary ID") for all languages, it would simply fit the current system. --Daniel Carrero (talk) 20:26, 9 June 2017 (UTC)
Perhaps, but what about my idea of having some kind of Wikidata local to en.wiktionary alone? —CodeCat 20:41, 9 June 2017 (UTC)
OK, about your idea:   Support. --Daniel Carrero (talk) 20:42, 9 June 2017 (UTC)
Sounds to me that what we need is actually WikiMetadata: a database of controlled vocabularies (and even ontologies) used by one Wiktionary to store and serve all linguistic descriptions, be it languages or parts of speech or scripts. No actual "data", only edited by recognized Wiktionarians, cached for fast Lua access, etc. — Dakdada 11:22, 14 June 2017 (UTC)
More and more I think the idea of a local Wikibase installation might be the better course. While I like the idea of sharing as much data as possible across Wiktionaries, I am afraid that the consensus to get there is not likely to arise any time soon. If we start with a local database we can get all of the benefits of a relational database to work with while eliminating the concerns of numerous individuals. - [The]DaveRoss 17:55, 16 June 2017 (UTC)


Thanks for bringing this interesting issue. I'll try to provide a general answer, feel free to ping me if I forgot something.

I'm not sure that a "English Wiktionary name" property would be accepted through the community process on Wikidata. Of course I can't answer on behalf of the editors. In order to get the languages names from Wikidata, maybe it would be more efficient to first improve Wikidata (we are aware that a lot of items about languages are missing), fix the labels if necessary (in collaboration with the Wikidata editors), and finally improve your module so you can use data from Wikidata, but also, in the cases where Wikidata doesn't fit to your naming rules, use your own local labels.

About having part of the data protected: I'm pretty sure this will be rejected by the community, since Wikidata, as the other Wikimedia projects, has the free and direct editing in its basic rules. We don't allow any editor, or external organization, to have specific rights on the data, and we try to solve the potential issues with discussions, source-based informations. I'm sure we can solve this as well whithin both communities.

About having a local database: it is technically possible that you install your own instance of Wikibase, the free software that powers Wikidata. However, our goal here is to share knowledge whithin the Wikimedia projects as much as possible, and try to have less informations split into silos, that's why we can't support this idea.

Now I understand better your concerns about languages, I have the feeling that this is not the best topic to start experimenting with Wikidata and the arbitrary access. These templates are used on almost all the pages of the main namespace. It would be wise to start with something with a smaller scale of change. Also, this topic appears quite controversial sometimes. I agree that we should find solutions together, but for a start, I would suggest something else.

I had a look at the citations namespace, for example this one, and noticed that you're generally including with the quote, the name of the work, its author, and date of publication. These informations could be very easily integrated from Wikidata. Instead of entering manually "1843 — Charles Dickens. A Christmas Carol", you could build a small module that needs only the ID of the work (Q62879) to display automatically the title, author, year of publication, and even more informations. This seems a nice way to start experimenting with arbitrary access. What do you think? Lea Lacroix (WMDE) (talk) 13:57, 12 June 2017 (UTC)

@Lea Lacroix (WMDE): Thanks for your reply. I understand. I agree with the citation idea, I've been thinking that it's a great idea to use Wikidata to fetch that information (and of course help to build Wikidata by adding data about more books when needed). --Daniel Carrero (talk) 02:05, 14 June 2017 (UTC)
About the local database: the idea is that this would not be a database of shareable data, but of metadata. This is very different. Also those metadata are already "split into silos" (in Lua modules), and for good reasons: they are community specific. — Dakdada 11:27, 14 June 2017 (UTC)

Request to add a new language code "rya" for rGyalrong a.k.a. JiarongEdit

I'm travelling in rGyalrong speaking areas of Sichuan right now. Danba county and now Kangding, both in Garze (Ganzi).

rGyalrong people identify as Tibetan and are classified as Tibetan by the Chinese government. But there language is more closely related to the Qiang language than to Tibetan. (The Qiang don't idenity as Tibetan and are classified separately by the Chinese government.)

There is an ISO code 'jya'

rGyalrong is apparently very important in reconstructions of Old Chinese as it's considered to be a very conservative member of the Sino-Tibetan family.

Note that it's considered a group of languages (or dialects?) but a proposal to split it into individual language codes was rejected in 2011. This can be handled with labels and in any case I believe none have written forms so we'd have to use either IPA or whatever conventional orthography is used by linguists. One main linguist is known for the study of all these languages so my guess is there's a unified conventional orthography.

hippietrail (talk) 04:14, 10 June 2017 (UTC)

-sche split the code "jya" last year into Situ (sometimes called Eastern rGyalrong) (sit-sit?), Japhug (sit-jap?), Tshobdun (Caodeng, Sidaba) (sit-tsh), Zbu (Rdzong'bur, Showu, Sidaba) (sit-zbu). DTLHS (talk) 04:24, 10 June 2017 (UTC)
If you can obtain more data on how different or similar the lects are, perhaps especially when written, that will be very helpful. The limited data I found on the lects suggested they were not mutually intelligible; even the Ethnologue page says "Dialects are likely three separate mutually unintelligible languages" with low similarity. Guiillaume Jacques says "Rgyalrong comprises at least four mutually unintelligible languages: Japhug, Tshobdun, Zbu, and Situ." That's why I proposed the split DTLHS links to, and (with no feedback for over a month) implemented it. - -sche (discuss) 05:58, 10 June 2017 (UTC)
Thanks for the feedback! None of them are regularly written though I read that Situ had an orthography created before 1950 I think. I haven't found any info on that yet so I guess it's something made up by linguists and/or missionaries. Situ is also by far the most spoken. It's also the one spoken in both Danba and Kangding, so it's the one I'm interested it.
Here is the best technical info I've found so far. I'm still reading through it: (talk) 11:45, 10 June 2017 (UTC)
I've been gathering some comparisons of the languages' pronouns, conjugation patterns and other words at User:-sche/Gyalrong. The last decade or two of literature seems to be in agreement that the lects are mutually unintelligible. The only user I can think of who hasn't commented but might know about these languages is @Wyang. - -sche (discuss) 22:59, 11 June 2017 (UTC)
@-sche I'm not an expert in Rgyalrong either, and only have a physical copy of the 2002 Chinese-Rgyalrong dictionary (in the Situ dialect). The western expert on Rgyalrong is definitely Dr. Guillaume Jacques, who also used to have an account and was an admin on the Chinese Wikipedia: w:zh:User:向柏霖, so it may be a wise idea to contact him re: the organisation of the varieties of Rgyalrong on Wiktionary. Dr. Jacques wrote the 472-page 《嘉绒语研究》 (Jiarongyu yanjiu, “A study on the Rgyalrong language”), which divided Rgyalrong into the four mutually unintelligible dialects above. I also vaguely remember there was a Rgyalrong-Chinese-French dictionary for the Japhug dialect circulating online, which may be handy. Wyang (talk) 08:16, 12 June 2017 (UTC)

Flag semaphoreEdit

I created three flag semaphore entries. Let me know if they look OK and if they should be kept. I also asked SemperBlotto (User talk:SemperBlotto#Flag semaphore).

I tried to imitate the notation used in Category:American Sign Language lemmas.

(It may be of interest that Morse code entries were created in 2016, they also fit the spectrum of "things you can use in place of letters and numbers". The Morse discussion is here: Wiktionary:Beer parlour/2016/August#Proposal: Creating entries for Morse code characters.) --Daniel Carrero (talk) 12:34, 10 June 2017 (UTC)

@Daniel Carrero: Thanks for this. With Braille, Morse Code, and semaphore, I think that covers most unusual encodings for the Latin alphabet other than fingerspelling and shorthand. —Justin (koavf)TCM 16:56, 10 June 2017 (UTC)
@Daniel Carrero, Koavf: Tactile Sign Language: at least its alphabet should be added to Wiktionary. I don't know whether it should be a a type of fingerspelling or rather a new language as such
Thanks in advance. --Backinstadiums (talk) 08:31, 11 June 2017 (UTC)
@Backinstadiums: I knew that tactical signing was a thing but not that there was a way to encode it in print. It seems like this is a pictorial chart and not a way to record it that we can use. And tactile signing is not a language itself but another encoding. —Justin (koavf)TCM 15:31, 11 June 2017 (UTC)


Should this be a redirect?--2001:DA8:201:3512:CD84:BF8E:5FA5:70A7 16:54, 11 June 2017 (UTC)

Yes, looks good to me. (single codepoint) and II (two instances of "I") are the same thing, even though there's a distinction at some level which is meaningful to computers. For the same reason, I had redirected say, to !. --Daniel Carrero (talk) 17:00, 11 June 2017 (UTC)
Ok, I won't revert anymore. It's a kind of instinct when you see an IP user doing wholesale deletion of stuff, you think something's fishy. —CodeCat 17:02, 11 June 2017 (UTC)
IP person, if it's not too much trouble, please add {{R character variation}} in these kinds of redirects! --Daniel Carrero (talk) 17:06, 11 June 2017 (UTC)
Could you write documentation and categorise the template, @Daniel Carrero? —CodeCat 17:07, 11 June 2017 (UTC)
Alright, done! --Daniel Carrero (talk) 17:19, 11 June 2017 (UTC)

How can we use Wikidata's existing data pool?Edit

It's very controversial to adopt Wikidata for anything new that's specific to Wiktionary, but is there any data currently on Wikidata that we can already make use of? Data about species comes to mind. @DCDuring, what do you think? —CodeCat 17:03, 11 June 2017 (UTC)

Author / biographical information is what comes to mind for me. DTLHS (talk) 17:04, 11 June 2017 (UTC)
Certain "is-a" categories, e.g. Rome is a city, labrador is a dog. Equinox 17:06, 11 June 2017 (UTC)
  •   Support for types of place names. --Daniel Carrero (talk) 17:07, 11 June 2017 (UTC)
    • The place name thing is interesting. We might be able to modify {{place}} to make use of it. However, we'd presumably need some way to tell, within an entry, "use this Wikidata item". Would that mean that {{place}} would take a parameter to specify the Wikidata item code (Q...)? —CodeCat 17:10, 11 June 2017 (UTC)
      • I think we should think about if we can avoid using numeric codes directly in entries, if that's possible. DTLHS (talk) 17:13, 11 June 2017 (UTC)
        • If that's possible, then sure. But what is the alternative? —CodeCat 17:14, 11 June 2017 (UTC)
          • I suggest using a combination of numerical codes and hidden text comments that don't affect the entry. d:Q90 (Paris, France) d:Q79917 (Paris, Arkansas) could work like in the table below. --Daniel Carrero (talk) 17:29, 11 June 2017 (UTC)
Code Result
# {{Wikidata place|Q90|capital of France... this is just a comment I can say anything here}}
# {{Wikidata place|Q79917|city of Arkansas... this is the same}}
  1. A city in Île-de-France, France and the capital and most populous city of France.
  2. A city in Arkansas, USA.
Bleh. If we're going to include comments, can we not put them in actual wikitext comments instead of a pointless template parameter? —CodeCat 17:37, 11 June 2017 (UTC)
I think I would be happy enough without any comments at all, just using the 1st parameter for the number code and that's it.
But if we want comments for all place names, I was hoping to do this: if the comment parameter is empty, the entry could be categorized in Category:Place names without comments. --Daniel Carrero (talk) 17:40, 11 June 2017 (UTC)
I like the idea of using comments since it gives an assurance that the wikidata item is actually the intended target- otherwise someone could add an incorrect number and there would be no way to tell what they meant or if it was wrong. DTLHS (talk) 17:45, 11 June 2017 (UTC)
The data itself might help. The module could check (somehow) if the item is in fact a city, town, river or some other kind of geographical thing. If it's not, then it could throw an error. —CodeCat 17:48, 11 June 2017 (UTC)
Two checks: 1) check if the item is a type of geographical location (presumably needed for description and categorization purposes, and a template called {{Wikidata place}} should be able to return a module error otherwise); 2) compare the current entry title with the accepted titles in the Wikidata item. When "Q90" and "Q79917" are used in the entry Paris, the module should be able to check if "Paris" is an acceptable name for both items as per Wikidata. --Daniel Carrero (talk) 17:57, 11 June 2017 (UTC)
That second check might not work. The template is used for definitions in all languages, so it might also be used on Dutch Parijs. —CodeCat 18:01, 11 June 2017 (UTC)
Dutch Parijs is already available in the list of names for d:Q90 in all languages. This reminds me, {{Wikidata place}} should be able to know what is the current language section, so in that Dutch entry the proposed syntax should actually be {{Wikidata place|nl|Q90|capital of France}} (and "en" in English entries, of course). --Daniel Carrero (talk) 18:06, 11 June 2017 (UTC)
That works, but what if the name isn't listed? Should we require the name in language X to be present in Wikidata before we allow the use of the template for X? Adding a tracking category ("hey, this name isn't in Wikidata yet, someone go add it!") would be much more helpful than a straight error. —CodeCat 18:10, 11 June 2017 (UTC)
I support adding a tracking category as you described. Maybe it could be called Category:Dutch place names missing in Wikidata or something. --Daniel Carrero (talk) 18:14, 11 June 2017 (UTC)
A top-level category for stuff to be added to Wikidata in a particular language would also be good. Since you were involved in renaming all the request categories, I'll leave the naming to you. :) —CodeCat 18:17, 11 June 2017 (UTC)
Alright! Proposed category tree (which may be changed/discussed):
If we have template tracking categories as discussed here, I believe we don't actually need those template comments. Sure, it would be wrong to set up a place name definition with the number d:Q10943 because it means "cheese", but the template should be able to recognize that automatically. --Daniel Carrero (talk) 18:36, 11 June 2017 (UTC)
Hmm, wouldn't they rather be request categories? —CodeCat 19:15, 11 June 2017 (UTC)
I and other people in the request category vote seemed to support the following notion: a request category is when you manually request something, like {{rfe}}. In the Wikidata categories above, the entries would get automatically categorized whenever something looks wrong. --Daniel Carrero (talk) 19:18, 11 June 2017 (UTC)
Ok. But to call it an error is a bit extreme. It's just a missing translation, a common symptom of a project that is always a work in progress. —CodeCat 19:20, 11 June 2017 (UTC)
OK, second proposal:
--Daniel Carrero (talk) 19:36, 11 June 2017 (UTC)
I don't like this since it gets into the entire problem of representing lexicographic data in wikidata which they are really not set up to do. Place names have synonyms, obsolete forms, dialectal forms, etc. DTLHS (talk) 18:18, 11 June 2017 (UTC)
This is purely to supplement the current {{place}} template. This template creates definitions based on the parameters you give it and then also categorises appropriately. What would change in a Wikidata implementation is that these parameters would be fetched from Wikidata (e.g. "is a city", "capital of France") rather than being provided as parameters. This isn't lexicographical data in my understanding of the word. —CodeCat 18:22, 11 June 2017 (UTC)
Knowing that Dutch Parijs is a synonym of English Paris is lexicographic data. DTLHS (talk) 18:24, 11 June 2017 (UTC)
And that follows from the fact that both uses of the {{Wikidata place}} template, one on Paris and one on Parijs, would use the same Wikidata item code. This information is therefore not stored on Wikidata at all. Even if there were no Wikidata and {{Wikidata place}} were an empty template, the mere fact that they both had Q90 as a parameter would establish them as synonyms. —CodeCat 18:29, 11 June 2017 (UTC)
It's important that the "city of France" sense in Dutch Parijs will need to access the English translation somehow. Here's two ways to accomplish that: using Wikidata or using a parameter. See table. --Daniel Carrero (talk) 19:04, 11 June 2017 (UTC)
Code Result
# {{Wikidata place|nl|Q90}} <!-- using Wikidata -->
# {{Wikidata place|nl|Q90|Paris}} <!-- using a template parameter -->
  1. Paris (city in Île-de-France, France and the capital and most populous city of France)
The current iteration of {{place}} uses t1= for that purpose. I think we should keep using a parameter, again to minimise our dependence on Wikidata for lexicographical things. —CodeCat 19:07, 11 June 2017 (UTC)
That is an option, but it seems Wikidata is still an option too because it will get "morpheme" data items designed specifically for lexicographical data (Wikidata:Wiktionary).
Maybe the best course of action is just keep using the parameter as you said since it's reliable and it works. But we can change our minds later and delete that parameter from all entries if the morpheme thing works out. --Daniel Carrero (talk) 19:11, 11 June 2017 (UTC)
Yes, I'm aware of how hesitant many people here are about offloading lexicographical data to Wikidata. But if we limit ourselves to the existing data already out there, such as topography in the case of {{place}}, then I don't think it would be as much of an issue. —CodeCat 19:14, 11 June 2017 (UTC)
In my role as articulate fish at a convention of ichthyologists, here's my initial view of how Wikidata might help with Translingual taxonomic entries.

How about a dynamic map for place names? See ca:Kenya. It fact it does not use currently Wikidata but OpenStreetMap with Wikidata identifier. With Wikidata access it could fetch coordinates to add a point in the OSM map, for example for cities. --Vriullop (talk) 12:20, 12 June 2017 (UTC)

The map could be included in an infobox with relevant links to Wiktionary. From d:Q114: capital=Nairobi, demonym=Kenyan, languages=Swahili, English, currency=shilling,, ISO code=KE. --Vriullop (talk) 13:12, 12 June 2017 (UTC)
Wikidata might be a desirable way for me to speed insertion of "References" to external sites. I have seen such links on Commons, en.WP, and Wikispecies. Wikidata seems to have already accumulated such information from multiple projects, though I haven't yet found which ones. If that were better than what I could find on eg, Rosa at NCBI, then it would marginally speed things up. If I could extract the links in a single step by, say, a substed template (or something more sophisticated), that would save a more meaningful amount of time. I am skeptical about the response time of accessing such links through Wikidata each time an entry is loaded.
Something similar might apply with respect to references at vernacular names, though such references are, IMO, not so useful for such entries.
You might think that the hierarchical taxon/clade structure would be a perfect use of Wikidata, but I believe that relying on such a structure is not helpful in definitions. It seems better to me to define species, genera, tribes, and sections with reference to the family, not matter what intermediate ranked taxa or clades may be used by one or more sites. Non-expert users don't seem to have much familiarity with the various super-, sub-, infra- ranks of phylas, classes, orders, families, and species, let alone tribes, sections, and divisions. DCDuring (talk) 19:49, 11 June 2017 (UTC)
See for example w:ca:Poecilia latipinna. At footpage a template fetches the identifiers from d:Q906572 and links to databases. --Vriullop (talk) 12:12, 12 June 2017 (UTC)
That's the idea, but [] . See [[Alconeura]] for a not uncommon approach to handling missing pages at external databases: providing a link to a page for a higher-ranked taxon. I suppose that could be managed by using a separate template for each higher-ranked taxon selected for inclusion, but I'd want to be able to exclude from the higher-taxon list of links those databases that had a more specific link. In principle there could be many such separate templates for a given taxon, though very few taxa would need more than three. DCDuring (talk) 14:21, 12 June 2017 (UTC)
I'm not sure if I understand. d:Q10404513 has 3 identifiers linking to 3 databases. Are these links ok? Anyway you must provide always the Wikidata page to fetch. Either this one or the higher-ranked. --Vriullop (talk) 14:44, 12 June 2017 (UTC)

Wikidata items as senseidsEdit

Currently, the template {{senseid}} is used to disambiguate specific senses and allow us to link to them from elsewhere. Senseids are just text, they can be anything at all as long as it's unique. This means it's possible to use the codes for Wikidata items as senseids too. For example, we could use {{senseid|en|Q90}} on the first sense of Paris. This wouldn't actually do anything, other than tell editors that this sense refers to the thing whose Wikidata code is Q90. {{senseid}} would not be modified at all for this purpose, it would not access Wikidata. But it does add information to Wiktionary entries, by establishing a conceptual link between senses and Wikidata items.

Such links could be used for future things that we currently haven't thought of. A possibility that comes to mind is Wikipedia links. If {{senseid}} detects that its parameter is a Wikidata item code (Q followed by numbers), then it could be modified in the future to query that item for its en.wikipedia article name. {{wikipedia}} or some similar interproject link could then automatically be displayed next to certain senses, whenever {{senseid}} is given a Wikidata item as a parameter.

For those of us hesitant to offload data onto Wikidata, please notice that this change would change nothing at all on Wikidata's end. The data added by this would be entirely on en.wiktionary, in the form of a wikitext template call, so we have full control. No data would be added to Wikidata, we'd merely be using what's already there. —CodeCat 18:47, 11 June 2017 (UTC)

I've added Wikidata senseids to Paris (English only) to demonstrate what I mean. Wikidata isn't actually enabled yet on Wiktionary, so these do literally nothing other than provide a senseid anchor on the page. —CodeCat 11:54, 12 June 2017 (UTC)
Looks good to me. Two comments: 1) doing this in large scale would probably require a vote, 2) I guess this idea should work 100% well for place names, but we can't use Wikidata items as senseids for all kinds of Wiktionary definitions. Wikidata probably doesn't have separate items for each sense of the verbs do, have, go, be... I guess the future Wikidata "morpheme" thing should work, but apparently the database would need to be built from scratch. --Daniel Carrero (talk) 12:03, 12 June 2017 (UTC)
Indeed, such senseids would probably be limited to nouns only, as Wikidata doesn't currently have items about verbal actions. The important thing to keep in mind is that the Wikidata items are about the referents of words, not the words themselves. The words would have their own items, if we get around to that. This does create some issues that I foresee. The colour green has Wikidata item d:Q3133. But in our entry green, we have not only a noun referring to this colour, but also an adjective. Since senseids must be unique within a single language section, we can't give both of them {{senseid|en|Q3133}}. So which one do we put it on? —CodeCat 12:10, 12 June 2017 (UTC)
Using Wikidata senseids would probably work well, not only for place names but also for a lot of proper nouns. For common nouns, adjectives, verbs and everything else, I don't have actual numbers, but at first sight I think it would fail more often than not. d:Q9465 is "ethics", so would we use it which sense of ethics and/or ethical, if any? d:Q7242 is "beauty", and our entry beauty has multiple related senses too. --Daniel Carrero (talk) 12:23, 12 June 2017 (UTC)
We could just add disambiguators to the ids. They are still just strings, after all. So {{senseid|en|Q3133-noun}} for the noun green would work too. As long as the Wikidata id can still be parsed out, it should be fine. As for ethical, I think the issue here is that ethics is a field of study, which the adjective doesn't really have much to do with. If it did indeed refer to the same thing as ethics, I see no reason not to include the Wikidata id there too. —CodeCat 12:52, 12 June 2017 (UTC)
I'm fine with adding disambiguators to the ids, with hope that later in the future we can replace all that stuff by the Wikidata "morpheme" thing. --Daniel Carrero (talk) 12:56, 12 June 2017 (UTC)
I've added a tracking template, Special:WhatLinksHere/Template:tracking/senseid/Wikidata, to {{senseid}} whenever a Wikidata id is used as a senseid. This allows us to keep track of them for the time being. The process of adding all these senseids to entries will be a long one. I've thought of a possible way to speed it up, though, once Wikidata is enabled. Module:headword can be modified so that it checks if a Wikidata label exists with the same name as the current page. If so, it could track the page. This would give us a list of all pages that could probably have a Wikidata senseid added to them. —CodeCat 19:01, 13 June 2017 (UTC)
Nice work! We should use senseid / senselinks more often, but as you said they are quite tedious to add at the moment. A gadget could also be an option, with a Wikidata suggest-style dropdown. – Jberkel (talk) 16:52, 14 June 2017 (UTC)

Note that senses will have a new ID in Wikidata lexeme data model: mw:Extension:WikibaseLexeme/Data Model, with format L3746552-S4. --Vriullop (talk) 10:59, 17 June 2017 (UTC)

This is one of the uses that I feared, having wikitext littered with numerical identifiers. Ugly. --Dan Polansky (talk) 15:21, 17 June 2017 (UTC)

Deverbative or deverbal?Edit

User:Barytonesis recently created a template {{deverbative}}. This is a good idea but I think it's wrongly named and should be {{deverbal}}. Both terms are synonyms but "deverbal" is much more common (as well as shorter and easier to type) — about 9x as many hits in Google, plus my spelling checker actually marks "deverbative" (but not "deverbal") as a mistake, plus Wikipedia has entries for Deverbal noun and Deverbal adjective but no entries for deverbative anything, not even redirects. The template puts these terms under the category "Foo deverbatives" (none of which have been created yet, and should not be created at all probably). I think instead they should go under "Foo deverbal nouns" or "Foo deverbal adjectives"; this requires an optional pos= parameter (which should default to "noun"). There are under 50 entries currently using this template and all appear to be nouns. I can easily use a bot to rename the template uses. Any objections? If not I will go ahead and make the changes. Benwing2 (talk) 23:37, 11 June 2017 (UTC)

I agree. It's also consistent with denominal (for which there should also be a {{denominal}} template). Additionally, it should point to Appendix:Glossary. --Victar (talk) 01:07, 12 June 2017 (UTC)
Fine by me. I'm actually happy that you take interest in it --Barytonesis (talk) 09:59, 12 June 2017 (UTC)
While, you're running a bot, you should change the outdated |lang= to |1=. --Victar (talk) 10:13, 12 June 2017 (UTC)

Conditionally renominating User:Dan Polansky for admin per User talk:Dan Polansky#Renomination for admin?Edit

I hereby nominate User:Dan Polansky for administrator on the English Wiktionary, with the one condition that Dan will be disallowed from using the block tool. For some context, quoting Dan himself:

"For context, my admin vote failed in August 2016.

Note that the block tool is one of power over other people, and should not be awarded to people who we cannot trust, no matter how good editors they are. The use of the block tool does not require consensus, and blocks are rarely challenged. Multiple of current admins are not qualified to use the block tool, in my view. The deletion tool can be abused to hide trails of conversation; it was used in this way by an English Wiktionary admin who meanwhile ceased editing."

I believe Dan would be able to delete pages in accordance to the rules and such, and seems to have a need for such. Still not sure about the block tool, but I'm throwing this condition in because it puts some previous opposers at ease a bit. Honestly, a lot of users may most of all fear that Dan may abuse the block tool. I think Dan is a great editor, though he may be rude sometimes and is notorious for such, so the admin tools of page deletion at the very least should be awarded to him. IMO, the admin tools are not supposed to be awarded to people because they're "nice" (and I'm not necessarily saying Dan is not nice), but because it would be useful for them to have to make even more constructive contributions, for instance, by fighting obvious vandalism, deleting pages in accordance to RFV and RFD, etc.

As I've never started a vote before, who wants to start this vote? PseudoSkull (talk) 18:22, 12 June 2017 (UTC)

I am somewhat uncomfortable with the idea that someone be given user rights that they are then prohibited from using. Either they should have them with the trust of the community to use them appropriately, or not have them. If there are, as Dan suggests, multiple users who should not have the blocking tools but should be able to delete (and protect?) pages, we could always create another user group with just the rights which are applicable. We could, for instance, give folks in the "template editor" group delete rights, or create a "deleters" group with just the rights to delete and restore. It is even possible to have delete rights without the ability to see deleted revisions, etc. I don't know if the complication is worth the extra effort, but to me it is preferable to the alternative. - [The]DaveRoss 19:04, 12 June 2017 (UTC)
I am also uncomfortable with that idea. I would probably support creating a new group "Deleters". --Daniel Carrero (talk) 02:08, 14 June 2017 (UTC)
Along with another group for "non-deleters". --Victar (talk) 02:33, 14 June 2017 (UTC)
Non-deleters would be anyone not in the "deleter" or "administrator" groups. - [The]DaveRoss 11:07, 14 June 2017 (UTC)

Local groupsEdit


Yesterday, I went to met my local colleagues, contributors of Wikipedia, Wiktionary, Commons, Wikidata, OpenStreetMaps or OpenFoodFacts. We met several times this year, to discuss about contribution and drink beers. Is there some contributors here that do the same in their local groups? If yes, do you also contribute on Wikipedia and chat about this project or do you give news from Wiktionary to Wikipedians. It's not a sociological inquiry, it's just curiosity. I have no idea how local groups works out of France.   Noé 10:06, 13 June 2017 (UTC)

I think that the concentration of francophone Wikimedians in a relatively small, easily accessible area (France) is quite different from the anglophone community. I just recently met another Wiktionarian and would be happy to meet more, but it isn't likely to happen often. I also have plans to meet some Wikipedians later this year and maybe speak a bit about Wiktionary, although I honestly don't know what I'd tell them. —Μετάknowledgediscuss/deeds 17:34, 13 June 2017 (UTC)

Unprotect WT:NFEEdit

I think it's silly that this is protected against non-admins. Surely non-admins also have important news to announce? —CodeCat 19:27, 13 June 2017 (UTC)

I agree. Maybe allow autoconfirmed only? --Daniel Carrero (talk) 19:29, 13 June 2017 (UTC)
Autoconfirmed sounds good to me. —JohnC5 19:31, 13 June 2017 (UTC)
Agreed and updated edit protection to autoconfirmed. - [The]DaveRoss 19:44, 13 June 2017 (UTC)
Thank you. I've been annoyed at the protection of that page for a while. — Eru·tuon 02:19, 14 June 2017 (UTC)

Foreign-language Wikisaurus entriesEdit

The very great majority of Wikisaurus entries are English. Some, such as Wikisaurus:కోతి are not. Is there some way we could add a "lang=" parameter to these? SemperBlotto (talk) 05:55, 14 June 2017 (UTC)

Where would you put it? {{ws header}}? —Μετάknowledgediscuss/deeds 18:31, 14 June 2017 (UTC)
Hmm, the language should probably be added to all the templates on Wikisaurus:కోతి that display Telugu text, so that it can be properly tagged. And the header should have a display title with the Telugu part appropriately script-tagged, as is done by Module:headword for entries in certain scripts. — Eru·tuon 18:44, 14 June 2017 (UTC)
imho, the English wiktionary isn't even the place for foreign Wikisaurus entries... --Barytonesis (talk) 19:39, 17 June 2017 (UTC)
Why not? Also, not giving language in pagetitles will inevitably lead to collisions, see for example WS:god, god.__Gamren (talk) 09:29, 21 June 2017 (UTC)

Canadian spelling: how to tag entries when both UK and US spellings are acceptedEdit

I see this has been discussed in the past but I'm not sure how to proceed. What I would like to do is tag gynecology and gynaecology as both being acceptable spellings in Canada (Royal societies tend to use the UK spelling, national and regional medical associations tend to use the US spelling, universities are mixed even in department names). I made a try at it using colour and color as examples:

{{tcx|Commonwealth spelling|Canada|lang=en}} and {{tcx|American|Canada|lang=en}} in the "Noun" headword
and {{qualifier|Commonwealth|Canada}} etc. in the "Alternative form" sections.

But with deprecated templates and such I'm not sure if I'm doing this correctly. Can anyone give me some pointers? Thanks! Facts707 (talk) 08:11, 14 June 2017 (UTC)

The label template for headwords is {{term-label}}, for definitions {{label}}. The template for Alternative forms sections is {{alter}}; see the template page for more information. — Eru·tuon 17:06, 14 June 2017 (UTC)

Why are we putting links to Wikipedia in reference sections?Edit

Grant Parish and many others. Wikipedia is not a reference. DTLHS (talk) 21:32, 14 June 2017 (UTC)

This looks like something that could be fixed by bot in a number of entries. If a "References" section only contains instances of {{pedia}}, rename the section to "Further reading". --Daniel Carrero (talk) 21:35, 14 June 2017 (UTC)
I didn't ask if it could be fixed with a bot. DTLHS (talk) 21:36, 14 June 2017 (UTC)
I know. What you said sounded like a rhetorical question, though. It seems there's already consensus not to put links to Wikipedia in reference sections. --Daniel Carrero (talk) 21:42, 14 June 2017 (UTC)
I am not the first to put Wikipedia under refs, further reading or whatever. Will up the top do, or not at all? I still regard Wikipedia as a reference though, not as further reading. DonnanZ (talk) 22:10, 14 June 2017 (UTC)
So I need to have both "References", increasingly common is taxon entries, and "Further readings" headings when neither is accurate with respect to links to external databases, Commons, and Wikispecies? I specifically said that I thought "References" was an adequate heading for including Wikipedia as well as the others in a recent discussion in which there were numerous participants.
Over time the same type of content appeared under no less than four different headings. First, "See also", which was deemed inappropriate, with "External links" being mandated by vote. Now "References" and "Further reading" are available, but one of those is also to be forbidden? This seems like pointless makework in service of some silly, content-killing uniformity. DCDuring (talk) 23:13, 14 June 2017 (UTC)
The thing with using Wikipedia as a "reference" is that an editor could reference themselves, so it's better to use published, unchanging sources. Andrew Sheedy (talk) 23:18, 14 June 2017 (UTC)
Fine, whatever, I see that this is a pointless discussion. DTLHS (talk) 23:27, 14 June 2017 (UTC)
I understand the issue about mislabelling wikis as references and wish that there were a better title. We really should have a single vague all-inclusive name for these things. IMO "Sources" would do the job. It is wonderfully ambiguous (of what? for whom? of what authoritative status?). DCDuring (talk) 00:09, 15 June 2017 (UTC)
I'm not sure I agree with using "Sources" as a single vague all-inclusive name. I see the problem you have with the current set-up is that it does not seem to work well for links to databases and images. I agree with the images thing, at least, and would accept using some heading for databases even though I don't think is strictly necessary (just my opinion). But the current set-up seems to work well for basically everything else. --Daniel Carrero (talk) 00:38, 15 June 2017 (UTC)
We also have the problem of too many headers. We may be able to have every conceivable entry because we have "no space constraints", but we do have space constraints to the extent that we remain interested in retaining human individual users. We already suffer because we use headers to structure our entries and retain oversized fonts for them. Right now many entries should have both "References" and "Further reading", typically with just one or two lines for each. Hiding the content by default might be a solution, if one could hide all of the references or all of the further reading and databases. DCDuring (talk) 00:53, 15 June 2017 (UTC)
The source I was using (  Index of U.S. counties on Wikipedia.Wikipedia ) is in my opinion a reference. There's an old saying "horses for courses", and that can apply to choosing a heading. DonnanZ (talk) 07:05, 15 June 2017 (UTC)

German vs Germans collectivelyEdit

Continuation of d:Wikidata:Project chat#Germans (Q42884), only for the group or also individuals?

In the discussion linked above, I asked if the concept of Germans as a collective ethnic group is compatible with our entry German, which has one sense referring to a single individual of that group. We group singular and plural into one lemma, so in principle the same lemma can refer to one German or many, depending on which inflection you choose (and indeed, in some languages, there isn't even an inflectional difference). However, the idea of many individual Germans is still not quite the same as all Germans collectively: "the Germans" in our understanding can refer to this entire group, but grammatically it's just multiple Germans with nothing to indicate that it refers to all Germans as opposed to merely multiple Germans. Does this merit a separate sense, marked with {{lb|en|in the plural}}, referring to "all Germans, Germans as a group collectively"? If not, why would it not? It is conceivable that a language has distinct terms for multiple Germans contrasting with the collection of all Germans, but whether this occurs in practice I don't know. If there are examples of this, it would be evidence that the concepts are indeed separate. —CodeCat 19:32, 15 June 2017 (UTC)

There is difference between the use of Germans with and without the definite article. Without the definite article you have sentences like "Germans drink a lot of beer", which is grammatically the same as "trees absorb a lot of sunlight". With the definite article you have "the Germans have elected a new chancellor", which is grammatically the same as "the trees have started changing color". In both of those cases I'm speaking about Germans as a whole and trees as a whole (which is not quite the same as "all Germans" or "all tree"). The case without the definite article can apply to just about any countable noun. The case with the definite article is more restricted (I'm not quite sure what the criteria are). So I really don't think we need a separate sense line for this. --WikiTiki89 20:11, 15 June 2017 (UTC)
As WikiTiki says, this is a general phenomenon, and probably does not merit a separate "in the plural" sense... but then, some entries where a homograph of the singular can be used as a proper noun to refer to the group collectively do indicate that, e.g. Abenaki! I don't know if that should be changed. Is there a difference (besides ethnicity) between "five Abenaki set out; later, the Abenaki reached their destination", "she settled among the Abenaki" and "the Abenaki considered issuing their own passports" vs "five Germans set out; later, the Germans reached their destination", "she settled among the Germans" and "the Germans considered issuing their own passports"? Hmm...! Compare also Chinese.
The fact that some language may have a distinction (and some constructed language, or even some computer 'language'/framework like is used on Wikidata, probably does have a distinction) does not mean that any of the English words, whether "Germans" or "trees", have separate senses. Some languages may distinguish living animals from dead ones, but English has just one sense of chum salmon, not two for "a living dog salmon" and "a dead dog salmon".
That some words in some languages do not match 1-to-1 to other words in other languages (while other words do) has been brought up before in the context of attempts to migrate Wiktionary sense information to Wikidata.
If Wikidata is ever to be comprehensive, it might well need an "instance of" entry for instances of Q42884.
- -sche (discuss) 20:56, 15 June 2017 (UTC)
I hear the "the Abenaki" in exactly the same way as the "the Germans". The tangible evidence is that "the Abenaki" still has plural agreement. --WikiTiki89 21:19, 15 June 2017 (UTC)

See also linksEdit

It seems that it is common practice to restrict links in the "See also" section to the same language as the entry. However this is not explicitly mentioned in WT:EL. Could it be added to better reflect reality/practice? – Jberkel (talk) 21:57, 15 June 2017 (UTC)

I have used it recently to link English words to Scots words, like Cumbernauld and Cummernaud, in lieu of a translations section. DonnanZ (talk) 10:07, 16 June 2017 (UTC)
Why not just create a "Translations" subsection in such cases? — SMUconlaw (talk) 13:59, 16 June 2017 (UTC)
Because nobody has opened one for other languages, and Scots should be regarded as a dialect rather than a language. To quote Oxford: "[mass noun] The form of English used in Scotland." DonnanZ (talk) 16:01, 16 June 2017 (UTC)
There's a difference between Scottish English and Scots, though. Also, they treat Middle English as English, too, I think. (Do they treat Yola as English?) - -sche (discuss) 16:06, 16 June 2017 (UTC)
I don't think links to foreign-language entries should be entirely banned; they are sometimes useful, for example if there is no English word for something, but two or a few languages have a word for it that could be interlinked. (There are those who prefer to create SOP translations targets in such cases, but that gets harder to justify the fewer and more obscure the languages with words for the thing are.) Some of the information could perhaps be shoehorned into etymology sections (even for unrelated words, one could say "compare/contrast how the X language term for the same thing, Y, is formed"), but that seems suboptimal. - -sche (discuss) 16:06, 16 June 2017 (UTC)

Categories English_N-syllable_wordsEdit

This may not be the most important issue in the World, but currently open compounds seem to be listed in the categories English_N-syllable_words, see for example this link[10]. I'm not convinced that open compounds such as extravehicular activity or venire facias de novo should belong to these categories. At least in sensu stricto they are not words but dictionary entries. --Hekaheka (talk) 08:48, 16 June 2017 (UTC)

We call phrases words, e.g. "word of the day". Equinox 11:07, 16 June 2017 (UTC)
We can't change. Think of the implications for our slogan. DCDuring (talk) 11:17, 16 June 2017 (UTC)
We may or may not be able to change. I'm pointing out that this category is not the only instance of the issue. Equinox 11:30, 16 June 2017 (UTC)
Personally, I'd have no objection if we changed words to entries or terms in all such categories. In fact, I've suggested this before at some other forum. — SMUconlaw (talk) 13:58, 16 June 2017 (UTC)
I see no point in a category that lists N-syllable phrases. I could understand words. But then, on the other hand, I could just ignore the part that I find useless. --Hekaheka (talk) 14:21, 16 June 2017 (UTC)

Consistent headings of numeral versus numberEdit

has numeral. has numeral. has numeral. has numeral. 사#Numeral has numeral. 오#Numeral has numeral. 육#Numeral has numeral. 칠#Numeral has numeral.

has number, not numeral. 팔#Number has number. 구#Number has number.

구#Numeral has numeral, not number

Is it possible to provide a consistent heading, so that when I'm copying the hanja for each sino-Korean number, I can just change the hangul in the URL and get directly to the section I want? Alternatively, making both #Number and #Numeral link to the same place would work.

I hope I haven't stepped into an awful WikiWar with decades of fighting. AGrimm (talk) 01:29, 17 June 2017 (UTC)

These are both valid headers. WT:ELE suggests that Numeral is a part of speech while Number is a symbol. I don't know quite what that is supposed to mean and I am not familiar with Korean. Anyone else? Equinox 04:20, 17 June 2017 (UTC)

Inflectional suffixes: -bam, -bas, etc.Edit

Do we want that? And right now, Category:Latin suffixes is a mess, because it gathers inflectional and derivational suffixes indiscriminately. --Barytonesis (talk) 17:20, 17 June 2017 (UTC)

I see no reason not to add inflectional suffixes like that. As for the category you mention, it occurs to me we should have a category Category:Latin derivational suffixes to match Category:Latin inflectional suffixes. — Eru·tuon 17:43, 17 June 2017 (UTC)
I've created the category derivational suffixes, and moved noun-forming suffixes, verb-forming suffixes, adjective-forming suffixes, and adverb-forming suffixes to be inside that category, rather than in the main category suffixes. This can be undone if editors disagree. — Eru·tuon 17:49, 17 June 2017 (UTC)
@Erutuon: Thanks. The categories you mentioned still appear in the main one, though. --Barytonesis (talk) 19:33, 17 June 2017 (UTC)
@Barytonesis: Yeah, it'll eventually update. Once a module change has been made, the software takes a while to regenerate the pages with the new code, and to change the members of categories. — Eru·tuon 19:35, 17 June 2017 (UTC)

2/3 majorityEdit

At some point, I'd like to mention explicitly in WT:Voting policy that a vote passes if it reaches a 2/3 majority. Apparently that's the true consensus and everybody knows it (although it's challenged and discussed sometimes), but it doesn't seem to be written as policy yet. --Daniel Carrero (talk) 03:54, 18 June 2017 (UTC)

Strange that it isn't mentioned. Are there any types of votes in which a bare majority would be enough? — Eru·tuon 04:13, 18 June 2017 (UTC)
I don't think all vote types should have the same criteria. For instance, a 2/3s majority is not sufficient for CheckUser (due to overriding policy) and I don't think it is sufficient for other user rights. For policies etc. I think that it is a good measure, for user rights I think it is low. Likewise, for removal of user rights I think 2/3s is high. Here is a previous discussion on the matter (thanks Dan). I fully support documenting what we consider a pass or fail (perhaps with a third range for no-consensus). - [The]DaveRoss 12:36, 18 June 2017 (UTC)
We could add "See meta:CheckUser policy#Appointing local Checkusers for the policy about appointing new checkusers." as part of the text in WT:Voting policy. --Daniel Carrero (talk) 19:40, 18 June 2017 (UTC)
I'm fine with having 2/3 majority for both addition and removal of user rights like most kinds of votes. But I think I can see the case for doing something different. Are there any specific suggestions? Maybe 3/4 majority for addition of rights and bare majority for removal of rights? --Daniel Carrero (talk) 02:12, 19 June 2017 (UTC)

Latin demonym capitalizationEdit

Some Latin demonyms (like Germānus, Carthāginiensis or Celta) are capitalized, while others (like romānus or graecus) are not.
What is the correct way to handle these demonyms? All capitalized? None capitalized? One of the two, with the other as an alternative spelling form?
The dictionary I use capitalizes them all, but I'm not sure how these things are done around here.
I would really appreciate any information about this. –– GianWiki (talk) 22:57, 18 June 2017 (UTC)

I'd say we should capitalize them all. We shouldn't have Latin entries for romānus and graecus unless they also have non-demonym meanings. —Aɴɢʀ (talk) 13:48, 20 June 2017 (UTC)
@Angr I wonder, what is your motivation for this? Even Italian, Spanish or Romanian demonyms are lower case. If I am not mistaken, you wanted some Sanskrit terms capitalised as well? --Anatoli T. (обсудить/вклад) 01:50, 21 June 2017 (UTC)
@Atitarev: My motivation is that that's the way I'm used to seeing it done in modern editions of Latin texts. I've always seen the first sentence of De Bello Gallico written "Gallia est omnis divisa in partes tres, quarum unam incolunt Belgae, aliam Aquitani, tertiam qui ipsorum lingua Celtae, nostra Galli appellantur." As for Sanskrit, I don't recall ever endorsing capitalized terms at all; we write Sanskrit in Devanagari anyway. —Aɴɢʀ (talk) 10:51, 21 June 2017 (UTC)
I've noticed that my dictionary capitalizes parts of speech derived from demonyms as well: (e.g. Graecānicus, Graecē, GraeculusGraecus). Also, I'll try summoning @JohnC5 and @Metaknowledge, as active users on Latin, to get more insight on the matter. – GianWiki (talk) 01:22, 21 June 2017 (UTC)
@GianWiki: There's been some disagreement on this topic in the past, particularly in reference to days of the week and months. The Romans only had capitals, so it is an editorial choice on our part (as with the alternations of u ~ v, i ~ j, and inclusion of macra and diaereses). I favor capitalization though I concede we maybe should leave soft redirects from the lowercase entries. —JohnC5 01:50, 21 June 2017 (UTC)
@JohnC5: Capitalization with soft redirects from the lowercase entries (unless they have non-demonym meanings: germānus (pertaining to brothers or sisters) vs. Germānus (Germanic)) sounds like the best option to me – GianWiki (talk) 13:43, 22 June 2017 (UTC)
@GianWiki: Take it away, maestro. —JohnC5 14:34, 22 June 2017 (UTC)
I wonder, do Latin dictionaries written in languages that do not capitalize adjectives, like German, French, and Spanish, follow the rules of the language and not capitalize Latin adjectives? — Eru·tuon 01:28, 21 June 2017 (UTC)
Capitalisation of demonyms is a borrowing from modern English. It has infected even transliterations of languages, which do not have the distinction between lower and upper case letters. Oppose. --Anatoli T. (обсудить/вклад) 01:30, 21 June 2017 (UTC)
@Erutuon: du Cange (published in France) gives entries in all caps, but capitalizes Anglicos in the example sentence. —Aɴɢʀ (talk) 13:55, 22 June 2017 (UTC)
I have not capitalised any Middle Dutch words because they weren't in the original manuscript. I think the same practice should be applied to Latin and other old languages. If we go as far to include Gothic in its original script, then we should also write lowercase Old English. —CodeCat 14:04, 22 June 2017 (UTC)

Proper nouns in derived terms listsEdit

I'm wondering if proper nouns really belong in derived terms lists, for example. @Donnanz --Victar (talk) 12:56, 20 June 2017 (UTC)

I think it's fine, unless you think there are too many possible derived terms and it would make the page too big. DTLHS (talk) 13:50, 20 June 2017 (UTC)
Right, that's my concern, with words like park, wood, bury, etc. --Victar (talk) 14:03, 20 June 2017 (UTC)
If there's an overabundance of place names in derived terms, they can be placed in a separate section. I have already done that somewhere. DonnanZ (talk) 14:31, 20 June 2017 (UTC)
One day, I split disease#Derived terms into the current three collapsible boxes. (obviously it's a long list of diseases, not of proper nouns; this is just an idea about how to handle long lists of derived terms) --Daniel Carrero (talk) 14:36, 20 June 2017 (UTC)
That works, but it seems pretty crazy to be manually adding all those instead sorting them into a category, like Category:English terms containing disease using a bot.--Victar (talk) 15:10, 20 June 2017 (UTC)
I support creating categories like Category:English terms derived from "apple" (see apple#Derived terms), Category:English terms derived from "disease", etc. --Daniel Carrero (talk) 15:17, 20 June 2017 (UTC)
I feel like there is a difference between Category:English terms derived from "apple" and Category:English terms compounded with "apple". --Victar (talk) 15:24, 20 June 2017 (UTC)
Is "compounded" a subset of "derived"? Maybe it would be a good idea to keep whichever "apple" category is the most inclusive one. --Daniel Carrero (talk) 15:28, 20 June 2017 (UTC)
I suppose. I'm trying to put to words the distinction between parklet and Park County. --Victar (talk) 15:39, 20 June 2017 (UTC)
Why do we need a category for something that will rarely be used and is most likely to be used from the entry itself, eg for apple. The following special search would provide what is needed"apple"&title=Special:Search DCDuring (talk) 15:57, 20 June 2017 (UTC)
I'm thinking more about the page subheaders. --Victar (talk) 16:58, 20 June 2017 (UTC)
If we are talking about open compounds, a template containing the above code would generate a list on demand. Compounds spelled solid would require something different. Note that we already have 244 entries that include the word apple with the headword spelled open, including hyphenated forms. DCDuring (talk) 02:31, 21 June 2017 (UTC)
@Daniel Carrero: Is there precedence for using quotation marks in category names? --Victar (talk) 21:25, 20 June 2017 (UTC)
No, I think there's no precedence for using quotation marks in category names. We don't have to use them. But I think in some cases it might be clearer that we are talking about the word itself. For example: Category:English terms derived from "food" seems clearer than Category:English terms derived from food. --Daniel Carrero (talk) 23:37, 20 June 2017 (UTC)
Actually, it looks like we already generate similar categories, Category:Terms derived from the PIE word *ǵónu. --Victar (talk) 00:35, 21 June 2017 (UTC)
@Daniel Carrero: Given that, what about moving forward with Category:English terms derived from the word apple? --Victar (talk) 01:16, 21 June 2017 (UTC)
I think this one that I had said before is better: Category:English terms derived from "food". It's shorter than the alternative, even though both names are fairly long. "the word" might not even always be true if we decide to have a few compound term categories like Category:English terms derived from "pick up" (pick up#Derivations). --Daniel Carrero (talk) 02:25, 21 June 2017 (UTC)
I agree that they should be listed in theory; if we make an exception, it is only for the sake of page size and navigation. Equinox 14:41, 20 June 2017 (UTC)
I think proper nouns should be included in Derived terms lists, but perhaps the list could be split somehow into "compounds derived from x" and something else. — Eru·tuon 00:31, 21 June 2017 (UTC)
Personally, I think I'd prefer a solution similar to this:
====Derived terms====
[[:Category:Terms compounded with the word park|Terms compounded with the word park]]
* {{l|en|parkade}}
* {{l|en|parklet}}
--Victar (talk) 02:53, 21 June 2017 (UTC)
If the desire is to exclude or to separate placenames or proper nouns, let's just spell that out in the category titles or 'headers' of the separate tables: "terms derived from 'town'" vs "placenames derived from 'town'". To distinguish "compounds" from "derived terms" is to do something else entirely: "Sandtown" (and lots of other "-town"s) is as much a compound with "town" as "townhouse", so both belong in the same category if the categories are based on compounding vs other derivation. - -sche (discuss) 04:22, 21 June 2017 (UTC)

Nahuatl vs Classical Nahuatl?Edit

So what exactly is the/our difference? Presumably, Nahuatl comprises modern lects, but apart from that?
Or for my own narrow purposes: are words like coyote, chocolate, avocado, tomato, guacamole and their equivalents in many seemingly predominantly European languages from Nahuatl or Classical Nahuatl? Does the perceived inconsistency (see Category:Terms derived from Nahuatl, Category:Terms derived from Classical Nahuatl) reflect actual difference, or simply begrebsforvirring?__Gamren (talk) 20:11, 20 June 2017 (UTC)

Oh, I forgot to link to [11], Wiktionary:Beer_parlour/2008/September for prior discussions.__Gamren (talk) 20:21, 20 June 2017 (UTC)
As the discussion you link to suggests, "nah" seems to be a relict of Wiktionary's early history when we often copied the ISO's codes for both macrolanguages and their constituent varieties, both unintentionally (as a result of importing codes en masse) and apparently sometimes intentionally as a result of a lax attitude towards such inconsistencies. But it obviously doesn't make sense to have both the macrolanguage nah and the subvarieties nci, nch, nhn, etc. There is considerable debate over how mutually intelligible the varieties are, with some sources saying there is little to no mutual intelligibility and speakers often cannot understand another variety at all, and other sources saying they are "largely mutual intelligible". - -sche (discuss) 02:42, 21 June 2017 (UTC)

Unicode 10.0.0 releasedEdit

Unicode 10.0.0 is released on 2017 June 20. Now, you can make entries with new characters (if you can). [12] --Octahedron80 (talk) 05:27, 21 June 2017 (UTC)

🤟 (which is supposed to be the "I love you" hand sign). —Justin (koavf)TCM 06:46, 21 June 2017 (UTC)

There is also the list of variation sequences (which is not yet supported in most fonts). This includes CJK compatibility ideograph mappings. --Octahedron80 (talk) 07:15, 21 June 2017 (UTC)

Thanks for the notice. I see that characters were added to, e.g., Zanabazar Square. Do we need to update the range of characters covered by "Zanb" as listed in Module:scripts/data? - -sche (discuss) 14:25, 21 June 2017 (UTC)
I already pre-updated the module for new scripts. --Octahedron80 (talk) 10:16, 22 June 2017 (UTC)

About attesting symbolsEdit

I'd like to discuss about symbols like these, which often have entries:

I don't know if we want them here.

Let's assume for a second that these symbols are unwanted. In that case, we would probably be able to delete all or most of the entries listed in these appendices. I think we wouldn't even need RFD/RFV to do it. They are unlikely to be found in running text by default. Except the emoticons, but they are mostly found on the internet (which we mostly don't accept for attestation purposes, as you know) as opposed to books. (That said, I added two quotations from a book that uses the domino tile "🁊" in running text.)

On the other hand, let's assume for a second that these symbols are wanted. I would be fine with attesting symbols as symbols -- that is, in drawings, pictures, comics and other contexts, as opposed to only in running text.

  • Attest map symbols by finding them in durably-archived maps.
  • Attest technical symbols by finding them in durably-archived technical books or manuals, etc. (I'm thinking = pause, = power symbol, 🔗 = hyperlink, 🔇 = mute, = refueling needed...)
  • Attest the heart () as a symbol of love by finding it in this context in a durably-archived drawing or something.
  • I got a quotation for 💡 meaning "idea" from a textbook, but it could easily be found in comics.
  • Attest computer symbols (like 💾 for "save") attested by finding them in durably-archived computer screenshots, or even consider its actual use in durably-archived software. (That said, I got a quotation for meaning "Tab symbol" in running text in a 1989 video game -- "⇆Tab to Move Down to Notes Button". In 📋, I mentioned that Windows 3.1 uses it in lieu of a quotation, but it could equally be from a Windows 3.1 screenshot in a book.)
  • Accept uses of symbols in durably-archived movies, comics and video games. (I got a few citations for 🛇 meaning "prohibition symbol" from cartoons.)

Let me know if there are any ideas/comments. --Daniel Carrero (talk) 07:18, 21 June 2017 (UTC)

"Attesting symbols as symbols" e.g. in comics would make us a picture book, not a dictionary. Equinox 14:32, 21 June 2017 (UTC)
The last part is true, this is not the job of a dictionary [sense 1: "A reference work with a list of words ..."].
Maybe the sense 1 is the only one that matters to us -- we would be a normal dictionary and list words only, not symbols. This is one idea that makes sense, and is the first one I started to discuss in my message above.
Maybe this could be the job of a dictionary of another sense [sense 2: "By extension, any work that has a list of material organized alphabetically..." although the "alphabetically" part probably can't apply to all languages and contexts.]
Google Books has some books called "Dictionary of Symbols" which could be called "not dictionaries" too.
Consider the entry . Do you think we should delete the senses "hearts (on playing cards)", "love", "(video games) a hit point", "(video games) healing" and (Japanese) "An emoticon indicating a smooth and pleasant voice."?
If we kept them, that would not make us a "picture book" as defined in the linked entry. It would simply make us a dictionary that lists meanings for "♥". Which is unusual, I know. --Daniel Carrero (talk) 18:26, 21 June 2017 (UTC)
The "love" sense is important, because it appears as a verb in English syntax ("I ♥ cookies"). The others I am not sure about. The playing-card sense only really says "pictures of hearts are used on playing cards"; so what? A picture of the Grim Reaper/Death appears on a tarot card but that's also a picture. Likewise, the video-game sense isn't a sense of a word: it's just symbology: the heart as an image is generally used to represent life or health. Equinox 19:10, 21 June 2017 (UTC)
As I'm sure you know, the entry has two separate senses for "love" in different languages: the English verb as in "I ♥ cookies", and a general Translingual "love" that is not necessarily a verb. Do you think we should delete the Translingual one? Likewise, has four Translingual senses: "death", "poison", "pirates" and "toxic". Do you think we should delete them all?
I wish to defend one specific sense we mentioned: I think we should keep ♥ "hearts (on playing cards)", because poker books regularly have sentences like this, using the playing card symbols in running text: "In that situation, you should raise with AA♠." --Daniel Carrero (talk) 19:29, 21 June 2017 (UTC)
Do you forget about idiomicity policy? --Octahedron80 (talk) 03:17, 22 June 2017 (UTC)
I also think it's a very, very bad idea for us to take the approach that "anything in Unicode is automatically a dictionary-worthy symbol". Unicode is full of all kinds of rubbish these days. Equinox 19:11, 21 June 2017 (UTC)
Could you mention maybe one or two examples of Unicode rubbish? --Daniel Carrero (talk) 19:29, 21 June 2017 (UTC)
🗑 - [The]DaveRoss 20:00, 21 June 2017 (UTC)
Compatibility relics. Skin colour modifiers for smileys. I know there's lots more crap but I don't want to spend my time hunting it down. Equinox 20:04, 21 June 2017 (UTC)
Some votes (listed at Wiktionary:Character variations) supported redirecting a few compatibility characters to "actual" entries when possible. For example, redirects to km (technically this entry was not part of the votes, FWIW).
Skin color modifiers are control characters. They don't have any shape by themselves. I support not having any entries for skin color modifiers. --Daniel Carrero (talk) 20:17, 21 June 2017 (UTC)

Sitelinks are enabled on Wikidata for Wiktionary pages (outside main namespace)Edit


Short version: Since yesterday, we are able to store the interwiki links of all the Wiktionaries namespaces (except main, citations, user and talk) in Wikidata. This will not break your Wiktionary, but if you want to use all the features, you will have to remove your sitelinks from wikitext and connect your pages to Wikidata.

Important: even if it is technically possible, you should not link Wiktionary main namespace pages from Wikidata. The interwiki links for them are already provided by Cognate.

Long version available and translatable here.

If you encounter any problem or find a bug, feel free to ping me.

Thanks, Lea Lacroix (WMDE) (talk) 08:24, 21 June 2017 (UTC)

I tried on Category:English language and it work well. d:Q7923975. Linking pages must be corresponding to the subject as Wikipedia etc. Otherwise, new Q entry must be created such as parts of speech. However, some Wiktionaries do not allow to cleanup (remove) interwiki links. This must be discussed locally. --Octahedron80 (talk) 09:38, 21 June 2017 (UTC)
This can also include IW from closed wikis but it must be input manually. --Octahedron80 (talk) 01:48, 22 June 2017 (UTC)
@Lea Lacroix (WMDE) May I ask why d:Q1860 was not used for this instead? —CodeCat 12:33, 21 June 2017 (UTC)
Hello, d:Q1860 is used to describe the concept of English language in general. It has, for example, statements like the number of speakers. This is not exactly what we want to describe with d:Q7923975, dedicated to the categories on Wikimedia projects, that's why we use a different item. Lea Lacroix (WMDE) (talk) 13:01, 21 June 2017 (UTC)
My short answer: it's not the same type. :) --Octahedron80 (talk) 15:02, 21 June 2017 (UTC)

@Lea Lacroix (WMDE) There is some problem with d:Q4167836. Wiktionary does not appear and I can not add any page like Wiktionary:Categorization. --Vriullop (talk) 15:12, 21 June 2017 (UTC)

Another problem: Category:English language links to the Dutch Wiktionary category nl:Categorie:Woorden in het Engels ("Words in English"). This category corresponds to our category Category:English lemmas instead, which contains all words in English like the Dutch category. The Dutch Wiktionary apparently has no equivalent to our Category:English language. The Afrikaans Wiktionary has the same problem. —CodeCat 16:38, 21 June 2017 (UTC)

Actually, nl:Categorie:Woorden in het Engels doesn't correspond to any of our categories. It seems to be a combination of Category:English lemmas and CAT:English non-lemma forms, as it includes things like abashes and abashing that we consider non-lemma forms. — Eru·tuon 16:55, 21 June 2017 (UTC)
I've also noticed this, other non-English wikis often have similar systems. Trying to interlanguage-link between Wiktionaries is complicated and at this point probably pointless due to the lack of a uniform system for the categorization of words across all Wiktionaries. — Kleio (t · c) 16:59, 21 June 2017 (UTC)
So how do we fix this exactly? Can we just remove the Dutch category from Wikidata manually? What about all the other languages? —CodeCat 17:08, 21 June 2017 (UTC)
Oddly, it:Categoria:Parole in inglese contains no words, so it's a better match for our category even though it's named "words in English". ie:Categorie:Parol in anglés by contrast contains nothing but words, no categories. co:Categoria:Parolle in lingua inglese meanwhile contains words and categories, like the Dutch one, but appears restricted to lemmas. —CodeCat 17:13, 21 June 2017 (UTC)
Yes, nl:Categorie:Woorden in het Engels and Category:English lemmas are different in nature (in what they cover) and so they should probably not be linked; there should apparently be separate Wikidata items for "category of all words in English" and "category of all lemmas in English". In general, interwiki links between categories can be handled (not on the technical level of Colgate vs Wikidata, but on the level of "do interwiki links exist?") just like links between main-namespace pages. If nl.Wikt has an entry foobar and we don't, they can't interwiki-link that entry to our wiki, and if we have an entry foo and they don't, we can't link our entry to them, but when both en.Wikt and nl.Wikt have entries for bar, they can be linked. Likewise, there's no page on our wiki for nl:Categorie:Woorden in het Engels to be linked to (unless we regard it as a fit with "Category:English language", which, actually, it seems to be, if it's their highest-level category for English), but other categories can still be linked. - -sche (discuss) 17:19, 21 June 2017 (UTC)
Part of the confusion seems to be the conflation of several different roles for the categories:
  1. Top-level category for a language.
  2. Contains lemmas.
  3. Contains non-lemmas.
  4. Contains categories for parts of speech.
The Dutch category fulfills all four roles, while we have three separate categories for roles 1-3 while role 4 is combined with role 2. The Italian category fulfills role 1, so it matches our role 1 category, but it also fulfills role 4 which ours does not. The Interlingue category fulfills only role 2, but not 4 like ours does. The Corsican category fulfills roles 1, 2 and 4. —CodeCat 17:41, 21 June 2017 (UTC)
I can confirm that nl:Categorie:Woorden in het Engels is the top-level category for English, that's why it's linked with Category:English language. The highest-level category for a language is always linked with the highest-level language category in other Wiktionaries, even if the subcategories/pages in them differ. The problem of not exactly matching category interwikis was always there; now, the only difference is that the interwikis will be hosted at Wikidata, instead of locally. Pinging User:Malafaya, who has been maintaining category interwikis with his bot (MalafayaBot) and solving category interwiki conflicts for years, in case he wants to share his experience / advice. -- Curious (talk) 22:07, 21 June 2017 (UTC)
(IMO the fundamental problem is that each Wiktionary has its own structure... —suzukaze (tc) 08:02, 22 June 2017 (UTC))

Can Wikidata store interwikis for Unsupported titles/Colon? --Daniel Carrero (talk) 02:37, 22 June 2017 (UTC)

@Vriullop Some items related to Wikimedia projects are protected, because they are used a lot and attract vandalism. For these, you can add a message on the talk page, with the links you want to add, and an admin will take care of it.
@Daniel Carrero Thanks for noticing, indeed, this is not supported either by Cognate or Wikidata links now. We're going to investigate on the best way to provide automatic links here.
Thanks for your feedback, Lea Lacroix (WMDE) (talk) 07:46, 22 June 2017 (UTC)

Semantic loans versus calquesEdit

@CodeCat proposed on my talk page that semantic loan categories, such as Category:Russian semantic loans from English, be merged with calques, such as Category:Russian terms calqued from English.

Theoretically the difference is that semantic loans are single morphemes, like Russian мышь (myšʹ, mouse), while calques, such as Latin accusativus, have more than one morpheme, but this is probably not consistently enforced. Recently I was recategorizing Greek γύρισμα (gýrisma) as a calque, as it consists of two morphemes.

I think it would be much simpler to merge the two categories. I don't see any practical value to the distinction.

I'm going to ping @Dixtosa and his alter ego @Giorgi Eufshi, because he created the template {{semantic loan}}. — Eru·tuon 03:54, 22 June 2017 (UTC)

No the difference is that in the case of semantic loan in the borrowing language the term already exists and already has some meaning (hence the name 'semantic'). And I think this difference is huge. Yes I agree it is very similar to calques and might as well be a special case but having calques and sem loans in differenct places lets us see what languages enriched what language in terms of new words. --Dixtosa (talk) 04:40, 22 June 2017 (UTC)
Ahh, thanks for the explanation. I guess I had misunderstood. That is a more fundamental difference: whether a new word was created or not. — Eru·tuon 04:46, 22 June 2017 (UTC)

Category names in which "words" should be replaced with "terms"Edit

While editing -oecious and adding the suffix -ous to the etymology, I found that the category name Category:English words suffixed with -ous is incorrect, as a suffix is not a word. It should be Category:English terms suffixed with -ous. Similarly, Category:English words by suffix should be Category:English terms by suffix.

(An alternative would be to add a |pos=suffix parameter, which would put -oecious in Category:Suffixes suffixed with -ous instead. That may or may not be simpler. The category name does sound a little funny.)

Moving the entries would simply require changing some code in Module:compound. There would be a lot of deletion, moving, and creation of categories. Moving, because some of the categories (see, for instance, Category:English words suffixed with -ious) have text besides the category boilerplate template. Some of that could be done with bots.

Another category in the entry -oecious suffering from the same problem is the syllable count category Category:English 2-syllable words. That should be English 2-syllable terms, as has been discussed before. — Eru·tuon 06:29, 22 June 2017 (UTC)

I support term everywhere because it has wider meaning than word; it could be suitable for long proper nouns (like countries), proverbs and phrasebook sentences. At the moment, I just use current namings for modules. --Octahedron80 (talk) 10:22, 22 June 2017 (UTC)
I don't think terms made of multiple words should be categorised by number of syllables. —CodeCat 13:09, 22 June 2017 (UTC)
Well, two things: there should probably be a subcategory for words by syllable count. And what about bound morphemes (prefixes, suffixes, whatever else)? Should they also not be categorized? — Eru·tuon 19:50, 22 June 2017 (UTC)
It seems to me the best solution is to have a category for "terms by syllable count" and within it subcategories for "words by syllable count" and "bound morphemes by syllable count". That way everyone's preferences are accommodated. — Eru·tuon

Category:Anatomy vs. Category:BodyEdit

I think these categories are currently not well delineated. I suppose the firs one is purposed for terms which are only used in a medical context, and not by laymen? --Barytonesis (talk) 14:08, 22 June 2017 (UTC)

Also, would it be possible to make Category:Teeth a subcat of Category:Face? --Barytonesis (talk) 14:28, 22 June 2017 (UTC)
Yes; in theory at least, CAT:Anatomy is for technical terms and CAT:Body is for everyday words pertaining to the body, though in practice the distinction is hardly observed. —Aɴɢʀ (talk) 19:50, 22 June 2017 (UTC)

Search results from Wiktionary now part of Wikipedia's search systemEdit

Just to let you know, as announced via mailing list service, English Wikipedia is now receiving search results of this project, Wiktionary, intended to direct Wikipedia users to this project. Currently, an option to suppress the search results of this project from the English Wikipedia search system is proposed at Village pump's "proposal" subpage, where I invite you to comment. --George Ho (talk) 19:16, 22 June 2017 (UTC)

What about in the opposite direction? DTLHS (talk) 19:23, 22 June 2017 (UTC)
I'd love such a feature here, especially if it enabled me to select which projects to include. DCDuring (talk) 19:37, 22 June 2017 (UTC)
It would facilitate article creation, whenever it's necessary to research the meanings of terms. Andrew Sheedy (talk) 20:30, 22 June 2017 (UTC)

Remaining pages with interwikisEdit

User:DTLHS/cleanup/pages with interwikis. Pages such as uexo should be looked at carefully. DTLHS (talk) 21:30, 23 June 2017 (UTC)