Wiktionary:Beer parlour/2016/June

Colored box around closed votes? edit

I think it would be useful if we put a colored box around votes after there closed, the way we do with RfDs when they're archived. Purplebackpack89 13:47, 1 June 2016 (UTC)[reply]

It might be nice, but you could overlook that, too, like you overlooked the "Status/Votes" column. An absent-minded mistake by you doesn't mean we have to rearrange everything to make it impossible for you to make the same absent-minded mistake again. Most people would just say "oops- my bad" and let it go. Chuck Entz (talk) 01:29, 2 June 2016 (UTC)[reply]
Purplebackpack89 also overlooked the decision at the end of the vote page. --Daniel Carrero (talk) 01:59, 2 June 2016 (UTC)[reply]
@Chuck Entz I'm not the dullest tool in the shed. If I make that absent-minded mistake, it's likely others would too. It's unlikely I'd notice something like the whole vote being shaded red, blue or green. @Daniel Carrero I don't think you're seeing the problem. Because the decision is at the bottom instead of the top, it is BEYOND the voting section. Purplebackpack89 16:47, 2 June 2016 (UTC)[reply]
Yeah, but it also has the close date near the top of the vote. Maybe you should give the "I'm not the dullest tool in the shed" thing a rest. —Μετάknowledgediscuss/deeds 17:49, 2 June 2016 (UTC)[reply]

Closed votes still in "current votes" section edit

Should votes that are closed be removed from the "current votes" section and put in a "recently closed" section? It seems bad form to have open votes and closed votes both as "current" Purplebackpack89 13:49, 1 June 2016 (UTC)[reply]

No. Not worth the extra visual clutter and the extra work. It's hard enough to make things idiot-proof- do we have to make things Purplebackpack89-proof, too?
Please note that I'm not calling you or likening you to an idiot (an idiot would be easier to anticipate). Chuck Entz (talk) 01:43, 2 June 2016 (UTC)[reply]
@Chuck Entz They have these fail-safes on Wikipedia. It's not like I'm suggesting anything revolutionary. And you ARE kinda suggesting that mistakes I make are mistakes nobody else could ever conceivably make, while defending a very confusing process. Purplebackpack89 16:48, 2 June 2016 (UTC)[reply]
Absent-minded mistakes aren't related to level of intelligence- if anything, people with more going on in their minds are more likely to make them. I also am not saying that you make mistakes that nobody else makes (except with regards to misinterpreting others' intent- but that's different). No, my point was that your response to your mistakes is different: there's no need to find someone or something to blame for an absent-minded mistake- we all make them, and no one would bat an eyelash at your admitting to one. You're not going to be singled out from the midst of the flock and eaten by wolves if you show signs of weakness. Chuck Entz (talk) 02:40, 3 June 2016 (UTC)[reply]
This isn't about blame, though, it's about improvement. IMO, there are a lot of ways in which Wiktionary is organized that could be better. This is one of them. Purplebackpack89 04:11, 3 June 2016 (UTC)[reply]

Sending thanks edit

This may seem bloody obvious to some, but…! On the history page you are given the option of thanking an editor. When chosen you are asked "Do you want to send public thanks? Yes or No" - the question could be taken as ambiguous. If I choose "No" am I (1) cancelling the thanks, or (2) sending thanks privately. What does "public thanks" mean? Where are they published?   — Saltmarshσυζήτηση-talk 06:38, 2 June 2016 (UTC)[reply]

Special:Log/thanks has a list of them. Wyang (talk) 06:46, 2 June 2016 (UTC)[reply]
Yes, I wasn't too sure what it meant when I first thanked someone for an edit. I went with "Yes" just in case, but I imagine many newer users are also confused by the ambiguity. Andrew Sheedy (talk) 07:41, 2 June 2016 (UTC)[reply]
@Saltmarsh "Send public thanks for this edit?" If you click no, you are cancelling the thanks.
The question probably should be: "Do you still want to send thanks, with the full knowledge that it will be public? (Yes/No)"
Just to make sure, I clicked "Thanks" in your Beer Parlour edit and then clicked No. If you received my thanks, then I was wrong. --Daniel Carrero (talk) 12:40, 2 June 2016 (UTC)[reply]
That's a bit long, though. "Really send public thanks?" would suffice. Equinox 13:38, 2 June 2016 (UTC)[reply]
Am I missing something, or does the thanks log not in fact indicate the specific edit that was "thanked"? Equinox 13:40, 2 June 2016 (UTC)[reply]
If we want to configure this, it seems the message to edit is mediawiki:Thanks-confirmation2.​—msh210 (talk) 14:22, 2 June 2016 (UTC)[reply]
It was "Send public thanks for this edit?" and I've changed it to "Send thanks for this edit? It will be public.". How's that?​—msh210 (talk) 15:52, 2 June 2016 (UTC)[reply]
I like Equinox's version ("Really send public thanks")--Dixtosa (talk) 15:54, 2 June 2016 (UTC)[reply]
Personally, I don't like when software uses the word "really". It sounds too colloquial. --WikiTiki89 16:01, 2 June 2016 (UTC)[reply]
(You should see the appalling slang in Office 2013!) Alternatively, we could just reduce it to "Send thanks for this edit?", and document the fact that thanks are public elsewhere. We don't warn about public-ness for other common wiki operations. Equinox 16:57, 2 June 2016 (UTC)[reply]
I've seen it, and I don't like it. --WikiTiki89 17:40, 2 June 2016 (UTC)[reply]
Equinox's Send thanks for this edit? seems to solve the problem succinctly.   — Saltmarshσυζήτηση-talk 04:59, 3 June 2016 (UTC)[reply]
Yeah, but it doesn't solve the problem of notifying the user that the thanks will be public. --WikiTiki89 14:27, 3 June 2016 (UTC)[reply]
Why is it necessary to double-check that a user really wants to do what he just said to do? I can understand having that sort of failsafe in place for something potentially damaging, but not for something as innocuous as sending thanks. Can't we just eliminate the message altogether and allow clicking on "thank" to immediately do what it says it does? —Aɴɢʀ (talk) 14:23, 3 June 2016 (UTC)[reply]
It's right next to "undo". Really not the kind of situation where you want a slip up. Korn [kʰũːɘ̃n] (talk) 17:16, 3 June 2016 (UTC)[reply]

Potential Bot for Adding LSJ and L&S Links to Ancient Greek and Latin Entries edit

Hello. In the last few days I have edited the L&S and LSJ templates and modules so that links to the dictionaries resolve correctly from the page names, without use of arguments, in a very large proportion of cases. The exceptions mainly involve proper nouns, affixes, non-lemma forms, and alternative spellings which are not precisely bugs. I have tested a robot called OrphicBot to add LSJ external links to the subset of 4,062 of Wiktionary's approximately 7,000 Ancient Greek entries which are not already linked, which are lemmas, and for which the bare template is tested to produce a valid result. Since, for example, almost all German entries link to the Duden dictionary, it seems consistent to include a link to a freely available dictionary for Greek. I also think it could be quite helpful, since too much inconvenience, perhaps, in Hellenistic pursuits is merely typographical in nature. Equivalently, the Wiktionary Latin section is much more developed, with nearly 30,000 lemma entries, as I recall. If it seems reasonable to others, I would like also to add links to the L&S dictionary via template where these are not already present. The source code (albeit grossly formatted, and in perhaps a still rough iteration) is linked in the user page of OrphicBot, and a small test run can also be seen in the catalogue of that user's contributions. If these edits seem reasonable to make to others here, I will put the bot user status question to vote in the voting area. Thank you. Isomorphyc (talk) 03:04, 3 June 2016 (UTC)[reply]

I don't know about L[ewis] & S[hort], but we already have a template {{R:LSJ}} that makes links to Liddell and Scott. The problem is the large number of Ancient Greek entries that don't use it, but instead have merely a link to Wikipedia's article on the Liddell and Scott dictionary. What I'd like a bot to do is go through and change all instances of *[[w:LSJ|LSJ]] to *{{R:LSJ}}, adding any necessary arguments as well. —Aɴɢʀ (talk) 14:28, 3 June 2016 (UTC)[reply]
Edited for clarity: this is a problem I have felt as well. Here are options by increasing aggressiveness:
1) just add {{R:LSJ}} to External Links where valid, even if an LSJ-mention exists.
2) replace all LSJ-mentions with LSJ-templates where valid; potentially this effaces bibliographical information (negligibly, I think: if an LSJ mention happens to imply the paper dictionary where it differs from the Perseus version, or where it implies the preface rather than the headword entry, for example.) This is close to my preference.
3) move all existing LSJ links to External links for consistency. This consistent and easy to use, but it destroys far too much bibliographical information.
Additional options/issues:
- Add an additional template to categorise lemmas with no valid entry in LSJ for manual linking. Mostly these are a few hundred non-Attic dialectical spellings and some number of Byzantine words. The former will usually be in LSJ with Attic spellings and the latter will not. A few other examples are prefixes and suffixes. The number is not large and I think this is worth doing.
- I would want to skip over inflected forms; given there are literally millions potentially, to destem and link seems like clutter.
Are there other Greek desiderata that can be addressed?
Isomorphyc (talk) 01:55, 5 June 2016 (UTC)[reply]
Hello @JohnC5, @Wikitiki89, @Angr, @Metaknowledge, @Chuck Entz -- thank you all for participating in my small discussion about my robot. I have opened a vote on this topic in the voting [area]. I would respect any of you if you chose not to support me in this, or to abstain, especially since I am so new here; but I would also be gratified should any of you choose to vote. Naturally I would be exceedingly gratified for any of your support. I hope that my recent activity has given some sense of the types of contributions I like to make to Wiktionary. I would still be very grateful for any further concrete References desiderata; I have posted a few blocks of samples on the user page of User:OrphicBot should it bring anything to mind that anyone might like-- or indeed might not like about the presentation. Thanks. Isomorphyc (talk) 07:27, 16 June 2016 (UTC)[reply]
Hi @JohnC5, I've been working on the pronunciations a bit. Does the new robot edit on diff:χρηστότης look worth proceeding with? I'm posting this here mainly so anyone interested knows I am working on this and can object if desired. It seems 1/8th of the grc-ipa-rows usages have all unambiguous vowels (approx. 235), and can be replaced with no arguments. If this looks reasonable I'll proceed with the following steps: 1) test for a,i,u in diphthongs and call grc-IPA with no arguments 2) test for breves or macrons in head=... and generate arguments 3) look in to finding head=... arguments from LSJ or possibly flagging ambiguous vowels missing head=... arguments, either from the grc-noun (and similar) or with a robot. Also: I noticed grc-ipa-rows produces unexpected output pretty regularly, so I am not using it to test the correctness of or generate any grc-IPA arguments.Isomorphyc (talk) 18:06, 20 June 2016 (UTC)[reply]
@Isomorphyc: The diff for χρηστότης looks good to me. I agree that all unambiguous entries should be changed, and your plan for proceeding seems logical. If we can cut down the number of ambiguous ones to a few hundred, we can fix the rest by hand. —JohnC5 00:07, 21 June 2016 (UTC)[reply]
Hi @JohnC5, per our discussion about the memory utilisation of the data modules, I have sharded the six largest across four shards. Here is a table of the changes:
new locationnew nameold nameold sizedesc
Module:data tables/data[0-2]grc_RLBG_lemma_to_indexModule:R:LBG/data2.2 Mlemma to index
Module:data tables/data[0-2]grc_RWoodhouse_lemma_to_headwordsModule:R:Woodhouse/reverse index1.4 Mlemma to headwords
Module:data tables/data[0-2]grc_RWoodhouse_lemma_to_infinitivesModule:R:Woodhouse/psia1 to infs737 Klemma to infinitives
Module:data tables/data[0-2]la_RMA_index_to_phrasesModule:R:M&A/ix to phrase400 Kindex to phrases
Module:data tables/dataUnshardedgrc_R:Cunliffe_lemma_to_indexModule:R:Cunliffe/data277 Klemma to index
Module:data tables/dataUnshardedla_R:M&A_lemmas_no_collision_to_ix_phraseModule:R:M&A/lemmas no collision to ix phrase143 Klemma to indices
Data access can now take this form: require("Module:data tables").index_table("grc_RCunliffe_lemma_to_index", title) instead of this form: mw.loadData("Module:R:Cunliffe/data")[title]. Hence, now only the shard which contains data for a given lemma will have to be loaded. Better still, since sharding takes place by title-key, all modules retrieving data for a given key on the same page will require only one module load. It turns out that it takes only about ten lines of Python to reshard the data into an arbitrary number of files. I would suggest thirty to one hundred, giving a file size of 50 K - 150 K per shard. An added benefit is that new tables, should they be needed for other modules can be added to dataUnsharded by hand in whole blocks. If this scheme seems to work, resharding can take place mechanically on an as-needed basis or through watching the file sizes. I view this solution as about six flavours of horrible (data masquerading as code), but it seems to solve our memory ceiling problem in a scalable and extensible way. I haven't altered the production modules yet, or tested very extensively, but preliminarily it seems to work; I wanted to mention it here in case there are obvious objections which have not come to my mind. Isomorphyc (talk) 03:40, 29 June 2016 (UTC)[reply]

bot status vote edit

Planned, running, and recent votes [edit this list]
(see also: timeline, policy)
EndsTitleStatus/Votes
Apr 29User:TTObot for bot status 9  0  0
May 26Allowing etymology trees on entriesstarts: Apr 27
(=2)[Wiktionary:Table of votes](=9)

Is no one watching the floating {{votes}} box? Some people are wondering why the vote on User:UT-interwiki-Bot has not been closed out. It appears to have passed a week ago. —Stephen (Talk) 14:43, 3 June 2016 (UTC)[reply]

Since you noticed it, could you not have closed it out? Anyway, I just closed it out. --WikiTiki89 14:53, 3 June 2016 (UTC)[reply]
That's the issue with having the box require manual updating, which I seem to remember opposing back when @Daniel Carrero instituted it (but I thought he would deal with it). —Μετάknowledgediscuss/deeds 18:11, 3 June 2016 (UTC)[reply]
You are referring to Wiktionary:Beer parlour/2016/January#Vote counter. Adding the result in the box was @Benwing2's idea, I just implemented it. In that discussion, I did not even formally support the idea, I "voted" abstain. That is, I don't really care if we have the result in the box or not. --Daniel Carrero (talk) 18:21, 3 June 2016 (UTC)[reply]
@Metaknowledge: The problem here is not that the box wasn't updated, but that the vote wasn't closed at all. --WikiTiki89 18:42, 3 June 2016 (UTC)[reply]
My mistake. —Μετάknowledgediscuss/deeds 18:49, 3 June 2016 (UTC)[reply]
I used to close out most of the votes, but the last time I did it, User:DCDuring began crying corruption! corruption! (or words to that effect), and, try as I might, I was never able to get an explanation of his accusation, so I stopped handling votes. I would not even have mentioned this unnoticed vote here, except that someone asked me to close it out, which I will no longer do. —Stephen (Talk) 18:22, 3 June 2016 (UTC)[reply]
I think you are talking about Wiktionary:Votes/pl-2014-07/Allowing well-attested romanizations of Sanskrit and Wiktionary:Beer parlour/2015/July#Persistent extensions of votes.
If you change your mind and decide to close votes again in the future, it would be fine by me, for what it's worth. --Daniel Carrero (talk) 18:37, 3 June 2016 (UTC)[reply]
I have four main objections to past practice on voting:
  1. Votes should rarely be extended and never by initiators of proposals or those who have strong opinions. Exceptions might be made by following some reasonable procedure.
  2. Substantive votes should not be too abundant or complex. Some special procedure might be needed to allow a large number of votes or a complex voting structure.
  3. There should be a minimum number of participants, including abstainers, for a vote to be closed out with an outcome that changes the status quo ante.
  4. "Technical" changes that have broad implications should not occur without a vote.
These are essentially due process objections. Does anyone have any reason why I should not have these objections? DCDuring TALK 18:47, 3 June 2016 (UTC)[reply]
You should have listed your objections instead of crying corruption! over and over. At first I thought you were accusing Dan of being corrupt because he likes to extend votes. Eventually it dawned on me that you were accusing me of corruption, but could not imagine what you were referring to. In any case, let’s not rehash this here. As far as I’m concerned, the matter ended a year ago and I’m not interested in reliving it. I’m only explaining to Wikitiki why I did not close out the vote. —Stephen (Talk) 18:57, 3 June 2016 (UTC)[reply]

Constructed languages and Foreign words of the day edit

Per the opinion poll taken alongside the original vote that established the Foreign word of the day project, we have only featured words in attested natural languages. However, part of the purpose of FWOTD is to exhibit the content that we have to offer, which includes excellent coverage of various constructed languages and reconstructed languages. Personally, I would like to see mainspace constructed languages like Esperanto be featured, and also well-referenced reconstructed languages like Proto-Germanic (so nothing in the Appendix namespace would be featured). Of course, they would still need to meet all the requirements that terms normally need to meet. What do you think? —Μετάknowledgediscuss/deeds 21:17, 3 June 2016 (UTC)[reply]

I don't mind including words in constructed languages already approved for mainspace (e.g. Esperanto), but I'd be opposed to including words in protolanguages. There's a reason protolanguages aren't in mainspace, and that reason should apply to FWOTD as well. —Aɴɢʀ (talk) 23:07, 3 June 2016 (UTC)[reply]
I support the featurability of both. — Ungoliant (falai) 02:23, 4 June 2016 (UTC)[reply]
Ditto   — Saltmarshσυζήτηση-talk 04:34, 4 June 2016 (UTC)[reply]
I also have no real objection to either being featured. It could be interesting to include appendix-only constructed languages as well, but that might limit our credibility as a serious dictionary in the eyes of some. Andrew Sheedy (talk) 05:38, 4 June 2016 (UTC)[reply]
It feels as though these would be mainly of interest to linguists and editors, and less so to mainstream language users and learners. Equinox 05:45, 4 June 2016 (UTC)[reply]
I was thinking no more than one a month of each, for that reason. But then again, more than 400,000 people are learning Esperanto on Duolingo, so these clearly aren't so unpopular as you might think. —Μετάknowledgediscuss/deeds 05:54, 4 June 2016 (UTC)[reply]

Deprecation tags for language codes edit

@-sche I've added a bit of code to Module:languages that checks for a deprecation tag on the language data, and includes a tracking template if found. This can be used to easily track down uses of a code that is being phased out, without generating errors everywhere when the code is removed. To use it, just place deprecated = true in the entry in the relevant language data module. Pages that use the code will then appear in Special:WhatLinksHere/Template:tracking/languages/deprecated as well as "Special:WhatLinksHere/Template:tracking/languages/deprecated/(language code)". See Special:WhatLinksHere/Template:tracking/languages/deprecated/dlc for an example where this has been used. I hope it's helpful! —CodeCat 12:35, 4 June 2016 (UTC)[reply]

Automatic transliteration for Thai has been disabled for now edit

Previous discussion: User talk:Wyang#Module:links

I disabled the automatic transliteration for Thai, because Module:th-translit isn't generating the right transliterations. Apparently, the code to generate the correct transliteration is located in Module:th in the getTranslit function, so this needs to be added to the transliteration module so that it generates the correct transliterations. User:Wyang had added workaround code to Module:links instead, but this is inappropriate, especially considering the code to generate a proper transliteration already exists, so I removed it again. Module:th-translit should be modified so that such workarounds are no longer necessary; then the automatic transliteration can be reinstated. —CodeCat 12:59, 4 June 2016 (UTC)[reply]

What you are doing is a perfect manifestation of your arrogance, ignorance and mindlessness. "So this needs to be added to the transliteration module so that it generates the correct transliterations." – while Module:th-translit is working perfectly fine with phonetically respelled words. You are suggesting that I should turn a transliteration module into a module that actually parses the entire entry's Wikitext and extract certain parts of the text, because "this is what a transliteration module is supposed to do". Sigh! So much for Eurocentric hubris on Wiktionary. "I shall break it, and ask you plebs to explain to me why things broke after this." Wyang (talk) 13:23, 4 June 2016 (UTC)[reply]
It's supposed to work with as many words as possible, not just phonetically respelled ones. The getTranslit function is capable of generating better transliterations, so this needs to be integrated into Module:th-translit. Right now, Module:th-translit only correctly transliterates a subset of the words that it could, in theory, but adding custom code to Module:links is not the way to fix that. Modifying Module:th-translit is the right way. User:Wikitiki89 even did so yesterday, and you just reverted it. Why? —CodeCat 13:36, 4 June 2016 (UTC)[reply]
Because those codes do not belong to a transliteration module page. How many times do I need to iterate that? Wyang (talk) 13:43, 4 June 2016 (UTC)[reply]
Yes they do. And they certainly do not belong on Module:links instead. —CodeCat 13:47, 4 June 2016 (UTC)[reply]
Which definition of "transliteration" is for this? Wyang (talk) 13:58, 4 June 2016 (UTC)[reply]
The same definition we apply across Wiktionary: generating a Latin-script version of a word, that can be understood by people who don't know the script. The accuracy of the transliteration, or its nature (pronunciation or spelling based) is up to the editors of the language and of the transliteration module. However, under no circumstances should a generic language-agnostic module be used to work around a deficiency of the transliteration module. —CodeCat 14:05, 4 June 2016 (UTC)[reply]
In that sense Module:th-translit is working perfectly well. It's just that your Module:links failed to take into account the fact that some languages require another level of phonetic respelling extraction, and it is that phonetic respelling, rather than the entry title itself, that needs to be fed to the transliteration modules. Wyang (talk) 14:17, 4 June 2016 (UTC)[reply]
Yes, and in those cases, we use the tr= parameter that is available on countless templates. But let's stick with the situation here. You have a function getTranslit that is clearly capable of generating the correct transliteration, albeit that it has to parse the page's content in order to extract it. The method used is completely irrelevant. It is clear that there exists a function that is capable of doing the transliteration better than Module:th-translit is currently doing. Therefore, it seems obvious that this function should be added to Module:th-translit so that its transliterations become more accurate. This is what Wikitiki89 tried to do, so what is your objection against having better transliterations? And why do you insist on putting inappropriate workarounds in Module:links instead? —CodeCat 14:29, 4 June 2016 (UTC)[reply]
Regarding your latest attempt at editing Module:links, the edits are completely unnecessary. This module doesn't have to account for this "phonetic extraction". The transliteration module can perform "phonetic extraction" instead. So please, for the nth time, add it to Module:th-translit and stop edit warring in Module:links. —CodeCat 14:32, 4 June 2016 (UTC)[reply]
I just fixed your Module:links, which you again reverted. Module:th-translit is functioning perfectly, given the right inputs. Stop insisting that this belongs at Module:th-translit; it does not. This is not transliteration.
 

Transliteration is not concerned with representing the sounds of the original, only the characters, ideally accurately and unambiguously. (Wikipedia)

 
It belongs at Module:links, which is lacking this new functionality of extracting the phonetic respelling to feed into the transliteration module. So for the nth time, please mend your Module:links so that it is fully language-agnostic, not just European language-agnostic. Wyang (talk) 14:49, 4 June 2016 (UTC)[reply]
The transliteration module itself should extract this information if it needs it. —CodeCat 14:55, 4 June 2016 (UTC)[reply]
Then it is not a module that does transliteration any more. This is exactly why the transliteration module should not be responsible for extracting this. Transliteration module is for transliteration, which is faithfully and systematically converting one writing system to another. Module:th-translit is fully functional at what it does, which is transliteration. A module that tries to extract phonetic respellings is a pronunciation module, which would have to be defined in Module:languages/data2 and have the infrastructure built around it, i.e. mending Module:links. Either way Module:links has to incorporate additional functionalities for non-phonetic languages. Wyang (talk) 15:03, 4 June 2016 (UTC)[reply]
I don't care if it doesn't do transliteration according to your narrow idea of what a transliteration is. Nobody else on Wiktionary cares either, I'd bet. What we all care about is that it generates transliterations according to what Wiktionary's idea of transliteration is, and has been for years, not what your idea of it is. —CodeCat 15:07, 4 June 2016 (UTC)[reply]
You are arguing whatever you believe in is what Wiktionary believes in, allegedly in opposition to what I believe in. A bit tongue-tied, probably? Wyang (talk) 15:17, 4 June 2016 (UTC)[reply]
I have restored automatic Thai transliteration. Remember that what you are doing is against the goal of this project - rather than improving the pages, removing information from numerous entries. Wyang (talk) 13:36, 4 June 2016 (UTC)[reply]
I have removed it. It's still not fixed. Stop edit warring and reach a consensus first. —CodeCat 13:37, 4 June 2016 (UTC)[reply]
Edit warring? Or undoing highly destructive edits to the project? Wyang (talk) 13:39, 4 June 2016 (UTC)[reply]
You added unnecessary custom code to Module:links, and when reverted, you keep reinstating it over and over despite a clear lack of agreement. That is edit warring against consensus. Reach a consensus for your edit first, then it can be reinstated. —CodeCat 13:40, 4 June 2016 (UTC)[reply]
It has been there for months. You abruptly removed it, causing all the Thai links to malfunction, prompting Thai editors to ask me to look into the problem and restore the original functionality. Can you be even further from the truth? Wyang (talk) 13:43, 4 June 2016 (UTC)[reply]
It never should have been added in the first place. Not in a highly visible and widely used language-generic module like Module:links. Language-specific code belongs in language-specific modules. —CodeCat 13:45, 4 June 2016 (UTC)[reply]
User:Wyang, again, please reach a consensus for your edit to Module:links rather than forcing the issue. Do not edit war to push your opinion through. Wait until there is a general agreement that your code belongs in the module. —CodeCat 13:53, 4 June 2016 (UTC)[reply]
Stop vandalising the page! Your removal simply wiped out thousands of correct Thai transliterations from Wiktionary pages. Where is your protest when I added it back in February? And where is your explanation when you suddenly removed the code 6 days ago? If you would like to maintain the status quo, at least get the version right. Wyang (talk) 13:58, 4 June 2016 (UTC)[reply]
Is there a time limit for contesting something? How long ago should an edit be before it's considered an automatic consensual status quo? Do we have a policy for this? I am contesting your edit now, as have two others so far, but you continue to ignore them and push your edit through. That is edit warring against consensus and I wouldn't be surprised if it got you blocked, though I won't be the one to do it because I'm involved in the dispute and people won't like that. —CodeCat 14:01, 4 June 2016 (UTC)[reply]
Did you forget that your edit had been reverted twice [38649499][38650974] by someone other than me? Taking out the block card now? A step-up from your threat to disable on my talk page? Four months seem like a much longer time than 6 days. Wyang (talk) 14:07, 4 June 2016 (UTC)[reply]
Reverts aren't the only way to contest an edit. But in any case, your edit was reverted first by me, then by Wikitiki, then by me again, then you started edit warring, and Dixtosa has also contested your edit. In comparison, only you and Metaknowledge have supported it. According to our common practice, consensus requires a 67% majority in favour, which is clearly not the case. So your edit has no consensus. —CodeCat 14:09, 4 June 2016 (UTC)[reply]
So stop your vandalism. The reason you dare to tackle Thai specifically is you simply don't care. You just don't care about what Thai editors think at all, hence destroying thousands of Thai entries is perfectly justified in your opinion. Wyang (talk) 14:17, 4 June 2016 (UTC)[reply]
Please stop using personal attacks. Reverting an edit that has no consensus is not vandalism. Reinstating that edit over ten times despite being notified that your edit has no consensus is vandalism. —CodeCat 14:29, 4 June 2016 (UTC)[reply]
Are you denying that your edit effectively eliminates valid Thai transliterations from thousands of entries? Repeatedly removing any one of those thousands of transliterations would lead to someone being blocked. So not vandalism you say? Wyang (talk) 14:34, 4 June 2016 (UTC)[reply]
Only for as long as the transliteration module hasn't been fixed to compensate. The fact that you refuse to do so does not suddenly make my reversions vandalism. In fact, you also reverted Wikitik89's edit to Module:th-translit, which did fix (or attempt to fix) the module. So it appears you are not actually interested in fixing the transliterations. —CodeCat 14:37, 4 June 2016 (UTC)[reply]
I have now reinstated User:Wikitiki89's edit to Module:th-translit. Reverting this again would re-break the transliterations, thus doing the exact same thing that you accuse me of doing. So if you revert this too, then I can only assume you are not interested in finding a solution for this problem. —CodeCat 14:41, 4 June 2016 (UTC)[reply]
It looks like พลเรือน (pon-lá-rʉʉan) once again has the correct transliteration. Why you reverted the edits by Wikitiki89 that restored this is beyond me. But please do not break it again. —CodeCat 14:45, 4 June 2016 (UTC)[reply]
As I said numerous times before, this is not transliteration. It does not belong in a transliteration module. Transliteration is the faithful letter-to-letter correspondence performed between writing systems, which is obviously not the process you and Wikitiki89 would like to see implemented in Module:th-translit. Which is hence something that more properly belongs elsewhere, i.e. at your Module:links. Wyang (talk) 14:49, 4 June 2016 (UTC)[reply]
Transliteration on Wiktionary is not the faithful letter-to-letter correspondence, and it never has been. Many languages have non-orthographic transliterations. Hindi, Chinese, Russian, just to name some. You cannot just unilaterally redefine what "transliteration" means on Wiktionary to suit your purposes, and then demand that everyone else accepts your edits to a generic module to work around it. It seems that this isn't a workaround for code, but a workaround for your own mental idea. —CodeCat 14:58, 4 June 2016 (UTC)[reply]
Well, there has never been a Module:zh-translit! Because a Chinese-English transliteration system is never possible. Hindi and Russian have two sets of transliteration and pronunciation modules: Module:hi-translit vs Module:hi-IPA, and Module:ru-translit vs Module:ru-pron, with the former doing fairly strict transliteration and the latter IPA interpretation based on transcription. Thai also has two: Module:th-translit vs Module:th-pron. And yet you are suggesting that th-translit should take on the role of the latter. It is never my "own mental idea" - it is what the definition of transliteration is, and it forms the basis for its distinction from "transcription", whether you are willing to accept it or not. Wyang (talk) 15:17, 4 June 2016 (UTC)[reply]
  • I do not think any module that is to be invoked in mainspace should EVER take content from the entry and parse it, because the entry can get arbitrarily large and introduces very difficult dependency. It is abusing Lua. As for code placement, it is about how you look at *-translit modules. CodeCat views (shared by me) them as the general transliteration modules which should work independently (i.e. not necessarily through Module:links). But, again, I disapprove the parsing part. @Wyang, why do not you just pass them as arguments? --Dixtosa (talk) 13:41, 4 June 2016 (UTC)[reply]

Recap for us outsiders: Did I understand correctly that the way Thai editors handled the situation worked but was incompatible with some stuff CodeCat's robots do, so CodeCat changed it to make it comply with his/her robots, which in turn broke it for Thai editors? And now you can't decide which way to go because you do not agree whether or not the module should scan the entire entry or not? Korn [kʰũːɘ̃n] (talk) 15:00, 4 June 2016 (UTC)[reply]

No this has nothing to do with bots. What happened, it seems, was that Wyang insisted that transliteration modules should only give letter-for-letter transliterations. But doing that would generate incorrect transliterations in many entries because Thai script is rather haphazard. So rather than adjusting their dogma - and the transliteration module - they instead made an edit to Module:links, a generic language-agnostic module, to entirely bypass the defective transliteration module. This code was noticed a few months later by me, and removed, then removed by Wikitiki89 again, then removed a whole lot more by me again. Wikitiki made edits to Module:th-translit and Module:th which fixed the transliterations after removing the Thai-specific code from Module:links had broken them. However, this seemed to go against Wyang's dogma that transliteration modules must transliterate letter-for-letter (even though they don't, and never have, on Wiktionary), so he reverted the edits and again reverted me when I tried to reinstate the fixes Wikitiki made. —CodeCat 15:05, 4 June 2016 (UTC)[reply]
The transliteration system for Thai was fully and well functional since its implementation in February, until it was abruptly removed by User:CodeCat six days ago. A bit of investigation led to User:CodeCat's edits which basically led to all Thai transliterations on Wiktionary non-functional. Wyang (talk) 15:07, 4 June 2016 (UTC)[reply]
Did you miss the fact that Wikitiki fixed the problem, and you undid his edits? Your undoing broke the transliterations again, but instead of putting Wikitiki's edits back in, instead you insisted that Module:links be edited to fit your dogma instead. —CodeCat 15:10, 4 June 2016 (UTC)[reply]
  • You both claim that your way produces correct results and the system of the other breaks it. Can you each provide a specific example which works with your system and say how it gets broken by your opponent's method? Korn [kʰũːɘ̃n] (talk) 15:13, 4 June 2016 (UTC)[reply]
    Both methods work. However, I object to having extra code in Module:links that handles deficiencies in Module:th-translit, deficiencies that were readily remedied by Wikitiki. The issue seems to be that Wyang dislikes Wikitiki's remedies, but to undo them he has to reinstate the extra code that I object to. I think that problems with Module:th-translit ought to be fixed in that same module, as Wikitiki did, rather than introducing workarounds in another module that has nothing to do with Thai. —CodeCat 15:16, 4 June 2016 (UTC)[reply]

For many languages transliteration and transcription/pronunciation are very different concepts, and Thai is one of these languages. One can generate a transliterated outcome for a Thai word (Module:th-translit), but oftentimes this is different from the pronunciation. The core issue here is that Module:links provides no support for these non-phonetic languages, which is why I added the new functionality in the module. Such information does not belong to individual transliteration modules, as this is a widespread linguistic phenomenon and the addition would greatly benefit many non-European languages (for example, Chinese and Japanese). The lack of transcription support in the central linking templates/modules is exactly the reason these languages have been moving away from the standard linking templates, resulting in much confusion and repetition during editing. Wyang (talk) 15:43, 4 June 2016 (UTC)[reply]

That's irrelevant. Module:links needs no additional support, transliteration modules (for Wiktionary's use of the word) are sufficient. If they are not, then you have to show why. So far you have failed to do so, since Wikitiki's edits (which you reverted) proved you wrong, it's perfectly possible for the existing infrastructure to handle Thai. Perhaps you don't want to be proven wrong? —CodeCat 16:31, 4 June 2016 (UTC)[reply]

I don't know what's going on. I am native and I only can say that direct auto transliteration from a "Thai word" could never be done due to complexity uncertainity of spelling. That's why we do it on basic syllables (which are more certain); it has been tought in school either. --Octahedron80 (talk) 15:23, 4 June 2016 (UTC)[reply]

Rather than this constant revert war that's going on: is it not possible to apply the code fixes in one single operation that will make the transliterations continue to work as they did before? Fixing only part of it, while leaving Thai users without useful content, seems like a problem. Equinox 16:39, 4 June 2016 (UTC)[reply]
As of right now, things work just fine. Wyang keeps reverting it. —CodeCat 16:41, 4 June 2016 (UTC)[reply]
Here's how it started(from User talk:Wyang):

I don't understand it either. Why do those edits change the transliterations, even though none is given in the entry? —CodeCat 20:49, 2 June 2016 (UTC)[reply]
Even after looking at the section above, I don't see what this edit does. In fact, it seems like it would break cases that have alt forms. --WikiTiki89 14:29, 3 June 2016 (UTC)[reply]
I've undone it until we can establish how the special treatment actually changes anything. —CodeCat 14:35, 3 June 2016 (UTC)[reply]
It looks like it, somehow, for some reason, changes the transliteration of พล (pon) between "pol" and "pon". But I have no idea why. I think the problem is with the Thai transliteration module here, not Module:links. —CodeCat 14:37, 3 June 2016 (UTC)[reply]
I think I fixed the problem with these edits. --WikiTiki89 18:57, 3 June 2016 (UTC)[reply]

This was part of a topic where it had been asked what the code was for, and everyone was waiting for Wyang's response. CodeCat acted without knowing what the consequences would be, without waiting to find out what the code was for. That was clearly wrong, and Wyang was understandably upset. Wikitiki89 helpfully came up with an alternative that seems to work.
This whole episode is painful to watch: we have two strong-minded people who have both done great things for the project, but are now butting heads instead of discussing rationally.
Wyang has a history of coming up with ingenious ways to make our system do things that no one would have thought possible. Our Chinese entries are infinitely better than they were, and they're getting better all the time. There are, however, times when the system gets brought to its knees, as at .
CodeCat is as responsible as anyone for the current template, module and category infrastructure that runs this site. This prodigious work ethic and expertise is, however, marred by a willingness to break things in order to force people to fix things she sees as wrong (case in point: Module:parameters). She also has a tendency to ramrod things through, which has created deep resentment in some quarters that has poisoned a number of discussions on unrelated issues.
On the one hand, we have Wyang, still furious about CodeCat's behavior and unwilling to allow anything that would let her get away with it. On the other hand, we have CodeCat, who has gone into Orwellian DoubleSpeak mode to shift attention from her initial, destructive action, portray Wyang as a dangerous loose cannon and portray herself as an innocent victim
We need to get past all of that and look at the merits of how we want to structure this. Our architecture isn't set up to handle the use of respellings in transliteration, so Wyang came up with a kludge to work around this. At the moment, the debate seems to be over where to put the kludge, not on whether there's a better way to do this. My question is: can we come up with a way to get the respellings to the modules without having the modules swallow an entry whole and rummage through it to find them (please forgive the mixed metaphor)? Chuck Entz (talk) 20:05, 4 June 2016 (UTC)[reply]

Wiktionary does not have a JSON-style dictionary system, which is why there is so much formatting nuisance with the use of different headers, headword templates, reduplicating etymologies, ectopic related terms and unsystematic pronunciation notations. Each word in a language should be defined by a JSON set, containing a series of qualities indicating the nature and relationships of subordination of various parts of the text. All the Wikitexts on a Wiktionary entry should be generated from scratch, from that JSON set using pre-defined formatting codes which tells the entry how the original core information should be displayed. All the JSON information from entries should be made rapidly indexable to other entries, so that there is no need to repeatedly define what the pronunciation of another word in the etymology is, or what the meanings of that word are.

What Wiktionary has is a very different system. A system that tends to make people think about "what are we eating for tonight" rather than "how can we most efficiently make dinner for the next 20 years". You can create a magnificent, all-encompassing entry for a word in a language if you put into the entry everything that is known on Earth about the word, when in actual fact you should not have to do most of what you did because they can already be found elsewhere in the dictionary and should have been "extracted" rather than "generated or provided de novo". Say you want to link to another word in your perfect entry. Then in the perfect entry on Wiktionary you would have to put in: (1) the word you wish to link to, (2) the transliteration/transcription of that word, (3) the definition of that word, and (4) qualifiers for the definitions (e.g. derogatory, obsolete), although points (2-4) have already been stored in your destination entry. Previously all of (2-4) would have to be provided in the internal link. Things have improved in that point (2) is sometimes no longer necessary, as Module:links will attempt to generate the transliteration from a series of transliteration modules. This is a great leap forward, as we start to realise some of what we previously wrote were not necessary at all. However, the source of that omitted information (i.e. the regenerable information) is misunderstood. It is not the transliteration modules that are ultimately the source of the regenerable information; rather it is the destination entry where the regenerable information is stored. For languages where transliteration approximates fairly well the transcription/transliteration system we use for that language, this is an acceptable and quite efficient way of regenerating the information, despite the non-zero failure rate (e.g. link in коэффицие́нту (koefficijéntu) to коэффицие́нт (koefficijént)). But for languages where transliteration approximates the transcription system we use very poorly (Thai etc.), or where transliteration is intrinsically impossible (Chinese, Japanese etc.), our hub of Module:links simply gives up, telling editors of these languages "sorry, there is nothing I can help you with here", when in fact it should have been set up to facilitate the extraction of the phonetic pronunciation in the destination entry. Languages lie on various parts of this transliteration–transcription continuum and it is outright inappropriate to call this process of phonetic extraction "transliteration" for languages that fall towards the transcription half of the continuum (Thai, Chinese, Japanese etc.), as that is an obvious oxymoron, and/or transcription vs transliteration are contrastive concepts for these languages. Mixing these two very different concepts or intentionally confusing them to achieve minimal effort could be very dangerous. Wyang (talk) 01:46, 5 June 2016 (UTC)[reply]

Word Transliteration outcome Transcription outcome What should be returned if transliteration is desired What should be returned if transcription is desired What should be returned if IPA is desired
พล (Thai) pol pon pol pon /pʰon˧/
십육 (Korean) sib.yug simnyuk sib.yug simnyuk /ɕʰimɲjuk̚/
བརྒྱད (Tibetan) brgyad gyaew brgyad gyaew /cɛʔ˩˧˨/
(Chinese) none dòu nil dòu /toʊ̯˥˩/
(Japanese) none mizu nil mizu /mizɯᵝ/
ရှည် (Burmese) hrany she hrany she /ʃè/

vs

Word Transliteration outcome Transcription outcome What should be returned if transliteration is desired What should be returned if transcription is desired What should be returned if IPA is desired
дли́нный (Russian) dlínnyj dlínnyj dlínnyj dlínnyj /ˈdlʲinːɨj/
ტორტი (Georgian) ṭorṭi ṭorṭi ṭorṭi ṭorṭi /tʼɔrtʼɪ/
κέντρον (Ancient Greek) kéntron kéntron kéntron kéntron /kéntron/
^According to the table, transliteration of Thai would be useless and would result in problem on difficult words, such as เศรษฐศาสตร์, รัฐธรรมนูญ. You could try to replace letter-by-letter but no one will understand it. I prefer transcription. --Octahedron80 (talk) 09:46, 5 June 2016 (UTC)[reply]
On Wiktionary, the term "transliteration" encompasses transliteration, transcription and general romanization. It's just a historical accident that we call it "transliteration", but it's not transliteration in the strict sense. See Wiktionary:Transliteration and romanization. So it is not an argument that the module can only supply transliterations in the strict sense just because it's called a transliteration module. It's a romanization module, but it's called a transliteration module for historical reasons. —CodeCat 12:35, 5 June 2016 (UTC)[reply]
It's not just a historical accident. It is Eurocentrism in Wiktionary at its best. As a consequence of this historical confusion, the central system just assumes that all languages use transliteration as their romanisation method, and Module:links sends words of all languages indiscriminately to their transliteration modules to generate their romanisations. This leaves languages with both transliteration and transcription outcomes unsupported. Thai already has a functioning transliteration module (Module:th-translit), and in addition it also has a transcription module (Module:th). Module:links should relay the 'tr' parameter to the correct place so that it is truly language-agnostic, and this includes distinguishing between the transliterative and transcriptive modules used for a particular language and rendering support to languages that use a transcriptive method of romanisation. Wyang (talk) 12:54, 5 June 2016 (UTC)[reply]
The correct place for transcriptions is Module:th-translit, so there is no need for additional code. Are you suggesting that we setup an entirely separate system to deal with transcriptions as opposed to transliterations, and have separate transcription and transliteration modules for languages? What's the benefit? And if you are so passionate about it, why don't you start a vote to change the current practice of including transcription in transliteration, rather than edit warring over it for days? Right now you have yet to display any kind of consensus for your views. —CodeCat 13:00, 5 June 2016 (UTC)[reply]
Never mind, I've done it for you. —CodeCat 13:08, 5 June 2016 (UTC)[reply]
For whom? You ignored the points I raised above and therefore completely misunderstood what I was saying. Again I feel like I am talking to someone who did not care to read my comments at all. The answers to your questions are: No and no, and you would not have asked these questions if you had read my replies above. I'm not suggesting that we set up an entirely separate system to deal with transcriptions as opposed to transliterations, nor am I interested in having separate transcription and transliteration modules for any other languages that do not differentiate between the two concepts on a romanisation level. Likewise I am absolutely uninterested in changing the current practice of including transcription in transliteration. Wyang (talk) 13:19, 5 June 2016 (UTC)[reply]
Then why do you keep edit warring? All Wikitiki's edits did was change Module:th-translit to supply a transcription. If you are fine with this practice, your edits say otherwise. —CodeCat 13:22, 5 June 2016 (UTC)[reply]

Poll: Should there be separate systems for transcription and transliteration? edit

Currently, both orthographic transcription and phonetic transliteration are subsumed under the term "transliteration" on Wiktionary. Wyang seems to be arguing that we should use "transliteration" only for the strict definition, and have an entirely separate system for transcriptions, allowing them both to exist side by side. Presumably, links and headwords would also show both, if both are available. Do you agree with this change? —CodeCat 13:05, 5 June 2016 (UTC)[reply]

Support edit

Oppose edit

  1.   Oppose There is no need for separate systems, this overly complicates things without any obvious benefit. The point of transliteration modules currently is to supply a version of a word in the Latin alphabet, without regard to how closely it maps to the orthographic form of the original language. In other words, they are romanization modules, that are called transliteration modules by historical accident. I see no value in being pedantic about the meaning. If we want to display transliteration and transcription side by side whenever applicable, we should be able to demonstrate that users benefit from this information overload. —CodeCat 13:05, 5 June 2016 (UTC)[reply]
    What's the use of phonetic transliteration when we already have dedicated Pronunciation section?--Dixtosa (talk) 13:13, 5 June 2016 (UTC)[reply]
    The point here is that this involves a change in the status quo. If we want transliterations to be strict transliterations, then we have to change the practices of all languages whose transliteration is not a strict transliteration currently, and make changes to Wiktionary:Transliteration to reflect the new practice. Russian editors have strongly opposed this in the past. @Benwing, Atitarev. —CodeCat 13:19, 5 June 2016 (UTC)[reply]

This is the most stupid response I have ever seen. You would not have acted so bizarrely if you were more attentive and respectful, and this includes completely misconstruing my reasoning and thus creating this poll, and abusing your admin rights to block me. I would like to request to have your admin rights reviewed. Wyang (talk) 13:19, 5 June 2016 (UTC)[reply]

I blocked you because you keep making edits that have no consensus. This poll is an attempt to establish a consensus, but you continue to revert without awaiting the results of the poll. —CodeCat 13:23, 5 June 2016 (UTC)[reply]
Pot, meet kettle; kettle, pot. DCDuring TALK 13:42, 5 June 2016 (UTC)[reply]
Aha. Whatever. I'd rather not be compared with this crazy user. Next time I'll just redirect all the Thai complaints to her. Wyang (talk) 13:59, 5 June 2016 (UTC)[reply]
  • It seems the status quo ante of Module:links is what CodeCat reverted to, and that therefore CodeCat's edit should be reinstanted but probably not by CodeCat. Wyang should be prevented from reinstating his edits. Wikitiki's edits to Module:th and Module:th-translit should be reinstated and then we should see which Thai entries, if any, display a problem with transliteration or transcription. --Dan Polansky (talk) 20:47, 5 June 2016 (UTC)[reply]
Not really. Wyang added the code in February, and CodeCat removed it in May as part of an extensive rework of the module. Wyang was just restoring it under the assumption that it had been removed accidentally. Thai editors have been basing their edits for three months on the presence of that feature. Wyang made his June edit only because Thai editors were complaining about it not working any more. Chuck Entz (talk) 21:21, 5 June 2016 (UTC)[reply]
Wyang's February edit cannot be traced to a discussion showing consensus, AFAIK. The edit is now challenged. The status quo ante is the status before the challenged edit. Three months have elapsed between the edit and its challenge, probably because the challenging editor did not notice the edit earlier. Now as before, I propose that CodeCat and Wikitiki edits are reinstated, and that specific problems in Thai entries that are a result of that are clearly stated, including stating at least one Thai entry that has the problem. --Dan Polansky (talk) 21:32, 5 June 2016 (UTC)[reply]
The word "consensus" is thrown around here too much. —Aryamanarora (मुझसे बात करो) 23:55, 9 June 2016 (UTC)[reply]

Languages of Sweden edit

The fact that Elfdalian now has an official ISO-639 code reminds me that we have several pages, at least in the Reconstruction: namespace, on which the language names Westrobothnian, Jamtish, and Scanian are used. These languages have neither ISO-639 codes nor Wiktionary-specific ad-hoc codes. What do we want to do with them? Should we make ad-hoc codes (e.g. gmq-vas, gmq-jmk, and gmq-scy) for them? Shall we consider them Regional Swedish dialects? —Aɴɢʀ (talk) 13:13, 4 June 2016 (UTC)[reply]

I think Scanian is a dialect rather than a language, I'm not sure about the others. DonnanZ (talk) 13:17, 4 June 2016 (UTC)[reply]
I don't agree, at least not the historical Scanian language. Indeed, Scanian has recently been under heavy influence from Standard Swedish and most Scanians today speak the Scanian variety of Standard Swedish due to recent language standardization in Sweden. But Genuine Scanian had it's own grammar, sound developments, own vocabulary etc., differing well from Standard Swedish, see for yourself at [1]. Same situation with Jamtish and Westrobothnian.--87.63.114.210 13:27, 4 June 2016 (UTC)[reply]
In addition, we list Gutnish (in a similar situation) as a separate language, even though it has come under heavy influence from Standard Swedish and is slowly dying out, in the meanwhile there are projects to revive it ([2]). Furthermore, Swedish Wiktionary uses gmq-bot for Westrobothnian. --87.63.114.210 13:37, 4 June 2016 (UTC)[reply]
gmq-bot is fine with me. I only suggested gmq-vas because Linguist List's ad hoc code is swe-vas. Are these three lects as different from Standard Swedish as Elfdalian and Gutnish are? If so, then I'm for giving them their own codes. —Aɴɢʀ (talk) 19:29, 4 June 2016 (UTC)[reply]
We've discussed this before. Can anyone come up with links to the previous discussions so we don't have to start from scratch? Chuck Entz (talk) 20:11, 4 June 2016 (UTC)[reply]
The discussion was at [3], but was left unresolved. --87.63.114.210 20:27, 4 June 2016 (UTC)[reply]
It looks like no one objected to giving all these languages their own codes; the discussion stalled over the truly trivial issue of whether or not to prefix the codes with gmq-. I don't care if we leave the prefix off, but I thought it would confuse the HTML if we did. —Aɴɢʀ (talk) 20:51, 4 June 2016 (UTC)[reply]
I've created gmq-bot, gmq-jmk, and gmq-scy, so entries can now be made for those languages, and links to them in the Reconstruction namespace can now use {{l}} instead of bare links. —Aɴɢʀ (talk) 12:35, 6 June 2016 (UTC)[reply]
P.S. I'm not touching Category:Scanian Swedish, because I'm not capable of saying what's Scanian language and what's Scanian dialect of Standard Swedish. I leave that for someone who knows these languages. —Aɴɢʀ (talk) 12:58, 6 June 2016 (UTC)[reply]
Great, thank you. I'll clean up the links. --87.63.114.210 18:09, 6 June 2016 (UTC)[reply]

Parameter in quotation templates for earliest attestation edit

Should we have a parameter in quotation templates for the earliest attestation that can be found? This is not the same as the earliest quotation that might be in the entry- this would specifically indicate that someone had searched for earlier quotations and found none. It would hopefully be a replacement for {{defdate}}, which I've always disliked since it gives no reference for its claim. It could also categorize by century or by a more granular period of time. DTLHS (talk) 21:36, 4 June 2016 (UTC)[reply]

what does eminant boot agreement mean edit

what does eminant boot agreement mean

For BOOT, see [4]. The "eminent" might have something to do with eminent domain...? Equinox 08:53, 5 June 2016 (UTC)[reply]

Proposal: Desysopping of User:CodeCat edit

Reason: Abuse of admin rights – misusing her admin power to block the other party of a personal dispute. Block log: [5]. Wyang (talk) 13:28, 5 June 2016 (UTC)[reply]

I blocked you to put an end to the continuous edits which forced Wyang's point of view without a consensus for that view. We block other editors for such behaviour, so why not Wyang? —CodeCat 13:29, 5 June 2016 (UTC)[reply]
Well your edit simply removed thousands of correct Thai transliterations on Wiktionary and caused uproar among our Thai editors, which is why it was reverted. Repeated removal of any one of those thousands of transliterations is sufficient to warrant a block. Wyang (talk) 13:31, 5 June 2016 (UTC)[reply]
No it didn't. The edits you've been edit warring on for the past day did not break any entry. Please demonstrate that Wikitiki's edits, which you continued to revert, broke or removed thousands of transliterations. —CodeCat 13:34, 5 June 2016 (UTC)[reply]
I have once again reapplied Wikitiki's edits. Please show an entry that is currently broken. —CodeCat 13:35, 5 June 2016 (UTC)[reply]
Why have you undone Wikitiki's edits yet again? There is no consensus for having transliteration and transcription separate. You should wait for the poll to finish. —CodeCat 13:37, 5 June 2016 (UTC)[reply]
I ask that Wikitiki's edits be restored until 1. it is established that a consensus exists for separating transliteration from transcription, or 2. it is established that Wikitiki's edits break anything. —CodeCat 13:39, 5 June 2016 (UTC)[reply]
Nor is it appropriate or does it have consensus. You seem to be in denial of your repeated vandalism – let me refresh your memory: diff, diff, diff, diff. These are the first four of your edits - did they remove useful content en masse? Wyang (talk) 13:40, 5 June 2016 (UTC)[reply]
Wikitiki also made that same edit diff, so should he also be blocked? Wiktiki in fact made additional edits to fix the problems caused by this edit, and you then reverted his edits too. —CodeCat 13:43, 5 June 2016 (UTC)[reply]
Circumventing the question huh? Did your edits repeatedly remove useful content en masse? Wyang (talk) 13:46, 5 June 2016 (UTC)[reply]
No, they did not, once Wikitiki had provided an appropriate fix. Which you then reverted. So again, please demonstrate that Wikitiki's trio of edits to Module:links, Module:th and Module:th-translit broke something, and that it is therefore warranted to desysop me for restoring those edits. You have yet to show even a single entry that was broken by it, yet you continue to revert these edits. —CodeCat 13:47, 5 June 2016 (UTC)[reply]
Go to the time points (1) 12:56, 4 June 2016; (2) 13:34, 4 June 2016; (3) 02:22, 4 June 2016 and (4) 01:01, 4 June 2016. Preview the page พลเรือน. Were the Thai romanisations there? Wyang (talk) 13:51, 5 June 2016 (UTC)[reply]
Please stop dodging the question. Did Wikitiki's trio of edits break any entries? Please restore his edits and then show us a broken entry. If you can't demonstrate that his edits broke an entry, how can you ask me to be desysopped for restoring them? —CodeCat 13:53, 5 June 2016 (UTC)[reply]
Looks like you are unable to answer my question. You did not restore his edits. You restored your edit, which wiped out thousands of Thai transliterations. Wyang (talk) 13:57, 5 June 2016 (UTC)[reply]
For the past day, you have been reverting those three edits Wikitiki made, one of which included the edit I also made. I have been trying to restore those edits because there is no consensus for your views and no evidence that those three edits break anything. —CodeCat 13:59, 5 June 2016 (UTC)[reply]

Your continued edit warring shows a severe lack of professionalism and responsibility. You both are perfectly aware that edit warring warrants an admin stepping in if the users can't get a hold of themselves. You both seem to be admins and abuse your positions to keep ranting where other users would long have been shut up. (Read: Prevented from editing the entry in question.)
You both continuously accuse the other of having no consensus, but your endless bickering makes it harder and harder for people to get an overview over the situation, and thus makes it more and more difficult for the community to actually reach a consensus. Please keep your hands still for a while so that the rest of the community, or at least those parts who understand the techno babble, can actually debate this matter. Korn [kʰũːɘ̃n] (talk) 15:28, 5 June 2016 (UTC)[reply]

  • +1. I can't even figure out what the primary point of contention is. (I agree very strongly with Dixtosa's point above that no module invoked in the mainspace should ever take content from the entry and parse it, though. Seriously, the devs are going to regret ever giving us Lua if we go in that direction.) Can someone please explain the difference between transliteration and transcription, and where they're each used in entries? --Yair rand (talk) 20:38, 5 June 2016 (UTC)[reply]
Whether we want to allow modules invoked from the main namespace to parse other entries should be a separate discussion, if anyone wants to start it. I believe the Chinese modules extensively use this paradigm. DTLHS (talk) 21:45, 5 June 2016 (UTC)[reply]
The distinction which seems to be being made by those who are making a distinction is : transliteration takes a set of characters and renders them letter-for-letter into another script (in this case, the Latin script), whereas transcription renders the word itself into another script; the difference being that e.g. cannot be 'transliterated' per se, but it can be transcribed (as dòu, IPA: /toʊ̯˥˩/), and that if e.g. พล is transliterated, it is pol, but if it is transcribed, it is pon (in IPA it is /pʰon˧/). In practice, the argument here seems to be (1) not over which of these systems should be used (since I haven't actually seen someone suggest that พล should be rendered pol), but over which word should be used, and (2) not over whether or not a module should parse a page, but over which module should host the code. - -sche (discuss) 21:01, 5 June 2016 (UTC)[reply]

Module:links is protected so that only administrators can edit it; this prevents non-admins from editing or edit-warring over it, and it means the edit-war between admins User:CodeCat and User:Wyang is a wheel war. If the two of you continue to wheel-war, I will ask a bureaucrat such as User:Chuck Entz or a global 'crat to make emergency and hopefully temporary desysoppings to stop the war. - -sche (discuss) 21:14, 5 June 2016 (UTC)[reply]

I was already considering doing so, but I've been hoping they would start acting like adults without being forced to. Unfortunately, the action has been taking place while I've been offline (I do sleep, occasionally), so I'm left to wonder whether it's over or it's just waiting to flare up again when both are back online. Chuck Entz (talk) 21:36, 5 June 2016 (UTC)[reply]
Yes, but no one seems to be backing the proposal, so it's not the brightest of ideas, just a desperate measure. DonnanZ (talk) 22:37, 5 June 2016 (UTC)[reply]
  • Each party has suggested the other's desysopping (above at at [6]) — and given that both parties are wheel-warring using admin tools/privileges, and that one blocked the other while edit-warring with him (as noted above), following both proposals and emergency-desysopping both may be in order if the warring continues. - -sche (discuss) 22:40, 5 June 2016 (UTC)[reply]
  • So blocking the other side of the argument is completely justified and one should not lodge a complaint after such abuse of rights? Ridiculous. Very disappointed in the Wiktionary community; seems to be a place for admin bullies who wilfully block others and maintain their modules without the slightest consideration of the consequences. Will greatly reduce the amount of time spent here. Considering quitting. Wyang (talk) 23:14, 5 June 2016 (UTC)[reply]
No one is excusing CodeCat's behavior, but de-sysopping is a very serious step, and one best not considered in the midst of a dispute, unless circumstances demand it. Chuck Entz (talk) 03:37, 6 June 2016 (UTC)[reply]

Thai Transliteration Debate Explained (I think) edit

This all revolves around what Latin text should be used to represent the letters of the Thai script when templates link to a Thai entry. The Thai script is mostly phonemic, but there are exceptions where the same letters can be read as different sounds, depending on the term. A true transliteration always represents the same letter or sequence of letters with the same Latin letter or sequence of letters, no matter how it's pronounced. A transcription represents the sounds of the text.

The transliteration can also be forced to be more like a transcription by using a respelling: a sequence of letters that can only be interpreted as the actual sounds of the term. That would be like spelling cathouse as "cat-houss" so the "th" doesn't get read as a digraph like it is in cathode and the "se" doesn't get read as a "z" like it is in "rouse". The template {{th-pron}} is used in Thai entries to display pronunciations, and the input often has to be respelled to get the right results.

The module that does the linking (Module:links) will show a transliteration for a term in a non-Latin script if we pass it as text using the |tr= parameter. If there's no |tr= parameter, it next checks whether there's a transliteration module listed for the language in our language data modules. If there is, it gets the transliteration from that module. Perhaps I should use quotes here, because we sometimes stray from transliteration to transcription when the sounds depart from the actual letters in odd or unexpected ways.

Thai has a transliteration module listed, (Module:th-translit), but this just calls the same module that {{th-pron}} uses(Module:th-pron) - the one that requires respelling to work right.

What happened edit

Back in February Wyang put code into Module:links that checked for Thai, then called a function in a different module than that used for the transliteration. This function basically checked if there was an entry for the term, and if there was, looked in the source of the entry for the {{th-pron}} wikicode. If it found the template, it took the template's (respelled) parameters and substituted them for the the actual spelling of the entry name, then called the same module that the transliteration module did. Whatever the module returned was returned in turn to Module:links (sorry), which used that instead of calling the regular transliteration module.

Nobody but the Thai editors noticed this for 3 months, until, at the end of May, CodeCat reworked that part of the module and, in the process, removed Wyang's code- perhaps without realizing it had been there. Thai editors asked Wyang why the link transliterations weren't working right anymore, so he put his code back in to fix the problem.

This time, CodeCat noticed the code and couldn't immediately figure out what it did, so she left a message on Wyang's talk page. In the meanwhile she reverted Wyang's edit. Soon after that, Wikitiki89 came up with a compromise that incorporated Wyang's code from Module:links into the Thai transliteration module.

When Wyang responded to the comments on his talk page 11 hours later, he explained his code and the rationale for it in detail, and expressed his annoyance at CodeCat's reverting his edit before finding out what it did.

Having explained himself, he went back and reverted CodeCat's revert to reinstate his edit.

CodeCat then responded by explaining on Wyang's talk page why she thought it was a bad idea to put custom code in Module:links, but then went on to say that the problem was all due to deficiencies in the transliteration module and tell him that his code wouldn't be allowed back until she was convinced it was necessary. She then reverted his revert of her revert of his edit.

If you don't already have a headache from this- it gets worse. They then proceeded to revert-war back and forth, stopping every once in a while to argue and denounce each other angrily (see above). Then CodeCat blocked him for edit-warring- which accomplished nothing, since he immediately unblocked himself. Then Wyang called for CodeCat to be de-sysopped, and CodeCat called for Wyang to be de-sysopped.

The issues edit

Filtering out the misunderstandings and trash talk, here's what I see the basic core arguments are (my formulation, not theirs):

CodeCat
  1. A general-purpose, high-traffic module like Module:links shouldn't have special cases hardwired into it- language-specific code should go in the language-specific modules.
  2. The transliteration modules aren't just for transliteration- they can provide transcriptions, if that's what's right for the language.
Wyang
  1. Thai and other languages like it need special treatment, because they need transcriptions rather than transliterations
  2. The version of the modules that CodeCat keeps reverting to isn't the same as his version.
Concerns from others
  1. Modules getting data from entries is a very bad idea.

My 2 cents edit

I agree more with Wyang's view of the events, but agree more with CodeCat on the substance.

CodeCat was wrong to revert Wyang's edit without knowing what it did. Her response to Wyang was too confrontational and demanding. Her poll wasn't really an accurate reflection of what Wyang was asking for, and the block did nothing but make things worse- much worse. On top of that, her characterization of the dispute is rife with spin and trash talk.

Of course, once the revert-war started, Wyang was a full partner in the mudfight, so I'm not giving him a pass, either.

I think the place to deal with Thai's peculiarities is in the Thai transliteration module, not in Module:links. Is there any module other than Module:links that gets the name of the transliteration module from our language data modules (in this case Module:languages/data2)? If not, we should take the function called by Wyang's code (Module:th.getTranslit) and use it as the basis for the transliteration module that Module:links calls (basically what Wikitiki89 did).

Except... I'm not qualified to say much about the concerns expressed over going to other entries to get data. After thinking about it, I can see why Wyang felt he needed to do it: most people linking to Thai entries know nothing about respelling, so it's unrealistic to require passing it as a parameter, and creating a data module with all the terms needing respelling would be a monumental and possibly fruitless task. Still, I think the module should eliminate as much as possible of the straightforward stuff before resorting to such tactics, in order to keep them to an absolute minimum.

Sorry for the encyclopedic length of this, but I wanted to make sure I didn't miss anything. Chuck Entz (talk) 04:17, 6 June 2016 (UTC)[reply]

This is a fairly good summary of the past events. By looking at the Thai frequency list, I think it is safe to say that more than half of the 4000 most commonly used Thai words require some phonetic respelling. This number will only go up if we consider the entire set of Thai words, meaning that only relying on the Thai title linked to is quite hopeless at generating the correct transcription. So it boils down to the problem of whether to analyse the link destination to extract the correct pronunciation, or make it compulsory to supply the romanisation every time. I'm highly biased towards the former as I think page parsing is the best functionality on Wiktionary, and I would imagine the natIve Thai editors to be not very welcoming to the idea of the latter either.
Regarding transliteration vs transcription, this is an issue that extends to many languages beyond Thai. Tibetan and Burmese are good examples that come to mind. I wrote Module:bo-translit (Tibetan) and Module:my-translit (Burmese) a while back, which form the backend for the Wiktionary transliterations of these two languages. The schemes used are the Wylie transliteration and MLCTS schemes respectively, both of which are transliteration schemes, and transliterated outputs of Tibetan and Burmese texts from these schemes have been used wherever the native script appears, whether it be in a Tibetan or Burmese language entry, in the etymologies of other languages or in translation sections.
The universal use of these transliteration schemes is confusing to many unfamiliar with the languages, especially casual visitors to the site. Consequently, there should be additional transcription modules developed for the two languages, used to generate the appropriate romanisation in some circumstances on Wiktionary. The most important circumstance under which transcriptions are desired is probably in translation sections. At the moment someone looking to say "eight" in Tibetan would be absolutely clueless when the person saw the following result on the page eight:
བརྒྱད (brgyad)
Same with someone trying to say "long" in Burmese:
ရှည် (hrany)
The pronunciations of these two words are /cɛʔ˩˧˨/ (Transcription: gyaew) and /ʃè/ (Transcription: she), which the person reading the pages eight and long would not have guessed if (s)he only stayed on those pages. For other circumstances, such as ordinary inter-entry linking, the use of a transliteration method of romanisation is probably better (especially in etymologies), although the decision is to be made by all active editors. The realisation that romanisations used in translation sections should resemble the pronunciation as much as possible has been present on Wiktionary. Compare the Wikitext in the Russian translation of catheter:
{{t+|ru|кате́тер|m|tr=katɛ́tɛr}}
This is despite the fact that there is a Russian transliteration module on Wiktionary, which in this case would generate a correct transliteration but an incorrect transcription outcome. On a whole, the distinction between transliteration/transcription in Western languages is very minor compared to languages of the East, for which no infrastructure for this distinction is provided on Wiktionary at the moment. This is how Module:languages/data2 appears currently:
m["tt"] = {
	canonicalName = "Tatar",
	scripts = {"Cyrl", "Latn", "Arab", "tt-Arab"},
	family = "trk-kip",
	translit_module = "tt-translit",
}
This works well with alphabetic languages. For many languages of the East, the section should be more detailed:
m["bo"] = {
	canonicalName = "Tibetan",
	scripts = {"Tibt"},
	family = "tbq",
	ancestors = {"xct"},
	translit_module = "bo-translit",
	transcript_module = "bo-...",
	transcript_in_links = false, --optional
	transcript_in_translations = true,
}
This is the reason I regarded this problem as a lack of support from the central modules, and did not consider changing Module:th-translit into a transcription module as an appropriate way to tackle this. Wyang (talk) 08:36, 6 June 2016 (UTC)[reply]
@Wyang: One thing I'm confused about, is if you are planning to use the transcription instead of the transliteration, why do you need a transliteration module? --WikiTiki89 18:21, 6 June 2016 (UTC)[reply]
Different languages have different uses of transliteration modules. For Thai, editors have agreed on the use of transcriptions in translation sections and in normal links, although transliteration may be the better option of romanisation of Thai terms in etymologies of other languages, when the module calling Module:links is Module:etymology. For Tibetan and Burmese, transcription should be used in translations, whereas transliteration is the better mode of romanisation in generic links, as there is good one-to-one script correspondence and makes etymologies much more apparent. The modules should be kept and named accordingly for languages where the distinction is important on a romanisation level. Wyang (talk) 00:47, 7 June 2016 (UTC)[reply]
@Wyang: Ok, now I understand better what your intentions are. However, I don't think it's a good idea to use different transliteration/transcription systems in different places. This is something the Wiktionary community should agree on as a whole, and not just the Thai editors (and the Tibetan and Burmese editors). The other issue is that parsing a linked-to entry to determine the word's phonetic transcription is a really bad idea for a number of reasons that have already been pointed out in the above discussions. What would be wrong with manually supplying these transcriptions? You can even add the manual transcriptions with a bot, which is similar to what User:Benwing2 did for Russian accent marks. Changing the logic of Module:links is not the right solution to either of these problems. --WikiTiki89 14:21, 7 June 2016 (UTC)[reply]
From the experience with parsing in the past one and a half years, I would say that the associated harm is very minimal and benefits are extensive. This is somewhat similar to the case of the deletion of Template talk:str index (used in py-to-ipa then) that I contested about five years ago, well before the advent of this Lua system, and the difference is that the benefit-to-harm ratio in this case is even higher. People were not even that warm to the idea of automatic transliteration back then. The earliest and most important use of parsing is in {{zh-forms}}, and it has resulted in dramatic changes in the way that Chinese entries are formatted. Code is much more succinct, and as a consequence efficiency and productivity have exponentially increased (examples of use: 安眠藥安眠药 (ānmiányào), 暗物質暗物质 (ànwùzhì), 報酬遞減定律报酬递减定律 (bàochóu dìjiǎn dìnglǜ)).
Tools should only be used in situations where they must be. In the case of parsing for transcriptions, it is irrelevant to most of the languages hosted on Wiktionary and therefore most editors on the site. Most people have no experience and will have no experience with this. People tend to show aversion to the unfamiliar, and when the aversive mentality is voiced collectively by similar-minded peers, the disinclination is irrationally amplified and may as well convincingly mask the reality, which may only be visible to those centrally involved. (This may well underlie some political phenomena and explain the difficulty experienced with the Chinese entry format change here.) I would be arguing that new technology should be actively embraced and not feared (Wikipedia:Don't worry about performance). Likewise, transcription should be achieved automatically and people/bots should not have been manually supplying the transcriptions since the infrastructure is fully functional with no demonstrated risks. Even if there are, the focus should be on how to solve it, not on how to disable it.
With regard to the partial change to transcriptive romanisation, I argued for what I consider as appropriate for Tibetan and Burmese and would be happy to hear about other ideas. On a historical note, before the creation of Module:my-translit, most formatted Burmese entries were using the BGN/PCGN system for romanisation, which is a transcription system, and the change to a transliteration system (MLCTS) occurred due to the higher success rate of automation of the latter, which allowed a much wider coverage of romanisation for the Burmese content. It is a decision to be made by Burmese-language editors collectively, and people should have the freedom to choose a practice of romanisation that is most appropriate for the language, with modules using the two modes (transliteration and transcription) of romanisation for this language already recorded in the backend database, and infrastructure in place for determining which system should be used where. For instance, if Burmese uses transcription in links I would still suggest that any calls to Module:links by Module:etymology use the Burmese transliteration module to generate romanisations, as Burmese transcriptions are much less informative for this purpose. Wyang (talk) 08:53, 8 June 2016 (UTC)[reply]
You make some good points. I'll need to think about this for a bit. But also note that {{Wikipedia:Don't worry about performance}} does not apply here. The page states "You, as a user, should not worry about site performance. In most cases, there is little you can do to appreciably speed up or slow down the site's servers. The software is, on the whole, designed to prohibit users' actions from slowing it down much." But the concern is not slowing down site performance, but that since the site's performance is protected by time and memory limits, we have frequently seen on Wiktionary these limits being reached and producing errors. Thus, performance is still an issue, even though its consequences do not affect the site's performance overall. --WikiTiki89 14:40, 8 June 2016 (UTC)[reply]
So, what happens now? Can we please get rid of the Thai code from Module:links now, or do we need some more edit warring? —CodeCat 12:06, 11 June 2016 (UTC)[reply]
Do you have any constructive suggestions? DCDuring TALK 14:25, 11 June 2016 (UTC)[reply]
Reinstate Wikitiki's original 3 edits and be done with it. —CodeCat 15:30, 11 June 2016 (UTC)[reply]
I not that Wikitiki's comment of three days ago made it seem that he hadn't come to that final conclusion. DCDuring TALK 00:21, 12 June 2016 (UTC)[reply]
  • User:Chuck Entz has described the situation very well. User:Wyang has created a working code for Thai transliterations/transcriptions and character sequencing. It is another commendable achievement of his. Few people attempted to work with scripts of such complexity as Thai. The majority of developers think that Thai is simply not transliteratable, even the phonetic respelling. User:CodeCat has broken the code for the reasons she mentioned. So, Thai transliteration modules stopped working and no alternative was offered. Thai editors were left wondering what was going on. User:Wikitiki89 has provided a workaround (later). I don't really know if it's a good fix. it should, of course, be considered but Wikitiki89 is not sure himself. There could be other solutions for many solutions but breaking an existing code without really offering a working solution is wrong. It seems CodeCat simply doesn't care about thousands of Thai entries, translations, editors and tremendous work put into this. I fully understand Wyang's frustration. I hope this conflict will be resolved peacefully. I don't want anyone desysopped but I encourage more consideration of other people's work. I'll leave the final technical solution to the people who understand it better. I don't see a huge reason for Module:links not to take some of the work (language-specific customisations) and/or accommodate handling of complex scripts with various levels of possible transliteration/transcription. For example, we capitalise transliterations of Korean proper nouns with a symbol "^" using the module.
  • As for the transliteration/transcription for Thai - a graphical (literal) transliteration for the Thai script is not used anywhere, no Thai dictionary uses non-phonetic transliteration, it would produce nonsensical garbage, even for many words with regular or predictable spellings, just like many English words would if they were transliterated graphically into another script, e.g. "light" (l-i-g-h-t) - Cyrillic лигхт (ligxt). A phonetic Thai transliteration is not only popular but it's also standard. There are various Thai transliteration standards but none of them is graphical (showing sequence of symbols). A graphical spelling can also be provided, please see กรรเชียง (gan-chiiang), which shows the actual orthography (including the phonetic respelling of the term - "กัน-เชียง). The one adopted here is based on Paiboon publisher of dictionaries, phrasebooks and textbooks. Royal Thai General System of Transcription is also phonetic but not very useful for learners - no tones, no long vowels, etc. --Anatoli T. (обсудить/вклад) 04:27, 14 June 2016 (UTC)[reply]

Google Scholar edit

Can we use Google Scholar for attestation? --Daniel Carrero (talk) 05:16, 7 June 2016 (UTC)[reply]

We can use Google Scholar to locate permanently archived journal articles, so I'd say yes. —Aɴɢʀ (talk) 07:28, 7 June 2016 (UTC)[reply]
We have traditionally counted it at RFV. —Μετάknowledgediscuss/deeds 08:13, 7 June 2016 (UTC)[reply]

Case order in German declension tables (others too probably) edit

German declension tables are vertically split by case. The cases are ordered nominative, genitive, dative, accusative. This makes no sense to me! It would be better if it was nom, acc, dat, gen:

  1. Conceptually, nominative and accusative are the most fundamental, and then dative is a variation on accusative. Genitive is then its own thing.
  2. The forms of practically everything (articles, adjective declension etc) tend to match in either nom+acc or acc+dat, and sometimes dat+gen. This ordering would place them next to each other.

A similar but more minor thing occurs with gender: it's ordered MFN, when usually the masculine and neuter forms are more similar, or sometimes F+N, but rarely M+F.

Why is it in this order? Would people support it being changed? Issues with this I'm imagining:

  1. There's some (stupid!) tradition that it's written in this order.
  2. It'd have to be changed across all languages or none.

This is how it would look the way I'm suggesting.

Fedjmike (talk) 07:44, 7 June 2016 (UTC)[reply]

You seem to think traditions are stupid. We have to cleave closely to traditions to be taken seriously as a scholarly work. Admittedly, some German grammarians do have a different order, but I would say that the one we use is probably the most traditional. Changing things up because you like them better is not a convincing argument. —Μετάknowledgediscuss/deeds 08:12, 7 June 2016 (UTC)[reply]
Yeah, guilty as charged wrt tradition. But I'm not saying change it because I don't like it, I gave what I think are good reasons for that order. Which sources use the current order, and why? I'd like to at least read about it and understand why they use it. I'm not sure I understand your argument about needing to match tradition; whose approval is Wiktionary trying to get, and why would it matter to them if it were to use a less conventional ordering of cases in tables? Fedjmike (talk) 08:43, 7 June 2016 (UTC)[reply]
Switching to nom-acc-dat-gen order has been proposed before a number of times. I am in favor of it. - -sche (discuss) 08:29, 7 June 2016 (UTC)[reply]
As am I. Leasnam (talk) 00:02, 10 June 2016 (UTC)[reply]
I don't really care which order the cases are in as long as nominative is first, but the advantage to sticking with tradition is that it's what readers will expect. I would be thrown off by an adjective declension table that put the gender columns in the order masculine-neuter-feminine, because over the years I have come to always expect masculine-feminine-neuter, and not just for German but for all languages with those three genders. I have no doubt we would get a lot more complaints about a declension table that put neuter between masculine and feminine than we get about the current order. —Aɴɢʀ (talk) 09:10, 7 June 2016 (UTC)[reply]
  • As someone who favours monolithic integrated tables over clear but repetitive tables, I'm also in favour of ordering the tables so that the number of cells is as small as possible. As such I'm giving strong support for NADG and having n/m and f/p next to each other. Korn [kʰũːɘ̃n] (talk) 09:18, 7 June 2016 (UTC)[reply]
  • Wikipedia uses the order NAGD (en and de, as well as fr.wikt). However de.wikt uses NGDA, and fr.wikipedia NADG. I am personally more familiar with NADG (I learned German in a French school). All that to say that the order of German declension seems to be far from being cast in stone, so we may as well choose the one that makes the most sense to learn the language. — Dakdada 11:11, 7 June 2016 (UTC)[reply]
  • FWIW, my German learning books mostly use NADG (presumably since that's the order that learners come across them). It depends whether we want to go for the scholarly one or the German-as-a-second-language one. Smurrayinchester (talk) 14:04, 7 June 2016 (UTC)[reply]
Awhile ago I proposed using NADG. This is what I find in my German books and it definitely makes the most sense to me. The NGDA order is only done in imitation of Latin. Perhaps this should be voted on. Benwing2 (talk) 01:28, 8 June 2016 (UTC)[reply]
For Slovene, the common order is also NGDA but we use NAGD here. For old Germanic languages we seem to use NAGD order, while for modern Icelandic and Faroese we use NADG. I personally find NGDA order to be really annoying and counterintuitive (given that nominative and accusative are the most common cases and often identical) and would favour abandoning it for all IE languages, Latin included. —CodeCat 12:27, 8 June 2016 (UTC)[reply]

What's the difference between a journal and a magazine? edit

We have both {{quote-journal}} and {{quote-magazine}}, with identical parameters. Could we combine these into {{quote-periodical}}? Is there a reason to distinguish journals and magazines, and if so what criteria could be used? DTLHS (talk) 23:59, 8 June 2016 (UTC)[reply]

Hmm. To me, a magazine is usually a mainstream popular publication you can find in shops, while a journal (unless we're talking about a personal diary) is usually an academic thing that gets published in volumes and issues. If you look at the APA academic style for citing the two things, there isn't much difference apart from the fact that journals come out in volumes and issues. They don't even require the publisher and city for either of them, despite requiring it for books. Equinox 00:07, 9 June 2016 (UTC)[reply]
Periodicals Agreed that the difference is mostly popular perception and occasionally a title will cross over, such as National Geographic which is certainly scholarly but also available in popular locations such as bookstores and dentists' offices. There's no particular reason to have separate templates and certainly many popular magazines have "volumes" and "issues" amongst those volumes. I agree with rolling them into one and having the other two templates redirect to it. —Justin (koavf)TCM 00:12, 9 June 2016 (UTC)[reply]
Okay. Magazines are more a subset of journal than vice versa (I think?), so shall we propose that we keep quote-journal (with volume/issue optional, since some magazines only have a month&year) and drop quote-magazine as redundant? Equinox 00:21, 9 June 2016 (UTC)[reply]
An even better idea: call it quote-periodical because "magazines are journals" is open to some debate but "magazines and journals are both periodicals" is not. Equinox 00:22, 9 June 2016 (UTC)[reply]
@Smuconlaw Do you have any input here? DTLHS (talk) 02:30, 10 June 2016 (UTC)[reply]
Actually, the primary template is {{quote-journal}}; {{quote-magazine}} and {{quote-news}} are just redirects to it. I suppose we used "quote-journal" by analogy to "cite journal" at the English Wikipedia. (According to the OED, a journal is "[a] daily newspaper or other publication; hence, by extension, Any periodical publication containing news or dealing with matters of current interest in any particular sphere", while a magazine is "[a] periodical publication containing articles by various writers; esp. one with stories, articles on general subjects, etc., and illustrated with pictures, or a similar publication prepared for a special-interest readership". A usage note adds: "The use of the word (rather than periodical) typically indicates that the intended audience is not specifically academic.") — SMUconlaw (talk) 02:42, 10 June 2016 (UTC)[reply]
Periodical seems the most generic of the candidates and therefore seems the least confusing for new users. But the redirects solve most practical problems. It is only when reading documentation that a user is likely to notice what the "real" template is. DCDuring TALK 10:42, 10 June 2016 (UTC)[reply]
I should also add that the template accepts the parameters |journal=, |magazine=, |newspaper=, |periodical= and |work=. — SMUconlaw (talk) 17:39, 10 June 2016 (UTC)[reply]
That's handy. But users might expect there to be a parallel in name between the template they want and {{quote-books}}. It wouldn't much inconvenience us to have a few redirects to {{periodical}}, would it? DCDuring TALK 17:59, 10 June 2016 (UTC)[reply]
We could create {{quote-periodical}} as a redirect to {{quote-journal}}. It may be a good idea to retain {{quote-journal}} as the primary template for consistency with other Wikimedia projects, as I suspect that many editors work on multiple projects. — SMUconlaw (talk) 00:04, 11 June 2016 (UTC)[reply]

{{hu-verb}} - no links in multi-word entries edit

Even though {{hu-verb}} is connected to {{head}}, it does not create links for each member of a multi-word entry. I can't figure out why. Can someone please help? It contains only a single line. Thanks. --Panda10 (talk) 12:47, 9 June 2016 (UTC)[reply]

Pagename is automatically treated as the argument in |head=. It should be fixed now. Wyang (talk) 12:58, 9 June 2016 (UTC)[reply]
Thanks so much! :) --Panda10 (talk) 13:04, 9 June 2016 (UTC)[reply]

Phrasebook vs. phrases categories edit

Is there a way to place phrasebook expressions/sentences only to the phrasebook category and remove them from the phrases category? In the past, I tried to solve this by using {{head|hu|phrasebook}}, but that was changed by a bot to {{head|hu|phrase}}, so it seems this is not accepted. Is there another way? The phrases category is cluttered up with sentences that really belong to the phrasebook only. Thanks. --Panda10 (talk) 15:57, 9 June 2016 (UTC)[reply]

Actually, Category:English phrases has 1,776 entries and Category:English phrasebook has 358 entries. Removing all phrasebook entries from the phrases category would mean a change of 20%. Just my opinion: I don't think the phrases category is too cluttered by phrasebook entries, and I don't think it would be much more improved by that change of 20% to justify the work to do it.
If we had some sort of distinction between "phrases" and "phrasebook", a few examples like how are you and good night would still fit both categories; and hello is both an interjection and part of the phrasebook. (currently, the POS header of good morning is Interjection and that of good night and good afternoon is Phrase, and that of good evening is Noun). --Daniel Carrero (talk) 16:20, 9 June 2016 (UTC)[reply]
I see your point. However, the percentage will be different for every language. Also, the 20% for English is true today, but may change in the future. I would still be interested to find out if there is a way to do this within the policies of this wiki. --Panda10 (talk) 16:45, 9 June 2016 (UTC)[reply]
More to the point, {{head}} and other headword templates categorise entries by part of speech. "Phrasebook" is certainly not a part of speech. —CodeCat 16:46, 9 June 2016 (UTC)[reply]
@CodeCat: Are you saying that phrase is a part of speech? --Panda10 (talk) 11:54, 10 June 2016 (UTC)[reply]
Many of our multi-word expressions are not phrases and not constituents. They are sometimes designed to simply be the target of a long list of redirects or to appear at the top of a no-entry search list. Because we never have an explicit "not elsewhere classified/categorized" category, inevitably some category or categories becomes the junk-catching category. In English grammar, "adverb" has long been one such. For us, "interjection" and "phrase" serve similar functions.
"Interjection" is a misnomer as we apply it. How does hello fit our definition of interjection in most of its normal uses? Collins uses "sentence substitute" (read "prosentence" if you'd prefer a technical word) for hello for example.
"Phrase" would benefit from a similar kind of split into one or more categories, though "phrasebook" is not any kind of grammatical category and would probably not be part of a long-term solution. DCDuring TALK 22:44, 9 June 2016 (UTC)[reply]

bot status vote 2 edit

Planned, running, and recent votes [edit this list]
(see also: timeline, policy)
EndsTitleStatus/Votes
Apr 29User:TTObot for bot status 9  0  0
May 26Allowing etymology trees on entriesstarts: Apr 27
(=2)[Wiktionary:Table of votes](=9)

Some are asking that the vote on User:RobokoBot be closed out. —Stephen (Talk) 12:47, 10 June 2016 (UTC)[reply]

  Done --Daniel Carrero (talk) 15:56, 11 June 2016 (UTC)[reply]

It says that the templates are used in the definition line, but some of the templates can be (and, sometimes, can ONLY be) used in the etymology section. This may happen when the etymology word does not mean the same as the derived word rendering the link to etym word useless. Am I right? So, I guess we need to add a new parameter to all of the templates to inform them if they are used in the etymology section. Also if this change is implemented the templates should be aware that the word it comes from may be a different language (like {{compound}}). For instance, English lb is an abbreviation of Latin libra. --Dixtosa (talk) 13:43, 11 June 2016 (UTC)[reply]

Just found that there are two versions of the templates already for Russian. {{ru-etym abbrev of}}, {{ru-etym acronym of}}, {{ru-etym initialism of}},* {{ru-abbrev of}},* {{ru-acronym of}},* {{ru-clipping of}},* {{ru-initialism of}}--Giorgi Eufshi (talk) 06:30, 22 June 2016 (UTC)[reply]

Conversation about origin of words edit

In conversation, when someone says "You're using that word wrong. When the word was first used, the meaning was different." Or: "That word came from (Ancient) Greek and the Greeks used it to mean something else. Therefore we should all use the original (?) meaning invented by the Greeks." What would you say to that person? --Daniel Carrero (talk) 22:24, 11 June 2016 (UTC)[reply]

Language changes. I would find some examples where the person, not being a linguist at all, didn't know about the change ("sad" is a good example), and ask them why they are not using the word in its original sense; or why they speak English at all, when it isn't the oldest language ever invented. Equinox 22:26, 11 June 2016 (UTC)[reply]
That they are falling for the w:etymological fallacy. Enosh (talk) 15:09, 12 June 2016 (UTC)[reply]
That sounds like a perfect Quora question. ("perfect" in a sense that it is perfectly characteristic to Quora). --Dixtosa (talk) 15:19, 12 June 2016 (UTC)[reply]
Folk etymology For that matter, a lot of folk etymologies are just wrong. You could ask the person, "If it turns out that the actual original meaning is [X] instead of [Y], then would you change your behavior...?" and the answer is no. —Justin (koavf)TCM 22:55, 12 June 2016 (UTC)[reply]

t:cite-meta author format edit

While discussing the new template {{R:M&A}}, I noticed the strange fact that we use semicolons to delimit the authors in {{cite-meta}} as opposed to the traditional “A, B, & C”. To that end I created a template {{format list}} which will take parameters and write them out in the normal list format (yes, it could be done more elegantly in Lua, but I wanted to conserve Lua runtime and memory for more important stuff). User:Smuconlaw and User:Isomorphyc then pointed out that there might be some concerns about changing the citation format, so I thought I'd ask here. —JohnC5 02:42, 14 June 2016 (UTC)[reply]

I cannot think of a conventional reason to use semicolons rather than commas; I do favour the change suggested by User:JohnC5, though since the template is very widely used, I thought it would be reasonable to ask around a bit first. Isomorphyc (talk) 02:56, 14 June 2016 (UTC)[reply]
@Smuconlaw, Isomorphyc Any further thoughts on this? I'm still in favor of making this change. —JohnC5 20:33, 1 July 2016 (UTC)[reply]
@JohnC5 I am too, and I think nobody has objected because there is no sensible objection. Isomorphyc (talk) 20:45, 1 July 2016 (UTC)[reply]
Perhaps we can consider the following scenarios, and decide how they are best set out:
  • A list of co-authors: "John Doe; Mary Doe; Richard Roe" (example 1A) or "John Doe, Mary Doe, Richard Roe" (1B).
  • An author and a translator: "John Doe; Mary Doe, transl." (2A) or "John Doe, Mary Doe, transl." (2B) (perhaps the latter would need to be changed to "John Doe, Mary Doe (transl.)").
(Please add additional examples if you can think of any.) I don't this there is any issue with either example 1A or 1B. Because we currently use semicolons, example 2A works well, but we are likely to encounter a problem with example 2B if we switch to commas. If we do so, I wonder if we can make corrections by bot? — SMUconlaw (talk) 20:52, 1 July 2016 (UTC)[reply]
Either option works quite well. The main issue is that they don't represent the citation convention. While it may be odd and sometime ambiguous, that's how the system works. That's why I'm bringing this up. —JohnC5 21:07, 1 July 2016 (UTC)[reply]
I'm not sure both 2A and 2B work equally well. Example 2B is ambiguous – let's say we have a list of three names, like this: "John Doe, Mary Doe, Richard Roe, transl.". Is Richard Roe the translator, or are both Mary Doe and Richard Roe translators, or (probably unlikely) are all three of them translators? I suppose I am saying that I do not mind a switch to commas, but if we do so I think we will also need to advise editors to start using parentheses for descriptors of that sort (i.e., "(transl.)") as well as to arrange for a bot update current uses. To maintain consistency in citations, ", editor" will also need to be changed to "(editor)". — SMUconlaw (talk) 21:35, 1 July 2016 (UTC)[reply]
I would also add |translator= to enforce such style conventions. —JohnC5 21:46, 1 July 2016 (UTC)[reply]
I suppose I could ... just wondering whether it's a good idea to add so many different parameters. Anyway, do we need more views before switching to commas? — SMUconlaw (talk) 12:55, 2 July 2016 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── Another issue has occurred to me: what if editors use the "last name, first name" format? For example: "Doe, John; Doe, Jane; Roe, Richard" (3A) vs. "Doe, John, Doe, Jane, Roe, Richard" (3B). I think you will agree that example 3B is unacceptable, which means we will have to retain semicolons for that format. Switching to commas will thus require this part of the template to be substantially rewritten. — SMUconlaw (talk) 11:43, 3 July 2016 (UTC)[reply]

@Smuconlaw: To be honest, I've been curious why you hadn't brought up 3B this entire time (I thought it would be the main point of contention). The fact is that's precisely how it normally works (except with an ampersand). This, I think, is often mitigated through abbreviation or omission of the first names (as seen in this APA documentation). I could whip up a template that would convert a names into an abbreviation. The main issue remains that no system uses the semicolon. I like the semicolon personally and think it disambiguates the authors nicely; it just doesn't represent any format I've ever seen. Our decision at the moment seems to be between making an unambiguous format that only we use or following a standard that could be ambiguous. I tend to lean towards the latter, even if it means adding some extra logic to lessen ambiguity where possible. —JohnC5 15:16, 3 July 2016 (UTC)[reply]
It only just occurred to me, as I don't use the "last name, first name" format. OK, so what do we do now?
  • Wait to see if other editors have comments?
  • Go ahead and switch to commas for all cases?
  • Switch to commas for all cases except the "last name, first name" format?
SMUconlaw (talk) 09:36, 4 July 2016 (UTC)[reply]

What is the difference between these two categories? Do we need both? Is the definition of merism "a pair of contrasting words" in linguistics? --Panda10 (talk) 13:05, 14 June 2016 (UTC)[reply]

Merisms seem, on the face of it, to be a subset of lexical doublets, just as reduplicated doublets are. DCDuring TALK 15:15, 14 June 2016 (UTC)[reply]

Wikimania 2016 edit

 
Slides of the talk about Wiktionary at Wikimania.

Hi, English-speaking Wiktionarians!

Wikimania is an annual meeting to discuss global issues in the Wikiverse. This year Wikimania take place in Italy, June 22 to 26 and the programme is here. Three nice French contributors plan to be there to talk about Wiktionary! Yes, our little-known project by non-English speakers. Is it not intriguing?

We already mentioned here our proposal in January 2016 and we are now in the process of organizing our slides. We are not ready yet but we want to make the building of the talk as collaborative as our projects. So, feel free to have a look at it and point out every mistake in the language or part you want more details on.

+ we want to meet you guys! So, if some of you come to Wikimania, please come to our talk or to a meetup later on the evening! Come to have a glass of Italian wine with us and discuss our amazing projects! If you plan to travel to France, for instance to see a football game, tell us, we'll be glad to host you! Noé (talk) 13:55, 14 June 2016 (UTC)[reply]

Hi! I update the slides with the version we broadcast during Wikimania today. If you have question, feel free to ping us or to come to visit our Wiktionary Noé (talk) 15:37, 25 June 2016 (UTC)[reply]

When the quotations are in phonetic transcription edit

Do we have any established customs regarding what do with quotations that aren't written in a phonetic transcription rather than the usual orthography of the language in question? I have a book of Burmese proverbs that writes all the Burmese in transcription, not in Burmese orthography; likewise Die araner mundart has lots of usage examples for Irish written in phonetic transcription rather than conventional orthography. So far, I've just been putting these things in conventional orthography, but that goes against our usual custom of transcribing quotations exactly as they're written in the source. —Aɴɢʀ (talk) 17:35, 14 June 2016 (UTC)[reply]

The ideal would be to find a native Burmese/Irish source written in the native orthography and quote from it. --WikiTiki89 17:53, 14 June 2016 (UTC)[reply]
I do do that when possible for Irish, but when I'm working through Die araner mundart to make sure we have entries for all the words it lists, it's easier to give the same examples. Also, it's a good source for unalloyed dialectal Irish rather than standard "school" Irish. —Aɴɢʀ (talk) 18:12, 14 June 2016 (UTC)[reply]
  • I might leave a note that the orthography of the text doesn't match the standard one, and maybe give both for the Irish example. I found one such time I did that, a while back, at mo'ai. —Μετάknowledgediscuss/deeds 18:07, 14 June 2016 (UTC)[reply]
    The difference there is much smaller than what I'm talking about. At aithrí, for example, I just added the usex "Mara ndéanfaidh muid aithrí inár bpeacaí, tá muid ar fad caillte", but what the source actually says is "mar ə ńīnə myȷ æŕī ə n-r̥ bȧkī, tā myȷ əŕ fad kāĺcə". —Aɴɢʀ (talk) 18:12, 14 June 2016 (UTC)[reply]
Is it a quotation or a usage example? Shouldn't you cite the source if there is one? DTLHS (talk) 18:16, 14 June 2016 (UTC)[reply]
It's a quotation of a usage example. The book I'm using is a reference work about this dialect; volume II is the dictionary, which provides usage examples taken from the author's fieldwork among native speakers. They're sentences that he heard spoken while he was living among Irish speakers, so this book is the only form in which these sentences have been published. And rather than writing them in conventional orthography, he writes them in his own ad-hoc phonetic transcription. —Aɴɢʀ (talk) 18:23, 14 June 2016 (UTC)[reply]
I suppose I could format it as a quotation along the following lines:
  • 1899, Franz Nikolaus Finck, Die araner mundart, Marburg: Elwert’sche Verlagsbuchhandlung, vol. II, 28:
    mar ə ńīnə myȷ æŕī ə n-r̥ bȧkī, tā myȷ əŕ fad kāĺcə.
    conventional orthography: Mara ndéanfaidh muid aithrí inár bpeacaí, tá muid ar fad caillte.
    Unless we do penance for your sins, we are all lost.
That would make it clear, wouldn't it? —Aɴɢʀ (talk) 18:30, 14 June 2016 (UTC)[reply]
That looks good to me. Maybe you want to make a special quotation template for this if you're going to be citing it a lot. DTLHS (talk) 18:36, 14 June 2016 (UTC)[reply]
Yeah, I was thinking about doing that. —Aɴɢʀ (talk) 18:47, 14 June 2016 (UTC)[reply]

Abbreviations in etymologies edit

Sometimes I see etymologies with abbreviations. Example: ferruminate contains "ferruminatus, p.p. of ferruminare".

I don't remember if it was discussed before, but based on Wiktionary:Todo/unhelpful abbreviations, I suppose abbreviations like p.p., q.v., Gr., and so on are disallowed in etymologies. Am I right? --Daniel Carrero (talk) 21:50, 14 June 2016 (UTC)[reply]

Since we are not a print dictionary, we don't need to save space. We decided (although I don't know when or where) that it's better not to use abbreviations in etymologies because not everyone will know or be able to guess their meanings. --WikiTiki89 21:57, 14 June 2016 (UTC)[reply]

"vernacular" as a label for Russian прост.? edit

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter Russian-language dictionaries of Russian commonly use the label "прост." for a certain type of colloquial words; "прост." stands for просторе́чный (prostoréčnyj) which literally means "vernacular" or "common speech". This is different from "разг." = разгово́рный (razgovórnyj) = "colloquial". I gather that words labeled прост. are considered lower-register than those labeled разг. For awhile I labeled these words as "nonstandard" but this doesn't quite seem right, as "nonstandard" would suggest these words are somewhat proscribed, which I don't think is quite correct. My Russian-English dictionary labels both types of words as simply "colloquial". I've started to label the прост. words using {{lb|ru|vernacular}}, but this label doesn't currently exist so it doesn't usefully categorize them. Should we create a "vernacular" label that categorizes into e.g. CAT:Russian colloquialisms (as {{lb|ru|colloquial}} does) and also maybe into CAT:Russian vernacular speech or similar? If not, how should these be handled? Benwing2 (talk) 03:06, 15 June 2016 (UTC)[reply]

I don't think "vernacular" is the best word for it. I'm tempted to say that we should choose in each case between "regional", "dialectal", and "colloquial", whichever fits better. --WikiTiki89 14:51, 15 June 2016 (UTC)[reply]
There's no good equivalent of просторе́чие (prostoréčije) in English. If you have to choose between "colloquial" and "non-standard" labels, "colloquial" is probably better, IMO but not in all cases and some would argue. --Anatoli T. (обсудить/вклад) 21:54, 15 June 2016 (UTC)[reply]
English also has informal as a label that covers broad range of registers, but excludes academic papers, legal and government documents, and similar. DCDuring TALK 23:17, 15 June 2016 (UTC)[reply]
It may have to do with perceptions of non-standard language in Russian and English. Non-standard language has always been discouraged and people who use it looked down on. Perhaps a good example is ложи́ть (ložítʹ). If someone uses it without prefixes or reflexive suffixes may be immediately identified as "uneducated", unlike when someone in English saying "gonna", "gotta" or "he/she have" (I'm sure there are better examples). There's not always a clear boundary between colloquial and non-standard (like in other languages), e.g. туды́ (tudý) (another example of "просторечие") is often used in joke. --Anatoli T. (обсудить/вклад) 23:51, 15 June 2016 (UTC)[reply]
I see.
Because we have the labels informal ("not suitable for formal speech or writing"), colloquial ("conversational"), non-standard ("scorned by many"), and dialectal ("acceptable only in a region or population") [glosses per my Wiktionary idiolect) available, we should probably avoid using senses of these that overlap with senses of the others. In particular, colloquial has a range of meanings, often overlapping with non-standard, but the "conversational" meaning is distinctive, IMO. DCDuring TALK 00:42, 16 June 2016 (UTC)[reply]
@Atitarev OK. I think your point about sounding uneducated is important. In English, educated people will use colloquial or slang speech in a sufficiently informal context but will generally avoid nonstandard speech except when used self-consciously for effect. For this reason we say words like "ain't" and "alls" ("alls you need to do ...") and "drug" (instead of "dragged") and "have dranken" (instead of "have drunk" or "have drank") are nonstandard. ложить sounds like an example of this. But I get the feeling most things that are просторечный are more like colloquialisms or slang. For this reason I'll use "colloquial" from now on for lack of anything better; I'd rather have some way of distinguishing разговорный from просторечный but there may be no help for this. Benwing2 (talk) 04:24, 16 June 2016 (UTC)[reply]

{{defdate}} and the Shorter Oxford English Dictionary edit

We must have thousands of entries that "reference" this dictionary (example abstain), just copying their dates of earliest attestation. This seems like a copyright violation, given that we are just copying their research across a large number of entries. Am I wrong? DTLHS (talk) 21:10, 15 June 2016 (UTC)[reply]

Copyright protects the expression of information, not the information itself. It would likely be a breach of copyright to copy a definition from the SOED word for word, but probably not if what is copied is a piece of information such as the date of earliest attestation. — SMUconlaw (talk) 22:16, 15 June 2016 (UTC)[reply]

putting an internal link in a translation of an example sentence edit

Are we allowed to put an internal link in a translation of an example sentence? See 行李 (baggage carousel) and 吹噓 (bragging rights). I think it's useful since it makes it clear that the translation is also a set term in English, but I thought I read somewhere that we shouldn't format it as such. ---> Tooironic (talk) 15:13, 17 June 2016 (UTC)[reply]

If you really want to, go ahead. I would only do this in very limited circumstances. --WikiTiki89 15:40, 17 June 2016 (UTC)[reply]

Sources for pronunciations of English words edit

I notice that lots and lots of English words are missing their pronunciations. I'm thinking of trying to write a bot to add them but I need a free source of pronunciations that contains enough detail to map to IPA. Anyone know of such a source? It's not obvious that websters1913.com will work; e.g. for Nation they show Na"tion and for National they show Na"tion*al; whatever those symbols mean they don't seem to indicate the vowel-quality difference in the a in the two words. Benwing2 (talk) 11:28, 18 June 2016 (UTC)[reply]

When to use references? edit

Should we include references only when a term is rare or disputed? Or should we use them whenever possible? The policy pages don't really say. Is "credibility" such an important factor for our goals? If so there are thousands and thousands of entries we could easily reference in English, Spanish, French, German, etc. Ultimateria (talk) 15:25, 18 June 2016 (UTC)[reply]

@Ultimateria: I think each page should have at least one external link to a good monolingual dictionary online as far as possible, or in case of English, OneLook would do. But that is not really a reference to boost credibility but rather an external link to provide further reading. I am sure readers are going to love having great sources one click way. That said, the reader may use the links for verification as well. We should not pester entry creators for failure to add external links; adding is up to the people who want to add them, or actually probably up to bots and similar scripted operations. One risk that I see for us adding external links is that it may confuse reader into thinking that we actually use these external links for verification when we in fact use attesting quotations. That is one of the reasons why I much prefer the "External links" header to "References" for the purpose.
Wiktionary:Votes/bt-2016-06/User:OrphicBot for bot status is currently running and proposes to bot-add certain external links to as many Latin and Ancient Greek entries as possible, which I welcome, despite opposing.
My user User:DPMaid was volume adding external links to multiple languages as documented on its talk page and no one objected so far. (I do not see why anyone would.) --Dan Polansky (talk) 07:56, 19 June 2016 (UTC)[reply]
Good point on the header, I'll start using "external links" and try to add them more consistently. Ultimateria (talk) 10:25, 19 June 2016 (UTC)[reply]
@Dan Polansky While the choice for the Korean dictionary is good, the Russian {{R:ru:BTS}} is incorrect. It's not linked to Большой толковый словарь but to gramota.ru. Not useful at all. Please undo your edits in Russian entries. I didn't check your other templates. A good bilingual Russian dictionary is [7], if the term can be added to the URL. --Anatoli T. (обсудить/вклад) 12:50, 19 June 2016 (UTC)[reply]
@User:Atitarev: The {{R:ru:BTS}} template uses "?bts=x" in the URL to specifically select Большой толковый словарь from all the dicts available at gramota.ru. If you following the link in one of the pages, e.g. from словарь, and then look at the right, you will see a list of checkboxes, and only the one for Большой толковый словарь will get checked. I really don't see how that is "not useful at all"; it cannot get much more useful than it is. Would it help if I add "at gramota.ru" to the text shown by the template, to warn the reader that the dictionary is hosted by the site in case the reader does not like the site or something? --Dan Polansky (talk) 13:02, 19 June 2016 (UTC)[reply]
As an alternative, when I was designing the template, I pondered creating "R:DGR" with the text "word in Russian dictionaries at gramota.ru". That would also show the reader some other dicts, like one featuring synonyms and one featuring antonyms. --Dan Polansky (talk) 13:04, 19 June 2016 (UTC)[reply]
I see now what you were trying to do. It didn't open correctly on a mobile version. While there's some value of gramota.ru for advanced learners or editors, it's better to use a bilingual or multilingual dictionary for a broader audience, IMO. --Anatoli T. (обсудить/вклад) 13:47, 19 June 2016 (UTC)[reply]
Monolingual dictionaries are the best ones as far as coverage, depth and unambiguity. They often contain example sentences, which bilingual ones usually do not do. Some argue that bilingual dictionaries are a really bad thing; while that seems to be overstated, many bilingual dictionaries indeed are severely limited, and lead to a lot of unnecessary misunderstanding on the part of their readers. Nonetheless, I do not object to adding some good bilingual dictionaries as external links. However, removing links to good monolingual dictionaries only because they are monolingual would be a real loss for the reader. Anyone seriously interested in a language should read its monolingual dictionaries. --Dan Polansky (talk) 14:09, 19 June 2016 (UTC)[reply]
@Anatoli T.: Mobile: I started my mobile device, went to словарь and followed the link. Indeed, the specific page for the word did not show, and instead, I landed at a page not specific to a word, offering "proverka slova", in the proper script, of course. I was trying to play with the URL on my desktop by placing "m." there, to emulate the mobile view, but the servers seem to redirect me to the desktop view. This will be a rather poor experience for the mobile users, and I do not know how to fix it. One thing the mobile user can do is enter the word again into the search field on gramota.ru and get to the sought dictionary. From my experience, sites that offer both mobile and desktop view usually provide some links at the bottom or the top to make it possible for me to switch between the mobile and desktop views regardless of the type of the actual device; I see no such link at gramota.ru. We may hope they will improve this at some point. --Dan Polansky (talk) 10:23, 20 June 2016 (UTC)[reply]
@Dan Polansky. It's okey, I guess, nothing can be done. Mobile users could click on "полная версия" hyperlink ("full version", i.e. desktop version) at the bottom right corner in gramota.ru and get the expected link. --Anatoli T. (обсудить/вклад) 12:40, 20 June 2016 (UTC)[reply]
  • Personally, if I was more strict with quality control, I would add an external link to the DRAE on every valid page in Spanish. In reality though, that's never gonna happen. Obviously though, I'd love other users to add references and external links all over the place. --Turnedlessef (talk) 10:38, 20 June 2016 (UTC)[reply]

Inconsistency in the treatment of comparatives edit

Hello,

The Latin adjective melior is considered as a lemma, same for the French adjective meilleur, whereas the English adjective best is considered as a non-lemma form. Why this difference, and especially, why best is considered as a non-lemma form? It's not an inflected form of good, is it? — Automatik (talk) 02:43, 19 June 2016 (UTC)[reply]

It probably has something to do with the lack of inflected forms for the English term. If you treat meilleur as a form, that makes meilleures a form of a form, which gets confusing. With best, on the other hand, it's always "best"- no matter what it modifies. Chuck Entz (talk) 03:27, 19 June 2016 (UTC)[reply]
Also, best is, in fact, an inflected form of good, through the process of w:suppletion, in the same way as лучше (lučše) is the comparative form of хорошо́ (xorošó). Chuck Entz (talk) 03:35, 19 June 2016 (UTC)[reply]
meilleur is also a suppletion according to w:suppletion, so it is considered both as lemma and non-lemma form?… — Automatik (talk) 12:29, 19 June 2016 (UTC)[reply]
This argumentation doesn't work. We commonly treat participles as forms of verbs, but they have inflections too, including in French. Some languages also have possessive forms for nouns, like Hungarian or Turkish, but we don't treat them as lemmas of their own despite having inflections. I think we should use the same treatment regardless of language, when possible. The consideration I go by is whether you'd expect to find a form in a paper dictionary as a lemma. Participles and comparatives would not normally appear in a paper dictionary, being subsumed under the lemma of the main verb/adjective. So by that reasoning I would not treat them as lemmas on Wiktionary. I would consider nonlemmas that have inflections of their own a "sublemma", a lemma that is part of the paradigm of another lemma. —CodeCat 13:31, 19 June 2016 (UTC)[reply]
For sure, the comparative meilleur has a specific entry in French paper dictionaries (under M). Is it the case for best? I don't have any English paper dictionary at home. — Automatik (talk) 13:49, 19 June 2016 (UTC)[reply]
That's probably because it's irregular. I wouldn't be surprised if was and were appeared in an English dictionary either. But comparatives generally would not be found there. —CodeCat 15:06, 19 June 2016 (UTC)[reply]
Regular comparatives and participles are usually listed within the main lemma entry but without definition in English-language paper dictionaries. The following would be typical:
red adj. Of the color of blood. red·der, red·dest.
walk v. To proceed by placing one foot in front of the other. walks, walked, walk·ing.
Irregular forms, at least those that are alphabetically far removed, would have their own minimal entries, e.g.:
bet·ter comparative of good.
went past tense of go.
Obviously each dictionary is different, but that's sort of typical for paper dictionaries. —Aɴɢʀ (talk) 17:40, 19 June 2016 (UTC)[reply]

Redirects to matched pairs edit

Suggestion:

At least ) has a separate sense: used in lists, like "A) milk, B) eggs, C) flour" so it should be kept as a separate entry and also link to ( ).

It seems that (, ), [, ] can be used alone in set builder notation, so I take it all the 4 entries should be kept as well.

I'd like to do this:

  • Delete all senses of ( and ) that are redundant to ( ).
  • Delete all senses of [ and ] that are redundant to [ ].

And finally:

  • Redirect { and } to { }. (unless { or } can be used alone in some sense)

Rules:

  • If a symbol is only used as part of a matched pair, redirect the symbol to the matched pair.
  • If a symbol is used as part of multiple matched pairs, create the entry for the symbol and link to all matched pairs.
  • If a symbol is used by itself as well as part of a matched pair, create the entry for the symbol and list the individual uses normally, plus link to the matched pair entry.

--Daniel Carrero (talk) 19:08, 20 June 2016 (UTC)[reply]

I agree with you on rules 1 and 2. I am also in favor of rule 3; I think we should include a definition-line pointer to the matched-pair entry (for instance using {{only in}}), rather than e.g. just a "See also" link.
- -sche (discuss) 19:49, 20 June 2016 (UTC)[reply]
Maybe we ought to extend rule 3 to apply generally to words that appear as part of a larger idiomatic term? For example, include a link among the definitions of give that leads to give up. —CodeCat 21:24, 20 June 2016 (UTC)[reply]
When the number of collocations to be linked to is small (especially if it's just one or two), I support that. For punctuation marks and symbols I could see allowing separate {{only in}} lines for each "collocation" even if there are many of them. But for words with a very large number of collocations, like take (take in, take over, take cover, take back, take up, take up for, etc, etc), I can see how some people might think it was better to list them in a collapsible table as is done at present. An alternative might be a template similar to {{only in}} but which allowed an arbitrarily long list of collocations to be linked to all on one line (rather than separate lines), a bit like how {{&lit}} can link to as many constituent parts as necessary. - -sche (discuss) 22:40, 20 June 2016 (UTC)[reply]
I favour putting them among the definitions, though. When someone says "give up", the word "give" in that collocation still has a meaning, but that meaning is only apparent in the combination with "up". It's still the word "give", and per our mission statement, if someone wants to know what a word means, they should be able to look it up. It doesn't matter that it's a collocation or idiom, because the person looking it up might not know that. —CodeCat 23:18, 20 June 2016 (UTC)[reply]
@CodeCat, -sche: I agree that ) should link to ( ) in a sense line as opposed to a "see also" section or something.
Would we only have collocations of verb + preposition and adverbs? For example, would the full sense line of give look like this one below?
  1. Used in: give away, give back, give in, give off, give out, give over, give up
--Daniel Carrero (talk) 01:40, 22 June 2016 (UTC)[reply]
Something like that, yes. If the list gets too long, we could have a separate section to list them instead, but then that definition should be replaced with "used in: see #section" or similar. —CodeCat 16:47, 22 June 2016 (UTC)[reply]

Note: The proposal concerning matched pairs affects few entries and people so far supported it. I'll wait a bit more and if no one objects I'll make all the moves and redirects.

I'd like to write the 3 rules into WT:EL eventually, but it would involve creating a vote and I suppose it can be done later. --Daniel Carrero (talk) 05:23, 25 June 2016 (UTC)[reply]

@CodeCat, -sche: Done. I created all the redirects and edited all the entries of matched pairs. See Category:Translingual matched pairs. --Daniel Carrero (talk) 01:36, 28 June 2016 (UTC)[reply]

I created Wiktionary:Votes/pl-2016-09/Matched-pairs — policy page. --Daniel Carrero (talk) 14:05, 11 September 2016 (UTC)[reply]

level of detail in English pronunciations edit

Under prodigal, the pronunciation looks like this:

/ˈpɹɑdɪɡəl/, [ˈpʰɹ̥ɑɾɨɡɫ̩]

Besides the fact that this is a specifically American pronunciation without labeled as such, do we really need the level of detail expressed in [ˈpʰɹ̥ɑɾɨɡɫ̩]? IMO this is hardly going to help most people and will likely scare a lot of them off. Benwing2 (talk) 21:57, 20 June 2016 (UTC)[reply]

AFAIK we are supposed to show phonemic and not phonetic pronunciation. Equinox 23:20, 20 June 2016 (UTC)[reply]
We can show both, as long as the phonetic pronunciation is clearly labelled as where it's used, register etc. —CodeCat 23:23, 20 June 2016 (UTC)[reply]
Like in every conversation on this topic I restate my conviction that the question should never be 'do we need it' but 'does it harm us'. Korn [kʰũːɘ̃n] (talk) 23:44, 20 June 2016 (UTC)[reply]
We're supposed to show phonemic, yes, but there's no ban on also showing phonetic. If the information is correct and correctly-labelled and (ideally) verifiable, include it. Average readers have the broad transcription to look at and advanced language learners and others might be interested in the narrow transcriptions. If the phonetic pronunciations become so numerous that they clutter the entry, collapse them. - -sche (discuss) 23:45, 20 June 2016 (UTC)[reply]
At the very least they should not be put on the same line. — Dakdada 08:52, 21 June 2016 (UTC)[reply]
  • I consider this sort of thing a case of false precision that should be removed. It's a bit like measuring the distance between two cities down to the nearest nanometer. —Aɴɢʀ (talk) 14:25, 21 June 2016 (UTC)[reply]
    I agree with Angr. To use language that would satisfy Korn, false precision is harmful. --WikiTiki89 15:25, 21 June 2016 (UTC)[reply]
    Which particular features here would be false precision in your opinion? Aspiration, velarization of coda /l/, and, in American English, medial flapping are quite common features of English pronunciation. Voicelessness of glides after voiceless stops does not seem too bad either. [ɨ] for /ɪ/ and syllabic [ɫ̩] for /əl/ seem more dubious, I suppose. --Tropylium (talk) 21:15, 21 June 2016 (UTC)[reply]
    Showing both aspiration of the [p] and devoicing of the [ɹ] is redundant. Flapping is common, but optional, in AmEng, so [ɾ] is not the only possibility. The unstressed vowel is not as far back as [ɨ]. And above all, all this information is predictable, so it doesn't need to be shown. There's a reason why paper dictionaries only give phonemic transcription, not phonetic, and saving space isn't it. —Aɴɢʀ (talk) 21:58, 21 June 2016 (UTC)[reply]
    Redundancy is not false precision, optionality is not false precision, predictability is not false precision, claiming that the unstressed vowel is never backed to [ɨ] might be false precision. Declension tables are predictable information too. For languages with the right spelling, the pronunciation section itself is redundant since predictable. If you know the rules, you can predict large parts of most languages down from their proto form. Where draw the line? That said, I agree that false precision is harmful. But I disagree that there is such a thing as too much precision, if the phænomena are well enough recorded. Korn [kʰũːɘ̃n] (talk) 23:30, 21 June 2016 (UTC)[reply]
    I do think it's possible to be too precise -- too much obvious detail will swamp the important things and make it harder to read. Benwing2 (talk) 23:43, 21 June 2016 (UTC)[reply]
    BTW we ran into this same issue when giving Russian pronunciation. We don't, for example, indicate that non-palatal [l] is heavily velarized, or the exact quality of [ɨ] (which, for example, has a noticeable on-glide preceding it when following labial consonants), but we do indicate the pronunciation of unstressed /a/ as either [ɐ] or [ə] (the rules for this are somewhat complex and easy to forget). The idea here is to include detail that is likely to help language learners and omit detail that is less helpful (either because it's too precise or because it will already be known). Especially unhelpful IMO is including lots of the more obscure IPA diacritics and other symbols, which few people will be familiar with and fewer still will have any idea how to pronounce correctly. Even using [ɾ] for flapped /d/ and /t/ bothers me a bit -- I would be at least as comfortable using [d], even if it's a slight lie. Benwing2 (talk) 23:53, 21 June 2016 (UTC)[reply]
    So false imprecision is better than false precision? I'm shocked. The moment we start entering even one smidgen of false information knowingly is the moment when we can scratch the entire project, because we no longer have the goodwill on which this project runs. And as long as we can collapse, I don't see how we can ever get swamped. We can easily make three labeled levels: Archiphonemic (English), phonemic (USA), phonetic (Working Class Michigan) and hide the phonetic levels if they become too many. In all languages. Korn [kʰũːɘ̃n] (talk) 05:27, 22 June 2016 (UTC)[reply]
    Giving the phonemic transcription only is not giving false information, but giving the phonetic transcription falsely implies that all other phonetic renderings are wrong, which is harmful. Pronouncing this word without aspiration/devoicing is unusual (except in certain accents like Indian English) but not incorrect. Pronouncing this word without flapping the /d/ is unusual in North America but not incorrect. Pronouncing this word with [ɪ] rather than [ɨ] is normal and not incorrect. Pronouncing this word with a nonvelarized [l] is unusual in North America but not incorrect. That's why this is false precision: it implies that any deviation is wrong, and it isn't. It's like saying the distance from New York to Boston is 13,495,680 inches: it implies that it's more than 13,495,679 inches but less than 13,495,681 inches, which is absurd. —Aɴɢʀ (talk) 11:08, 22 June 2016 (UTC)[reply]
    I would not read an implication that everything else is wrong, and even if that was the case, that issue would be fixable by adding labels, even more pronunciations, and not by removing stuff. Korn [kʰũːɘ̃n] (talk) 11:41, 22 June 2016 (UTC)[reply]
    No one else has mentioned that this is how they pronounce it so here I am. It's my exact pronunciation (although added by User:msh210). I don't see the harm. If someone wanted to find the phonetic transcription, where else could they find it besides here? Ultimateria (talk) 21:23, 21 June 2016 (UTC)[reply]
    OK, I took the liberty of deleting the excessively detailed pronunciation (and adding UK pronunciation in, hopefully I got it right). If we want to put it back we should have a general policy of how to represent phonemic and phonetic detail. I think something like [ˈpʰɹɑɾɪɡəl] is plenty enough detail. Rules for aspiration and flapping are a bit complicated so it may be useful to show them, but devoicing of [ɹ] is obvious and surely excessive, and the quality of [ɪ] and [ə] (and whether the last syllable has a syllabic l) are too variable to quantify, and all /l/'s are velarized in American English so it's probably not necessary to bother with that -- anyone who cares enough about the exact quality of /l/ will almost surely already know that /l/ is velarized. Benwing2 (talk) 23:41, 21 June 2016 (UTC)[reply]
    I don't think the devoicing of [ɹ] is obvious to all non-native speakers, nor do I fully agree with your final comment. When learning other languages, I find very detailed phonetic pronunciations extremely helpful, as I am not always able to pick up the finer subtleties of pronunciation just by listening (and by finer, I mean at least as fine as [ˈpʰɹɑɾɪɡəl], and often more specific). In French, for instance, I'm finding that I've become limited in my ability to improve my accent, because nowhere can I find exact enough phonetic transcriptions of words, and I'm often not able to successfully imitate some of the minutiae of pronunciation that I hear. I'm opposed to removing phonetic pronunciations unless their precision is actually false (as opposed to "unnecessary"), but I do think they should be clearly labelled as such. Andrew Sheedy (talk) 04:23, 22 June 2016 (UTC)[reply]
    I've restored the pronunciation, labelled as American, as [ˈpʰɹɑɾɪɡɫ̩]. Seeing as many varieties (including US varieties) of English use both [ɫ] and [l], and some languages contrast them, a narrow transcription should distinguish them. I went with [ɫ̩] rather than [əɫ] because the former is what I've seen more of in other entries, e.g. battle, bottle, petal, fiddle (in the broad transcription of that last one — there it should probably be changed to /əl/). - -sche (discuss) 04:47, 22 June 2016 (UTC)[reply]
    I think (a select few) non-native speakers might find the transcription of [ɹ] as [ɹ̥] helpful, but I suppose they would be able to find that information elsewhere. Andrew Sheedy (talk) 05:53, 22 June 2016 (UTC)[reply]
    No one seems to have mentioned this yet, but reason I would label this as false precision is that it is selectively precise. It is precise about some aspects of the pronunciation and imprecise about others. The problem with that is that our readers will think it is a precise transcription and assume that all aspects of it are precise. What aspects are we being imprecise about? First of all, the [ɫ] symbol is an intentionally imprecise symbol and should never be used in precise transcriptions; this symbol is intentionally vague about whether the [l] is velarized or pharyngealized. Secondly, we are missing the actual place of articulation of the /l/, which for most Americans is dental. Thus the last syllable can be given as [l̪̩ˠ]. Next, the articulation of the /ɹ/ is most certainly not simply alveolar. In fact I'm not entirely sure what it is. But after saying this word over an over, I have come to suspect that in my pronunciation of this particular word, it is [ʟ̹ʷ] (a rounded labialized velar lateral approximant) or perhaps [ɣ̞̹ʷ] (a rounded labialized velar approximant), this also seems to velarize the /p/, giving [pˠʰʟ̹̊ʷ] or [pˠʰɣ̞̹̊ʷ] for the initial consonant cluster. Now we encounter another problem, which is that I have no idea whether all GenAm speakers pronounce it that exact way or not, and if not then by using this transcription we would be making the inaccurate claim that they do. I'm not even gonna bother analyzing the vowel qualities and lengths, but just note that those are another missing piece of precision. My guideline would be that if the phonetic transcription is not illustrating some important peculiarity of a word, then it is superfluous and falsely precise. --WikiTiki89 14:43, 22 June 2016 (UTC)[reply]
    The phonetic pronunciation is showing the peculiarity of a specific accent. Something I am absolutely looking for in Wiktionary, it's highly interesting information to me and seems to me to be well apt for our pronunciation section. As long as the data given is correct, I don't see the relevance of other pronunciations which diverge more or less from it. They can be added. Assuming that all people in area X pronounce aword exactly the same way is a lack of understanding that's to be fixed by a lecture on linguistics/phonetics, not a dictionary. Korn [kʰũːɘ̃n] (talk) 11:37, 23 June 2016 (UTC)[reply]
    ps.: While I'm usually for assuming that the user is not too well acquainted with linguistics and has a short attention span, clearly anyone knowing how to read IPA in the first place has a basic interest in the topic and can be expected to have a basic understanding. If not, add a disclaimer, don't remove information. Korn [kʰũːɘ̃n] (talk) 11:39, 23 June 2016 (UTC)[reply]
    No, the only peculiarity of a specific accent that it is showing is the realization of /d/ as [ɾ] (well and the vowel quality of the first syllable's vowel, but that's already given in the phonemic representation and is actually variable within GenAm). The aspiration of /p/ and the darkening of /l/ are universal in English (perhaps with the exception of small dialects that I don't know about?). The devoicing of the /ɹ/ is not something I've ever noticed or paid attention to before, but I suspect that it is not peculiar to GenAm either. The quality of the second syllable's vowel is disputable (I'm not sure what it actually is), and I don't think it is peculiar to GenAm either. The features I mentioned in my previous post, however, such as the dental nature of the /l/ and the precise articulation of the /ɹ/, are peculiar to GenAm (RP has an alveolar /l/ and in this word I suspect the /ɹ/ is simply [ɹ] or [ɻ], and not velar). The vowel length features of GenAm are also completely ignored (the first syllable has a longer vowel than the other two). The features given are not any more important or interesting than the features not given. --WikiTiki89 15:09, 23 June 2016 (UTC)[reply]
    Non-velarised L occurs in Northumbria and Ireland. I'm talking about whether this level of pronunciation should be had in general; I have no merit whatsoever to talk about this pronunciation specifically. I'm just saying that, if e.g. GenAm is /bɜrd/, then having New York: [bɜjd] and Some city: [pɚt] seems to me to be within our scope, and desirable. Every phonetic feature which is distinguishing either for or within the dialect should be visible. So l-velarisation should be featured, for, while it is not phonemic anywhere, it is part of what makes Geordie sound like Geordie. When dealing with most German, just having /a/ and /a:/ might be sufficient, but an extra line for northern accents, where /aː/ and /ɑː/ are contrasting phonemes, actually making that difference, and having that line in the first place, is neither superfluous nor false precision, but simply extra service. Can we be on the same page on that? Korn [kʰũːɘ̃n] (talk) 15:56, 23 June 2016 (UTC)[reply]
    Yes, I can agree that "New York: [bɜjd]" is useful, because it does not attempt to be overprecise, it is just highlighting a particular feature. --WikiTiki89 17:18, 23 June 2016 (UTC)[reply]
    I definitely agree that we should avoid being overprecise. @Korn, keep in mind that many people know or can learn the basics of IPA but will not know how to interpret all the strange diacritics and such. Figuring out that [ʃ] represents the sound of sh is on an entirely different level from figuring out what all the symbols in [l̪̩ˠ] (much less [pˠʰɣ̞̹̊ʷ]) mean. Most people, including IMO many or most people familiar with IPA, have no idea what sounds are denoted by vowel symbols like [ɤ] and [ɞ] and [ɜ], or what the difference between dental vs. alveolar articulations or velarized vs. pharyngealized articulations are, or even what "pharyngealized" means, and have no easy way of figuring this out, either. This is a general dictionary and needs to be aimed towards the intelligent layman, not a specialist in linguistics. We always have to strike a balance between precision and ease of use. Benwing2 (talk) 17:56, 25 June 2016 (UTC)[reply]
    Finding out the value of [ʃ] and [ɤ] is equally difficult. Both infos are a mere two klicks on Wikipedia away - tops. And since our users browse this Wikiproject, we can assume they have access to another. If they cannot be arsed to look up the information, that is their choice. Wiktionary should be aimed at everyone, which is my understanding of the Wiki spirit. We should not suddenly stop being informative at a certain level of education. If we cannot approach laymen and expert alike, we're simply not good at what we do. I'm tired of hearing that the user is unknowing and thus may never be given incentive or chance to improve. We can provide both, the simple and the precise, the overview and the in-depth-treatment. Korn [kʰũːɘ̃n] (talk) 18:13, 25 June 2016 (UTC)[reply]
    I've been away from the discussion for a bit, but I strongly agree with pretty much everything Korn has been saying. If we're worried about confusing the average user with extremely precise (and potentially extremely helpful) phonetic information, then let's put it in collapsable boxes, not remove it. Andrew Sheedy (talk) 03:59, 7 July 2016 (UTC)[reply]

"Category:en:Currencies" and "Category:en:Currency" edit

What's the difference between "Category:en:Currencies" and "Category:en:Currency"? Do we need both? — SMUconlaw (talk) 22:14, 21 June 2016 (UTC)[reply]

"Currencies" contains the names of particular currencies, while "Currency" contains terms related to currency that are not necessarily currencies. So there is a difference. —CodeCat 01:01, 22 June 2016 (UTC)[reply]
Maybe in theory, but that's not the case with those categories at present. Equinox 02:25, 22 June 2016 (UTC)[reply]
In that case, the categories need usage notes, and some reclassification is in order. — SMUconlaw (talk) 07:17, 22 June 2016 (UTC)[reply]
We really need some clear contrast made between "set" categories and "topic" categories. Other than the fact that "set" categories tend to have plural names and "topic" categories singular names, I don't know how we're supposed to predict which category is of which type. Category:en:Horses, for example, has a plural name and says "English terms for horses", but in fact its content includes lots of terms that relate to horses in some way but are not terms for horses (behind the bit, equine, gait, etc.). Some of them could be moved to Category:en:Equestrianism, but not all of them. —Aɴɢʀ (talk) 10:50, 22 June 2016 (UTC)[reply]
Perhaps rename to a clearer word? These little difference are annoying to one that uses of non-plural language just like me. --Octahedron80 (talk) 10:57, 22 June 2016 (UTC)[reply]
Wikipedia distinguishes them using the plural in some cases. There's w:Category:Color next to w:Category:Colors. But I do agree that it may make sense to distinguish them more clearly. I just don't know how. Perhaps the simplest solution would be to have Category:Topic:Horses for the topic, or disambiguate the set as Category:Kinds of horses. But then we'd have to do the same for all other categories too, so we might end up with Category:Species of mammals and similar "long" names for all life forms. And the system may not be watertight in any case; someone may still decide to place Stadtkreis in Category:de:Districts of Germany, even if the category may be intended only for the names of actual districts, not terms for specific kinds of districts. Both are sets, but the category would be only intended as one of them. —CodeCat 15:46, 22 June 2016 (UTC)[reply]
cat:en:List of colors? --Giorgi Eufshi (talk) 15:54, 22 June 2016 (UTC)[reply]
It kind of works, but it also has a connotation to me that implies it's a complete list. Maybe all categories are that way, I don't know, but it feels stronger with "list of". —CodeCat 16:36, 22 June 2016 (UTC)[reply]
I don't know if new names are really necessary; it might be sufficient to have more explicit text in the categories themselves. The text currently in CAT:en:Currency and CAT:en:Currencies is pretty good, but maybe they could even say "This is a topic category..." and "This is a set category...". Take CAT:en:Body, which says "English terms for and related to the body and its parts." It seems to be both, as it's both terms for and related to. It should probably be a topic category, with a separate CAT:en:Body parts as the set category. Then, to add to the confusion, there's CAT:en:Anatomy, which I guess is supposed to be just for anatomical technical terms (such as one might learn in anatomy class at university) and not for everyday words, but in practice it's full of every day words for parts of the body. Maybe we should have a third kind of category, the "technical-term category" and label them as such. I was recently at a loss where to put some language's word for "feather". Not in CAT:Birds, because a feather isn't a kind of bird, and not in CAT:Ornithology because it isn't a technical term. CAT:Body is OK I guess, though it seems odd since the average reader probably expects that to refer to the human body. I notice that feather isn't in any category specific to its primary birdy meaning. —Aɴɢʀ (talk) 17:31, 22 June 2016 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I am in favour of fuller usage notes that clearly explain the intended use of a particular category and suggest alternative categories for related words. I just wanted to point out that "This is a topic category ..." and "This is a set category ..." are not clear enough. — SMUconlaw (talk) 21:09, 22 June 2016 (UTC)[reply]

People that add categories to entries won't always look at the description on the category page. If they see one language use it, they might add it to their own language entry without checking. I certainly don't look at the descriptions much myself. —CodeCat 21:34, 22 June 2016 (UTC)[reply]
I don't look at the descriptions all that much either, but I do look at them when I'm uncertain whether a particular words belongs in a particular category. —Aɴɢʀ (talk) 21:56, 22 June 2016 (UTC)[reply]
Nonetheless, it would be useful to define possibly confusing categories on the category pages so that incorrectly categorized words can be spotted and moved. — SMUconlaw (talk) 23:51, 22 June 2016 (UTC)[reply]
Perhaps two separate namespaces, like Category (or Topic) and Set (or List)? Functionally they would both operate like categories but the distinction would then be clear(er) even to those who don't look at the cat page. Equinox 22:26, 22 June 2016 (UTC)[reply]
Can we actually create new category namespaces? Do we really want to? We can also have Category:Topic:en:Anatomy or similar. But also consider Angr's point that there is a distinction among topical categories between terms related to a topic, and technical terms and jargons used within a field. —CodeCat 00:22, 23 June 2016 (UTC)[reply]
Well, one could potentially have a three=way split: "Category:List:en:Religions" (list of religions: Judaism, Islam, etc), "Category:Topic:en:Religion" (words pertaining to religion: god, church, etc), and "Category:Jargon:en:Religion" (or some other word besides "jargon") (for words used chiefly by scholars of religion, like perhaps actual sin). But the last one might be better named "Category:Jargon:en:Theology". In other cases, the Topic and Jargon categories might share a keyword ("...:en:Aviation") while the List category had a different name ("...:en:Aircraft"); that's not a problem, I'm just mentioning it. A related issue the the tendency of people to use labels for all three purposes, making it hard to tell when a sense simply pertains to religion and when only scholars of religion use the sense. - -sche (discuss) 04:11, 23 June 2016 (UTC)[reply]
The whole idea of separating topics and sets is likely lost on the vast majority of potential readers, which leads me to believe there's not a lot of benefit in a dual system. Those (such as CodeCat) who want a dual system or even a triple system are quite focused on the minutiae of categorization. They aren't really creating a user-friendly system. Purplebackpack89 15:33, 23 June 2016 (UTC)[reply]
I threw the idea out there to see what people would think. I'm only tangentially interested in categories (and would like to see how much use they get, if we could measure such things). But if nobody cared at all, this discussion wouldn't have come up, I suppose. I do think it's worth stating on each category page what it's supposed to achieve, but realise somebody will always add further entries without reading that text. Equinox 16:46, 23 June 2016 (UTC)[reply]

Proposal: Editing WT:EL#Flexibility to remove everything except the first sentence. (diff)

Current text:
While the information below may represent some kind of “standard” form, it is not a set of rigid rules. You may experiment with deviations, but other editors may find those deviations unacceptable, and revert those changes. They have just as much right to do that as you have to make them. Be ready to discuss those changes. If you want your way accepted, you have to make the case for that. Unless there is a good reason for deviating, the standard should be presumed correct. Refusing to discuss, or engaging in edit wars may also affect your credibility in other unrelated areas.

Rationale:
I created Wiktionary:Votes/pl-2016-02/Removing "Flexibility" this year with the proposal of removing the paragraph entirely. It ended as no consensus 6–4–1 (60%), with some voters stating that they support the notion of flexibility. That said, as an additional suggestion, I maintain that maybe adding the rest of the text to Help:Interacting with other users would be a good idea.

Note about votes:
I'm trying not to create too many votes at once because people complained when I did it last time, around February 2016. I believe this change in the flexibility text would need a vote, but it can be created later. I still have the intention of editing more of WT:EL but I'll try focusing on this section now because it's the first unvoted section of the policy. --Daniel Carrero (talk) 09:41, 25 June 2016 (UTC)[reply]

Should the ‘Vulgar Latin’ distinction be deleted? edit

The way we treat Latin is rather unusual. The Latin spoken by the illiterate or barely literate is classified as Vulgar, whereas that which was spoken by the educated is called Classical. There was almost certainly a lot of variation within Vulgar Latin, and between regions and ages. Likewise, Classical Latin had its own set of variations. Presumably Vulgar Latin is unique in that it was usually spoken by the less educated, but I’ve never seen people apply this distinction to other languages.

The phrase “Vulgar Latin”, coined in the nineteenth century by Hugo Schuchardt, is unfortunate, but has come into common usage among scholars. József Herman defined Vulgar Latin as a collective label for those features of Latin which we can be sure did exist, but which were not recommended by the grammarians. It should not be, although it often has been, envisaged as being a separate language (or “system”) co-existing with “Classical” Latin. Both terms are ambiguous, and probably best avoided: all the styles and periods might as well be included under the umbrella of “Latin”, tout court. “Romance” is no clearer a word when applied to these centuries. The word is sometimes used as an alternative label for what others call Vulgar Latin, implying that Latin and Early Romance were then, and are now in retrospect, clearly distinct simultaneous entities. This perspective could only command adherence before the development of sociolinguistics which has made clear that variation is normal within a single language, and we need not assume that the existence of synonymous variants (such as DE and genitives, or AMABO and AMARE plus an auxiliary) implies that they each belong to a different language (or “system”). It is true, of course, that by the second millennium CE it had become normal to distinguish, at least in some contexts, Latin and Romance as separate entities; but it now seems anachronistic to postulate such a mental distinction as existing before the seventh century, and probably (for reasons there is no room to go into now) before the Carolingian “Renaissance” of c. 800 CE. Attested features in writing of the period preceding this separation of Latin and Romance may well therefore be direct evidence of the spoken usage of that time (inasmuch as written evidence can be taken to attest speech anywhere): e.g. when St Benedict of Nursia in the sixth century CE used both manducare and comedere to mean “eat” in his Monastic Rule, it is reasonable to take that as evidence for both words being used at that time and in that area; but it is a pointless argument to discuss whether these uses attest sixth-century southern Italian “Latin” or “Romance”, as if that had been a real distinction in St Benedict’s context. It makes it awkward for us, but we need to realize that calling the spoken language of pre-Carolingian post-imperial times (c. 400–800 CE) “Latin” (or “Late Latin” or “Vulgar Latin”) or “Romance” (or “Early Romance”) is only a terminological distinction. There was just one language there, however variable it was in increasingly elastic and complex ways. (Source; more information.)

Are we better off without this? --Romanophile (contributions) 12:31, 25 June 2016 (UTC)[reply]

No, I don't think we are. We treat Vulgar Latin as an etymology-only variant of Latin anyway, it's not as if we treat it as a separate language. And in etymologies it is important to recognize that some reconstructed words that must have existed as early as the 1st century simply aren't attested, or are attested only in nonliterary contexts (e.g. graffiti in Pompeii), or that some words must have had colloquial senses that aren't attested in literature (e.g. focus (fire) as opposed to 'fireplace, hearth'). Of course in real life the distinction between Vulgar Latin and Classical Latin wasn't binary, it was a continuum from basilect to acrolect, but for dictionary-writing purposes it's still more helpful to retain the labels than to ditch them. —Aɴɢʀ (talk) 12:45, 25 June 2016 (UTC)[reply]
Welsh has a similar distinction between colloquial and literary language, and they differ in grammar too. But we don't make that distinction as strongly as we do for Vulgar Latin. —CodeCat 13:50, 25 June 2016 (UTC)[reply]
I thought that we were just following our sources. Most of the Vulgar Latin etymology labels are from MW 1913, possibly Century 1911 too. The passage cited may lead to a vast amount of scholarship which would enable better labeling in etymologies and in Latin entries. When the fruits of that research are available we could incorporate it into our labeling. Unless the current label leads to loss of users or contributors, it seems silly to eliminate what information value the label has, however modest. DCDuring TALK 14:35, 25 June 2016 (UTC)[reply]
I agree with the above and I'm for keeping it the way it is for now, as the label does serve a purpose. But I'll admit the way we handle it can be unusual. On a related note, I think it's also somewhat inconsistent. Technically, most terms that were inherited probably should've passed through a "Vulgar Latin" phase before making their way into what we call "Romance". But on here we usually only make a point of listing the Vulgar Latin intermediate if it is sufficiently different from the attested Classical Latin term and thus warranted. There are numerous hypothetical intermediates between each of the Classical term and the Romance language term that we could ascribe to Vulgar Latin, such as forms dropping the final -s or -m, or ones accounting for sound shifts, contractions/syncope, metathesis, etc. So the way we handle it is admittedly somewhat arbitrary. However, adding a Vulgar Latin intermediate form for every inherited entry is unnecessary, of course, as the transformation from the Classical can be already be clearly inferred in many cases. Word dewd544 (talk) 17:43, 27 June 2016 (UTC)[reply]

Should I just add basic definitions? edit

I was looking at this wordlist: https://en.wiktionary.org/wiki/Wiktionary:Frequency_lists/Czech_wordlist

I noticed that a lot of the words have non-working links. I was thinking about adding English definitions for these words. But I don't know all the rules for participating in Wiktionary, and I don't really want to take all the time to learn. So would it be useful of me to just add basic definitions so that someone else can come along and put them into the official Wiktionary format, or would my sloppy additions just be deleted anyway?

I've been emailing info-en@wiktionary.org about this for four days, but nobody is responding. Please let me know if there is a better place to get an answer to my question. Thanks!

edit

The logo vote passed. According to phab:T138801, the change is scheduled for 30 June 15:00-16:00 UTC. --Daniel Carrero (talk) 07:56, 29 June 2016 (UTC)[reply]

I look forward to the next new logo vote! - TheDaveRoss 12:44, 29 June 2016 (UTC)[reply]
We are going to need a matching favicon as discussed here. --Daniel Carrero (talk) 12:50, 29 June 2016 (UTC)[reply]
Finally, we got it! I'll celebrate by editing twice as hard today --Turnedlessef (talk) 10:33, 1 July 2016 (UTC)[reply]
I realize now the lighter gray color of the text at the bottom would have been better off black, and there probably should have been a little more clearance at the top, but it's not too bad. Also, they haven't changed our logo on the www homepage. Are they going to? Do we need to enter a separate ticket for that? --WikiTiki89 16:52, 1 July 2016 (UTC)[reply]
The old logo has disappeared for me, leaving an empty space ... — SMUconlaw (talk) 17:55, 1 July 2016 (UTC)[reply]
You edit meta Www.wiktionary.org_template to affect that page. It is protected, so someone who is an admin there will have to do it I guess. - TheDaveRoss 18:47, 1 July 2016 (UTC)[reply]
Something is wrong. As with Smuconlaw, for me the logo has disappeared. Benwing2 (talk) 02:21, 2 July 2016 (UTC)[reply]
Same here- both on Firefox and Safari (Mac). I looked at the part of the page source for the logo:
<div id="p-logo" role="banner"><a class="mw-wiki-logo" href="/wiki/Wiktionary:Main_Page" title="Visit the main page"></a></div>
There's no code for an image. Odd. Chuck Entz (talk) 03:22, 2 July 2016 (UTC)[reply]
The image is provided via CSS applied to .mw-wiki-logo. (The image is [8].) —suzukaze (tc) 03:45, 2 July 2016 (UTC)[reply]
I'm using Chrome on Mac, and still have a missing logo. Benwing2 (talk) 06:09, 2 July 2016 (UTC)[reply]
The logo looks fine here; it didn't disappear at all to me. I use Firefox 47.0 on Windows 8.1. --Daniel Carrero (talk) 06:17, 2 July 2016 (UTC)[reply]
I'm on the new Firefox 47.0.1 and there is still a gaping grey hole. — SMUconlaw (talk) 12:57, 2 July 2016 (UTC)[reply]
I installed the new Firefox 47.0.1, too. The logo is still perfect to me. I even cleared the cache a few times, all is good here. --Daniel Carrero (talk) 13:04, 2 July 2016 (UTC)[reply]
Clearing the cache makes no difference for me. :( — SMUconlaw (talk) 13:08, 2 July 2016 (UTC)[reply]
Still missing for me, under both Chrome and Safari on the Mac. Can we file a phab bug? Benwing2 (talk) 21:48, 2 July 2016 (UTC)[reply]
I created phab:T139255. --Daniel Carrero (talk) 02:42, 3 July 2016 (UTC)[reply]
Thanks! Benwing2 (talk) 02:56, 3 July 2016 (UTC)[reply]
Hi, there was a mistake with the original configuration change so that people with high-resolution (retina) displays wouldn't see any logo at all. I deployed the fix just now and all should be good. Sorry about the trouble! Legoktm (talk) 07:47, 3 July 2016 (UTC)[reply]
Yup, it looks fine now! Thanks. — SMUconlaw (talk) 11:43, 3 July 2016 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @Legoktm: Actually, it looks fine on my laptop but I am still not seeing any logo on mobile devices (am using an iPhone and iPad). — SMUconlaw (talk) 09:37, 4 July 2016 (UTC)[reply]

I don't think there ever was one in the mobile view. I may be wrong though. --WikiTiki89 20:24, 5 July 2016 (UTC)[reply]
I wasn't viewing the website in the special mobile mode, but as an ordinary website in the Safari browser. Unless I've been hallucinating, my impression is that there used to be a logo visible, in the same way it appears when the website is viewed on a laptop or desktop. — SMUconlaw (talk) 20:36, 5 July 2016 (UTC)[reply]
Hmm... On my iPhone the logo appears as expected on the Desktop version of the site in both Chrome and Safari. --WikiTiki89 20:40, 5 July 2016 (UTC)[reply]
Tried closing and reopening Safari. Nope, the logo is still missing. I'm on iOS 9.3.2 (the current version). — SMUconlaw (talk) 21:08, 5 July 2016 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I can now see the logo on my older-model iPad, but not on my iPhone 6. — SMUconlaw (talk) 12:01, 16 July 2016 (UTC)[reply]

Compact Links coming soon to this wiki edit

 
Screenshot of Compact Language Links interlanguage list

Hello, I wanted to give a heads up about an upcoming feature for this wiki which you may seen already in Tech News. Compact Language Links has been available as a beta-feature on all Wikimedia wikis since 2014. With compact language links enabled, users are shown a much shorter list of languages on the interlanguage link section of an article (see image). This will be enabled as a feature in the coming week for all users, which can be turned on or off using a preference setting. We look forward to your feedback and please do let us know if you have any questions. Details about Compact Language Links can be read in the project documentation. Thank you. On behalf of the Wikimedia Language team:--Runa Bhattacharjee (WMF) (talk) 13:09, 29 June 2016 (UTC)[reply]

I hate this feature. Can there be a way to disable it in user preferences? --WikiTiki89 20:12, 29 June 2016 (UTC)[reply]
 
Seems to already exist, Wikitiki.
Looks like it is an optional feature. - TheDaveRoss 20:22, 29 June 2016 (UTC)[reply]
Thanks! You just made my day! --WikiTiki89 20:25, 29 June 2016 (UTC)[reply]
If you don't want it, untick the box that will be available at Preferences > Appearance > Languages. —Stephen (Talk) 03:57, 30 June 2016 (UTC)[reply]