Open main menu

Wiktionary:Beer parlour/2016/January

discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← December 2015 · January 2016 · February 2016 → · (current)


Happy New YearEdit

Happy wiki-ing I hope that 2016 is a banner year for etymologies, appendices of personal names, and pushing forward creating the most comprehensive and accurate dictionary yet. —Justin (koavf)TCM 15:35, 31 December 2015 (UTC)

Vote counterEdit

Happy new year! I added a vote counter in Template:votes. Let me know if you would change anything or if there's any bug. --Daniel Carrero (talk) 09:15, 1 January 2016 (UTC)

(I'll update the documentation later, I need some sleep.) --Daniel Carrero (talk) 09:25, 1 January 2016 (UTC)
That's cool, but for past votes I think the result is more important. --Dixtosa (talk) 10:06, 1 January 2016 (UTC)
There's no way to get the decision automatically as a simple text (Passes, Fails, No consensus) just by parsing the page. For example, technically I could make the module look for uses of "Passes", "Fails", "No consensus" in the decision text but it would fail and get wrong results sometimes, especially in cases of votes with multiple options and complex results like "passes except for English".
Template:votes can be changed to let people add the vote result manually, but I oppose that. Showing the past votes with their results is already the job of WT:V#Recently ended votes, which shows more votes. If someone has the time to edit Template:votes to add a "Fails", they could as well be moving the vote to the actual list of recently ended votes. --Daniel Carrero (talk) 19:27, 1 January 2016 (UTC)

Note: Hover the mouse over the number of votes to see who voted. --Daniel Carrero (talk) 21:48, 1 January 2016 (UTC)

For me, "{{#ifeq|normal|past|}}" and whatnot is showing up dozens of times over at the top of the table. Andrew Sheedy (talk) 22:09, 1 January 2016 (UTC)
@Andrew Sheedy: Sorry, that appeared for a moment because I made a mistake. I already fixed that. --Daniel Carrero (talk) 22:11, 1 January 2016 (UTC)
Alright, thanks. I feel like it would be better if the table showed fewer votes; say, 10-15, rather than 22 like it does now. Also, why does each vote title begin with something like "pl-2015-12/"? I don't recall it doing so when the table was first implemented, and it looks kind of messy. That being said, I find it helpful to keep me up-to-date on recent votes, as I never remember to check them out otherwise. Andrew Sheedy (talk) 22:32, 1 January 2016 (UTC)
@Andrew Sheedy: The long list of votes is my fault, I created 18 of those. (82%) I feel it's better if the table shows all current votes rather than hiding some. If the table shown only 10-15 votes right now, some unstarted votes would be hidden from view but some active votes would be hidden too.
I removed the "pl-2015-12/" part now. Thanks for the last comment! --Daniel Carrero (talk) 22:49, 1 January 2016 (UTC)
Thanks, that makes it easier to read. Andrew Sheedy (talk) 23:02, 1 January 2016 (UTC)

I would like to install Extension:GetUserName to show if the current user already voted.

  •   (You already voted!)
  •   (You did not vote yet!)

--Daniel Carrero (talk) 22:24, 1 January 2016 (UTC)

Daniel, is it possible to allow people to manually insert the result of a vote? There's already an "edit this list" link. There are few enough votes that it wouldn't be too hard to insert the results, or at least a "passed/failed/other" or "passed/failed/partly passed" or whatever. Benwing2 (talk) 03:03, 2 January 2016 (UTC)
I oppose anything that requires further manual upkeep. —Μετάknowledgediscuss/deeds 03:21, 2 January 2016 (UTC)
It's not required. Benwing2 (talk) 03:30, 2 January 2016 (UTC)
@Benwing2, Dixtosa, Metaknowledge: I can change the template to add a "decision" parameter to all votes but it would require manual upkeep forever: Everytime a vote ends, someone would have to put the "Passes", "Fails", whatever in the vote box. I mean, if nobody types a decision, then it would just show nothing.
If enough people want that and are willing to update the box whenever needed... Why not, I suppose. --Daniel Carrero (talk) 03:36, 2 January 2016 (UTC)
I still oppose that. —Μετάknowledgediscuss/deeds 03:45, 2 January 2016 (UTC)
@Benwing2, Dixtosa, Metaknowledge: I added the decision1=, decision2= parameters per Benwing2 and Dixtosa, despite Metaknowledge's opposition. Does it look good? Myself, I said above that I opposed it, but now I abstain as to whether that parameter should be kept.
Currently, the vote box shows 2 "passed" results. --Daniel Carrero (talk) 03:52, 2 January 2016 (UTC)
Thanks, looks good. Benwing2 (talk) 03:58, 2 January 2016 (UTC)
Good. FYI: after the end date of a vote, if nobody has entered a decision yet, the shown decision is "decision?". --Daniel Carrero (talk) 04:05, 2 January 2016 (UTC)

Format of entries in the Reconstruction namespaceEdit

Now that we have the Reconstruction namespace, it would be really good to implement it. However, this is held up by a basic formatting decision that needs to be made: do we want "Reconstruction:Proto-Indo-European/albʰós" or "Reconstruction:albʰós#Proto-Indo-European"? I'd like it if we could make this decision in a quick poll rather than have a protracted formal vote. —Μετάknowledgediscuss/deeds 06:48, 2 January 2016 (UTC)

Support "Reconstruction:Proto-Indo-European/albʰós"Edit

  1.   Support because this most closely matches our current format and avoids having reconstructed forms in multiple languages on a single page. —Μετάknowledgediscuss/deeds 06:48, 2 January 2016 (UTC)
  2.   Support for the same reasons —suzukaze (tc) 06:59, 2 January 2016 (UTC)
  3.   Support since reconstructions do not have a well-defined spelling — they vary by author and source, and ours may also be subject to revision. Two reconstructions in unrelated proto-languages having "the same spelling" is usually not a meaningful relationship. --Tropylium (talk) 20:41, 2 January 2016 (UTC)
  4.   Support per above and because the other alternative would require restrictions in use of redirects. Currently we can have hard redirects for exact equivalents in other notational systems, such as *-an to *-ą for Proto-Germanic, or PIE *a and schwa to various clusters including laryngeals in our notation. Most of these no-brainer redirects don't apply to any other proto-language, and a lot would have to be made into soft redirects. Chuck Entz (talk) 23:02, 2 January 2016 (UTC)
  5.   Support because, as noted below, merging creates new problems. I also think this is a good first step towards a one-language-per-page format. —CodeCat 00:35, 3 January 2016 (UTC)
    What problems? Renard Migrant (talk) 12:31, 3 January 2016 (UTC)
    @Renard Migrant: Problems with existing and future redirects — see Angr's comment below. —Μετάknowledgediscuss/deeds 16:03, 3 January 2016 (UTC)
  6.   Support --Vahag (talk) 08:47, 3 January 2016 (UTC)
  7.   Support Wyang (talk) 01:24, 4 January 2016 (UTC)
  8.   Support Hillcrest98 (talk) 01:29, 4 January 2016 (UTC)
  9.   Support this, and oppose the alternative, per Trop and Chuck. - -sche (discuss) 05:04, 4 January 2016 (UTC)
  10.   Support for many reasons I have given before. --WikiTiki89 15:59, 4 January 2016 (UTC)
  11.   Support per CodeCat, we should be looking at subpages in a big way. - TheDaveRoss 22:43, 5 January 2016 (UTC)
  12.   Support to make consensus clearer, but this does not mean I oppose the other option. —Aɴɢʀ (talk) 16:52, 8 January 2016 (UTC)
  13.   Support per CodeCat. —Pengo (talk) 22:27, 9 January 2016 (UTC) [I'd like to see us try this approach, but the other way is fine too. Pengo (talk) 00:30, 2 February 2016 (UTC)]
  14.   Support I changed my mind. —JohnC5 18:02, 31 January 2016 (UTC)

Support "Reconstruction:albʰós#Proto-Indo-European"Edit

  1.   Support because it matches the entry format. --Daniel Carrero (talk) 06:52, 2 January 2016 (UTC)
    1.   SupportJohnC5 09:48, 2 January 2016 (UTC)
  2.   Support -- don't have a super-strong opinion here but this feels more natural, and I remember early on finding it difficult to figure out where proto-language entries were in Wiktionary with the old (i.e. current) format, which is similar to the Reconstruction:Proto-Indo-European/albʰós format. — Benwing2 (talk) 17:38, 2 January 2016 (UTC)
  3.   Support. I like this one. Would anything actually need to be merged? Renard Migrant (talk) 18:12, 2 January 2016 (UTC)
    Yes, for example *-kʷe and *-kʷe, or *pénkʷe and *pénkʷe. Even more when redirects are taken into consideration (*kapros and *kapros*kápros), and even more when plausible future redirects are taken into consideration (*oynos and plausible *oynos*óynos). I would be very surprised if no proto-language other than PIE had a form spelled *ne. —Aɴɢʀ (talk) 22:19, 2 January 2016 (UTC)
    Proto-Algonquian has *ne- (Proto-Algic has *n-, contrast PIE *n̥-). - -sche (discuss) 05:04, 4 January 2016 (UTC)
    It's most likely to come up in parent/daughter cases, as in Proto-Uralic *kala > Proto-Finnic *kala. Though it's debatable how much benefit there is to treating these kind of cases as different entries in the first place. (My stance remains that if Protolang A looks usually identical to its parent Protolang B, it should be treated as a dialect of the latter, rather than as a distinct reconstructed language entirely.)
  4.   Support per Daniel Carrero and Benwing2. Easier to find IMO. —Aryamanarora (मुझसे बात करो) 01:17, 4 January 2016 (UTC)
  5.   Support --profesjonalizmreply 13:17, 4 January 2016 (UTC)
  6.   Support Feels more natural to me. This matches our current mainspace format. A similar alternative would be to have the reconstructions directly in the mainspace and start with *, for which AFAIK the objections were rather weak. We absolutely should not be moving the normal mainspace to one entry per language, which I see CodeCat above say. Also, this format makes entries easier to find: there is going to be a shortcut for the namespace so the reader only types rec:albʰós, and there comes the entry; in fact, the reader only types rec:alb, and there appears a list of items that are completions of that; try typing ws:pers into the search box to see how this works for Wikisaurus. I acknowledge that there will probably be fewer redirects than for the alternative but I do not see anyone showing us how large the problem is; it is possibly relatively small. On a procedural note, if a plain majority prefers the Reconstruction:Proto-Indo-European/albʰós format, let's use it; it is the status quo anyway. --Dan Polansky (talk) 09:34, 9 January 2016 (UTC)

Support a different option (specify)Edit

Comments etc.Edit

  • Either one is fine with me. —Aɴɢʀ (talk) 09:34, 2 January 2016 (UTC)
    Statement of support added above. —Aɴɢʀ (talk) 16:52, 8 January 2016 (UTC)
  • It looks like we're going to have a lot of supporters of each form. Perhaps each supporter should note whether the other form is also acceptable to him/her or whether, on the contrary, he/she supports only the form indicated. That might help decide what the general consensus is.​—msh210 (talk) 22:40, 5 January 2016 (UTC)
  • As I mentioned above, I prefer the combined form but I'd be OK with the separated form. BTW I don't see the merging issue as a big problem; I've done lots of more complicated transformations using bots. The only thing that might be tricky is converting hard redirects into soft redirects; however, I imagine most of this can be automated as well. Benwing2 (talk) 22:17, 8 January 2016 (UTC)
    @Benwing2, CodeCat: I think we have a pretty solid consensus that's emerged, and we really ought to make the transition now that we have the namespace. Would you mind doing the honours? The trick is to ensure that relevant templates and modules are updated as soon as the moves are done, so we don't leave anything broken. —Μετάknowledgediscuss/deeds 04:59, 9 January 2016 (UTC)
    I'm not completely convinced there is a consensus given the relative number of users going the other way, but if everyone else thinks there's a consensus I'm fine with it. I can help move pages although I may be a bit busy until around the 13th or 14th of this month. If you need it before then, maybe User:CodeCat can help? BTW, CodeCat or anyone, how can I get a list of all appendix-only languages that should be moved to the Reconstruction space? Benwing2 (talk) 07:30, 9 January 2016 (UTC)
    On a procedural note, to want to close such a major decision after mere 7 days of voting seems improper to me. I would see 14 days as the minimum, or even 4 weeks typical of votes. Also, the discussion is very weak; I see very little in way of argument or links to specific places where arguments and reasoning can be found. --Dan Polansky (talk) 09:39, 9 January 2016 (UTC)
  • This wasn't a !vote for a reason — I just wanted to get it over with so that we could deal with the move, but the people who are capable of doing that (and, as far as I know, most interested in doing that) haven't, so I'm not sure whether there was a point to that. @CodeCat, Benwing2Μετάknowledgediscuss/deeds 03:39, 26 January 2016 (UTC)
    I guess I've been treating it like a vote and waiting for it to be formally closed -- it seems important enough and contested enough to merit this. In this respect I agree with Dan. I actually think it might not be a bad idea to treat it like a vote, and create a formal vote with a retroactive start date of say Jan 2 and an end date of say Jan 31; or at least set a fixed end point a few days out to make sure anyone else who wants to say something can do so -- people who have been quiet tend to perk up when deadlines approach. Benwing2 (talk) 03:51, 26 January 2016 (UTC)
    So, what's the plan currently? —JohnC5 18:02, 31 January 2016 (UTC)


Not that this is a real vote, but with 70%, I think the first option passes. Can we get a bot to start moving these? @CodeCat? —JohnC5 22:26, 28 February 2016 (UTC)

I'll need to make some changes to Module:links first so that links don't break too much. —CodeCat 22:58, 28 February 2016 (UTC)
Thanks! :)JohnC5 23:01, 28 February 2016 (UTC)
I've made a small edit to Module:links. It now links to the Reconstruction: or Appendix: based on the existence of pages with that name. If the Reconstruction page exists, or if neither page exists, it links to that. It only links to the Appendix page if it exists but the Reconstruction page does not. So it should be no problem to move entries without leaving redirects behind, as links will update as soon as the move is made. Red links will automatically take you to the Reconstruction namespace. —CodeCat 00:50, 29 February 2016 (UTC)
Looks good to me! —JohnC5 01:15, 29 February 2016 (UTC)
I modified {{reconstruction}} with a link that automatically takes you to the "move page" form with everything already filled in. That should make it easier to move the pages. —CodeCat 01:20, 29 February 2016 (UTC)
@CodeCat: It seems that the box for leaving a redirect behind is unchecked by default. I strongly believe that there should be redirects left behind, because otherwise we are leaving the internet with a bunch of dead links for no good reason. There will be no conflicts if we leave them, so why wouldn't we? —Μετάknowledgediscuss/deeds 02:28, 29 February 2016 (UTC)
I made it unchecked on purpose. I don't like leaving leftovers. —CodeCat 02:29, 29 February 2016 (UTC)
Okay, but it seems more important that we not produce dead links to Wiktionary, because that reduces traffic in the short term. —Μετάknowledgediscuss/deeds 02:31, 29 February 2016 (UTC)
Dead links are inevitable, they're a normal part of website evolution. I don't think we should give ourselves the duty to clean up other people's links for them. —CodeCat 02:34, 29 February 2016 (UTC)
It's not a matter of duty; it's a matter of whether we want more people using this website or fewer. —Μετάknowledgediscuss/deeds 02:36, 29 February 2016 (UTC)
We can delete the redirects in a year or so. Let's leave them for now. --WikiTiki89 16:30, 29 February 2016 (UTC)

Color-coded ELEdit

FYI: I was curious to see how much of WT:EL was voted and how much was unvoted, so I created User:Daniel Carrero/Color-coded EL.

Turns out it's about 75% unvoted. --Daniel Carrero (talk) 08:36, 2 January 2016 (UTC)

Interesting. Could you select less saturated colors to enhance readability? I get a headache just thinking about the page as it is. DCDuring TALK 13:16, 2 January 2016 (UTC)
Absolutely. I won't do that right now because I'm on my cell phone, but if anyone wants to do a search/replace on that page, go ahead. Just look for background-color: red and background-color: green. --Daniel Carrero (talk) 14:02, 2 January 2016 (UTC)
  Done, but imperfectly. My color selections were not quite pale enough, IMO. DCDuring TALK 18:12, 2 January 2016 (UTC)
Perhaps we should also distinguish between parts that have a vote created for them, from parts that don't, instead of marking them all red. --WikiTiki89 16:02, 4 January 2016 (UTC)
This is nice, thanks Daniel. I updated one section that I recalled a vote for, I assume that is Ok? - TheDaveRoss 17:11, 4 January 2016 (UTC)
@DCDuring: That's great, thanks.
@TheDaveRoss: That's great, too, thanks.
@Wikitiki89: OK, I did as you suggested. --Daniel Carrero (talk) 00:20, 5 January 2016 (UTC)
If people prefer just the green/red format, the page can be converted back. But I wonder if we would be able to create votes for all the unvoted sections later, thus defeating the point of having the additional yellow (I mean, "khaki") color for unvoted sections that have votes created for them. Better yet if they all pass, then EL would finally be 100% voted and thus User:Daniel Carrero/Color-coded EL would be deleted, lest it become a completely green, thus useless, version of EL. --Daniel Carrero (talk) 06:18, 5 January 2016 (UTC)
Actually, all of ELE was voted in. (I'm only half kidding.)​—msh210 (talk) 16:43, 5 January 2016 (UTC)
LOL, true. That vote even has "Replacing the contents of Wiktionary:Entry layout explained by the contents of (another revision of the same policy).", sounds pretty serious. (I'm half kidding, too.) --Daniel Carrero (talk) 19:16, 5 January 2016 (UTC)
Or perhaps the replacement of the voted on contents with identical new contents negates the voting done for the original content? Maybe we don't actually have a CFI or ELE. (I never kid.) - TheDaveRoss 19:19, 5 January 2016 (UTC)
We don't? Good. I always wanted to make an entry for a8idsah09d8has9dh, but CFI was in the way! --Daniel Carrero (talk) 19:59, 5 January 2016 (UTC)

English possessivesEdit

My earlier posting on this: Wiktionary:Tea room/2011/April#English possessives.

A number of words indicate the possessive in English. These include have, of, -'s, their (my, et al.), and theirs (mine, et al.). For some of those words, we have a simple "marks the possessive" or similar sense, which is not very informative. For others of those words, we have specific senses — but those senses are not the same for two possessive words, though they may overlap.

What we need IMO is one central location for definitions of the possessive, and for all the possessive words to link thereto. I think this should be an appendix, say appendix:English possessives, and that all the possessive senses of these words (which is not always all the senses) should be referred to the appendix in lieu of the definition lines. A very rough draft toward such an appendix is at [[User:Msh210/English possessives]]; please feel free to edit it.

I would love to see what others think of this.​—msh210 (talk) 08:59, 5 January 2016 (UTC)

Seems like a good idea. Where would the links be? Under "See also" or something more likely to be clicked? DCDuring TALK 11:30, 5 January 2016 (UTC)
I'm thinking it would be a sense. For example, for my, we now have four senses ("Belonging to me", "Associated with me", "Related to me", "In my possession": those are not summaries but the entire definition lines). They'd be replaced by a single "{{n-g|Possessive of me, used before a noun phrase: see Appendix: English possessives}}" or some such. Likewise, for your we have four senses, of which the last two ("A determiner that conveys familiarity…" and "That; the specified") would remain in place but the first two ("Belonging to you; of you; related to you") would be replaced by something like above. Likewise, several of the subsenses of of could be replaced by something like "{{n-g|marking the possessive, see appendix…}}" and usexes.​—msh210 (talk) 15:13, 5 January 2016 (UTC)
Hmm. If I had to coin a slogan, it would be "case envy". Think about the last item in your list: "concatenation of noun phrases" (to which you've put ????). This is simply using a noun phrase to qualify another noun phrase, and there is essentially no limit to the nature of the relationship to which this refers, and any attempt at listing the "meanings" is doomed. (I really suggest you should remove this from the list.) The other terms (constructions) you have listed are very close to what might be "Genitive case", if we had a proper case system (or a system of unvarying particles, like Japanese の (no)), and trying to list the "meanings" of the genitive is obviously hopeless. After all, in many cases, it is a purely grammatical construct: if I say "I approach this question with an open mind", then want to nominalise this in a later reference, I say "My approach..." There is no more a "meaning" of "my" here than there is a "meaning" of "I" in the first sentence.
All that said, I think an appendix on possessives would be a Very Sensible Idea. It should show the pronoun table, rules for apostrophe-ess, including noting that this is not normal morphology, because you can say things like "The King of Persia's right elbow". It should also point out the way in which these can be seen as nominalisations of the verb "to have" (I learned about four foreign languages before I met one two which don't have one, and it is a very enlightening experience.) I hope this makes up for the negative tone of my first paragraph. Imaginatorium (talk) 15:57, 5 January 2016 (UTC)
Note that although concatenation of two nouns has several meanings, one of them is the possessive (at least according to w:English possessive); the same is true for of and have and even -'s, all of which have meanings other than the possessive.​—msh210 (talk) 16:35, 5 January 2016 (UTC)
What does "the possessive meaning" mean? (Serious question: I think it is semantically empty.) I can't actually see where it makes the claim about concatenated nouns "meaning" "possessive", but I'm sure one can find an example, for some values of "possessive meaning" at least. Imaginatorium (talk) 17:35, 5 January 2016 (UTC)
@Imaginatorium Re "possessive meaning": Good question. I guess when I think of possessive meaning, I think of meanings typically held by both -'s and his. But doubtless linguists have done some work in this area already and we don't need to reinvent the wheel. There's some stuff to read in the "Semantics" section of the WP article.   Re WP: It's in the "As determiners" section: "the system failure, using system as a noun adjunct rather than a possessive" (where "possessive" means with -'s morphology, not with possessive meaning).​—msh210 (talk) 18:39, 5 January 2016 (UTC)

Definitions at town and city namesEdit


. In many town-name entries there are lists similar to this, I am wondering if it would be better to replace all of the specific towns with more generic definitions a la given and family names. I am thinking of something along the lines of "A town or city name in the US, Canada and England." Are the specific definition lines of value? It seems to be treading the line of encyclopedic, and Wikipedia's disambiguation pages do a better job I think. - TheDaveRoss 14:48, 8 January 2016 (UTC)

Plenty of dictionaries include these kinds of lists, and it appears Wiktionary is following tradition in this case. {{place}} could be useful in these cases though. Personally, I'm fine with generic definitions or lists. —Aryamanarora (मुझसे बात करो) 15:50, 8 January 2016 (UTC)
I agree that this is reduplication of WP content, and not helpful in a dictionary sort of way. Traditional dictionaries do often include geographical names, either in the body, or in an appendix, but then, they didn't have the option of a WPlink. In at least some cases there could be an etymological overview -- how this name spread (e.g. from England to the former colonies, sort of thing). Imaginatorium (talk) 04:49, 9 January 2016 (UTC)
See Talk:Paris. The definitions of Woodstock could be replaced by "Any of a number of cities and towns in the US, Canada and England", or subsenses could be listed under that definition,in a fold-up list preferably. @Lo Ximiendo is currently adding definitions of specific places, many of them useful, but a comprehensive list is not realistic for the most common place names.. --Makaokalani (talk) 12:16, 9 January 2016 (UTC)
To treat all of the Parises as if they were lexically equivalent is silly, as is making Woodstock, Troy, Amsterdam, Geneva, Birmingham, Rome, Utica, Syracuse, all in New York State; Moscow, ID; and London, ON equivalent to all the others. Often one, two, or a few of the places are distinguished in general or regional discourse and might merit inclusion on some lexical grounds. Our attributive-use standard for including proper nouns made lexical sense to me, but apparently not to others. We are stuck with voting on these things, one at a time, since the language of that standard was explicitly voted down without any comparably effective replacement. DCDuring TALK 13:59, 9 January 2016 (UTC)

Category:en:Rivers in CanadaEdit

@CodeCat I made Category:en:Rivers in Canada. Is there a way to enable this empty category into something else? Thank you in advance. --KoreanQuoter (talk) 04:23, 9 January 2016 (UTC)

-èd wordsEdit

For most languages, we don't include the forms with macrons and stress markers where these are usually ignored in writing. So... where does this leave entries like wingèd, learnèd and cursèd? (If nothing else, we need to add pronunciations to these entries.) Smurrayinchester (talk) 17:33, 9 January 2016 (UTC)

I don't think these are 'usually ignored' just rare, obsolete forms that were used for a relatively short period. Renard Migrant (talk) 18:43, 9 January 2016 (UTC)
Well, some scholarly and pseudo-scholarly material still use them, e.g in some critical version of Shakespeare's texts. I say we keep 'em. —Μετάknowledgediscuss/deeds 18:46, 9 January 2016 (UTC)
But even then they're independent stress markers of poetical criticism and not part of the actual spelling of the word, are they not? Sounds comparable to the accents used in Russian text books for me. Korn [kʰʊ̃ːæ̯̃n] (talk) 20:26, 9 January 2016 (UTC)
True, but is it not worth keeping them in the dictionary in case someone wants to look one up to see what the stress marker means? (We should have pronunciation for these so that people can see the difference between them and the unnaccented word.) Andrew Sheedy (talk) 21:49, 9 January 2016 (UTC)
The entry for "winged" doesn't have a Pronunciation section, but if it did, it would surely just list the normal one. Then an entry (or at least a note) for the accented version should mention that this is not just a poetic spelling (it isn't, really!) but a spelling indicating a poetic pronunciation. Imaginatorium (talk) 06:52, 10 January 2016 (UTC)

Russian cognateEdit

@ cognate, under the linguistics examples, it has:

"English mother is cognate with Greek μητέρα ‎(mētéra), German Mutter, Russian маmь ‎(matʹ) and Persian مادر ‎(madar)."

My focus is on the Russian term: it appears as маmь, although it is linked to мать. It doesn't have an alias in the arguments. Why is this, does anyone else see what my browser is rendering, or is it just me ? Leasnam (talk) 01:46, 10 January 2016 (UTC)

Please rephrase your question it doesn't make sense. мать (matʹ) is linked as it should be. What alias do expect? Inflected forms? It's a lemma. Stresses? Not required for monosyllabic Russian words.--Anatoli T. (обсудить/вклад) 04:30, 10 January 2016 (UTC)
I think he's just saying that it looks strange on his browser. I don't see that, but m is the Russian representation of т in an italic font, so possibly there's some weird font issue on your system? Benwing2 (talk) 05:14, 10 January 2016 (UTC)
Hi - Thank you, all ! Yes, it appears as an "m", but mousing over clearly shows the "t". oK, it's just me (and perhaps some others) then, not an issue of concern :) Leasnam (talk) 19:50, 10 January 2016 (UTC)
The question makes perfect sense: you are seeing a script lower-case Cyrillic 'T', which looks like 'm', instead of a "regular" lower-case Cyrillic 'T', which looks like a small-cap 'T'. I see the text in a sloped sans-serif font (the browser says the font is "DejaVu Sans Oblique"), so I see a sloped version of the normal мать. It would help if we knew what browser, OS, font selection etc etc you are using.
OS is old, Windows 7. Browser is Chrome. Leasnam (talk) 19:53, 10 January 2016 (UTC)
I think this sort of problem is worth considering very carefully. The problem is that (of course) stuff like HTML and CSS were conceived by clueless monolinguals (I mean clueless about anything not monolingual), so it is not easy to avoid unpredictable odd side-effects. Do we have any HTML/CSS gurus who could understand why I see a sloped font and @Leasnam sees an "italic" font, which looks like script? (I can't see anything in the CSS rules which the Firefox Inspector shows which would make it "sloped" or "italic".) Imaginatorium (talk) 06:49, 10 January 2016 (UTC)
Cyrillic т (t)'s cursive (and thus italic, too) form resembles an m. Observe:
  • т
  • т
Copy and paste "Russian маmь ‎(matʹ)""Russian мать ‎(matʹ)" into Notepad; there is no "alias".
suzukaze (tc) 06:57, 10 January 2016 (UTC)
That exactly demonstrates the problem: you cannot discuss this stuff by pasting in the problem text and saying "Look!" -- I see something quite different from what you see. In fact the Wikipedia double-quote trick does not generate italics (at least for me), but generates a sloped sans-serif font. The text you have presumably copied above, i.e. "маmь" is not Russian at all: it's a sequence of Cyrillic ма followed by Roman m followed by a Cyrillic ь. (I don't have "Notepad", and I don't know what you mean by "alias".) Imaginatorium (talk) 07:53, 10 January 2016 (UTC)
I wasn't replying to you, I'm sorry if it seemed that way D:
But clearing things up: 1. Leasnam mentioned an "alias". 2. my "Russian маmь ‎(matʹ)" was copy-and-pasted from Leasnam's message, which contains mixed Cyrillic/Latin rather than pure Cyrillic (maybe that was a bad idea) —suzukaze (tc) 07:56, 10 January 2016 (UTC)
I've reformatted the example sentences because if you have running text in italics and then there's something that would normally be italicized, the thing to do is put it back in roman. —Aɴɢʀ (talk) 15:22, 10 January 2016 (UTC)
By "alias" I am referring to the display text (arg 3?), e.g. {{m|en|hello|HELLO}}, where "HELLO" acts as an alias (maybe that's not the correct term, sorry) Leasnam (talk) 19:58, 10 January 2016 (UTC)
I see it as мать now Leasnam (talk) 21:22, 10 January 2016 (UTC)
Just remember that Russian uppercase М and lowercase м look alike except for height (no rounded humps). Lowercase м does not look like т, and т cannot be a case of М; т is lowercase italic Т. —Stephen (Talk) 22:07, 10 January 2016 (UTC)

crimson in Template:table:colorsEdit

An anon (Special:Contributions/ added crimson to Template:table:colors and language subtemplates.

As far as I'm concerned, I think I'd allow it, assuming it was all done correctly. (I'd just move crimson to someplace else in the order of colors) I have an inclusionist point of view concerning that template. I probably wouldn't mind having that table with 50, or maybe 300 colors, it could be made collapsible to save space.

Just my two cents. I'd understand if other people want to delete "crimson", since it's close enough to red. --Daniel Carrero (talk) 07:17, 10 January 2016 (UTC)

The problem with having 300 colours is that the swatches become quite meaningless. Everybody (almost!) can agree on a red, a yellow, and a green, but once you get into the silly territory and have to pick separate swatches for grey, "timberwolf", "silver birch", and "shiny new saucepan", they aren't helpful or meaningful any more. Equinox 14:00, 11 January 2016 (UTC)
I vote on removing the following: strongly: crimson, cream, azure; weakly: teal, indigo. — Ungoliant (falai) 14:24, 11 January 2016 (UTC)
  • There shouldn't be more than 18 colors in that thing: the 16 HTML colors (red, maroon, yellow, olive, lime, green, cyan, teal, blue, navy, magenta, purple, white, silver, gray and black), then maybe indigo and orange, but anything beyond that's too much @Daniel Correro, if people want the full 300, there are appendices that list all of them, so there's no point in Template:table:colors duplicating that. Purplebackpack89 15:53, 11 January 2016 (UTC)
  • Don't see why their being HTML colours matters. Surely brown is a much more basic and useful colour than "teal" or "maroon", which are basically blue and red. Equinox 19:04, 11 January 2016 (UTC)
    Setting aside the issue that teal and maroon are actually a long ways away from blue and red (at least HTML-wise), I have an alternate proposal: black, blue, brown, cyan, green, gray, magenta, orange, pink, purple, red, white and yellow. Purplebackpack89 19:53, 11 January 2016 (UTC)
    If we are going to use only basic colors, why magenta? The others seem fine. --Daniel Carrero (talk) 19:55, 11 January 2016 (UTC)
    Magenta (and cyan) give us the three secondary colors of light, while the rest are the colors you're most likely to find in a relatively small box of crayons, markers or colored pencils. Purplebackpack89 20:11, 11 January 2016 (UTC)

Adding a parameter or two to {{m}} and {{l}}Edit

What do other people think of the idea of adding a parameter to {{m}} and {{l}} that would display the language name? At the moment, in etymology sections for example, if we want to write "compare German Mutter" we write compare {{etyl|de|-}} {{m|de|Mutter}}. Wouldn't it be nice to write compare {{m|de|Mutter|name=1}} instead? Not only is it fewer keystrokes, it ensures that the language named and the language tagged are the same, so there won't be any mistakes like compare {{etyl|fr|-}} {{m|de|Mutter}}, which do happen from time to time. For {{l}}, since it's more often used in lists, the language name could be preceded by a colon, as is the case in Descendants lists. For example muoter#Descendants could just list * {{l|de|Mutter|name=1}} instead of * German: {{l|de|Mutter}}. This would have the same two advantages: fewer keystrokes, elimination of the chance of a mismatch. Any thoughts/ideas for improvement? —Aɴɢʀ (talk) 15:49, 10 January 2016 (UTC)

  • Support. Great idea; I wish I'd thought of this myself. —Μετάknowledgediscuss/deeds 15:54, 10 January 2016 (UTC)
  • Support. This would relieve some of the pressure for misuse of {{inh}} and {{der}} in etymology sections, not to mention simplifying that {{etyl|xyz|-}} {{m|xyz|blah blah blah}} typing exercise. Chuck Entz (talk) 16:09, 10 January 2016 (UTC)
  • Oppose, redundant to {{cog}}. —CodeCat 16:19, 10 January 2016 (UTC)
    • I was unaware of {{cog}}; however, {{cog}} still isn't right for things like Descendants lists. —Aɴɢʀ (talk) 16:48, 10 January 2016 (UTC)
      • I can only support it for descendant lists if we also change the format of translation tables accordingly. —CodeCat 16:51, 10 January 2016 (UTC)
        • As in * {{t|de|Mutter|f|name=1}} instead of * German: {{t|de|Mutter|f}}? I'm down with that. —Aɴɢʀ (talk) 19:28, 10 January 2016 (UTC)
          • It's clearly a good idea as it prevents mistakes where you don't know which is the mistake, the language name or the language code. The template's already expanding the language code into a language name, so doing it twice shouldn't pose a problem. Renard Migrant (talk) 16:27, 11 January 2016 (UTC)
  • Support. Andrew Sheedy (talk) 23:12, 11 January 2016 (UTC)
  • Oppose only because I think it would look nicer to have a separate template for that, something like {{name-l|de|Mutter}} and {{name-t|de|Mutter|f}} (maybe that's not the best naming scheme, but I definitely think that whatever it is should be added to the left of the l or t. --WikiTiki89 23:21, 11 January 2016 (UTC)
    • I would support this over the other option. Andrew Sheedy (talk) 02:32, 12 January 2016 (UTC)

Away until Thursday.Edit

I will be away for a few days. Normally, I would ask that the dictionary be finished by the time I get back, but this time I am going to be far more modest, and will merely ask that you get all of the WT:RFD and WT:RFV discussions finished and cleaned up. Of course, this includes WT:RFDO. Cheers! bd2412 T 18:47, 10 January 2016 (UTC)

This joke never stops getting old. —Aɴɢʀ (talk) 21:20, 10 January 2016 (UTC)
Perhaps it never stopped to begin to stop being a joke? --WikiTiki89 02:39, 12 January 2016 (UTC)

Let's make a WT:FUN competition in which the last person to edit Wiktionary and make it complete wins. --Daniel Carrero (talk) 23:11, 14 January 2016 (UTC)

I have an idea for a project: "A visual representation of the etymology of words using trees"Edit

Hi everyone, I am new here. I am developing a project - "etytree" (click here) - to visualize the etymological tree of a word using an interactive tool. Basically I have built a graphical interactive web page using d3.js - a JavaScript library for manipulating documents based on data - where you search a word and then you visualize the etymological tree of the word (ancestors, cognates, on the same tree). Right now I have only developed a demo for 10 words and my next step will be to build an extractor of etymological relationships from Wiktionary. It won't be easy but definitely interesting.

A screenshot of the etymological tree of the English word 'butter' as produced by 'etytree'

I have some ideas on how to do it but I would like to get some feedback from people that are experts in the field / interested in the topic + I'm writing a grant proposal (click here) because I need funds.

Do you think this is the right place to describe my project and ask for feedback? Thanks! --Epantaleo

This is definitely a fascinating idea. Thanks for describing it. Not sure if this is the right place to ask for feedback, but if not, I'm not sure where is better. If you have specific questions about extracting the etymological links (which you are right won't be easy), you might ask at the Grease Pit. Benwing2 (talk) 09:33, 12 January 2016 (UTC)
This is really cool! UI could use some polishing (the ISO codes are a little small) but a great concept. —Aryamanarora (मुझसे बात करो) 23:01, 12 January 2016 (UTC)

Hi all, in particular Benwing2 and Aryamanarora (your feedback was really helpful and encouraging!). I have submitted the grant proposal and now need your support. If you think the project is interesting and feasible, and or if you feel you would like to volunteer, please post at the end of the new grant page here! Thanks a lot in advance. ps: I will work on the UI soon, for now I have been working on the java extraction tool.

Wiktionary:Word of the day/January 11Edit

Could an admin please change the word of the day slightly, from "A snake" in sense 1 to "Any snake". The current formulation is confusing - there is still a snake called the "adder"; what has changed is calling other snakes "adders". Smurrayinchester (talk) 11:19, 11 January 2016 (UTC)

  •   Done I also replaced the red links with blue, one to a synonym that we have and another to a WP link (via {{vern}}. DCDuring TALK 13:49, 11 January 2016 (UTC)

Filling the CFI donut holeEdit

At present, it's possible for an abbreviation, a derivative or a slang term of/for certain terms to be included as an entry, but not for the term that is being abbreviated, derived or corrupted to itself have an entry. This "donut hole" seems nonsensical to me IMO, and we should expand CFI to fix it. Something along the lines of "Any word or phrase that has an abbreviation or derivative term derived from it that passes CFI will also pass CFI." IMO, no harm will be done to the project in allowing these additional entries. Will start a vote on it if others think filling this hole is a good idea. Purplebackpack89 15:33, 11 January 2016 (UTC)

I disagree; when the derived term is a word, include it, sure, but things that aren't words shouldn't get a free pass based on having a word derived from them. It's backwards thinking (as in literally, in the reverse order). Renard Migrant (talk) 16:25, 11 January 2016 (UTC)
At first blush I agree with Renard, but I might not be thinking of the same sorts of terms as you are so I would like some examples of types of entries. I am thinking of things like , which is best handled with a link to Wikipedia I think. - TheDaveRoss 16:38, 11 January 2016 (UTC)
I think we're talking about things like AFAICT, the existence of which should (in Purplebackpack's view) or should not (in Renard Migrant's view) automatically permit the existence of as far as I can tell. —Aɴɢʀ (talk) 18:57, 11 January 2016 (UTC)
I would not favor the automatic inclusion extending to proper nouns, eg, inclusion of FTC should not lead to automatic inclusion of Federal Trade Commission. I haven't really looked for more subtle faults in the proposal, though I expect there to be quite a few. DCDuring TALK 23:04, 11 January 2016 (UTC)
FBI (Federal Bureau of Investigation) is as good in terms of an example as AFAICT. Purplebackpack's of course free to clarify but I do think he means literally everything. Also I don't think the distinction is nonsensical, it has clear boundaries and a clear rationale. Renard Migrant (talk) 23:15, 11 January 2016 (UTC)
I do mean everything, in case of acronyms or other derived terms. When I created this, I think of acronyms of things that have never been created, were deleted, or (in the case of field goal percentage) were about one vote away from deletion. At present time, CFI as worded is too restrictive towards words like these. Yes, the result maybe creates some words of dubious value to the project, but better to have a CFI that's overly broad than one that's overly restrictive. There may be quite a few, but I don't think it's as many as DCDuring thinks...and remember, whenever one of them goes to RfV, you'd be RfVing for two, because if the derived word fails RfD or RfV (and thereby fails CfD), the root does to. Purplebackpack89 00:39, 12 January 2016 (UTC)
Strongly oppose. This would force us to include many unnecessary phrases, like as far as I can tell, rolling on the floor laughing, and fucked up beyond all recognition. Oh and rolling on the floor laughing my fucking ass off (too bad I couldn't find cites for ROFFLMFAO, or that would have been one word longer). Not to mention, this would get very messy if the etymology is unclear or disputed: Should we include all of laughing out loud, laughing online, and lots of laughs? Should we go as far as to include fornication under consent of the king and for unlawful carnal knowledge? --WikiTiki89 01:29, 12 January 2016 (UTC)
I dispute your claim that those are "unnecessary". As for "should we...", I personally think we should. Also, why do you guys insist on drawing from the longer and more absurd side of the spectrum, rather than accepting that those are the price to pay for a great many shorter, more important and less controversial words and phrases? Purplebackpack89 02:30, 12 January 2016 (UTC)
Care to give an example of a "shorter, more important and less controversial" word or phrase that would otherwise fail CFI? --WikiTiki89 02:32, 12 January 2016 (UTC)
What's so bad about having laugh out loud, or true shooting percentage? Or any number of other things that are slipping my mind at the moment. As I've said, you need to look at the big picture, not just random seven word entries that happen to pop into your head. Your argument boils down to "I don't think we should have those entries", even though there are no technical limitations preventing us from having them. And don't say, "more entries means more vandalism", because it doesn't...more editors means more vandalism. Purplebackpack89 05:38, 12 January 2016 (UTC)
Also, let me correct a misconception Wikitiki, Equinox and DCDuring routinely mention whenever expanding CFI comes up. Expanding CFI doesn't force any thing. It doesn't mean that people have to go out and create some random seven word entry. It doesn't even mean people will go out and create that entry. Purplebackpack89 05:42, 12 January 2016 (UTC)
If laugh out loud should be included (and perhaps it should, as some kind of interjection), it should be included on its own merit, and not simply due to the existence of LOL. My point is exactly that your suggestion would include everything that would not merit inclusion on its own. Anything useful that would be covered by your suggestion would be covered by other CFI criteria. And perhaps we should add more criteria to CFI, but these criteria should judge terms on their own and not based on other terms. --WikiTiki89 15:45, 12 January 2016 (UTC)
@Wikitiki89 You're ignoring the perceived problem this thread seeks to address: that some definitions we have (such as LOL) are dependent on other definitions (such as laugh out loud). Can I at least get you to concede that people who don't know what laugh out loud means won't be able to discern what LOL means either? Once you've conceded me that, any chance I can get you to concede that this problem is compounded by the fact that many acronyms, abbreviations and corruptions aren't defined except by the thing they are abbreviating? And maybe you'll acknowledge a that it isn't easy to figure out what laugh out loud means using only Wiktionary? (I doubt the majority of editors would think to look for laugh + out loud, laugh + out + loud doesn't really give you the proper definition and many wouldn't have the patience to look it up anyway). Purplebackpack89 17:53, 12 January 2016 (UTC)
I'm not going to concede anything. If LOL is inadequately defined, that's a problem only with the entry LOL. If laugh out loud is idiomatic and needs explaining, then the existence of LOL is still irrelevant. --WikiTiki89 18:20, 12 January 2016 (UTC)
So you believe it's somehow possible that you can know what LOL means even if you don't know what laugh out loud means? That doesn't seem to make any sense at all! If LOL stands for laugh out loud, and you don't know what laugh out loud means, you can't use the expression LOL properly. Also, if we had a definition for laugh out loud, we could just link to it and that would solve the problem of LOL's definition being inadequate. We have the functionality of linking entries to each other, might as well use it. Purplebackpack89 18:34, 12 January 2016 (UTC)
Let me rephrase. There are two possibilities:
  1. laugh out loud is SOP, which means that defining it doesn't help anyone understand anything that they couldn't have understood from looking up its parts.
  2. laugh out loud is not SOP, which means that your proposal is not necessary for it to have an entry.
In either case, I think defining LOL as "laugh out loud" is inadequate. --WikiTiki89 18:52, 12 January 2016 (UTC)
Setting SOP (which is bullshit, I might add; people have never proven that readers CAN actually make those connections) aside, if defining LOL as "laugh out loud" is inadequate, there are two ways to fix it:
  1. Add more to the definition of LOL
  2. Create the definition of laugh out loud and link to it.
I advocate the latter. Purplebackpack89 19:30, 12 January 2016 (UTC)
What I'm saying is that even if we had a research-paper length definition of laugh out loud, defining LOL as "laugh out loud" would still be inadequate. They are not the same thing. So yes, what I'm saying is we should add more to the definition of LOL. If SOP is "bullshit", then campaign against that. You still won't get much support, but at least you'd be being honest about what you want. Stop beating around the bush with these ridiculous proposals. --WikiTiki89 19:41, 12 January 2016 (UTC)
If I believe a proposal is a good idea, I'll make it. Equinox suggested I make this proposal after an RfD (or was it an RfV?) vote I made. I believe I've tried one-shot SOP dismantling already; that didn't work, so I'm settling for dismantling it piecemeal, starting with the most egregious examples. Purplebackpack89 05:14, 13 January 2016 (UTC)
The proof that human beings can make SoP "connections" is in the fact that people who learn a language can produce original sentences as well as individual words. Jesus H. Christ. Equinox 21:14, 12 January 2016 (UTC)
Equinox, that a) assumes that everybody who uses this dictionary has enough comprehension of English to get to the construction of proper sentences, b) two-word phrases are all constructed the exact same way sentences are, and c) a person would never voluntarily look up a two-, three- or four-word phrase unless it passed our CFI. I can't in good faith make any of those leaps, sorry. Purplebackpack89 05:14, 13 January 2016 (UTC)
Chances are that if someone is using an English dictionary, they know enough English to construct proper sentences, or at least understand what is proper and what isn't. I'm not perfectly fluent in French, but whenever I stumble across a French phrase with which I am not familiar, I am nearly always able to intuitively determine whether I should look up an individual word or the entire phrase. I think you underestimate people's intelligence, and I don't think you understand that understanding English is a prerequisite for using a dictionary written entirely in that language. Andrew Sheedy (talk) 05:47, 13 January 2016 (UTC)
"laugh out loud" being SOP, Purplebackpack89's reasoning could be used to argue the inclusion of any SOP construction, couldn't it? Suppose there's a person who does not understand properly the sentence "I see dead people" (which is SOP). That being SOP, he/she should look for I + see + dead + people. Suppose we are discussing whether we should have an entry I see dead people. The argument goes this way: if there's an abbreviation ISDP for it, keep that entry, otherwise delete that entry. --Daniel Carrero (talk) 08:16, 13 January 2016 (UTC)
  • Oppose per Wikitiki. Our dictionary would be rendered a laughingstock were we to include these. This is not the right way to go about making CFI more inclusive. —Μετάknowledgediscuss/deeds 02:34, 12 January 2016 (UTC)
    We're a laughingstock as it is because people can't find the definitions of words we need, @Metaknowledge. As usual, people seem to ignore the fact that if a person can't find the definition that they are looking for, they leave Wiktionary, find it somewhere else, and probably continue using that somewhere else instead of Wiktionary. The whole "if we do this, we'll be a laughingstock" line of "argument" is completely fallacious. Purplebackpack89 05:35, 12 January 2016 (UTC)
    My experience is that people generally have an idea of what words can be expected to be contained in a dictionary, and they tend to find them in languages where we have good coverage. But why don't you amend your suggestion instead of baselessly calling criticisms of it fallacious? —Μετάknowledgediscuss/deeds 05:38, 12 January 2016 (UTC)
    @Metaknowledge The reason I'm critical of your "laughingstock" claim is you haven't said why we'd be a laughingstock. You seem to be implying that anything above the "general idea" (which, I might add, doesn't exist, at least not a single one that's anywhere near the same for everybody) is a waste of space, and if people discover we have entries above and beyond the "general idea", they will think it absurd for one reason or another. That idea has no basis in either provable fact or in common sense; the people most likely to look for/find the definitions I'm proposing to include are the people the least likely to find it absurd that we have them. Furthermore, there is no technical need to artificially constrain ourselves to the words we're expected to have. Maybe you meant something else when you said what you did, but as you've said nothing else, it's hard to believe otherwise.
    As for amending my proposal, a) I truly believe the project would be better if the proposal were adopted verbatim, and b) I don't really know in what direction I'd go to amend it to placate you, Wikitiki and others. Purplebackpack89 05:57, 12 January 2016 (UTC)
@Purplebackpack89 If the spectrum that would be included by your proposal includes terms that look like ones we wouldn't want, you need to come up with some wording that doesn't include them. A "proposal" that consists of a grand statement, some hand-waving, and hope for good outcomes obviously isn't going to satisfy those who are skeptical of the desirability of quantity increases at Wiktionary. A policy proposal needs a little bit more thought. DCDuring TALK 03:30, 12 January 2016 (UTC)
Well, we can spend the rest of this thread thinking and discussing. It doesn't have to be perfect on the first go. Purplebackpack89 05:32, 12 January 2016 (UTC)
@Purplebackpack89 Impulse control is one of the wonderful consequences of having a deliberative body to mull over questions of policy. DCDuring TALK 11:09, 12 January 2016 (UTC)
Oppose Your initial premise has a tantalising air of plausibility, but does not stand up to careful thought. Some abbreviations which require explanation refer to perfectly ordinary language which does not. Some words which require explanation have (typically variable) abbreviations used in context, which do not require explanation. Therefore the two inclusion decisions are somewhat independent. Of course, when considering any particular case, the existence of an abbreviation entry can be taken into account. And in the end I do not think you have given a single convincing example. Imaginatorium (talk) 08:55, 12 January 2016 (UTC)
I think on a usability level, this is not a user-friendly proposition. Changing creating an entry for laugh out loud in order to change [[laugh]] [[out loud]] to [[laugh out loud]] is not user friendly, because a user would be better off understanding what laugh and out loud mean. This won't help anyone understand more words and abbreviations. I should note, that not the intention of this proposal either, so I wouldn't expect it to. But I see no value in having entries allowed per WT:CFI that won't help anyone understand any words or phrases. Renard Migrant (talk) 18:32, 12 January 2016 (UTC)
I've never bought into your line of reasoning that we're somehow more user-friendly with fewer entries. The only thing that is user-friendly is having all the entries people would look for. If a person looks for laugh out loud, they should be able to find it; it's doubtful whether even looking for out loud would even cross their mind. And the entry for "laugh out loud" would help that person understand the phrase "laugh out loud". I'm sorry, but since you insisted on bringing user-friendliness into this, my belief is that perfect user-friendliness would call for a complete abolition of SOP and anything else restricting verifiable entries. That's not what I'm advocating in this proposal, but that's what would generate optimum user-friendliness. Furthermore, "helping people understand more words" isn't the same thing as user-friendliness. Purplebackpack89 05:07, 13 January 2016 (UTC)
Because users need to be able to speak English as opposed to learning phrases verbatim. If you learn "I have a cat" verbatim you won't know what "I have a dog" means because you don't know what any of the individual words mean. But if you learn the word I, have, a, cat and dog you'll know what "I have a cat" and "I have a dog" means. Are you genuinely saying it's just a numbers game? We should be based purely on the number of entries we have not what they are and what they contain? I mean, we could have picture of a tall man with a dog on his lap just purely because it's one more entry than we have now. Like I said, you're not claiming this is a user-friendly feature and I think you're right not to, as the aim of this proposal is not to make Wiktionary more user-friendly; it's to satisfy you personally. Renard Migrant (talk) 14:34, 13 January 2016 (UTC)
In many ways, being more user-friendly and satisfying me personally are one and the same, because making the project more user-friendly is one of my goals for the project. If you're thinking about user-friendliness, you need to think less about what people are learning and more about what people are looking for. I have a cat is not something that would be allowed by this proposal, but something like laugh out loud or anything else that's commonly acronymed is. A person who is searching for the definition of "laugh out loud" might not want to or think to to break it into its component parts, and even if he/she does want to or think to, giving them only one avenue isn't user-friendly. In essence, Renard, your line of reasoning forces people to look for certain things. To do so isn't user-friendly. Am I saying that user-friendliness is a numbers game? Yeah, pretty much. And I again say that what you're advocating isn't user-friendliness per se. Purplebackpack89 14:48, 13 January 2016 (UTC)
Don't be so sure about 'I have a cat'. Also, you don't have a monopoly on what constitutes user friendliness, on who users are, or on what their needs might be. The argument that something should be included merely because it might be of use to someone at some point is irrelevant, we also have a scope. We don't try to be Wikipedia, we don't try to be a stock price index, we don't try to be IMDB. We are trying to be a dictionary. The argument is that all of the phrases which you propose to include are not material for a dictionary. - TheDaveRoss 15:05, 13 January 2016 (UTC)
But, @DaveRoss I am entitled to my opinion of user-friendiless and inclusiveness, and I am entitled to advocate that policy reflects my point of view. Purplebackpack89 15:11, 13 January 2016 (UTC)
Oppose because of...everything written here. SOP and WT:CFI exist specifically so we don't have entries like laugh out loud. —Aryamanarora (मुझसे बात करो) 23:11, 12 January 2016 (UTC)
To continue with the metaphor, what's wrong with a donut with a hole in the middle? Renard Migrant (talk) 14:34, 13 January 2016 (UTC)
If there wasn't a hole in the donut, there'd be more donut. Purplebackpack89 14:48, 13 January 2016 (UTC)
Without a hole it wouldn't be a donut, and there would be fewer of the doughballs than there would have been donuts given the same amount of dough. DCDuring TALK 15:03, 13 January 2016 (UTC)
I believe the two are made separately. Purplebackpack89 15:11, 13 January 2016 (UTC)

Internationalisms in etymologiesEdit

In the modern world, many languages that came later than others into a particular academic field or the like borrowed many words from an international pool of terminology, rather than from any particular language. These are then often naturalized in a systematic way, which helps hide the direct source of the borrowing, if there even was one. The most prominent example of this are words with the suffix derived from Latin -tiō. Good examples are virtually all the words in the translation tables at radio, civilization, and physics. My question is, how should we handle the etymology sections of these terms? One thing we sometimes do is say "Ultimately from Latin/Greek X", but I find that insufficient. --WikiTiki89 20:03, 12 January 2016 (UTC)

I quite like "coined based on". Renard Migrant (talk) 22:37, 12 January 2016 (UTC)
Coined based on what? It's not the wording that's the problem, it's what do we link to? --WikiTiki89 23:19, 12 January 2016 (UTC)
You're going to have to rephrase it then, I don't understand. Renard Migrant (talk) 14:22, 13 January 2016 (UTC)
Ok, so give me a full example of the etymology section of, lets say, Turkish radyo, and I'll explain what I mean based on that. --WikiTiki89 17:53, 13 January 2016 (UTC)
In some cases, a little research will reveal which language the scientific word was first coined in. For example, homosexual was first coined in German, though most languages' words look as if they come from a New Latin homosexuālis. —Aɴɢʀ (talk) 16:02, 13 January 2016 (UTC)
Yes, but homosexual did not go directly from German to every other language. --WikiTiki89 17:53, 13 January 2016 (UTC)
I see what you're getting at. Many languages created their cognate terms for homosexual at a time that cognates had already established itself in many languages (German, English, New Latin, French, Russian, etc), and so the creation probably proceeded along the lines of "well, everyone else calls it this" rather than "German calls it this". It's probably still possible to decide which specific language it was borrowed from / coined based on in a lot of cases, but in those where it isn't, I'd say something like "Coined based on English homosexual, German homosexuell, French homosexuel, etc, as if from New Latin homosexualis". - -sche (discuss) 01:37, 15 January 2016 (UTC)
Yes, that's exactly what I'm talking about. The problem is, that's a lot to add. This is a very common situation in many languages and I was hoping we could get some kind of standard format for it. Also, the problem remains of which language to categorize it under. Perhaps we shouldn't categorize it under any language and create a new category for "Internationalisms". Also, should we link to the "New Latin" term? --WikiTiki89 03:08, 15 January 2016 (UTC)
Yeah, the drawback to what I suggested is that it's a lot to add and a lot that will get duplicated (potentially) across many entries. Perhaps we could say "From English foo and cognates thereof in other languages", potentially using a template to keep the wording the same across many entries, and then let foo#English list the other cognates. Or for words derived as if from something Latin, link to the Latin entry rather than the English entry. - -sche (discuss) 02:59, 16 January 2016 (UTC)
I often encounter this problem when adding Esperanto etymologies—Zamenhof and other important Esperantists seem to have coined a lot of words based on whatever word French, Italian, Spanish, English, German, and Russian (or some combination of those) have in common. When it's a case of simply taking a root shared by most major Romance languages, I use the phrase "common Romance", as in rompi and dento, but I think this is an Esperanto-specific solution, and it doesn't work for all cases. In other cases, I list a few of the languages, as in adjektivo (something like what -sche suggests above). It would be good to have a standard way to deal with this across languages. —Mr. Granger (talkcontribs) 23:46, 15 January 2016 (UTC)
Lojban uses {{jbo-etym}} and faces a similar situation. —Aryamanarora (मुझसे बात करो) 03:09, 16 January 2016 (UTC)

long enough to qualifyEdit

I think it’s time that DerekWinters be made an admin. It was suggested almost a year ago by WF, but some were against it at the time because he had not been around long enough, or because WF had proposed it. DerekWinters has made 6600 edits on Wiktionary and has been active here since 14 October 2012 (over three years). —Stephen (Talk) 20:15, 12 January 2016 (UTC)

I'm sure he's trustworthy and experienced enough, but the real questions are: Does he want to be an admin? And what will he contribute as an admin? --WikiTiki89 20:24, 12 January 2016 (UTC)
If he never uses any admin powers, he would still be an outstanding representative of our slogan, based solely on his Babel. The practical value of having admins who can communicate in such a range of languages and scripts seems important to me. DCDuring TALK 22:35, 12 January 2016 (UTC)
I didn't ask what could he contribute, but what will he contribute. Only he himself can answer that. @DerekWinters: I'll ask you directly: Do you want to be an admin? And what would you contribute as an admin, if you were made one? --WikiTiki89 22:58, 12 January 2016 (UTC)
I wouldn't mind being an admin, especially because I'll be able to speedy delete some of the mistakes I make. But I honestly don't know what I'd be able to contribute should I become one. I definitely can speak a few languages rather well, and it is quite the hobby of mine to master other writings systems, but several of them are often unused in day to day matters. I however do believe I have been useful in making transliteration modules and some declension and headline templates and I have been increasing the lemma-count of several underrepresented languages, but I'm not entirely sure that being an admin would allow me to do this significantly better. DerekWinters (talk) 03:40, 13 January 2016 (UTC)
One thing we definitely need is better patrolling of non-European-language edits in Recent changes. There are lots of cases where I may check for defacing of entries or insertion of out-of-place text, but I have no clue whether the non-English content is correct or is deliberately-planted offensive nonsense. There are a few admins who know the languages, but they don't always have the time. Even if you patrolled only a fraction of the edits, it would be an improvement. Chuck Entz (talk) 04:05, 13 January 2016 (UTC)
I would be able to help with that. DerekWinters (talk) 04:14, 13 January 2016 (UTC)
Should this be formally voted on, I'd support. Mainly because of this cool list that he gave me. —Aryamanarora (मुझसे बात करो) 23:03, 12 January 2016 (UTC)
  • Stephen is being somewhat dishonest about why DerekWinters was not made into an admin when WF suggested it. The reason can be seen at Wiktionary:Votes/sy-2015-06/User:DerekWinters for admin, where I opposed because DerekWinters created entries that did not meet CFI more than once over a long period of time, and when he was told about this or pinged in RFVs of his protologisms, he did not once respond to the best of my knowledge. He continues to create entries in languages he doesn't speak, and he has not demonstrated that he even recognises the problem. You can see that he often ignores messages left to him about problems with his editing at User talk:DerekWinters. It doesn't matter how long someone has been active on Wiktionary: I still cannot support a candidate for sysophood whose edits cannot be trusted, and who admits himself he has little or no use for the tools. —Μετάknowledgediscuss/deeds 03:49, 13 January 2016 (UTC)
I admit that I had added some terms that did not meet the CFI simply because they were words I found to be beautiful. However, since then I have been ascertaining that any term I add to the project most definitely meets the CFI requirements. I do apologize for not having responded then for the RFVs. However I also do believe from what I've seen that many editors add terms in languages they do not speak. DerekWinters (talk) 04:14, 13 January 2016 (UTC)

The Quality of the Macedonian Entries on WiktionaryEdit

I would like to point out to anyone who may find it of interest that many (I use the term "many" hyperbolically) users which are not fluent speakers of Macedonian are freely contributing to the Macedonian corpus on the English Wiktionary with little to no concern for making errors. So, they're basically polluting the body of Macedonian entries, not with minimal oversights such as failing to mark a literary word as such (of such oversights I am at times guilty myself), but with grave inaccuracies such as allowing blatantly wrong suffixes to be generated in the inflection tables, be it willingly or inadvertently. This saddens me greatly because I've invested so much effort into creating Macedonian entries, only for some B1-level speaker to taint them all by adding a -о vocative form to a feminine noun which actually has an -е vocative form. Indeed, it's not as though the errors of other users don't affect me whatsoever - most people consulting online dictionaries view those dictionaries as integral units, so if they detect an error in the Macedonian Wiktionary for which some non-fluent speaker of Macedonian is liable, they will deem the entire body of Macedonian entries unreliable, such that the reputation of all of my own entries will be marred - they will be reduced to collateral damage.

My entries aside, the mistakes made by unskilled and/or reckless users trying to enhance the set of Macedonian entries are naturally to the detriment of anyone trying to learn Macedonian from this project, yet if no one notices them by hazard, there is no systematic way in which they can be detected and subsequently resolved. I personally try to hunt down faulty Macedonian entries and correct anything that needs correction, but I am not always able to do this. First of all, I don't devote attention to Wiktionary regularly; second of all, I don't get a notification every time a Macedonian entry is created or modified. All I can do (as far as I am aware) is check the contribution history of users whom I have already observed making Macedonian contributions. Either way, it's not as though I'm a moderator of the Macedonian part of Wiktionary - I haven't assumed any official obligations. I'm just trying to direct the attention of other concerned parties to the fact that in the absence of a moderator and a strict system of regulations, the set of Macedonian entries is left at the mercy of whomever feels the whimsical desire to tamper with it. This is not so with the corpora of many other languages, e.g. French or Japanese - there are so many active users that speak those languages here that mistakes can't simply weave themselves into the project with absolutely no one taking heed of that. On the whole, I feel that something should be done about this issue, although I don't have any concrete proposals - the fact that there aren't enough users fluent in Macedonian here makes everything so infeasible. Martin123xyz (talk) 18:23, 13 January 2016 (UTC)

To begin with we could publish regular reports documenting all changes to Macedonian entries, including the person making the contribution. This will only be of use if there are people capable of reviewing those reports, but it might help you find things to look at when you do have time. - TheDaveRoss 18:28, 13 January 2016 (UTC)
I find this suggestion agreeable - indeed, I wouldn't be able to review those reports regularly (so I don't think its necessary for you to produce them too often, e.g. weekly - every two months would be better), but even if I manage to devote attention to them a year or two after their creation, they will not have been in vain. I will correct whatever needs correction belatedly, and that is certainly better than nothing. Either way, I hope that it will be possible for me to filter my own contributions out of those reports, so that I can focus on the ones by other users (though there will obviously be no contributions from me during the breaks I take, e.g. the one I've just started). Martin123xyz (talk) 10:32, 14 January 2016 (UTC)
@Martin123xyz I can try and work on this this weekend, if you could provide me a list of editors who you would like to whitelist that would be great. Also, would you like to "trust" any entry which is most recently edited by a trusted editor, or only edits which were made by those contributors? - TheDaveRoss 20:42, 21 January 2016 (UTC)
@Martin123xyz Check out User:TheDaveRoss/Macedonian/р and let me know what you think. It is a relatively slow process, since I don't want to download and extract the full revisions dump, but I think I could get all of the Macedonian pages audits in this format in an hour or two. - TheDaveRoss 22:31, 25 January 2016 (UTC)
@TheDaveRoss I don't think I can really provide you with a proper whitelist of editors, because I'm hardly familiar with the Wiktionary users who have created or modified Macedonian entries so far; moreover I cannot predict the ones that will do so in the future. Indeed, I have identified three users to be blacklisted so far, but I haven't identified any trustworthy ones. I suppose that Bjankuloski06~enwiktionary could be assigned to that category - after all, he's a native speaker. Meanwhile, I don't think that he's been active on Wiktionary recently, but naturally, that doesn't necessarily mean anything. Anyhow, I don't understand your question about what edits I would like to "trust"; could you please rephrase it (I don't understand what "trusted editor" as opposed to "those contributors" implies)? As for the link you've provided, I've looked at the table you've generated and I think I like it, but I don't understand why some sections have comments whereas some don't. Furthermore, I think its impractical to have a separate row for each edit on every entry (that makes reading the table cumbersome, i.e. long and messy). I would only be interested in being notified that an entry has been edited to begin with; then I could go take a look at it to see what exactly has been changed. Also, I'm worried about chronological order in the table - edits from many years ago are shown together with more recent edits for all of the words. Could you program it to show only recent edits and to sort words according to the date they were edited, rather than sorting them alphabetically, and then sorting the edits of each one independently after that? If not, I would have to go through all 11,000 + Macedonian pages to check if something has been edited, rather than just looking at the top. Well, at least that's the impression I'm getting. I'm sorry if I'm making inapposite requests or arriving at absurd conclusions - I have a very poor understanding of how coding works, so I can't imagine what your table can and cannot do. Either way, I really appreciate your interest in cooperating with me. Martin123xyz (talk) 16:51, 26 January 2016 (UTC)
@Martin123xyz The rows without comments are edits which did not have an edit summary. The current contents are just edit histories of the Macedonian sections, excluding certain bots. My thinking was that you could scan down the page and, if you saw an edit which was suspicious, click on the link to see the diff. I am not totally sure how you intend to use the results.
If you would prefer to only see the most recent edit, I can do that. If you would prefer to see only entries which have been edited since some particular date, I can do that too. Just let me know what you would like to see and I can try and accommodate. I just assumed the alphabetical was the most convenient ordering, if you would like chronological by most recent edit that is also possible. - TheDaveRoss 17:04, 26 January 2016 (UTC)
@TheDaveRoss Thank you for the prompt reply. I would only like to see the most recent edit for all Macedonian entries, and if it hasn't been made by me, I'll check it. I would also prefer to see entries from the 12th of January (which is when I terminated my last contribution spree) until whenever I start contributing again (at which time the dates will be reset, presumably). Finally, it would be nice if the table were ordered chronologically based on the recentness of the edits. Martin123xyz (talk) 19:27, 26 January 2016 (UTC)
@Martin123xyz can you point to some entries that had incorrect content added by a non-speaker? — Ungoliant (falai) 18:56, 13 January 2016 (UTC)
I will present, albeit with reserve (since I do not particularly wish to defame anyone or cause them any other form of inconvenience), three different entries (there are many more I have in mind, but three are enough to serve as illustrative examples) created or modified by three different non-speakers (I judge that they are non-speakers by the information on their profile pages) - народ (narod) (which was marked as feminine, whereas it is masculine; even if this was a coding error, it was nonetheless alarming), дојде (dojde) (which was given a more regular but either way invented past tense ("дојдол" instead of "дошол"), and ниво (nivo) (whose correct plural form, "нивоа", entered by myself, was changed to "нива", as though it were a regular neuter noun in -o, rather than a French loanword with a final stress). I have now taken care of all these entries, such that no errors are observable in them, but one can review their histories to see what their earlier condition was like. Either way, I must mention that I don't require whatsoever that all users contributing to Macedonian without speaking the language fluently be prohibited from doing so - their work can indeed prove useful at times. Indeed, there are many entries created by non-speakers of Macedonian which are of decent quality. Furthermore, a non-speaker once corrected a mistake I had made myself out of inattention, by allowing plural forms to be generated for чаре (čare), which is actually singularia tantum. It's just that non-speakers appear to be unable to contribute in a favourable manner consistently. Martin123xyz (talk) 10:16, 14 January 2016 (UTC)
@Martin123xyz I just want to say, thanks for your work! I've noticed many of your contributions appearing on various pages (in particular, those with a Russian term that's spelled the same), and I definitely appreciate the effort. I know it can be difficult or lonely working on a language without many Wiktionary contributors; I ran into this issue when I was working on Arabic entries. Benwing2 (talk) 18:52, 14 January 2016 (UTC)

Thank you very much, Martin123xyz, for making a great contribution for the Macedonian entries. --KoreanQuoter (talk) 05:54, 15 January 2016 (UTC)

Thank you for the compliments (they're not exactly relevant to the topic I'd introduced, but it's nice to receive them :) ) - I'm glad that people consider my entries useful. I greatly enjoyed creating them (my frustration with Wiki code aside). Martin123xyz (talk) 08:58, 15 January 2016 (UTC)
What do you mean by "Macedonian Corpus"? I think you mean "the entries in Macedonian", wherease "Corpus" normally means something quite different from dictionary entries (i.e. "a corpus of text(s)"). I strongly suggest renaming this as "The Quality of the Macedonian entries", not because your title is "wrong", but because it could lead to confusion... Imaginatorium (talk) 08:05, 17 January 2016 (UTC)
Thank you for the correction - I have made appropriate modifications (as I saw fit). Martin123xyz (talk) 19:37, 18 January 2016 (UTC)

A conference about French Wiktionary at Wikimania ?Edit

Hello, English-speaking wiktionarians!

We are three French-speaking wiktionarian with a strong will of going to the annual conference Wikimania. We want to share our experiences with others about different topics. Well, to make it short, you can directly go to this direct link to our draft. We have until Sunday to send it, so only few days, but any help is welcome, especialy regarding the language. As you are probably guessing reading my prose now, English is not my mother tongue. Plus, we want to know what do you want to hear from us and imply the community as much as possible. You can react here or there, as you prefer. Thanks a lot in advance! Noé (talk) 21:34, 13 January 2016 (UTC)

I have edited the text a little to make it sound more "nativelike". I hope I preserved all of the meaning. Good luck with your project. Currently, the various Wiktionaries have different markup schemes and different policies, so the scope for collaboration seems a bit limited... Equinox 23:31, 13 January 2016 (UTC)
Thanks Equinox and Koavf for proofreading! I think we are going in the same direction without having a proper understanding of our paths. I plan to translate in English our 2015 report to gather your comments on it. I think we need to start talking about other Wiktionary policies to see if it may be a good idea to adopt it. I don't want to have a supervision but to publish thought about our own projects and to discuss about others' votes and decisions. It's a lot of energy and we need bilingual people to help, but I think it had to be one of our goal in the future. Noé (talk) 13:34, 14 January 2016 (UTC)

Hi all! Just to let you know that our talk have been accepted! We will be at the Wikimania to talk about Wiktionary and what we are doing to develop it. We'll be writing our draft collectively soon and I hope to meet you there, in Italy in June   Noé (talk) 14:09, 14 February 2016 (UTC)

SI prefixesEdit

About the SI prefixes:


Shouldn't they be named like normal prefix entries, that is, with a hyphen in the end?


Some of these entries already exist, defined as prefixes in various languages. I find it amusing that μ- is defined as "Abbreviation of micro-." in English. Plus, a- has a Translingual section, but it is not defined as a SI prefix. --Daniel Carrero (talk) 23:35, 14 January 2016 (UTC)

I don't see how they are grammatically prefixes. Sticking abbreviations together is not morphology. Equinox 23:49, 14 January 2016 (UTC)
Daniel, it’s an interesting point, but we categorize them as symbols, not as prefixes. — TAKASUGI Shinji (talk) 01:06, 15 January 2016 (UTC)

EL: Language voteEdit

Of all the five votes that are going to end in the next few days, please direct your attention specifically to Wiktionary:Votes/pl-2015-12/Language.

Reason: It has few votes: 1-0-2. Please vote on it, abstention is fine too, IMO. End date: January 20. Thanks. --Daniel Carrero (talk) 10:25, 15 January 2016 (UTC)

Entries for suffix-like wordsEdit

Per a suggestion at Requests for Deletion under the current discussion of -mongering, it might be a good idea to keep entries for words that are likely to be searched for as suffixes (with a leading hyphen) as redirects to the unhyphenated entries. Several editors participating in the discussion either assumed that -monger and -mongering were suffixes, or felt that they could be considered suffixes when attached (suffixed) to the end of other words. Since the words are rarely encountered except in compounds, one might expect a large percentage of people looking for definitions, etymologies, or other words formed with them to search for them with a leading hyphen. The same must be true of many other words that may not technically be considered suffixes, but which are frequently placed at the end of compound words. However, these searches usually turn up no results, frustrating the user, and in at least some cases probably leading to the creation of entries that are subsequently nominated for deletion.

Therefore, my suggestion is that we convert entries such as these into redirects to the entries that cover the intended meaning. -house would redirect to house; -wall to wall, -monger and -mongering to monger and mongering (or both to monger), etc. That would solve the problem of people looking for them as suffixes and not getting any results at all. It wouldn't involve a great deal of work; the redirects could be created as needed or converted as they appear; and if any legitimate suffixes happen to exist with the same spelling, then a sense could be added with wording such as, "house used in a compound word" (just using "house" as an example; I know it won't have a corresponding suffix entry). P Aculeius (talk) 15:34, 15 January 2016 (UTC)

I would support soft redirects, but not hard redirects. --WikiTiki89 15:37, 15 January 2016 (UTC)
I'll also add that these should only be words that people would tend to look up as suffixes. --WikiTiki89 16:25, 15 January 2016 (UTC)
I support the idea, but I think it should be limited to words that have a relatively high percentage of usage as a compound element. In other words, I’d include -monger and -mongering but not -house and -wall.
My preference is for hard redirects, but if soft redirects are used they should use the correct POS instead of suffix. — Ungoliant (falai) 16:18, 15 January 2016 (UTC)
I support hard redirects. Definitions in the target entry should have an appropriate label if use in combination is not rare, ie, (usually/often/also in combination). DCDuring TALK 16:36, 15 January 2016 (UTC)


What should I use to write [ä] (Open central unrounded vowel) in IPA for entries? For Hindi, I see many entries with [ɑ] and rarely [a], even though Hindi should be using [ä]. —Aryamanarora (मुझसे बात करो) 21:48, 15 January 2016 (UTC)

Personally, I'm for writing ⟨ä⟩ when applicable. Korn [kʰʊ̃ːæ̯̃n] (talk) 22:32, 15 January 2016 (UTC)
Same here, just wanted to know the conventions here. [ä] isn't in the official IPA guide and is commonly replaced with the other open vowels in transcription. —Aryamanarora (मुझसे बात करो) 22:59, 15 January 2016 (UTC)
It's part of IPA nonetheless. I think in the last discussion I had about that here, a few people were of the opinion that one should use ⟨a⟩ instead, because our poor users might otherwise be scared and confused by something as uncommon as ⟨ä⟩. But for me /ä/ is simply a cardinal vowel like all others. I think another practice is to use /ɑ/ when [ä] phonemically behaves like a backvowel in twofold systems like vowel harmonies or consonant palatalisations Korn [kʰʊ̃ːæ̯̃n] (talk) 12:56, 17 January 2016 (UTC)
I consider [a] sufficient for cases where a language does not have both [a] and [ä] as distinguishable allophones, but I do not oppose the more exact practice of using [ä] either.
On the other hand, sometimes I've seen people using [ɐ] for this purpose, which I find a poor idea: the symbol indicates specifically a near-open vowel, not a fully open one (and is usually only used in the transcription of languages that have both /a/ and /ɐ/).
In phonemic transcription it's recommendable practice to keep it simple, and thus e.g. use /a/ even for [ɑ] if there are no other open vowels, or /u/ even for [ɯ] or [ʊ] if there are no other close back vowels. But that might not be much of an issue around here. --Tropylium (talk) 11:48, 19 January 2016 (UTC)
For dictionary-writing purposes it's almost never necessary to use [ä]. The IPA vowel diacritics are great when you're discussing the fine details of phonetic realization, such as in a discussion of allophones in various contexts or when comparing the vowel systems of two distinct languages or dialects. But in a dictionary, what's important is the phonemes and maybe their most common, widespread allophones, and for that the IPA recommends using the typographically simplest symbol in the neighborhood. Although the cardinal vowel ɑ is defined as maximally back and maximally low, that doesn't mean that only a maximally back and maximally low vowel is correctly transcribed with ɑ. If a language has only one unrounded back low vowel, then ɑ is the correct symbol for it, even if (to judge from the vowel chart at Hindustani phonology) that vowel is closer to being central than being maximally back. Using ä for a language's only vowel in the low back unrounded range, or worse yet, for a language's only low vowel, is an example of false precision that we should avoid. —Aɴɢʀ (talk) 13:04, 19 January 2016 (UTC)
Yep, that chart is accurate. Thanks for the information everyone! I'm going to use /ɑ/ for Hindi since it's the only low vowel. —Aryamanarora (मुझसे बात करो) 18:58, 19 January 2016 (UTC)

Module errors on the vote boxEdit


Is twinkle not available on this wiki? Ipadguy (talk) 23:48, 16 January 2016 (UTC)

Categories like Category:French verbs with conjugation erEdit

@Kc kennylau I think these should be named with a hyphen, e.g. Category:French verbs with conjugation -er. Also, when you create them you should probably create a catboiler to generate the content. Benwing2 (talk) 00:53, 17 January 2016 (UTC)

@Benwing2: First part done; second part no idea what to add yet. --kc_kennylau (talk) 09:18, 17 January 2016 (UTC)
@Kc kennylau Check out {{fr-verbconjcat}}. Benwing2 (talk) 12:16, 17 January 2016 (UTC)
@Benwing2: Thank you. --kc_kennylau (talk) 13:38, 17 January 2016 (UTC)
We already have Category:French first group verbs. Renard Migrant (talk) 15:19, 17 January 2016 (UTC)
These aren't quite the same thing, though. There are categories like Category:French verbs with conjugation -cer that don't have an equivalent. Benwing2 (talk) 20:51, 17 January 2016 (UTC)

Module:ugly hacksEdit

(Firstly, I apologize for violating the intention of this module by mentioning it here, since this would be equivalent to advertising for that module, which is not the writer User:Kephir's attention.) Shortly after Kephir decided to discourage the use of that module, he used that module himself on Template:en-verb ([1]). My question is, what is the current (inofficial) policy towards the use (or the discouragement thereof) of this module? Should I refactor Template:en-verb (as well as all the other templates that use this module) so that it no longer uses this module? --kc_kennylau (talk) 09:17, 17 January 2016 (UTC)

Next votesEdit

I would like these to be the next WT:EL votes to start. Please review them and see if they are OK.

Plus I created a poll. It was scheduled to start in 1 month and last for 3 months.

But I'm probably going to oppose the current proposal of this poll, even if I'm the creator! It's just that the issue has been brought up repeatedly before and IMO it's better worded as a new proposal but I'd prefer the status quo. Please edit/change it too if you'd like.

Cheers! --Daniel Carrero (talk) 10:45, 17 January 2016 (UTC)

Complete entry templateEdit

I'm playing with the idea to make a template which triggers a row of subtemplates which create an entire entry from scratch for Middle Low German. So the final entry would look like any other to the user, but for editors it would be thus:
Atm it's just a random fleeting idea, but I figured before I even playfully muse about it, I'd ask whether there would be any problems with such a template/form of entry. Korn [kʰʊ̃ːæ̯̃n] (talk) 21:35, 17 January 2016 (UTC)

It would probably be confusing to newbies. I'd suggest to use {{subst:}} when using it, like the templates {{ja-new}}, {{zh-new}}, {{ne-new}}, and some others do. —Aryamanarora (मुझसे बात करो) 02:44, 18 January 2016 (UTC)

2016 WMF Strategy consultationEdit

Hello, all.

The Wikimedia Foundation (WMF) has launched a consultation to help create and prioritize WMF strategy beginning July 2016 and for the 12 to 24 months thereafter. This consultation will be open, on Meta, from 18 January to 26 February, after which the Foundation will also use these ideas to help inform its Annual Plan. (More on our timeline can be found on that Meta page.)

Your input is welcome (and greatly desired) at the Meta discussion, 2016 Strategy/Community consultation.

Apologies for English, where this is posted on a non-English project. We thought it was more important to get the consultation translated as much as possible, and good headway has been made there in some languages. There is still much to do, however! We created m:2016 Strategy/Translations to try to help coordinate what needs translation and what progress is being made. :)

If you have questions, please reach out to me on my talk page or on the strategy consultation's talk page or by email to

I hope you'll join us! Maggie Dennis via MediaWiki message delivery (talk) 19:06, 18 January 2016 (UTC)

Poll: Restore deleted high-use templatesEdit

Usually, when a high-use template is nominated for deletion ({{term}}, {{l/en}}), @Dan Polansky argues that they should be kept to preserve page histories. (Wiktionary:Beer parlour/2015/November#About deleting l/en, l/la, l/de and others, Wiktionary:Requests for deletion/Others#Template:l/de, Wiktionary:Votes/2015-11/term → m; context → label; usex → ux#usex → ux, etc.)

I would like to know what people generally think about this.

This is a poll with no policy value. The full proposal of this poll:

  • There should be some effort to restore templates that were once highly-used in the main namespace and were orphaned and/or deleted, to keep them usable for past revisions of the main namespace to be readable. This arguably includes: {{proto}}, some context templates ({{obsolete}}, {{colloquial}}, {{UK}}, {{transitive}}) {{Wikisaurus-link}}, {{SAMPA}}, {{l/en}} and others.

--Daniel Carrero (talk) 18:11, 19 January 2016 (UTC)


  1.   Support I support this both for enhanced usability of entry histories and to make Special:WantedTemplates and Special:WantedPages more useful by eliminating some of the detritus there, though there are other, greater contributors to the problem. DCDuring TALK 18:33, 19 January 2016 (UTC)
  2.   Support Make revision histories legible. As for making sure deprecated templates are no longer used: we could use the AbuseFilter or some such tool to enforce deprecation on the technical level: it would be impossible to save a page that uses a deprecated template. In the oppose section, I see no reasoning that explains why this or a similar technical solution is not a good idea; all I see there is very vague and non-specific. --Dan Polansky (talk) 21:09, 22 January 2016 (UTC)
    If you want to keep old revisions legible, why not just lock all templates right now and prevent changes for all eternity? The preservation of templates alone cannot preserve legibility, as parameters or internal code may be changed, as Benwing described below. —suzukaze (tc) 07:25, 23 January 2016 (UTC)
    @suzukaze: Your question does not seem to be serious. I want to keep revisions legible at an acceptable cost. Using a deprecation mechanisms instead of deleting templates is not only acceptable but also reasonable. By contrast, locking all templates for eternity is not acceptable. In some cases described by Benwing below, keeping templates deleted may be in order. Alternatively, the body of a deprecated template could be updated to be less dependent on other templates. Either way, an argument of the form "we cannot make page histories perfectly legible => let's give up on the legibility altogether" is a crass fallacy. In its form, it is identical to "we cannot prevent all environmental pollution => let's give up on limitting environmental polution". --Dan Polansky (talk) 07:57, 23 January 2016 (UTC)
    Let me be specific: if we delete template:l/en, we will make all the links that used it illegible. If, by contrast, we enter a plain wikilink to the template, which does not depend on any other templates, we will preserve the legibility of all the uses of the template. To prevent further use of the template, we may create a filter using AbuseFilter. Now, what are the specific disadvantages of keeping the template and deprecating it using AbuseFilter? I see none. The Benwing objections do not apply to this template and the deprecation change to it just presented. --Dan Polansky (talk) 08:03, 23 January 2016 (UTC)
  3.   Support High-use templates, at a minimum, should be either redirected or left as historical. Not doing so confuses the bajeebers out of editors. Purplebackpack89 21:40, 22 January 2016 (UTC)


  1.   Oppose I'm not sure that restoring these templates is worth the trouble, especially all the (hundreds of?) context templates. Some people might potentially use the old templates if they are available, and it would require the additional work of converting them to the new templates. Example: Even after {{l/en}} was orphaned, it was not deleted, and it was added to tapaculo and pupunha. --Daniel Carrero (talk) 18:11, 19 January 2016 (UTC)
    @Daniel Carrero Is there a way to allow the templates to be used everywhere but current principal namespace, Appendix space, etc, or only in histories? DCDuring TALK 00:56, 20 January 2016 (UTC)
    As far as I can tell, no, templates have no way to check if they are being used in the current or a previous revision. I also did not find any extensions on mw: to that effect. --Daniel Carrero (talk) 01:51, 20 January 2016 (UTC)
  2.   Oppose. It's something Dan Polansky talks about, having older revisions readable because you end up with things like {{infl}}. The cost for keeping them is however too high in my opinion; they need to be usable in the main namespace for it to work, and then they get used. And how many people actually look at old revisions? I look at the odd one usually in RFV debates trying to find who added a disputed definition, but I can live with things like {{infl}} and Template:idiom being red links because I know what they refer to and they're not what I'm looking for. And I can bypass those problems by clicking on edit. In short, the disadvantages far outweigh the advantages, not least because very few people read old revisions of pages, and experienced editors won't be bothered by old red links. Renard Migrant (talk) 18:27, 19 January 2016 (UTC)
    You and other veterans may know, but this constitutes yet another barrier to strong participation by newcomers. DCDuring TALK 00:56, 20 January 2016 (UTC)
    Maybe I'm just smarter than most editors, but it didn't take me very long to figure out what was up with those red links, once I'd got to the stage where I was looking at enough previous revisions to notice them. I've only been editing half a year, and I think that's the least of our worries. Ensuring that the help pages are up to date, as Daniel Carreiro has been doing, is a far more useful task, as that is where I got confused when I first started. Andrew Sheedy (talk) 04:47, 23 January 2016 (UTC)
    @Renard: "the disadvantages": That's a plural. You've stated only one disadvantage: when the templates are usable, they get inadvertently used. We have AbuseFilter that we could "abuse" to block saving pages that contain a deprecated template. --Dan Polansky (talk) 21:14, 22 January 2016 (UTC)
  3.   Oppose, for the reasons listed above. I've never been bothered by red links in previous revisions of pages. Andrew Sheedy (talk) 18:43, 19 January 2016 (UTC)
  4.   Oppose, per above. —Aryamanarora (मुझसे बात करो) 18:47, 19 January 2016 (UTC)
  5.   Oppose. I wouldn’t mind having a grace period for an RFD-failed template before it’s deleted, so contributors can get used to the new template (especially occasional or antisocial contributors who don’t follow discussions). But I think that keeping them forever would do more harm than good. — Ungoliant (falai) 00:46, 20 January 2016 (UTC)
  6.   Oppose Progress is progress. Does anyone still make computer towers that accept 5¼ floppy disks? —suzukaze (tc) 00:57, 20 January 2016 (UTC)
  7.   Oppose per Daniel, who has very gracefully pointed out that idiots like me adding {{l/en}} long after deprecation when I'm multitasking on Wiktionary is something that is bound to occur. —Μετάknowledgediscuss/deeds 00:58, 20 January 2016 (UTC)
  8.   Oppose. Way too complicated to do something like this. But I wonder if anyone has proposed a mediawiki feature where page revisions use the revision of any templates at the time the edit was made? DTLHS (talk) 01:30, 20 January 2016 (UTC)
  9.   Oppose per Daniel. An additional issue is that technically it can be complicated and messy to require such compatibility. This is especially the case if an existing template is changed to eliminate a particular parameter or change the parameters, and all uses fixed accordingly. For example, in {{ru-noun-table}}, which declines Russian nouns, it used to have a 4th parameter that specified a special "bare-stem" form. I fixed up the Lua code so this wasn't required, and corrected all the template uses to eliminate their use of this parameter, and then deleted the code that supported this old parameter and eventually reused the 4th parameter for a different use. If maintaining compatibility were mandated, I couldn't do this, and instead would have to keep the old useless code around forever to support the old use of the 4th parameter, and would have to specify the new use as a 5th parameter with an always-empty 4th parameter before it. Benwing2 (talk) 05:58, 20 January 2016 (UTC)
  10.   What Ungoliant said.​—msh210 (talk) 18:06, 21 January 2016 (UTC)
  11.   Oppose because we want to think about the people who look things up in a dictionary more than those who make it (more and more these days, I'm using it as a look-up dictionary rather than a make-it-up dictionary...that is progress BTW). Those who make the dictionary know their way around. If they are able to click on [History] and can see a red link there, they're probably at a pretty good stage in the development of Wiktionarying already and will probably be able to find the new-and-improved versions of these templates. Ce mot-ci (talk) 03:52, 23 January 2016 (UTC)
  12.   Oppose to much work and complications (and basically bad resource allocation) for something that isn't very important. If anything it should be done like DTLHS describes. Enosh (talk) 07:09, 23 January 2016 (UTC)
    @User:Enoshd: Please clarify how is creating an abuse filter to ensure deprecation "too much work and complication". I have no idea what you are talking about. --Dan Polansky (talk) 08:10, 23 January 2016 (UTC)
    Similar to what Benwing says above, you have to continue updating them when other things change while keeping them backwards compatible. This mostly applies for templates transcluding other templates, otherwise not so much. Enosh (talk) 09:03, 23 January 2016 (UTC)
    @User:Enoshd: What prevents us from changing deprecated templates in such a way that they largely do not depend on other templates and yet render something legible? For {{l/en}}, we can replace the template body with a plain wikilink to preserve legibility and be done. --Dan Polansky (talk) 09:23, 23 January 2016 (UTC)
    I agree with you that {{l/en}} is a simple one, but it's also the last on the list and the only one not deleted. The first {{proto}} would have needed updating multiple times, most recently when we changed namespace. (I list the changes since deletion because I cannot predict the future ones. We won't make the same changes again but there'll probably be changes.) The context labels would have needed updating (excluding what a bot will do) or at least consideration in aliases and such. {{Wikisaurus-link}} and {{SAMPA}} I'm not sure, perhaps not problematic. Enosh (talk) 06:31, 30 January 2016 (UTC)


  1.   Abstain kc_kennylau (talk) 13:14, 28 January 2016 (UTC)

Oodles of numbersEdit

Anon user (talk) has been adding lots of entries for numbers, like 1338, 912, and 43. Is this appropriate for Wiktionary? ‑‑ Eiríkr Útlendi │Tala við mig 18:12, 19 January 2016 (UTC)

No, these aren't words or idioms in any language. Nor are they symbols (nothing wrong with 0, 1, 2, etc.) Renard Migrant (talk) 18:22, 19 January 2016 (UTC)
  • I've just nuked this anon's contribs in their entirety -- with the exception of XCII and XCIII, all of their contributions are of Arabic-numeral numbers as linked above, and even for those two Roman-numeral entries, this anon is the only user to touch them.
(If this was in error, please let me know and feel free to undo.)
What about all the other number entries not created by this specific anon, like 313 or 51 or 32? ‑‑ Eiríkr Útlendi │Tala við mig 00:17, 20 January 2016 (UTC)
I would probably delete all of them above "9". Their categorization is an inconsistent muddle and makes little sense to me. An alternative would be to agree on a consistent format and create them all (up to 2016?) by a bot. SemperBlotto (talk) 08:57, 20 January 2016 (UTC)
I don't mind for one that have other meanings like 69 but everything else should go. If you look at the original revision of 313 it should have been shot on sight, and 32 is just Wonderfool pissing around. Renard Migrant (talk) 18:21, 21 January 2016 (UTC)
  • For entries like 33 that have a valid idiomatic sense, should we rip out the otherwise-useless and not-dictionary-material ==Translingual== sections? ‑‑ Eiríkr Útlendi │Tala við mig 19:58, 21 January 2016 (UTC)
And what about confusingly formatted entries like the ==Translingual== section at xxx? My instinct is to rip that out too. ‑‑ Eiríkr Útlendi │Tala við mig 20:00, 21 January 2016 (UTC)
I suppose I'd keep them a bit like we keep unidiomatic sense for idioms when the idioms meet CFI (see {{&lit}}). Renard Migrant (talk) 19:27, 22 January 2016 (UTC)
For the curious, here are the entries which had only digits and decimal points as of the last dump
.500 0 0. 000 007 0157 06 1 1. 1.0 10 10. 100 1000 101 102 103 104 1040 1080 109 1099 11 11. 112 12 12. 121 125 13 13. 1337 14 147 1471 15 16 17 18 180 187 19 1984 1992 2 2. 2.0 20 20. 200 21 22 224 23 233 24 25 26 27 28 29 3 3. 30 300 303 31 313 32 33 360 39 4 4. 40 400 404 411 419 42 420 45 4649 470 5 5. 50 500 51 520 527 540 555 5555 6 6. 60 600 606 666 69 7 7. 70 71 720 73 737 747 757 777 78 79 8 8. 8.3 80 81 82 83 84 85 86 87 88 89 9 9. 90 900 9000 91 911 92 93 94 95 96 97 98 99 999 - TheDaveRoss 14:42, 23 January 2016 (UTC)
Also these:
-one -ten billion decillion duodecillion eight eight hundred eight thousand eighty eighty six eighty-eight eighty-five eighty-four eighty-nine eighty-one eighty-seven eighty-six eighty-three eighty-two fifty fifty six fifty thousand fifty-eight fifty-fifty fifty-five fifty-four fifty-nine fifty-one fifty-seven fifty-six fifty-three fifty-two five five hundred five thousand five-nine forty forty two forty-eight forty-five forty-four forty-nine forty-one forty-seven forty-six forty-three forty-two four four hundred four one one four thousand hundred hundred thousand million nine nine hundred nine one one nine thousand nine-one-one ninety ninety-eight ninety-five ninety-four ninety-nine ninety-one ninety-seven ninety-six ninety-three ninety-two nonillion octillion one one billion one hundred one hundred million one hundred one one hundred six one hundred thousand one million one thousand one-hundred one-two one-two-three quadrillion quintillion septillion seven seven hundred seven hundred fifty seven thousand seventy seventy-eight seventy-five seventy-four seventy-nine seventy-one seventy-seven seventy-six seventy-three seventy-two sextillion six six hundred six thousand sixty sixty nine sixty-eight sixty-five sixty-four sixty-nine sixty-one sixty-seven sixty-six sixty-three sixty-two ten ten million ten thousand ten-four tenone tenten thirty thirty one thirty-eight thirty-five thirty-four thirty-nine thirty-one thirty-seven thirty-six thirty-three thirty-two thousand thousand one three three hundred three thousand trillion twenty twenty four twenty four seven twenty hundred twenty one twenty two twenty-eight twenty-five twenty-five-eight twenty-four twenty-four seven twenty-nine twenty-one twenty-one hundred twenty-seven twenty-six twenty-three twenty-three hundred twenty-twenty twenty-two twenty-two hundred twentyone two two hundred two thousand two- two-four twoten undecillion - 15:28, 23 January 2016 (UTC)
  • I would definitely want to keep the spelled-out English words in the paragraph immediately above. —Aɴɢʀ (talk) 17:07, 23 January 2016 (UTC)

Proposal for Sorting DefinitionsEdit

As anyone who regularly frequents Wiktonary knows, one of its most widespread problems is that of inconsistency in the ordering of definitions (and etymologies, pronunciations, parts of speech, etc.). I propose implementing something that could somewhat improve the situation and at the same time, satisfy people with conflicting opinions.

This would be a simple template that allowed the ranking of definitions in two different ways. It would look something like {{2|3}}, with the first parameter ranking it according to how common it is, and the second parameter ranking it according to when it entered the language. It would have no effect on the appearance of the page unless a user selected a setting to have definitions ranked either by age or by frequency. With this option available, I would suggest making it policy to order definitions according to their relationship with each other. (By default, entries would be displayed in the order displayed in the wikicode.)

Obviously, some definitions would be almost equally frequent (or infrequent), or might have roughly the same, not necessarily known, time of origin. The template would thus have to allow for multiple definitions to have the same ranking. The other parameter could be used as a secondary ranker, and perhaps the unmodified order of definitions could be used as a tertiary ranker (i.e. equally frequent definitions would be ranked by their age, and if some of them had an equal age, then they would be ranked according to how they were entered in wikicode).

The same thing could perhaps be applied to etymologies, pronunciations, etc. which also suffer from the same issues. I think it would be less important for these than for definitions, however.

This would introduce some problems, especially ones that label definitions as "by extension" from preceding definitions. I don't see this as a huge issue, as the templates would be entered manually, so editors would hopefully ensure that there were no such problems, and other editors would hopefully catch them over time, as they do any other mistakes and inconsistencies. I also don't think that label is particularly useful anyway.

This would be an ambitious change to make, and its implementation would no doubt be tedious, but I feel that it would pay off in a few years, if it is even possible to do. What think you all? Is this possible (and would anyone besides myself care enough to help implement it)? Hopefully the wall of text didn't scare anyone away. Andrew Sheedy (talk) 05:00, 20 January 2016 (UTC)

I like the idea. I wonder if the template thing is needed at all; maybe we can get consensus for commonness-based order rather than etymological order. — Ungoliant (falai) 13:51, 20 January 2016 (UTC)
Consensus has been sought many times, and there are people who are very dedicated to one approach or the other. As for implementation, this would require code, presumably in javascript, to rearrange the page. Page loads take far too long as it is, and this could make pages with lots of definitions really, really slow. I know that it would only be a problem for those that opt in to rearranging, but I'm skeptical that there would be enough use to justify the monumental investment of time and resources to make it happen. It could easily end up as a failed experiment like the alphagrams. Chuck Entz (talk) 14:33, 20 January 2016 (UTC)
How would this be made to work with subsenses (and subsubsenses)? DCDuring TALK 15:09, 20 January 2016 (UTC)
Good question. Maybe a modified template could be used? I'm afraid I'm not very skilled in the programming department, so I'm not too sure how one would make it work in those cases. Andrew Sheedy (talk) 18:39, 20 January 2016 (UTC)
As far as subsenses, this is not a technical issue but a logical one. If you can figure out how we want them to be sorted, dealing with them on the technical side shouldn't be too hard. As far as pageload times, most of the time is spent loading the JavaScript, not running it. It runs pretty fast. --WikiTiki89 19:21, 20 January 2016 (UTC)
In that case, I would think subsenses would be sorted in the same way as other definitions. If a user wanted to see definitions in order of age, the subsenses would still be displayed as such, and would be ordered related to other subsenses under the same main sense. They wouldn't be ordered in relation to other senses, at any rate. Andrew Sheedy (talk) 18:11, 21 January 2016 (UTC)
I must have faster Internet than you, Chuck Entz, because I find that even the longest pages load fairly quickly on Wiktionary. In fact, one of the main reasons I prefer Wiktionary to any other online dictionary is the speed at which pages load (Larousse and have way too many ads), so if rearranging the definitions would significantly slow down page loads, then maybe it wouldn't be such a good idea, particularly since my proposal is aimed especially at definition-heavy pages. Andrew Sheedy (talk) 18:39, 20 January 2016 (UTC)
I think this is a terrific idea, and restating what I think Dixtosa was saying below: we ought to hold off on this and actually take the plunge re structuring data in such a way that we could migrate it into a relational database. The current suggestions involving templates and reordering pages via js and lua may well work, but only insofar as data is presented locally. The real conversation is about WikiData or some other model for backend structure. - TheDaveRoss 19:25, 20 January 2016 (UTC)
As far as I know, there's never been any consensus over what order to have definitions in. Most entries put common definitions first but a few put things in date of first appearance order, meaning that very rare meanings can come before very common ones, which I oppose. Renard Migrant (talk) 16:48, 21 January 2016 (UTC)
I think that the vast majority of entries actually have the definition lines in a more-or-less random order. The upshot of the suggestion here, and those like it, is that you as a reader could choose how you like your definitions provided and then all entries which had been tagged could be displayed to you in that manner. - TheDaveRoss 16:52, 21 January 2016 (UTC)
What I am proposing would allow users to view definitions sorted in three different ways (the default, according to age, and according to commonness), according to preference. Andrew Sheedy (talk) 18:11, 21 January 2016 (UTC)
We have fewer than 2,600 pages that use {{defdate}}, not all of which have any English content, some of which use defdate not for definitions but for alternative forms etc. A large number of the pages with {{defdate}} have only one definition. Almost all of the PoS sections that have {{defdate}} do not have it for every definition.
I don't think we have any other usable data for time ordering. The basic source for such date is the OED. I don't know whether large-scale use is a copyright problem.
We have no source whatsoever to determine which definition represents the most frequent semantic content of current usage of the word. That is, we would have to do our own interpretation of corpus data to actually offer something that could claim to be reliable.
AFAICT, we have never mastered the relatively simple problem of technically determining whether a given spelling of a word had as lemma one or both of the capitalized or uncapitalized forms. Nor have we mastered folding all inflected forms into the count for the lemma form or separated participles from homonymic adjectives. And, of course we have never mustered the effort to do these things manually, except in principal namespace where it took much of the lifetime of the project to reach our current state of incompletion.
If we cannot get reliable information on frequency of usage of word-definitions, how could we execute the project proposed? What would motivate the manual effort required? How could the problem be solved technically, even over the course of the next decade? DCDuring TALK 19:23, 21 January 2016 (UTC)
Fair points. It's a pity we don't have the manpower to do the job. I think it could be done if people were motivated, but adding new definitions and words is probably higher priority. It would be nice if we could achieve consenus on the definition order, though, even if it required a compromise. We have some entries that are overwhelmingly confusing, and I think some consistency could help solve that.
I might just use {{defdate}} for French entries now that I know about it. It's not essential information, but the more it's used, the easier it'll be to rearrange definitions later. Andrew Sheedy (talk) 23:21, 22 January 2016 (UTC)
It is worthwhile to include the {{defdate}} information where available. I wish that it weren't so hard to get it. Especially difficult, even conceptually, are the more gradual and subtle transitions from creative innovation, through usage limited to special context, to full membership in the lexicon. It seems a bit like speciation in organisms, not the most encouraging of possible analogs. DCDuring TALK 00:21, 23 January 2016 (UTC)
Data about the relative frequency of different definitions of a word is even harder. The conceptual problems are severe: the relative frequency depends on the set of definitions a given dictionary uses and the wording of the definitions. More practical would be recording the frequency of collocations of a word, though that too has problems. At least it would make it more possible to better separate the frequency by PoS. DCDuring TALK 00:34, 23 January 2016 (UTC)
I think any ranking by frequency would have to be largely subjective, and based on personal experience. For more technical terms, this would be far more difficult to determine. Andrew Sheedy (talk) 03:20, 23 January 2016 (UTC)
Try comparing subjective experience with corpus evidence for a few polysemic words. Then assume that crowd-sourcing subjective experience improves things. Do you think the result is satisfactory? DCDuring TALK 13:13, 23 January 2016 (UTC)
Is that sort of comparison even possible without spending hours on a given word? I do think that multiple editors contributing their subjective observations would help balance things out, though. Andrew Sheedy (talk) 06:17, 24 January 2016 (UTC)

  • Tagging this idea as part of {{hashtag|MediawikiHoldsWiktionaryBack}}. --Dixtosa (talk) 19:11, 20 January 2016 (UTC)
    Detagged. This places the whole Beer parlour discussion page into a pointless category. Furthermore, if you don't like Mediawiki and the semi-formal approach, you can goto OmegaWiki and contribute there; good luck with that.--Dan Polansky (talk) 13:21, 23 January 2016 (UTC)
    That's like saying if you don't like me you can fuck off. Thank you for letting me know that I can leave. Very informative. Try directing your energy coming up with a solution next time.
    How does a category that contains specific, clearly defined talks sound pointless to you? I am trying to aggregate all the reasons of why MW ins't perfect for WT in one place.
    Also, the whole Beer parlour discussion page is also in the category "term cleanup/wiktionary namespace". Does this not bother you? --Dixtosa (talk) 14:14, 23 January 2016 (UTC)
    What I was trying to say is that it is pointless to try to convert the English Wiktionary to an analogue of OmegaWiki since we already have OmegaWiki. Those who want to contribute lexicographical content to a relational database with stringent entity-relationship models already have the option. --Dan Polansky (talk) 14:28, 23 January 2016 (UTC)
  • I am not sure what the intention was, but I took it to mean that Mediawiki as it is is not perfectly suited for dictionary content. That is a sentiment that I am sure everyone who has done any work here can agree with to some degree. It would be nice, for instance, if there was some method of tying a sense to a translation which was tighter than the current use of glosses. It would be nice if there were less hacky methods of only displaying the content that a user wanted to see, etc. - TheDaveRoss 14:28, 23 January 2016 (UTC)

My two cents For what it's worth, I definitely think that we need some standard for sorting these definitions: cf. truck. By far the more common usage of the English word is the automobile but the first definition is the verb. —Justin (koavf)TCM 05:04, 24 January 2016 (UTC)

And worse, it's a different etymology, with only dialectal senses! We definitely need some better defined standards. Andrew Sheedy (talk) 06:17, 24 January 2016 (UTC)
I find putting obsolete senses first and obsolete words (etymologies, parts of speech) first massively inferior from usability standpoint. Unfortunately, multiple editors don't think so and we have no consensus as per Wiktionary:Beer_parlour/2012/December#Positions of obsolete senses. The link contains a poll in which the editors are divided on the issue approximately 50:50. --Dan Polansky (talk) 07:28, 24 January 2016 (UTC)
@Dan Polansky: Agreed. But I also see some value in the chronological approach. The nice thing about this if we can template-ize it or make it into a table of some sort is that it would then be sortable--users can chose the order via scripts, user settings, etc. —Justin (koavf)TCM 07:35, 24 January 2016 (UTC)

{{inh}} vs. {{der}} againEdit

The criteria for when to use one vs. the other aren't always clear. For example, I just fixed up Latin pluit to say it was "inherited" from PIE *plew-. This is true, except that pluit is thematic and the verb corresponding to *plew- might have been athematic *plewti instead of thematic *pleweti; if so, then technically it was "morphologically reformed" at a later point so it should maybe say it was "derived". But this seems a needless distinction to make in cases like this, and in many cases it isn't even known for sure if a particular verb was thematic or athematic in PIE since the athematic->thematic change was so common and repeated so often in so many languages. Or is the statement that it's inherited from a root rather than a particular verbal form enough to work around this issue? Benwing2 (talk) 05:45, 20 January 2016 (UTC)

*plew- is a root, and roots have no descendants, only derived terms. So it can't be inherited from it. —CodeCat 15:29, 20 January 2016 (UTC)
It is, however, inherited from the root present *pléw-, which is derived from *plew-. —JohnC5 15:38, 20 January 2016 (UTC)
We had separate entries for verb stems for a while, but I removed them again because they caused some problems and a few people had requested this long ago. One issue is that etymologies (ours or others') generally don't distinguish between the root and its stems, so knowing what goes where is hard. Another difficulty is that the PIE verb system was structured fundamentally differently from what the descendants have. PIE had different aspect stems, and these could be considered separate verbs in their own right. In the descendants, however, these were generally unified into one paradigm, and missing members were created anew while duplicates were trimmed. For example, every Germanic strong verb has a form that is descended from the PIE stative, but that doesn't mean that a stative form of that verb existed in PIE. Not all PIE verbs might have had an imperfective aspect stem either, but pretty much all descendants formed one at some point. Then there is the question of the athematic-thematic distinction, which was pretty much completely eliminated in many of the descendants (Slavic, Italic, Germanic). All in all, it's very difficult to say "PIE verb X has descendant verb Y as a descendant" when Y can actually be an amalgamation of several PIE verbs, including any that were only formed post-PIE. —CodeCat 16:02, 20 January 2016 (UTC)
I vaguely remember a discussion from which it followed that upholding this distinction was not even supported by consensus. But I am not sure. From what I can see, the distinction between {{inh}} and {{der}} was installed without discussion and consensus. I fear it is going to create a lot of pain down the road. --Dan Polansky (talk) 13:17, 23 January 2016 (UTC)

Fixing cite- and quote- templatesEdit

Recently, @Smuconlaw has changed the behavior of {{cite-book}} and {{cite-web}} (which previously behaved like/redirected to {{quote-book}} and {{quote-web}}) to actually be proper citation templates. While this is a good change, it gives us a big problem - the cite- templates were widely used to include quotations in entries, and suddenly the formatting here has become completely messed up (look at Citations:rest on one's laurels, Citations:macaroni and gravy or Citations:twelve-ounce curls). Would it be possible for someone with a bot or AWB to go through entries and convert "{{cite" to "{{quote" in the following specific circumstances:

  1. Any usage in the Citations: space.
  2. Any usage in the mainspace immediately preceded by "#*" (or "#* " - in fact, any usage of a cite-template preceded by an asterisk is probably an error)
  3. Any usage under a ====Quotations==== header

That should fix the formatting issues messing up the references (although if anyone can see a potential for false positives, please point it out). Smurrayinchester (talk) 08:53, 20 January 2016 (UTC)

(And vice versa, any quote- templates that appear within <ref></ref> tags should be cite-, presumably) Smurrayinchester (talk) 08:53, 20 January 2016 (UTC)
I've spotted two problems with this idea. a) Some people call "cite-book" through the {{cite}} template. This wouldn't necessarily be unfixable, but we'd need to create an equivalent {{quote}} (currently a redirect to {{blockquote}}). b) There are a few citations pages where citation templates are used without a date= or year= field (eg Citations:Brown bounce), which causes problems if you try to convert citation quotes directly into quotation ones. Smurrayinchester (talk) 09:25, 20 January 2016 (UTC)
Yes, I realized there were going to be some transitional issues but figured the short-term pain was worth the long-term gain. If these could be fixed by bot that would be great. By the way, {{cite}} works (I updated it) but links to the citation, not the quotation, templates. Smuconlaw (talk) 10:15, 20 January 2016 (UTC)

How to mark long vowels in Germanic variantsEdit

I think this would apply to Middle Low German/Dutch/Frisian, and some forms of German. I'm making Middle Low German templates and Low German research traditionally has two different systems of marking long vowels. I'd like to have consistency amongst the languages, so I thought I'd informally ask which one to use.

  • System one: Lengthened short vowels are marked by a macron: hö̂gede (heighth) vs. hȫgede (joy)
  • System two: Lengthened short vowels are unmarked: hö̂gede (heighth) vs. högede (joy)

Lengthened vowels can only ever occur in open syllables, so system 1 is superfluous but explicit. Asking @CodeCat as DUM person especially. Korn [kʰʊ̃ːæ̯̃n] (talk) 13:50, 20 January 2016 (UTC)

Vowel length is not marked for Middle Dutch, as it can be (generally) deduced from the spelling. However, we do use macrons and circumflexes to distinguish between different long vowels that were spelled identically. See WT:ADUM for more. —CodeCat 15:31, 20 January 2016 (UTC)
FWIW, I find it hard to distinguish ö̂ from ȫ at the font size most text on this site is displayed at. - -sche (discuss) 19:06, 24 January 2016 (UTC)

Translations voteEdit

Please vote on Wiktionary:Votes/pl-2015-12/Translations.

Reason: Only 5 people voted so far, and it's going to end in a few days.

  • Current results: 3-2-0
  • End date: January 27

The 2 opposers raised a good point concerning translation tables in Translingual sections, so I proposed to amend the vote based on that point. I believe this vote could pass, with that proposed modification. Just see the support vote by @Andrew Sheedy. Thanks. --Daniel Carrero (talk) 18:11, 22 January 2016 (UTC)

Wikidata & GLAM 'down under'Edit

In February, I'm undertaking a three-week tour of Australia, giving talks about Wikidata, and Wikimedia's GLAM collaborations. Do join us if you can, and please invite your Wikimedia, OpenData, GLAM or OpenStreetMap contacts in Australia to come along. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:42, 22 January 2016 (UTC)

Deprecating term templateEdit

A vote to let bots replace all uses of {{term}} with {{m}} has passed.

I propose to deprecate {{term}} by adding an abuse filter to AbuseFilter extension that prevents saving of pages that contain the template. Thus, we can standardize on {{m}} while keeping the huge volume of page revisions that use {{term}} legible.

--Dan Polansky (talk) 08:24, 23 January 2016 (UTC)

I'm not opposed to this, but concerned this may establish a precedent preventing deletion of obsolete templates in the future. Let's emphasize that doing this doesn't establish such a precedent. Benwing2 (talk) 12:08, 23 January 2016 (UTC)
If we can agree to give it a try with {{term}} and {{m}}, voting on deletion of templates in RFDO should still proceed as before, and those who want to delete other templates should feel free to vote in RFDO according to their best conscience, which may be "Delete". If the template code of {{term}} proves to block improvement of templates or modules on which {{term}} depends, then {{term}} code should be edited to no longer use other templates or modules and provide some minimal legibility, without necessarily having the full original function. --Dan Polansky (talk) 21:26, 23 January 2016 (UTC)
Would this be a problem for more inexperienced users who edit pages with the template and then can't figure out why the page won't save? That template is used on a lot of pages. Once the bots get most of them, it should be fine, though. Andrew Sheedy (talk) 03:07, 24 January 2016 (UTC)
Shouldn't this be at WT:RFDO? There's been a vote to replace {{term}}, but not to delete it. I don't mind having two templates do the same thing; with Lua it should be possible to have them both work in exactly the same way apart from the language statement. They could just both call the same module so any edit to one would be an edit to both. Renard Migrant (talk) 18:05, 25 January 2016 (UTC)
In fact they already use the same Lua code internally, except for a "compat" flag that specifies how the arguments work. Benwing2 (talk) 18:29, 25 January 2016 (UTC)

Definitions vote -- Rationale and changesEdit


Rationale and changes:

  • Removing "The definitions are the most fundamental piece of dictionary", it's a comment rather than a rule.
  • Removing "[definitions] do not have their own header", no need to say what they don't have. Arguably, the POS header is their header.
  • Expanding upon the idea that "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop.", mentioning other type of definitions: "In language sections other than English, the definition generally consists of a simple translation into English, rather than a full definition."
  • Mentioning: "Sometimes, they are grouped into subsenses."
  • Writing out the actual formatting rules of Wiktionary:Votes/2006-12/form-of style and Wiktionary:Votes/2010-08/Italicizing use-with-mention, rather than just linking to them.
  • Removing "The “definitions” of entries that are abbreviations should be the expanded forms of the abbreviations." Sometimes, the expanded abbreviation is in the etymology section, not in the definition.
  • Removing "Where there is more than one expansion of the abbreviation, ideally these should be listed alphabetically to prevent the expanded forms being duplicated.", does not seem common practice.
  • Compressing the explanation of where to link the expanded forms in a single paragraph; arguably, that information does not need its own subsection.
  • In particular, replacing "Expanded forms that are encyclopedic entries should also be wikified and linked to the appropriate Wikipedia entry." by "Otherwise, if appropriate, link it to the appropriate Wikipedia article, if it exists." Arguably, existence in Wikipedia is a more objective criterion than whether an entry is "encyclopedic".
  • Mentioning three abbreviation examples (PC, USA, SNAFU), together in the same line. The original text had two examples (PC and SNAFU) in separate lines.
  • Removing bold formatting from "a definition which only applies in a restricted context"; arguably, it's unnecessary.
  • Compressing the explanation of context labels in a single paragraph; arguably, that information does not need its own subsection. In particular, "Details in Wiktionary:Context labels." does not to be in a separate line.
  • Adding an example of non-lemma definition, properly formatted: "plural of word".
  • Removing three separate references to the same vote (Wiktionary:Votes/pl-2009-03/Context labels in ELE v2) in consecutive paragraphs.
  • Reordering some of the ideas. Original order: introduction, form-of definitions, abbreviations, context labels. Proposed order: introduction, form-of definitions, context labels and abbreviations.
  • Using {{lb}} rather than {{context}}, as approved at Wiktionary:Votes/2015-11/term → m; context → label; usex → ux.
  • Replacing "wikify" by "link"; "wikified" by "linked".
  • Mentioning the fact that some entries are romanizations linking back to the main entries. The requirement that each romanization entry have at least one definition line was voted at Wiktionary:Votes/pl-2013-03/Romanization and definition line.
  • Making sure another WT:EL section is voted, a step in the direction of having WT:EL completely voted.

--Daniel Carrero (talk) 11:00, 23 January 2016 (UTC)

References vote -- Rationale and changesEdit


Rationale and changes:

  • Adding the rule: "References are listed using bullet points".
  • Adding 1 more usage example + the result of the usage examples.
  • Formatting the usage examples with bullet points, showing actual usage.
  • Removing "There is a need to balance respect for copyrights with definitions so inventive as to be inaccurate." For semantics, we go by attestation.
  • Removing "The validity of the dictionary has a profound effect on its usefulness." It's a comment rather than a rule.
  • Minor change of punctuation and word order.
  • Making sure another WT:EL section is voted, a step in the direction of having the WT:EL completely voted.
  • Disclaimer: The References section probably could be expanded with more information. This is proposed as an improvement to the current text, not as the "final" version of it.

--Daniel Carrero (talk) 11:06, 23 January 2016 (UTC)

EL introduction vote -- Rationale and changesEdit


Proposed introduction:
"This is a list of aspects that govern how an entry should be formatted. This includes what are allowed sections and what are the contents expected to be found in them. These rules reflect what editors think as best concerning the standard format of an entry."

Rationale and changes:

  • Quickly states that WT:EL is and what it does, for those unacquainted with the policy.
  • The first sentence was based on WT:NORM's "This is a list of aspects that govern how the wiki code behind an entry should be formatted."
  • The second sentence is a generic, all-encompassing statement but it also suggests that we have some standards concerning specific allowable headers and contents.
  • The third sentence was based on WT:NORM's "[...] they do make the pages conform more to a standard format reflecting what we think of as best for the wiki code."

--Daniel Carrero (talk) 11:21, 23 January 2016 (UTC)

German imperativesEdit

I brought this up in About German, but didn't get much of a reply. German has developed a First Person Plural and a Third Person Plural imperative, both of which function identically to the old Second Person Singular and Second Person Plural imperatives but require to be used with the personal pronouns. I would like to incorporate these into the templates. Korn [kʰʊ̃ːæ̯̃n] (talk) 10:20, 24 January 2016 (UTC)
ps.: The Third Person Plural imperative comes from the usage of 3rd rather than the 2nd Person as the polite form in German. Korn [kʰʊ̃ːæ̯̃n] (talk) 10:22, 24 January 2016 (UTC)

@Korn: Care to provide any examples? --kc_kennylau (talk) 17:02, 24 January 2016 (UTC)

All four examples mean "stop it!" Wir and Sie are the personal pronouns.

  • Lass das! – 2nd sg.
  • Lassen wir das! – 1st pl.
  • Lasst das! – 2nd pl.
  • Lassen Sie das! – 3rd pl. Korn [kʰʊ̃ːæ̯̃n] (talk) 17:45, 24 January 2016 (UTC)
Isn't Lassen Sie das second person plural? It's definitely used to make requests, but I don't know whether it rises to the level of an imperative. Smurrayinchester (talk) 19:22, 24 January 2016 (UTC)
It's second person (singular and plural) imperative in function, but third person plural present subjunctive in form. —Aɴɢʀ (talk) 21:28, 24 January 2016 (UTC)
What I would ask is whether we should describe forms by function or by form. In most cases, we describe things by form, and don't list the nuances in function that a form may have. This is something for a grammar, not a dictionary. For example, in Finnish verbs, the passive/impersonal form is used colloquially as a first-person plural. In French, the impersonal 3rd person singular is used in a similar way. But these aspects are not found in our inflection tables. —CodeCat 19:31, 24 January 2016 (UTC)
In Spanish we do. Should the Spanish inflection templates be simplified? DTLHS (talk) 21:30, 24 January 2016 (UTC)
I can't really see what you mean, CodeCat. All plural imperatives - including the one inherited from Proto-Germanic - are identical with their indicative present forms. That doesn't mean they're not imperatives. They're clearly not indicative statements and they're not an optative subjunctive present, as subjunctive present is a concept which doesn't exist in most registers of Modern German, other than maybe some consciously archaic speech or legal texts. And given the very comprehensive declension tables we provide for words, I don't see how "that's grammar" would be a reason to only list half of a set of forms. Especially since we already usually provide as much of grammatical information pertaining to a word as we can in both inflection tables and usage notes, and I very much want Wiktionary to be as much grammar as can be. I was about to say that this should be read with the restriction that the grammar has to pertain exclusively to the given entry and not be general rules, but every inflection table for a verb of a regular verb class is already 'general grammar' and nothing that needs to be listed for that word. Yet we do it, because information is good. After all, the Wiki credo is "text is cheap" and I don't see a reason to not give any information available on a word in an online wordbook. Korn [kʰʊ̃ːæ̯̃n] (talk) 11:45, 25 January 2016 (UTC)

Are the names of languages proper nouns?Edit

previous discussion: Wiktionary:Beer_parlour/2015/February#Languages_-_are_they_proper_nouns_or_not.3F

We seem to define the English names for languages as proper nouns, but I'm pretty sure they are all just uncountable common nouns. They certainly are in Italian and French. How do other dictionaries define them? SemperBlotto (talk) 07:11, 25 January 2016 (UTC)

Names of languages and ethnicities are common nouns and written in lower case in the Russian Wiktionary. I would check others but I doubt there will be any change in policies here. --Anatoli T. (обсудить/вклад) 07:25, 25 January 2016 (UTC)
I imagine this may have to vary from language to language in enwikt. In some languages (e.g. Turkish, and Arabic also to a fair extent), there are grammatical or phonological tests that help determine whether something is a proper noun or not. In other languages (like English), it's more conventional, and tends to be based on whether a noun is capitalized. E.g. English December is capitalized and considered a proper noun, Russian февраль is lowercase and not considered a proper noun, Arabic شُبَاط has no case but is considered a proper noun because it satisfies certain morphological and syntactic tests that distinguish it as proper. Many (most?) other dictionaries don't distinguish proper nouns from common nouns but just classify both as nouns. Many editors want to eliminate proper nouns as a separate class but there's no consensus on this, I don't think. Benwing2 (talk) 18:20, 25 January 2016 (UTC)
This has been discussed a couple of times, once briefly at Wiktionary:Information desk/2015/August#Modern_Greek_.26_PoS and before that at greater length at Wiktionary:Beer parlour/2015/February#Languages_-_are_they_proper_nouns_or_not.3F. As I noted in the latter discussion, few (or no?) other major dictionaries distinguish proper from common nouns, and few authorities give useful guidance on the matter: most authorities (both old and new) just say, sometimes in these exact words, "Capitalize proper nouns and words derived from them; do not capitalize common nouns", which is obviously inaccurate — tell it to the Marines, the Americans, the Englishmen and other capitalized common nouns (and to bell hooks and other uncapitalized proper nouns). (I am sympathetic to the idea that we should reduce the prominence of the distinction between proper and common nouns, e.g. by using a label on the headword line rather than different POS headers. However, distinguishing proper from common nouns does seem to be a useful distinction, and we potentially gain readers by being among the few dictionaries to make it.) I think languages are proper nouns in English, but may be common nouns in other languages. - -sche (discuss) 20:14, 25 January 2016 (UTC)
Well, there's only one Italian language, one French language (and so on) unlike grain or rye which are uncountable (yes they're both countable in some senses, but you get the point). Renard Migrant (talk) 21:46, 25 January 2016 (UTC)
  • Languages are common nouns, even if they're uncountable and spelt with a capital letter in English. In Danish, Norwegian and Swedish no capital letter is used for languages. I think Wiktionary may be out on a limb with its treatment. Donnanz (talk) 21:56, 25 January 2016 (UTC)
As with many proper nouns, they can be forced to be plural (eg, "Indian English is just one of the Englishes that need to be included."). They can be used uncountably more readily than most proper names and accept a wider range of adjective modifiers. But almost all proper nouns can be forced to be used uncountably ("There was too much Boston in his speech"), usually metonymically (eg, "Boston accent") and accept some adjectives ("historic old Boston").
Orthography (initial capital), semantics (the individuality of the named referent), and syntax (no indefinite article, limited acceptance of modifying adjectives) together seem to provide sufficient evidence that, in English, the names of languages are proper names, even when MWEs (eg, Old English). DCDuring TALK 01:17, 26 January 2016 (UTC)
In Hindi, language names take on gender and can (at least theoretically) be inflected in the plural. However, they still use the proper noun header on Wiktionary... I can't say much about English - my understanding of the nuances of English linguistics is lacking. —Aryamanarora (मुझसे बात करो) 02:12, 26 January 2016 (UTC)
In English, the names of languages are always proper nouns when referring to particular languages, and adjectives derived from those names are proper adjectives. Every grammar and dictionary I can find is in agreement that a noun designating a particular person, place, or thing, usually without requiring an article or other limiting modifier, is a proper noun. This may not be the rule in other languages, but it certainly is in English. Note, however, that certain words and phrases derived from proper nouns are treated as common nouns, and not usually capitalized. So, English, French, and Arabic are proper nouns when referring to languages; but one may have french fries, gum arabic, anglicized words, japanned leather. Capitalization varies with common nouns derived from proper nouns; one occasionally sees French fries or Brussels sprouts, but usually English muffins and Belgian waffles; this is a matter of style. P Aculeius (talk) 03:53, 26 January 2016 (UTC)
  • The question keeps popping up, so maybe it should be put to the vote to decide once and for all. Donnanz (talk) 16:32, 26 January 2016 (UTC)
What are you proposing to vote on? Whether the names of languages are always proper nouns, wherever they occur, in all languages? Or just whether they're proper nouns in English, which can be either capitalized or not, at the writer's discretion, when used to form a common noun, such as English muffin, french fries, or a danish? Can you point to some widely-used English language dictionaries, grammars, or style books that define "proper noun" in such a way as to exclude the names of languages? Every source I can find says that a noun referring to a particular person, place, or thing (usually without an article or limiting modifier), is a proper noun. So Polyhymnia, Ithaca, and Greek are all proper nouns, while muse, island, and tongue are common nouns. When referring to a common noun formed from a proper noun, capitalization is up to the writer; so Greek or greek salad; and of course one may use a common noun as a name: O Muse...; where is the unguent, Mother? in which case it becomes a proper noun in that instance. Before we put something that seems settled up for a vote, it needs to be clear just what's being voted on, and there ought to be at least some authority for disputing the issue. P Aculeius (talk) 17:31, 26 January 2016 (UTC)
I was thinking primarily of voting on our treatment in English, but other languages should be taken into consideration. In some languages gender is used, so they can automatically be regarded as common nouns, in many others no capital letter is used, so they can also be regarded as common nouns. Actually with the present treatment as both a proper noun and common noun the entry (for English) is, put bluntly, a mess and quite confusing. And English, French and German can also be surnames (which are undeniably proper nouns, recognised in the entry for English at least). Having said that, I disagree with much of your philosophy, and am siding with SemperBlotto. Donnanz (talk) 19:29, 26 January 2016 (UTC)
  • The fact that most languages have an adjective with the same spelling, in English at least, appears to have not been mentioned (unless I missed it), and of course adjectives are never classed as "proper", e.g. Filipino cuisine. Of course, there are capitalised adjectives which aren't a language also. So this makes the treatment of languages as proper nouns even more illogical.
Going off-topic slightly I notice that all months have been treated as proper nouns, while days are not. I'm not sure what the reasoning behind that is. Incidentally, days have also been classed as adverbs, which is an American trait, not normally done in British English. Donnanz (talk) 22:02, 8 February 2016 (UTC)

Making Byzantine Greek an etymology-only languageEdit

I propose we make Byzantine Greek an etymology-only variant of Ancient Greek (much as Medieval Latin is an etymology-only variant of Latin) for the following reasons:

  1. In effect, it already is one (Category:Byzantine Greek lemmas and Category:Byzantine Greek non-lemma forms are both empty, but Category:Terms derived from Byzantine Greek has 53 subcategories containing over 100 entries)
  2. It has no separate ISO 639-3 code (our code gkm was proposed in 2006 but has never been accepted)
  3. We can still have it as a lect of Ancient Greek, i.e. we can tag any specifically Byzantine words or senses with {{lb|grc|Byzantine}} and put them in Category:Byzantine Greek (or would it automatically be called Category:Byzantine Ancient Greek, and if so, can we override that?).

What do others think? —Aɴɢʀ (talk) 11:56, 25 January 2016 (UTC)

I agree. --Vahag (talk) 13:54, 25 January 2016 (UTC)
I note that the automatic Ancient Greek IPAs have included the Byzantine pronunciation for as long as I remember. — Ungoliant (falai) 14:36, 25 January 2016 (UTC)
If I rememeber correctly, we used to just treat it as grc, then we split off gkm. This would be going back to the previous status quo, but with an etymology-only code. Chuck Entz (talk) 14:49, 25 January 2016 (UTC)
Yeah. As far as I can tell, it's always been treated as an etymology-only language, but because it had a (quasi-)ISO code, it was grouped in with "full" languages early on in Wiktionary's history (as was Cajun French, frc). I proposed to update things to reflect its etymology-only-ness in 2013, but the discussion didn't result in any change. I have no strong feeling about whether Byzantine merits separate L2s etc, but since it's always been de facto subsumed under Ancient Greek here, it makes sense to make it so de jure. - -sche (discuss) 19:51, 25 January 2016 (UTC)
  • Support. It definitely belongs under the grc L2. —Μετάknowledgediscuss/deeds 01:43, 26 January 2016 (UTC)
  • There is already Medieval Greek as an etymology-only language. There's no difference between Medieval Greek and Byzantine Greek, is there? Shouldn't the two be merged into a single term? And Module:languages/data3/g even calls Medieval Greek another name for Byzantine Greek. —Aɴɢʀ (talk) 14:55, 27 January 2016 (UTC)
  •   Done. Last time I refreshed CAT:Pages with module errors there weren't any more caused by this change, but maybe some more will pop up over the next few days. —Aɴɢʀ (talk) 19:53, 31 January 2016 (UTC)
    • Found a couple more after hard purging. —JohnC5 20:12, 31 January 2016 (UTC)
    • @Angr Actually, there are a bunch more now. —JohnC5 20:38, 31 January 2016 (UTC)
    • Why wasn't the code turned into an etymology-only code instead? —CodeCat 00:52, 1 February 2016 (UTC)
      • It was, but it still creates a module error when an entry says "{{etyl|gkm|en}} {{m|gkm|φόοβαρ}}" instead of "{{der|en|gkm|φόοβαρ}}". —Aɴɢʀ (talk) 06:53, 1 February 2016 (UTC)

Rename instances of Template:term lacking a language to Template:termwithoutlangEdit

Is it ok for me to run a bot to rename all instances where {{term}} lacks a language to {{termwithoutlang}}? This template would exist only temporarily, and is of course not supposed to be added to entries by editors. But it would help with the current effort to orphan {{term}}, because it separates instances that a bot can fix from those that can't be fixed with a bot. —CodeCat 15:26, 25 January 2016 (UTC)

How about we create a fake language code "?" which corresponds the the language "Unknown", so that in all templates which take a language parameter and categorize there would be a category for "blah in Unknown" and we could easily find and work on such problems. - TheDaveRoss 15:31, 25 January 2016 (UTC)
This wouldn't achieve the effect I'm hoping for. —CodeCat 15:47, 25 January 2016 (UTC)
Well, assuming that {{m}} could accept the language code, you will still be able to orphan {{term}}, what other things are you trying to achieve? - TheDaveRoss 17:41, 25 January 2016 (UTC)
{{termwithoutlang}} seems ugly to me, maybe it should be {{term-nolang}}? Benwing2 (talk) 01:40, 26 January 2016 (UTC)
Making it one word would make it easy for editors to doubleclick it and delete the whole thing in one go. Adding a hyphen makes it much more troublesome. —CodeCat 01:50, 26 January 2016 (UTC)
Well then {{termnolang}}. Still looks ugly though. What is the issue with Dave's suggestion? It would be even easier to manually correct that by just changing the ? to the right code. Benwing2 (talk) 01:55, 26 January 2016 (UTC)
If MewBot succeeds in converting all instances of {{term}} with lang into {{m}}, then logically all instances of {{term}} would be without lang and thus would be equal to {{termwithlang}}. Re: "because it separates instances that a bot can fix from those that can't be fixed with a bot", all instances of {{term}} would be the latter. (except if/when people keep adding {{term}} to entries, of course) --Daniel Carrero (talk) 02:06, 26 January 2016 (UTC)
It's not that easy. Some instances of {{term}} occur inside other templates, and may not appear on the template page itself (so that they don't show up as transclusions). My reason for this request was to make it easier to separate pages that need a language from pages that have a language but are still transclusing {{term}} for some reason. However, I've now made a change to Module:term cleanup instead that adds a tracking template if there is a language. So the proposal isn't needed in the short term. It may still be valuable, though, to make it visually obvious to editors viewing the wikitext that the template needs to be replaced, or at least not added to new entries. Perhaps something like "termwithoutlangpleasereplace" would definitely stand out, and draw the attention of editors who happen to edit the page for other reasons. —CodeCat 03:16, 26 January 2016 (UTC)
I would suggest something like "termtemp". We don't really need to explain why it's being used (except in its documentation), but we do want to make clear that it's not permanent so that people don't copy its use in other entries. Chuck Entz (talk) 03:04, 26 January 2016 (UTC)
@CodeCat, what is the effect you're looking for. Renard Migrant (talk) 11:44, 26 January 2016 (UTC)
  • {{term/t}} isn't doing anything these days; why not temporarily resurrect it for current purposes? —Aɴɢʀ (talk) 14:57, 27 January 2016 (UTC)
  • Oppose. The purpose of this would be to delete {{term}}, which I oppose. If the objective is to minimize the number of instances of in the mainspace, renaming {{term}} to {{m}} while passing {{m}} some artifical lang code standing for "language missing" would be an okay solution, IMHO. --Dan Polansky (talk) 15:12, 31 January 2016 (UTC)

Amateur AltaicistsEdit

I have seen a slew of people add several ridiculous etymological theories. One person was trying to link Etruscans to Turkic, another one was linking random stuff like cannabis and other IE terms to more Turkic stuff.

How do we deal with them? Keep shooting ridiculous claims on sight? Hillcrest98 (talk) 23:45, 25 January 2016 (UTC)

As long as we're sure they're ridiculous. It's fairly widely assumed that cannabis is a loanword in Proto-Indo-European, but there's no consensus on where it came from. The main problem I had with Horsesongrassland's cannabis-related edits was that the etymology already said that it probably came from somewhere else and the references duplicated that, but were incompatible with the rest of the etymology and cited really poor, strongly POV sources. Some of the details were clearly nonsense, but the general idea of some kind of Altaic origin can't be absolutely proven wrong, because there doesn't seem to be enough evidence to prove anything with regard to Altaic- including whether it exists or not. Chuck Entz (talk) 03:27, 26 January 2016 (UTC)

Officializing automated romanizationsEdit

Wiktionary:Votes/pl-2015-12/Translations is probably going to fail. I have the intention of creating a new vote with the same proposal, but improved/fixed based on the multiple points raised by the opposers.

One of the points is: "switching Russian: {{t|ru|апельсин|m|tr=apelʹsín}} to Russian: {{t|ru|апельси́н|m}} is a topic for a separate vote, deserving its own discussion". WT:EL#Translations currently uses some Russian examples with manual romanizations (tr= parameter). But Russian can do it automatically, so as requested in Wiktionary talk:Votes/pl-2015-12/Translations#Transliteration, in my proposed change, I want to use Russian examples without without the tr= parameter, which implies that automatic romanizations are official policy. Is there any problem or controversy here?

I have no problem creating a separate vote officially "Allowing automatic romanizations", if that's what people want. --Daniel Carrero (talk) 12:25, 26 January 2016 (UTC)

By the current convention, not ALL Russian words are automatically transliterated but over 95%. Only languages under "override_translit" in Module:links have automatic transliteration overriding manual. So, Russian is a bad example. --Anatoli T. (обсудить/вклад) 12:31, 26 January 2016 (UTC)
Current rule in WT:EL#Translations:
  • Do add a transliteration or romanization of a translation into a language that does not use the Roman alphabet. Note however that only widespread romanization systems may be used. See Wiktionary:Transliteration.
My proposed change, as per Wiktionary:Votes/pl-2015-12/Translations:
If there's some controversy, I could further edit the sentence this way:
  • You can add a transliteration or romanization of a translation into a language that does not use the Latin script. In some languages, the romanization can be supplied automatically by the software, but there's no consensus as of yet concerning the acceptability of automatic romanizations and exactly what languages should use them. See Wiktionary:Transliteration and romanization.
Looks good? Of course, if there's consensus in favour of generally having automatic translations (I'd vote support on that), then that last change would be unnecessary. --Daniel Carrero (talk) 16:19, 26 January 2016 (UTC)
I'm not clear as to what you're actually proposing. Or are you not proposing anything yet? Renard Migrant (talk) 18:37, 26 January 2016 (UTC)
Proposal 1: For some languages, allowing automatic romanizations.
Proposal 2: In some WT:EL examples of wiki markup of Russian translations in the translation tables, using automatic romanizations.
Reason: I assumed that was a given (i.e., that people generally are supportive of automatic romanizations and that it would be okay mentioning one or two examples in WT:ELE using them), but in Wiktionary:Votes/pl-2015-12/Translations#Oppose, @Dan Polansky complained about it. --Daniel Carrero (talk) 19:22, 26 January 2016 (UTC)
OK, I'm a bit confused about what is specified as current policy and what is being proposed, but I actually wrote and ran a bot to automatically convert {{t|ru|апельсин|m|tr=apelʹsín}} to {{t|ru|апельси́н|m}}, and no one has complained about it; in fact, the main Russian editors here were happy with the results. Actual current policy doesn't agree much at all with the rule that Daniel quoted above. In particular:
  1. "do add a transliteration or romanization" isn't really right. It should ideally only be added when automatic transliteration either doesn't exist for a language or would be wrong. In particular, writing апельсин without a stress mark and then including manual translit apelʹsín with a stress mark is wrong; instead, апельси́н should be written and the auto-translit allowed to work. As Anatoli mentioned, most of the time (I would say over 99%) the automatic transliteration for Russian is correct (provided of course that stress marks are added to the Russian). Pretty much the only time when manual translit is needed for Russian is in cases like тест (tɛst), where the auto-translit would be test. For other languages, it may be needed more often; e.g. for Arabic, it's often needed to specify how a tāʾ marbūṭa should be transliterated (as t or as nothing).
  2. "only widespread romanization systems may be used" gives far too much latitude. This kind of attitude created a huge mess in the Arabic transliterations in translation entries, which took a lot of work on my part to fix (and may have gotten messed up again in more recent entries). Properly, the transliterations must follow the particular translit system used by Wiktionary for that language.
I would propose something like:
  • Add a transliteration or romanization of a translation into a language that does not use the Latin script, except for those languages where the romanization is supplied automatically by the software (but do add a transliteration/romanization if the automatically-provided one is wrong). The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration appendices); do not use any other romanization system.
Benwing2 (talk) 05:37, 27 January 2016 (UTC)
@Benwing2: That sounds great to me. I think these are actually our current "unspoken rules", which you could articulate well. Since @Dan Polansky asked for that specific issue to be voted separately, I don't mind creating a separate vote for it. --Daniel Carrero (talk) 13:33, 27 January 2016 (UTC)
Oh you're trying to get common practice codified somewhere. Excellent. Go for it. Renard Migrant (talk) 15:43, 27 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-01/Automated transliterations. --Daniel Carrero (talk) 03:58, 28 January 2016 (UTC)

Proposed wording:
  • Translations not written in the Latin script should have romanizations. In some cases, the romanization is supplied automatically by the software. Supply the romanization manually if it is not supplied by the software or if the romanization supplied by the software is wrong. The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration policies); do not use any other romanization system.
--Daniel Carrero (talk) 00:45, 30 January 2016 (UTC)
Perhaps the use of manual transliteration for automatically transliterated languages should be mentioned (exceptions as in Korean, Russian, etc.). The necessity to provide word stresses for Cyrillic-based Slavic (Serbo-Croatian accents?), diacritics for Arabic (Hebrew?).
Some languages are in a transition and a unified transliteration hasn't established yet, due to complexities - such as Khmer and Thai. Some transliteration modules are in the process of development or fixing - Lao, maybe Burmese. Thai module may never work the way other transliteration modules do, it will need phonemic spelling, split by syllables, just like Japanese requires kana readings and PoS info (plus morpheme boundaries in some cases) to determine the correct transliteration. Just commenting. --Anatoli T. (обсудить/вклад) 01:18, 30 January 2016 (UTC)
@Atitarev: IMO, ideally I would want a comprehensive list of transliteration circumstances as you described, but for the moment I'll probably just try to update WT:EL to officially allow transliteratons in the first place. --Daniel Carrero (talk) 10:06, 30 January 2016 (UTC)
@Daniel Carrero Sorry, I have been pre-occupied with testing new Thai transliterations and fixes with Russian. Quite busy at work too. There ARE changes currently happening with Thai transliteration methods and headwords and situations with transliterations and requirements with languages are indeed different. Make a list of questions, if you need for transliteration policies/issues and I'll try to answer them. --Anatoli T. (обсудить/вклад) 00:56, 1 February 2016 (UTC)
@Atitarev Thank you very much. :) There's absolutely no need to apologize, language-specific transliteration issues and changes are valuable information to be documented, it's just that WT:EL technically does not even allow automated transliterations. It says: "Add a transliteration or romanization of a translation into a language that does not use the Roman alphabet." If we were to obey that pre-Lua rule, we would have to throw away all transliteration modules, so I'll try to update that rule first, before working on the language-specific issues. (at least that's my plan at the moment) --Daniel Carrero (talk) 01:07, 1 February 2016 (UTC)
I'll just describe briefly how I understand the situation with automated transliterations, not manual transliterations.
Slavic, Cyrillic-based languages should normally use accent marks, especially, Russian, Ukrainian and Belarusian.
Russian: User:Benwing2 kindly converted all Russian translations to have accents when they were present in the manual transliteration. For cases when manual transliterations are required, word stresses are still required. Exceptions requiring manual translit are described, many are now partially automated.
Ukrainian and Belarusian: These don't require manual transliterations, if fully accented Cyrillic forms are provided. There is an unresolved issue with monosyllabic Belarusian words with "ё", fixed in Russian. Manually transliterations shouldn't be simply removed before Cyrillic words get accents.
The above three - currently deciding if we need to use grave accents for the secondary stress. Its usage is inconsistent.
Bulgarian: No manual transliteration is required, if accents are provided. The use of the grave accent is not very clear but normally used with accented vowel "ъ". More info is required on the rules.
Macedonian: No manual transliteration is required. The stress position is predictable but accents should be given for words when they differ from expected.
Serbo-Croatian: (Cyrillic) No transliteration is required. Another nested Roman form should be given to match the Cyrillic. The headwords use accents. (I personally find them problematic but they can be copied from entries if they exist).
Arabic: Automatic transliteration works only with fully vocalised Arabic forms. Loanwords, which are pronounced irregularly still need to have accents, manual transliterations is required (or can be provided) for some loanwords, words with "ة" between words and some words, with silent letters.
Korean: Automated but words of certain etymologies need manual transliterations.
Manual translit overrides automatic for all the above.
Greek, Armenian, Georgian, Kazakh, Kyrgyz, Tajik, etc. - fully automated. The list can be found in Module:links
Lao, Burmese: - fully automated. The transliteration is complex, sometimes doesn't work.
Khmer: - needs more work, can't officialise yet.
Hindi, Sanskrit, Nepali - almost there, the modules look good but occasional manual translit is required.
Japanese and Thai are special cases.
Japanese: transliteration works with headwords on kana, there are some exceptions and additional parameters are sometimes needed to get a correct transliteration. Can't be used to automatically transliterate Japanese words.
Thai: (new) only works in pronunciation sections. It needs phonetic respellings by syllables. Can't be used to automatically transliterate Thai words.
Feel free to add on transliteration policies. --Anatoli T. (обсудить/вклад) 02:02, 1 February 2016 (UTC)
Special thanks from me to User:Wyang and User:Benwing2 (aka Benwing) for making some complex transliterations happen! --Anatoli T. (обсудить/вклад) 02:14, 1 February 2016 (UTC)
That is an amazing list! Thank you! :) --Daniel Carrero (talk) 02:20, 1 February 2016 (UTC)
OK. With Hindi, there seems to be an agreement to provide nuqta and chandra when they effect pronunciations, even if Hindi speakers normally omit them in writing. (Very similar to Russian writing "е" instead of "ё" but dictionaries use "ё", so does Wiktionary). Some Sanskrit lovers prefer to provide word stresses, even if there is no native method to that (also Hebrew) and add hyphens to show morpheme boundaries. (I personally oppose that but I need to mention).
Mongolian (Cyrillic): Fully automated, overrides manual translit but it's known that Mongolian Cyrillic is not fully phonetic. Some textbooks and phrasebooks provide a more phonetic transliteration but we don't - no data or editors.--Anatoli T. (обсудить/вклад) 02:30, 1 February 2016 (UTC)
Overall, I would find it hard to set the rules to vote for and I am not sure how I am going to vote. I'd like to officialise the use of "^" to capitalised Korean romanisations (romaja officially capitalises proper nouns). I haven't described all situations, of course. e.g. Tamil, Malayalam, Telugu, Tamil and Sinhalese don't require manual overrides but Amaharic, Tigrinya do (rules for schwa-dropping are not defined and consonant geminations may need to be provided manually). Yiddish words of Hebrew origin are often transliterated and pronounced irregularly. --Anatoli T. (обсудить/вклад) 03:07, 1 February 2016 (UTC)
We don't need to set the rules for each language, we just need to mention that transliterations should appear for non-Latin script languages (with the exception of languages such as Serbo-Croatian, which are exempt as long as an equivalent Latin-script form is supplied), regardless of whether they are manually entered or automatically generated. Then each language can worry about how this happens on its own without policy in getting in the way. And even this much should not be part of the Translation Table policy, but general transliteration in links policy. --WikiTiki89 19:14, 3 February 2016 (UTC)

Literal translations from FL to English in the translation tableEdit

WT:EL currently says:

  • Do not give translations back into English of idiomatic translations [in the translation table of the English term]. For example, when translating “bell bottoms” into French as “pattes d’éléphant”, do not follow this with the literal translation back into English of “elephant’s feet”. While this sort of information is undoubtedly interesting, it belongs in the entry for the translation itself.

But I propose changing that: some entries give the literal translation in the translation tables. In kill two birds with one stone, we see that the idiom translated into other languages apparently have the literal translations: "to hit two flies with one slap", "to cook two roasts on one fire", "with one shot, two pigeons", etc. I like that, for purpose of language comparison, and also it makes it clearer that the translated idiom has a different literal meaning than the English idiom.

I propose officially allowing the literal translations and adding a parameter such as lit= to {{t}} and {{t+}} if there's no parameter like that available yet. The full syntax might be:

  • Portuguese: {{t|pt|matar dois coelhos com uma cajadada só|lit=to kill two rabbits with only one hit of a staff}}

Some other entries currently giving one or more literal translations back to English:

Thoughts? --Daniel Carrero (talk) 16:50, 26 January 2016 (UTC)

Oppose. The translation table is there just to point readers to an entry. The entry should contain all the information, including the literal translation. --WikiTiki89 16:54, 26 January 2016 (UTC)
But we include genders in translation tables for some reason. Why this exception? —CodeCat 17:02, 26 January 2016 (UTC)
That is a consequence of the history of this project. I don't really think we should include genders, but I'm not going to actively propose such a radical change. --WikiTiki89 18:27, 26 January 2016 (UTC)
Genders are such a small amount of information, like literally one character, I don't mind it. But in general yeah we include too much in translation tables. I've seen people cite bilingual dictionaries using <ref></ref> and it's just so much information for a translation table. Stuff like that goes under ===References=== in the entry itself. Renard Migrant (talk) 18:35, 26 January 2016 (UTC)
I tend to oppose this as well, as it is reduplication of effort and is likely to get out of sync. It is also not the most attractive, and might be confusing for some users. - TheDaveRoss 17:07, 26 January 2016 (UTC)
Is there some danger that users might think that all translations given in a translation table are literal translations? Would a user see that all the tea in China in French is tout l'or du monde and wonder which French word means "tea" and which one means "China"? I don't have actual stats to back it up, but if the answer to these questions is "yes", I'd consider it a point in favor of adding FL-to-English literal translations there. If the answer is "no, the users won't ever make that confusion even without the literal translations", then I suppose it would be fine removing the literal translations from the entries that have them. --Daniel Carrero (talk) 17:29, 26 January 2016 (UTC)
I think adding literal translations is a good idea, especially for multiword idiomatic entries. In fact most sayings already have literal translations. There's no need for an extra template or {{t}}-template parameter, the {{gloss}}-template can readily be used for that purpose. Matthias Buchmeier (talk) 18:08, 26 January 2016 (UTC)
Maybe we could use {{gloss}}, but the literal translations are not standardized in the examples above, so IMO it would be better using lit= to standardize them. --Daniel Carrero (talk) 19:23, 26 January 2016 (UTC)
Support, at least to the extent that the literal meaning of the translation is significantly different from that of the English idiom. It's not just interesting (or hilarious), but it can directly influence the choice of idiom and affect its appropriateness. Not sure how essential the template is, or whether users unfamiliar with it would stumble over it, but as a policy allowing this makes good sense. After all, people should know if "come alive" translates as "rise from the grave" in parts of Asia, to cite a famous, if apocryphal, example! P Aculeius (talk) 22:11, 26 January 2016 (UTC)
@P Aculeius: But they can click on the translation and find out in the entry itself. Why does this information have to be duplicated in the translation table? --WikiTiki89 22:44, 26 January 2016 (UTC)
It's natural to suppose that the translation of a word, phrase, or idiom will have the same meaning; and often the meaning is literally represented, but sometimes it's not. Someone unfamiliar with another language may be assume that the translation is not merely the closest equivalent, but the literal one. Experience suggests that users won't always look up the translation to see if it means the same thing as the phrase being translated, simply because they assume it means the same thing. If it means something significantly different, it would be a good idea to explain that at the point the translation is given, rather than on a different page. The current policy discourages useful information like this. Why? Does it make Wiktionary operate more smoothly? Does it make it easier to find out what translations of idioms really mean? No. The pages for foreign idioms can contain all kinds of information about them that wouldn't make sense in notes such as those proposed here. But relevant and important information about translations ought to be provided where it would do the most good. P Aculeius (talk) 01:18, 27 January 2016 (UTC)
@P Aculeius: But that applies to pretty much all translations of all terms, even if they are not "idioms". Should we just start including the entire definition section of every term in the translation tables? --WikiTiki89 15:57, 27 January 2016 (UTC)
@Wikitiki89: I disagree that it applies to pretty much all translations of all terms, even if they are not "idioms". olho (Portuguese) means eye and is not an idiom, it does not have a "literal" translation back to English; custar o olho da cara in Portuguese means "to cost very much, to cost an arm and a leg" and literally means "to cost the eye from the face", for this one I would argue that a "lit=" translation would be an improvement. --Daniel Carrero (talk) 19:53, 27 January 2016 (UTC)
olho is just one counterexample and the reason that I said "pretty much all" rather than "all", although in retrospect, I probably should have said "many" rather than "pretty much all". For example, look at the translations in the first translation table for the verb watch. A naive reader looking at that might think that in Portuguese ver and assistir mean the same thing, or that Russian смотреть, наблюдать, and глядеть mean the same thing. But that doesn't mean we should take up space in the translation table to clarify the differences. That is what the entries are for. The entries themselves should be able to link to synonyms or similar words and phrases and explain the differences between them, but that is not what translation tables are for. --WikiTiki89 21:30, 27 January 2016 (UTC)
If you are talking about the first verb sense ("To look at, see, or view for a period of time."), then, yes, ver and assistir can be used interchangeably. (In my experience, as a speaker from São Paulo/Brazil). So, in my view, neither of those need a lit= parameter or an explanation. --Daniel Carrero (talk) 23:17, 27 January 2016 (UTC)
Sorry for my limited knowledge of Portuguese, the Russian example still stands. --WikiTiki89 23:23, 27 January 2016 (UTC)
Idioms are different because their meanings aren't literal or intuitive in either language. Because they're basically oddities of speech, they often have very inexact equivalents from one language to another. Because the meaning in each language tends to be complicated and both are often very different from the literal meaning of the words, these phrases pose especial dangers in translation, which is why it makes sense to provide information about very different meanings. People should know if the Vespugian equivalent of "passing the buck" literally means "your pig can't fish by itself." P Aculeius (talk) 00:02, 28 January 2016 (UTC)
About "it can directly influence the choice of idiom and affect its appropriateness": that's one reason why there should be literal FL-to-English translations in the translation table, IMO. If there aren't any, and there are multiple idioms to choose, then the user would have to open all of them in separate windows to make their choice. It might be particularly annoying if said user is using a cellphone, for example.
Re "at least to the extent that the literal meaning of the translation is significantly different from that of the English idiom": if our coverage of FL-to-English translations in the translation table is good, then I think it would be reasonable to assume that all translations without them have the same meaning as the English term. I don't know exactly about borderline cases: o silêncio é ouro = silence is golden (literal ≈ idiom, the Portuguese one means "silence is gold", so maybe it would be OK not having the literal translation); 言わぬが花 = silence is golden (the Japanese one means "not saying is a flower", so I think it requires the literal translation, which the entry does give) --Daniel Carrero (talk) 23:10, 26 January 2016 (UTC)
  • Support. This is de facto allowed, and should be in EL as well. If it is too contested here, I think it ought to go to a vote. —Μετάknowledgediscuss/deeds 21:34, 27 January 2016 (UTC)
  • Support. Based on my experience, I doubt most people will go to the entry to check that they understand the translation. We should provide enough information in the translations tables that someone can correctly use the word without going to the entry (thus, gender and clarification of distinctions made in other languages that don't exist in the English sense should be included as well). The entry itself serves the purpose of including the finer shades of meaning, the pronunciation of the word, and all its other definitions. Andrew Sheedy (talk) 23:09, 27 January 2016 (UTC)
  • Support. I think this is a good idea and I agree with Andrew that people aren't going to go to the entry to check for a literal translation, but will expect it to be alongside the translation itself. Benwing2 (talk) 03:35, 28 January 2016 (UTC)
  • Support. As Meta notes, this has long been common practice for idioms. (I don't think literal translations are appropriate on acorn, especially since "oak-berry" is etymologically where acorn itself comes from. I do think they're appropriate in idioms' entries.) The argument that readers should get the literal translation from the entry has, in addition to the various problems discussed above, the problem that the translation is not required to be a bluelink. If it is a redlink, then a reader cannot get any information from the (non-existent) entry. - -sche (discuss) 04:32, 29 January 2016 (UTC)

I'd like to create a vote. Proposals:

Policy stuff:

  1. Editing WT:EL to officially allow FL-to-English literal translations in translation tables.

Technical stuff:

  1. Adding a lit= parameter to {{t}} and {{t+}}.
  2. Adding lit= support to the translation table gadget.

Note: Some templates already have the lit= parameter, {{t}} and {{t+}} don't:

  • {{m|pt|foo|lit=bar}} returns foo (literally bar)
  • {{l|pt|foo|lit=bar}} returns foo (literally bar)

Possible policy text:

  • If the translated word is an idiom, you can give the literal translation back to English using the parameter |lit=. For example, the idiom “none of your beeswax” cannot be translated into German literally as “nicht dein Bienenwachs”, as this does not have the same meaning in German; an idiomatic translation is “nicht dein Bier” (which means, literally, “not your beer” in English).

--Daniel Carrero (talk) 18:57, 29 January 2016 (UTC)

I created the vote: Wiktionary:Votes/pl-2016-01/Literal translations in translation tables. --Daniel Carrero (talk) 10:36, 30 January 2016 (UTC)

Arabic loanwords and vocalisationsEdit

User:Mahmudmasri has been removing vocalisations (diacritics) from Arabic terms borrowed from other languages, especially words with irregular pronunciations (where automatic transliteration differs from manual). I disagree. We need an agreement on this. @Benwing2, Wikitiki89.

Example: كُوت دِيفْوَار(kōt divwār) should still have diacritics - "كُوت دِيفْوَار", not "كوت ديفوار" even if the transliteration is not the expected "kūt dīfwār". --Anatoli T. (обсудить/вклад) 22:19, 26 January 2016 (UTC)

I think the vocalization should be present. Benwing2 (talk) 22:23, 26 January 2016 (UTC)
I have mixed feelings, but I'm leaning toward keeping them. --WikiTiki89 22:46, 26 January 2016 (UTC)
Transliterations aren't for pronunciation respellings. Otherwise, we might as well start transliterating Latin script words to match their pronunciation too. Anyone interested in Dutch chauffeur (sjofeur)? —CodeCat 23:01, 26 January 2016 (UTC)
@CodeCat Please note that many (not all) Arabic transliterations are considered standard and are attested. Hans Wehr dictionary is one of the sources or perhaps the most reliable (but not comprehensive) for transliterations. As in previous arguments, this applies to irregular Thai, Korean and other transliterations. E.g. كُورِيَا(kōriyā, Korea) is transliterated "kōriyā" in the dictionary, not (as automatic) "kūriyā". Similarly Thai ราช (râat) can be either "râat" or "raa-chaa" for etymological reasons.
As for your Dutch example, nobody seems to care about transliterations of Roman-based languages but "chauffeur" (sjofeur) would be transliterated as "шофёр" into Russian Cyrillic. --Anatoli T. (обсудить/вклад) 23:13, 26 January 2016 (UTC)
Transliterations aren't for pronunciation respellings but often do reflect pronunciation. In Arabic this is inevitable; a true transliteration would include only the consonants (Buckwalter transliteration is an example of this), but this would be far from useful for most users, since the vowels are critical for learners. Hans Wehr's transliterations, which we follow subject to a few modifications, are really transcriptions of the pronunciation; this includes cases (mostly loanwords) where written long vowels are pronounced as short vowels, and where written i and u are pronounced as e and o. Similarly, the Russian transliteration system we use partly reflects pronunciation, and transliteration of East Asian languages is definitely pronunciation-based. Benwing2 (talk) 05:00, 27 January 2016 (UTC)
@Benwing2 Did you mean South East Asian, East Asian or both. All those ar surprisingly phonetic. Japanese kana really has a couple of true exceptions (well, it is designed for pronunciations but something has changed. There are also semantic and morphemic differences, where, e.g. one needs to know if it's ou or ō. Thai has a number of awkward loanwords from Sanskrit and Pali, lots of traditional spellings, like English but it is a phonetic script, more meaningful than English, actually. Korean is slightly less phonetic than Japanese, it's often to do with Sino-Korean words with South Korean spellings. Chinese is not phonetic but can be considered much more consistent in pronunciations of characters than Japanese but there are multiple readings as well. --Anatoli T. (обсудить/вклад) 05:18, 27 January 2016 (UTC)
@Atitarev I was thinking of languages like Chinese and Japanese, where our "transliterations" are necessarily pronunciation-based, but it applies e.g. to Thai as well. Keep in mind the proper distinction between a true "transliteration", which is a direct mapping of the written form, and a true "transcription", which is a direct mapping of the spoken form. A true transliteration would follow Thai spelling exactly, and have distinct Latin representations for every distinct Thai letter, so that the reverse conversion from translit back to Thai would be possible. In reality a transcription is used, which doesn't distinguish letters pronounced the same, and romanizes Pali and other loanwords (the "awkward loanwords" and "traditional spellings" you mention) according to pronunciation rather than spelling. All of this is generally the right thing to do, IMO. Benwing2 (talk) 05:34, 27 January 2016 (UTC)

This discussion is longer than necessary.

  • Keep in mind that the vocalization should be familiar to Arabic speakers, not fake, as it is now.
  • Damma+و are /uː/. The use for /oː/ is not expected. You are wrongly indicating /uː/ rather than /oː/ which has no way other than writing plain و without preceding diacritics.
  • Kasra+ي are /iː/. The use for /eː/ is not expected. You are wrongly indicating /iː/ rather than /eː/ which has no way other than writing plain ي without preceding diacritics.
  • The example of كوريا is wrong, since it is generally Arabized as /ku(ː)rja/; كورى /kuːri/.
  • Diacritics are made for Arabic words, not loanwords. If there is a need to be used in loanwords, they must be minimally used. Some letters don't need vocalization to indicated the expected pronunciation, like the ending of يا, it can't be anything but /ja/.

--Mahmudmasri (talk) 20:35, 6 February 2016 (UTC)

    • @Mahmudmasri The evidence shows that vocalisation is also used for loanwords. I deliberately used كُورِيَا(kōriyā, Korea) because it's referenced (it's Hans Wehr's transliteration). However, your argument (that it should be "kūrya") just shows that there is more than one way to pronounce loanwords, depending on the speaker, region, education and preferences. The only way to indicate the pronunciation for native speakers is vocalisations for all combinations, including vowels o, ō, u, ū / e, ē, i, ī or consonants missing in the Classical Arabic - g, p, v, etc. I can't be that wrong. --Anatoli T. (обсудить/вклад) 00:48, 7 February 2016 (UTC)
I basically agree with Anatoli here. Benwing2 (talk) 02:56, 7 February 2016 (UTC)
I'll just say that we focus too much on Hans-Wehr. Hans-Wehr can be used as a guide, but not as a definitive reference. We need primary sources (i.e. quotations from the "wild") as definitive references for both pronunciation and vocalization of loanwords. What I mean by this is, we can initially give whatever Hans-Wehr gives, but when an investigation is being made in a particular case into the accuracy of a pronunciation or vocalization, Hans-Wehr can no longer be used as a source. Also, by pronunciation, I really mean the one represented by the transliteration. --WikiTiki89 03:02, 7 February 2016 (UTC)
There seem to be some videos on Youtube which use the word; someone who speaks Arabic could take a listen and see how it was pronounced. - -sche (discuss) 03:51, 7 February 2016 (UTC)
There is a fundamental difference between regular Arabic words and Arabic transliterations of loanwords. Regular Arabic words are written in the Arabic abjad, where short vowels usually are not indicated in writing, and long vowels are indicated with letters of prolongation (since there are no letters for vowels other than the diacritics fatha, kasra, damma). Loanwords are written in the Arabic alphabet (a true alphabet), and most vowels are indicated with ا‎ and ي(y) and و(w) (as true vowels), and these vowels do not indicate vowel length. Some very short vowels (the schwa) are not indicated in alphabetic Arabic, because they feel that the lack of a vowel best represents the shortness of the schwa. It means that the automatic Lua transliteration does not work correctly for alphabetic Arabic. —Stephen (Talk) 07:02, 7 February 2016 (UTC)
Indeed; this is why we require manual translit for these words. Benwing2 (talk) 07:12, 7 February 2016 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── We've got several aspects here, some are related, some are not.
Do Arabic words borrowed from other languages get diacritics? Yes, they do. It's our current policy to provide vocalisations for Arabic words. (Objections should have been made earlier, IMO.) The differences from native words:
  1. Loanwords don't get ʾiʿrāb endings like native words do. It's not an issue here, as we normally don't include ʾiʿrāb in headwords or translations, only in inflection tables (or some known exceptions where definite and indefinite forms differ greatly).
  2. It is perceived that only religious texts, such as Qur'an use diacritics. Qur'an doesn't have foreign words. However, foreign words appear in books for children and foreigners, in textbooks and dictionaries and they get vocalisations.
  3. Pronunciations of loanwords may differ substantially from native Arabic words and long vowels may be used to render short vowels or vowels not present in standard Arabic - ي(y) to render e, ē, short i, و(w) to render o, ō, ū and short u, ا‎ can be used for a short a. Consonants g, p, v, č, etc. (absent in the Arabic alphabet) can be rendered with other letters. They still get vocalisations. Incorrect transliterations (sometimes caused by automatic transliterations) should be entered manually in such cases (we DO have words with incorrect transliterations, no-one has denied it). ا‎ and ي(y) and و(w) may represent either long or short vowels in loanwords but they still have to use fatḥa, kasra and ḍamma.
Of course, Hans Wehr is not the only source of information, no-one said it was. Besides, Hans Wehr is not a source for vocalisation but it definitely is a good source for transliterations. If there is evidence that a specific word is pronounced differently, we can change it.
Checking for vocalisations (diacritics) in the Google books would be useless because Arabic books don't use vocalisations but selected dictionaries do and there's also evidence that vocalisation is also used with words irregularly pronounced - loanwords and dialectal words. We have adopted to use diacritics and should stick to it. --Anatoli T. (обсудить/вклад) 12:02, 7 February 2016 (UTC)

Random search on the internet: كوريا /ku(ː)rja/ [2]; كورى /kuːri/ [3]. --Mahmudmasri (talk) 14:43, 7 February 2016 (UTC)

@Mahmudmasri If you're still going on about كوريا, I have added "kūryā" as an alternative transliteration. BTW, I think we agreed to transliterate the final alif with "ā", even if it's shortened in pronunciation. Anyway, I think we need to keep "kōriyā" as well, which is referenced, unless it's proven that it is incorrect and only "kūryā" (kūrya) is correct. --Anatoli T. (обсудить/вклад) 00:49, 8 February 2016 (UTC)
That's exactly my point. It's not referenced because other dictionaries are not valid references for Wiktionary. --WikiTiki89 16:10, 8 February 2016 (UTC)
What is not referenced? --Anatoli T. (обсудить/вклад) 21:51, 8 February 2016 (UTC)
You said "Anyway, I think we need to keep 'kōriyā' as well, which is referenced, unless it's proven that it is incorrect". I disagree that Hans-Wehr counts as a reference for us to keep it. --WikiTiki89 22:09, 8 February 2016 (UTC)
If we don't use Hans Wehr's transliteration, what do we use? There are no books written in Arabic transliterations, they are only used in dictionaries and textbooks. Vocalised Arabic, if available, will result in many mistransliterations for loanwords. As I said, transliteration of loanwords may need some tweaking but Hans Wehr is one of the few limited resources on comprehensive transliterations available. I trust native speakers' judgement but I don't think we should discard Hans Wehr as a resource. Hans Wehr dictionary IS the reference in many cases. You were there when we decided on largely adopting Hans Wehr's methods. Why this negative attitude now? Did you find any mistakes in the dictionary? --Anatoli T. (обсудить/вклад) 11:18, 9 February 2016 (UTC)
I think the basic issue is whether the vocalizations are usage-based, like the definitions and the basic part of the spelling, or reference-based, like the etymologies. Arabic may be a well-documented language, but the vocalizations just aren't used that much to convey meaning, so vocalized Arabic is more like a less-documented language. How much of the vocalization on words not used in children's books can actually be verified without resorting to mentions in dictionaries? Chuck Entz (talk) 14:39, 9 February 2016 (UTC)
I didn't say we need to attest the transliterations in the same way we would attest words and spellings, but that we need to verify them somehow in the wild, such as with YouTube videos. If you didn't understand what I originally said, I said that we can use a Hans-Wehr transliteration initially as long as that particular transliteration is not contested, but as soon as that transliteration is contested we have to find another source. --WikiTiki89 16:28, 9 February 2016 (UTC)
The original term in the discussion wasn't from HW, it's not contested. As for كوريا, I'm not sure Mahmud insists that kōryā is absolutely wrong and I don't think it is. It's like saying موسكو (Moscow) should be mūskū, not mosko. There's more than one reading. Recently I found vocalised جريون (garsōn, "waiter", colloquial), without transliterations disputed on my talk page by another native speaker in an Oxford dictionary. We need to vocalise all words, even if not all words can be found in this form in the literature for obvious reasons, just a policy thing. As for contested transliterations, I have already said that we do have incorrect cases and we need to fix them but we won't find transliterations for each term in books. —This unsigned comment was added by Atitarev (talkcontribs).
I'm sorry if I misunderstood, but I assumed that you were saying that you got "kōriyā" from Hans-Wehr, and that Mahmudmasri was saying that it was wrong. Anyway, I'll repeat that I never said that we have to find transliterations in books in order to use them. --WikiTiki89 20:44, 9 February 2016 (UTC)
Well, if Mahmud proves that "kōriyā" is incorrect, we can take it out, even if it's in Hans Wehr. Do we need to search for "kōriyā" pronunciations or let's trust HW dictionary?
Let's take جَرْسُون(garsōn) as a good example for this discussion. I trust the native judgement that it should be "garsōn" (from French garçon), not "jarsūn" as the spelling would suggest but Oxford English-Arabic dictionary only has the spelling "جَرْسُون". In this case we have an attested vocalised spelling of a loanword but unattested transliteration but I am sure we can find pronunciation examples. BTW, I won't be surprised if "jarsōn" is also used outside Egypt.
Unlike Russian, which is much more homogeneous, an opinion of one Arabic speaker may not be even enough to contest a referenced spelling or pronunciation. --Anatoli T. (обсудить/вклад) 22:25, 9 February 2016 (UTC)
He doesn't have to prove that it is incorrect. Once he is contesting it, we have to prove that it is correct. Again, it's not the spelling we have to verify, but the pronunciation. For example, if you found a YouTube video in which the pronunciation "kōriyā" is used, that would be enough evidence, but Hans-Wehr is not evidence. It's like RFV but more informal. --WikiTiki89 22:29, 9 February 2016 (UTC)
Of course Hans Wehr is evidence. His transcriptions may reflect a particularly formal style of speaking but they're not put there just for someone's random amusement. We can argue whether it's sufficient by itself but it's certainly evidence. Benwing2 (talk) 22:49, 9 February 2016 (UTC)
Sorry, I meant "valid evidence for our purposes". Of course it is evidence, everything is evidence. Hans-Wehr could have assumed or even invented a transliteration when its own editors lacked evidence. --WikiTiki89 23:09, 9 February 2016 (UTC)
Why isn't it valid? Benwing2 (talk) 23:11, 9 February 2016 (UTC)
Other than what I literally just said, Wiktionary on principle does not accept other dictionaries as evidence. Because of Arabic's unusual situation, I'm assuming a relaxed version of CFI and the RFV procedure for the purposes of verifying transliterations and vocalizations. Essentially, once someone comes along and disputes a transliteration (i.e. the pronunciation), we have to go and make sure that our source itself was not wrong, and our source in this case is Hans-Wehr. Do you disagree? --WikiTiki89 23:30, 9 February 2016 (UTC)
I don't know the rules for accepting evidence but it surprises me that we can't use other dictionaries. I've seen plenty of cases where other dictionaries appear in references for definitions. I don't disagree that we should verify pronunciations if they're disputed but that doesn't mean they should be removed pending resolution. Also, I agree with Anatoli that a native Arabic speaker's own intuition isn't sufficient evidence given that there's so much variety and that MSA isn't even anyone's native language (and the line between MSA and dialect isn't very well defined). Benwing2 (talk) 00:32, 10 February 2016 (UTC)
Yes. A reference from a universally accepted dictionary, such as Hans Wehr should be taken seriously and editors do use dictionaries as references. Even more often than references from books.
I have just checked with a native Arabic speaker, a lady from Iraq. She says both pronunciations are acceptable. There you go. --Anatoli T. (обсудить/вклад) 05:46, 10 February 2016 (UTC)
I don't mean that you can't list in the reference section. It's just that the dictionary does not count as verifying the pronunciation. And I'm also not saying that we should immediately delete a pronunciation that is disputed. We should do the research first, but if we then fail to find any evidence "in the wild" of a given pronunciation, its existence in the dictionary should not save it from being deleted at that point. @Atitarev: I was commenting on the procedure, not on the specific word كوريا. But it's great that you verified it. --WikiTiki89 06:20, 10 February 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── My concern is the procedure as well. I had no doubts that a pronunciation from Hans Wehr is verifiable and would be confirmed sooner or later. If we do a research for every word, which is also present in a notable (trustworthy) dictionary, very few entries get created. --Anatoli T. (обсудить/вклад) 22:58, 10 February 2016 (UTC)

That's why I'm saying that we don't need to research every word. Only the ones that someone explicitly contests. --WikiTiki89 23:37, 10 February 2016 (UTC)
google:"كوريا" "koriya" (not many sites eem to use macrons) gets some hit that seem to be include transliterated Arabic (as well hits that seem to transliterate other languages). Mahmud seems to cite some Youtube videos for other pronunciations above. google:"Kūriyā" also gets some hits and seems to be worth investigating. Make of that what you will. - -sche (discuss) 22:50, 9 February 2016 (UTC)

Pali in non-Latin scriptsEdit

IMO, Pali in any non-Latin script should redirect to the Latin form, otherwise we end up with five entries with the same definitions, and mistakes can propagate (like rāja, which was a misspelling of rājā, and all the non-Latin stuff had to be removed). —Aryamanarora (मुझसे बात करो) 23:30, 26 January 2016 (UTC)

Support. Wyang (talk) 00:39, 27 January 2016 (UTC)
Soft redirects (with no definitions, just links, please), not hard redirects. Formats of soft redirects to be discussed. --Anatoli T. (обсудить/вклад) 00:52, 27 January 2016 (UTC)
Of course, soft redirects only - something like {{zh-see}}. —Aryamanarora (मुझसे बात करो) 01:07, 27 January 2016 (UTC)

Made {{pi-sc}} with documentation.Aryamanarora (मुझसे बात करो) 21:14, 27 January 2016 (UTC)

I put it into use at ဗျဂ္ဃ and ब्यग्घ for illustration purposes. Do we like this and want to do things this way? (I do!) —Aɴɢʀ (talk) 21:29, 27 January 2016 (UTC)

Future IdeaLab Campaigns resultsEdit

Last December, I invited you to help determine future ideaLab campaigns by submitting and voting on different possible topics. I'm happy to announce the results of your participation, and encourage you to review them and our next steps for implementing those campaigns this year. Thank you to everyone who volunteered time to participate and submit ideas.

With great thanks,

I JethroBT (WMF), Community Resources, Wikimedia Foundation. 23:49, 26 January 2016 (UTC)

Translations of taxonomic namesEdit

WT:EL#Translations currently says:

  • Translations are to be given for English words only. []

But, some Translingual entries for taxonomic names have translation tables. WT:EL#Translations does not allow Translingual translation tables. Non-exhaustive list:

I created Wiktionary:Votes/pl-2015-12/Translations last month, which proposed rewriting WT:EL#Translations completely. It is most likely going to fail today, 23:59. One of the points raised by the opposers is that my rewrite (also) does not allow Translingual translation tables.

So, I have a proposal:

  • Officially allowing translation tables for taxonomic names, and rewriting the part(s) of WT:EL required to that effect.

But I have some questions: Wouldn't the (Translingual) translation table of Canidae be a duplication of the (English) translation table of canid? If so, shouldn't Canidae use {{trans-see|canid}}? Anyway, there are probably some taxonomic names without easily found English counterparts, I think having a translation table would be most useful for them.

What about actual language-specific taxonomic names like ホモサピエンス, 호모 사피엔스, होमो सैपियन्स and гомо сапиенс, all of which apparently read close enough to "Homo sapiens" rather than "human"? One might argue that they should be in the translation table of Homo sapiens, but AFAIK, Translingual translation tables most often have the colloquial name ("dog", rather than "Canis familiaris") in different languages. For the moment, I placed the "Homo sapiens" variants in the See also section in Homo sapiens, rather than in a translation table.

Some older discussions:

--Daniel Carrero (talk) 14:10, 27 January 2016 (UTC)

Pinging everyone who voted in Wiktionary:Votes/pl-2015-12/Translations: @Metaknowledge, Andrew Sheedy, I'm so meta even this acronym, DCDuring, Dan Polansky, Xbony2. --Daniel Carrero (talk) 14:25, 27 January 2016 (UTC)
The use of {{trans-see}} would be highly desirable. It is an open question whether some uncommon English vernacular names should be the location of the translation table, rather than the corresponding taxonomic name.
Some English vernacular names (eg, hippoboscid) are almost certainly uncommonly used outside of a scientific context. I suspect that some corresponding FL names are similar.
Many English vernacular names are unclear as to their scope. In use, especially outside a strictly scientific context, does canid include genus Canis, tribe Canini (includes genera that are somewhat fox-like IMO), subfamily Caninae (includes genera that are much more fox-like than dog-like in appearance IMO, as well as some more like other carnivores), family Canidae (including foxes), suborder Caniformia (including weasels and seals)? Is dog meant to include only domesticated varieties of Canis lupus familiaris or does it include some other species of wild Canis, including extinct ones? Doesn't it have definitions without any specific relationship to any taxon, eg, "an animal resembling a typical dog in size and in facial features"
As taxonomy increasingly departs from exclusive reliance on gross and other morphological features of organisms, the relationship between taxonomic names and common vernacular names is likely to become ever more tenuous. DCDuring TALK 14:58, 27 January 2016 (UTC)
Yeah codify it, I'm all for it. Renard Migrant (talk) 15:45, 27 January 2016 (UTC)
I would tend to keep the translations in the Translingual entry, with an additional table in the equivalent English entry. The reason for having both tables is this: the Translingual entry has a greater degree of precision, while the more colloquial/language-specific terms can vary in meaning, and don't always correspond exactly to the taxonomic name. Also, certain terms could be used to translate Translingual ones, but might not be appropriate translations of a word that was chosen to host the translations instead of on the Translingual page.
To further clarify (or maybe obfuscate), the entry at Hominidae could host great ape, pongid, or hominid as translations (note that neither of the first two terms typically include humans) for English, and hominidé and grand singe for French. Only one of those could host the translations, namely hominid in this case, but that would (a) eliminate the other English translations, and (b) split up hominidé and grand singe between hominid and great ape.
I realize it may sound like I am arguing for keeping translations tables at Translingual entries to the exclusion of the English ones, but there are of course reasons to have them in the English entry as well.
Hopefully I got my meaning across, as I'm somewhat tired and tend to produce messes of confusion when that is the case. Andrew Sheedy (talk) 23:45, 27 January 2016 (UTC)
For one, I understand what you mean, @Andrew Sheedy. Well, I'll create a vote later for this. The proposal is as I said in the first message: "Officially allowing translation tables for taxonomic names, and rewriting the part(s) of WT:EL required to that effect." The choice of using an actual translation table or {{trans-see}} for an entry in particular would be done on a case-by-case basis, I think. Maybe we should have a paragraph about {{trans-see}} saying that it can be used "cross-language" between English and Translingual, without introducing any hard "requirements" like "always keep translation tables in X language to the exclusion of Y". --Daniel Carrero (talk) 20:00, 29 January 2016 (UTC)
It might be helpful to have a version of the translation table template like this: {{trans-top|Translingual gloss|see also|English entry}}, keeping a translation table in both entries, but letting users know that they might find more translations in the English entry. The same thing could be in the English entry, linking to the Translingual one. Andrew Sheedy (talk) 20:58, 29 January 2016 (UTC)
The only time {{trans-see}} should be used is if the target is very nearly an exact synonym for the source. Sometimes that occurs between vernacular and taxonomic names, especially where there is an official or semi-official body that prescribes the names. Mammal Species of the World and especially Birds of the World: Recommended English Names include such names. For vernacular names not so prescribed, the correspondence is often imperfect and sometimes nearly hopeless, eg, fish, which does not correspond to any taxon, and the more recent DNA-based clade names, which defy economical definition, let alone brief English or other language vernacular names. DCDuring TALK 21:26, 29 January 2016 (UTC)
I agree with Daniel Carrero; it's too early for general principles on where and when to use full translation tables, {{trans-see}}, or whatever. Let's just officially allow their use, and then wait till we have the precedents on which to base our casuistry. — I.S.M.E.T.A. 02:31, 30 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-01/Translations of taxonomic names.

Current text in WT:EL#Translations:

Translations are to be given for English words only. In entries for foreign words, only the English translation is given, instead of a definition. Any translation between two foreign languages is best handled on the Wiktionaries in those languages.

Proposed text:

Translations should be given in English entries, and also in Translingual entries for taxonomic names. Entries for languages other than English and Translingual should not have Translations sections; usually, the English translation is given, instead of a definition. Any translation between two foreign languages is best handled on the Wiktionaries in those languages.

Looks good? --Daniel Carrero (talk) 23:42, 30 January 2016 (UTC)

OK with me. It does the important things. The rest of what I mentioned was just intended to present some aspects of implementation. It would be, at least, premature and, possibly, completely unnecessary to go further than the text above goes. DCDuring TALK 01:22, 31 January 2016 (UTC)
Looks good to me. Andrew Sheedy (talk) 05:11, 31 January 2016 (UTC)

Transliteration policiesEdit

I organized the transliteration pages in Category:Transliteration policies.

Before, they were a mess randomly distributed in Category:Wiktionary:Transliteration and Category:Transliteration appendices. A few (like Wiktionary:Classical Syriac transliteration) were not in a language category (like Category:Classical Syriac language). I made all the pages use the same naming system and categories, through the use of {{transliteration policy}}.

Compare Category:Script appendices, which is a category explaining scripts/characters but for readers, not editors. There was some overlap between this and that category, but I believe I was able to separate them by renaming a few pages and choosing the right category. I don't think the system is perfect but it's more consistent than before. --Daniel Carrero (talk) 17:34, 27 January 2016 (UTC)

Thank you! Benwing2 (talk) 18:34, 27 January 2016 (UTC)

Placement of "Usage notes"Edit

WT:EL#Order of headings currently has a note between parentheses concerning the "Usage notes" section:

  • Usage notes (can be placed anywhere appropriate)

This was actually voted in 2007, see Wiktionary:Votes/pl-2007-06/ELE level 4 header sequence. But, I don't believe we can actually place it anywhere nowadays, can we? This has been discussed recently at Wiktionary talk:Votes/pl-2015-12/Usage notes#Location of the usage notes header, Wiktionary:Votes/pl-2015-12/Usage notes/old#Oppose and Wiktionary:Votes/pl-2015-12/Usage notes#Support.

Note: until today, WT:EL had the following sentence, repeating that rule: "[Usage notes], whether identified by a heading or indent level may come anywhere." But I removed it, per the vote Wiktionary:Votes/pl-2015-12/Usage notes.

I have a proposal:

  • Officially deciding what is the proper place for the "Usage notes" header and disallowing the rule that "Usage notes can be placed anywhere".

According to @Wikitiki89 in this link (which is the first discussion I linked above), the placement would be: "immediately after the definitions if there is no inflected forms section or after the inflected forms section (===Conjugation===, ===Declension===, ===Inflection===, etc.), which is itself immediately after the definitions". Sounds good? --Daniel Carrero (talk) 02:52, 28 January 2016 (UTC)

Sounds good to me. Sometimes I've seen it before the inflections section but I think it should go after. Benwing2 (talk) 03:26, 28 January 2016 (UTC)
That has my support. I think there's only one time I've seen it elsewhere, anyway. Andrew Sheedy (talk) 03:28, 28 January 2016 (UTC)
I hope that would de-legitimize the Usage notes header (not content) appearing in Pronunciation sections (where I have seen it) or other pre-definition locations (where I have not). DCDuring TALK 12:38, 28 January 2016 (UTC)
Just after the definitions is the most logical place, and is what I do. Donnanz (talk) 12:44, 28 January 2016 (UTC)
  • At [[Ogham]] the Usage notes header has content about pronunciation. Is that where we want the content to be or should it appear in the pronunciation section without a header? DCDuring TALK 13:26, 28 January 2016 (UTC)
    Support this: "appear in the pronunciation section without a header".
    These aren't actual "Usage notes". At best, they are "Pronunciation notes", which don't require a separate header. --Daniel Carrero (talk) 13:34, 28 January 2016 (UTC)
  • Support placing it after definitions, before inflection. —CodeCat 16:33, 28 January 2016 (UTC)
    How does it make sense to place it before the inflection? After all, the inflection section is just an extension of the headword line. Not only that, but usage notes very frequently reference the inflections. I think it makes more sense as thinking of the headword line, definitions, and inflection section as one logical unit that cannot be separated. --WikiTiki89 16:51, 28 January 2016 (UTC)
    I've always looked at it as Wikitiki does. That order does put additional pressure to make sure that the inflection is vertically compact. DCDuring TALK 20:35, 28 January 2016 (UTC)
My understanding was that usage notes belonged after the definitions; I am surprised that WT:EL says otherwise; this is a reminder that we must check periodically that our policies and our practices match. Like CodeCat, I think the usage notes should go immediately after the definitions, before any inflection section, because the usage notes often contain vital information about the definitions, and pushing them below inflection information increases the chance that readers who only came to look at the definitions will not notice that there are usage notes which provide additional information on the definitions. Even when the usage notes are about inflection info, I think it's appropriate to put them right after the definitions (as on regnen). As a compromise, perhaps we could allow the usage notes to go either before or after inflection info, depending on whether or not the usage notes were about the definitions or about the inflection (this would mean we'd flip the order of the usage notes and inflection info in regnen and neger). - -sche (discuss) 21:05, 28 January 2016 (UTC)
As it doesn't effect either English or Translingual L2 sections, I would abstain in a vote on the narrow question of whether Usage notes were immediately before or after inflection tables. DCDuring TALK 22:22, 28 January 2016 (UTC)
I'd usually expect notes about the inflection to be in the inflection header itself. Random example: French pâtir uses a certain conjugation template that returns the text "This is a regular verb of the second conjugation, like finir, choisir, and most other verbs with infinitives ending in -ir. One salient feature of this conjugation is the repeated appearance of the infix -iss-." before the conjugation table. --Daniel Carrero (talk) 22:45, 28 January 2016 (UTC)
Not those kinds of notes. Those are really just curiosities and not necessary to include anyway. I'm talking more about things like an explanation of when different variants are used. For example, the note about puis in the entry for pouvoir (which probably needs a more detailed explanation with examples. Or the note about человек vs. людей as the genitive plural of человек. --WikiTiki89 02:09, 29 January 2016 (UTC)
Data: I used AWB to page through the first 100 entries starting with 'f' (a letter I picked at random) which had usage notes sections as of the 2015-07-02 database dump yes, I should download a newer dump, but it doesn't make a difference here unless you think the overall proportion of definition-centric vs inflection-centric usage notes has changed in the last 6 months. Of these hopefully representative entries' usage notes,
  1. 32% (examples: facient, fag, fait accompli, falseness) expand upon or contain information about specific definitions and/or registers or contexts (offensiveness, transitivity, etc) of specific definitions or of the term as a whole. Whether you buy the claim or not, this information has a stronger claim than other kinds of information to belonging directly after the definitions, before inflection information.
  2. 17% (examples: f., f's, faca, facet, faire, faire caoud, falten) contain information about inflected forms / the inflection of the word, or about lenition, which we seem to treat like inflection. Whether you (or I) buy the claim or not, this information has a stronger claim than other kinds of information to belonging after inflection information.
  3. 22% (examples: fa chomhair, fa-near, fail, fallacious, faoin, fara) contain information about which words or case-forms are used with the word in question (e.g. saying a verb takes the dative case).
  4. 29% had some other function, e.g. functioning as glorified synonyms sections.
- -sche (discuss) 22:54, 28 January 2016 (UTC)
For comparison, fr.Wikt places notes directly after definitions, while full inflection information (of the sort we're discussing interpolating between definitions and usage notes) is on a separate page; see fr:baiser (the note section is called "Note"). de.Wikt places notes before definitions, and full inflection information on a separate page; see de:rheinisch and de:US-amerikanisch (the notes sections are called "Anmerkung" with or without additional words). The speaks in favour of the idea that usage notes are more closely bound to definitions than inflection information is. Even when the usage notes are about inflection, they're as near to the inflection if they're above it as they are if they're below it. For this reason and the reasons I gave above, I favour the order definitions, then usage notes, then inflection. - -sche (discuss) 23:31, 28 January 2016 (UTC)
@-sche These are good points. I'm starting to think it would be a good idea supporting the order: Definitions, Usage notes, Inflection (rather than Definitions, Inflection, Usage notes). I admit I'll have to think about it with more clarity after sleeping. If most people support that order, then I could create a vote about that order specifically. But if people remain divided about the exact order, maybe I should create a vote with both options. --Daniel Carrero (talk) 02:52, 29 January 2016 (UTC)
Regardless of how much support you think there is, the vote should have both options. I think it should be a two-section vote. Section one would be whether we want to fix the position of the Usage notes section in either of these locations. Section two would be whether it should be before the inflection line, after the inflection section, leave up to personal preference, or (perhaps) have it depend on whether the particular usage note is talking about the definitions or the inflections. --WikiTiki89 03:00, 29 January 2016 (UTC)
@Wikitiki89 I feel that section 1 would be redundant. If a person supports any option in section 2, it would be equivalent to support the section 1, wouldn't it? I propose setting up the vote this way:
Voting on: What is the placement of the "Usage notes" section in all languages. (Currently it is the only section where WT:EL states: "can be placed anywhere appropriate".)
Note: In the proposals below, "Inflection" can be replaced by Declension, Conjugation, etc.
Proposal 1: The sections should be ordered this way in all entries:
  • Part of speech, Inflection (if available), Usage notes.
Proposal 2: The sections should be ordered this way in all entries:
  • Part of speech, Usage notes, Inflection (if available).
Proposal 3: The sections should be ordered in either of these ways, up to personal preference, in all entries:
  • Part of speech, Inflection (if available), Usage notes.
  • Part of speech, Usage notes, Inflection (if available).
Proposal 4: The sections should be ordered in either of these ways, depending specifically on the contents of the usage notes, in all entries:
  • Part of speech, Usage notes, Inflection (if available). (if the usage notes are about the definition)
  • Part of speech, Inflection (if available), Usage notes. (if the usage notes are about the inflection)
Proposal 5: Allow "Usage notes" to be used freely anywhere in all entries.
--Daniel Carrero (talk) 14:59, 31 January 2016 (UTC)
@Daniel Carrero: The problem with your suggestion is that it is unclear how to close it if the results are close. The point of my proposal was that section 1 would determine whether the vote passes, and section 2 would determine how it passes. If section 1 passes, but section 2 is inconclusive, the default is to leave it unspecified whether the Usage notes should come before or after the inflection section. While with your proposal, in the equivalent situation, the whole vote fails. --WikiTiki89 19:23, 3 February 2016 (UTC)
@Wikitiki89 I edited Wiktionary:Votes/2016-02/Placement of "Usage notes" per your proposal. --Daniel Carrero (talk) 09:16, 4 February 2016 (UTC)
I agree. I think the option to place the usage notes either before or after the inflection should definitely be in the vote. Benwing2 (talk) 03:13, 29 January 2016 (UTC)
Sure, I'll make it available. --Daniel Carrero (talk) 01:45, 31 January 2016 (UTC)
Template:@ping Excellent data collection and analysis. However the sample should have discarded data from all English (and taxonomic) sections for reasons analogous to why I would abstain from voting on this narrow matter. I don't know whether data from inflected forms entries should also be discarded because no inflection table can appear. Only entries which at least could have inflection tables should be sampled or retained after sampling. DCDuring TALK 14:23, 29 January 2016 (UTC)

I have a separate proposal, to tackle a problem noted above. Adding this rule somewhere on WT:EL (WT:EL#Usage notes and/or WT:EL#Pronunciation):

  • Notes about pronunciations should be placed in the "Pronunciation" section. Entries should not have a "Usage notes" section whose only purpose is to give pronunciation notes.

This could be done in a separate vote, maybe. --Daniel Carrero (talk) 01:45, 31 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Notes about pronunciations, to address the "pronunciation"/"usage notes" issue:
  • Entries should not have a "Usage notes" section whose only purpose is having notes about the pronunciation. Pronunciation notes can be added directly in the "Pronunciation" section.
--Daniel Carrero (talk) 16:29, 1 February 2016 (UTC)
Data, part 2: Of the first 100 pages starting with f which used both one of the headers (Inflection|Mutation|Declension|Conjugation) and the header Usage notes, 24% used the headers in different language or POS sections and are irrelevant. Of the relevant pages, 30% used the order "notes, then inflection", 26% used the order "notes, then mutation", and 43% used the order "inflection, then notes".
- -sche (discuss) 08:14, 18 February 2016 (UTC)
F must be a particularly common initial letter in Irish. If we consider that the percentages represent an effort-weighted expressed preference for a given order, it is the Irish vote that pushes the "notes first" side to victory (the situation being analogous to the electoral situation for mayor of Boston.) If we think the Irish "vote" is overweighted by the selection of f for the analysis, preferences are clearly rather evenly divided, far from a consensus. Using votes to find a consensus gives excessive weight to current voters, rather than to those who actually put in the effort to place the usage notes. It is also interesting that Irish preference was uniform. If preferences are also relatively uniform within other languages, there would seem to be a case for diversity of practice within each language, even where there are few contributors in a language in question. DCDuring TALK 12:17, 18 February 2016 (UTC)
I don't think there are many Irish entries because f is more common in Irish, but rather because the percentage of Irish entries that give mutation information (regardless of whether or not they give usage notes) is apparently closer to 100% than the percentage of other languages' entries that give inflection information is to 100%. The proportion of entries using "inflection, then notes" increased somewhat as time went on; in the first 50 pages, "notes, then inflection" outnumbered "inflection, then notes" (disregarding "mutation" entries); in the first 100 pages, the data is what I wrote above; immediately after that cutoff was a spate of Latin entries which gave inflection at L4 followed by usage notes at L3 that said "used only in taxonomic names" (something many entries in the sample also did).
I included "mutation" in the count because I thought it was treated like inflection and put at L4, but all these entries happen to have mutations at L3 at the bottom of the entry. There does seem to be variation between languages: many Latin entries had "inflection, then notes" (at different levels, as just described) and only a sizeable minority had "notes, then inflection", while most (but not quite all) German entries had "notes, then inflection". - -sche (discuss) 18:13, 18 February 2016 (UTC)

Bold "Do" and "Do not" in WT:ELEdit

Currently, WT:EL#Translation dos and don’ts has 8 rules starting with bold Do or Do not:

This was a change introduced by Paul G in diff (23 July 2007). I find that distracting. Is that really necessary? No other part of WT:EL has the same formatting.

I have a proposal, which I would like to do without a vote (an unsubstantial change):

  • Remove "Do" from all positive statements. ("Do provide" = "Provide"; "Do ensure" = "Ensure").
  • Keep "Do not" in all cases, but without the bold formatting ("Do not add the pronunciation" = "Do not add the pronunciation").

Can I do that? --Daniel Carrero (talk) 04:20, 28 January 2016 (UTC)

Disclaimer: I'm not saying that I agree (or disagree) with any of these rules. I just don't like the current presentation. --Daniel Carrero (talk) 04:41, 28 January 2016 (UTC)
I agree, they come across as intimidating with the bold Do/Do not. Redoing them as you specify is fine with me. Benwing2 (talk) 05:12, 28 January 2016 (UTC)
I thought the same thing. I say go for it. Andrew Sheedy (talk) 05:15, 28 January 2016 (UTC)
Support. —Μετάknowledgediscuss/deeds 06:23, 28 January 2016 (UTC)
Support. - -sche (discuss) 04:06, 29 January 2016 (UTC)
  Done. --Daniel Carrero (talk) 00:36, 30 January 2016 (UTC)

Luacize Template:grc-ipa-rowsEdit

@Gilgamesh~enwiktionary, Angr: I can't believe such a template is not luacized. I will luacize it if nobody disagrees. --kc_kennylau (talk) 15:12, 28 January 2016 (UTC)

If you feel like it, even better IMO would be to make a new template that takes actual Greek as input and generates the appropriate pronunciations, rather than requiring the awkward code arguments. Benwing2 (talk) 15:24, 28 January 2016 (UTC)
@Benwing2: That is also in my mind. --kc_kennylau (talk) 16:30, 28 January 2016 (UTC)
The luacized template is {{grc-IPA}}. As far as I know, {{grc-ipa-rows}} is almost deprecated (though there seem to be some cases where {{grc-IPA}} doesn't work, so {{grc-ipa-rows}} is required instead. I'd rather have a bot go through and replace all instances of {{grc-ipa-rows}} with {{grc-IPA}} (changing the parameters as required) instead of having two competing luacized templates. —Aɴɢʀ (talk) 16:35, 28 January 2016 (UTC)
Angr is quite correct that we need to convert over to {{grc-IPA}}. We're still trying to fix PHP error (according to Newt) that occurs for words like ᾍδης (Hā́idēs). Otherwise, to my knowledge, we are effectively done. —JohnC5 17:05, 28 January 2016 (UTC)
It should have been marked as deprecated. In fact, it was, but User:LlywelynII removed the tag, because there had been no formal discussion and he didn't like the 'terseness' of the template. Which is fair to some degree, but it really would have been better to at least make a note on the talk page instead of just silently removing the tag. Which is not to say that he's not right, though: upon looking at the template, I don't like the way it's presented; just having /x/ > /y/ > /z/ is ambiguous because it doesn't actually specify how long of a period this change took place over, or which one is even Correct, and, at that point, why bother collapsing it?
In my opinion, we should remove the "Constantinopolitan" row (which, at 1500, is Modern Greek anyway), and then consider cutting another row or two. The Latin template has two rows, "Classical" and "Ecclesiastical", which is really all anyone should need. On this model, I'd cut us down to "Classical" and "Byzantine", and maybe add "Koinê" (and of course eliminate the show/hide function.) Any more than that just seems unnecessary. —ObsequiousNewt (εἴρηκα|πεποίηκα) 19:01, 28 January 2016 (UTC)
Didn't realize this work had already been done. John, what exactly is the supposed PHP error? Something where regexes don't behave the way they should? So far I haven't seen any such errors with Russian or Arabic. Benwing2 (talk) 18:18, 28 January 2016 (UTC)
I had pinged Newt because I am unfamiliar with the actual error. It is mentioned here, and Newt made a bug report here. —JohnC5 18:47, 28 January 2016 (UTC)
Thanks! @ObsequiousNewt Can you work around the problem? I had to work around an issue with the ordering of the shadda diacritic in Arabic with respect to other diacritics; this is basically a bug in Unicode itself, and causes problems because MediaWiki normalizes the ordering of diacritics according to Unicode. In this case, I had to manually reorder the diacritics in various places. If the problem is with capitalization/lowercasing, can you write a function that wraps the relevant MediaWiki calls and manually converts the chars in question to lowercase? Benwing2 (talk) 19:46, 28 January 2016 (UTC)
The Mediawiki lc: magic word unfortunately doesn't recognize capital letters with iota subscript as having lower-case equivalents, so it doesn't do anything to them. So any workaround will have to employ some other method. —Aɴɢʀ (talk) 20:27, 28 January 2016 (UTC)
Where is lc: used exactly? If it's in template code, presumably it can be replaced with a #invoke to a function that wraps lang:lc() or mw.ustring.lower(). Benwing2 (talk) 22:02, 28 January 2016 (UTC)
I don't know if it is used in this context at all yet; I'm just saying if someone were to try using it to force the template to reinterpret capital letters as lowercase ones, it wouldn't work on iota-subscripted capital letters. Of course, there probably aren't very many words beginning with an iota-subscripted capital letter; we can always write {{grc-IPA|w=ᾅδης}} manually. —Aɴɢʀ (talk) 16:49, 29 January 2016 (UTC)
Yeah, I can try and work something up. Unfortunately, nobody seems to have looked at/confirmed the bug on PHP yet (perhaps I should look into fixing it myself...) —ObsequiousNewt (εἴρηκα|πεποίηκα) 14:18, 29 January 2016 (UTC)
(straying off the original topic) Re "I don't like the way it's presented; just having /x/ > /y/ > /z/ is ambiguous because it doesn't actually specify how long of a period this change took place over, or which one is even Correct, and, at that point, why bother collapsing it?": I agree. - -sche (discuss) 04:05, 29 January 2016 (UTC)

EL: Language section (revision 2)Edit

Wiktionary:Votes/pl-2015-12/Language passed 10 days ago. Some people complained that the text needs to be written more clearly.

I created Help:Language sections with a longer, more detailed explanation directed at newbies.

I have a proposal:

  • Removing all the how-to parts and explanations from WT:EL#Language and leaving only the rules plus a link to the help page.

Current text:


Each entry has one or more L2 (level-two) language sections. For example, the entry sea has different meanings in English and Spanish, both on the same page. Priority is given to Translingual: this heading includes terms that remain the same in all languages. This includes taxonomic names, symbols for the chemical elements, and abbreviations for international units of measurement; for example Homo sapiens, He (helium), and km (kilometre). English comes next, because this is the English Wiktionary. After that come other languages in alphabetical order. Language sections should be separated from each other by a horizontal line, generated with four dashes (----).[1]

For languages that have multiple names, a single name is chosen that should be used throughout Wiktionary. Typically, this is an English name for the language. See Wiktionary:Languages for more information.


Proposed text:

  • Every entry should have one or more language sections.
  • All language sections should be level-two.
  • The order of language sections is: Translingual, English, then other languages in alphabetical order
  • Language sections should be separated from each other by a horizontal line.
  • For languages that have multiple names, a single name is chosen that should be used throughout Wiktionary. See Wiktionary:Languages for more information.

See Help:Language sections for more information.

Looks good? --Daniel Carrero (talk) 02:22, 30 January 2016 (UTC)

that looks really nice and clear, i wish more guidelines were written in this way when i came here at first profesjonalizmreply 06:38, 30 January 2016 (UTC)
 :) --Daniel Carrero (talk) 23:31, 30 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Language 2. --Daniel Carrero (talk) 05:57, 4 February 2016 (UTC)

Proposal: Transclude only 2 months in WT:BP and other discussion roomsEdit

In Wiktionary talk:Beer parlour#Transclude last three months makes the page very slow to load, @Automatik complained that it is extremely slow to load WT:BP on their computer and suggested transcluding only the last 2 months WT:BP instead of the current 2 months.

I support that. As I pointed out in that discussion, a BP discussion older than 2 months is most often an inactive discussion. This change would require editing {{discussion recent months}} and I think it would make sense changing the behavior of all pages that use that template, not just WT:BP but WT:GP, WT:ID, WT:TR and WT:ES as well.

Can I make these pages transclude only the last 2 months? --Daniel Carrero (talk) 00:13, 31 January 2016 (UTC)

I don't have a problem with it, although I don't feel strongly; I usually access the individual month-specific pages through the watchlist and haven't noticed the slowdown that much (although it did take maybe 7-8 secs to load the front page when I just checked). Benwing2 (talk) 03:51, 31 January 2016 (UTC)
  Done --Daniel Carrero (talk) 16:59, 2 February 2016 (UTC)
Does this mean that we sometimes only have one month and one day of BP discussion loading? Or is it as little as two months and one day? DCDuring TALK 20:43, 2 February 2016 (UTC)
Now, sometimes we only have one month and one day of BP discussion loading. I'm happy with the current state, but that could probably be changed with more complex rules if people want, like "keep showing 3 months for the first two weeks of the month, then change to showing two months".
--Daniel Carrero (talk) 21:00, 2 February 2016 (UTC) (edited: --Daniel Carrero (talk) 13:33, 3 February 2016 (UTC))

CAT:, T:, MOD:, AP:, RC: are workingEdit

Testing all shortcuts, they are all working:

--Daniel Carrero (talk) 02:02, 31 January 2016 (UTC)

When did we have a new namespace that I am not aware of? --kc_kennylau (talk) 02:07, 31 January 2016 (UTC)
@Kc kennylau: These are all preëxisting namespaces, which we now have shortcuts for as a result of a vote. —Μετάknowledgediscuss/deeds 03:43, 31 January 2016 (UTC)
@Metaknowledge: I am talking about the RC namespace. --kc_kennylau (talk) 03:44, 31 January 2016 (UTC)
@Kc kennylau: Also the result of a vote. —Μετάknowledgediscuss/deeds 03:48, 31 January 2016 (UTC)
What about U: for User:? --kc_kennylau (talk) 05:31, 31 January 2016 (UTC)
It was not part of the vote. It could be voted in the future, but personally I don't care much for for U: for User: because User: is short enough. H: for Help: (equally short) was voted but failed. --Daniel Carrero (talk) 15:02, 31 January 2016 (UTC)

Namespace "Reconstruction"Edit

Is Reconstruction: better than Reconstructed:? --kc_kennylau (talk) 04:12, 31 January 2016 (UTC)

The discussion which resulted in this name being chosen is Wiktionary_talk:Votes/2015-09/Creating_a_namespace_for_reconstructed_terms#What_should_we_name_the_namespace.3F. - -sche (discuss) 04:41, 31 January 2016 (UTC)
@-sche: Alright. Then I can start bot-moving everything. --kc_kennylau (talk) 05:18, 31 January 2016 (UTC)
Oops. Looks like it is not possible. --kc_kennylau (talk) 05:28, 31 January 2016 (UTC)
Should we continue to create new entries under Appendix: until currently existing pages are mostly moved and templates like {{inh}} or {{m}} point to the correct namespace? --Tropylium (talk) 21:08, 31 January 2016 (UTC)

About: Pronunciation 1, Pronunciation 2, Pronunciation 3Edit

Some stats:

  • There are 31,175 entries with "Etymology 1" written somewhere. (search link: here)
  • There are 5,985 entries with "Pronunciation 1" written somewhere. (search link: here)

WT:EL does not currently allow entries with numbered pronunciation sections such as "Pronunciation 1", "Pronunciation 2" and "Pronunciation 3". Are these sections something we want? If the answer is yes, I have a proposal:

  • Officially allowing numbered pronunciation sections in entries by editing WT:EL. (most likely WT:EL#Pronunciation, and also somewhere in WT:EL explaining what are the allowed sections and section order)

See also: Category:Entries with Pronunciation n headers for a bot-populated category with the entries affected by this proposal.

Older discussions:

--Daniel Carrero (talk) 15:14, 31 January 2016 (UTC)

Whenever I find these (in Russian or Arabic), I rewrite them to have "Etymology N" in them. Benwing2 (talk) 01:30, 1 February 2016 (UTC)
Other such sections that I've seen are "Noun 1", "Noun 2", etc. These I also rewrite, either to "Etymology 1", "Etymology 2" or just "Noun", "Noun". Benwing2 (talk) 01:32, 1 February 2016 (UTC)
My objective is trying again to make an official list of allowed headings as comprehensive as possible after Wiktionary:Votes/pl-2015-12/Headings failed.
"Pronunciation 1", "Pronunciation 2", etc. are part of that project, because I'd like to say either that they're allowed, or disallowed, or that there's no consensus for them yet. (whatever may the case be)
If people don't want to use "Pronunciation 1", I'd be glad to mention that fact in a new proposed Headings list.
Just to be sure, I would probably create a separate vote first, specifically about numbered pronunciation sections, with both options: 1) Allow numbered pronunciation sections, 2) Disallow numbered pronunciation sections.
(Noun 1, Noun 2 are already officially disallowed per WT:EL#Part of speech after Wiktionary:Votes/pl-2015-12/Part of speech passed recently.) --Daniel Carrero (talk) 02:10, 1 February 2016 (UTC)
One of the things I don't like about "Pronunciation 1", "Pronunciation 2", etc. is that there's no clear way they interact with "Etymology 1", "Etymology 2", etc. If we nest Pronunciation under Etymology, we get level-6 inflection tables and such, which seems awkward. For Russian, we have a single Pronunciation subsection per etymology section, and if there are multiple pronunciations in that section, they're all listed under the Pronunciation section with an annotation indicating which headword they go with. See сопли for an example. (They are grouped in the same etymology section because сопли as a lemma is a plurale tantum formed etymologically as the plural of сопля, and the other entries are for non-lemma forms of the same сопля. Forms for different lemmas should generally go in different etymology sections.) Benwing2 (talk) 04:29, 1 February 2016 (UTC)
Most of the entries with Pronunciation n headers are Latin inflected forms with pronunciations that cut across PoS headers and Etymologies. Anyone recommending elimination of Pronunciation n headers should have a proposed format for, say, auraria that at least the regular workers on Latin entries find acceptable. A userpage or sandbox mockup would be nice. I have worked to eliminate such headers in English entries, but could not see how to do it for Latin entries without making the entries almost useless. Maybe fresh eyes can do better. DCDuring TALK 06:03, 1 February 2016 (UTC)
See переда. This is how we tackle this issue in Russian. Benwing2 (talk) 06:26, 1 February 2016 (UTC)
Note: The use of annotations like in переда is necessary in other situations, too, that's why we adapted it for this situation. See Немуро for such an example. (BTW when I say "annotations" I mean the boldfaced words appearing to the left of the IPA, rather than the phonetic respelling appearing to the right in Немуро, which is used only because the pronunciation is irregular.) Benwing2 (talk) 06:29, 1 February 2016 (UTC)
@DCDuring, Benwing2: I don't speak Latin or Russian, but I created User:Daniel Carrero/auraria which should be auraria using the format of переда (without annotations like the ones seen in Немуро). See if I made any mistakes, feel free to edit that page, add annotations if necessary, change whatever you want. --Daniel Carrero (talk) 06:41, 1 February 2016 (UTC)
I'm not the one to be editing the proposed entry. Perhaps some of the following contributors to WT:ALA would be interested: @Wikitiki89, I'm so meta even this acronym, SebastianHelm, Pengo, Angr, Metaknowledge, CodeCat, Robert.Baruch, Jerome Charles Potts (apologies to any I've missed). DCDuring TALK 10:30, 1 February 2016 (UTC)
@DCDuring: Thanks for the ping. I opine below. — I.S.M.E.T.A. 03:21, 3 February 2016 (UTC)
I like the format of User:Daniel Carrero/auraria, though it should be flexible enough to allow cases where the headword forms are identical as well, e.g. the present tense vs. past tense of read. —Aɴɢʀ (talk) 11:19, 1 February 2016 (UTC)
Agreed. I imagine that could be done with present and past annotations or something of that sort. Benwing2 (talk) 12:00, 1 February 2016 (UTC)
I do not see any problem with having Pronunciation 1 and 2 parallelly to Etymology 1 and 2. That's how I handled it for wa and having looked at the alternative offers, I still prefer it over summarising multiple semantically relevant pronunciations under a single header. Korn [kʰʊ̃ːæ̯̃n] (talk) 12:41, 1 February 2016 (UTC)
I'd be OK with having it either way, but I prefer the format used for User:Daniel Carrero/auraria. Andrew Sheedy (talk) 00:08, 2 February 2016 (UTC)
I am opposed both to blank etymology sections and to multiple instances of the same POS header occurring in the same nest (which occurs, for example, when a level-3 Noun header is immediately followed by another level-3 Noun header). Accordingly, I am particularly opposed to the presentations in переда (pereda) and User:Daniel Carrero/auraria. (As an aside, I think it's bizarre that both those entries have two pronunciation sections with identical contents — why not simply move the pronunciation information to the top of the entry so it applies to both etymology sections, and save the space and redundancy?!) Because I oppose multiple instances of the same POS header occurring in the same nest, I consider numbered pronunciation sections to be indispensable for Latin (chiefly for the ablative singular feminine form vs. the other graphically isomorphic forms of first–second-declension adjectives, but also for some other cases). However, even in the case of variously pronounced homonyms, I sometimes see value in using numbered pronunciation sections rather than numbered etymology sections. Take, for example, the Latin entry I recently created for Dion. Now, I could have written it like this, but what would've been the point? Adding etymology headers like that adds no useful information and is just a waste of space; numbered pronunciation sections are better for that entry.
I don't really understand this drive to get rid of numbered pronunciation sections. They're useful and intuitive and nothing is gained (as far as I can see) by banning them. Let's allow them, so I can stop adding {{rfc-pron-n|Pronunciation 1|lang=la}} to Latin entries I create with numbered pronunciation sections once and for all. — I.S.M.E.T.A. 03:21, 3 February 2016 (UTC)
@I'm so meta even this acronym You're right, normally in Russian when two etymology sections share the same pronunciation we put it once at the top; I messed up переда in this respect. I still find it awkward, though, to nest pronunciation sections under etymology sections. Benwing2 (talk) 04:14, 3 February 2016 (UTC)
@Benwing2: Why? — I.S.M.E.T.A. 22:13, 8 February 2016 (UTC)
Because that's the standard practice. --WikiTiki89 22:19, 8 February 2016 (UTC)
I, on the other hand, am opposed to nesting POS sections under anything at all. It's inconsistent that sometimes they are L3 below an etymology or pronunciation section, sometimes L4 nested within either of these sections. The term should be primary, and any information that is associated with the term should be nested under it. —CodeCat 20:34, 3 February 2016 (UTC)
I would even go as far as to say that it makes no sense that the part speech of a word is part of the hierarchy of our entry structure. The part of speech logically should be attached to individual definitions, or at least small groups of related definitions. But there's something about maintaining the status quo in order to focus our efforts on the quality of entries rather than on reformatting the whole dictionary. --WikiTiki89 20:45, 3 February 2016 (UTC)
The status quo already attaches the part of speech to individual definitions or small groups of related definitions. The POS header is what does that. —CodeCat 20:50, 3 February 2016 (UTC)
No the POS header attaches definitions to a POS. That's different. Also, our groups of related definitions are often not as small as what I meant in my previous post. --WikiTiki89 20:56, 3 February 2016 (UTC)
  • Some data points from Japanese.
Japanese lemmata here at EN WIKT are typically the kanji spellings. For these entries, like , pronunciation is subordinate to etymology.
We also have Japanese entries in kana, generally hiragana. These spellings are phonetic, except that kana spellings do not denote pitch accent (vaguely similar to stress in other languages). Sometimes a single kana spelling may have multiple possible pitch accents, and the etymology (or more often, which lemma) depends on the pitch accent. For these entries, like にじ (niji) or まく (maku) or しゃべる (shaberu), etymology is subordinate to pronunciation.
Allowing Pronunciation N headers is really the only way of cleanly organizing Japanese kana entries where there are multiple pitch accents. Japanese kana entries would be negatively affected if the Wiktionary community were to ban these headers. ‑‑ Eiríkr Útlendi │Tala við mig 01:49, 2 February 2016 (UTC)
These are good points. I created User:Daniel Carrero/にじ to try and compare with the entry にじ you linked, using the same format as User:Daniel Carrero/auraria.
What I like about the page I created is that User:Daniel Carrero/にじ takes a bit less space than にじ.
  • Counting from the "Japanese" L2 header, 二時 has 16 lines and User:Daniel Carrero/にじ has 12 lines.
  • I removed the repeated "Noun" and the headword line "にじ ‎(romaji niji)".
  • I also removed the repeated IPA transcription ([nid͡ʑi])
  • The shortness is helped by the fact that にじ has a TOC and User:Daniel Carrero/にじ doesn't, because it has only 3 sections.
In the creation of User:Daniel Carrero/にじ, I used {{head}} because {{ja-noun}} was giving unexpected results in a user page, which I coulnd't fix. Also, I edited the pronunciation markup manually to use a different style than {{ja-pron}} uses, which I can put in a different template for use in other entries if people want.
Overall, I find it a little weird that, when the lemma of a Japanese word is actually the entry with kanji, kana entries have pronunciation sections and romaji entries do not (romanization entries are just the modicum of information to find the right entries). But I won't propose the pronunciations to be removed from the kana entries; despite being somewhat odd IMHO, they are certanly helpful in comparing quickly where to locate the accents in different words spelled with the same kana. --Daniel Carrero (talk) 03:31, 2 February 2016 (UTC)
@Eirikr IMO, all three of にじ (niji), まく (maku), しゃべる (shaberu) should be using "Etymology N" headers, not "Pronunciation N" headers. The different pronunciations are clearly etymologically unrelated; that's exactly what separate etymology sections are for. E.g. しゃべる (shaberu) meaning "shovel" looks like it's borrowed from English, whereas the meaning "to chat" isn't. Benwing2 (talk) 04:27, 2 February 2016 (UTC)
As a user, I find the current Japanese sorting system easier to grasp, i.e. to have a better overview, than the alternative where a single-line list of words is put before the pronunciation. I also find it considerably more sightly than having a block of text in front of the pronunciation. I'd like to put out the idea of making Pronunciation the default (but not mandatory) topmost header, since it is basically the spoken æquivalent of spelling, which is the first thing we sort entries by now. For entries with one etymology but multiple semantically distinguishing pronunciations this would create different entries according to meaning and differentiability in spoken language and for entries with one pronunciation but multiple etymologies, this would cut a bit of superfluous text compared to now. Korn [kʰʊ̃ːæ̯̃n] (talk) 09:36, 2 February 2016 (UTC)
  • (after edit conflict) ... @Benwing2, one key factor with the kana entries is that they are phonetically oriented: if one were to reorganize by etymology instead, まく (maku) would need possibly ten different ===Etymology N=== headers, certainly no fewer than eight. Rather that etymological information belongs on the lemma entries anyway, such a reorganization doesn't seem optimal. Some of these phonetic kana entries map more easily between pitch accents and etymologies, like しゃべる (shaberu), but again, these kana entries are phonetically oriented, and are intended as soft redirects to the kanji-spelled lemmata, with etymologies provided as part of the lemma entries. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
  • Daniel, the kana entries are organized by the reading, rather than the spelling (i.e. kanji). Any user who understands at least romaji and kana can look up something they've heard and find the kana entry. For a given reading, however, there are often different pitch accents, with some words always pronounced with one pitch-accent pattern and never another. If we don't provide any pronunciation information on kana entries, a user who has heard [máꜜkù] cannot identify which set of lemma entries might be relevant, unless they click through to each of the linked lemma entries to check the pronunciation there. This is an onerous burden and is very poor usability. Including pronunciation information on the kana entry page enhances the page's intended function as a kind of disambiguation page, leading the user to the entry they are looking for.
I appreciate you taking the time to create your mock-up. However, I confess I find your proposed entry structure hard to read. It may be more compact, but compactness is not necessarily a virtue: your structure makes it less clear to me how the pitch accents and entries correlate. I also suspect it doesn't scale adequately: applying a similar redesign to more complicated pages like まく (maku) would produce a result that would be even harder to visually parse. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
  • @Korn, I'm certainly open to that approach. I've struggled with the distance between the start of a Japanese entry section and the pronunciation / reading information. This is perhaps more awkward for Japanese, as kanji spellings often have only a tenuous connection to the reading. Organizing Japanese entries by pronunciation first is an attractive idea. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
Should we disallow "Pronunciation 1" in most languages but keep them for Japanese as an exception because of the pitch accent in the kana entries?
I wanted to create a layout without "Pronunciation N" that works but it's okay if I was unable to. In any event, I changed a little the design of User:Daniel Carrero/にじ by getting rid of the table style and adding a couple of newlines in the pronunciation section.
I added the etymology of しゃべる and used Etymology N headers rather than Pronunciation N headers. --Daniel Carrero (talk) 17:12, 2 February 2016 (UTC)
I'm with Daniel here that it might make sense to allow "Pronunciation N" for Japanese kana entries and maybe other exceptions to be determined, but not generally. Benwing2 (talk) 01:06, 3 February 2016 (UTC)
Have we actually yet heard a single reason for not having numbered pronunciations? I'm strongly opposed to disallowing them, for if we admit that they're preferable for one language, we already know that they might be in other languages. And we should leave it to the users to decide when and where this need arises, without bureaucracy. Korn [kʰʊ̃ːæ̯̃n] (talk) 19:26, 3 February 2016 (UTC)

Guys, my original proposal in the first message was "Officially allowing numbered pronunciation sections". If people want them, fine. If people don't want them, fine too. I just want to try and make sure what is the consensus to update WT:EL accurately.

Should I create a vote with the proposal "Officially allowing numbered pronunciation sections"?

I don't want to create a vote with the opposite proposal "Officially disallowing numbered pronunciation sections" because @Eirikr made a good point that they are useful in Japanese. That's why I tried the "compromise" of disallowing numbered pronunciations sections everywhere but in Japanese. Re "for if we admit that they're preferable for one language, we already know that they might be in other languages.": Japanese seems to be a very special case, because of the kana stuff. (many entries with the same kana readings but different pitch accents)

Correct me if I'm wrong, but I believe "Pronunciation N" sections are unwanted for English. A search for "pronunciation 1" latin (link) returns 1,107 results so the number of Latin entries with multiple pronunciations seems to be equal or less than 1,107, which seems like a low number to me. I don't suppose it would be completely unreasonable to remove "Pronunciation N" from all Latin entries, would it? --Daniel Carrero (talk) 04:43, 4 February 2016 (UTC)

Since nobody seems to have brought an argument against multiple pronunciation sections, I would propose you start a vote on just generally allowing or disallowing them. If that fails, a new and more restricted proposal can still be generated from the discussion of the first one. Korn [kʰʊ̃ːæ̯̃n] (talk) 13:43, 4 February 2016 (UTC)
re: ""Pronunciation N" sections are unwanted for English."
Let's not get ahead of ourselves. The facts are that I didn't like them, didn't think they were necessary, and worked to eliminate them, there not being very many. AFAICR no one objected. I don't know whether any have been added since my efforts to eliminate them, nor whether there would be objections to having them or not having them. I don't even know that I would agree now with the way I eliminated them then. DCDuring TALK 13:57, 4 February 2016 (UTC)

@Korn, Daniel Carrero: Let me break down this issue so that we can create a better vote and also to explain some of the arguments:

A. Should we have multiple pronunciation sections or should we put multiple pronunciations in one pronunciation section?
B. If multiple pronunciation sections are used:
1. Should they be numbered or simply be repeated unnumbered?
Note: We used to have numbered POS sections, until we decided that we shouldn't have to keep the numbers updated and stopped numbering them. Now we simply have two ===Noun=== sections or two ===Verb=== sections following each other with no problem. It was agreed that only Etymology sections should remain numbered, although the reason they need to remain numbered is not clear to me. Perhaps it is because we used to link to them as [[foo#Etymology 2]], but we have stopped doing that in favor of sense-ids.
2. Should the content be nested under them, or simply come after them?
Note: I guess the argument here is that it does not make sense to nest things under the pronunciation header. After all, part of speech headers do not logically belong "under" a pronunciation. Most of our nesting is pretty straightforward, with only etymology sections allowing optional nesting based on whether there is one of them or more than one. To me that only makes a little bit more sense than nesting under pronunciations.
3. What happens if there are multiple pronunciation sections and multiple etymology sections?

--WikiTiki89 14:42, 4 February 2016 (UTC)

Do Header-levels make a technical difference for anything or are they merely a visual sorting tool? Korn [kʰʊ̃ːæ̯̃n] (talk) 10:25, 5 February 2016 (UTC)
They would make it harder for an amateur like me to process the dump definitively, if I were ever working with a group of languages that used this header. It requires a bit more care in constructing regexes. DCDuring TALK 13:06, 5 February 2016 (UTC)
Header-levels make it clear which POS sections the Pronunciation section applies to. If it is simply at the top, it is pretty clear that it applies to all of the POS sections, but if it occurs in the middle, it's not clear whether it should apply to just the next POS section or all the rest of the POS sections. Nesting solves this problem by putting all POS sections that the Pronunciation section applies to "under" the Pronunciation section. That way the Pronunciation section clearly does not apply to anything that follows but is not "under" it. The structure is often clearer in the TOC than in the actual entry. But still, all the things I just mentioned are only visual and make no technical difference, and same with all of our entry layout decisions really. --WikiTiki89 14:37, 5 February 2016 (UTC)
As I said before, my objective is making a list of allowed headings and "Pronunciation N" is in the way, so I'd like to know if it's allowed or disallowed (or no consensus). I'm thinking of creating a vote based on @Wikitiki89's proposal above, but I'm not sure what would be the options for the part 3. "3. What happens if there are multiple pronunciation sections and multiple etymology sections?" Probably the options would be:
  • Allow/disallow nested numbered pronunciation with numbered etymology sections.
--Daniel Carrero (talk) 03:42, 6 February 2016 (UTC)
Should these questions be asked as a vote or as a poll? I'm thinking a poll is better because this issue has not been discussed much yet, we have still to show a clear consensus about it.
The poll could also ask: Allow/disallow for Latin, Allow/disallow for Japanese, Allow/disallow for English, etc. (just a few select languages that have been brought up for discussion here) --Daniel Carrero (talk) 03:53, 6 February 2016 (UTC)
I created Wiktionary:Votes/2016-02/Multiple pronunciation sections, mostly based on @Wikitiki89's idea above, but I had to fill in the blanks in the question about "multiple pronunciation sections and multiple etymology sections". Feel free to edit that vote. --Daniel Carrero (talk) 02:29, 7 February 2016 (UTC)

I'll start this vote: "Entry name: sign languages"Edit

FYI: Wiktionary:Votes/pl-2015-12/Entry name: sign languages was created by me 1½ month ago (on December 16), I was delaying the start but I think it is good to go now. (Basically, I was waiting for Wiktionary:Votes/pl-2015-12/Entry name section 2 to end first.)

I edited the vote a bit more and added diffs. Feel free to further change it if you want, I'll update the diffs before starting the vote if anyone changes it. Related discussions: Wiktionary talk:Votes/pl-2015-12/Entry name: sign languages.

--Daniel Carrero (talk) 16:30, 31 January 2016 (UTC)


I created Wiktionary:Votes/pl-2016-01/Pronunciation.

Rationale and changes:

  • More compact version of the same policy. No rules were intended to be changed, they are just described in a way that takes up less space.
  • Using a bulleted list to organize the ideas. The order of ideas changes in a few places.
  • The subsections (Homophones and Rhymes) were removed. The same information was kept, albeit in the bulleted list.
  • The current text uses 4 entries as examples of multiple types of pronunciation information: portmanteau, beta, right and hat. The proposed text uses only 1 entry as an example of all the types of pronunciation information previously mentioned: right. Bonus: The example shows the transcription, the audio, the homophones and rhymes in order in the right example.
  • Another step in the direction of having WT:EL completely voted.

--Daniel Carrero (talk) 21:21, 31 January 2016 (UTC)