Wiktionary:Beer parlour

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit

September 2015

Category:Sanskrit language appears in Category:All extinct languagesEdit

To my surprise, w:Sanskrit shows that Sanskrit is an official language in one region of India and that it has native speakers. Yet we class is as an extinct language, so one of us is wrong. Renard Migrant (talk) 18:07, 1 September 2015 (UTC)

The Wikipedia page says "The Mattur village in central Karnataka claims to have native speakers of Sanskrit among its population. Inhabitants of all castes learn Sanskrit starting in childhood and converse in the language." This really seems no different from the status Latin would have had maybe a hundred years ago in some places. --WikiTiki89 18:11, 1 September 2015 (UTC)
@Wikitiki89: For that matter, Latin is in this category. To be sure, there are Latin speakers and even those who are exposed to it from birth (through Catholic Mass) but it is effectively a dead language. —Justin (koavf)TCM 03:25, 2 September 2015 (UTC)

Wiktionary:Votes/2013-10/Reconstructions need referencesEdit

The vote is now open. (I presume pinging users I have seen working with etyls would be politeering? (Why is this not a word?)) Anyway, if you have an opinion, please participate in the vote, thank you! Neitrāls vārds (talk) 21:55, 1 September 2015 (UTC)

Manual inflection tables: positional or named parameters?Edit

For manual inflection tables, each form is specified separately rather than generated through stems, rules and other logic. I am wondering whether it's preferred to have such templates with numbered/positional parameters to specify each form, or named ones? Named has the advantage that you don't have to remember which parameter is for which form, but it's also a lot more to type. —CodeCat 14:08, 2 September 2015 (UTC)

I think it's better to have named parameters, along with a copy-and-paste template in the template's documentation to save typing time. See Template:uk-adj-table. --WikiTiki89 14:35, 2 September 2015 (UTC)
Or for a more extreme example {{sga-conj-complex}}. And I agree with WikiTiki89 that it's better to use named parameters. —Aɴɢʀ (talk) 14:56, 2 September 2015 (UTC)
I think such a template should have named parameters if there are many parameters or if there are likely to be parameters skipped in some cases, because (in both those cases) people using the template will have a hard time keeping track of positional parameters. I think such a template should have positional parameters if some of its uses will be of only the first not-many parameters, because people using the template will not want to write named parameters' names. Thus, where both those criteria apply, the template should have both named and positional parameters, like {{{so|{{{1}}}}}}. (I don't see the harm in doing things that way even if not both the criteria apply.)​—msh210 (talk) 21:53, 8 September 2015 (UTC)
Well, for these kinds of templates, all the parameters are given the vast majority of the time. —CodeCat 22:04, 8 September 2015 (UTC)
For most of them, yes, I think so. But since this question (whether to use positional or named parameters) is decided when writing each template, and there may be some — I suspect there are — which are often used without all their parameters, it is appropriate to note (and, if writing guidelines, to build into the guidelines) the criteria I mentioned.​—msh210 (talk) 21:56, 9 September 2015 (UTC)

Introducing the Wikimedia public policy siteEdit

Hi all,

We are excited to introduce a new Wikimedia Public Policy site. The site includes resources and position statements on access, copyright, censorship, intermediary liability, and privacy. The site explains how good public policy supports the Wikimedia projects, editors, and mission.

Visit the public policy portal: https://policy.wikimedia.org/

Please help translate the statements on Meta Wiki. You can read more on the Wikimedia blog.


Yana and Stephen (Talk) 18:12, 2 September 2015 (UTC)

(Sent with the Global message delivery system)

Is this really something Wikimedia should be involved in? There goes NPOV... --WikiTiki89 18:21, 2 September 2015 (UTC)
It seems like policy that favors the survival of WMF and the projects. And they can't go it alone so they have to ally and coordinate with other parties. DCDuring TALK 19:28, 2 September 2015 (UTC)
@DCDuring: I don't think they're worried about not surviving. It probably has to do with ensuring that people in every country are allowed to have access to WMF resources. Even though this is highly desirable for the WMF and its projects, it isn't a task that the WMF itself should be taking on (see my other responses below). --WikiTiki89 14:02, 3 September 2015 (UTC)
@Wikitiki89 The survival question in the short run is just the intermediary liability issue, which could easily bankrupt WMF as well as create a chilling effect. All four of the other thrust are longer-term survival matters: maintaining the economic model (eg copyright) and a political model to enable WMF projects to serve superordinate goals that make contributors feel they are working for a higher cause in lieu of monetary compensation. The superordinate goals also help garner support from other elements in society. DCDuring TALK 17:58, 3 September 2015 (UTC)
@DCDuring: Ok, maybe I'm wrong about the survival question, but the point I was trying to make still stands: this isn't a task that the WMF itself should be taking on. --WikiTiki89 18:04, 3 September 2015 (UTC)
Very few organizations of any size and weight in the world get to ignore public policy matters. WMF has selected issues that are close to its core mission. I would be unhappy if they got involved in other causes, no matter how much I agreed with them, eg, certain environmental issues, official corruption, nuclear proliferation. DCDuring TALK 21:34, 3 September 2015 (UTC)
The second you make a political statement, you alienate everyone that disagrees. The WMF foundation's core mission is to make information available to everyone, and alienating people is not the right way to achieve that. WMF projects need to be able to thrive even in places where free speech is a foreign concept and governments looking over people's shoulders is taken for granted. --WikiTiki89 21:57, 3 September 2015 (UTC)
I think WMF would lose a lot of very committed people if it were to fail to push its values as best it can and many others would lose some of the feel-good that keeps them contributing, weakening their commitment to the projects. DCDuring TALK 00:47, 4 September 2015 (UTC)
You're saying people would quit supporting the WMF if it weren't vocal enough about politics? --WikiTiki89 14:01, 4 September 2015 (UTC)
And people really get excited about such things? SemperBlotto (talk) 19:32, 2 September 2015 (UTC)
  • @Wikitiki89: WMF doesn't have to abide by NPOV, just the projects themselves. —Μετάknowledgediscuss/deeds 21:20, 2 September 2015 (UTC)
    • @Wikitiki89: I'm confused... Are you saying the WMF and broader Wikimedia community of editors and supporters shouldn't be in favor of copyright reform and protecting readers' privacy? I don't see what is inappropriate here. —Justin (koavf)TCM 21:22, 2 September 2015 (UTC)
      • I agree, this is a great thing to have. bd2412 T 22:55, 2 September 2015 (UTC)
      • @Metaknowledge: They don't have to, but they should. The NPOV philosophy of WMF projects loses a lot of credibility if the organization behind these projects is making political statements, regardless of what these political statements are. @Koavf: The editors can be in favor of whatever they want, but the Wikimedia organization itself should remain publicly neutral, even if every single one of the editors shared the same opinions. --WikiTiki89 14:02, 3 September 2015 (UTC)
Hmm, as long as they don't start bringing politics into it like the current GitHub code-of-conduct controversy. Equinox 23:07, 2 September 2015 (UTC)
Thanks for pointing out the GitHub thing. We don't have a hardcore adolescent geek culture here, despite occasional outbreaks of the kind of humor that can be offensive. I also haven't seen much of it in other WMF projects, so there shouldn't be as much occasion for a Code of Conduct. We did manage to deal with that kind of thing without resorting to a formal code of conduct and banning. I do expect that there will be pressure to adopt such a thing however though the policy thing seems to deal solely with public policy, not internal policy. DCDuring TALK 23:45, 2 September 2015 (UTC)

Special categories for Euro & Brazilian Portuguese formsEdit

We have category:American English forms and category:British English forms. Why not category:Brazilian Portuguese forms & category:European Portuguese forms? Combining the orthographies and semantics together just hardens navigation. --Romanophile (talk) 00:48, 3 September 2015 (UTC)

  • @Romanophile: Agreed. It's completely legitimate to separate varieties for navigational purposes and this is particularly common with pt/pt-br. —Justin (koavf)TCM 03:10, 3 September 2015 (UTC)
  • Support. {{tcx}} could be made to categorise {{tcx|Portugal}} as Category:European Portuguese forms, but entries created prior to {{tcx}} will have to be updated. — Ungoliant (falai) 15:08, 3 September 2015 (UTC)
    • But then what if the label "Portugal" is used for another language? —CodeCat 15:21, 3 September 2015 (UTC)
      • {{tcx}} takes a language code. — Ungoliant (falai) 15:48, 3 September 2015 (UTC)
        • That's not the point. What would {{lb|en|Portugal}} result in? —CodeCat 16:04, 3 September 2015 (UTC)
          • Whatever it results in currently. — Ungoliant (falai) 16:09, 3 September 2015 (UTC)
  • Yes, and we shoukd do the same for transpondian Spanish (if we don't already).SemperBlotto (talk) 15:12, 3 September 2015 (UTC)
    Are there that many orthographic differences between European and Latin-American Spanish? --WikiTiki89 15:27, 3 September 2015 (UTC)
    I'm not a Spanish expert - but I know that tortilla has different meanings in Europe and the Americas. SemperBlotto (talk) 15:32, 3 September 2015 (UTC)
    But this isn't about meanings, this is about orthography. We already have Category:Spanish Spanish and Category:Latin American Spanish for things like that. --WikiTiki89 15:50, 3 September 2015 (UTC)
    There are no (longer any) orthographic differences in the standard of Spanish of Latin America versus that of Spain. —Μετάknowledgediscuss/deeds 16:03, 3 September 2015 (UTC)
    @Wikitiki89: Spanish is almost entirely phonetic (with some caveats about c/k/s/th/z sounds running together) so there are very few--if any--orthographic differences. I could imagine some eye spellings becoming somewhat popular in regions but I don't know of any. In fact, I don't know of any Spanish spelling differences like the American/British differences between colo(u)r/hono(u)r/etc. —Justin (koavf)TCM 04:04, 4 September 2015 (UTC)
    I cringe every time someone says any languages' orthography is "almost entirely phonetic". This seems true on the surface, but breaks down when you take a deeper look, especially at not-completely-standard varieties. --WikiTiki89 14:04, 4 September 2015 (UTC)
    @Wikitiki89: E.g.? —Justin (koavf)TCM 02:03, 5 September 2015 (UTC)
    @Koavf: The first two things that come to mind for Spanish are:
    • Voicing assimilation of "s": mismo is pronounced [ˈmizmo] rather than [ˈmismo], etc.
    • In some dialects the dropping of syllable-final "s" affects the quality of the preceding vowel and creates a phonemic distinction between, for example, todo [ˈtoð̞o] and todos [ˈtɔð̞ɔ].
    There are many more examples. --WikiTiki89 15:04, 8 September 2015 (UTC)
    @Wikitiki89: Sure, or the peninsular distinction--pronouncing "s" as "sh". But the spelling is a virtual free-for-all as in English. I don't know of any language anywhere near the size of Spanish where spelling can be inferred from sound and vice versa as well. —Justin (koavf)TCM 15:40, 8 September 2015 (UTC)
    @Koavf: Isn't that exactly what I was saying, that virtually no language is "almost entirely phonetic"? --WikiTiki89 15:45, 8 September 2015 (UTC)
    @Wikitiki89: Well, we may have been saying the same thing the entire time but my point was that although Spanish--like any language--is not perfectly phonetic, it is far more regular than the language that we are using right now. Especially when one considers that Spanish has about 600 million speakers. As a general rule of thumb, the language is phonetic but with some necessary explanation. I don't think it's really cringe-worthy when someone says, "Turkish is phonetic" because that is a meaningful statement. Maybe not a perfect one but still a useful one for understanding that spelling is very standardized and maps to pronunciation in a predictable way. (In point of fact, this reminds me of when I worked in a bookstore and a Turk asked me in which "ay-zul" a book was--she meant "aisle".) —Justin (koavf)TCM 15:56, 8 September 2015 (UTC)
    First of all, English is much more phonetic than people give it credit for; there are just a lot more rules to learn. Second of all, the fact that you just said that Turkish spelling is standardized, proves that it is not a phonetic language for speakers of less standard dialects (a truly phonetic language would not have a standard and everyone would write in their own dialect, which is almost the case for Serbo-Croatian). Third of all, there is big difference between "far more regular than [English]" and "almost entirely phonetic". --WikiTiki89 16:49, 8 September 2015 (UTC)
    @Wikitiki89: What I meant that if you hear Turkish, you will know how it's spelled and if you see something spelled in Turkish, you will know how to pronounce it. Not anything about a standard register. This is not even close to true for English and if you have a huge panoply of rules to remember, then you don't have very phonetic spelling--that's what makes spelling phonetic. If you hear English and try to transcribe the sounds using the ISO-standard Latin alphabet, you can get all kinds of spellings and many of them not close to proper English. This is not true for Spanish. In fact, this is empirical: we could take native and non-native speakers and have them transcribe sounds or guess how words are spelled and we would find that they would be much more accurate for Spanish or Turkish than for English. That's all I'm claiming and I think that anyone wouldn't make a more over-reaching claim about any language being perfectly phonetic or without variation over time and space. —Justin (koavf)TCM 17:28, 8 September 2015 (UTC)
    You're right that if you see something spelled in standard Turkish, you will know how to pronounce it in the standard language (and not counting the exceptions that I presume exist but do not know about, having never studied Turkish). I don't know what you mean by transcribing English with the "ISO-standard Latin alphabet", but the average English speaker would be able to accurately transcribe spoken English into written English, even when this speech contains words not previously known to the listener. A non-native English speaker who is not very proficient would not be able to. But the same goes for Spanish and (I presume) Turkish. If you heard a Spanish speaker say [ˈtɔð̞ɔ], you might mistakenly transcribe it as todo, depending on your familiarity with this class of dialects. You might also hear [ˈla.o] and not know whether to transcribe it as lado, lago, or lavo. --WikiTiki89 17:56, 8 September 2015 (UTC)
    There are few eye spellinings like rajuñar vs. rasguñar but they are rather rarely used and officially considered incorrect. Matthias Buchmeier (talk) 05:05, 4 September 2015 (UTC)
    There are regional morphological differences in the second person, though they don't line up in a completely tidy Europe/America split. Chuck Entz (talk) 06:01, 4 September 2015 (UTC)

Languages (possibly) needing additional scriptsEdit

DTLHS (talk) 19:42, 4 September 2015 (UTC)

Acehnese was formerly written in Arabic script, per WP and Philip A. Luelsdorff, Orthography and Phonology (1987, ISBN 9027274436), page 136. Banjarese either formerly was or still is written in Arabic. Old Javanese was written in Javanese script. I'll update the modules accordingly.
The Algonquian translation I have simply removed (wrong script and part of speech).
The Chinese-script-Chamorro was a typo (of the language code cmn as ch, clearly mnemonic for "Chinese").
- -sche (discuss) 16:50, 5 September 2015 (UTC)

Two Russia German (Russlanddeutsch) languagesEdit

I have some words from two languages of the Russlanddeutsche which I would like to add, but we need to decide how to encode them.

  1. The Volga Germans speak primarily Rhine Franconian dialects[1] (with some Russian loans like Erbus ‎(watermelon)),[2] similar to the Pennsylvania Germans.
    We could treat Volga German under the code gmw-rfr which we recently created for the Rhine Franconian varieties of Germany proper (at which point it would probably make sense to also merge Pennsylvania German into that header). In favour of this are the arguments that Volga German, Penn. German and Palatine German have remained very similar despite their geographically separate development, and having separate headers will result in many pages looking like Wasser does now. Some references, like the Pfälzisches Wörterbuch, do treat Volga German, Penn. German and Palatine proper as one language with a huge variety of dialects (since both Volga German and Palatine proper have quite a few dialects).
    Alternatively, we could treat Volga German as its own language (say, gmw-vog). In the past, we have tended to give lects their own codes if they developed independently due to geographic isolation, even if they didn't develop to be very different: hence not only Pennsylvania German but also Transylvanian Saxon, Hunsrik, Alemán Coloniero and other lects have their own codes separate from their parent varieties. And there is another argument: Volga German is not entirely Rhine Franconian; it developed in communities made up of people from all over (Alsace, Baden, East Central German areas, Sweden, etc), and hence it is in practice a mish-mash in which Rhine Franconian is merely the most dominant element (this is also true of Penn. German).[3][4] Several references treat Volga German as its own lect, though most of them comment on its similarity to Penn. and/or Palatine German.
    See User:-sche/Volga for a sample and comparison.
  2. The Russian Mennonites spoke Mennonite Low German, a.k.a. Plautdietsch. At the momet, our only ==Plautdietsch== entries are from American communities; I'd like to include historical texts from those communities while they were still in Russia, and texts from the communities which remain in Russia, under the same L2. Plautdietsch is distinguished from German- and Dutch- Low German by some phonological changes (especially to k) which hamper mutual intelligibility, but within Plautdietsch the distinction between American and European is only as notable as the distinction between Chortitzaer and Molotschnaer, and the references I can find treat the American and Russian (and Chortitzaer and Molotschnaer) varieties as the same language. See User:-sche/RMLG for a comparison: the principle distinction is that American MLG has [c], written 'kj', where Russian MLG has [tʲ], written 'tj'. To re-iterate, I'd like to add Russian Mennonite Low German under our existing Plautdietsch header with {{label}}s and {{a}}s, rather than giving it its own header.

Note that there are and historically were other Germans in Russia (e.g. Swabian speakers in some places), but I'm content to leave them undiscussed for now because I don't have words from them yet. (I intend to start a thread about the Danube Swabians later.) - -sche (discuss) 21:48, 4 September 2015 (UTC)

I'd lean in favor of a separate language code for Volga German, for the reasons you mention. (Is it never written in Cyrillic?) And I'm in favor of treating pdt as one language with a Russian dialect and an American dialect—or rather, a North American dialect, since isn't Plautdietsch also spoken in Canada and Belize? —Aɴɢʀ (talk) 08:04, 5 September 2015 (UTC)
On the other hand, w:Plautdietsch language#Varieties says the two major dialects are Chortitza and Molotschna; does that division correspond to what you're calling the Russian/American division? —Aɴɢʀ (talk) 08:12, 5 September 2015 (UTC)
No; modern Plautdietsch in both the Americas and Europe blends elements of the Chortitza and Molotschna varieties. For instance, early references note that /tʲ ~ c/ (from original */k/) originated as a Chortitza feature, but it is now found everywhere. Also, in the early period Molotschna had [uː] in words like 'Fru' and 'Hus' while Chortitza had [yː], but the references I looked at noticed speakers of both varieties using the other's form. One says that before WWII, "the Chortica rounded front vowel [yː] was replaced by the Molotchna long back vowel [uː], as in [fryː] - [fruː]", while another says that after the war, the "dominierende [Molotschnaer] Varietät setzte sich gleichwohl nicht in allen primären Merkmalen durch, sondern nahm z. B. aus der rezessiven [Chortitzaer] Varietät den Umlaut langes yː statt langem uː (fryː statt fruː 'Frau', hyːs statt huːs 'Haus')". (It has been suggested that the switch to the otherwise less prestigious but older and more original Chortitza form was a way of defiantly resisting Russian pressure to become more Russian.) - -sche (discuss) 18:03, 5 September 2015 (UTC)
Maybe we should recognize four dialects of it then: the two older ones and the two modern ones. —Aɴɢʀ (talk) 19:04, 5 September 2015 (UTC)
re Cyrillic: I would expect Russian-language texts to mention individual Volga German words in Cyrillic the same way English texts mention Russian in transliterated form, but I can't even find examples of that, searching the web for Cyrillizations of common words, like "бам|баам|ман|манн" поволжские немцы. The printed as well as the handwritten texts in the language that I've seen are in Latin script, and the one Russian-language reference I have prints all the Volga German words in Latin script. - -sche (discuss) 21:39, 8 September 2015 (UTC)
I have added "Mennonite Low German", "Chortitza", "Molotschna", and "Russian Mennonite Low German" as alt names of pdt and will add content from Russia/Ukraine-based communities under that code soon.
I have not yet added a code for Volga German. I recognize that the weight of precedent regarding European languages is behind giving it its own code, and I have a weak preference for that myself. I worry that it does show "our" bias ("our" meaning not just Wiktionary, but the various generally European- or American-authored reference works on the languages themselves which we consult), however: that various mutually-intelligible shades of European languages are often treated as distinct while mutually-intelligible shades of African languages are often handled as single languages.
- -sche (discuss) 21:39, 8 September 2015 (UTC)
I have added gmw-vog. - -sche (discuss) 05:26, 28 September 2015 (UTC)

Open call for Individual Engagement GrantsEdit

Greetings! The Individual Engagement Grants program is accepting proposals from August 31st to September 29th to fund new tools, community-building processes, and other experimental ideas that enhance the work of Wikimedia volunteers. Whether you need a small or large amount of funds (up to $30,000 USD), Individual Engagement Grants can support you and your team’s project development time in addition to project expenses such as materials, travel, and rental space.

I JethroBT (WMF), 09:34, 5 September 2015 (UTC)

There is less than one week left to submit Individual Engagement Grant (IEG) proposals before the September 29th deadline. If you have ideas for new tools, community-building processes, and other experimental projects that enhance the work of Wikimedia volunteers, start your proposal today! Please encourage others who have great ideas to apply as well. Support is available if you want help turning your idea into a grant request. I JethroBT (WMF) (talk) 15:31, 24 September 2015 (UTC)

People who have productive namesEdit

I've thought about this for years, and haven't floated the idea before for fear of starting an ugly shitstorm of useless argument, but it's continued to bug me.

Wiktionary is not for biographies, but we have a lot of words that derive from proper nouns. It would make sense to me to include definitions for those base terms that have gone on to form other words. I'm not proposing lengthy biographies, just who a person is with a link to their Wikipedia article, and only for the form of their name which has been productive.

We have entries for Hemingwayesque and Hemingwayan, so we should include a biographical definition at Hemingway (e.g. "Ernest Hemingway (1899–1961), American writer and journalist") and list the derived terms. Keeping with the current rules, we don't need an entry at Ernest Hemingway nor Ernest Miller Hemingway, as they are not productive terms and don't forms other words.

Similarly, we have the word Obamacare, so we should include a short biographical definition at Obama (but not at Barack Obama, nor Barack H. Obama, etc). Notably, Obama already has a biographical entry in defiance of the "no biographies" rule.

We have the term Darth Vader, as in "the Darth Vader of", but because of strict adherence to the "no biographies" rule, there is no definition for the fictional character w:Darth Vader himself. As such, the entry is ridiculous, redeemed only slightly by squeezing the fictional character into the etymology. You don't need to go to the discussion page to know there's been some weird argument that has lead to the elephant-in-the-room definition list. The attributive use derives from the fictional character, so he should also get an entry at Darth Vader (but not Vader, nor Anakin Skywalker unless they also have derived terms).

Darwin has a lengthy list of derived terms but the entry awkwardly squeezes mention of Charles Darwin into the "A surname" definition.

You can have "a Picasso" (a work of art by Pablo Picasso), so Picasso himself should get an entry at Picasso. And under the Spanish heading I see he's snuck in.

Sorry to kick up this argument again, but it seems like a no-brainer to me, and in many cases it's what's already seems to be happening. If a name is productive, that form of the name should be allowed a definition. If nothing else, it will allow some consistency for what is already being added. If there are too many odd cases, we could restrict it to only those people who are "notable enough" to have Wikipedia entries. And again, only the productive/attributive form of their name, not the Wikipedia article name. Thoughts? —Pengo (talk) 00:10, 8 September 2015 (UTC)

@Pengo: This issue is completely legitimate and probably thorny. We have commonly-used derivations like "Orwellian", "Kafkaesque", and "Dickensian" but I also added this change to Volapük and had it removed. Maybe constructions like this need to have a kind of secondary citation threshold: not only must someone coin the term "koavfesque" but someone also needs to comment on that usage ("It seems that the sorry state of public infrastructure is being described by both conservatives and liberals as 'koavfesque', meaning truly pathetic and dilapidated...") Does that make sense? —Justin (koavf)TCM 02:49, 8 September 2015 (UTC)
I see nothing ridiculous about providing an etymology for Darth Vader noting the origin of the name as a fanciful coinage for a fictional work. bd2412 T 03:20, 8 September 2015 (UTC)
We have reasonable dictionary definitions for several people who are known just by their surname. Hitler is a reasonable example. This sort of thing is good to have in a dictionary. We should have more of them. SemperBlotto (talk) 07:42, 8 September 2015 (UTC)
The "Picasso" noun entry seems a bit silly. You can do that with any painter's name, e.g. "a Manet". Equinox 17:56, 8 September 2015 (UTC)
Sure, but that's kind of my point, that we have an entry for "a Picasso" but, by our current guidelines, Picasso himself shouldn't have a sense.
I had another read of the CFI guidelines. I thought there was some kind of ban on biographical entries (perhaps because of the Darth Vader kerfuffle, or perhaps because it basically doesn't mention them at all), but the guidelines only exclude names of people that are made up of 2+ words:
No individual person should be listed as a sense in any entry whose page title includes both a given name or diminutive and a family name or patronymic. For instance, Walter Elias Disney, the film producer and voice of Mickey Mouse, is not allowed a definition line at Walt Disney.
So really this only excludes Darth Vader, which is kind of a bit silly and arbitrary, but otherwise biographical entries are basically ignored by CFI. Perhaps they could be made more explicitly allowed, with some notability guidelines.
As for "a Picasso", we kind of need that sense because you can also have several "Picassos", which is certainly attestable and that needs an entry too (just like Monets and Rembrandts). —Pengo (talk) 01:18, 9 September 2015 (UTC)
I think I'm in the minority, but my feeling has always been that we should WP-link to famous names like Einstein from our entries (under "See also", or conceivably "Derived terms") without attempting to "define" them ourselves. People are individuals who bear a name, not a sense or meaning of the name. People are not semantic. Having said that, I have come across hybrid "encyclopaedic dictionaries" (Oxford has or had one) that do include such entries, but I've found that they tend to be poor dictionaries and even worse encyclopaedias. Okay, we don't have the lack-of-paper problem, but WP is always going to be superior on encyclo coverage, and we should exploit that rather than doing some dubious weak copying of a fraction of its entries. Equinox 01:25, 9 September 2015 (UTC)
Of course people are semantic. It's just that names are often not unique globally, so their meaning is context dependent. But that's nothing new; every noun preceded by the is also context dependent. Names can be considered to have an inherent definite article in them. Some languages actually do use a definite article with names, too. —CodeCat 01:51, 9 September 2015 (UTC)
Bearing in mind my suggestion that "people are individuals who bear a name, not a sense or meaning of the name", would you then support a sense at Smith for every individual mentioned attestably (e.g. 3 mentions! CFI) as "Smith"? That's gonna be a long entry. Equinox 02:10, 9 September 2015 (UTC)
BTW, the "several Picassos" thing is almost a red herring to me, since you can pluralise surnames qua surnames (e.g. "I'm going to see the Smiths"). This then just becomes the combination of two rules: (i) you can pluralise a surname, (ii) a surname can stand in for a work by a person who bears that name. Equinox 01:26, 9 September 2015 (UTC)
@CodeCat For me, the evidence that the person's name has become "semantic" in language is when his or her name is used as a base of other words or meanings. E.g. "Darwinian" or "Hitlerite". I don't think you can argue very strongly that "Darwin" or "Hitler" are just names with no particular meaning any more, evidenced by the fact their names have branched off into new words. So we should try to capture what that base word (name) lends to its derived terms. We're certainly not trying to replicate Wikipedia. All cases I've seen link there with about as much text as a Wikipedia disambiguation page listing. See any of the above examples. But having the fictional character "Darth Vader" under "See also" for "the Darth Vader of..." is just silliness to me. Having it under "Derived terms" would be the wrong way around, and an attempt to downplay a main meaning of the term. Pengo (talk) 16:09, 9 September 2015 (UTC)
@Equinox I feel that someone could list "a Picasso" or "a Rembrandt" in any list of random items and it would be understood to specifically mean a painting by that painter by most English speakers, and that someone might legitimately look up "Rembrandt" or "Rembrandts" trying to understand its meaning, e.g. having seen it without enough context to guess what it meant, perhaps confusing it for "a Remington". So perhaps the pluralization part is a red herring, but it still seems dictionary-worthy from a "what does this usually mean in English" viewpoint, and doesn't apply equally to all surnames. Pengo (talk) 16:09, 9 September 2015 (UTC)
Shouldn't the proper noun definition of Darth Vader go? We have the etymology and the noun. Renard Migrant (talk) 16:40, 9 September 2015 (UTC)
No. —Pengo (talk) 04:22, 10 September 2015 (UTC)
Agreed: Obama (Obamacare), Dickens (Dickensian), Picasso (Picassian), Darwin (Darwinian), Popper (Popperian), Kuhn (Kuhnian) should have a succinct biographical sense lines, or they should have ", especially ..." parts of the surname sense line. Here's what dicts do:
  • AHD: Obama[1], Dickens[2], and Picasso[3].
  • Collins: Obama[4], Dickens[5], and Picasso[6].
  • Merriam-Webster: Obama[7], Dickens[8] and Picasso[9].
  • oxforddictionaries.com: Obama[10], Dickens[11], and Picasso[12].
  • Macmillan and dictionary.cambridge.org have none of this.
--Dan Polansky (talk) 09:33, 13 September 2015 (UTC)
They should be see-alsos. Vote? Equinox 09:36, 13 September 2015 (UTC)
@Equinox: What is the rationale for excluding these? (They are not excluded by the current CFI.) Is it the redundancy to Wikipedia? If so, should Nile be reduced to "a specific river" to minimize redundancy? Is there at least one dictionary that does such a minimization of "Nile"? --Dan Polansky (talk) 09:49, 13 September 2015 (UTC)
If I take your "People are not semantic" above as the rationale or part of the rationale, do we really mean that referents of proper names are not semantic and should therefore be excluded or reduced? I saw that position before. Applied to extreme, each geographic name would have a single sense line saying just "a geographic name"; even "a specific river" would be too specific; and each astronomical name would say "astronomical name" instead of "an autumn constellation of the northern sky" (Perseus). If this position that referents must be excluded or obscured as far as possible is accepted, I don't see why it should be only accepted for biographical names and not for geographic names or astronomical names. --Dan Polansky (talk) 10:55, 13 September 2015 (UTC)
Re: "would you then support a sense at Smith for every individual mentioned attestably": That's a good point. As a practical matter, we do not want to include a sense line for every attested human individual. That's why the name as referring to the individual should overcome additional hurdles, such as that it gave rise to an adjective or that it is broadly understood to refer to that individual when used out of context. Other hurdles can be come up with. The hurdles are not specified in CFI, but CFI allows editors discretion in deleting proper names and their senses, via RFD of course. --Dan Polansky (talk) 11:12, 13 September 2015 (UTC)

Sathmar SwabianEdit

I have some words from Sathmar Swabian which I'd like to add, but as with the Russia German languages I mentioned above, we need to decide how to encode them. The Sathmar Swabians inhabit a region on the border of Hungary and Romania, having migrated there from Swabia, and they speak a dialect of Upper Swabian which has remained very similar to the varieties of Schussenried and Otterswang in Germany.
As I noted above re Volga German, we have on the one hand tended to give lects their own codes if they developed independently due to geographic isolation, even if they didn't develop to be very different (see e.g. Pennsylvania German, Transylvanian Saxon, etc). On the other hand, the variation which does exist between Sathmar Swabian and Upper Swabian proper is comparable to the variation between Upper Swabian and other dialects of Swabian (which are incidentally usually the sources of the differences), which we don't split, and of the four people who are said to be the main scholars of the language, Moser and Stephani speak of it as 'the Sathmar dialect (Upper Swabian)'. (I can't determine the stance of the other two scholars, Fischer and Wonhas. De.WP implies that Fischer considered it a dialect, but I don't see where it's covered in his comprehensive Schwäbisches Wörterbuch, perhaps because I'm missing some obvious terminological difference or perhaps because he doesn't actually include it.)
I have a weak preference for treating Sathmar Swabian as its own language, but compare my comments in the section above, re Volga German, about "bias". You can gauge the similarity of the lects yourself at User:-sche/Sathmar. - -sche (discuss) 01:24, 9 September 2015 (UTC)

I've created the code gmw-stm for this. - -sche (discuss) 05:27, 28 September 2015 (UTC)

How to deal with formations that have no overt phonetic or orthographic form?Edit

There are many cases where a suffix has been reduced to zero, but its effects can still be seen through other grammatical processes. For example, in Northern Sami, the present participle has no actual suffix, and is characterised purely by strengthening the consonant grade. The Estonian genitive case has no actual suffix, but is apparent through consonant gradation, and because the stem-final vowel is not deleted as in the nominative. I am wondering how we could create entries for these. There's no actual suffix to use as the page name, but detailing the function and (especially) etymology would be useful nonetheless. So how should I do this? —CodeCat 23:39, 8 September 2015 (UTC)

I would suggest an appendix, such as Appendix:Arabic verbs, which details Arabic verb classes, some of which are characterized by the doubling of the middle root letter, which is very similar to the issue you are dealing with with Northern Sami. --WikiTiki89 14:33, 9 September 2015 (UTC)
Uralic morphology is almost entirely suffixing though, so such an appendix wouldn't give much extra value. The difficulty here is only that the actual suffix has eroded due to sound changes, leaving some other effect as a residue. But the suffix is still "real" and its identity can be etymologically established, unlike the vowel changes of Arabic. For the Northern Sami present participle for example, the present participle can be traced to a suffix *-jē, which has disappeared after a sound change that deleted intervocalic -j- with compensatory lengthening of the preceding consonant, and changes to the vowels. While there is no actual suffix, there is still conceptually a suffix that causes this effect when attached.
My own thought is to use the entry - for cases like this. It could also be used, for example, to explain the zero plural suffix in sheep (which goes back to Proto-Germanic *-ō). —CodeCat 19:11, 9 September 2015 (UTC)
But you can't really say it's a suffix anymore at that point, rather a morphological feature. There is no longer any suffix in the plural of English sheep, even if the singular and plural are derived from previously distinct suffixes. You could say that in Russian, there is a null suffix in the nominative of masculine nouns, which also causes epenthetic vowels when it follows a consonant cluster. So I guess you could treat this Northern Sami thing as a null suffix as well, but you would have to think about whether that actually makes sense. Even if Uralic morphology is almost entirely suffixing, an appendix would still be useful. It does not have to be exactly like the Arabic one, since this is a completely different situation; the reason I compared it to Arabic is that some verb forms (such as form X) can be simply analyzed as prefixes (اِسْتَـ ‎(ista-)), while others (such as form II) are characterized solely by the doubling of the middle root letter, which cannot be represented as an affix or infix in any way that makes sense. For Northern Sami, you could create an appendix detailing verb conjugation and participle formation (if it is all suffixes as you say, this page would be short and sweet), and for Estonian you could create an appendix detailing noun morphology (which would also be short and sweet). --WikiTiki89 19:59, 9 September 2015 (UTC)
Zero suffixes are sometimes theoretically justified, but I'd think they pretty clearly should not be dictionary entries. A separate appendix page could be useful in some cases though, I suppose. --Tropylium (talk) 20:19, 9 September 2015 (UTC)
Note the treatment of the "zero suffix" by User:Cinemantique in снос. What do people think about it? --Vahag (talk) 21:54, 9 September 2015 (UTC)
I personally don't very much like it, but I could be convinced of its usefulness. In the case of снос ‎(snos) and similar, I think it would be better to just say "From сноси́ть ‎(snosítʹ)". --WikiTiki89 22:03, 9 September 2015 (UTC)
I agree with you. But it can be argued that some kind of categorization of "suffixless" derivations is useful. --Vahag (talk) 22:11, 9 September 2015 (UTC)
This is a very good reason to have entries for such derivations. We certainly would want derivation lists and categories for them. I think Cinemantique's solution is pretty good, and I think I'll use it as well unless there are bad objections (as well as good alternatives). —CodeCat 23:07, 9 September 2015 (UTC)
Categorizing zero derivatives sounds like a good idea (including things like bug, drink), but surely this can be done without treating zero suffixes as entries. That's kind of the point after all; these are forms that are morphosyntactically treated as derived while being lexically underived.
Note that this is a distinct issue from stem alternations or the like as allomorphs of productive inflectional suffixes — in those cases there's full reason to have a suffix entry in the first place, and given out habit of linking allomorphs to the same entry, there should be no obstacle to doing something like {{suffix|stem|suff|alt2=∅}}. But then again, since when do we link inflected forms to their suffixes anyway? --Tropylium (talk) 07:09, 10 September 2015 (UTC)
If we're going to do this, -∅ does seem like the logical notation to use (or at least, to display; perhaps we could rig it up so that the link went to the appendix: -∅). But we do have to be careful when deciding where to do this; I agree with WikiTiki that it would be inaccurate to describe English "sheep" as having a suffix. - -sche (discuss) 01:11, 10 September 2015 (UTC)
I like the idea of explaining the suffixlessness, but using a -∅ seems overly jargony to me. Why not write a plain English explanation as has been done above, and add it to the entry (either as a line in the etymology or in a "Grammar notes" section), saying something like what has been said from above, "The present participle has no suffix, and is characterised purely by strengthening the consonant grade" or something more easily understood. If this text needs to be repeated on many entries then make a {{suffixless}} template and use that. I don't know how common -∅ notation is, but I suspect it's not common enough to help casual readers of Wiktionary. Pengo (talk) 13:54, 13 September 2015 (UTC)
  • Empty suffix looks like an inferior idea. To derive Russian снос as сносить +‎ -∅ looks most curious to me. Suffixing is a process of adding something; if nothing is added, no suffixing takes place. google books:"empty suffix" does not seem to find many hits from linguistics; most seem to be computer science. --Dan Polansky (talk) 19:12, 13 September 2015 (UTC)
    That’s because it is usually termed a zero morpheme or null morpheme, not an empty suffix, in linguistics. Deriving as x + -∅ is fairly common linguistic practice. Vorziblix (talk) 08:46, 17 September 2015 (UTC)

"New Ancient Greek"?Edit

We already have "New Latin" for Latin words coined in modern times. But Ancient Greek words are also made anew, often for scientific purposes, by combining ancient elements. Should there be a separate "New Ancient Greek" etymology language for these, to match how we treat Latin? —CodeCat 22:48, 12 September 2015 (UTC)

In taxonomy, at least, Greek is Latinized before being incorporated, so that even terms composed of Ancient Greek parts are really Latin (although there are a number of taxonomic names that show Ancient Greek nominative endings, I've never seen one that used Ancient Greek genitive endings instead of Latin). I'm sure this is true throughout the sciences. Chuck Entz (talk) 01:16, 13 September 2015 (UTC)
Ancient Greek was not used in the scientific community and therefore the statement that "Ancient Greek words are also made anew, often for scientific purposes" is false; words may be made from Ancient Greek roots but not actually reflect words in Ancient Greek. The same is true of many scientific coinages from Latin nowadays, but until the 19th century most of them originated in New Latin texts. Therefore there would really be no use for this. —Μετάknowledgediscuss/deeds 01:38, 13 September 2015 (UTC)
It's not just the roots that are Ancient Greek though, the derivational rules used to create the combination are also Ancient Greek. Everything about the words is Ancient Greek, except that they're not used in Ancient Greek. They are Ancient Greek words coined for the sole purpose of deriving loanwords from them. Does that make them words or not? Not in the usual sense, but to say that they are "not words" doesn't seem right either. They're like limbo words. —CodeCat 02:55, 13 September 2015 (UTC)
Well, no, not everything. As Chuck already pointed out, they are almost always Latinised. In any case, we document words that are used, and those in limbo can never have entries, and thus never be an etymon. It's like reconstructing a word in a protolanguage for a modern concept just because all the descendant languages use cognate terms for it; it could be done, but has no lexicographical validity. —Μετάknowledgediscuss/deeds 03:57, 13 September 2015 (UTC)
Examples? I think the idea is worth entertaining (though I'd unlikely be expert enough to weigh in). But I feel we really need some examples of potential modern Ancient Greek words for any meaningful discussion. —Pengo (talk) 13:34, 13 September 2015 (UTC)
@CodeCat, Chuck Entz, Metaknowledge, Pengo: The only "New Ancient Greek" word I can think of is μιξόγλωττος ‎(mixóglōttos), which occurs in Johann Jacob Hofmann's Lexicon Universale (1698) as an adjective qualifying the Latin noun nōmenclātor here. CodeCat's on to something here, but I'm not sure how widespread this phenomenon is. — I.S.M.E.T.A. 15:08, 13 September 2015 (UTC)
I think we're talking about two different things here. I'm talking about Ancient Greek words that are coined to serve as a base for derivations in various languages. Things like hypothermia. I am aware that these words have never been actually used in Ancient Greek, but to ignore their existence altogether seems wrong too. —CodeCat 15:15, 13 September 2015 (UTC)
@CodeCat: So, you want (appendical) entries for hypothetical etyma like *ὑποθερμία ‎(*hupothermía), yes? — I.S.M.E.T.A. 15:49, 13 September 2015 (UTC)
Something like that, yes. They are not reconstructed terms though, since we know they weren't used. Reconstructions are assumed to have been used, just unattested. —CodeCat 15:53, 13 September 2015 (UTC)
@CodeCat: Well, yes; that's why I called them hypothetical. — I.S.M.E.T.A. 16:08, 13 September 2015 (UTC)
There is no reason to have that. The parts can be linked to separately in the etymology. --WikiTiki89 15:55, 13 September 2015 (UTC)
Yes, but what about linking the derivatives together? Are they not cognates? —CodeCat 16:09, 13 September 2015 (UTC)
@CodeCat: Normally we would pick the language which coined "hypothermia" first, and that one gets a list of descendants. Otherwise what would we do with television? Create a "New Ancient Greek-Latin hybrid" language to list its cognates? I think the real issue here is that there's no consistent way to list and/or tag cognates, especially when the language it was originally coined is not clear, i.e. when there's no clear descendent-relationship. Other than listing descendants, what other benefits are there to New Ancient Greek? Are Ancient Greek grammatical rules applied to hypothermia (or other such terms) which are reflected in multiple descendent languages [and wouldn't those rules be the same just for 'thermia', θέρμη ‎(thérmē), anyway]? —Pengo (talk) 01:19, 14 September 2015 (UTC)
I don't see anything getting in the way of something like Appendix:Wanderwort/Hypothermia or Appendix:Wanderwort/Television for collecting the cognates in cases like these. They would obviously have to be formatted differently from usual entries though. --Tropylium (talk) 18:24, 3 October 2015 (UTC)



A userpage appearing without ever being created is quite surprising at first. Since it is clearly self-advertizing, it has nothing to do here but it also seems to be undeletable. Thank you 08:10, 13 September 2015 (UTC)

(edit conflict) The actual page is on Mediawiki, but it shows up on every Wikimedia wiki that doesn't have a page by that name. The only way to get rid of a global user page here would be to create a page (even a blank one) to replace the global page locally. That said, I'm not sure that global page is actually advertising, though it probably doesn't meet the standards in WT:USER. We haven't really worked out how to respond to global user pages, yet. Chuck Entz (talk) 08:16, 13 September 2015 (UTC)
In fact it's not really advertizing (at least not "classical" advertizing), but if the page was directly created here, it would certainly have been deleted for being "promotional material" (and perhaps the user blocked indefinitely), as far I can see. Am I wrong? Bu193 (talk) 08:28, 13 September 2015 (UTC)

Intransparent headwords for Ancient Greek entriesEdit

I noticed a bot introduced intransparent headwords into headword lines of Ancient Greek entries. For Εὐριπίδης, the transparent headword was Εὐριπίδης while the intransparent is "Εὐρῑπίδης". Can User:Benwing, the owner of the bot, point me to the discussion that lead to that change? Thank you. --Dan Polansky (talk) 13:24, 13 September 2015 (UTC)

What makes you think there was a discussion? Kind of presumptuous. —CodeCat 13:31, 13 September 2015 (UTC)
Can anyone point me to a discussion, if any? --Dan Polansky (talk) 13:39, 13 September 2015 (UTC)
From the revert war at Euripides, I see that the unfair methods of CodeCat are taking hold. Oh well. I have created Wiktionary:Votes/pl-2015-09/Using macrons in headword lines of Ancient Greek entries. The Euripides entry is now locked for "Disruptive edits by Dan Polansky". --Dan Polansky (talk) 14:00, 13 September 2015 (UTC)
@Dan Polansky: See Wiktionary talk:About Ancient Greek/Archive 1#Breves in Templates and User talk:Benwing#Ͷ, ͷ for two discussions. — I.S.M.E.T.A. 14:56, 13 September 2015 (UTC)
Wiktionary talk:About Ancient Greek/Archive 1#Breves in Templates is a 2007 discussion showing no consensus. User talk:Benwing#Ͷ, ͷ does not seem relevant, and is not a Beer parlour discussion. Is this a joke? --Dan Polansky (talk) 17:20, 13 September 2015 (UTC)
@Dan Polansky: Go to hell, you troll. — I.S.M.E.T.A. 17:50, 13 September 2015 (UTC)
Really? You revert-war against status quo ante, provide irrelevant discussions, and then call me a troll? You have shown true colors, indeed. --Dan Polansky (talk) 17:52, 13 September 2015 (UTC)
No, Dan isn't a troll- he takes himself way too seriously for that. As far as he's concerned, he's the last barrier standing between Civilization As We Know It and Tyranny And Chaos. On occasion, that's not that far from the truth. The problem is that he's so used to seeing himself as this principled Defender Of Truth, that he often can't see it when his own personal grudges take the place of his principles. Chuck Entz (talk) 21:13, 13 September 2015 (UTC)
@Chuck Entz: I apologise for my outburst. I should've written something like what was written by Μετάknowledge in Wiktionary talk:Votes/pl-2015-09/Using macrons in headword lines of Ancient Greek entries#Status quo ante, viz. “your statement of status quo ante is inaccurate. This [discussion] is pointless, and the issue belongs only among Ancient Greek editors, who already have a clear consensus (as you can see from the response [herein]). You are clueless about what has been going on”. @Dan Polansky: Perhaps you don't realise how profoundly irritating it is to have an editor completely unfamiliar with a language-editing community's practices (I have never seen you make a non-trivial edit to Ancient Greek content) come in, revert a completely routine and uncontroversial edit, demand an essay in justification of that edit, begin litigation when that demand is refused, and then dismiss the response of an editor from that community (when he finally concedes) as “a joke”. Your meddlesome, holier-than-thou manner endears you to no one. — I.S.M.E.T.A. 20:34, 15 September 2015 (UTC)

@Dan Polansky: I'm a little confused. Is the claim that using macra lacks transparency? We put diacritics outside the normal written orthography in the headword all the time (like Latin macra). Why is this a problem? —JohnC5 15:06, 13 September 2015 (UTC)

I don't remember Latin using macra all the time. Has this been changed recently? --Dan Polansky (talk) 17:20, 13 September 2015 (UTC)
It has been the case in Latin that macra should be used within Latin entries for a long time. The use of macra has been specified in WT:ALA and WT:AGRC for quite a while. Any entries that do not contain macra are ill-formatted and should be updated. The status quo has always been for their use. —JohnC5 19:17, 13 September 2015 (UTC)
As for Latin, my memory must have failed me; I now recall that macrons were used as long as I can remember. My mistake. As for Ancient Greek, let us have a look at WT:AGRC. This revision from 6 May 2012 tells me, on a slightly unrelated subject, that "Vowel length marks (i.e. the macron and breve) should not be used outside of Ancient Greek entries", which is relevant for the appearance of Ancient Greek in Euripides, which "I'm so meta even this acronym" protected to have his way contrary to WT:AGRC from 6 May 2012. Meanwhile, someone changed WT:AGRC to no longer state that. I admit that the same revision states that 'Secondly, headline templates, such as {{grc-noun}} have a "head" parameter, which can take vowel marks.' I admit my mistake since WT:AGRC suggests allowed use of these on the headword lines. Nonetheless, I point out that this was not an actual widespread practice before the bot run, as far as I remember (but do I remember this correctly?). Be it as it may, it may well be that there never was any controversy about the use of macrons in headword lines of Ancient Greek entries, and that I am quite mistaken here. Whatever the case, the vote should clarify that. --Dan Polansky (talk) 19:39, 13 September 2015 (UTC)
FWIW, I (as well as ISMETA and ObsequiousNewt) have been using macra in AG for a full year now and the new {{grc-decl}} requires them for correct functionality. —JohnC5 19:53, 13 September 2015 (UTC)
Oh noes, that makes you a criminal too! —CodeCat 20:30, 13 September 2015 (UTC)
I've been using macrons in grc exactly the same way as in Latin (i.e. everywhere except page names) ever since Module:languages/data3/g was edited here in January 2014 to automatically strip them in links. Breves have been stripped since October. I see no reason not to take advantage of this functionality, nor do I see any way in which doing so is "intransparent". —Aɴɢʀ (talk) 13:52, 14 September 2015 (UTC)
What's an intransparent headword head word? Renard Migrant (talk) 20:43, 14 September 2015 (UTC)
Certainly, all the Greek dictionaries I've seen include breves and macrons in their headwords. Benwing2 (talk) 10:42, 15 September 2015 (UTC)
@Benwing2: Can you please state these specific dictionaries at Wiktionary talk:Votes/pl-2015-09/Using macrons in headword lines of Ancient Greek entries#Dictionary practice? There, can you name specific entries in which it can be verified that they use macrons? --Dan Polansky (talk) 10:00, 20 September 2015 (UTC)

Macrons in Ancient Greek entriesEdit

This topic is currently discussed above, in #Intransparent headwords for Ancient Greek entries. --Dan Polansky (talk) 13:50, 13 September 2015 (UTC)

wrz, war translationsEdit

There are a bunch of translations labeled as "Waray" (code wrz) with the code for Waray-Waray (war). These are two distinct languages. Can they all be switched to one or the other, or is there a mix of both languages that need careful separation? DTLHS (talk) 20:32, 14 September 2015 (UTC)

@DTLHS: I looked through them, and recognised a lot of words that are definitely war, so I think it would be safe to switch them all. —Μετάknowledgediscuss/deeds 20:36, 14 September 2015 (UTC)

Vote timeline clean upEdit

Wiktionary:Votes/Timeline (current revision: 29123778) has not been updated with new finished votes since 2013. The page is a bit messy, so I am thinking of maybe cleaning it up as a whole when I have the time. I am creating this BP discussion, as opposed to an RFC, because that's a personal project that I intend to do myself, not a request for others. (but if someone else beats me to it, that would be great, too).

Here's a list of what I intend to do. Please say whether you support doing it that way or if you'd rather it done differently.

  • Using the table format (which is being used in votes from 2004 to 2008) in all of the votes.
  • Merging the sections "Archived votes" and "Policy votes", since the division seems pointless in the first place and the former has plenty of "policy votes" to boot. (not to mention that "Archived votes" and "Policy votes" are ordered in opposite directions)
  • Editing the date: leaving just year-month (as in, "2008-12") as it's done since 2009 and not year-month-day like it was done between 2004-2008 (using multiple time formats). Maybe I'd use a single date template to format it consistently across the whole page.
  • Probably more things that I haven't brought up before but would post here before proceeding. Anything that's inconsistent and can be fixed easily. One minor thing:

--Daniel Carrero (talk) 07:25, 15 September 2015 (UTC)

I archived it --Zo3rWer (talk) 13:54, 15 September 2015 (UTC)
I don't see any bad consequence of this. Someone would have to point out some bad effect for me to consider opposing it. DCDuring TALK 22:16, 15 September 2015 (UTC)
  • I oppose using a table format in Wiktionary:Votes/Timeline. The current list format is nice enough, and easier to create for any archiver. People started to use the format on WT:VOTE itself (at the bottom), which makes archiving WT:VOTE easier. Keep things simple. --Dan Polansky (talk) 08:38, 20 September 2015 (UTC)
  • As for "Policy votes", that is a selection that I created in 2010. It is admittedly out of date, but there is nothing to be "merged"; it would have to be removed. As it is now, most people will see it is out of date, I think, and it still adds some value. I think it would better to update it; a guide for it is at Wiktionary_talk:Votes/Timeline. --Dan Polansky (talk) 08:38, 20 September 2015 (UTC)

Multilingual tablesEdit

Based on the multi-language system of Template:list:chess pieces/pt (which I had created pre-Lua), I created a system of tables that can be used in multiple languages.

First tables:

Thoughts? --Daniel Carrero (talk) 17:09, 15 September 2015 (UTC)

I like it. Most lists probably shouldn't be converted into tables, but I think this is a good example of a type of list that would benefit from it. —Μετάknowledgediscuss/deeds 17:37, 15 September 2015 (UTC)
That's ok because it's not too intrusive. As Metaknowledge says it wouldn't work with most lists. Renard Migrant (talk) 14:19, 16 September 2015 (UTC)
That's true. Category:English list templates has 91 members as of now. Many of those are lists of geography/places (countries, continents, oceans, states) which are probably better off the way they are, as lists rather than tables. The table of chess pieces has the advantage of being a simple eight-member group; lists with varying/unpredictable number of members (canids, religions, blues, reds) also probably should not be converted to tables either. --Daniel Carrero (talk) 14:35, 16 September 2015 (UTC)

Assuming we are going to start using this system for other tables in the future, (Disclaimer: I'm not proposing converting every list into a table, it's just that maybe there are other tables that it would be a good idea to have, after the chess thing.) I've been putting those in categories named like this:

Is this a good name? Obviously this is the best I could think of, but "auto-table" really does not explain that much, so I'm very open to other ideas. Or maybe it's a pretty good name anyway. --Daniel Carrero (talk) 07:51, 17 September 2015 (UTC)

Useless statisticsEdit

I was curious about belpuga, because it is an entirely ordinary seven letter word apparently found (but marginally) in but one language, as far as Google Books can tell. So I downloaded the page titles, counting small all-lowercase words, to see what density we have of the possible space.

ASCII [a-z], one letter: 26 (possible 26)
ASCII two letters: 521 (possible 676)
ASCII three letters: 4656 (possible 17576)
ASCII four letters: 22800 (possible 450,000)
ASCII five letters: 71184 (possible 12 million)
ASCII [a-z][aeiou] or [aeiou][a-z]: 228 (possible 235) (missing qo, iq, iy, ub, uc, uj, uq)
ASCII [a-z][aeiouy] or vice versa: 259 (possible 276) (additionally missing qy, yc, yh, yj, yk, yp, yq, yv, yx, yz)
ASCII three letters, including [aeiou]: 3956 (possible 8315)
ASCII three letters, including [aeiouy]: 4170

[[:lower:]]: 630 (possible 1984 Unicode 8.0 characters, or 1492 excluding MATHEMATICAL Unicode characters)
lower * 2: 2239
lower * 3: 13089

(I don't know if :lower: would count the latest Unicode 8.0 characters, so maybe someone has added them.) I don't know how much this reflects us and how much anything in the world of languages. If anyone cares about noting alphabetic characters, there may be 800 characters that need some sort of "xxth character of the XXX alphabet".--Prosfilaes (talk) 23:28, 15 September 2015 (UTC)

That is very fascinating stuff, and almost certainly useless. Congratulations! --Zo3rWer (talk) 15:43, 29 September 2015 (UTC)

Old ItalianEdit

The fate of Category:Old Italian language is being discussed at Wiktionary:Requests for moves, mergers and splits#Category:Old Italian language. I would like it if more than 3 people commented (most notably, GianWiki who is the only person making Old Italian entries, has not commented). Renard Migrant (talk) 15:52, 16 September 2015 (UTC)

@GianWiki. — I.S.M.E.T.A. 16:01, 16 September 2015 (UTC)

Change the appearance of the "favourite languages" in translationsEdit

I don't know what this feature is called, but it's the one that shows translations of languages you select, in the top bar of the translation box, even when collapsed. I think this format isn't so useful, because there's no room for more than one or two languages. I think it would preferable if, instead of collapsing the box altogether, the box just collapsed smaller, and showed only the translations for the languages you selected. So in collapsed state, it shows favourite translations, and expanding it shows all of them. I'm thinking of something similar to how the inflection box works on muitalit. —CodeCat 01:00, 18 September 2015 (UTC)

I support some sort of improvement in this aspect. My average translation table has 4 featured languages and it does get too cluttery. — Ungoliant (falai) 01:31, 18 September 2015 (UTC)
I would implement this if I had any idea how, or where this feature is currently located. —CodeCat 14:42, 21 September 2015 (UTC)

Min Nan POJ entriesEdit

Should Min Nan entries in Pe̍h-ōe-jī have definitions (e.g. in pe̍h-ōe-jī), or should they be like the pinyin entries for Mandarin words (e.g. in pīnyīn), where the character form(s) of the romanization are linked? (This is probably a question about whether POJ is a main script used for Min Nan.) Justinrleung (talk) 01:18, 18 September 2015 (UTC)

The status quo seems to be the former. —suzukaze (tc) 01:40, 18 September 2015 (UTC)
(e/c) I think the main question is: do people communicate using POJ, or do they just use it to transliterate? As I understand it, one wouldn't normally write a letter or a book in Pinyin, Romaji, etc. when the audience was native speakers, except perhaps in dictionaries or language-education materials. In other words it's more about the characters/writing system than about the subject matter. It would also seem to me that texts intended to demonstrate how familiar subject matter looks when written in the script would also be about the writing rather than about the subject matter. There's a book, for instance, that has the Lord's Prayer in hundreds of different languages and scripts- I would consider it strictly on the mention side of the use/mention distinction. Chuck Entz (talk) 01:42, 18 September 2015 (UTC)
The Bible has been published in POJ. —suzukaze (tc) 02:07, 18 September 2015 (UTC)
However, in the "Current Status" section of the Wikipedia article on POJ, most Taiwanese are unfamiliar with POJ. Justinrleung (talk) 02:16, 18 September 2015 (UTC)
Since we include usage from throughout recorded history, that might not be a problem in itself, if it was in sufficient use at one time, though we have to be careful to avoid giving brief, failed experiments undue weight, and to be clear about the difference between historical and current usage. Chuck Entz (talk) 02:25, 18 September 2015 (UTC)
I just came across Talk:a-bú which may be relevant. —suzukaze (tc) 23:53, 19 September 2015 (UTC)
@Suzukaze-c Thanks for the link. However, a concern I have is that the current format for Chinese entries might make it redundant to have definitions in both the Chinese character entry and the POJ entry. Also, does that mean POJ entries should be linked in translation boxes? Justinrleung (talk) 00:22, 20 September 2015 (UTC)
  • Personally, I'd support making them like pīnyīn entries, but I'm not an especially relevant editor. —Μετάknowledgediscuss/deeds 05:10, 20 September 2015 (UTC)

Multiple translationsEdit

I was just merging one section of the Translation section for "chaperone"/"chaperon". The German section has 11 terms - I think this is unhelpful - the user will eventually have to look at 11 pages to find out which might be the best. An editor should really do this for them by restricting translations to normally one (and occasionally two). In this case, IMHO, the reader would find Anstandsdame and then go to that entry to find those other terms, where their nature (dialectal, idiomatic, slang, insulting, … ) could be explained.   — Saltmarshσυζήτηση-talk 09:23, 18 September 2015 (UTC)

Slightly expanded usage of Template:alsoEdit

Earlier today, I attempted to add {{also}} to Northwest Territory and Northwest Territories, each linking them to the other. Seem to make perfect sense: people would confuse the defunct American jurisdiction with the still-existent Canadian one. The wording at Template:also seems to leave the door open for something like this, yet I was reverted (rather rudely, I might add) by User:Ungoliant MMDCCLXIV, who informed me that this template is only to alternate capitalizations and diacritics, without providing any actual reasoning why it should be limited to those things. Is there any good reason for constraining, and if not, can we allow use of Template:also in this relatively small case of proper nouns that are unrelated except for the fact that one is the plural of the other? Purplebackpack89

Isn't this what the header "See also" is for? —suzukaze (tc) 05:07, 20 September 2015 (UTC)
I support using {{also}} to link between Northwest Territory and Northwest Territories. --Daniel Carrero (talk) 08:13, 20 September 2015 (UTC)
This is what the header See also is for, however {{also}} is for similar titles and I think it makes sense sometimes to have the disambiguation right at the top and not almost at the bottom in a see also section. How would you handle aliterate and alliterate? Renard Migrant (talk) 09:44, 20 September 2015 (UTC)
Would Usage notes work? —suzukaze (tc) 04:55, 21 September 2015 (UTC)
@Renard Migrant: See alliterate and aliterate DCDuring TALK 07:13, 21 September 2015 (UTC)
  • The good reason for placing a limit on what it included in {{also}} is to attempt to maintain its utility in its original intended purpose: helping users find what they wanted despite limited keyboarding skills or understanding of scripts with different diacritics. Homonyms in Pronunciation perform a similar function, though limited to same-language items. That {{also}} is placed above any L2 section suggests that its primary use is for resolving cross-language/cross-script confusion. The principal exception is that we use it for items in the same language that have different initial capitalization. Under the logic of the headings structure of our entries that would seem to be a mistake.
If someone were to want add to this confusion by including other kinds of confusions in those allowed in {{also}} above the first L2 or by using {{also}} within L2 sections, I'd like to see a proposal and a vote. DCDuring TALK 11:12, 20 September 2015 (UTC)
I don't really buy into your argument that if additional uses were added, it would be less useful in its original use. Purplebackpack89 12:56, 20 September 2015 (UTC)
@Purplebackpack89: Why is that? Because you can't afford it? Because you don't like the salesperson? Because you are unfamiliar with the research that shows that folks, when faced with a choice of ten items are less likely to make a selection or purchase of an item than when there are three? Or is it because you are aware of other research that contradicts this. DCDuring TALK 15:33, 20 September 2015 (UTC)
I don't buy into it because it believe it to be not entirely true. I don't believe that {{also}} reaches its maximum utility by being limited in use. Correct me if I'm wrong, but your argument seems to be somewhat that if it was added to more entries, it wouldn't be as useful to the entries it was on originally (or at least before September 19, 2015). I disagree. I think if you add {{also}} to more entries, there is added utility for the entries it is added to, and consistent utility for the entries it was on before. Purplebackpack89 16:19, 20 September 2015 (UTC)
  • I prefer using a ===See also=== section for this sort of thing. It's not a keyboard limitation issue. —Aɴɢʀ (talk) 13:27, 20 September 2015 (UTC)
  • Oppose. {{also}} is meant for language-independent links. Northwest Territory and Northwest Territories can only be confused in English and thus the link should be within the English section. --WikiTiki89 14:01, 20 September 2015 (UTC)
  • I oppose this as well. But I should note that I have on occasion included {{also}} inside language sections. In these cases, there was possible confusion between different words in the same language, for example between c and č or e and é. —CodeCat 14:51, 20 September 2015 (UTC)
  • Support. This is already how the template is used in practice. I've used it with malternative and mallternative, two words that have completely different meanings, but a variance in spelling of a single letter, and thus might reasonably be confused. It makes sense to disambiguate words that might reasonably be confused due to having very similar spellings. Placing an {{also}} template at the top of the entry is a simple, unobtrusive way to point readers in the right direction. Burying a link in a "see also" section probably isn't as helpful, since a reader who takes a wrong turn is likely to be unfamiliar with the structure of Wiktionary entries, and thus won't know to look in the "see also" section. This really shouldn't be controversial. Ultimately, it's about increasing the ease of use of this site for readers -Cloudcuckoolander (talk) 20:07, 20 September 2015 (UTC)

For the record, nowhere in WT:ELE explictly says what is the exact use for {{also}}. It has not been voted anywhere. Perhaps it should. --Daniel Carrero (talk) 23:05, 20 September 2015 (UTC)

IMO, we are not yet to a point of near-consensus that would allow a good vote.
AFAICT, we have a few locations we can use to direct users to an entry with slightly different spelling:
  1. {{also}} above the first L2 header (first and only choice for items with letter-by-letter correspondence (excepting diacritical marks and type case) to headword, but on different pages, in different languages.
  2. {{also}} at the beginning of any L2 header (for items in the same language subject to confusion.)
  3. {{homophone}} under Pronunciation header (first and only choice for words in the same language, with same pronunciation, but different spelling and derivation.
  4. Under Alternative forms header (first and only choice for variations of the headword that fit under the same Etymology heading)
  5. Under See also header {first choice for items that are not search targets for users coming to the page and L2, but which provide additional information of possible use to some users).
It would be nice if the solution for first L2 headers also worked for other L2 headers and worked under tabbed languages.
It seems to me that the use of {{also}} above first L2 would not work for languages that appeared below the English L2 section. It would also be conceptually different from the class of items under 1 above. Position 5 doesn't make much sense because it occurs too far down on large pages. The alternative forms and pronunciation headers don't fit the facts of the case. That makes me conclude that option would be the right choice in general. In other cases aesthetics might lead us to combine 1&2 and place it above first L2. DCDuring TALK 00:45, 21 September 2015 (UTC)
I agree with DCDuring's specifications, except that I would remove the restriction "in different languages" from #1. --WikiTiki89 14:37, 21 September 2015 (UTC)
Move {{also}} to English L2 Also remove See alsos. DCDuring TALK 00:45, 21 September 2015 (UTC)
Makes no sense. "See also" L3-section or L4-section is not for syntactic or morphological relations. {{also}} (previously {{see}}) is used to connect syntactic forms regardless of language. Regardless of language does not mean cross-language; it means that it does not matter whether it is within a single language or not. --Dan Polansky (talk) 08:37, 27 September 2015 (UTC)
  • I support to use {{also}} to conect "alliterate" and "aliterate". "Northwest Territory" and "Northwest Territories" is syntactically near enough to be connected with "also" as well, I think. --Dan Polansky (talk) 08:37, 27 September 2015 (UTC)

Terms derived from LatinEdit

For example, beef is derived from the Latin bōs, however not in the nominative, but the accusative bovem. The French bœuf actually mentions this.

The question is, should all terms derived from Latin show the accusative instead of the nominative? --kc_kennylau (talk) 05:19, 20 September 2015 (UTC)

If they click on the accusative, they'll go to a form-of entry. If they click on the nominative, they'll go to the lemma, with all the important information, and the chances of there being a nominative entry are much higher than for an accusative entry. This is similar to the issue of higher-level taxonomic names, which are notmally derived from the genitive form of a generic name. I've been known to say "from X, the Y form of Z", but that can be a bit unwieldy.Chuck Entz (talk) 06:17, 20 September 2015 (UTC)
Some mention accusatives and some don't. The thing is, even if you had a policy for this (and some might say that's a step too far) you'd have to implement it by hand and that could take years. A bit like orphaning the abbreviation headers, there are thousands of them and they all need to be fixed manually. Renard Migrant (talk) 09:48, 20 September 2015 (UTC)
I think we should cite etyma in their lemma forms; this means saying that bœuf comes from bōs (not from bovem) and that chanter comes from cantō (not from cantāre). It just makes it easier for people to find the informative entry at first click. One possible compromise I've seen in some entries is to link to the lemma form but display both the lemma and the relevant inflected form, e.g. {{m|la|bos|bōs, bovis|ox}} or {{m|la|canto|cantō, cantāre|to sing}}. I'm not thrilled with that, as it strikes me as pedantic, but I can live with it. —Aɴɢʀ (talk) 13:48, 20 September 2015 (UTC)
I think that when the process is regular, we do not need to specify this on every page and we can just link to the nominative. In this case, French bœuf is regularly derived from the accusative of Latin bōs, because (almost) all French nouns derived from Latin are derived from the accusative. The same would especially be the case for French verbs (we should say that French prendre is "from Latin prehendō", not that it is "from Latin prehendere") because the French infinitive represents the whole paradigm just like the Latin first-person singular present represents the whole paradigm. In the case of irregular derivations from the nominative, we can specify that, for example, French fils derives "from Latin nominative fīlius". As for the English word beef, since it derives from (Old) French, and (Old) French regularly derives from Latin accusatives, we can simply say "from Old French buef, from Latin bōs". If English had taken the word directly from a Latin accusative, then we would have to specify. --WikiTiki89 14:12, 20 September 2015 (UTC)
My practice is to link to the lemma but display the true antecedent. So I’ll have bovem and cantāre. Less clicking. Users should be able to spot the accusative forms and infinitive forms in the declension tables. --Romanophile (talk) 14:16, 20 September 2015 (UTC)
What about the other direction? I imagine it might be useful for ====Descendants==== sections to mention that they are generally derived from the accusative. --Tropylium (talk) 14:23, 22 September 2015 (UTC)
In every one of the thousands? —CodeCat 14:32, 22 September 2015 (UTC)
Why would that be useful? --WikiTiki89 14:45, 22 September 2015 (UTC)

Difference between "regional" and "dialectal"Edit

When a term clearly exists, but isn't part of Standard English and its origins are murky, we tend to mark it as either "dialectal" or "regional". I can't see any meaningful difference between the two in Category:English regional terms and Category:English dialectal terms – is there a good argument for not merging these? Anything that is "regional" is surely also "dialectal" by definition. Smurrayinchester (talk) 13:02, 22 September 2015 (UTC)

Not necessarily: regional sounds broader, as in maybe northern England as opposed to, say, Scouse. In other words, regional would encompass a number of dialects in a larger area, while dialectal would be in isolated dialects here or there. Of course, there's also the pejorative sense of dialect at play here: my usage is regional, yours is dialectal because it's in some obscure, out-of-the-way backwater that I don't care about... ;) Chuck Entz (talk) 14:16, 22 September 2015 (UTC)
To me, regional is a subset of dialectal, as dialects are not necessarily tied to a region. They can be social as well. —CodeCat 14:31, 22 September 2015 (UTC)
That's how I understand it too. Polari is a dialect, but not regional. I can't think of anything that would be regional but not dialectal – even British English, Indian English, US English etc are dialects of a kind. (And that's another problem with regional – are we talking about a town or a continent?) Smurrayinchester (talk) 14:58, 22 September 2015 (UTC)
Really everyone here is mostly in agreement. Even by CodeCat and Smurrayinchester's definition, Chuck Entz's example makes sense. --WikiTiki89 15:08, 22 September 2015 (UTC)
DARE documents a lot of differences in frequency of usage and pronunciation of words by region in the US, some at the level of city (NY, Chicago), parts of states, states, some over much larger areas (Southern US). I don't think that these vocabulary differences are sufficient to make a dialect. AFAICT linguists typically recognize only Appalachian English, and, sometimes, southern English as dialects. DCDuring TALK 15:54, 22 September 2015 (UTC)
Labov et al's Atlas of North American English (ISBN 978-3110167467) describes layers of dialects in America, from 'South', 'Mid-Atlantic', 'Inland North dialect' and other broad dialects to 'New York City dialect' and other small dialects with are in some cases subsets of the broad dialects. I can imagine some people making a distinction between 'dialectal' and 'regional'; but the categories reveal that insufficiently many people imagine that there is a distinction, and they are in practice used interchangeably, with dialects described as regional (see in particular Category:Classical Hebrew and Category:Biblical Hebrew!) and the distinctive speech of regions (accurately IMO and per Labov) called dialectal. I favour making "regional" an alias of "dialectal". I wouldn't object to having "regional" display as such but categorize as "dialectal". I find the label to be uselessly vague, though: specify which regions, and then you needn't add a vague "regional" label, and if it's desirable that all regional/dialectal entries go into a single category, then make each label double-categorize into the specific category and the regional/dialectal category, rather than appending "regional" to only a small handful of the many entries that have regional (Southern US, Northern England, etc) tags. - -sche (discuss) 15:35, 23 September 2015 (UTC)

The constituent part parameters in Template:calqueEdit

The template {{calque}} has parameters to provide the source language term as well as the terms in the calquing language that the new term was made from. This is problematic, because it makes it impossible to provide more detailed etymologies using more fine-grained templates like {{compound}} or {{affix}}. In fact, many calques are also compounds, but are not categorized as such. I think it would make more sense and work better if, instead of putting this in antibody like now

{{calque|en|anti-|body|etyl lang=de|etyl term=Antikörper}}

this were changed into

{{affix|en|anti-|body}}, {{calque|en|de|Antikörper}}

This way, the entry will be correctly categorized as being prefixed with anti-. And it also makes {{calque}} work more like {{borrowing}}. —CodeCat 16:02, 22 September 2015 (UTC)

I agree. This annoyed me recently when I was editing съвѣсть ‎(sŭvěstĭ). --WikiTiki89 17:30, 22 September 2015 (UTC)
I've now modified {{calque}} so that it works fine when no positional parameters are given. The entries that still do have them are at Category:calque with terms. There's quite a lot to do, and they can't be done by bot because it's not a given that {{compound}} should always be used. grammatical alternation for example is definitely not a compound. But then, we have never been very strict on what a compound is, some people treat anything made up of different parts as a compound. —CodeCat 19:58, 27 September 2015 (UTC)

Modules, schmodulesEdit

When did it become policy that you can't add a category to another category? This is unfair to users who can't code (i.e. most of us). Anyone should be allowed to put a category into another category, and we should not be forced to use User:CodeCat's confusing, elitist and generally unnecessary module system. Purplebackpack89 21:29, 22 September 2015 (UTC)

It's unfair that people with no medical knowledge can't perform surgery. BAWWWW. Equinox 21:30, 22 September 2015 (UTC)
Editing Wiktionary should be easy enough that we shouldn't compare it to surgery. You should be able to edit nearly all aspects of Wiktionary, including putting categories into other categories, without knowing how to code a module. Coding modules is really advanced stuff. Purplebackpack89 21:34, 22 September 2015 (UTC)
If you are talking about your recent edit war with CodeCat about Category:Penutian languages, maybe you should wait WT:RFDO#Category:Penutian languages to end. Perhaps the problem is not that using modules is the "hard" way to use categories, I'd argue that templates/modules actually make the category system easier to manage as a whole, it is just that the category in question is not part of the system because nobody created a code for "Penutian". Assuming that the category fails RFDO it will get deleted; if it passes you would just have to insert it normallly into the category system. The proper code would not be {{langcatboiler}}, but {{famcatboiler|qfa-pen}}, provided that the category passes RFDO, then a code qfa-pen would be created for it. --Daniel Carrero (talk) 21:54, 22 September 2015 (UTC)
I'm arguing about modules in general, and I totally disagree that it's easier. It's so much easier to use HotCat to add categories to other categories than it is to edit modules or add templates. FWIW, I am also concerned that User:CodeCat has too much OWNership of the categorization system of this project; particularly when CodeCat has forced us to use modules instead of just HotCatting everything like most other projects do. Purplebackpack89 22:03, 22 September 2015 (UTC)
@Purplebackpack: Most languages/families alreay have a code so it works for them; Penutian is the odd one out, still under discussion. I suggested above: That is clearly a disputed/proposed language family, wait for the RFDO discussion to end before using that category further. But you'd rather use that category (and "Terms derived from Penutian languages", etc.) now, before discussion? I wouldn't advise to do that manually. That's a lot of work for a category under discussion: You would have to edit Category:Terms derived from Wintuan languages, Category:Terms derived from Chinookan languages and others as subcategories of Category:Terms derived from Penutian languages individually or leave the work unfinished. The purpose of the module system is doing or undoing all that at once with few edits to the modules.
For better or for worse, when you complain about @CodeCat, I demand credit, too, since pre-Lua I created {{poscatboiler}}, {{langcatboiler}}, {{famcatboiler}} and a good chunk of the category structure in use now. Though lots of other people contributed to the system as well, with edits and also votes and discussions. (I admit my coding was a mess, often in the form of large templates, and it made people's lives difficult editing them. I apologize to everyone who tried to edit the templates and remember this.) One thing I think is an improvement is that you know what to expect in a category system that is structuralized through templates: before the creation of a number of standardized subcategories, language categories like Category:English language had entries, appendices, templates and indexes randomly placed directly in it instead of in the subcategories, for example. Also many categories "Category:XYZ nouns" were NOT in "Category:Nouns by language, a number of categories were using different names for the same language (Category:Nynorsk language vs. Category:Norwegian Nynorsk language) and so on. --Daniel Carrero (talk) 22:52, 22 September 2015 (UTC)
"The purpose of the module system is doing or undoing all that at once with few edits to the modules." If you can't code and/or can't find modules, the number of edits it takes isn't particularly relevant. With HotCat (and even without it), I can add a category to a large number of pages relatively quickly, as HotCat is relatively easily to use and doesn't require coding. Also, Dan, while you may have had a role in our current confusing system, you revert me (and other uses) far less in this area than CodeCat does. Purplebackpack89 23:54, 22 September 2015 (UTC)
I should make myself clearer. "The purpose of the module system is doing or undoing all that at once with few edits to the modules." That is in cases where you want to change how the system works, not just simple tasks like adding or deleting categories, which often don't require any change to the modules themselves.
Even if you did your edits with HotCat in a perfect and consistent way, we can't guarantee everyone will do it. Modules make the category system consistent. Consistency is important: Category:English nouns is a subcat of Category:English lemmas, not a subcat of Category:English language, and that is true for all languages. But what if people want to change this? What if User:Example wants Category:German nouns to be a subcat of Category:German language? My point is: There are dozens of thousands of categories, that is an increasingly complex system by itself. In my opinion, User:Example should not be able to place Category:German nouns under Category:German language while leaving all other "X nouns" categories unattended. If that is to change, then all "X nouns" would have to change together. That is why we are using modules now. What's your opinion on this? --Daniel Carrero (talk) 10:15, 23 September 2015 (UTC)
My opinion, Dan, is that you're ignoring the main drawback of module use: that most people either don't know what a module is and/or are unable to edit them even if they do. Yes, using HotCat or other manual categorization methods may have a lack of standardization, but it's far less complicated. I could probably add or remove a category from every single category we've got in less time than it'd take me to learn to code. Purplebackpack89 14:32, 23 September 2015 (UTC)
I am not ignoring that modules are complicated to people who don't know how to edit them. (I am one of those people, I can make templates using MediaWiki code, but I know virtually nothing of Lua.) I am explicitly supporting the consistency of modules in favor of the freedom of editing any category individually, on the grounds that categories are far more useful (at least IMO) in this multilingual system if all languages follow exactly the same categorization system. You say: "manual categorization methods may have a lack of standardization", but that downplays the fact that the categorization system without templates (as it was without poscatboiler and some votes and discussions) was a huge mess, with some examples that I already mentioned in this discussion. I also find it hard to believe that everything could be solved manually, like adding "Category:Nouns by language" and "Category:X lemmas" manually to all the 1,567 "Category:X nouns" categories, without having any categories using different language names. Surely there would be some problems solved with higher priority and others left unattended? Do you have interest in the whole categorization system or just a few of the existing categories? You said: "most people either don't know what a module is and/or are unable to edit them even if they do." What do you want to edit in modules? Or: What problem are you trying to solve where you would use HotCat rather than modules? --Daniel Carrero (talk) 15:52, 23 September 2015 (UTC)
If by "problem", you mean "everything I've ever wanted to recategorize or create since we created modules"...Purplebackpack89 16:30, 23 September 2015 (UTC)
That does not answer my questions. Be more specific. You only have 42 edits to categories but apparently since the beginning you had known how to use {{poscatboiler|gul|verbs}}, for example, so fitting new categories into the existing system does not seem to be an issue. What did you want to recategorize or create ever since we created modules? --Daniel Carrero (talk) 16:54, 23 September 2015 (UTC)
I have some questions:
  1. Can we direct User:Purplebackpack89 to the documentation for the relevant part of our infrastructure?
  2. Can we direct him to the the documentation for how to accomplish specific tasks?
  3. Can we tell him where the system for accumulating user requests, issues or questions and the response thereto?
No truly professional sysytem with a user/contributor interface would have so little documentation accessible from users PoV. If we can't have a truly professional system, then why do we place so much of the project in the hands of coders? Why do we accept so much regression in the features of the software. It seems as if many of the software projects simplify user-interface and other matters to match the skill level and "tidiness" preferences of the coders. DCDuring TALK 22:20, 22 September 2015 (UTC)
Question 1: {{famcatboiler}} / Module:families/data.
Question 3: WT:GP. --Daniel Carrero (talk) 22:52, 22 September 2015 (UTC)
Re: Question 1: How would one find that?
Re: Question 3: What gives some assurance that the problem is actually solved and how it is solved (any regressions?) rather than oversimplified, ignored, or deleted?
DCDuring TALK 23:29, 22 September 2015 (UTC)
Question 1: By looking at the code of other family categories. Random suggestion: Category:Italic languages. (or by asking other people)
Question 3: No assurance; this is a volunteer project. Though I'm curious: Did you have a request, issue or question that was posted in GP and later oversimplified, ignored or deleted? --Daniel Carrero (talk) 23:38, 22 September 2015 (UTC)
re: "this is a volunteer project" That seems to be interpreted as coders doing whatever they want, whether asked or not, conforming only to standards of their own invention. Other users are simply to conform to the changes. Those users are also volunteers and they have stayed away from or left this project in droves. DCDuring TALK 01:22, 23 September 2015 (UTC)
What do you want to change in categories? I don't see people complaining that way about complex templates other than categorization ones, such as {{en-noun}} and {{context}} (and previous incarnations, such as {{obsolete}}). What would people do if they wanted to change {{en-noun}} some way but didn't have the skills to do it? {{poscatboiler}} and related templates are great; but I'm just saying my opinion. If a number of people really hate them so much, why don't you nominate them for deletion? --Daniel Carrero (talk) 10:15, 23 September 2015 (UTC)
And Question 2, which is the "how to code" question? Also, having coding issues actually makes things harder for people like you and CodeCat who can code, because not only do you have to do your coding, you have to do the coding of everybody who needs coding done. Purplebackpack89 23:54, 22 September 2015 (UTC)
You started your complaint with "having coding issues"; what coding issues? --Daniel Carrero (talk) 10:15, 23 September 2015 (UTC)
And let me remind you: this is a general beef I have with categorization requiring coding. Penutian wasn't the first time I ran into this wall, and I'm definitely not the only person to run into this wall. Purplebackpack89 23:59, 22 September 2015 (UTC)
Nominate {{poscatboiler}} for deletion. I'd vote oppose, but you have the right to pursue what you think is best. Or hire someone to create something better if you can't code. Or make a list of everything you think is wrong with the template so that can be discussed/fixed. --Daniel Carrero (talk) 10:15, 23 September 2015 (UTC)
How 'bout I just nominate all modules for deletion, and we go back to manual categorization? Your "solutions" essentially require that I acquiesce to coding being necessary for categorization on this project. I'm not willing to do that. I shouldn't be asked to. Purplebackpack89 14:32, 23 September 2015 (UTC)
"...require that I acquiesce to coding being necessary for categorization on this project" No. What is the difference between presenting an opposing idea in a discussion and "requiring that the other person acquiesce" to something? On that logic, I could accuse you of making me acquiesce to your ideas right now, too. But the fact is that I am not trying to "manipulate" you into changing your mind. I am presenting my points and I am asking you to discuss yours, especially after you have explictly complained of your edits being reverted or what you have to say being ignored. What else do you want? Ultimately, nominating {{poscatboiler}} for deletion as I said may have sounded sarcastic, but that was not the case. If a template is problematic, creating a deletion discussion would be the natural thing to do. Even though I think it's a great and helpful template, you can do whatever you want. --Daniel Carrero (talk) 15:52, 23 September 2015 (UTC)

Formatting proposal: always put cognates in a separate paragraphEdit

Previous discussion: Wiktionary:Beer parlour/2015/June#Proposal: Always collapse cognate lists in entries

Back in June I complained that lists of cognates make etymologies messy and hard to read. Several other editors seemed to agree, but there wasn't a consensus for any change that I can see. So I'd like to propose something much milder and less intrusive: cognates must always be in their own paragraph, separate from the explanation of the term's origin. See landschap for an example. —CodeCat 00:16, 24 September 2015 (UTC)

Support. Some etymologies have the evolution chain and the cognates intertwined (i.e. “from Fooese foo (cognate to Barese bar, Voynichean cthar), from Klingon kplar (cognate to ...)”. How can these be dealt with? — Ungoliant (falai) 01:11, 24 September 2015 (UTC)
They should probably be detwined and made into a paragraph. Of course, there's nothing that says there can't be more than one paragraph of cognates, so one could have one for close cognates and another for more distant ones. —CodeCat 01:27, 24 September 2015 (UTC)
I also support, though I would like to know more about this issue. Would we format multiple cognate paragraphs to begin with a word to indicate from what level the cognates come? If so, is the cognate paragraph labeled by the parallel cognate or the shared etymon? Also, do we represent collateral forms under the list of cognates? —JohnC5 16:15, 24 September 2015 (UTC)
Would it be possible to make a template for this, something like {{cognates|fi=...|eu=...}}, or do you think there would be too much variation? DTLHS (talk) 01:42, 24 September 2015 (UTC)
I Support. And I am huge fan of consistency, so in cases (like neap etym_2), will the one cognate be in a separate paragraph ? What should be the proper way in this instance? Leasnam (talk) 11:44, 24 September 2015 (UTC)
And we also treat comparisons in a similar way to true cognates, so would we need a separate template (if any) for these ? Leasnam (talk) 11:47, 24 September 2015 (UTC)
That template would be possible and is an interesting new way to use template parameters, but I don't think we should use it for this because cognates need to be arranged in a logical and structured order which makes sense in terms of the etymology and the template would have no way of knowing how to arrange them. --WikiTiki89 14:40, 24 September 2015 (UTC)
For descendants we usually have a fixed ordering, so could we not use that for cognates too? —CodeCat 14:47, 24 September 2015 (UTC)
The ordering is fixed only when the word descended through it's "natural path", which is not always the case. --WikiTiki89 14:51, 24 September 2015 (UTC)
True, but that might not matter for listing cognates. —CodeCat 15:29, 24 September 2015 (UTC)
Well it does still matter, but you may be right that it would not apply in many cases because the words did descend in their natural paths to all the listed cognates. --WikiTiki89 15:38, 24 September 2015 (UTC)
Support: saying they "must always" be in their own paragraph seems needlessly firm to me. There's always room for exceptions to be made. But in general it seems like a good idea. WurdSnatcher (talk) 16:18, 24 September 2015 (UTC)
Support, seems reasonable. Also WT:ELE is currently lacking any mention of cognates. I realise they don't have their own header, but this is the place users ought to be able to look for guidance on how to add them to an entry. —Pengo (talk) 01:49, 26 September 2015 (UTC)
When an etymology is very long, this would be helpful, but when an etymology is very short ("From Proto-Algonquian *foo, whence also Penobscot foo."), a paragraph break seems unnecessary and excessive. It would also make entries with different levels of cognates (as Ungoliant describes) messy. - -sche (discuss) 05:21, 26 September 2015 (UTC)
I agree with this point. "Must always" seems a bit overly firm here. We could consider e.g. separating cognates if the etymology runs to more than one line long? --Tropylium (talk) 22:42, 28 September 2015 (UTC) 
@Tropylium How long is a line? —CodeCat 18:05, 1 October 2015 (UTC)
Depends on the window size (which further depends on one's screen resolution), of course. But we should probably aim for downward compatibility. Is it possible to find out what is the median non-mobile user's screen width? Off the cuff I'd guess 1024 px? and allowing for other screen uses, about 600-800 px might then be a reasonable range for "a line". --Tropylium (talk) 20:20, 1 October 2015 (UTC)
  • I oppose that "cognates must always be in their own paragraph" since it makes etymology sections take more vertical space, especially the short ones. I am quoting, including "must" and "always". --Dan Polansky (talk) 11:45, 26 September 2015 (UTC)
Define list, is a list just more than one? Renard Migrant (talk) 14:32, 26 September 2015 (UTC)
  • I support the option to use separate paragraphs, but would not be comfortable enforcing it if, for instance, there are only one or two examples. Ƿidsiþ 14:31, 30 September 2015 (UTC)

Wikiproject SirionoEdit

Hi all,

I inform you about a project I am trying to build for fr.wiktionary and es.wiktionary. It's name Wikiproject Siriono and it's about a language spoke in Bolivia named Siriono. I am a French linguist who study this language since five years now for my PhD in linguistics and I want to export my database into Wiktionary. Then, as the project is wrote, I plan to go to Bolivia to train the speakers to access and manage the entries in the Spanish edition of Wiktionary. My database is made with FieldWorks Language Explorer and include translation from Siriono to local bolivian Spanish, French and English. I don't know nothing about the structure of the English edition of Wiktionary, so I will not try to create a bridge for this edition, but I am willing to collaborate if someone from here want to. In my opinion, it's just a basic formatting for xml datas and a language check, as English is not my mother tongue. Plus, if there is here someone who speak Spanish and want to be part of the team that will go to Bolivia next year, there's still room for a name. I want to work mainly if only on the Spanish Wiktionary so better if it's someone that know this project well. Hope you will find interesting and your welcome to comments, critics and fix mistakes in the proposal if my English is to vague or incorrect. Yours, Eölen (talk) 17:51, 24 September 2015 (UTC)

@Eölen: Very exciting project! I've copy edited the proposal as requested (hopefully I haven't added any errors). The only part I wasn't sure about was the line "...and be host in family during one month." I wasn't sure if you meant "and his family will host for one month" or maybe "a local family will host the team for one month" so I left it (I'll let you fix it). Looks like a very worthwhile project and wish you the best with it. —Pengo (talk) 03:04, 26 September 2015 (UTC)
@Pengo: Thank you very much! You did a great job! I will clarify this sentence. A local family will host the team, in one separated house I built in the village, next to their house. I don't have relatives there! I am very grateful you helped to fix the language and improve the writing. I hope this project will go well, I am still trying to define precisely how to schedule it and I am still looking for a volunteer for the team! Thanks again! Eölen (talk) 04:25, 26 September 2015 (UTC)

"Compare" in etymologiesEdit

I've always been annoyed when etymologies say "compare", with a bunch of words in other languages, because it's so meaningless. Why am I supposed to compare them? Am I supposed to note their superficial similarity? Is their meaning interesting? The number of syllables? Do I win a meerkat if I compare them? Obviously the insinuation is that there is some connection, but why does the entry not just say what the connection is? Are they cognate? Or otherwise related? Then say so.

Sorry, a bit of a rant, but I hope people agree with me. —CodeCat 00:15, 26 September 2015 (UTC)

Then say so.—how about you go speak to your cat like that?
I guess I'll have to be the first to disagree, since those were my two entries that you just defaced. Like I said when I reverted your edits, at least mine were helpful, which is more than I can say for yours. If you don't find this information useful—don't use it. If you don't find it interesting—ignore these sentences. And let other people make whatever sense they make of those things. They don't break any rules, the connections pointed out are factual and real.
P.S. Does anybody honestly care what someone has "always been annoyed" by? Pfftallofthemaretaken (talk) 00:21, 26 September 2015 (UTC)
But you didn't point out any connection. You said "compare", which means nothing. Things can be compared to show the absence of connections, too. And Wiktionary isn't a free-for-all where you can just put anything you think is interesting. Things have to serve a purpose and fit into our formatting rules. A "compare with" heading does neither. And for the record, I didn't "deface" "your" entries. I edited the entry, and the entry was Wiktionary's from the moment you made your edit. —CodeCat 00:27, 26 September 2015 (UTC)
There is a place for "compare" in etymologies, because establishing the exact relationship between words isn't always possible, but there are plenty of cases where there seems to be some kind of connection. I would rather have "compare" in many cases than to either have a long explanation of why the term is of interest, but maybe not a cognate, or to eliminate anything other than that which is rigorously proven to be a cognate. As with anything open-ended, this can certainly be abused: I don't really want to see a bunch of Mongolian terms in English entries because they pass some kind of Pan-Turkist's Rohrschach test. But then, I don't really want to see Norwegian, Swedish, Danish, Icelandic and Faroese cognates because one of the languages was included and partisans for the others felt they deserved equal treatment.
As for the two cases at contention: I agree that having a number of terms in other languages under a separate "Compare with" header is overkill, though they all probably trace back to the same Pali term through some combination of borrowing and inheritance. On the other hand, removing A Chinese term from a Sino-Vietnamese etymology that's made up of the same characters as those cited in the main part of the etymology seems like overkill, too. It's a matter of proportion and degree of relevance, which can't be tidily summed up in some kind of rule. Chuck Entz (talk) 01:20, 26 September 2015 (UTC)
It's more the wording that annoys me than anything else. I have no problem with listing cognates or related terms, but then say what they are. Don't use "compare" and leave people guessing at what the idea is. Related terms should be called related terms, because then you know what they are. —CodeCat 01:38, 26 September 2015 (UTC)
I agree. I assume cf. is shorthand for "see also this other word which sounds a bit similar so maybe its etymology is correct for this word too or at least had an influence on its formation, who knows?", or more briefly: possibly cognate with or possibly influenced by or formed in a similar way with a different meaning etc, which are the kinds of phrases I'd prefer.
Similarly Related terms is also fairly meaningless, and it annoys me that we have no standard way (afaik) to, for example, mark a related term as "adjective form of this noun", etc. —Pengo (talk) 02:11, 26 September 2015 (UTC)
For related terms, I was talking about putting them in etymologies. Just as we have "cognate with" lists, there's also "related to" lists, which are much preferable to the nondescript "compare". —CodeCat 02:18, 26 September 2015 (UTC)
Compare makes perfect sense. It means, look for similarities or analogies. "Compare" does not state what the connection is since that would be too wordy, and since that is often obvious once you actually start doing the comparing. It looks like I largely agree with Chuck Entz above. --Dan Polansky (talk) 11:51, 26 September 2015 (UTC)
Compare is used for cognates in other language and also cognates in the same language, or words that aren't cognates but follow a similar pattern in their formation. You're free to be pissed off abut it, it's your life, but I'd like a Wiktionary where not only what CodeCat thinks is important, and other people can make some decisions too. Renard Migrant (talk) 14:24, 26 September 2015 (UTC)
Of course, because whenever I think something can be improved, that must automatically be interpreted as my attempts to force my will through. Because we all know that leaving problems unsaid works so well. At least now that I've said this, it's clear that some people agree with me. So how can you even think it's about what I alone think is important? Do those other people not count? —CodeCat 14:40, 26 September 2015 (UTC)
I agree with CodeCat, "compare" only means something if you already know what the relationship is or know enough about linguistics to deduce what's meant. It makes sense in paper dictionaries where you have to save space, but here we can be clear. Why not say "possibly cognate with" or "may be related to" or whatever? It seems like Wiktionary tries to be as confusing and off-putting as possible sometimes. Why use such stilted, unnatural language? Are we trying to communicate info to each other and other linguistics enthusiasts? Or are we trying to educate ordinary people? If we're trying to educate ordinary people, "compare" is pointless obfuscation about as useful as a "Meronyms" section. WurdSnatcher (talk) 14:30, 26 September 2015 (UTC)
“I would rather have "compare" in many cases than to either have a long explanation of why the term is of interest, but maybe not a cognate, or to eliminate anything other than that which is rigorously proven to be a cognate”—this.
@WurdSnatcher Educate ordinary people? I didn’t know that was the aim of the project. I would be curious to know how many people here would say they contribute because they’re trying to educate ordinary people. I’d say ordinary people don’t care about dictionaries at all—they use them once or twice a year to look up a word or two. And when they do that, they’re certainly not interested in things like, say, the etymology of the word at all. Should we start removing all etymologies then?
“Why use such stilted, unnatural language?”—oh wow, that’s rich, coming from someone championing the interests of ordinary people and proposing to use “possibly cognate with” instead of “compare with”. I didn’t even know what a ‘cognate’ was until a couple of months ago. How are ‘ordinary’ (whatever that means anyway) people supposed to know that? “Compare with” is certainly more natural than “possibly cognate with” or, god forbid, ‘cf.’
“It makes sense in paper dictionaries where you have to save space, but here we can be clear.”—there’s another thing we need to try and save, whether we’re talking about paper or online dictionaries. That thing is people’s time. And that is the reason why “compare with” is preferable to “possibly [insert an obscure linguistic term that only linguists understand]”, or etymology sections spanning half a dozen sentences. This is especially true if we’re keeping in mind the interests of those ‘ordinary’ people, who just want to quickly figure out what that obscure word they’ve just heard means.
@CodeCat Are you going to nitpick my use of the word ‘my’ in front of the word ‘entry’ now? The entries might not be mine, but can I at least keep the messages to my unworthy person? With m’lady permission, of course. Thank you. And while my messages are still mine, the way that I see it is that I get to call things whatever I want (as long as I’m not employing insults, and I’m not), so ‘deface’ you did. I like the word, and it describes accurately the nature of your edits.
By the way, here’s an idea for you. This is how books used to be censored in tsarist Russia: [[13]]. Maybe next time you don’t like something in an entry you could just replace that part with dots, like in the book pictured. You still get to censor things (which you seem to enjoy doing), but don’t do as much damage in the process. Pfftallofthemaretaken (talk) 19:34, 26 September 2015 (UTC)
Wow, that was quite a bit of bile there. I'm not really sure that was merited in response to anything. I do not mean to lecture you, but no user really has chief control over any entry outside of his or her user namespace. Code may often be rather brusque when she makes changes to things, but the reason we are having this discussion is to decide how we, not she and not you, will handle this issue in the future. I really don't want us to be so angry about such a small matter, so I would ask all parties involved please calm down just a touch.
Compare has always seemed like shoddy explanation to me personally, but not enough that I think it should be removed wholesale. I normally understand compare to act as a stub for bigger and better things!
P.S. I can whip up a template like {{redacted|US}} or {{redacted|tsarist}} that will place black bars or dots over text respectively. It wouldn't be too hard, really. (N.B. the preceding joke is meant solely for the purpose of humor and not to antagonize, mock, or promote one side's argument in any way. Please do not get bent out of shape about a silly suggestion.) —JohnC5 22:32, 26 September 2015 (UTC)
I agree that removing "compare" along with all the terms is probably bad. But at the same time, it's often not possible for a random person to improve things, because this means knowing just what the connection being alluded to is - exactly the vagueness that is making me complain about "compare" in the first place. If I see "compare", I have no way of knowing whether the words are cognates, unless I have knowledge of those particular languages (e.g. Germanic). For Chinese, I would have no clue what to replace "compare" with; only the person who added it there presumably knows. Preventing that situation is better than having to tediously fix it afterwards, hence my revert to Pfftallofthemaretaken's edits. —CodeCat 23:10, 26 September 2015 (UTC)
That seems fair, especially considering the illegal headers. Also, sorry for calling you brusque. —JohnC5 23:53, 26 September 2015 (UTC)
@JohnC5 After reading your message I realized I was wrong and amended mine accordingly.
"I do not mean to lecture you, but"—"but that's exactly what I'll proceed to do."
"I normally understand compare to act as a stub for bigger..."—I sincerely hope that this isn't what this project is all about.
"...but the reason we are having this discussion is to decide how we, not she and not you, will handle this issue in the future."—really? I thought we were having this discussion because one particular editor wished to communicate to the rest of us what she's "always been annoyed" by.
But anyways, I agree that the situation could use some defusing, so let's all play a game. The rules are as follows. I will now revert CodeCat's edit of the entry on bảo đảm and put back "compare with Chinese 擔保" that I originally put there. These are the same two characters that are used in the hán tự spelling of the word, but in reverse order. I'm sure there's a name for it in linguistics, but haven't the faintest idea what that term might be. So I'll use "compare with". Now, whoever reverts that edit will have to explain why they didn't also remove all the instances of compare in the Etymology section of the entry on father—as the section invites us to compare that word with 15 other words, in 15 different languages! I don't know about everyone else, but I'm curious what's gonna happen. Because, after all, maybe those reverts that I mentioned in my original reply here were just personal... Pfftallofthemaretaken (talk) 08:21, 27 September 2015 (UTC)
This game seems more antagonistic than conciliatory…. —JohnC5 15:35, 27 September 2015 (UTC)
The way I see it, "compare" is not the ideal, but sometimes details are simply not available or the editor writing the etymology is not aware of the details and does not want to spend too much time researching them. You can look at it as an invitation for anyone who cares enough to do some research and replace "compare" with something more informative. --WikiTiki89 16:29, 28 September 2015 (UTC)
I agree. And I didn't see the abovementioned entry edits (so don't know whether they were removal of "compare" lists or what), but don't think that "compare" lists should be removed. Clarified, certainly. Removed, no.​—msh210 (talk) 18:03, 1 October 2015 (UTC)
One point, tho not directly related to the kerfuffle at hand, is that a lot of our "compare" stretches seem to date from before we had much in the way of protolang appendices, and seem to fulfill the role of listing cognates. This at least I think can be done much more efficiently with the appendices (which would also help with the occasionally seen complaing that our etymology sections take too much space).
I do not oppose "compare" if given context, e.g. if we say that "Barese X is borrowed from Proto-Fooian Y", then it seems pretty clear what adding a "(compare Classical Foo Z)" is doing. But yeah, a completely hanging "compare Zoinks Ø" seems unhelpful. --Tropylium (talk) 22:38, 28 September 2015 (UTC)

Automation of French conjugationEdit

This user is requesting that all French conjugations use the Template {{fr-conj-auto}}. If the consensus agrees, this user will start converting all French verbs to the new usage. --kc_kennylau (talk) 08:32, 27 September 2015 (UTC)

  • Any chance of some documentation? SemperBlotto (talk) 08:37, 27 September 2015 (UTC)
    Obviously it's a good idea, but does it ever need parameters, or is all irregular verb information stored in the module itself? Renard Migrant (talk) 15:18, 27 September 2015 (UTC)
    @SemperBlotto: Okay, documentation done. --kc_kennylau (talk) 01:40, 28 September 2015 (UTC)
    @Renard Migrant: No parameter would be needed (usually). --kc_kennylau (talk) 01:40, 28 September 2015 (UTC)

Template inh in etymologiesEdit

I see a bot replacing {{etyl}} with {{inh}}, e.g. in diff. The resulting markup combines what was etyl and term into a single template, like {{inh|cs|sla-pro|*čьlověkъ}}. I don't really like this. Everyone happy? --Dan Polansky (talk) 11:24, 27 September 2015 (UTC)

Yes. I've been waiting a long time for someone to finally do this. I find it really annoying to have to separate the {{etyl}} and {{m}} templates, and half the time I find myself putting the etymon directly in {{etyl}}, and then I have to go back and fix it. Using {{inh}} makes life much easier. —Aɴɢʀ (talk) 12:12, 27 September 2015 (UTC)
Well, {{inh}} isn't meant as a drop-in replacement for {{etyl}} and {{m}}. It's meant specifically for inherited terms, as its documentation notes. For borrowed terms, you'd use {{bor}}. There are also cases where terms are neither inherited nor borrowed. For example, foxhound does not inherit from Proto-Germanic *fuhsaz or *hundaz but it can be said to be derived from Proto-Germanic nonetheless. I intended "inheritance" to mean specifically cases where the actual morphological formation was inherited. In other words, inheritance can be traced until the point that the actual word "foxhound" came into existence. —CodeCat 13:16, 27 September 2015 (UTC)
Good idea. Funnily enough the French template étyl has done this almost since its inception many years ago. The only issue with bot replacements is sometimes you get "Borrowed from {{etyl|la|fr}} {{m|la|<word here>}}" which for obvious reasons, shouldn't used {{inh}}. Renard Migrant (talk) 14:52, 27 September 2015 (UTC)
I haven't done wholesale bot replacements. The ones I did were always pairwise: for only one given source language and current language. That way I could make sure that the current language was in fact a descendant of the source. I've also only replaced instances where the source is the first to appear in the etymology, and only when the preceding word is "From" or empty, never "Borrowed" or anything else. On top of that, I've so far avoided language pairs where the current language might have reasonably borrowed from the source as well as inherited from it, i.e. Romance borrowings from Latin. The Germanic languages have generally not borrowed words from older stages, so they are safe. I've also skipped cases where reconstructed terms end in -, because those are likely to be stems or roots rather than fully-formed words. Roots have no descendants, only derived terms which have descendants. —CodeCat 15:13, 27 September 2015 (UTC)
Side note: it's often a hard to tell if a word's history was derivative formed in proto-X → develops into word in X or root inherited in early X → derivative formed within the history of X. I would suggest that at least in cases where we can only clearly establish a proto-root and a language-specific derivative, it's safest to analyze them as a derivative within the language, and only linking the root of the derivative to the proto-root. --Tropylium (talk) 22:28, 28 September 2015 (UTC)
I seem to remember cleaning up a couple of cases (maybe not with this specific template) where the language code was an etymology-only subset along the lines of NL. and the module tried to use it for the mention. You would need to allow for this before converting such cases. Chuck Entz (talk) 15:54, 27 September 2015 (UTC)
Are you sure? It explicitly handles etymology-only languages already. —CodeCat 16:12, 27 September 2015 (UTC)
Is the bot checking to make sure there's no borrowing along the way (e.g., if an English word from Latin via a French/Norman borrowing says only "from Latin foo", or if an English word from Hebrew via Yiddish says only "from Hebrew foo"), CodeCat?​—msh210 (talk) 17:57, 1 October 2015 (UTC)
Well, so far I've only done Dutch and the Finnic and Samic languages. I don't have any immediate plans to do more. That said, this script and template are only for inherited terms, so I wouldn't use it for English from Latin because English didn't descend from Latin. —CodeCat 18:03, 1 October 2015 (UTC)

Proposal: Drop "explained" and make it "Wiktionary:Entry layout"Edit

I've been thinking about this for a while. I propose renaming Wiktionary:Entry layout explained (WT:ELE) to just Wiktionary:Entry layout (WT:EL), while retaining the old name and shortcut as usable redirects.

1. "explained" does not add anything new or make the title any better. We could just as well have Wiktionary:Criteria for inclusion explained, Wiktionary:Blocking policy explained, Wiktionary:Page deletion guidelines explained and maybe Wiktionary:Bots explained without the new name making the policies any more accurate. Maybe one reason "explained" is in the title is because a three-letter shortcut ("ELE") is catchy?

Bonus, less important, reason:
2. Speaking for myself, if the title is shorter, it would invite me for typing the full name of the policy in discussions or on my personal browser more often if I want to ("WT:Entry layout" or "Wiktionary:Entry layout"). Conversely, if the title is longer, it makes me more likely to use the shortcut only ("WT:ELE"). I am willing to bet this would be true for other people, too. Not that big of a reason, (it boils down to "'explained' makes the title longer!") but it's good to mention it, as secondary to the question above.

Previous discussion:

I did not use RFM this time because that's a major policy. In any event, if other people agree with the name change in this discussion, I plan to follow up with a vote to close the deal.


  • WT:EL was the shortcut to Wiktionary:External links and was used in three places: a 2006 discussion, a 2014 discussion and as a proposed shortcut to ELE itself. IMO that's unused enough that the shortcut could be changed; I renamed it to WT:EXT to free it for this proposal

--Daniel Carrero (talk) 14:50, 27 September 2015 (UTC)

Good idea, obviously. Renard Migrant (talk) 14:53, 27 September 2015 (UTC)
Support. —CodeCat 15:10, 27 September 2015 (UTC)
Seems reasonable enough - support SemperBlotto (talk) 15:13, 27 September 2015 (UTC)
Support. - -sche (discuss) 06:36, 28 September 2015 (UTC)
SupportPengo (talk) 03:19, 29 September 2015 (UTC)
Why not!   — Saltmarshσυζήτηση-talk 04:40, 29 September 2015 (UTC)
Support - --Zo3rWer (talk) 15:38, 29 September 2015 (UTC)
Support. --WikiTiki89 15:43, 29 September 2015 (UTC)
Support.​—msh210 (talk) 17:49, 1 October 2015 (UTC)
The downside of it is that it will no longer be a three-letter acronym (TLA). --Dan Polansky (talk) 15:55, 3 October 2015 (UTC)
We can still continue to call it ELE. --WikiTiki89 15:21, 5 October 2015 (UTC)
WT:ELE is definitely going to be kept, since it has been a widely used shortcut. We can use WT:ELA (Entry LAyout) as an alternate three-letter acronym, though personally I prefer just WT:EL. (Also, in Portuguese, ele = he, ela = she.) --Daniel Carrero (talk) 16:37, 5 October 2015 (UTC)

I count 10 supports (including myself, not counting @Dan Polansky). I take it we can move WT:Entry layout explained to WT:Entry layout now without the need for a vote? --Daniel Carrero (talk) 16:37, 5 October 2015 (UTC)

Yeah; I've moved the page and its talk page and subpages, leaving redirects in all cases. Any double redirects will soon show up at Special:DoubleRedirects and I'll fix them. - -sche (discuss) 21:43, 6 October 2015 (UTC)

Reimagining WMF grants reportEdit

IdeaLab beaker and flask.svg

Last month, we asked for community feedback on a proposal to change the structure of WMF grant programs. Thanks to the 200+ people who participated! A report on what we learned and changed based on this consultation is now available.

Come read about the findings and next steps as WMF’s Community Resources team begins to implement changes based on your feedback. Your questions and comments are welcome on the outcomes discussion page.

Take care, I JethroBT (WMF) 17:02, 28 September 2015 (UTC)

On three-part compoundsEdit

Ancient Greek words have a tendency to form compounds that are not directly derivable from their component words—not in the same way English compounds are.

There is also an analog in English: words of the form X-Yed, like quick-witted. However, the treatment of these words with respect to etymology is not entirely consistent, and the most common manner is apparently to create a separate lemma, e.g. witted, called an adjective but described as a suffix. Which may be the most accurate manner in which to describe such a morpheme, I'm not sure, although I do think at least it should include a hyphen. With respect to Ancient Greek—after making this list and pondering my findings, I have come upon the conclusion that the best practice is to mark the etymology with {{compound|stem|stem|suffix}}, or {{compound|stem|stem}} when the "suffix" is really just the thematic ending (i.e. type 1 above) or no ending (type 1b.) Verbs that change grade should probably get separate pages, e.g. -λογος (although the POS of this is uncertain—perhaps "adjective" is best?). Harder is the zero-grade—perhaps entries such as -τραφής?

Comments? —ObsequiousNewt (εἴρηκα|πεποίηκα) 19:30, 28 September 2015 (UTC)

Why not use {{affix|en|quick|wit|-ed}}? —CodeCat 20:27, 28 September 2015 (UTC)
Oh, hey, that is better than {{compound}}, isn't it. Great. Do you have any other thoughts regarding this? —ObsequiousNewt (εἴρηκα|πεποίηκα) 02:56, 29 September 2015 (UTC)
@User:ObsequiousNewt: It depends on whether you want to emphasize the compounding process or the suffixing process; "quick-witted" contains both processes, and compounding seems more marked to me in this term. And then, you need to know whether you want to have the terms categorized as compounds. I don't see anything wrong about "{compound|stem|stem|suffix}}", really. --Dan Polansky (talk) 16:07, 3 October 2015 (UTC)
{{compound}} won't categorise in Category:English words suffixed with -ed, while the entry should be located there. —CodeCat 17:34, 3 October 2015 (UTC)
{{compound|lang=en|quick|wit}}{{suffix||-ed|lang=en}} will, though. —Aɴɢʀ (talk) 17:56, 3 October 2015 (UTC)
Which does the same as {{affix|en|quick|wit|-ed}}, but is less logical. —CodeCat 18:00, 3 October 2015 (UTC)
It might make sense to modify {compound to categorize to Category:English words suffixed with -ed when it sees {{compound|quick|wit|-ed|lang=en}}. Then the categorization effect would be the same, and it would only be the matter of deciding whether the markup should emphasize compounding or suffixing. --Dan Polansky (talk) 19:22, 3 October 2015 (UTC)
I'm not sure what you mean by emphasizing. The end result is probably indistinguishable on the page. What you seem to be suggesting is to make {{compound}} work like {{affix}}, minus differences in the parameters. But I already checked in the past whether this would work, and it won't. Not everything beginning with - is a suffix, and not everything ending with - is a prefix. Those cases don't work with {{affix}} either, no, but that template was created anew so there were no issues with backwards compatibility. With {{compound}} on the other hand... —CodeCat 22:46, 3 October 2015 (UTC)
Will {{affix|en|quick|wit|-ed}} put it in Category:English compound words? —Aɴɢʀ (talk) 19:32, 3 October 2015 (UTC)
It will; try it in a dummy entry in preview, without saving. --Dan Polansky (talk) 19:41, 3 October 2015 (UTC)

Proper nouns and capitalization across all languagesEdit

We have established through many discussions here in the BP that proper nouns are defined by their usage and semantics and not by their capitalization. Most of these discussions, however, focused mainly on English. I am wondering whether we have a consensus that this also applies to all other languages that distinguish capitalized and uncapitalized nouns. I know that many of us believe that we should not distinguish between proper and common nouns at all, but as long as we do, we should do it consistently. Let's do a poll on whether we agree with the following principle: For all languages, usage patterns and semantics should take priority over capitalization and punctuation in determining whether a noun is common or proper. --WikiTiki89 15:32, 29 September 2015 (UTC)


“For all languages, usage patterns and semantics should take priority over capitalization and punctuation in determining whether a noun is common or proper.”

Please state whether you agree or disagree. If you disagree, please explain why and, if possible, include an example of when you believe this should not apply.


  1. Agree. --WikiTiki89 15:32, 29 September 2015 (UTC)
  2. Agree, but my preference is to treat them all as nouns and use categorisation alone to distinguish them. —CodeCat 15:33, 29 September 2015 (UTC)
  3. Agree. The usefulness of distinguishing is lost if capitalization is used as the distinguisher, as there would be no need to look up the part of speech. --Andrew Sheedy (talk) 23:22, 30 September 2015 (UTC)
  4. Agree. Capitalization does not necessarily tell whether something is a proper noun; rather, capitalization habits are to an extent adjusted based on the perception of grammarians of whether something is a proper noun or not. In particular, "Frenchman" is not a proper noun. However, I am not clear what role punctuation might have in something being a proper noun; what language would that pertain to? On another note, I don't think we should necessarily be consistent across languages: If English grammarians deem names of languages to be proper nouns, I am okay with marking them so in English, while marking them as common nouns in Czech as per Czech grammatical tradition. --Dan Polansky (talk) 10:01, 3 October 2015 (UTC)
    Punctuation was just hypothetical; what I had in mind was something like Ancient Egyptian cartouches. --WikiTiki89 13:14, 3 October 2015 (UTC)


This becomes problematic when you get into dead languages. Academic consensus for Latin, Ancient Greek, etc. is to capitalize proper nouns, even though actual writing only had one (upper) case. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:12, 29 September 2015 (UTC)

I think you misunderstood. What I am saying is not about how we should capitalize nouns, but about how we should classify them. In other words, that we should ignore the capitalization when deciding whether to put ===Noun=== or ===Proper noun=== as the POS, but still follow the established capitalization practices for deciding where to put the entry. --WikiTiki89 17:17, 29 September 2015 (UTC)
Yeah, I see what you mean now. My bad. —ObsequiousNewt (εἴρηκα|πεποίηκα) 17:10, 30 September 2015 (UTC)
  1. Disagree. Semantics are not always easy to determine. Wikitiki89 recently turned the famous sense of перестро́йка ‎(perestrójka, perestroyka) into a proper noun, so he is making a case now.

    There is no convention or precedence of treating lower case words as proper nouns in Russian. Lower case words can never be proper nouns in Russian, regardless of semantics, and I see no need to introduce this. Specific rules and conventions with different languages should not be ignored. I oppose treating differentiation of proper/common nouns the same way we do for English. E.g. language names, ethnicities, month and weekday names are common nouns and are spelled in lower case in Russian but some political terms are proper nouns, made so by the Soviet Communist government.

    I am not too interested in making points on the topic. Suffice that I expressed my opinion on the matter but I may join the discussion later on. --Anatoli T. (обсудить/вклад) 01:28, 1 October 2015 (UTC) (I've reformatted your post slightly, Anatoli.​—msh210 (talk) 17:46, 1 October 2015 (UTC))

  2. Disagree. I have no reason to think that semantics and/or usage is the usual standard by which proper vs. common noun is determined in every language (that has proper and common nouns). Maybe for some languages it is indeed orthography (e.g. capitalization) or something else. I think this should be decided by each language's editors.​—msh210 (talk) 17:44, 1 October 2015 (UTC)


And what are the cross-linguistic usage patterns and semantics that determine whether a noun is proper or common? —Aɴɢʀ (talk) 17:20, 29 September 2015 (UTC)

These might vary by language (I didn't mean to imply that they are universal). The general idea is that proper nouns usually cannot take determiners or adjectives, cannot change their number, and cannot be possessed (unless these features are already part of the lemma, or unless the proper noun is being commonized). As far as semantics, they generally refer to one specific thing, rather than a class of things. --WikiTiki89 17:35, 29 September 2015 (UTC)
We should be consistent across languages, though. If day or month names are common nouns in some languages, we should treat them that way in all languages. This, by the way, is one huge reason for treating proper nouns and common nouns the same on Wiktionary: the distinction is semantic, and can be determined by analysing the referent, so the result is the same for all words with the same referent in all languages. They are a subset of nouns, not separate from nouns. —CodeCat 23:39, 30 September 2015 (UTC)
That's not necessarily true though. Some languages might handle certain concepts with common nouns that other languages handle with proper nouns. In most cases, you would be right, but there will be exceptions. --WikiTiki89 01:11, 1 October 2015 (UTC)
That seems kind of unlikely to me. Can you give an example? —CodeCat 17:25, 1 October 2015 (UTC)
For example, Navajo bilagáana tʼáá biʼałkʼiijééʼ (American Civil War). The Navajo term is not a proper noun, as it translates literally to "when the Americans fought each other". —Stephen (Talk) 01:55, 2 October 2015 (UTC)
  • Also, what English calls "Boxing Day" (proper noun) is in many languages just "second day of Christmas" (a specific common noun - one of the days of Christmas). Smurrayinchester (talk) 09:47, 2 October 2015 (UTC)
You aren't seriously suggesting applying the same POS rules across languages? In some languages, adjectives are verbs. In others, they're nouns. In still others, they're neither. I'm sure there are similar discontinuities in assignment of terms to the proper-noun POS. Chuck Entz (talk) 02:31, 1 October 2015 (UTC)
  • Many Australian languages differentiate proper nouns from common nouns by suffixes, which gives us a linguistic way of telling common and proper nouns apart. The Western Desert languages treat ŋana (determiner "who") as a proper noun (ref), so do we make it a proper noun across all languages? Smurrayinchester (talk) 14:58, 1 October 2015 (UTC)
    • "Who" is proper in English too. It refers to the specific person whose identity you are asking about. It also fits all of the criteria WikiTiki noted above: it can't take determiners or adjectives, can't change number, and can't be possessed. —CodeCat 17:27, 1 October 2015 (UTC)
      • But it's a pronoun, not a noun. Are there "proper pronouns" to be distinguished by a new header? Equinox 01:32, 2 October 2015 (UTC)
        • All pronouns are proper; I don't think there are common-noun-like pronouns. --WikiTiki89 15:32, 6 October 2015 (UTC)
          • What about indefinite pronouns like someone, anyone, whoever, etc.? They don't refer to a specific person. In many languages (e.g. Italian and Portuguese) possessive pronouns can take determiners (il mio padre / o meu pai), and as for pronouns taking adjectives, what about "poor me" and "lucky you"? —Aɴɢʀ (talk) 19:54, 6 October 2015 (UTC)
            • Good point. I realized that right after I posted that. I think indefinite pronouns would be a separate category altogether, still unlike common nouns (and even they can sometimes be "commonized" into a common noun: e.g. "this particular someone"). "Possessive pronouns" are determiners and are not really pronouns, much like English possessives such as "John's" are determiners and are really no longer proper nouns. And in case anyone was going to mention "o João", this is one of the reasons I said "usually" and not "always". --WikiTiki89 20:43, 6 October 2015 (UTC)
  • Can you give an example of a proper noun that meets all these criteria? I'm struggling to think of one. Even terms that are universally agreed upon as proper nouns violate these rules in at least some edge cases. Germany: "the Germany of my youth", "beautiful Germany", "the two Germanies", "Merkel's Germany". John: "the John I knew and loved", "good old John", "Which are of the Johns do you mean?", "my darling John". White House: "the White House", "the rebuilt White House", "they wanted their own White Houses", "Hoban's White House". Allah: "a vengeful Allah", "the groups worship totally different Allahs", "thy Allah and the Allah of thy fathers". Smurrayinchester (talk) 13:30, 6 October 2015 (UTC)
    • You seem to have missed my parenthetical remark "unless the proper noun is being commonized". Practically any proper noun (in English at least) can be turned into a common noun with a slightly different meaning. --WikiTiki89 15:32, 6 October 2015 (UTC)
I'm often not sure whether to make something a common or proper noun, e.g. medical syndromes. Can't think of examples right now, but suppose there's a "small face syndrome": it refers to one specific thing and has no plural; it also isn't uncountable (because it's "the syndrome", not "some syndrome"); I would probably make it a proper noun but it feels odd since it's somehow nothing like Paris. Equinox 14:21, 1 October 2015 (UTC)
I feel that this is better decided on a language basis, by each of their contributor communities individually. That said, I support this as a default guideline. — Ungoliant (falai) 14:36, 1 October 2015 (UTC)
If this is such a hazy question, then why do we need to determine the proper/common status of all nouns anyway? For a user interested in onomastics specifically, the unambigous cases like placenames or personal names will be clearly identifiable in any case. --Tropylium (talk) 20:10, 1 October 2015 (UTC)

Should we display the active votes in the watchlist?Edit

I moved the vote list from Wiktionary:Votes to a separate template that can be used anywhere as a reminder of the votes to participate.

Do you think it would be a good idea displaying this box in the watchlist of all users to increase awareness of votes?

I believe this could be accomplished by editing MediaWiki:Watchlist-details. Maybe we should have some way for each user to opt-out displaying the vote box in the watchlist, but I'm not sure how to do that.

Previous discussion:

--Daniel Carrero (talk) 17:32, 29 September 2015 (UTC)

Support. I always miss votes that are not explicitly advertised in the BP. --WikiTiki89 17:38, 29 September 2015 (UTC)
Before, the way not to miss votes was to watchlist Wiktionary:Votes. Now, you have to watchlist Template:votes. —Aɴɢʀ (talk) 18:24, 29 September 2015 (UTC)
I do watch WT:Votes and I still manage to miss them all. --WikiTiki89 18:28, 29 September 2015 (UTC)
@Angr @Wikitiki89 My idea was placing that box in the watchlist of all users, so the list of votes itself should appear whether you watchlist Template:votes or not. But, anyway, if that idea proves unpopular, I'll delete the template and restore WT:Votes to the previous version. --Daniel Carrero (talk) 00:05, 1 October 2015 (UTC)
Also, I think the actual list of votes should not be located in the template namespace. We should move it to something like Wiktionary:Votes/Active and have {{votes}} tranclude it. --WikiTiki89 19:57, 29 September 2015 (UTC)
Since no one responded to this suggestion, I just went ahead with it. --WikiTiki89 19:04, 1 October 2015 (UTC)
I don't mind making the votes more visible, but I don't like the idea of pushing the actual watchlist even further down the screen; already the "preamble" takes up so much space that only a few lines of the actual watchlist make it onto the first screen on my computer. Is there a way of making it the version that gets displayed in the watchlist smaller, maybe a horizontal list separated by mid dots? Alternatively, what if we adopted the same idea as the Beer Parlor month-subpages, and had a single page that new votes were moved through (I mean using the "move" function), so that each vote page itself was added to the watchlist of everyone who watched that central page, and each vote would thus show up in everyone's watchlist every time it was edited, the same way that new BP month subpages show up in your watchlist without you ever doing anything to watchlist them? (Contrast how, currently, if you watchlist WT:VOTE, you only see the one edit when a new vote is listed on that page; you aren't updated when the vote starts, and when people vote, unless you watch each individual vote page.) - -sche (discuss) 21:01, 29 September 2015 (UTC)
  1. "I don't mind making the votes more visible, but I don't like the idea of pushing the actual watchlist even further down the screen"
    • Since that box floats to the fight, maybe the box could stay side-by-side with the watchlist instead of pushing it down further?
  2. "Is there a way of making it the version that gets displayed in the watchlist smaller, maybe a horizontal list separated by mid dots?"
    • Yes, that could be done and it's not very difficult to do.
  3. "Alternatively, what if we adopted the same idea as the Beer Parlor month-subpages, and had a single page that new votes were moved through (I mean using the "move" function), so that each vote page itself was added to the watchlist of everyone who watched that central page, and each vote would thus show up in everyone's watchlist every time it was edited, the same way that new BP month subpages show up in your watchlist without you ever doing anything to watchlist them? (Contrast how, currently, if you watchlist WT:VOTE, you only see the one edit when a new vote is listed on that page; you aren't updated when the vote starts, and when people vote, unless you watch each individual vote page.)"
    • That sounds great. But I'm not sure how I would do that; can it be done, in the first place?
--Daniel Carrero (talk) 00:05, 1 October 2015 (UTC)

Update: I edited MediaWiki:Watchlist-details to make the vote box currently appear in the watchlist of all users, as proposed by me above. I request feedback on this. Does it look good? Edit Template:votes/layout to change the appearance if needed. --Daniel Carrero (talk) 00:05, 1 October 2015 (UTC)

I think the list should be sorted by end date, and the end date should be shown as well. —CodeCat 00:20, 1 October 2015 (UTC)
Support. --Daniel Carrero (talk) 04:38, 1 October 2015 (UTC)
Good idea! I recently proposed instituting a system for notifying users of votes, and I think this would do the job well. -Cloudcuckoolander (talk) 00:37, 1 October 2015 (UTC)
  • The bad consequence of this is that, where before I only had to edit one page to move votes to the bottom, now I need to edit two. I don't really like the change. OTOH, if the change makes it easier for people to watch active votes, that's fine. I still think that regularly glancing at WT:VOTE with its less than 10 items at each point of time is an easy way to not miss any votes. --Dan Polansky (talk) 10:07, 3 October 2015 (UTC)

I initially disliked the box, considering it clutter. Knowing the human tendency for habituation, I left it a couple of days before commenting. I'm fine with it now, and I think it's a good idea. I support this addition. — I.S.M.E.T.A. 13:30, 3 October 2015 (UTC)

I don't mind it being there, but conceptually the watchlist seems a slightly irrelevant place for it. Equinox 14:41, 3 October 2015 (UTC)
I agree with Equinox. There is no good conceptual logic to placing it on the watchlist page, but it does place one of our wiki-citizenship duties squarely in front us better than any page I can imagine. DCDuring TALK 18:37, 3 October 2015 (UTC)
It makes about as much sense as Wanted entries being on the Watchlist. --WikiTiki89 15:23, 5 October 2015 (UTC)

Update: Per CodeCat's suggestion, I edited the list to sort votes by end date and show the end date as well. --Daniel Carrero (talk) 00:07, 7 October 2015 (UTC)

FYI: Vote on namespace for reconstructed termsEdit

Wiktionary:Votes/2015-09/Creating a namespace for reconstructed terms --WikiTiki89 20:46, 29 September 2015 (UTC)

Vote: Installing DynamicPageListEngineEdit

See: Wiktionary:Votes/2015-09/Installing DynamicPageListEngine. --Daniel Carrero (talk) 21:48, 29 September 2015 (UTC)

October 2015

Templates for place namesEdit

I created a few templates for place names; they should be able to generate standardized definitions for them in all languages. I've been using these templates for entries of places in Brazil only, but this system should be usable for other countries by copying and adapting the existing templates.

I chose the format of "municipalities of São Paulo, Brazil" to copy Category:en:Municipalities of China and others. (see Place names and Earth modules for a complete list of place name categories; I should mention I've found some different, inconsistent naming formats that could hopefully be fixed eventually)

The templates I created:

Main template:

Known issue:

  • These templates generate simple standardized definitions like "A municipality in São Paulo, Brazil." They still lack the functionality of linking from states to state capitals, and vice-versa. I plan on implementing that feature soon.

Thoughts? --Daniel Carrero (talk) 13:14, 1 October 2015 (UTC)

I was expecting something more like {{surname}} and {{given name}}. —CodeCat 13:22, 1 October 2015 (UTC)
How so? --Daniel Carrero (talk) 13:26, 1 October 2015 (UTC)
Like {{municipality|São Paulo|Brazil|lang=pt}}. —CodeCat 20:05, 1 October 2015 (UTC)
I did start to develop something like this at User:Daniel Carrero/place (while is just a stub, it did work perfectly in this revision, see the code). But, in my opinion, I'd rather use hard-wired full definitions (no matter if they are stored in MW code or Lua) like "municipality of São Paulo, Brazil" for this reason: if we use parts of definitions and allow people to use these parts in any way they see fit, then it would be potentially impossible to make the whole system consistent (judging by all the current un-templatized entries for place names, which are inconsistent in various levels, from the act of categorizing or not some entries, to the internal logic of the category naming system itself!):
  1. There would be definitions like {{city|Florianópolis|Brazil}} (basically all second-level subdivisions in Brazil are "municipalities", presumably that's why multiple Wikipedias and Wiktionaries use "municipality" categories; yet I found many of those to be randomly defined as "cities" or "towns", which could mess up the categorization and definitions).
  2. There would be too much freedom to change levels, like {{municipality|São Paulo|Southeastearn Region|Brazil}} with a "Southeastearn Region" in the middle.
  3. And also just {{municipality|Florianópolis|Brazil}} does not account for the fact that Florianópolis is a state capital unless you add another parameter.
I don't mind changing the system, but since the current system restricts each of the full definitions and associates them with categories individually, it does not have any of the aforementioned limitations, so I would like any other proposed system to be safe from all these problems as well. --Daniel Carrero (talk) 20:32, 1 October 2015 (UTC)

Update: Rather than adding new parameters to the previous templates, I created different templates for capitals because I needed more specific definitions and categories (they use 2 categories each: state capitals of Brazil; and municipalities of each state). I've thought of somethiong along the lines of {{place:capital of São Paulo, Brazil}} or {{place:São Paulo (capital)}}, but that could change. They work well. The only problem I fear is having too many different templates, though that seems manageable. Naturally, I am open to different suggestions, such as using Lua or less templates with more parameters, but first I have one thing to say: The current system is simple and intuitive enough (at least, that's my opinion) and very customizable in case other countries have different needs. (like, provinces instead of states; or more or less comma-separated levels) For this reason, I'd suggest waiting some time before attempting to merge the current templates into any more condensed model, because that might not work for all countries. In the meantime, I plan to continue using these templates for new entries. As usual, I also request feedback of other people, too. --Daniel Carrero (talk) 19:14, 1 October 2015 (UTC)

Update: Using fewer templates for Brazil: I'm deleting state-specific and (oh God.) city-specific templates in favor of {{place:Brazil/municipality}} and {{place:Brazil/state capital}} (the last one I'm going to create in a moment.) --Daniel Carrero (talk) 08:19, 2 October 2015 (UTC)

Should all foreign-language place names have counterparts in English?Edit

I've created most of the 853 entries for municipalities (a.k.a., cities/towns) of Minas Gerais (a state of Brazil) in Portuguese. See Category:pt:Municipalities of Minas Gerais, Brazil. I am thinking of doing the same to fill the English category completely as well. See Category:en:Municipalities of Minas Gerais, Brazil.

Should all place names in foreign languages have counterparts in English? Surely many of those (or all of those?) are citable anyway. I've done some cursory search of small Brazilian towns on Google Books and as of yet found all of them to be citable in English. Random example: Comercinho is citable on this book. What do you think? --Daniel Carrero (talk) 08:33, 2 October 2015 (UTC)

Might as well. I doubt anyone would stop you. --Zo3rWer (talk) 12:58, 2 October 2015 (UTC)
How can you be so sure that every single placename in every language has an English translation? Making all FL placenames link to English by default was not a good idea. — Ungoliant (falai) 15:42, 3 October 2015 (UTC)
Of course they have an English name. How else would English speakers refer to it? —CodeCat 15:47, 3 October 2015 (UTC)
I doubt that some little village up in a Chinese mountain has an English name. It will certainly have an English transliteration - but we don't accept those. SemperBlotto (talk) 16:04, 3 October 2015 (UTC)
Attestation in running English text is a prerequisite for an English entry, as per CFI. --Dan Polansky (talk) 16:13, 3 October 2015 (UTC)
Place names are not words, they're designations. If a person comes along and introduces themselves as Katherine, then the other party must use that name to refer to her. And that's true cross-linguistically; speakers of all languages must use that name, because that's what she said her name is. The same would apply to a random village in China. It can be assumed that foreign speakers will adopt the name that locals use, because that is the name of the village. Of course, there's exonyms, but that's a different story: with exonyms, there is a name, but certain speakers decide to use another. When there is no known name, it must be assumed that there is still at least one name. —CodeCat 17:33, 3 October 2015 (UTC)
That sounds almost like an argument to make such entries Translingual rather than pin them down to a specific language. —Aɴɢʀ (talk) 17:54, 3 October 2015 (UTC)
Maybe that's a good idea. —CodeCat 17:59, 3 October 2015 (UTC)
Names as "translingual" would seem to make it pretty difficult to deal with even things like Москва ‎(Moskva)/Moscow, let alone more divergent cases like Köln/Cologne. And in case you're only suggesting this approach for placenames that only exist in one language: suppose that we discover that a tiny Chinese village does have a distinct name in a minority language spoken nearby; would this turn it from translingual back into (Mandarin) Chinese? I'd say treat placenames as particular to the local language, and attestations in other languages as citation loans, unless there's some kind of evidence to the contrary (e.g. for a pronunciation or spelling particular to English). --Tropylium (talk) 18:17, 3 October 2015 (UTC)
Single-word place names (London, not New York) either are words per me and John Stuart Mill or in any case they behave like words: they get written down using alphabet, get pronounced, inflected and have an etymology. Having them as translingual very often does not work since they get language-specific inflection. But even if place names somehow were not "words", they still get included and regulated by CFI, and the criterion of attestation applies to them. We even had this vote Wiktionary:Votes/pl-2010-05/Placenames with linguistic information 2, so I do not think I am in a minority to think they need to be attested. --Dan Polansky (talk) 19:08, 3 October 2015 (UTC)
Then does this mean that many of the articles for places that exist in English Wikipedia actually have titles and describe places in running text, when their names are not words usable in English according to our own standards? I find that a bit bizarre. Let's take w:Tegal Buleud as a random obscure place. The article uses the name of the place twice in running text. Is "Tegal Buleud" not English if it's used in this way? If not, then what would make it English? —CodeCat 22:55, 3 October 2015 (UTC)
It's not that "their names are not words usable in English", it's that their names are not attestable in English by our standards. --WikiTiki89 15:54, 5 October 2015 (UTC)
I understand that, but then this does mean that our mission statement is false. We don't include all words, just the attestable ones. —CodeCat 16:18, 5 October 2015 (UTC)
And that's always been the case. --WikiTiki89 16:23, 5 October 2015 (UTC)
What I do (with Italian placenames) is generate an Italian entry then, if I can find an English translation, I add an English entry for the translation. If I can't find a translation, I have been known to add an English section if it is a well-known place. SemperBlotto (talk) 18:23, 3 October 2015 (UTC)
For Spanish places, I usually add an English and a Spanish, but sometimes just a Spanish or just an English. This is pure laziness from my part, but I suppose in theory we could have the same entry in loads of languages. I know that the French Wiktionnaire often does this - see [14] as an example of (excessive?) repetition. --Zo3rWer (talk) 10:31, 4 October 2015 (UTC)
Place names are designations, but these designations are words. There should be a section for a language only if the word is used in the language: attestations are required. These sections may be very useful, especially for pronunciation (have you ever heard of another dictionary with a pronunciation given for place names from all over the world?), homophones, examples/citations, usage notes, derived words, gentilics, anagrams, etc. In the example given above, most sections are repetition, sure, but these sections will be completed with time. Lmaltier (talk) 20:20, 5 October 2015 (UTC)
  • I don't see any reason to forbid creating English entries for foreign-language place names. I also don't see a need to rush out and create them all immediately. People can create them from time to time, though. Purplebackpack89 22:12, 5 October 2015 (UTC)
    Yes, especially when they see them used in the language. It's the same for Italian places names used in Spanish, etc. Lmaltier (talk) 17:58, 6 October 2015 (UTC)

Place name format: English (non-gloss) vs. foreign language (translation + gloss)Edit

As I said in the discussions above, I created a few placename templates. I'd like to discuss about the results as they appear on the entries. See the entry Ouro Preto (a municipality in Brazil). It is currently defined in 3 languages using the same template. ({{place:Brazil/municipality}}) I checked on Google Books, it's attestable in all three.

If the language is English, the definition is formatted as a main (non-gloss) definition and if it's a foreign language, it's formatted as an English translation (linking back to the English entry or section) + {{gloss}}. It looks good IMO in the entry I linked, and it's a consistent system overall, but it also causes a problem: the translation still points back to the English section even if there's no English section to begin with, like in the entry Comercinho. (Which in my opinion, is a bad thing that should be fixed some way or other, but it's not extremely harmful. Ultimately, it's just a pointless link back to the same page.) I was just kind of expecting an English section to be present most of the time, like it happens randomly in entries like Laredo and Colorado (each of these entries have definitions for places in multiple countries, in the English section).

Note that the template uses the same syntax for all languages (only the language code changes), so it's supposed to make copypasting between languages easy. If you wanted to add a French section, you would just use the same code with "fr".

# {{place:Brazil/municipality|en|state=Minas Gerais}}
# {{place:Brazil/municipality|pt|state=Minas Gerais}}
# {{place:Brazil/municipality|es|state=Minas Gerais}}

This is one reason why I asked directly above "Should all foreign-language place names have counterparts in English?". If we could simply add English entries/sections for all placenames, the problem would be solved. But for cases where there's no language section in English, what should the template do? Should the template allow for non-gloss formatting (first letter capitalized and the period at the end) in foreign language entries? Can we use the Translingual section some way for place names? Most likely, any new functionality would be controlled by new parameters to the templates, so the system would become a bit more complex than it is now.

My favorite proposal is this:

  • For foreign language entries without an English translation, make the template keep the gloss format but without linking the main word [like this: Ouro Preto (municipality in the state of Minas Gerais, Brazil)]. Rationale: Consistent formatting in all entries, and when you translate Ouro Preto into English, you would use the original language name ("Ouro Preto"), even in cases where it's not attestable in permanently recorded media in English.

I plan to continue using the current system for the new entries I am creating. Edit {{meta-place}} if needed, to change how it works. --Daniel Carrero (talk) 08:21, 5 October 2015 (UTC)

Proposal: Extinct unwritten languages should not qualify for inclusionEdit

WT:CFI includes some interesting clauses for the inclusion of terms from languages that are not "well-documented on the internet". Perhaps the most interesting is that entries may be created even on the basis of a single mention, without any attestation or evidence of attestability required.

Let's think a little bit about what are we even doing here. I would not say this criterion is always unreasonable; but obviously it does not exist just as a backdoor to document words even when they are not attestable per the usual standards, since we do not allow this method for creating entries for rare words in well-documented languages.

The impression I get — though this does not appear to be written out anywhere! — is that there's an underlying idea that some languages mainly exist elsewhere than on the Internet: e.g. as old literary languages that are not spoken anymore, or mainly as spoken languages which so far do not have much written materials available. And so we assume that if a word as been documented in a scholarly source or the like, it could in principle be also easily attested once the speakers of Pohnpeian get around to hanging online in greater amounts, or once people have uploaded enough Karakhanid Turkic materials on Wikisource, or so on forth. But as long as this is not the case, asking editors to provide attestations would be simply adding to a backlog.

"Mention implies potential attestation" however fails to hold for some languages. I propose that to qualify for inclusion in the main namespace, a language must have at least one of the following:

  1. A surviving written tradition.
  2. Continuing existence (≈ the potential for a written tradition to be established in the future).

This is relatively lenient still. If we consider epigraphic attestation a written tradition (but see below), languages like Oscan would continue to qualify under criterion #1 — alongside any more abundantly attested extinct languages, say Hittite, Old Tupi or Ubykh. In the absense of other updates to CFI, any individual words in these languages would also continue to qualify on the basis of mentions alone.

What this serves to exclude are languages like Crimean Gothic, Pumpokol or Tasmanian. Any material that is known of languages of this sort generally only exists in linguistic sources, in all but exceptional cases comes with glosses attached, and thus seems to fall clearly short of Wiktionary's general rule of inclusion, as stated at WT:CFI:

A term should be included if it's likely that someone would run across it and want to know what it means.

That is: it seems to me like entries in dead-and-buried languages do not exist to fulfill this need. They exist solely for the sake of linguistic curiosity. No one randomly runs across text in Crimean Gothic, and is left wondering what it might mean.

Linguistic curiosity is of course still a need, and Wiktionary is doing a good job at answering it, I believe. Some people might indeed wonder "so how does one count to ten in Crimean Gothic", or "has this Ket word of alleged Yeniseian ancestry been even recorded from the related languages" and want to look it up. Hence I am not proposing flat-out deleting what we currently have in languages that fail the current-or-potential-written-tradition test. A better solution probably would be the inclusion of recorded data from any natural language variety in the Appendix namespace. (Or, perhaps, the creation of a new Extinct namespace?)

You might ask what difference does it make to switch from regular entries to an appendix, other than make the terms not come up by default in search. One thing is that this would also seem to be grounds to diverge from the usual layout requirements. For example:

  • If a language's known corpus is something like twenty or fifty words, we can put them in a single appendix rather than sprinkle them across several stub entries.
    • Individual quotations could in such cases be replaced with a single references section.
  • If a language has only been recorded in phonetic transcription, we could accordingly provide only the pronunciation/transcription, and avoid implying that an orthography exists.
    • If competing transcription schemes exist, we could standardize one of them and cover the others by means of an equivalence table, rather than creating duplicate entries.
  • Missing information such as parts of speech could be left unknown.
  • If glosses are only available in a language other than English, and the precise meaning cannot be verified, we might leave the glosses in the original language and not risk mistranslating things.

--Tropylium (talk) 01:08, 6 October 2015 (UTC)

I support this proposal, if I understand that it essentially means "Mentions may only be used for attestation, if the language's corpus as a whole has actual attestations of other words. Languages whose entire corpus is mentions do not qualify for inclusion." —CodeCat 09:27, 6 October 2015 (UTC)
It's still more lenient than that: "Entries may be based on mentions only if the language's corpus includes actual attestations, or, if the language is not extinct." --Tropylium (talk) 16:35, 6 October 2015 (UTC)


The above argumentation could additionally be extended to exclude from mainspace also two further types of languages:

  • Extinct languages whose known corpus is highly limited and which does not allow directly establishing the language's grammar, pronunciation, the meaning of its words, etc. Often information of this type can be determined via the comparative method (a particularly good example might be Proto-Norse) — but it would seem to me that this is not too much different from entirely unattested reconstructed languages.
  • Moribund languages, for which all available material is linguistic documentation and no revitalization efforts exist. For such languages, we have effectively no foreseeable hope of ever gaining anything resembling a written standard that could be documented according to regular attestation criteria. An example might be Ter Sami.

I would however like to go on record as strongly in favor of continuing to include endangered languages for which even marginal natural transmission can be suspected to remain (e.g. Ishkashimi), or even elementary attempts at formulating a written standard are underway (e.g. Votic). Hence any exclusion of languages by these criteria should be probably "opt-in", i.e. with the burden of proof on the side claiming that a language is indeed poorly-documented enough to not merit inclusion. --Tropylium (talk) 01:08, 6 October 2015 (UTC)

I strongly oppose this. Your "impression" (i.e. assumption) recorded above is inaccurate; we have more lenient criteria for such languages because we would otherwise have no way of documenting them (the use-mention distinction is merely meant as a tool for us to determine whether a word is really used). There is no point to separating off certain natural languages thus, and it would be counterintuitive to users. —Μετάknowledgediscuss/deeds 01:25, 6 October 2015 (UTC)
An interesting interpretation. (And in the absense of explicit policy support, I could ask whether it is also merely an assumption.) It seems to carry fairly strong implications, though.
  • If Wiktionary's purpose is to document any use of language whatsoever, on an equal level;
  • if this includes extinct languages just as well as living ones;
  • and if we're not tied to direct attestation, but are to also allow documentation thru indirect inference such as mentions
— then this would actually seem to require that we must also document proto-languages in mainspace, to an extent! The most reliably "entirely reconstructed" words are at least on an equally probable footing as are words in otherwise attested languages that are presumed to have exist based on hapax attestations by non-native speakers, reconstructed semantics, etc.
You might also want to note that the proposals above are not in opposition to the documentation of anything at all, only on what should be presented as regular dictionary entries. Contrary to its name, WT:CFI is not actually the criteria for inclusion on the Wiktionary servers period, but merely the criteria for inclusion in mainspace.
(Also, should I assume that this reply is meant also in opposition to the main proposal, not just to the two subproposals?)
--Tropylium (talk) 16:31, 6 October 2015 (UTC)
I oppose this as well. Attested terms should be included, period. —CodeCat 09:27, 6 October 2015 (UTC)
Fair enough, though this counterargument appears to only cover only languages of the Proto-Norse type. With languages of the Ter Sami type, no attestations exist whatsoever. --Tropylium (talk) 16:31, 6 October 2015 (UTC)
What exists for Ter Sami then? —CodeCat 16:58, 6 October 2015 (UTC)
The only major source is a comparative dialect dictionary of Kola Sami (the majority of whose materials are Kildin Sami) rendered in phonetic transcription, i.e. a huge bunch of mentions. Accordingly, what entries we have essentially have been created only as a pronunciation. Consider e.g. лa̭i̭ja ≈ IPA [ɫʌi̝jɑ]. (I doubt if anyone with native knowledge of Ter Sami would recognize either of these written forms, if presented with it.)
It's not only moribund languages that suffer from this problem, though. Tons of endangered languages only have field research materials available so far. You may recall a BP discussion about a new "Languages without a Written Tradition" in February. Hence I draw a difference here not between written and unwritten languages, but on if Wiktionary inclusion could be expected actually benefit the speaker community, and if we could expect native speakers to contribute at some point. --Tropylium (talk) 19:39, 6 October 2015 (UTC)
  • Query: How would this proposal affect proto-languages? Do you suggest that we remove all of our proto-language appendix entries? ‑‑ Eiríkr Útlendi │Tala við mig 08:00, 6 October 2015 (UTC)
    None of this has any effect on reconstructed proto-languages, which are already excluded from mainspace inclusion. --Tropylium (talk) 16:31, 6 October 2015 (UTC)

Away until mid-October.Edit

I will be away until mid-October. Please try to have this project completed by the time I return. Cheers! bd2412 T 15:28, 7 October 2015 (UTC)

You're not my supervisor! WurdSnatcher (talk)
The "get it completed" is a traditional Wiktionary joke that gets funnier every time it is reused. HTH. Equinox 01:29, 8 October 2015 (UTC)
Surely there's not much to do, right? How complete is Wiktionary right now? 90%? 95%? --Daniel Carrero (talk) 01:39, 8 October 2015 (UTC)
Not even 1%. — Ungoliant (falai) 01:42, 8 October 2015 (UTC)
Blasphemy! Everyone knows that Wiktionary is pretty much complete already. Any word we don't have is probably not worth the trouble anyway. --Daniel Carrero (talk) 01:51, 8 October 2015 (UTC)
You're not my supervisor! HTH --Catsidhe (verba, facta) 01:41, 8 October 2015 (UTC)
What's the etymology of the joke? --Dixtosa (talk) 14:34, 11 October 2015 (UTC)

I am stunned to find that in the entire week that I was absent, the collection of all words in all languages was not completed. bd2412 T 01:06, 14 October 2015 (UTC)

I ate them all. --Romanophile (contributions) 01:08, 14 October 2015 (UTC)
That must have given you a sour stomach. bd2412 T 13:21, 14 October 2015 (UTC)

The vote on adding a collocations or phrases namespace or sectionEdit

Wiktionary:Votes/2015-09/Adding a collocations or phrases namespace or section has opened. The vote was prompted by Wiktionary:Beer parlour/2015/August#Adding_a_collocations_tab_or_section. - -sche (discuss) 17:10, 8 October 2015 (UTC)

WT:NORM and multiple spaces, tabs and indentationEdit

Currently, the first rule says no leading or trailing spaces, while rule 4 says no leading or trailing space in templates. I came across many entries with leading spaces, such as apple. Many of them occurred in uses of the {{quote-book}} template, though presumably it may occur in any template call. apple breaks both rule 1 and rule 4, since line breaks are also leading/trailing space in a template, and furthermore the parameter names have whitespace around them. This whitespace could simply be removed, but is this desirable?

I have also come across many pages that include tabs; some of them at the start or end of a line, others in the middle of a line. The question is the same, what should be done to get rid of them? They could be replaced with a set amount of spaces, but multiple spaces are equivalent to a single space in Wikitext, so this seems a bit pointless. This makes me wonder if there needs to be a rule that says no multiple spaces in a row. Then tabs could just be replaced by a space. What do you think? —CodeCat 19:48, 8 October 2015 (UTC)

I think rule 4 should be amended to allow what apple is doing. I think the intention of rule 4 is to disallow things like {{ en-noun | - }} and [[ apple ]], but for {{quote-book}} and other templates that use a large number of parameters with long values, it is more readable to put in line breaks. I don't particularly like the extra spaces on each line of the {{quote-book}} template, but I can't see a reason to ban them. --WikiTiki89 19:57, 8 October 2015 (UTC)
But we already do ban them. | page = 537 goes counter to rule 4 whether you put it on the same line as the template name, or on a line of its own. The rule specifies that it should be compressed to |page=537. I don't think it would be any good to make an exception to this whitespace rule for parameters on their own line.
I have less objections to the practice of breaking long template calls up into multiple lines in general, as long as a set format is decided on for those as well. Right now, there are differences in the amount of leading whitespace (apple vs accost). Should the leading whitespace be removed, so that the line starts with |? Then rule 1 would be satisfied. An exception could be added to rule 4 that the | preceding a template parameter can be optionally preceded by a line break.
Also, the practice should be nuanced for positional parameters, because leading and trailing whitespace does not get stripped from them, unless the template does it itself explicitly. Stripping whitespace in a template requires a module, as template code can't modify strings. At the same time, templates that take an indefinite number of positional parameters, like {{compound}}, {{head}} or {{der3}}, should be written as modules anyway, for other reasons. —CodeCat 20:32, 8 October 2015 (UTC)
But there is a workaround for stripping whitespace in templates: pass the text as a named parameter to another template that returns the text back (like {{strip|={{{1|}}}}}). Anyway, I guess I would agree that the leading space in  | page = 537 is bad practice, but I think the spaces remaining in | page = 537 are harmless as long as it is consistent within the template. --WikiTiki89 21:09, 8 October 2015 (UTC)
I've now edited apple to conform with WT:NORM at least as far as the whitespace rules are concerned. —CodeCat 21:34, 8 October 2015 (UTC)
Maybe you should have waited for more opinions before doing that. --WikiTiki89 19:25, 9 October 2015 (UTC)
I didn't think following existing rules would be controversial. In fact, I did it to show the outcome of those rules. —CodeCat 20:15, 9 October 2015 (UTC)
Re: "it is more readable to put in line breaks" I strongly disagree and remove them regularly in favor of spaces. We have many instances of the entire edit frame being taken up with a single citation template and many pagedowns being required to proceed from one definition to another. The result is that substantive editing of definitions of highly polysemous, cited, ie, English, terms needs to be done offline and can only be done offline with difficulty. Perhaps it is time to use transclusion of citations from citation space to make such entries intelligible in the edit frame. DCDuring TALK 23:56, 8 October 2015 (UTC)
A few extra page-downs is not so bad compared to trying to read such a long template crammed into one line. I would agree that this should not be done when the parameters to the template are short enough. --WikiTiki89 19:25, 9 October 2015 (UTC)
FWIW, the Russian manual declension template {{ru-decl-noun}} is specifically intended to be used with linebreaks, e.g.:
|о пауке́-во́лке|о паука́х-волка́х}}
This way of formatting puts the singulars in the first column and the plurals in the second column, and reading down the rows you get nom, gen, dat, acc, ins, prep. Putting it all on one line sometimes happens but makes it much less readable. So we might want to amend things to allow linebreaks with the preceding vertical bar on the same line, while still excluding leading/trailing/embedded spaces. Benwing2 (talk) 09:29, 9 October 2015 (UTC)

Inflections identical to lemma formEdit

I've noticed that in general, when an inflectional form of a word is identical to the lemma, there is no separate definition for it. For instance, pecūnia and pecūniā, the vocative and ablative singular forms of pecūnia, have no definitions, yet each inflected form would have its own definition if they weren't on the lemma page.

I've come across a fair number of entries (mostly for Latin) that have a definition for each inflection in the lemma entry. I can't find any at the moment, but they're like this:


1. A woolly ruminant of the genus Ovis.
2. plural of sheep

Is there some sort of policy on this? If not, I think there should be. I would vote for the second option, since it is most consistent with us having separate definitions for each inflected form under non-lemma entries, but I'd like to hear other people's thoughts on the matter. Andrew Sheedy (talk) 03:51, 9 October 2015 (UTC)

I'm opposed to listing inflected forms that are identical to the lemma form, unless (as with pecūniā) the diacriticized headword form is different from the lemma's own diacriticized headword form. Since [[sheep]] already says "(plural sheep)", there's no need to list it separately. —Aɴɢʀ (talk) 09:29, 9 October 2015 (UTC)
When I created Arabic non-lemma inflections I specifically added code to exclude adding non-lemma inflections to the same page as the lemma they're derived from. Because I was only creating verbal inflections, this principally applies to the 3sg masc past active (which is the same as the dictionary form), but also to 3sg masc past passives, which almost always have the same written form as the active, although the vowels are different. I did this because I felt it would create a lot of noise to include all those passives on the same page; among other things, almost every verb page would be formatted using "==Etymology 1==" and "==Etymology 2==" (I did it this way because the pronunciations are different, although maybe there's a better way). With nouns it would be a lot worse, since for every noun lemma there would be four or five non-lemma entries on the same page, each with different pronunciation but the same nonvocalized spelling. The assumption here is that if the user looks up an Arabic word by spelling, it's enough to get them to the right page for the lemma, and they can hopefully figure out by looking at the conjugation table for the verb that it has 3sg masc past active and passive that are spelled the same as the lemma form. Benwing2 (talk) 09:42, 9 October 2015 (UTC)
I do think it would be good to make this information more explicit, but I think it would be confusing like that. Maybe a usage note? (This is the base form of this term, and it is also identical to the genitive plural and the ablative singular) WurdSnatcher (talk) 16:04, 10 October 2015 (UTC)
I think it's made explicit enough in the inflection table itself. —CodeCat 17:24, 10 October 2015 (UTC)
"Genitive plural" and "ablative singular" are probably a lot more confusing to a user than what we have now. Equinox 21:38, 11 October 2015 (UTC)
So should definitions listing the other forms that are identical to the lemma be deleted, or left as is for now? The general consensus seems to be that they shouldn't be included. Does anyone else think that this should be made "official"? Andrew Sheedy (talk) 21:45, 11 October 2015 (UTC)
I think it would need at least a poll on this page showing near-consensus (80%+ support?) and possibly a vote if there were not near-consensus. DCDuring TALK 22:26, 11 October 2015 (UTC)
For Hungarian entries, I list identical lemma and non-lemma entries separated by their own Etymology header. See terem, a lemma noun, verb and non-lemma noun form. Because the non-lemma form has its own declension, pronunciation and hyphenation, I would like to continue to include them. --Panda10 (talk) 12:40, 12 October 2015 (UTC)
The possessive form is what's called a "sublemma": a form of another term that has some lemma-like properties, like having its own inflection. Participles also fit into this category. For fully non-lemma forms, that are not sublemmas either, there shouldn't be a separate entry if it's identical to the lemma form in all respects. If there is a difference (one that's not apparent from spelling), then it should have its own entry, like for Latin ablative singular forms. Non-lemmas, whether sublemmas or not, should not have etymologies unless the formation is irregular, and even then it's probably better to put the etymology on the lemma page. —CodeCat 13:21, 12 October 2015 (UTC)
Re "Non-lemmas, whether sublemmas or not, should not have etymologies": Etymologies may not be useful for English non-lemma entries, but they are for agglutinative languages. For Hungarian, the non-lemma etymology allows the user to click on the suffix for more information as in for example ésszel. --Panda10 (talk) 13:39, 12 October 2015 (UTC)

Please vote - Allowing matched-pair entriesEdit

Please vote on Wiktionary:Votes/2015-08/Allowing matched-pair entries — it ends in 3 days: 23:59, 12 October 2015 (UTC).

Current results:

  • Support: 6 (66,6%)
  • Oppose: 3 (33,3%)
  • Abstain: 0

--Daniel Carrero (talk) 20:05, 9 October 2015 (UTC)

Poll and discussion: table for seasons of the yearEdit

I created Template:table:seasons about 16 hours ago. It is a table for the 4 seasons of the year that can be used in any language. Disclaimer: This template does not have to be used for any languages which have seasons other than the 4-season system — see this article for some 6-season calendars. (Only if the template can be changed somehow to agree with the needs of such languages.)

I chose some icons to illustrate the table and started adding it in on entries of multiple languages. In diff, @Catsidhe reverted one of the entries with the edit summary: "Undo revision 34541571 by Daniel Carrero (talk) That is twee, insulting, and i would like you to stop that now. It's starting to feel like vandalism." I started the discussion Thread:User talk:Catsidhe/Table for seasons, where Catsidhe says more about what they don't like about the template. (After that discussion, I shrunk the images and changed the style to the current 1-row template, turning larger "cartoon"-style images into less conspicuous icons.)

First of all, I realized I did not discuss that season template beforehand, so I apologize if creating and using that table was not a good idea and I'm ready to undo all the changes if that's what people want. I opened this discussion to make sure what format the community prefers for this. I acknowledge Catsidhe's opposition, but I'd feel stupid if I re-edited the entries right now to return to the list format without any further discussions and other people wanted to discuss the issue or wanted the tables back.

(Other tables I created recently: Template:table:playing cards,Template:table:suits and also Template:table:poker hands. The first of all was Template:table:chess pieces, which I discussed in the BP in September. Please let me know if there's any problem with them. Template:table:colors was created by User:DTLHS and has been discussed a lot in the talk page and also revised multiple times.)

The previous state of the entries for seasons was:

  • Many season entries have been using the "list" system, as in Category:English list templates - "Template:list:blahblahblah" with text as opposed to tables and cross-linking many templates between languages. The list system was a previous project of mine, I created the initial design of lists a few years ago, though it has changed a lot since then. I started converting a few season lists into tables after creating the initial Template:table:seasons template. I also created a few new season tables that didn't exist before as lists, such as Template:table:seasons/ast.

My rationale for using tables in some cases, as opposed to lists:

  • Consistency in word order and illustrations if they are needed; in short, tables are supposed to help you know instantly the meaning of all the words even if you don't speak the language in question, without the need to check each entry.

Also I chose the icons because IMHO they represent well the ideas of each season in the table even in low resolutions. As alternative proposals, I suggest: 1) keeping the table with different images or 2) keeping the table with English -> Foreign-Language text translations without images. But, as I said, if people want the list format back, (or agree on some other idea) I'll undo my changes (I'd use AWB for that). So please let me know what the community wants. --Daniel Carrero (talk) 01:58, 11 October 2015 (UTC)

Proposal: Using the table for seasonsEdit

Proposal: Having a table for the 4 seasons of the year that can be used in all languages, (Template:table:seasons - check the template to see a list of which languages have the season table already, which includes Template:table:seasons/en for English, Template:table:seasons/fr for French, etc.) except those languages that don't agree with the 4-season system. This proposal is just about having the table, not about the exact images that are used or the exact format of the table.

See 3 examples of implementation:


  1. Symbol support vote.svg Support --Daniel Carrero (talk) 01:58, 11 October 2015 (UTC)
    Also I like the pictures, but it's okay if I'm in the minority. --Daniel Carrero (talk) 21:18, 11 October 2015 (UTC)
    Note: primavera is an interesting entry to look. It currently has the seasons table in the Asturian, Catalan, Galician, Interlingua, Italian, Portuguese and Spanish sections. --Daniel Carrero (talk) 21:43, 11 October 2015 (UTC)
  2. Symbol support vote.svg Support -- I like the pictures, though I wouldn't be too upset to lose them. I do generally like the idea though. WurdSnatcher (talk) 17:02, 11 October 2015 (UTC)
  3. Symbol support vote.svg Support -- The pictures are unnecessary, but they can be removed if the majority agrees. I think it's a good idea. Aryamanarora (talk) 21:21, 11 October 2015 (UTC)
  4. Symbol support vote.svg Support The tables look nicer than plain lists of words, and can have clarifying translations and/or pictures too. Much more effective than a list of foreign words alone, where you have to click to know what they mean. —CodeCat 21:36, 11 October 2015 (UTC)


  1. Symbol oppose vote.svg Oppose Simply hideous. Wrong for English wiktionary. Maybe OK for Simple Wiktionary? DCDuring TALK 02:15, 11 October 2015 (UTC)
  2. Symbol oppose vote.svg Oppose addition of Template:table:seasons (which has pictures intended to represent seasons) to entries for now. I'll wait to see what supporters are going to say to see whether I should change my mind. --Dan Polansky (talk) 11:07, 11 October 2015 (UTC)
  3. Symbol oppose vote.svg Oppose the pictures. Table or list, I don't mind, just not the daft cartoons. Catsidhe (verba, facta) 11:21, 11 October 2015 (UTC)
  4. Symbol oppose vote.svg Oppose these, although I like the ones that can be symbolised unambiguously. —Μετάknowledgediscuss/deeds 17:00, 11 October 2015 (UTC)
  5. Symbol oppose vote.svg Oppose per Equinox (below) - this wastes a lot of space to provide four words. - -sche (discuss) 19:38, 12 October 2015 (UTC)



  • I'd be fine with it if we got rid of the little pictures. —Aɴɢʀ (talk) 10:34, 11 October 2015 (UTC)
  • I don't really like the images and am not sure they are necessary. Equinox 18:02, 11 October 2015 (UTC)

When I created this poll, I said "This proposal is just about having the table, not about the exact images that are used or the exact format of the table.", but really the images seem to be the most controversial aspect of the table, so I've tried a different design: Template:table:seasons (without images).

I used it for:

What do you think? --Daniel Carrero (talk) 21:15, 11 October 2015 (UTC)

Honestly all the table/border stuff seems like a waste of space for only four words. Equinox 21:18, 11 October 2015 (UTC)
I oppose use of the table, per Equinox. (And Equinox's view should carry more weight than others' in a discussion about seasons. :-) )​—msh210 (talk) 17:49, 12 October 2015 (UTC)
A simple list of the three seasons (with gloss for FLs) that are coordinate terms of the headword is what seems appropriate to me. The Coordinate terms header is intended for just this kind of thing. I would think there would be some technical challenge in efficiently suppressing the season that was the headword. DCDuring TALK 23:56, 12 October 2015 (UTC)

Update: I removed the pictures from the table. Past revision with pictures: this revision.

The official support/oppose count is 4-5, but it does not do justice to the table/picture relation. Maybe I designed this poll poorly.

Some people opposed the table in its entirety, but other people explicitly opposed the pictures while either saying they would be fine with the table or they don't care if it's a table. This, regardless of whether those people voted "Support/Oppose/Abstain/Comments". Even among people who supported the table, a number of people said they don't mind the pictures at best. That confused the hell out of me, but it's clear that at the very least the pictures are unwanted by the current majority, probably the whole table too actually, it looks more like a "no consensus" right now. --Daniel Carrero (talk) 02:41, 13 October 2015 (UTC)

No consensus, I guess. (as of now) --Daniel Carrero (talk) 20:03, 19 October 2015 (UTC)
No consensus needed to implement under existing WT:ELE using coordinate terms header, omitting from the display the headword. Can't that be done technically? DCDuring TALK 22:39, 19 October 2015 (UTC)
IMHO, I'm against using direct links in each wiki page because that would involve undoing the work using list or table templates. We can (edit: I mean "I can", since I promised that, but I was expecting some consensus and this discussion has become too confusing for the reasons that I said. Probably I'll let the status quo linger for a while, then I'll create some new discussion later.), however, convert the current tables back to the list format. Interestingly, Template:list:seasons/lv seems to be the only "list:"-prefixed template that uses 2 lines for some reason.
The distinction between "Coordinate terms" and "See also" seems to be a moot point. It's true that "Coordinate terms" exists in ELE, but so does "See also". --Daniel Carrero (talk) 23:00, 19 October 2015 (UTC)

Sub-national countries in Wiktionary:WiktionariansEdit

Wiktionary:Wiktionarians currently has sections for the Basque Country, Catalonia, and Spain; the former two are divisions of the latter. This is inconsistent with the principle of according sections only to countries qua sovereign states. Nevertheless, this is a dictionary, and as such political considerations are not as important as linguistic ones. The Basque Country and Catalonia have their own languages (Basque [eu] and Catalan [ca], respectively), so it is relevant to us whether a person is from one of those sub-national countries. On this principle, I see validity in including sections for other sub-national countries, for example Flanders and Wallonia (Belgium), Quebec (Canada), Guangzhou, Inner Mongolia, Manchuria, Tibet, and Xinjiang (China), Lapland (Finland / Norway / Russia / Sweden), Brittany (France), Gaeltacht (Ireland), Hokkaido and Ryukyu Islands (Japan), Eastern Cape, Free State, Gauteng, Kwazulu Natal, Limpopo, Mpumalanga, Northern Cape, North West, and Western Cape (South Africa), and Cornwall and Wales (United Kingdom). What is the opinion of the community? — I.S.M.E.T.A. 10:25, 11 October 2015 (UTC)

Speaking only to the case I'm somewhat familiar with, I would not be in favor of a section for the Gaeltacht, as being from the Gaeltacht is neither a necessary nor a sufficient condition for being a native speaker of Irish (though there is a greater than chance correlation). Also, unlike the Basque Country and Catalonia, the Gaeltacht does not correspond to any political entity and could not be considered a "sub-national country" by any stretch of the imagination. —Aɴɢʀ (talk) 10:33, 11 October 2015 (UTC)
@Aɴɢʀ: I can understand where you're coming from, and I therefore agree that the Gaeltacht was a bad example. However, surely being from anywhere is neither a necessary nor a sufficient condition for being or for not being a speaker (native or otherwise) of any language; in every case, the country listings are only suggestive of language ability. — I.S.M.E.T.A. 20:54, 11 October 2015 (UTC)
I agree with you - also, India and Pakistan could definitely use that category system:
  • Kashmir (Kashmiri)
  • Bengal (Bengali)
  • Sindh (Sindhi)
  • South India (Tamil, Telugu, etc.)
Aryamanarora (talk) 21:24, 11 October 2015 (UTC)
@Aryamanarora: Yes, that was the sort of thing I was thinking. Should the sub-national states be listed separately, as the Basque Country and Catalonia currently are, or should they be listed as subsections of the sovereign states' sections? — I.S.M.E.T.A. 23:10, 11 October 2015 (UTC)
@I'm so meta even this acronym: Subsections will be best for organization. Aryamanarora (talk) 01:44, 12 October 2015 (UTC)
@Aryamanarora: Subsections it is! — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)
Eh, let people categorize themselves however they want. If they identify as Basque and not Spanish, or as Californian and not American, let 'em do it. There's no need for policy to be that restrictive. Purplebackpack89 05:25, 12 October 2015 (UTC)
@Purplebackpack89: I'm in favour of listing sub-national countries in Wiktionary:Wiktionarians. I just wanted to make sure that doing so has community support or, at least, lacks community opposition. — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)

Any objections to having subsections for sub-national countries in Wiktionary:Wiktionarians? — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)

I just listed myself as being from Scotland. --Zo3rWer (talk) 09:32, 14 October 2015 (UTC)

I've just reorganised the page and posted notice of this on Wiktionary:News for editors. — I.S.M.E.T.A. 21:54, 16 October 2015 (UTC)

Appendix:Capital letterEdit

I created Appendix:Capital letter as an (incomplete) list of uses of a capital letter. Wikipedia already has Capitalization, but I created this page using the entry layout, which I find easier to navigate. A number of those is basically a list of senses that the entries A, B, C, etc. could have if we wanted. (like "found in the beginning of proper nouns" and "found in the beginning of sentences") But I think as a single page they look better. Feel free to expand the list with more languages or senses. --Daniel Carrero (talk) 09:31, 12 October 2015 (UTC)

Nice, it is certainly thorough. Aryamanarora (talk) 22:29, 12 October 2015 (UTC)
Thank you. --Daniel Carrero (talk) 08:55, 13 October 2015 (UTC)

I have a proposal:


  • Not only the whole page is formatted like an entry, if we assume that entries like A, B, C, etc. should have senses like "found in the beginning of proper nouns" and "found in the beginning of sentences", "found in the beginning of taxonomic names", etc., then the page Appendix:Capital letter suppresses the need for creating those definitions in every single letter. Think of it as a merger of all the entries for capital letters because they would have repeated information otherwise. The idea of "capital letter" is something of lexical significance, and completely able to be checked for attestations just like a normal entry. Also IMHO it is more important than the entry ] [.

Then again, I know it's an unprecedented idea, so I don't mind if someone disagrees. (not that I would usually mind otherwise) I used the appendix namespace because it would seem uncontroversial, but I was really aiming for the main namespace. Thoughts? --Daniel Carrero (talk) 08:55, 13 October 2015 (UTC)

I forgot to mention: moving Appendix:Capital letters into the main namespace also would serve the purpose of making it searchable. --Daniel Carrero (talk) 10:53, 24 October 2015 (UTC)
Nobody seems to have weighed in, but this has my support, whatever that's worth. Andrew Sheedy (talk) 01:08, 18 October 2015 (UTC)
True, thanks for your opinion. I suppose that makes us 2 Support; 0 Oppose; 0 Abstain!! :) In any event, I'll ask more people to join in the conversation. --Daniel Carrero (talk) 10:57, 24 October 2015 (UTC)

Tolkien's languages' copyrightEdit

I've recently started expanding Quenya entries and User:Chuck Entz suggested coming over here and making sure that the Wikimedia Foundation won't be sued by the Tolkien Estate for any copyright infringement. The main reference I'm using is Eldamo.org, which licenses all of Tolkien's languages' definitions and meanings under Creative Commons 4.0. What does the community think? Aryamanarora (talk) 22:49, 12 October 2015 (UTC)

[IFYPY] IANAL, but @BD2412 is. —Μετάknowledgediscuss/deeds 03:01, 13 October 2015 (UTC)
This is an area where I would recommend treading very carefully. The fact that another website maintains a compendium of words from a fictional language under an lenient license has no bearing on the copyright status of the creative work with respect to the estate of the original author. bd2412 T 01:15, 14 October 2015 (UTC)
To clarify: The other site can't license something it has no right to in the first place
In the US, looking at 17 U.S. Code § 102, it does not seem that languages are included under the list of things that are copyrightable; they are closer to a system (of communicating), which is explicitly excluded, than a literary work, which seems to be the closest thing that is included.--Prosfilaes (talk) 01:14, 16 October 2015 (UTC)
What makes this rather murky is that these languages are an integral part of a series of literary works, and play an important part in the effect of several passages on the reader. For instance, the words of the Dark Lord on the Ring might not sound that unusual to someone who speaks any of a number of languages (somewhere in Central Asia, perhaps?) with a similar phoneme inventory, but to English speakers their sound symbolism really gives an impression of something alien, barbaric and harsh.
Even though some of the Elvish languages were the source of the literary works, rather than the other way around, I doubt they would have come out the same in the end if the literary works had never been created. Besides, Tolkien was such a master of linguistic details that it's hard to escape the impression that the languages are as much original creations as a poem, a sculpture, or a painting.
As for the issues involved, it's not merely a matter of whether the WMF is at risk of being sued: as an enterprise that depends on the willing contributions of a great many people, we need to be very respectful of intellectual property rights. We should avoid violating anyone's legal rights, whether there's a likelihood of litigation or not. Chuck Entz (talk) 23:11, 16 October 2015 (UTC)
I would suggest some limiting principle, such as including only words that are at least discussed in some other work. bd2412 T 00:24, 17 October 2015 (UTC)

Matched-pair entries - follow-up proposalEdit

Now that Wiktionary:Votes/2015-08/Allowing matched-pair entries passed, (thanks for the votes!) I have a follow-up proposal:

  • All unpaired entries can exist as separate entries -- ), (, [, {, etc. -- but they should not have any definitions on the likes of "begins X" and "ends X" and they should not repeat the same information of the matched-pair entries. For example, if ( ) is defined as "encloses supplemental information", then ) should not be defined as "ends supplemental information". The entry ) should only have the modicum of information to point the reader to ( ), and should be devoid of examples, multiple senses (math sense, chemistry [?] sense, typography senses, etc.), regional variations, synonyms, etc. In other words, ( ) should be lemmatized, with (/) pointing to it. (But the individual character entry could have some information specific to it, such as the Unicode box, the name "this is called 'left parenthesis' in English", perhaps even a picture.)

More considerations:

Sometimes, a component of a matched-pair has also standalone definitions. The entry ) could have both:

  1. Used in ( ).
  2. Separates a number or letter from an item in a list.
    1) New York, 2) London, 3) Paris.

Sometimes, a single character is the component of different matched-pairs, so it should point to all of them. The entry » could have, maybe:

  1. Used in « », » » and » «.


Also, feel free to change/adapt the proposal or propose something else if you'd like. --Daniel Carrero (talk) 01:17, 13 October 2015 (UTC)

Proposal: Only use lemma forms in etymologiesEdit

For Latin, some etymologies show multiple forms of a word. For verbs, both the infinitive and first-person singular present active indicative are sometimes shown. For nouns, some people seem to include the accusative singular or the genitive singular form.

This practice is pretty much unique to Latin etymologies, I haven't seen it for any other languages. It can be argued that mentioning this form makes the etymology more correct since Romance lemma forms derive from the infinitive and accusative singular. However, we don't seem to do this for any other languages where this might apply:

  • Bulgarian and Macedonian etymologies show verbs being derived from the Proto-Slavic infinitive, even though the modern languages have no infinitive and another form is used as lemma.
  • Irish and Scottish Gaelic verbs use the Old Irish third-person singular form, even though the lemma is another form in the modern languages.

Therefore, I want to propose that we use only the lemma forms in etymologies, regardless of whether the modern lemma form descends from the ancient lemma form. What is inherited is the entire paradigm, not the lemma form alone. The lemma form is merely a representative of that paradigm. So Latin cantō merely stands in for the entire paradigm, which includes cantāre. The choice of lemma form in any given language is completely irrelevant for the actual etymology. —CodeCat 12:38, 13 October 2015 (UTC)

  • Definitely support. This has long been a desideratum of mine. —Aɴɢʀ (talk) 12:52, 13 October 2015 (UTC)
It’s not the always the entire paradigm that that is inherited. Most Romance words are loaned or inherited from the accusative, but some are from the nominative, and some are from the accusative plural.
For most Portuguese verbs, it’s specifically the infinitive that was loaned or inherited, and other forms were formed from its stem. Look at ajo (not from agō), jogue (not from iacit).
I don’t support this as a general rule, as we risk losing important etymological information if it were followed to the letter. I think it would be better if each language had its own policy about how to link to etymons. — Ungoliant (falai) 14:40, 13 October 2015 (UTC)
That kind of analogical restructuring and reforming is normal in any language and doesn't need mentioning unless something unusual is going on. Paradigms may be inherited but that doesn't mean all forms must be inherited individually. As for borrowing, for example, English placate does come from plācō, even though the specific form was taken from the past participle plācātus. Many Germanic languages, meanwhile, borrow Latin and Romance verbs by using the original infinitive as the stem. —CodeCat 15:33, 13 October 2015 (UTC)
It is the entire paradigm that is inherited. It's just that all of the case forms other than the accusative are dropped, but the paradigm also contains all the definitions and connotations (even if these are also changed). --WikiTiki89 15:43, 13 October 2015 (UTC)
  • Why is this being discussed without any reference to ordinary users?
In English etymologies the practice of including, for example, the stem of or a form other than nominative singular of a Latin or Greek noun can help users see and accept the etymological we offer. Similarly for words derived from present or past participls.
Is this intended to make it simpler to impose some pan-lingual uniformity on entries? It seems to me to have little prima facie justification, so one naturally looks for other, unstated motives. DCDuring TALK 23:31, 13 October 2015 (UTC)
I oppose this. I would, however, support linking to the lemma, as is done in some cases already (what seems most common, though, is including both in the etymology, as in "from linking, gerund of link"). Andrew Sheedy (talk) 00:08, 14 October 2015 (UTC)
Does that mean you think the same should be done for Old Irish, Proto-Slavic and any other language where this applies? What to do when the lemma form didn't even exist in the ancestor language, like for PIE vs its descendants? —CodeCat 00:20, 14 October 2015 (UTC)
Assuming I am understanding correctly, then I would answer "yes" for the first question. One of the two practices that exist for Latin words in etymologies would be ideal, in my opinion. Part of it is, as DCDuring mentioned above,that an average user may not realize that they are not being given the actual form of the word from which the word they are looking at was derived. I know that was the case for me before I realized what was going on, and as a result, I thought that some of the etymologies were pretty far-fetched. I wouldn't throw a fit, however, if that standard wasn't adopted for languages like Old Irish, Proto-Slavic, etc.
I'm afraid I'm having difficulty parsing your second question in my sleepy state. Rather than me answering a question you didn't ask, could you please clarify what you mean first? Andrew Sheedy (talk) 01:02, 14 October 2015 (UTC)
Proto-Indo-European didn't have an infinitive, so for any language that uses the infinitive as lemma, there's a problem. There's no form to show it to be derived from. Latin ferō is inherited straight from PIE *bʰéroh₂, but Proto-Germanic *beraną didn't inherit from anything in PIE, it was formed after Germanic split off. The exact Germanic descendant of the PIE form is *berō, but that's not the lemma form. —CodeCat 01:09, 14 October 2015 (UTC)
I see. In that case, I would note the extra step, i.e. the formation of *beraną from whatever form, which in turn came from PIE *bʰéroh₂ (or whatever the exact inflection is). I realize that that information may not always be available, but the bottom line is that the intermediary step should be noted, or it should be made clear that the derivation was indirect (e.g. saying "ultimately derived from" rather than simply "from"). (Also, if PIE had no infinitive, why is *bʰer- defined as "to bear, carry"?) Andrew Sheedy (talk) 02:00, 14 October 2015 (UTC)
It's actually the root with all the inflectional/derivational bits removed, so it's impossible to translate directly into English. Using the infinitive is better than a long explanation to the effect that it depends on what inflectional/derivational state it's in as to how to translate it. Chuck Entz (talk) 02:25, 14 October 2015 (UTC)
Ah, OK. Proto-Indo-European is unfamiliar territory for me. Andrew Sheedy (talk) 02:33, 14 October 2015 (UTC)
  • Much as Andrew Sheedy, I oppose this, but I support linking to the lemma forms. I deal primarily with Japanese, and sometimes a term derives not from a lemma form, but from some inflection thereof. Listing only the lemma form in the etymology would be incomplete, and it invites confusion. ‑‑ Eiríkr Útlendi │Tala við mig 02:47, 14 October 2015 (UTC)
  • Oppose. Trying to standardize all cited forms as lemma forms would be overblown. Lemmatization is decided on other grounds entirely than etymological transparency. In particular, when lemmas consistently contain a particular suffix (e.g. an infinitive or nominative marker), I am in favor of quoting only the word stem, if this can be well-defined. (But I am, per Andrew S, in support of including at minimum a link to the lemma.) --Tropylium (talk) 09:19, 14 October 2015 (UTC)

Standardizing suffix entriesEdit

Looking at the suffix entries of different languages, I notice a great variety in how the definitions and examples are formed/formatted, and what terminology is used. This is especially visible on long multi-language pages such as -a, -t, -k. I think it would increase the quality of our dictionary if we could standardize certain aspects to make them appear more unified. The standards would be mainly recommendations and guidelines for formatting and terminology. Here are some simple examples:

  • Is there a preferred terminology in the definitions? E.g.:
    "verb suffix", "verbal suffix", "verb-forming suffix", or "verb-building suffix"?
    "plural suffix", "plural ending", or "plural marker"?
  • "Forms the..." or "Used to form the..."? Or {{non-gloss definition|Used to form the…}}
  • How to format the FL definition when the FL entry has an English equivalent?
  • How to format it when there is no English equivalent?
  • The order of terms within a definition, e.g. "third-person singular indicative past indefinite" or "third-person singular past indicative indefinite"?
  • For the examples listed below the definition, should we use the format recommended by {{suffixusex}}?

These are just initial thoughts and observations, I'm sure there will be others, I just wanted to find out whether other editors see a need for such recommendations. --Panda10 (talk) 14:24, 13 October 2015 (UTC)

@Panda10: I also struggle to find good ways to define suffixes. I’ll try to explain what I usually do:
  • If the suffix has an exact English equivalent, I use the typical FL format of translation (gloss). Scientific suffixes such as -metro and -algia are in this category.
  • Otherwise, I use the format {{n-g|explanation}}; possible_translations.
  • The explanation is “forms parts of speech, from parts of speech [qualifier], indicating/denoting/meaning [...]” (see -eiro for an example)
  • Sometimes the wording makes it clear what is the part of speech, e.g. “forms the names of lakes” doesn’t need to say it forms proper nouns
  • I think I’m the only one who has ever used {{suffixusex}}.
Ungoliant (falai) 13:54, 16 October 2015 (UTC)
@Ungoliant MMDCCLXIV: Thanks. I often read FL suffix entries for ideas, to see how others organized them. It is especially educational when I know nothing about that particular language. If I understand it easily and I like the layout, I will use the same or similar in Hungarian suffix entries.
  • The -eiro entry looks very thorough and well organized. I'm not sure about using {{n-g}}. Personally, it's a little hard for me to read italics. Then there is the problem with the similarity of "form" and "from", not to mention that "forms" is also a plural noun. I checked other online and paper dictionaries, some use "forming nouns from verbs" to make it clearer. Others use a label such as {in nouns} before the definition. I also use a label (saw it first in Finnish entries). See -ás.
  • I considered using {{suffixusex}} before because its output is very close to what I've been using, but in the end I didn't. Mainly because I didn't want the suffix to be repeated in the example, instead I bold the suffix in the derived word. This way the example appears more compact to me and the suffix is still clear. See -ás.
I understand that there will always be differences due to the nature of a specific FL. For example, Portuguese will not require noun suffix inflection tables, only the headline forms for feminine and plural. But I'm convinced that certain things could be standardized. --Panda10 (talk) 19:59, 16 October 2015 (UTC)
There's also {{usex-suffix}}, for some reason (one of them should probably be deleted) DTLHS (talk) 20:05, 16 October 2015 (UTC)
I've converted existing uses to {{suffixusex}}, and turned it into a redirect. —CodeCat 20:43, 16 October 2015 (UTC)

Last week for comments about IEG projectEdit

Hi beer chatters,

The call for Individual Engagement Grants is closed and there is only one week left for comments from the communities. I haven't seen any notice about this and it's quite sad looking at the nice list of projects and number of people involded. So, I invite you to spend some time to look at the project and to let endorsement notes to encourage the participants. I particularly invite you to look at Wikiproject Siriono, a project I built for Wiktionaries. I'll be glad to receive advice and comments about it. Of course, please read the others projects as well, there is really very exciting stuffs! Eölen/Noé (talk) 21:24, 13 October 2015 (UTC)

Cities of Norway vs Cities in NorwayEdit

I have noticed not too long ago. There are two very similar categories: Category:en:Cities in Norway and Category:en:Cities of Norway.

So which one is better? "Cities of Norway" or "Cities in Norway"?

--KoreanQuoter (talk) 02:27, 14 October 2015 (UTC)

I'd choose "IN", that's what Wikipedia does for cities.
Wikipedia has: w:Category:Cities and towns in Norway. --Daniel Carrero (talk) 02:34, 14 October 2015 (UTC)
I like "of" better, but our existing entries (and a bunch I just added to the module) are for "in". Of course, we're not talking about a lot of categories, or even of entries- but it's better not to rearrange everything if we don't have to. Chuck Entz (talk) 03:24, 14 October 2015 (UTC)


Civilocity is a neologism which describes a form of government where the people can watch and listen to the leader of their country for the entire time that person is leading their country. In 2007 Nathaniel Wenger took it upon himself to coin, classify, and copyright this pragmatic philosophy. Nathaniel began talking about civilocity, which he often calls wengerocracy as it remains in its neologism phase, to emphasize the importance for countries to watch the leader of their country no matter where they live. Civilocity can be defined as a form of government where the people can watch the leader of their country 24/7, 365 days a year, including the extra day once every leap year broadcasted live on public television to the entire world. Civilocity allows you to know every single thing the leader of your country did and having it all online.

The exact definition of civilocity is literally, behaving in the dwelling. Civilocity is derived from the Latin term civilis and the Medieval Latin term civitat in the early of the 21st century AD to improve the political systems existing in some American city-states, notably Washington, DC.

Add to WT:LOP if you like. Equinox 15:28, 15 October 2015 (UTC)

Lingwa de PlanetaEdit

Lingwa de Planeta (Lidepla) is a constructed language made in 2010. They have a sizable lexicon and I think we should include the language in Wiktionary. My question is - should it go in the appendix or main namespace? It only has 15 fluent speakers, but so does Volapük. Edit: 25 fluent speakers as per Wikipedia. —This unsigned comment was added by Aryamanarora (talkcontribs) at 19:58, 15 October 2015 (UTC).

Does this language have a sufficient amount of published textual material from which we can cite words to meet our attestation criteria given at WT:CFI? --WikiTiki89 20:03, 15 October 2015 (UTC)
Since Wiktionary:Criteria for inclusion#Constructed languages shows that languages without ISO codes have no consensus, I'd think this would need to be decided by someone familiar with this. w:Lingwa de Planeta shows many literary works translated into Lidepla - [15] has a list, notably Alice in Wonderland is translated. There are around 3,000 words in the lexicon. There is a Swadesh list in the aforementioned Wikipedia article. There's also this [16], a translator from Esperanto to Lidepla. Aryamanarora (talk) 20:36, 15 October 2015 (UTC)
Is there any "durably archived" (i.e. used in "permanently recorded media"; see WT:CFI for clarification) written material produced originally in this language (i.e. not tranlations)? Note that even though Volapük currently has about 20 speakers (according to Wikipedia), it had more in the past and has existed for much longer. I don't know where we find Volapük citations, but I'm sure someone here knows. --WikiTiki89 20:54, 15 October 2015 (UTC)
As far as I know, it does not. I just learned about a few days ago, however, so my knowledge may be limited. I think should we decide to add it it should remain in the appendix namespace. Aryamanarora (talk) 21:06, 15 October 2015 (UTC)
Is there anything at WT:CFI saying that durably archived cites have to be produced originally in the language and not translated? I can't find anything to that effect, but maybe I missed it. The Alice translation certainly exists in a dead-tree edition, which is available from Amazon. —Aɴɢʀ (talk) 21:11, 15 October 2015 (UTC)
I'm just trying to find out more information about the language. Since there would need to be consensus to include this language (a requirement that CFI does mention), editors need information to base their votes on. --WikiTiki89 21:17, 15 October 2015 (UTC)
I do not believe languages should be added without an ISO code.
More directly, Volapük had close to a million speakers and some non-trivial publications. This language has 25 speakers, and a translation of Alice. (The list of translations seems to either consist of short works or translations of a few chapters.) Evertype has published lots of translations and versions of Alice, including several into tiny conlangs and rare scripts, including one in a script with one user. He currently offers three books in Volapük. His books, at least his Alice's, are print on demand.--Prosfilaes (talk) 01:37, 16 October 2015 (UTC)
  • It's a fairly obscure IAL, even within the obscure world of those who know and create conlangs. I certainly see no reason why it should be in mainspace, but I don't think we have any limitations on conlangs in appendix-space (although perhaps we should). —Μετάknowledgediscuss/deeds 02:44, 16 October 2015 (UTC)
  • My opinion is that Lingwa de Planeta has not matured enough to be included in mainspace. The translation of Alice really exists as an example of the language rather than a usage of the language, since I doubt that there has ever existed even one human being who would have felt more comfortable reading Alice in Lingwa de Planeta than in English. As time goes on, this language might take hold and its community may start producing legitimate uses of the language and then the language would be on track for inclusion in mainspace here. If this language takes off, it will take decades for it be ready to be included here. --WikiTiki89 15:24, 16 October 2015 (UTC)
I agree with Wikitiki that this is not suitable for inclusion in the main namespace. As WT:CFI says, most constructed languages "do not meet the basic requirement that one might run across them and want to know the meaning of their words, since they are only used in a narrow context in which further material on the language is readily available." However, we do seem to let most conlangs have a minimal, not copyright-infringing appendix. (Is the language copyrighted? The website has a copyright notice.) Inclusion of an appendix doesn't seem to require that the language be given a code, e.g. the Sindarin appendix doesn't seem to use its code (its code seems to be included in the module only so that Category:Terms derived from Sindarin works). - -sche (discuss) 18:39, 25 October 2015 (UTC)


Just a note: Template:place was created, together with Module:place, for use in placenames in all languages. Thanks to Ungoliant MMDCCLXIV (talkcontribs), who developed the module completely. (Also I credit myself as a beta-tester.) See Module talk:User:Ungoliant MMDCCLXIV/archive1 for conversations during the development of the module. --Daniel Carrero (talk) 03:13, 16 October 2015 (UTC)

Appendix:okay signEdit

Okay, so, I’m hardly proficient at any sign language, so I simply put this in the appendix. I don’t think that we have any other entries that are extremely similar to this one, so I more‐or‐less made up my own format. If you people have any suggestions or comments, I’d like to read them. I think that this sign, if nothing else, merits inclusion somewhere. --Romanophile (contributions) 07:29, 17 October 2015 (UTC)

I made another one: appendix:V-sign. --Romanophile (contributions) 15:38, 17 October 2015 (UTC)

I created Appendix:finger gun, Appendix:thumbs up, Appendix:thumbs down, Appendix:shushing and Appendix:air quotes. --Daniel Carrero (talk) 16:10, 17 October 2015 (UTC)

No comments? Well, qui tacet consentire videtur, as the Romans would say. --Romanophile (contributions) 23:59, 17 October 2015 (UTC)

To avoid having random bits and pieces all over the place, I think they should be made subpages of a common parent, like Appendix:Gestures/thumbs up etc. Equinox 00:02, 18 October 2015 (UTC)
(edit conflict) I think all of those should be in the mainspace, (to be searchable as normal entries) but I have yet to learn that long notation that we use for them.
Note: Appendix:okay sign = Sign gloss:OK and O@Side-PalmForward K@Side-PalmForward.
But, as long as they are in the appendix namespace, I agree with Equinox's idea of Appendix:Gestures/thumbs up. --Daniel Carrero (talk)
Yeah, there might be some sort of technical code that describes these, but I’m not sure how to write it. Had I known it, I would have just put them in the main space. --Romanophile (contributions) 00:14, 18 October 2015 (UTC)
FWIW, I think descriptive names such as "V-sign" are better than the long technical names. --Daniel Carrero (talk) 09:54, 18 October 2015 (UTC)
This is a great idea, love it! As others have already pointed out, finding gestures might be tricky (visual index?) Jberkel (talk) 00:28, 20 October 2015 (UTC)

Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, rightEdit

A note: I created Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right as a follow-up to Wiktionary:Votes/2015-08/Allowing matched-pair entries. --Daniel Carrero (talk) 08:38, 17 October 2015 (UTC)

Categories for places that are not cities?Edit

People have been creating a variety of "cities in..." categories, which is nice. But the category is a bit misnamed, because there are also attestable place names that aren't cities. Furthermore, in many countries/languages, "city" is merely an unofficial name for any large place, and is not strictly defined, whereas in others, even small places can be cities if they have city rights. So these should really be renamed to something more neutral and less subject to uncertainty. —CodeCat 20:26, 17 October 2015 (UTC)

I'm not sure I understand why this is a problem. Every country has its own hierarchy of polities, so we shouldn't be trying to make everything conform to some one-size-fits-all scheme. Trying to coordinate placename types between countries can only lead to madness. Besides, what are you going to replace it with? A municipality can range from a part of a city to a regional jurisdiction containing multiple cities. A metropolitan area can stretch across large areas and include numerous cities. Even if you have very specifically-defined entities, deciding which to use entails a rather research-intensive, subjective process, since cities can vary so much. The only halfway-reliable fact is what something is called, so we should categorize using that and suppress the impulse to make sense of it all. Chuck Entz (talk) 22:15, 17 October 2015 (UTC)
I think we should use a single term that can apply to all of them regardless of size. We don't need to subdivide it further into whatever definitions apply. My gripe is that "city" doesn't cover all we might want to put in the category, so we will need Category:nl:Villages in the Netherlands, Category:nl:Villages in Belgium and so on. I'm trying to avoid that situation by suggesting we use a neutral term. —CodeCat 22:25, 17 October 2015 (UTC)
In my opinion, we really should not use "city" for everything if that's inaccurate. Using inaccurate qualifiers just makes us an inaccurate dictionary, thus a less trustworthy one. One idea that Wikipedia uses is "First-level subdivision" (state, province, county), "Second-level subdivision", etc. In fact, Wikipedia takes this one step further, since each of those contains specific categories by country such as "Provinces of Algeria", "Districts of Azerbaijan", etc.
I created WT:Place names, which I propose to be a list of types of place names that all countries use, to help our current categorization system, though most countries are to be filled with information yet. That page also has links to the "x-level" Wikipedia categories I mentioned. --Daniel Carrero (talk) 22:48, 17 October 2015 (UTC)
Countries may not treat cities, towns, villages, hamlets etc. as legal entities. In the Netherlands for example there are only municipalities, but they can have many different villages in them. What Dutch people usually go by is whether the place has its own road sign that gives the name of the place when you enter it. The sign has legal significance, but only for road users (it means a 50 km/h speed limit). Addresses also use places rather than municipalities. The Dutch term for a generic group of houses in one place, regardless of size, is woonplaats ‎(literally living-place). An English equivalent of that would be ideal for these category names. —CodeCat 22:52, 17 October 2015 (UTC)
Wikipedia calls it a "settlement": w:Human settlement. —CodeCat 23:05, 17 October 2015 (UTC)
That will run into the problem of different countries having different levels of complexity, organized in different hierarchies. To connect a term in use with one of your abstractions requires knowledge of the how the country is organized. Your description of Brazil took up most of a screen- multiple that by dozens. In the US, you have states, except for the District of Columbia and various territories. States are divided into counties, except for the ones that have independent cities that are their own counties, or the ones that are instead divided into parishes, or into boroughs. In Alaska, the borough is the equivalent of a county, and can contain multiple cities. In New York, the city of New York is divided into boroughs. As you subdivide further, there's virtually no correlation between size/importance of a given polity in a metropolitan area and anything at the same hierarchical level in a rural area: w:Los Angeles County, where I live, is larger than some states (not to mention countries), while some rural counties are smaller than the the smallest subdivision of the city of w:Los Angeles, where I live (which isn't in any of your "x-level" Wikipedia categories). No matter what criteria you use, consistency is a pipe dream without making things too complicated and too impractical for mere mortals. Chuck Entz (talk) 00:20, 18 October 2015 (UTC)
Hey, I didn't propose any "consistent" system or any specific change to the categories, I don't know what exactly would be the categories for most countries. But I'd like to know more, that's why I created what WT page. You don't have to help if you don't want to. But your comments about US are something that would fit well there. Heck, I'm not even saying that we are going to have a perfect, ideal system eventually, but I bet that information could help even the current system some way nonetheless. --Daniel Carrero (talk) 00:33, 18 October 2015 (UTC)
Wikipedia has another name for certain categories that I don't suggest using here, but I'm going to mention anyway: "Category:Populated places in (place)". --Daniel Carrero (talk) 08:49, 19 October 2015 (UTC)

Nominalized AdjectivesEdit

Regarding nominalized adjectives (adjectives which are used as nouns, such as rich) are we to add a section "Noun"? For example, rich has only an "Adjective" section.SoSivr (talk) 09:07, 18 October 2015 (UTC)

This is a productive process—I'm having trouble thinking of any English adjectives referring to a class of people that cannot be used this way, aside from words like "Catholic" that are already used as countable nouns. "The deaf", "the living", "the hidden", "the tall", "the ill", "the healthy", "the infirm", etc. can all be used in the appropriate context. Giving that seemingly all adjectives can be used this way as long as the meaning makes sense, I don't think we should add noun senses to their entries.
On the other hand, deaf, poor, wealthy, and ill do have a noun sense to cover this usage, though I think they probably shouldn't. —Mr. Granger (talkcontribs) 17:03, 18 October 2015 (UTC)
Some dictionaries have some "nominalized adjectives" as nouns, but most do not. Some of those that include it as a noun assert the rich to be an idiom or use the entry to say that rich takes a plural verb. To add and verify the corresponding information for every adjective for which such information would apply seems like a long run for a short slide. DCDuring TALK 17:59, 18 October 2015 (UTC)
Yes, this is just a feature of English grammar. We should only have noun entries if there exists translations that are different to those of the adjective. SemperBlotto (talk) 05:28, 19 October 2015 (UTC)
Some previous discussions: Talk:Irish, Talk:deaf, Talk:wicked. We do have "Used before an adjective, indicating all things (especially persons) described by that adjective." as a sense of the. - -sche (discuss) 16:41, 21 October 2015 (UTC)
So example sentences of these adjectives where they are used as nouns will probably be put inside the relevant Adjective section, together with a usage note perhaps.SoSivr (talk) 09:06, 23 October 2015 (UTC)

Wiktionary:Votes/2015-10/Internet ≠ Internet slangEdit

Note: Created Wiktionary:Votes/2015-10/Internet ≠ Internet slang, based on the 2012 discussion Wiktionary:Beer parlour/2012/January#Internet =/= Internet slang. --Daniel Carrero (talk) 11:04, 18 October 2015 (UTC)

Does this need a vote? I feel it’s already the modern consensus and common practice (among people who know what they are doing). — Ungoliant (falai) 13:20, 18 October 2015 (UTC)
I doubt this needs a vote. It needs definition-by-definition review to correct existing entries. Possibly it could use a bit of discussion to clarify the distinction., especially in borderline cases and in cases where both might seem applicable. I'd assume that internet referred to the mostly technical jargon concerning the internet and internet slang referred to slang used on the internet. Slang used on the internet about the internet probably belongs in internet. DCDuring TALK 18:05, 18 October 2015 (UTC)
@Daniel Carrero: I agree with DCD, and I think this kind of response shows that a vote is unnecessary. —Μετάknowledgediscuss/deeds 18:44, 18 October 2015 (UTC)
OK, I retract the vote. --Daniel Carrero (talk) 18:52, 18 October 2015 (UTC)

Context label in the form "often medicine"Edit

@CodeCat and I had a discussion about whether a context label at cacoethic in the following form was correct: "(obsolete, often medicine)" ({{context|obsolete|often|_|medicine|lang=en}}). I had used this label to indicate that cacoethic was often, but not always, used in a medical context. CodeCat said "often medicine" made no sense. I pointed out that this sort of label seemed fine for dictionary entries: see, for example, [17]. The matter was resolved by splitting up the medical and non-medical senses, but for future reference I'd appreciate some guidance on whether labels like "often medicine" are appropriate. Smuconlaw (talk) 14:40, 19 October 2015 (UTC)

I'd say in cases like this it would make more sense to say {{lb|en|chiefly|medicine}} to indicate a term is used chiefly but not exclusively in medicine. —Aɴɢʀ (talk) 15:12, 19 October 2015 (UTC)
So constructions like "chiefly medicine" and "often medicine" are acceptable as context labels, even though they are not, of course (as CodeCat pointed out), strictly grammatical? Smuconlaw (talk) 15:52, 19 October 2015 (UTC)
They are perfectly grammatical in the subgrammar of labels. --WikiTiki89 15:55, 19 October 2015 (UTC)
OK, great. Thanks. Smuconlaw (talk) 13:55, 21 October 2015 (UTC)

Boring cleanup work for moneyEdit

I lost my job in July, that's how I've been able to be more active on Wiktionary in the last few months. Though I'm still looking for other jobs.

My money is almost gone. Can I do boring cleanup work on Wiktionary for money?

I got the idea from Wiktionary:Beer parlour/2012/July#Reward or bounty board, in which someone said "I see no reason why a person doing boring cleanup work should not be paid with money if someone offers that money." and "appropriately clean up Category:Translation table header lacks gloss" is mentioned as one possibility.

My plan, if no one objects:

  1. I set up a Patreon account. (I never did that before, I did my research but I apologize if I understood wrong how it works)
  2. Let's say the goal is specifically: appropriately clean up Category:Translation table header lacks gloss.
  3. I set up a goal of "receiving tips every 100 entries", with some minimum amount ($1?). If other people are willing to help, they use my Patreon page and every 100 entries I receive that amount of money, up to the maximum amount that people choose.

There are probably other types of boring cleanup work to do, I'm open to suggestions. --Daniel Carrero (talk) 12:09, 20 October 2015 (UTC)

I'll take you out for a meal and let you crash on my couch if you run a Spanish verb bot and empty all these categories. --Zo3rWer (talk) 13:49, 20 October 2015 (UTC)
Thanks! :p Running a bot? I could do it, but that seems a bit different from my original idea, I was thinking of working for money by doing some manual labor that no one wants to do, and this is one of those inherently repetitive tasks that a bot could do better, as you mentioned. (other tasks require individual consideration of each entry and thus would be better done by humans) I wonder who was the last bot used to create forms for Spanish verbs. User:TheDaveBot (2006-2013)? --Daniel Carrero (talk) 11:54, 21 October 2015 (UTC)
I could be a gazillionaire if I'd got paid for every edit I ever made. Even if you included the fines I'd have gotten for being blocked. --Zo3rWer (talk) 13:54, 20 October 2015 (UTC)
I'd pay you $1 for every 100 entries cleared out of Category:term cleanup and all its subcategories. —Aɴɢʀ (talk) 10:52, 21 October 2015 (UTC)
Thanks for the idea, sounds good! If no one minds, I think I'll create a Patreon account saying more broadly "$1 every 100 entries for boring cleanup work, to be decided by consensus." so that I could start working on Category:term cleanup now per your idea and the job could be changed if people voted/decided/discussed on something else (since there's Category:Translation table header lacks gloss mentioned above and probably other jobs). --Daniel Carrero (talk) 11:54, 21 October 2015 (UTC)
Does it violate any of Patreon's terms and conditions that what you're doing isn't really art? —Aɴɢʀ (talk) 12:38, 21 October 2015 (UTC)
No, I believe. I've read the Community Guidelines, the legalese Terms of Use and the Help Center. They talk repeatedly about "artists and creators."
Just to be sure, I've sent them an e-mail today.
My name is Daniel Carrero, I'm an editor/administrator at Wiktionary, a dictionary wiki which is a sister project to Wikipedia.
We are all content creators, but often there is content on the wikis that need maintenance and cleanup for quality and standards.
Can I use Patreon to crowdsource specific cleanup and quality work and, like "pledge $1 for every 100 pages cleaned up according to criteria X and Y"? I predict that only other Wiktionary members would be interested in paying for that project.
Thank you,
Daniel Carrero
Also, I've found some specific Patreon projects of creating a wiki:
--Daniel Carrero (talk) 15:36, 21 October 2015 (UTC)
They replied to my e-mail. They've suggested doing a monthly campaign (as opposed to "per creation"). What do other people think? I think "per creation" is still better as measurable progress, I don't mind listing all the entries when I finish the work.
Also: +1 point for Patreon because they did their homework, apparently. They said "Wiktionary entries" and I never said "entries" in my message, so they must know at least a bit about our work and correct terminology.
Hey Daniel,
Thanks for writing in and stoked to hear you're thinking about starting a Patreon campaign for Wiktionary!
I think this is a great idea and totally something that would work well on Patreon. To make it easier on you, I might even recommend doing a monthly campaign (as opposed to "per creation,") so that you're not having to keep a constant count of the pages cleaned up. I think that a lot of people would love to support such a great cause and it would be really cool to share fun new Wiktionary entries with your supporters on your Patreon page.
Happy to answer any other questions that come up as you familiarize yourself more with Patreon, so feel free to shoot me a note as they come up!
All the best,
--Daniel Carrero (talk) 09:30, 22 October 2015 (UTC)
Wikipedia says in Patreon: "In October 2015, the site was the target of a massive hacking attack with almost 15 gigabytes' worth of password data, donation records, and source code taken and published. The breach exposed more than 2.3M unique e-mail addresses and millions of private messages." Is there a safer method? --Panda10 (talk) 13:06, 21 October 2015 (UTC)
I don't know, should I use any other platform listed at Category:Crowdfunding platforms? I believe Kickstarter would work only for huge, expensive unstarted projects, while Patreon should be usable for small tips according to measurable progress done (or per month if the worker chooses that option), like I proposed above.
Here's the links to the 2 websites that Wikipedia uses as sources to that information: [18] and [19].
This [20] is the official statement from Patreon, which I also found quoted in a number of other sites. It says: "There was unauthorized access to registered names, email addresses, posts, and some shipping addresses. Additionally, some billing addresses that were added prior to 2014 were also accessed. We do not store full credit card numbers on our servers and no credit card numbers were compromised. Although accessed, all passwords, social security numbers and tax form information remain safely encrypted with a 2048-bit RSA key. No specific action is required of our users, but as a precaution I recommend that all users update their passwords on Patreon." --Daniel Carrero (talk) 15:36, 21 October 2015 (UTC)
Also: "The unauthorized access was confirmed to have taken place on September 28th via a debug version of our website that was visible to the public. Once we identified this, we shut down the server and moved all of our non-production servers behind our firewall." --Daniel Carrero (talk) 15:41, 21 October 2015 (UTC)
  • I've finished 200 entries. Please see User:Daniel Carrero/term cleanup. --Daniel Carrero (talk) 03:09, 22 October 2015 (UTC)
    Another big problem are plain wiki links to non-English entries, e.g. [[non-english lemma]] (or is this included in your proposal?), but I think this type of cleanup could be semi-automated. Jberkel (talk) 12:56, 3 November 2015 (UTC)
    @Jberkel: That sounds like a good idea! It's not included in my current work (User:Daniel Carrero/term cleanup), but it's something that I could do as a separate project if people want, after I finish the current one.
    Just to be clear -- Surely we want every plain link to be converted to either {{m}} (mentions of words), {{l}} (in synonyms lists, etc.) or other templates, right, no matter if the link is to an English or non-English section? I'd guess that probably most of the 2,1 million "gloss definition" (according to WT:STATS) entries have plain links in one way or another.
    If one or more people are willing to pay for that as a separate cleanup project later, I don't have any problem with editing as many entries as possible for that purpose. But bear in mind that it would be basically revising all the existing entries, (a process that I would try to speed up using CSS to spot plain links quickly or something) so of all the possible options for a future cleanup project, this might be one of the longest ones. --Daniel Carrero (talk) 13:31, 3 November 2015 (UTC)
    WingerBot has gone through all Russian entries and wrapped plain links within {{l}} under all section except parts of speech and etymology.--DixtosaBOT (talk) 13:46, 3 November 2015 (UTC)
    A possible plan could be like that:
    1. Let bots wrap automatically all links in all entries where it can be done faithfully (synonyms, coordinate terms, derived terms, etc.) Supposedly, bots would be unable to fix etymology, POS sections, usage notes and other sections.
    2. Create some sort of dump listing all the pages that have plain links that bots are unable to fix, so that they could be done manually if people want.
    --Daniel Carrero (talk) 13:52, 3 November 2015 (UTC)
    Yes, I think every plain link should be converted to a template with an explicit link target, if we want the Wiktionary data to be useful and non-ambiguous. Whenever I edit entries I always try to get rid of any [[links]] I come across. Another problem I discovered are relative links ([[#English|foo]], if already on page foo. These are not so much usability problems right now but important if we look at Wiktionary outside of a website / wiki context.
    If it worked well for Russian is there a reason not to use the same bot for other languages? A combination of bot + manual work sounds like a good plan of attack. Jberkel (talk) 16:06, 3 November 2015 (UTC)
    I tend to do the opposite (convert to simple [[links]]) so we may be working against each other here. Equinox 16:44, 4 November 2015 (UTC)
    Why would you do that? Please stop. —CodeCat 16:58, 4 November 2015 (UTC)
    Because it makes it much harder to read and edit the source code. Equinox 15:43, 5 November 2015 (UTC)
    I oppose templated links in definitions. --WikiTiki89 18:57, 4 November 2015 (UTC)
    It's good to know that we can still tell people who use screen readers to fuck off. DTLHS (talk) 19:14, 4 November 2015 (UTC)
    I'm sure the random English word in the middle of an English sentence would be very confusing to a screen reader if it is not marked as English. --WikiTiki89 21:14, 4 November 2015 (UTC)
    Would that mean English links can remain in the normal wikilink style ([[links]]), even so, should all non-English links be templated? Maybe we should create a poll for these questions? --Daniel Carrero (talk) 21:27, 4 November 2015 (UTC)
    In definitions, etymology sections, usage notes, and various other places, we have running English text. If we want to link a word that we happen to use in running English text, then I think plain links are the best choice in order for the wikitext to remain easy to read. But if we were to talk about a word or present an example of text, then we should use a template even if it is in English. The former situation only occurs in English (since this is the English Wiktionary), but the latter situation can occur with any language; therefore, non-English links should always be templatized, since they are always either mentions or examples. --WikiTiki89 23:00, 4 November 2015 (UTC)
    I don't agree – why should English get treated differently (and only sometimes)? As I've already said, it leaves room for ambiguity, especially when there are non-English entries using the same headword. Having two different ways of linking (plain/template) is also confusing editors, which means we have a lot of entries which link to non-English words using plain links (which is definitely the bigger problem and should get fixed first).
    However, by using templates for all links, regardless of language, we minimize the possibility of mistakes and add valuable semantic information. It can be very useful for tools (like mentioned screen readers) to know a) you're linking to a headword (and not a user page etc. b) what language the word is you're linking to. If this information is not given, these tools need to "guess" it (maybe based on the context or some implied knowledge) and apply arbitrary defaults, which could be wrong; these assumptions could become invalid any time. Sure, it could be an English word, but it could very well be a wrongly tagged Spanish word. If the link has already been marked as English (or any other language) there's no extra guesswork required. And as a nice side-effect all links will be generated by the same code/template which means formatting is automatically consistent and can be tweaked "after the fact". A plain link is just a 'dumb' plain link, nothing can be changed about it.
    About readability: no doubts, wikitext is a mess and will never win a prize for readability, but I think we should worry more about user readability, not editor readability. The advantages outweigh the few extra character to type.
    Jberkel (talk) 04:32, 5 November 2015 (UTC)
    It's not English that should be treated differently, it's running text that should be treated differently, and all of our running text happens to be English because this is the English Wiktionary. --WikiTiki89 15:42, 5 November 2015 (UTC)
    What exactly is the advantage of using templated links to people using screen readers? I never used a screen reader before, would it change the voice/accent according to each language when encountering sentences like "The word bread in Japanese is パン, from Portuguese pão."?
    Apart from that, some advantages I know of using templated links are: proper formatting/scripting, both the standard formatting that we all see (as in MediaWiki:Common.css) and the user-side formatting (as in Special:MyPage/common.css).
    Also, when you use plain links without language ([[example]]), it doesn't point to the correct section, plus the "orange links" gadget can't work for that reason. I remember sometimes seeing horrible plain links with languages back in the day ([[example#English|example]]), but I guess nobody does that anymore, it would be too much work to type. --Daniel Carrero (talk) 19:36, 4 November 2015 (UTC)
    Most Anagrams sections, which are bot-generated, still use the [[example#English|example]] format. I always convert those to {{l|en|example}} when I see them, but normal links like [[example]] I leave alone for English words as I don't see what's wrong with them. (I do change them for other languages, though.) —Aɴɢʀ (talk) 19:57, 4 November 2015 (UTC)
    Apart from Anagrams sections, I was thinking of older examples like this: here's a a 2010 version of the entry pizza with some links in the format of # [[#English|pizza]]. --Daniel Carrero (talk) 05:20, 5 November 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── What I did with WingerBot for Russian is:

  • Only convert links in Russian sections.
  • Only convert links in lines beginning with *.
  • Only convert links containing no Latin text (I should probably fix this further to check specifically for Cyrillic text or non-alphabetic text).
  • I skip Etymology and Pronunciation sections. In Usage Notes sections, the links become {{m|...}}, otherwise {{l|...}}. This is a sort of compromise; probably I should either skip Usage Notes or go ahead and process Etymology and Pronunciation.
  • I skip links nested inside of templates and tables.
  • When converting links, if the link is of the form [[A|B]], then it becomes {{l|ru|B}} if A is the non-accented equivalent of B, otherwise it becomes {{l|ru|A|B}}. Links of the form [[A#Russian|B]] have the #Russian part removed in the process.
  • Currently the set of pages processed is those in the categories Category:Russian lemmas and Category:Russian non-lemma forms.

This could probably be automated for other languages. It's not clear to me that we need to convert plain English-language links, but definitely it should be done for non-English links esp. those in foreign scripts. Benwing2 (talk) 00:33, 5 November 2015 (UTC)

I believe a bot would definitely work for lists of links in synonyms/related terms, etc!
But I'd like to talk about: "probably I should either skip Usage Notes or go ahead and process Etymology and Pronunciation"
Correct me if I'm wrong, but I believe the bot would be unable to accurately edit all the links in etymology sections. What about sentences like this?
  • [[A]] is the [[gerund]] of [[B]]
If A and B are terms in, say, Old Spanish, then the final result would ideally be one of the 2 options below, with the correct language codes; we'd use Template:l for simple links and Template:m for mentions:
  • {{m|osp|A}} is the [[gerund]] of {{m|osp|B}} (plain link English word)
  • {{m|osp|A}} is the {{l|en|gerund}} of {{m|osp|B}} (templated English word) -- personally, I prefer this one!
If a bot can or could do the whole work, then it would defeat the point of me manually converting all the uses of {{term|ABC}} into {{m|xx|ABC}}, to be honest! :-) At Wiktionary:Beer parlour/2013/April#Template term and lang parameter the possibility of adding language codes to {{term}} through a bot run has been discussed. According to that discussion, User:CodeCat already did part of the job, which could be done reliably by bot: she used the bot to replace {{etyl|xx|yy}} {{term|word}} with {{etyl|xx|yy}} {{term|word|lang=xx}}.
Another idea:
After User:Daniel Carrero/term cleanup finishes, I can start editing Category:Entries with non-standard headers as a separate paid project -- i.e., manually converting "Initialism", "Abbreviation" sections into "Noun", "Proper noun", etc. in all entries. Would people want that?
I'd just ask to keep the rate I suggested initially of $1 every 100 entries. 8,457 entries = US$84.57. I don't mind if it's, say, 1 person paying the whole amount or 2 people paying $42.28 each. This helps me pay my bills. Thank you. :) --Daniel Carrero (talk) 06:16, 5 November 2015 (UTC)
Well, it only works for Russian because Russian and English have different character sets. It's much harder for languages like Old Spanish, as you point out; we'd have to skip Etymology and Usage notes and Pronunciation, no way around it. For such languages I might also want to restrict further the links that get templated to be only those in a line beginning with * that look like they're part of a list (using appropriate regexes and such to determine this). Benwing2 (talk) 09:50, 5 November 2015 (UTC)
@Benwing2: I understand. Do you think that what you have done for Russian can be done for many languages with different character sets? Maybe Hebrew, Arabic, Greek, Chinese, Gothic, Armenian, Korean, Georgian, etc.? That would sound like a good plan, if that's possible. --Daniel Carrero (talk) 02:04, 6 November 2015 (UTC)
@Daniel Carrero: I can run it on other languages. The main thing I'd need is a regexp that specifies characters within the given character sets. It would look something like %AW-XY-Z where %A gets all non-letter characters and W, X, Y and Z represent the endpoints of the Unicode ranges that contain the appropriate character sets for each language. If you could help construct these ranges it would make it a lot easier to run the bot. You might be able to just snarf the character set ranges from Module:scripts/data, with a bit of checking to make sure they're reasonable. Benwing2 (talk) 00:05, 7 November 2015 (UTC)
@Benwing2: Why do you need to get all non-letter characters separately? If you find a foreign equivalent for a hyphen, a comma or an apostrophe or something else, does it change how the bot must work?
I'm going to try getting codepoints for the first few scripts now. Is this reasonable? I added the "X script languages" because I figured this would help knowing which languages your bot could edit that use each script.
--Daniel Carrero (talk) 02:38, 9 November 2015 (UTC)
Thanks! I use a regexp that gets the correct script and also includes non-letter characters so it will still catch terms that have accents, macrons, hyphens, etc., but excludes letter characters from other scripts. I'll also have it print out warnings if it finds terms that it excluded but which have non-Latin characters in them, to make sure it's not excluding too much. I'm going to start on Armenian, we'll see how it works. Benwing2 (talk) 03:17, 9 November 2015 (UTC)
I have already done that for Georgian long time ago. --DixtosaBOT (talk) 06:13, 9 November 2015 (UTC)
@Daniel Carrero I ran this for Armenian, Greek and Ancient Greek. It required both the list of characters in each script and the entry-conversion regexps in Module:languages/data2 and such. I also had it look for transliteration in parentheses after the link and try to eliminate it (or incorporate into the link, if the language isn't a translit-overriding language). To determine whether something in parens is a transliteration, it transliterates the link and then computes the edit distance (Levenshtein distance) between the auto-generated translit and the explicit translit, and if it's small enough (depending on the length of the words in question), it's accepted, although there are additional checks. When those various checks fail, there's a warning issued. If you want to do a good deed, check the warnings listed in User:Benwing2/fix-links-grc-warnings and fix up the entries needing fixing. There are about 200 of them, and many of them can be ignored. You especially want to check the warnings that mention "Levenshtein distance ... not treating X as transliteration of Y" or "Upper/lower mismatch between explicit X and auto Y", where X is what's found in parens and Y is the automatic transliteration of the link (the Levenshtein distance warning is slightly misworded). For example, the warning "WARNING: Levenshtein distance 15 too big for length 6, not treating Arktos, “Ursa Major” as transliteration of Árktos" means that it found something like [[Ἄρκτος]] (Arktos, “Ursa Major”) and determined that the stuff in parens couldn't be a translit of the link; in this case, the translit should be removed and the gloss incorporated into the link. There are also warnings of the sort "Link contains non-Latin characters not in proper charset", which are links in various non-Greek charsets that could be converted to templated links in the proper language, although a few appear to be Greek script and must contain some non-Greek character in them, which could be fixed. Benwing2 (talk) 10:18, 11 November 2015 (UTC)
BTW for Ancient Greek it was a bit tricky, or at least I had to use modern Greek (code el) in the Descendants section of Ancient Greek entries, since they share the same charset. Benwing2 (talk) 10:46, 11 November 2015 (UTC)
I've seen a few of the recent contributions of User:WingerBot, they look good!
OK, you've been using information from Module:languages/data2, but I assume you still need the codepoints I'm looking for you, right?
I'll look at User:Benwing2/fix-links-grc-warnings with more attention later. For the moment, I'll leave a few more codepoints here for you. Some of the starting or ending codepoints are combining forms, is that a problem? I can get codepoints ignoring combining forms if you want.
--Daniel Carrero (talk) 10:55, 11 November 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Thanks. I did Tamil, Telugu, Oriya, Punjabi, Gujarati, and I'm currently doing Hindi and Hebrew. The combining codepoints are OK for Python but cause problems in vim syntax highlighting, so I rewrote them using \u escapes. The language-specific info comes from a combination of Module:scripts/data, Module:languages/data2 (or data3/...), and Module:links (override_translit). The actual language-specific info looks like this:


def ar_remove_accents(text):
  text = re.sub(u"\u0671", u"\u0627", text)
  text = re.sub(u"[\u064B-\u0652\u0670\u0640]", "", text)
  return text

# Each element is full language name, function to remove accents to normalize
# an entry, character set range(s), and whether to ignore translit (info
# from [[Module:links]], or "notranslit" if the language doesn't do
# auto-translit)
languages = {
    'ru':["Russian", ru.remove_accents, u"Ѐ-џҊ-ԧꚀ-ꚗ", False],
    'hy':["Armenian", hy_remove_accents, u"Ա-֏ﬓ-ﬗ", True],
    'el':["Greek", lambda x:x, u"Ͱ-Ͽ", True],
    'grc':["Ancient Greek", grc_remove_accents, u"ἀ-῾Ͱ-Ͽ", True],
    'hi':["Hindi", lambda x:x, u"\u0900-\u097F\uA8E0-\uA8FD", False],
    'ta':["Tamil", lambda x:x, u"\u0B82-\u0BFA", True],
    'te':["Telugu", lambda x:x, u"\u0C00-\u0C7F", True],
    'gu':["Gujarati", lambda x:x, u"\u0A81-\u0AF9", "notranslit"],
    'or':["Oriya", lambda x:x, u"\u0B01-\u0B77", "notranslit"],
    'pa':["Punjabi", lambda x:x, u"\u0A01-\u0A75", "notranslit"],
    'he':["Hebrew", he_remove_accents, u"\u0590-\u05FF\uFB1D-\uFB4F", "notranslit"],
    'ar':["Arabic", ar_remove_accents, u"؀-ۿݐ-ݿࢠ-ࣿﭐ-﷽ﹰ-ﻼ", False],

It doesn't take too much work to find this info. However, it takes more effort to go through the warnings, so if you have time that's definitely something that would help. Here are warnings for modern Greek (98 of them) and from five Indian languages, not including Hindi (20 of them in total):

Again, not all of these warnings need to be fixed but they could use a once-over. Benwing2 (talk) 04:00, 12 November 2015 (UTC)

Here are the Hindi warnings (20 of them):

Benwing2 (talk) 04:07, 12 November 2015 (UTC)

Here are the Hebrew warnings (49 of them):

Benwing2 (talk) 04:19, 12 November 2015 (UTC)

Code for WestrobothnianEdit

Discussion moved from Wiktionary talk:Beer parlour#Code for Westrobothnian.

Apparently Westrobothnian is still considered a dialect of Swedish here, even though it's linguistically impossible. As I understand, the reason is that there is no language code for Westrobothnian here yet. On Swedish Wiktionary, we use gmq-bot. Languages like Jamtish and Scanian obviously also need their respective language codes, but at this time I'm only asking for Westrobothnian in order to stop people from adding it under Swedish (I kind of hurfes when I think of people doing this; it's like watching someone paint a norseman with a horned helmet, or hearing someone argue that we only use 10% of our brain). — Knyȝt (talk) 13:01, 20 October 2015 (UTC)

I think the word you're looking for is shudder.​—msh210 (talk) 17:13, 22 October 2015 (UTC)
I have no objection to a new code for Westrobothnian, I only object to the removal of Westrobothnian forms from ==Swedish== entries as long as there's no place to move them to. We may as well discuss the other "Swedish dialects" that could be considered separate languages. In addition to Westrobothnian, they seem to be:
  1. Dalecarlian (including Elfdalian), which had the ISO-639 code dlc until 2009 or so
  2. Jamtlandic, which had the ISO-639 code jmk until 2009 or so
  3. Scanian, which had the ISO-639 code scy until 2009 or so
  4. Gutnish, which (like Westrobothnian) has never had an ISO-639 code.
The ISO-639 codes were apparently removed at the request of the government of Sweden, which didn't like the idea of their not being dialects of Swedish; the decision was thus more political than linguistic. At the moment, Scanian seems to be the only one we accommodate at all: Category:Regional Swedish includes subcategories only for Finland Swedish, Scanian Swedish, and Swedish Swedish. —Aɴɢʀ (talk) 14:10, 20 October 2015 (UTC)
I think we should use the old codes. What the government of Sweden thinks isn't relevant to linguistic interests. —CodeCat 14:27, 20 October 2015 (UTC)
Of the five under discussion, only three have old codes. We can use those, but we should probably prefix them with gmq-. The other two will need codes of their own. Westrobothnian may as well use gmq-bot, as it does at sv-wikt. If Gutnish doesn't already have a code at sv-wikt, gmq-gut will work. —Aɴɢʀ (talk) 14:50, 20 October 2015 (UTC)
Well, what do you know, we already have dlc and gmq-gut, so it's just a matter of the other three. —Aɴɢʀ (talk) 18:11, 20 October 2015 (UTC)
Would an administrator please implement the codes gmq-jmk Jamtish, gmq-scy Scanian and gmq-bot Westrobothnian in Module:languages/datax? Thank you in advance -- 15:44, 4 November 2015 (UTC)
I oppose these codes, they should be jmk and scy, since those are the ISO codes for these languages. —CodeCat 16:03, 4 November 2015 (UTC)
That is fine - there is just a need for some codes as to avoid inserting simple links [[term]] in the Descendants sections of the Proto-Germanic entries. -- 16:08, 4 November 2015 (UTC)
I oppose those codes; those aren't the ISO codes for these languages, those were the codes. They are no longer valid to use for new works. gmq-jmk, gmq-scy and gmq-bot are more consistent with the ISO standards.--Prosfilaes (talk) 23:29, 10 November 2015 (UTC)
We use sh, don't we? —CodeCat 00:13, 11 November 2015 (UTC)
And we should probably use hbs instead. I'm more interested in us doing the right thing going forward then fighting that battle, though. "scy" and friends are not found on the official list of ISO 639-3 codes.--Prosfilaes (talk) 01:28, 11 November 2015 (UTC)

Unified multilingual WiktionaryEdit

As Wiktionary:Project – Unified Wiktionary outreach and Wiktionary:Project – OmegaWiki are no longer active, OmegaWiki makes a multilingual dictionary to describe all words of all languages with definitions in all languages. However, without further progress at m:OmegaWiki, I have thought of a possibly simpler way to make a unified dictionary right here, possibly moving to wiktionary.org in the future:

  1. Making many heading from Wiktionary:Entry_layout#Additional headings, like English, pronunciation, nouns, etc., automatically translated to other languages per users' preferences, like on Wikimedia Commons, would be valuable to encourage merging smaller Wiktionaries in other languages to this largest site.
  2. Repeating many materials like pronunciation, synonyms, antonyms, derived terms, related terms, etc., is the drawback of keeping too many Wiktionaries in many languages. Unified Wiktionary would eliminate the duplicated materials.
  3. Smaller Wiktionaries lack sizable community to justify bureaucrats and possibly administrators. Merging them here would administer all contents much more efficiently.
  4. Considering Wiktionary:Entry_layout#Variations for languages other than English, only when Wiktionaries in certain other languages want to merge hereto, should we translate them to English then to third languages. For example, Chinese entries here should not yet be translated to third languages, until Chinese Wiktionary is going to merge hereto, which is what I dream of due to too few active users and too many entries with poor quality.
  5. As OmegaWiki already has definitions of any word in many languages, Unified Wiktionary would also give etymology, usage notes, references, etc. in many languages.

I propose inviting other Wiktionaries optional merger when we are ready here based on my proposals above. Any useful comments are welcome.--Jusjih (talk) 00:29, 22 October 2015 (UTC)

Data unification can be very beneficial, but we must be careful when deciding what can and what can’t be merged. Definition- or translation-based mergers (the sort that the Wikidata folk wants to force upon us) are an outright horrible idea because they ignore the existence of anisomorphism, and that language is not mathematics.
I think the least controversial content mergers would be pronunciations and inflection tables. — Ungoliant (falai) 01:06, 22 October 2015 (UTC)
Wikimedia Incubator has Translingual Wiktionary as a stub project with a few pages and unclear format/purpose, so if there's any effort directed to making a multilingual Wiktionary, I suggest using that. I added some Portuguese verbs there myself back in 2011.
Ungoliant is right about the problems he mentioned, though. --Daniel Carrero (talk) 09:50, 22 October 2015 (UTC)
A very obvious problem is the decision of what to treat as a language. Other Wiktionaries may not accommodate reconstructed languages for example. Lithuanian Wiktionarians may object to having Proto-Balto-Slavic. And I doubt the Croatian Wiktionary will want to merge "their" language into Serbo-Croatian for our sake. Wiktionaries may also differ in the classification of languages. Even pronunciation details may differ; consider how divided our own Wiktionary is on /a/ vs /æ/ for British English. —CodeCat 14:40, 22 October 2015 (UTC)
Thanks so much for all of your valuable comments. Wikimedia Commons already uses {{int:summary}}, {{int:support}}, {{int:oppose}}, etc. to automatically translate common phrases to many languages, and this is needed to go multilingual here. We should try a pilot program to phase in auto-translation as on Commons. For example, ks:ठूल is really Kashmiri-English, so once we auto-translate important layouts, we may be able to bring the contents of many smaller Wiktionaries here. As many minor languages lack their own Wiktionaries and some have been closed out, their users may want to come here to translate their words, compounds, and phrases to English.--Jusjih (talk) 16:54, 22 October 2015 (UTC)

Anyway, it's impossible: in a common project, common decisions must be taken, and there must be a common language for discussions. Discussions are in a different language for each wiktionary, and this principle cannot be changed: I'm not sure that you would be willing to close en.wikt in order to merge data with fr.wikt, and to change your discussion language to French. Wiktionaries are a success, Omegawiki is a failure (7 contributions a day: this is the average for the last 7 days), and there are good reasons for this situation. Trying to adopt Omegawiki principles would not be beneficial at all, it would kill projects adopting them. I am convinced that the best way to share data is through bots (bots importing data when possible, bots checking data consistency when possible, bots providing lists of words or list of translations, etc.) Lmaltier (talk) 20:49, 22 October 2015 (UTC) Of course, anybody can contibute to any wiktionary, not only in one's native language. This is already the case. And it would be beneficial to provide translation tables in all entries (except inflected forms), not only in English word entries. But this does not change the principle: one project for each discussion language. Nobody would propose a single unified Wikipedia. It's the same for wiktionaries. Lmaltier (talk) 21:00, 22 October 2015 (UTC)

I do not mean closing out English Wiktionary to go multilingual. I just suggest trying limited auto-translation here, as on Wikimedia Commons, to test merging minor languages in, while keeping major languages separate, as many Wiktionaries in minor languages have been closed out. If no consensus to go very multilingual, we need more global bots to coordinate Wiktionaries in different languages, maybe in connection with Wikidata.--Jusjih (talk) 00:59, 23 October 2015 (UTC)
Well, your title above is Unified multilingual Wiktionary... If the main idea is automatic translation of Wiktionary pages, tools already exist for major languages. For other languages, anyway, human translation of definitions, etc. would be needed, and there is no reason why there would be more contributors on a site in a foreign language than on a site in one's own language... Lmaltier (talk)
As files from Wikimedia Commons may be transcluded on other wikis with local descriptions possible, I am thinking of keeping language subdomains for language-dependent things, like categories, so even should the unified multilingual Wiktionary be approved through Meta request, only main articles will likely be imported there for internationalization of entry layouts, etc. for future transclusion on language subdomains. I will open further discussion on Meta later. Thanks.--Jusjih (talk) 00:44, 24 October 2015 (UTC)

Verbs that introduce a subordinate clauseEdit

Verb senses like “I think [that] she is here”, “I saw everyone go away”. Is there a grammatical term for them? — Ungoliant (falai) 19:09, 22 October 2015 (UTC)

Your two examples are completely different. "Think" is simply a transitive verb and the subordinate clause functions as a noun phrase: What do you think? I think thoughts. What are your thoughts? My thoughts are that she is here. "Saw" is a sensory verb, and it seems that sensory verbs have a special case where their direct objects can be modified by a bare infinitive in addition to a participle: I saw everyone. Everyone was going away. I saw everyone going away. I saw everyone go away. Everyone was go away. It does seem odd. I would like to know the reason behind this. --WikiTiki89 19:42, 22 October 2015 (UTC)
I'm unaware of a special term for verbs that can take clauses as their complement, but the difference between your two examples is the kind of clause they take. "Think" is followed by a noun clause, "(that) she is here", which (when that is removed) is by itself a complete sentence. "Saw", however, is followed by a small clause, "everyone go away", which is not a complete sentence and cannot be introduced by that. (You can tell it's not a complete sentence because everyone takes a singular verb, "everyone goes away", but in this case go is in the bare stem form.) When think means "hold an opinion" rather than "believe something to be the case", it can take either a noun clause or a small clause: I think that she is pretty or I think her pretty. —Aɴɢʀ (talk) 20:21, 22 October 2015 (UTC)
Thank you both. I guess I’ll continue to label them transitive. — Ungoliant (falai) 20:29, 22 October 2015 (UTC)
There is a category of verbs called "reporting verbs": Category:English reporting verbs, [21]. - -sche (discuss) 03:32, 23 October 2015 (UTC)
Yes, though neither "think" nor "see" is a reporting verb. —Aɴɢʀ (talk) 12:20, 23 October 2015 (UTC)
But it doesn't seem that reporting verbs are all that special grammatically. It seems more of a category that writers use to avoid saying "he said". --WikiTiki89 15:31, 23 October 2015 (UTC)
"See" and "think" are labelled as reporting verbs by the site I linked to. "Think", at least, seems to be able to report things about as well as "order", which our own category gives as a reporting verb:
John shouted "leave!" — [report:] (I left because) he ordered me to leave.
John told me I should leave. — [report:] (I left because) he thought I should leave.
The category does seem to be rather amorphous, as Wikitiki notes. - -sche (discuss) 20:11, 24 October 2015 (UTC)

Vote on disallowing extending of votes - 7 days remainingEdit

FYI, you can still vote at Wiktionary:Votes/pl-2015-07/Disallowing extending of votes.

Current results:

  • Support: 8 - 66%
  • Oppose: 4 - 33%
  • Abstain: 2 - N/A

--Dan Polansky (talk) 09:24, 24 October 2015 (UTC)

Is this vote going to be extended? :) —CodeCat 13:42, 24 October 2015 (UTC)
<sarcasm>With 66% support, this vote could use more time to build consensus. Definitely extend.</sarcasm> --Daniel Carrero (talk) 13:46, 24 October 2015 (UTC)
From the vote: "Duration note: The vote is set for three months, and is not expected to be extended, to prevent discussions about circularity or recursiveness of the vote." --Dan Polansky (talk) 14:21, 24 October 2015 (UTC)

Please see - capital letter discussionEdit

Please give your opinion on this discussion above, in which I proposed moving an appendix (already formatted as an entry) to the main namespace, to make it searchable. Current "results":

  • 2 support (me and Andrew Sheedy)
  • 0 oppose
  • 0 abstain

--Daniel Carrero (talk) 11:00, 24 October 2015 (UTC)

The principle for which this case would be a precedent is the placement in mainspace of material that can be presented in the form of an entry even though it does not share the common characteristics of dictionary entries, appearing typically in a style manual, usage guide, or grammar.
One conceptual difficulty is that the headword, proposed to bear the headword "Unsupported titles/Capital letter, with the displayed title as [capital letter]", does not have the same relationship to the content as a normal headword does to a normal entry.
One suggested advantage is that the page would be included in searches. I can't quite imagine how a normal user, ie, one not an active participant in discussions such as this, would ever enter terms that would find the article and put it on the first search page, let alone near the top.
I propose the following alternative. As we already have an entry for capital letter (to which I have added a "See also" link to the Appendix), every L2 section of every one of our entries for capital letters should have such a link, preferably to the L2-like section of Appendix:Capital letter. DCDuring TALK 12:11, 24 October 2015 (UTC)
Just a note: I'm going to paste here my original rationale about moving the appendix to the entry namespace, assuming that the discussion should continue here:
Not only the whole page is formatted like an entry, if we assume that entries like A, B, C, etc. should have senses like "found in the beginning of proper nouns" and "found in the beginning of sentences", "found in the beginning of taxonomic names", etc., then the page Appendix:Capital letter suppresses the need for creating those definitions in every single letter. Think of it as a merger of all the entries for capital letters because they would have repeated information otherwise. The idea of "capital letter" is something of lexical significance, and completely able to be checked for attestations just like a normal entry. Also IMHO it is more important than the entry ] [.
Re DCDuring: I take your points about the headword and also about the difficulty of this page appearing at the top of search results. I like your idea of linking from every entry of every capital letter to the appendix. --Daniel Carrero (talk) 12:23, 24 October 2015 (UTC)
To save space, I think we should link the letter entries from the headword line, as opposed to linking them from the see also section. See all the letter sections in the entry B. I edited a few templates to make the headword lines of capital letters link to the appendix, at least when they use separate templates like {{en-letter}} and not {{head|en|letter}} (changing that would require me to edit {{head}}). Feel free to discuss. --Daniel Carrero (talk) 12:37, 24 October 2015 (UTC)

Incomplete etymologiesEdit

Quite often, there are words where I can easily tell where the word eventually came from, but it's much harder to determine how it made its way into the language. For example, Northern Sami anánas originates from the same source as similar words in most other European languages, but it could have been borrowed through Norwegian, Swedish or Finnish (the three languages that most Northern Sami loanwords come from) and I can't tell which one. návli is from Germanic, but did it come from Norwegian or Swedish, from Old Norse, or straight from Proto-Germanic?

Sometimes, etymologies are doomed to be incomplete because there just isn't enough information. I generally just put in what I can figure out myself. But I think it would be useful if there was a way to tag an entry with an "incomplete etymology" tag of some sort. Currently I've used {{rfe}} for this, but I don't think that's really correct when there is some etymology, just not enough to really explain the origin of the word in the necessary detail. Any thoughts on this? —CodeCat 19:54, 24 October 2015 (UTC)

{{etystub}} is supposed to be for exactly that purpose. The case of anánas is different, in my opinion; we may never know the answer, and it would be best simply to list all three likely possibilities and give the ultimate etymon. —Μετάknowledgediscuss/deeds 20:12, 24 October 2015 (UTC)
{{etystub}} is a little bulky for the job. Something like {{rfelite}} would be better, but an expression of incompleteness like ultimately from Proto-Germanic or a more recent North Germanic language and placement in a category such as Category:Incomplete Northern Sami etymologies would seem to do the job. DCDuring TALK 00:37, 25 October 2015 (UTC)
I rephrased {{etystub}} a bit. —CodeCat 00:48, 25 October 2015 (UTC)
You sell yourself short. You've completely changed the visual appearance of the template. DCDuring TALK 00:54, 25 October 2015 (UTC)
I was expecting a change from a table to bare text or vice versa when I read that, but it's just plain text (which I think is good). I agree with CodeCat's edit summary, the wording is now more in line with what I'd expect from a "stub" template. - -sche (discuss) 05:37, 25 October 2015 (UTC)

Zipser GermanEdit

I have words to add from Zipser German, a Central German lect which developed as such in the 1300s in Slovakia (where it is still spoken in Hopgarten as Outzäpsersch i.e. Altzipserisch) before being carried to Franzenthal, Wassert(h)al, elsewhere in northern Romania, and Bukovina, where it was over time increasingly influenced by Upper Austrian. You can see a sample at User:-sche/Zipser. I would like to give it the code gmw-zps and treat all of the dialects under the one code in accordance with the literature on the subject, which speaks of it as one language. - -sche (discuss) 06:03, 25 October 2015 (UTC)

Yes check.svg Done --Lo Ximiendo (talk) 18:46, 25 October 2015 (UTC)

Restoring WT:ELEEdit

I request that WT:ELE (Wiktionary:Entry layout) is restored to the state of 13 October 2015. The subsequent changes seem rather subtantial to me, and require a vote, IMHO. For editing with abandon, there is Wiktionary:Entry layout/Editable. Thank you. --Dan Polansky (talk) 13:59, 25 October 2015 (UTC)

Which parts of the new version do you disagree with? —CodeCat 14:01, 25 October 2015 (UTC)
I edited the WT:EL. See the history and the specific diff from the date that Dan Polansky mentioned.
I tried editing with the current consensus in mind, i. e., I believe the new version reflects best our current practices.
That said, I am aware that the policy box says "It should not be modified without discussion and consensus. Any substantial or contested changes require a VOTE." That was substantial indeed, and now contested by Dan Polansky. --Daniel Carrero (talk) 14:16, 25 October 2015 (UTC)
The point is that the change is substantial. Per "substantial or contested" in "Any substantial or contested changes require a VOTE" (on the top of the page), it suffices that it is substantial; it does not even need to be contested. I really don't see what there is to discuss, unless I have woken up in some Orwellian world, again. --Dan Polansky (talk) 14:22, 25 October 2015 (UTC)
That's OK. I am going to restore the 2 policies to the point you requested and create a vote for them. --Daniel Carrero (talk) 14:24, 25 October 2015 (UTC)


I'd like to know if other people disagree with any of the changes/proposals. Thank you. --Daniel Carrero (talk) 15:24, 25 October 2015 (UTC)

Restoring WT:CFIEdit

I request that WT:CFI is restored to the state from 5 September 2015‎. For free editing, there is Wiktionary:Criteria for inclusion/Editable. Thank you. --Dan Polansky (talk) 14:04, 25 October 2015 (UTC)

I restored CFI too per your request and I am going to create a vote for it, too. Though, FWIW, I'd like to say that while the changes to EL were going to be very substantial, I don't consider the changes to CFI to be substantial. Diffs:
  1. Wiktionary:Criteria for inclusion: 35064944 (diff)
  2. Wiktionary:Entry layout: 35064941 (diff)
--Daniel Carrero (talk) 14:40, 25 October 2015 (UTC)
The CFI change may be less substantial but is bad, and I am going to oppose it. It introduces phrasing "It has been voted" and it introduces rationales in "One reason for having separate pages ...". That is bad for a policy page, IMHO, and AFAIK some people agree with me in this regard. A policy page should state its shoulds AKA regulations and that's it. It should not state "It has been voted on"; we have refereces to votes for that. And rationales should be in the votes that lead to the policy, not in the policy itself, IMHO. --Dan Polansky (talk) 15:29, 25 October 2015 (UTC)
@Dan Polansky: Point taken. In my proposed revisions, there is 1 explicit mention to a vote in the CFI, and 1 in the ELE. I intend to remove those from the proposal, per your criticism. Are there many other points you would disagree with? Or: Would you support the proposal? If not, what could change in the proposal before you could consider supporting it?
Sometimes, I see you posting long, detailed arguments about a given issue. If you have time/interest to review the proposal to be voted, we could discuss any changes to be made before the vote starts. --Daniel Carrero (talk) 22:10, 26 October 2015 (UTC)

Suggestion: Edit to Template:policyEdit

I propose editing Template:policy like this, to organize the different types of policies:

Application-certificate Gion.svg This is a Wiktionary policy, guideline or common practices page. It must not be modified without a VOTE.
Entries: CFI - EL - NORM - NPOV - QUOTE - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS.

(I removed WT:REDIR recently as outdated and added WT:LT because I believe it's important, like WT:AXX.)

--Daniel Carrero (talk) 02:12, 27 October 2015 (UTC)

There are pages like WT:About English where parts of the pages are policy and parts aren't. I think this could be made clearer with an optional parameter in the template. Renard Migrant (talk) 14:27, 27 October 2015 (UTC)

Category:Regional Hebrew for diachronic varietiesEdit

Currently, the categories Category:Classical Hebrew, Category:Biblical Hebrew, Category:Mishnaic Hebrew, and Category:Israeli Hebrew are categorized under Category:Regional Hebrew. However, these are all diachronic varieties of Hebrew and all existed in essentially the same region. Is there a better way to categorize diachronic varieties of a language? What other languages have this problem? --WikiTiki89 22:24, 27 October 2015 (UTC)

Pretty much every old language that has varieties. Latin in particular. —CodeCat 22:46, 27 October 2015 (UTC)
Yes but we don't seem to have a Category:Regional Latin. --WikiTiki89 01:11, 28 October 2015 (UTC)

Documenting how to handle long s and ligaturesEdit

We exclude a number of graphical variants, such as long s (Talk:diſtinguiſh) and ligatures like f-i, s-t, f-f-l and so forth (Talk:fisherwoman, Talk:philerast), but these two practices are not explicitly documented on a Wiktionary-namespace page as far as I can tell. I'd like to know if these practices are still supported; if they are, I'll document them somewhere (perhaps in WT:CFI#Spellings in the vicinity of the line about combining characters?). By the way, can our javascript be made to redirect etc to fi etc, like it redirects ſ to s? (Go to [[ſiſter]] and you're sent to [[sister]] after a second, but go to [[fish]] and your browser sits on the blank page.) - -sche (discuss) 02:15, 28 October 2015 (UTC)

Wouldn't this be very language dependent? So maybe the specific language policy pages would be a better place. DTLHS (talk) 02:32, 28 October 2015 (UTC)
Not really. Is there any language where is a different letter from fi? —CodeCat 03:11, 28 October 2015 (UTC)
Make it policy (vote anyone? as much as I hate the v-word) and then add exceptions as we find them. If we find them, I agree with CodeCat there probably aren't any. A lot of our policies aren't documented because of the difficulty of getting stuff through a vote (votes failing with 65% support and whatnot) but it's rarely a problem because there are few enough of us we can just discuss it. Renard Migrant (talk) 13:20, 28 October 2015 (UTC)
The rules actually were very language-dependent, especially with regard to sequences of two or more S's. --WikiTiki89 14:49, 28 October 2015 (UTC)
Go on. Renard Migrant (talk) 15:03, 28 October 2015 (UTC)
I swear I thought there were differences, but I can no longer find any evidence of them. Perhaps I was thinking of v vs. u, where some languages used u even at the beginnings of words. --WikiTiki89 15:09, 29 October 2015 (UTC)
@Wikitiki89: here you go. --Romanophile (contributions) 15:30, 29 October 2015 (UTC)
That really is a good source supporting the idea that the long s is a typographical variant of s as opposed to another letter. Renard Migrant (talk) 16:50, 29 October 2015 (UTC)
I think that we should make an appendix for ſ, similar to how we have an appendix for capital letters. For the stylistic ligatures, though, there’s not much to say about them. Even as potential redirects, they wouldn’t be very utile considering that very few people know how to type them. Plus, Unicode discourages them anyway. Automatic redirections would be acceptable with me, though.
A policy prohibiting stylistic ligatures is okay with me. It would have been nicer if somebody made that years ago, though. --Romanophile (contributions) 13:21, 29 October 2015 (UTC)

A few proposed changes to WT:NORMEdit

There are a few changes that are being proposed at Wiktionary talk:Normalization of entries. Since not many people have responded, I'm letting you know here. —CodeCat 20:13, 28 October 2015 (UTC)

Restoring WT:NORMEdit

I ask that WT:NORM is restored to the state of 5 September 2015. Since then, substantive (meaning-changing) changes took place without a vote, and that cannot be per what it says at the top of WT:NORM: "Any substantial or contested changes require a VOTE." The addition of the template containing this text to the page is a consequence of Wiktionary:Votes/pl-2015-07/Normalization of entries 2.

Let me reiterate that my contesting the changes now is not necessary; the condition contains an or: "Any substantial or contested changes require a VOTE." --Dan Polansky (talk) 20:18, 28 October 2015 (UTC)

You haven't actually contested any changes. You just voiced a blanket "I don't like it"-style disagreement with no rationale. —CodeCat 20:20, 28 October 2015 (UTC)
I have made the restoration. I point the above editor and anyone else to the word "or" in the condition. I have my hopes. --Dan Polansky (talk) 20:20, 28 October 2015 (UTC)
I've restored the current version. The changes that were made didn't change the meaning, so this proposal is unconstructive and in bad faith. On that ground I reserve the right to reverse it. —CodeCat 20:23, 28 October 2015 (UTC)
I ask for restoration. I won't revert war on that page; someone else has to restore the page to a proper state. If the page will not get restored, it will ipso facto cease to be a policy. --Dan Polansky (talk) 20:27, 28 October 2015 (UTC)
What Dan Polansky is saying is that a voted-on policy shouldn't be altered without a vote. It is a bit of a pain when someone missed a comma out and nobody wanted to vote against a proposal merely on the basis of a missing comma, then you need another vote just to insert a bleeding comma where one is needed. Renard Migrant (talk) 16:43, 29 October 2015 (UTC)
Eats, shoots, and leaves. DCDuring TALK 17:52, 29 October 2015 (UTC)
That is not entirely accurate. You can correct a missing or misplaced comma without a vote as long as it does not change the meaning of the sentence. It follows from "Any substantial or contested changes require a VOTE", which was voted by the community to apply in Wiktionary:Votes/pl-2012-03/Vote requirements for policy changes. --Dan Polansky (talk) 09:32, 7 November 2015 (UTC)
As such pages are not assuredly on anyone's watchlist and certainly not on everyone's, shouldn't a contributor making such changes draw attention to them by giving notice here to expose them to being deemed "substantial or contested"? To me that seems like an efficient way of reducing acrimony. This search illustrates that commas have been involved in controversies of interpretation. DCDuring TALK 12:49, 7 November 2015 (UTC)
I think it is enough for such pages to be on the watchlist of the admins who care enough to enforce such policies. --WikiTiki89 14:51, 9 November 2015 (UTC)
What about those who would prefer that such policies were not altered and then imposed on them? What about new and occasional contributors, potential recruits to more substantial contribution?
We seem to have a more than sufficient number of folks whose principal contribution is finding rules to impose on content contributors. The rules can hardly be said to make it easier for new contributors to get involved in our efforts. DCDuring TALK 21:47, 9 November 2015 (UTC)
Which is why no one can make any changes to the rules (i.e. changes to the substance of a policy page) without a vote; and admins who care enough will watch the policy pages to make sure such changes are not made. --WikiTiki89 22:02, 9 November 2015 (UTC)

"Proper" codes for etymology-only languages part 2Edit

In Wiktionary:Beer_parlour/2013/December#"Proper" codes for etymology-only languages, it has been proposed renaming the etymology-only language codes into the "proper" format of aaa-aaa, like this: Late Latin/LL. > "la-lat". Can we do that now? I take it would require a vote?

Rationale: Standardization. It is weird that the different codes work in different ways:


  • Late Latin: both {{etyl|LL.|en}} and {{etyl|Late Latin|en}} work (it does not matter if we use the code or name)
  • Latin: only {{etyl|la|en}} works, {{etyl|Latin|en}} does not work (we have to use the code, not the name)


  • Late Latin: only {{etyl|la-lat|en}} should work, {{etyl|Late Latin|en}} would not work anymore

It would require a bot changing the codes in all entries before full implementation.

That discussion also introduced the idea of leaving both old and new codes working together for a while (LL. and la-lat) as a transition period while people get used to them. What do you think? --Daniel Carrero (talk) 03:38, 29 October 2015 (UTC)

We've already been in the transitional period since then. The new codes already work. See Module:etymology languages/data. —CodeCat 13:35, 29 October 2015 (UTC)


This category claims that Singlish is an English-based creole. If that is true, it needs to be given its own language code and this category needs to stop being used in English entries. — Ungoliant (falai) 13:41, 29 October 2015 (UTC)

Wouldn't that force us to have separate Singlish sections for the thousands of nouns, adjectives, etc. that are used unaltered from English in Singlish? (Actually, is the same true of e.g. Scots?) Equinox 11:11, 30 October 2015 (UTC)
Yes, but if it’s a different language it’s a different language. — Ungoliant (falai) 13:47, 30 October 2015 (UTC)
Singlish lists three sources asserting that it's a creole, and looking at the example sentences, especially those labeled basilectal, I'm inclined to agree. (I certainly wouldn't consider "Dis guy Singrish si beh zai sia" to be an utterance of a dialect of English.) According to Category:Creole or pidgin languages, the code we use for creoles and pidgins is crp, so maybe crp-sng? Incidentally, why do we have both Category:Creole or pidgin languages and Category:Pidgins and creole languages? —Aɴɢʀ (talk) 14:54, 30 October 2015 (UTC)
Line 172 of Module:category tree/langcatboiler is the culprit. — Ungoliant (falai) 15:07, 30 October 2015 (UTC)
More specifically, that module is not in agreement with Module:families/data, which specifies "creole or pidgin" as the name of the language family. Can someone more versed in editing modules than I am please fix it? —Aɴɢʀ (talk) 19:59, 30 October 2015 (UTC)
I have merged the categories at Category:Creole or pidgin languages. - -sche (discuss) 23:15, 11 November 2015 (UTC)

Rollback in LenovoTest01Edit

Add flag rollback LenovoTest01. admin group. LenovoTest01 (talk) 09:11, 30 October 2015 (UTC)

Why? Who are you? Equinox 11:07, 30 October 2015 (UTC)
According to WP, a blocked sockpuppet of w:User:Никита-Родин-2002, who progressed from vandalism to good-faith-but-disruptive cluelessness and incompetence, all using a host of IPs and sockpuppets. Chuck Entz (talk) 21:03, 11 November 2015 (UTC)
Request denied. Unknown user with zero edits. SemperBlotto (talk) 11:17, 30 October 2015 (UTC)

Wiktionary:Votes/pl-2015-10/Entry name sectionEdit

FYI: I created Wiktionary:Votes/pl-2015-10/Entry name section. It is a vote about the "entry name" section of WT:EL. --Daniel Carrero (talk) 18:20, 30 October 2015 (UTC)

November 2015

"Headword line" vote startedEdit

FYI: Wiktionary:Votes/pl-2015-10/Headword line started today. --Daniel Carrero (talk) 02:05, 1 November 2015 (UTC)

Renaming Wiktionary:Normalization of entries (WT:NORM) → Wiktionary:Entry source code (WT:ESC)Edit

What do you think about renaming Wiktionary:Normalization of entries (WT:NORM) into Wiktionary:Entry source code (WT:ESC)? (while keeping the old name and shortcut as usable redirects, naturally)

The name would be more intuitive as to the actual purpose of the policy.

I apologize since I'm the one who had chosen the current name of the policy ("Normalization of entries") in the first place, based on that old discussion called "Normalization of articles". But I've been thinking a lot about this policy and I believe it would be an improvement for the reason above.

It would also match the name of the policy Wiktionary:Entry layout. (rather than "layout of entries")

Finally, the name "Normalization of entries" is unclear. Since Wiktionary:Entry layout, too, provides rules for "normalization" of "entries", the uninformed reader cannot tell at a glance why we have two different policies currently named as "Entry layout" and "Normalization of entries". If the names were "Entry layout" and "Entry source code", it would be easier to make that distinction. --Daniel Carrero (talk) 03:32, 1 November 2015 (UTC)

I think a better name would be Wiktionary:Wikicode normalization. —CodeCat 14:13, 1 November 2015 (UTC)
Between "Entry source code" and "Wikicode normalization", I prefer the former. As a secondary, not much important reason, the acronym WT:ESC looks nice. The main reason is this:
IMO, the word "Normalization" is superfluous in the same way that the word "explained" was superfluous in WT:ELE. You could have: Wiktionary:Normalization of entry layout, Wiktionary:Normalization of criteria for inclusion, Wiktionary:Normalization of blocking policy, Wiktionary:Normalization of page deletion guidelines and Normalization of bots. --Daniel Carrero (talk) 17:15, 1 November 2015 (UTC)
@CodeCat: That said above, I'm going to add "Wikicode normalization" in the proposal of my vote per your suggestion. I am going to oppose "Wikicode normalization", but I don't know what other people will vote. I'm going to use "WCN" as the proposed abbreviation, please edit the page or propose another one if you disagree with "WCN". --Daniel Carrero (talk) 17:20, 1 November 2015 (UTC)
The presence of source code in the name may scare non-programmers from reading it. — Ungoliant (falai) 17:34, 1 November 2015 (UTC)
What about Wiktionary:Wikicode style guide? --WikiTiki89 17:36, 1 November 2015 (UTC)
Excellent. I support that. — Ungoliant (falai) 17:38, 1 November 2015 (UTC)
Ungoliant's comment about scaring non-programmers is true, maybe the name really should have "wikicode" and not "source code". A few possibilities:
--Daniel Carrero (talk) 17:47, 1 November 2015 (UTC)
the hell is wikicode?--Dixtosa (talk) 18:53, 2 November 2015 (UTC)
@Dixtosa: w:Help:Wiki markup. --WikiTiki89 19:01, 2 November 2015 (UTC)
wikicode (currently a redlink) = wikitext (currently a bluelink) --Daniel Carrero (talk) 19:05, 2 November 2015 (UTC)
The chances of wikicode to be understood as related to language code is high. Lets stick to plain easy wt:norm which coincidentally has a good shortcut ?:/
Also, the Wikipedia's page on wiki markup mentions wikicode once for a reason. --Dixtosa (talk) 19:30, 2 November 2015 (UTC)
If we're going to use "wiki-[something]", "wikitext" seems both more common and 'friendlier' to a non-programmer than "wikicode". "Normalization of entries" does sound like it should be about rules on e.g. using accents vs macrons vs nothing for Old English vowel length. - -sche (discuss) 19:42, 2 November 2015 (UTC)
What about using "wiki markup" or "markup" in the policy name?
--Daniel Carrero (talk) 19:56, 2 November 2015 (UTC)
I like "Entry markup". "Wiki markup" sounds a bit like a how-to guide, how to use wiki markup. - -sche (discuss) 20:20, 2 November 2015 (UTC)

Created Wiktionary:Votes/pl-2015-11/NORM: 10 proposalsEdit

I created Wiktionary:Votes/pl-2015-11/NORM: 10 proposals.

This vote is larger than average, so I've set it up to start in 14 days and last for 2 months.

Feel free to discuss. --Daniel Carrero (talk) 07:21, 1 November 2015 (UTC)

I got these 10 proposals from CodeCat's reverted edits, from the Beer parlour, the NORM talk page and the NORM votes. The "Discussions" part of the vote should be able to link all the sources. --Daniel Carrero (talk) 03:01, 2 November 2015 (UTC)

Proposal: exclude non-printing characters, encoding variants of characters, and combining charactersEdit

Combining diacritics are really just variants of their non-combining equivalents. They are not lexicographically different characters. They also pose many technical problems because of how they are handled. Non-printing characters like control codes are also difficult to work with, and also have no lexicographical value since they do not appear in text by definition. We currently have some control codes as redirects, but these redirects are themselves inaccessible and can't be edited (not even by a bot, as I found). As for "encoding variants", I'm talking about things like versus C. These are the same character, merely encoded with different code points in Unicode and displayed a bit differently by the font. I think should redirect to C. —CodeCat 14:12, 1 November 2015 (UTC)

Symbol support vote.svg Support Redirect ("Roman numeral" C), ("Fullwidth" C) and probably others to C, then place a complete list of codepoints for varities of "C" on the main entry. --Daniel Carrero (talk) 16:49, 1 November 2015 (UTC)
Are you talking about entries for the individual characters only? Some languages need combining characters because they use unencoded character + diacritic combinations. — Ungoliant (falai) 17:37, 1 November 2015 (UTC)
Yes, just for the single characters. Links to combining characters go wrong in all kinds of ways, just try clicking on this link: ̅. —CodeCat 17:39, 1 November 2015 (UTC)

Context label "North America"Edit

Currently, {{lb|en|North America}} produces the text (Canada, US). I think this is wrong. If someone puts "North America" as a context label, it should show up as (North America); if they wanted to say "Canada, US", they would have. --WikiTiki89 16:36, 1 November 2015 (UTC)

It would be quite funny is {{lb|en|UK}} produced (England, Northern Ireland, Scotland, Wales). Seriously though, Wikitiki89 is right. Renard Migrant (talk) 23:53, 1 November 2015 (UTC)
I agree. This generates wrong information when, i.e., used in a sense is used in Mexican and US Spanish. — Ungoliant (falai) 00:01, 2 November 2015 (UTC)
The behaviour was implemented because template was being used as a shorthand for "Canada, US" in French and English entries, and it was agreed that it was bad to split and have only some entries in "Category:Canadian English"+"Category:American English" while others were in "Category:North American English". I oppose going back to "North American" and splitting the entries up again. Deprecating the label altogether and making bot or AWB runs periodically to clean up uses could work. - -sche (discuss) 01:15, 2 November 2015 (UTC)
I would be against a bot cleaning these things up. We could automatically put anything tagged with "North American" into all three categories. Either that or have "US" and "Canada" also put words into "North American English". If this is a categorization issue, it should be solved with categorization. --WikiTiki89 16:54, 2 November 2015 (UTC)

Changes to Template:votesEdit

Template:votes has been recently edited to cause the enddate in past votes to be formatted yellow, and some icons have been added to past votes and votes near the enddate.

IMO it would look better without the icons. If there's no support for the icons, I request them to be removed.

Also, I don't know if the yellow text should stay. Making a distinction for past votes could be useful, and I know yellow text in this case means "past vote", but this meaning is not intuitive. Sometimes, a vote past the enddate remains open for voting because nobody closed it yet, so red text would still be appropriate and yellow text would be misleading. --Daniel Carrero (talk) 17:05, 1 November 2015 (UTC)

I agree, the icons should be removed. Perhaps the past votes should be red and the "ending soon" votes should be yellow? We would also need to use a darker yellow so that the date is actually readable on a white background. --WikiTiki89 17:14, 1 November 2015 (UTC)
I prefer to keep the icons. —CodeCat 17:40, 1 November 2015 (UTC)
The icons are too distracting. I went ahead and removed them and made the color scheme more intuitive. Now, votes ending soon will be orange, votes ending today will be red, and votes that have already ended will be gray. --WikiTiki89 18:41, 2 November 2015 (UTC)

Separate, simplified pages for lettersEdit

I suggest moving all the definitions of letters into separate, simplified pages to try and fix the problem of cluttered letter pages.

For instance, Letter:D or Appendix:Letters/D could have the following contents. I didn't take the trouble to link all the letters of the alphabet to other pages, but in the end they should be linked.

Capital and lowercase versions of D, in normal and italic type.

D uppercase, d lowercase

English: 4th letter. Name: dee.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Esperanto: 5th letter. Name: do.
  • Aa, Bb, Cc, Ĉĉ, Dd, Ee, Ff, Gg, Ĝĝ, Hh, Ĥĥ, Ii, Jj, Ĵĵ, Kk, Ll, Mm, Nn, Oo, Pp, Rr, Ss, Ŝŝ, Tt, Uu, Ŭŭ, Vv, Zz
Finnish: 4th letter. Name: dee.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Šš, Tt, Uu, Vv, Ww, Xx, Yy, Zz, Žž, Åå, Ää, Öö
Latvian: 6th letter. Name:
  • Aa, Āā, Bb, Cc, Čč, Dd, Ee, Ēē, Ff, Gg, Ģģ, Hh, Ii, Īī, Jj, Kk, Ķķ, Ll, Ļļ, Mm, Nn, Ņņ, Oo, Pp, Rr, Ss, Šš, Tt, Uu, Ūū, Vv, Zz, Žž
Portuguese: 4th letter. Name: .
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Spanish: 4th letter. Name: de.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Ññ, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Turkish: 5th letter. Name: de.
  • Aa, Bb, Cc, Çç, Dd, Ee, Ff, Gg, Ğğ, Hh, Iı, İi, Jj, Kk, Ll, Mm, Nn, Oo, Öö, Pp, Rr, Ss, Şş, Tt, Uu, Üü, Vv, Yy, Zz

Thoughts? I've added an image for neatness, it would be optional. --Daniel Carrero (talk) 02:26, 2 November 2015 (UTC)

I’d rather see appendices for every letter in a given language rather than the opposite (otherwise a page like Appendix:Letters/D will eventually end up having thousands of entries), but I support anything that stops them from cluttering the entries of real words. — Ungoliant (falai) 02:31, 2 November 2015 (UTC)
PS: with the exception of translingual. — Ungoliant (falai) 02:31, 2 November 2015 (UTC)
Perhaps we could keep the Translingual letter definition in all entries, and link the Translingual definitions to the separate pages like Letter:D or Appendix:Letters/D. --Daniel Carrero (talk) 02:35, 2 November 2015 (UTC)
I think it's a great idea! It would certainly clean up the letter entries. Aryamanarora (talk) 12:42, 2 November 2015 (UTC)
I prefer Ungoliant's approach, with per-language pages rather than per-letter pages. We should include links to all of these pages, or a category containing them, in the Translingual section. —CodeCat 17:23, 2 November 2015 (UTC)
Ungoliant is right that per-letter appendices won't reduce clutter (they will themselves become crowded), so per-language appendices seem better. I would definitely keep translingual sections on all of the letters, from which to link to the per-language appendices. (I would also include a ===See also=== link to a language's appendix any time we happened to have an entry for a single-letter word in that language, e.g. a#English, n#Aromanian, e#Hungarian.) - -sche (discuss) 17:58, 2 November 2015 (UTC)
Will, for instance, the entry Ğ link only to alphabet appendices that use that letter, or will it link to the full list of alphabet appendices?
Is there going to be some way to know what are the alphabets that use Ğ? Or Ñ? --Daniel Carrero (talk) 18:06, 2 November 2015 (UTC)
If we decide to link to individual appendices, then presumably Ğ#Translingual will link only to the appendices of languages that use Ğ. But then whatever section we put the links in (say, ===See also===) could easily grow to contain a hundred lines, and have to be updated any time a new appendix was created... we're probably better categorizing the appendices and then linking from entries to the category, an idea CodeCat mentioned above. Per-language appendices would allow us to give pronunciations / notes on orthography, letter names, etc... perhaps, especially for the many small languages where the info is just "the Foobarese alphabet is ABCD...YZ", the appendices could just be the WT:About X appendices. - -sche (discuss) 19:46, 2 November 2015 (UTC)
We could also put each language's appendix in a category for each character it uses, then on the entry for a, we could a put an {{also}} link to Category:Languages that use the character "a". --WikiTiki89 20:42, 2 November 2015 (UTC)
That sounds like the most workable solution. —CodeCat 21:03, 2 November 2015 (UTC)
But the English alphabet has 26 letters, from A to Z, so the categories would look like this:
Categories: Languages that use the character "a" | Languages that use the character "b" | Languages that use the character "c" | Languages that use the character "d" | Languages that use the character "e" | Languages that use the character "f" | Languages that use the character "g" | Languages that use the character "h" | Languages that use the character "i" | Languages that use the character "j" | Languages that use the character "k" | Languages that use the character "l" | Languages that use the character "m" | Languages that use the character "n" | Languages that use the character "o" | Languages that use the character "p" | Languages that use the character "q" | Languages that use the character "r" | Languages that use the character "s" | Languages that use the character "t" | Languages that use the character "u" | Languages that use the character "v" | Languages that use the character "w" | Languages that use the character "x" | Languages that use the character "y" | Languages that use the character "z"
That is too long, and also a bit hard to find the right letter inside that wall of text. Couldn't we use just Category:A, Category:B, etc. as the category names? That way, the categories would be:
Categories: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
--Daniel Carrero (talk) 21:39, 2 November 2015 (UTC)
Well the point is more that at each letter in the mainspace, there will be a link to just one category. They aren't intended to all be listed on each language's page. Of course this long list will appear on the bottom of the page, but that's not really a big deal. Maybe we should make them hidden categories? --WikiTiki89 03:27, 3 November 2015 (UTC)
But if we move all the letter definitions into individual language appendices like Appendix:Letters/French, then the categorization above would apply to the appendix, right? --Daniel Carrero (talk) 05:07, 3 November 2015 (UTC)
But also I don't see anything wrong with the single-letter category names either. --WikiTiki89 03:29, 3 November 2015 (UTC)

Suppose we create a simple appendix for all the letters in a single language.

Appendix:Letters/English could have:


Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz

Letter names

a, bee, cee, dee, e, ef, gee, aitch, i, jay, kay, el, em, en, o, pee, cue, ar, ess, tee, u, vee, double-u, ex, wye, zee/zed

A few questions and comments:

  1. Or should we use a table format?
  2. Should the appendix include IPA pronunciations? Sound files too?
  3. We already have a number of randomly-formatted alphabet appendices like Appendix:Mapudungun alphabet, Appendix:Polish alphabet and Appendix:Polish alphabet. Can all the appendices follow the same format?
  4. Appendix:Latvian alphabet has long texts with information and Appendix:English alphabet has a list of letters using certain ligatures, do we need all that in the appendices? I'd rather each alphabet appendix was basically just a simple, standardized list of letters. I mean, Wikipedia has w:English alphabet to fill with lots of historical information and we already have categories for words in English spelled with the ligatures.
  5. Suppose the appendix could be called any of those: Appendix:English letters, Appendix:English alphabet, Appendix:Letters/English, Appendix:Alphabet/English... But the words "Letters" and "Alphabet" don't apply to all languages, do they? What about Persian, Korean, Chinese, etc.? Is there a more comprehensive name for all writing systems? Appendix:Writing/English or Appendix:Characters/English, maybe?

--Daniel Carrero (talk) 05:48, 3 November 2015 (UTC)

Several issues with the categorisation of placenamesEdit

I’ve recently finished going through Portuguese-language toponyms, and in the process I came across several inconsistencies, issues and uncertainties in our category scheme:

Ungoliant (falai) 19:41, 2 November 2015 (UTC)

All of these are good points and I support fixing it up. —CodeCat 19:52, 2 November 2015 (UTC)
Yes, merge same-level subdivisions (so, Category:en:Provinces and territories of Canada; and merge Category:en:Autonomous oblasts of Russia into Category:en:Oblasts of Russia).
Yes, towns should also be categorized by country if cities are, and given that the distinction may be nebulous for many countries, or that inconsistent metrics may apply between countries, it may be best to lump them all into one category.
"State capitals of the United States", maybe?
Cities in the Crimea could go into both Ukrainian and Russian categories. (How do we handle cities in Palestine/Israel? In the Golan Heights?)
Modern usage is "in Ukraine" and we lemmatize [[Ukraine]], so let's be modern in the categories, too.
- -sche (discuss) 20:18, 2 November 2015 (UTC)
Russia also has krais, okrugs, autonomous okrugs and federal cities. I was hoping we could pick a name that encompasses all of them (Federal units of Russia?) — Ungoliant (falai) 20:21, 2 November 2015 (UTC)
In any case, I’ll have to start RFM discussions. I’m just checking if there is support for the general idea that there should be a single category for all subdivisions of a given country. — Ungoliant (falai) 20:27, 2 November 2015 (UTC)
(Two edit conflicts later) Be careful about cleaning them up: "States" and "Territories" in Australia are two different things in both law and common speech. Victoria, New South Wales and South Australia are states, Northern Territory, Australian Capital Territory are not states. (Although, to be extra confusing, there has been talk of promoting Northern Territory to statehood, but possibly keeping the name "Northern Territory".) --Catsidhe (verba, facta) 20:22, 2 November 2015 (UTC)
@Ungoliant MMDCCLXIV Just a note, you should check Special:WantedCategories, as there are a bunch of Portuguese-related place name categories that need to be created. Benwing2 (talk) 22:02, 2 November 2015 (UTC)
Perhaps we ought to make a project or appendix page of possibly difficult-to-categorize place names or types of place names. DCDuring TALK 17:00, 3 November 2015 (UTC)
  • Comment on a few of these:
  1. The oblast category you mention is in a gots-to-go situation; I'll even nominate it for deletion myself
  2. I don't really see the need to combine Canadian provinces and territories (or Australian states and territories). If we do, the resulting category, likely titled Category:en:Provinces and territories of Canada, would a) contain two different types of political entities, and b) be a longer title than the existing categories.
  3. Nothing "hideous" about Category:US State Capitals. a) There should be a category containing the 50 items currently in this category, and b) that's the shortest way to unambiguously title said category.

I would remind people that moving a category or template is not an action to be taken lightly; we've been a little too cavalier about it in the past. Purplebackpack89 21:16, 4 November 2015 (UTC)

  • "Category:States and territories of Australia" would be a better name. Although states and territories are legally distinct, they're almost always grouped together and serve the same function for addresses and whatnot. Just checked Wikipedia and they use the exact same category name. Combining them would also resolve any issues from the NT becoming a state, and the question of what you would make the parent category of these entities, or the user overlooking territories when viewing the category for states, etc. Pengo (talk) 07:43, 6 November 2015 (UTC)
    Symbol support vote.svg Support combining them into "States and territories of Australia". --Daniel Carrero (talk) 07:49, 6 November 2015 (UTC)

Script appendicesEdit

FYI: I populated Category:Script appendices with the help of a new template {{script appendix}}, added at the top of the appendix to generate the introduction and standardize the categories. I renamed some pages from "Appendix: X alphabet" into "Appendix:X script" for consistency with category names. Feel free to discuss. --Daniel Carrero (talk) 04:54, 3 November 2015 (UTC)

Written-out fractionsEdit

Pursuant to the discussion at Wiktionary:Requests for deletion#three quarters, I would like to propose the inclusion of a specific, limited set of eighteen written-out fractions (counting fourths and quarters as the same fraction). The reasons that I propose this particular set are as follows:

  1. These are probably the most common fractions in use (particularly given the tendency to use fifths in parliamentary procedure, and to divide inches into eighths for measurement).
  2. All of these are fractions for which the numerical form is available in Unicode, particularly ½, , , , ¼, ¾, , , , , , , , , , , , and (See Appendix:Unicode/Number Forms). If we have the Unicode form, we should have the written out form as an alternate spelling.
  3. I do not know if these exist as single-word forms in other languages, but I would not be at all surprised if at least some of them do, given their commonality.

The fractions I propose to include are in two groups:

I note that "one tenth" is often used in an idiomatic sense of asserting that person a is "not one tenth the man" (or other property) as person b. I also propose adding the hyphenated form of each as an adjective form (one-half, one-third, etc.). If we can agree on this specific set of eighteen fractions, that will set both the floor and a ceiling on the written-out fractions that can be included in Wiktionary. Cheers! bd2412 T 14:20, 3 November 2015 (UTC)

  • Go for it! By the way, is "three fourths" the US usage? It seems very strange to UK ears. SemperBlotto (talk) 14:34, 3 November 2015 (UTC)
    • Is "three quarters" the UK usage? I'm sure "fourths" is much more common in the U.S. bd2412 T 14:49, 3 November 2015 (UTC)
      • Ngrams suggests "three quarters" is more common than "three fourths" on both sides of the Herring Pond, but that "three fourths" represents a much larger minority in en-US than in en-GB. —Aɴɢʀ (talk) 15:41, 3 November 2015 (UTC)
        • So, {{context|chiefly American}} for the "fourths"? bd2412 T 16:10, 3 November 2015 (UTC)
          • I guess. Another thing to consider is that, although I personally consider it punctuation abuse, the hyphenated forms are also very common as nouns: one-sixth, three-quarters, five-eighths, etc. They should probably be listed at least as alternative spellings. —Aɴɢʀ (talk) 16:17, 3 November 2015 (UTC)
            • I would agree with calling that punctuation abuse. I'd be tempted to list hyphenated noun forms as a misspelling. bd2412 T 16:38, 3 November 2015 (UTC)
          • Based on Ngrams, the peak popularity of fourths in Britain was 1750-1850, but it took off much more in the US, where became pretty popular throughout the 1800s, but never quite overtook quarters and has been in a steady decline ever since. Maybe it's got something to do with avoiding confusion with a certain circular piece of metal that is not found too frequently in Britain. --WikiTiki89 16:24, 3 November 2015 (UTC)
  • Done with the basic entries. Hyphenated forms and translations will follow. Cheers! bd2412 T 21:03, 3 November 2015 (UTC)
  • Re: "the inclusion of a specific, limited set of eighteen written-out fractions" There is, of course, no reason to keep all of these if they are challenged in RfD and no reason to exclude any that are attestable if we decide that such fractions are, in general, to be excluded. It would be bad precedent to make some kind of exception to our policies without a vote.
An alternative would be for each language's (or group of languages') fraction-naming system to be documented in an appendix with redirects from attestable forms that someone finds necessary to add. DCDuring TALK 13:32, 4 November 2015 (UTC)
  • As noted above, these happen to be the 18 fractions for which Unicode versions of the numerical forms exist. I think, actually, that this speaks volumes about the set. They are, at least, alternative spellings to entries that we already have due to the inclusion of the Unicode forms. Arguably, we could have a few more fractions if they were particularly significant, like twenty-two sevenths (which is approximate pi), but I think that either the addition of any more or the exclusion of any of the above would be difficult to justify. I note that "one tenth" gets twenty times as many Google Books hits as "one eleventh", and (with the exception of the unusually popular "one twelfth") would presume the trend continues in that way. Of course, I have no objection to an appendix documenting fraction naming systems. However, we have many appendices that collect information that is also reflected in individual entries, a luxury that we have the storage space to afford. bd2412 T 16:02, 6 November 2015 (UTC)

Wiktionary-l mailing listEdit

Hi everyone, I'm unsure if this is the correct board for this (if not, please do move it). The wiktionary-l mailing list[22] has not had active list administrators it seems and during a migration, we had to clear out a lot of emails form the list unfortunately (possibly legitimate ones). As such, we're asking for someone to take over the list so emails sent to the mailing list are moderated and handled within an appropriate time-frame. A phabricator bug[23] is open for this. Interest can be shown either here or there - both places will be monitored. If there is no interest shown by Friday, I will close the list as inactive and it can be re-opened in future if the community want it.

Thanks, John F. Lewis (talk) 16:47, 3 November 2015 (UTC)

I don't think any regular Wiktionarians still use that mailing list. We have a small community compared to en.Wikipedia, and we tend to communicate on-wiki or by direct e-mails. - -sche (discuss) 19:36, 6 November 2015 (UTC)
I didn't even know we had a mailing list. --WikiTiki89 19:54, 6 November 2015 (UTC)
Hypothetically, I believe it's supposed to encourage communication and cooperation between the various Wiktionaries, but that doesn't happen much in any case. I see no problem in letting it die. —Μετάknowledgediscuss/deeds 17:35, 7 November 2015 (UTC)

About the "Entry name section" voteEdit

Wiktionary:Votes/pl-2015-10/Entry name section is scheduled to start in 7 days. (it was going to start in 2 days but I delayed the start a bit)

As some people know, the vote introduces an expanded "Entry name" section in the WT:EL.

Some changes to the proposed text have already been made based on suggestions in the talk page. Please review and see if you disagree with anything in the proposed text. I suggest making any changes before the vote starts, especially if it can avoid possible opposing votes in the future.

If there's any piece of text that's controversial or unwanted by some people, I'd rather cut that out and vote about the rest, so that it can be further discussed/voted later. --Daniel Carrero (talk) 23:15, 3 November 2015 (UTC)

Proposal: Add a text box to give a search term for "terms starting with" in categoriesEdit

Currently, we have table-of-contents templates for a few languages, which let you click on letters to skip to that letter in the category. But this system is wholly inadequate for a dictionary. A paper dictionary lets you skip quickly to a particular term, often using indexing terms that appear at the top of the page. For example, a paper dictionary might show that a pair of pages includes terms ordered alphabetically between "lazy" and "lecture". Something like this is sorely needed for our categories, which are currently pretty much unsearchable. The only thing you can do is manually edit the URL to skip to a particular term, there's nothing on the page itself allowing for this, other than the aforementioned TOC templates. So I think that there should be a text box which lets you say "show me terms starting alphabetically from here".

A difficulty is that I have no idea how this might be implemented. The InputBox extension only has a fixed set of things it can do, and setting the start of a category listing isn't one of them. —CodeCat 15:45, 4 November 2015 (UTC)

Symbol support vote.svg Support It seems like we could do it fairly easily with JavaScript, but I really hope we can do it without JavaScript. Maybe we should ask the developers? --WikiTiki89 15:58, 4 November 2015 (UTC)
I wouldn't count on them too much. We've been waiting for years to have custom category collation orders implemented, but it's still not done. —CodeCat 15:59, 4 November 2015 (UTC)
German Wiktionary has been there, done that. See de:Kategorie:Aragonesisch for an example. We could take their script, examine it and take the good parts. It is also a good idea to show the range for a category... --Dixtosa (talk) 16:17, 4 November 2015 (UTC)
Perfect example of why I would prefer doing this without JavaScript. The text box only loads sometimes and I have to refresh the page. --WikiTiki89 16:28, 4 November 2015 (UTC)
Only loads sometimes? Hows that? Your javascript engine is stochastic? You must be using IE. --Dixtosa (talk) 16:53, 4 November 2015 (UTC)
I often have issues with scripts not working. Sometimes the buttons to expand inflection tables and translation tables go missing. —CodeCat 16:57, 4 November 2015 (UTC)
Nope, Chrome. Sometimes JS fails to load due to faulty internet connections, sometimes it fails to run because of random JS errors. JS always has lots of problems on all browsers. --WikiTiki89 18:56, 4 November 2015 (UTC)
Yes, something along these lines would be good. Equinox 16:45, 4 November 2015 (UTC)
  • Symbol support vote.svg Support, and consider similar measures for ending with and containing. Purplebackpack89 21:10, 4 November 2015 (UTC)
    I think "starting with" is the best we can do for now, it's the only thing that the software supports in the URL. I'd love it if categories could be dynamically sorted, either ascending or descending, and from beginning to end or end to beginning. Sorting by the end of the word is very useful especially for suffixes. —CodeCat 21:40, 4 November 2015 (UTC)

Name of the namespace for reconstructed termsEdit

As the vote on whether to create a dedicated namespace for reconstructed terms is nearly over is will in all probability pass, I would like to draw attention to the poll on the vote's discussion page regarding the name of the namespace. If you have not voted and have an opinion, please do so at Wiktionary talk:Votes/2015-09/Creating a namespace for reconstructed terms#What should we name the namespace?. --WikiTiki89 21:17, 4 November 2015 (UTC)

Deleting trademark categories?Edit

These categories have been emptied and deleted in 2014 with the text "empty category; should stay empty per WT:TM":

Some terms which pertained to these categories in the past: Durex, Air France, Skidoo, Spezi, Pampers, Pepsi and Coca-Cola.

But these categories still exist:

Should all trademarks categories be emptied and deleted because of WT:TM? --Daniel Carrero (talk) 09:33, 5 November 2015 (UTC)

  • I could see us having categories for words originating as trademarks - but we should not be in the business of maintaining the current trademark status of words. If this is all these categories do, they should be dispensed with. bd2412 T 14:05, 5 November 2015 (UTC)
Consider moving the trademark info to the entries' etymology sections, rather than just wiping the category out entirely. Equinox 15:39, 5 November 2015 (UTC)
I agree with BD, delete these categories: I have been deleting them, per WT:TM and the discussions which preceded it, in which a representative of the WMF legal team participated and it was concluded that current legal/trademark status should not be noted. For entries which originated as trademarks, that lexical information should be noted in the etymology, as Equinox says. - -sche (discuss) 21:34, 6 November 2015 (UTC)

All entries ending in -smanEdit

Would it possible to find all English entries ending in -sman (-swoman, -speople, -sfolk...)? I think that most of these are really -s- + man, and I'd like to correct the etymologies (although of course not all of them are (talisman, chessman, hillsman are not) so an automatic change would be inappropriate). Smurrayinchester (talk) 16:17, 6 November 2015 (UTC)

Also, -shead, -sfoot and -stail. Smurrayinchester (talk) 16:20, 6 November 2015 (UTC)
I've always wished there was a Special:SuffixIndex like the existing Special:PrefixIndex. --WikiTiki89 16:24, 6 November 2015 (UTC)
I wish Special:PrefixIndex had a different name, since you can put any string there regardless of whether it's a prefix or not. And I also really wish Wiktionary had some functionality as a language-specific reverse dictionary. —Aɴɢʀ (talk) 18:34, 6 November 2015 (UTC)
They could call it Special:YouWillNeverGuessWhatThisDoes for all I care as long as we have the functionality. --WikiTiki89 20:04, 6 November 2015 (UTC)
Today I 've requested the permission to use WM databases on wmflabs. If they grant me a membership, that functionality is going to be a piece of cake.--Dixtosa (talk) 21:00, 6 November 2015 (UTC)
It would be preferable to add that functionality to individual categories, rather than to all pages on a Wiki. Then you could apply it to subsets of words, like English nouns alone. This was mentioned on the Grease Pit a few days ago too. —CodeCat 21:02, 6 November 2015 (UTC)
Which option is preferable depends on what you're using it for. --WikiTiki89 21:22, 6 November 2015 (UTC)
@Smurrayinchester User:DTLHS/sman DTLHS (talk) 19:52, 6 November 2015 (UTC)
Wow, thank you! Smurrayinchester (talk) 19:58, 6 November 2015 (UTC)
Cannot this be done using the DynamicPageList extension, which creates list of articles based on their category membership? - Amgine/ t·e 22:49, 18 November 2015 (UTC)

Formatting of ellipses to indicate elisions in attestationsEdit

Apologies if I've asked this in the wrong place, I'm still learning my way around Wiktionary.

I noticed that in egg on, where part of the passage in the attestation from In the South Seas was omitted, the periods in the ellipsis are spaced out . . . like so. I know that the English Wikipedia's Manual of Style prefers unspaced ... ellipses (and gives several reasons why), but I couldn't find the equivalent style guide for Wiktionary. Is there a preference here, and if so, where would I find it? —GrammarFascist (talk) 19:36, 6 November 2015 (UTC)

We have the template {{...}}, which seems to be a good way of achieving uniformity. —Aɴɢʀ (talk) 19:40, 6 November 2015 (UTC)
Thanks, Aɴɢʀ! Template applied, looks great. —GrammarFascist (talk) 22:25, 8 November 2015 (UTC)

Expanding on WT:COALMINEEdit

Right now, WT:CFI has the following text: "Unidiomatic terms made up of multiple words are included if they are significantly more common than single-word spellings that meet criteria for inclusion". The canonical example is that coal mine is allowed because coalmine is. But I think the phrasing here misses the point of what it really should convey, which I think is:

If there is a lemma, of which at least one of the alternative forms is includable under the idiomaticity criterium, then all of them should be considered includable under that criterium.

The idea here is that idiomaticity shouldn't depend on orthographic representation, so if a form has different spellings, it makes sense to treat the collection of them as idiomatic if at least one of them is. Should CFI be modified to this effect? It would have the consequence of removing the "significantly more common" part, meaning that if coalmine were the most common form, then coal mine, being an alternative form of it, should also be includable. —CodeCat 00:48, 7 November 2015 (UTC)

Wouldn't that mean that coalmine would not be includable, since "coal mine" isn't idiomatic? DTLHS (talk) 01:08, 7 November 2015 (UTC)
No, it would work in both directions. coalmine is idiomatic, therefore all alternative forms of it are also includable, coal mine too. So it only expands on what's includable, it doesn't remove anything that's includable now. —CodeCat 01:10, 7 November 2015 (UTC)
  • Symbol support vote.svg Support We need to expand CFI. CFI as it's written is too restrictive and is costing us readers. Purplebackpack89 02:11, 7 November 2015 (UTC)
    @Purplebackpack89 Any evidence that CFI "is costing us readers"? DCDuring TALK 03:58, 7 November 2015 (UTC)
    @DCDuring The way I figure it, people come to an online dictionary to learn the meaning of a particular word. If that particular word isn't on Wiktionary, they go someplace else for that word...and there's a strong chance they never come back to Wiktionary. If we don't have entries people are looking for, particularly if other dictionaries have them, we will lose readers. And we know that we're not the most-read online dictionary. Purplebackpack89 13:57, 7 November 2015 (UTC)
    @Purplebackpack89 I get the argument, but I'd like some evidence to support your bold, unqualified presentation of a theory as a fact. I wonder if it isn't this and other violations of Gricean maxims that makes so many of your contributions to talk pages so irritating to me. DCDuring TALK 02:06, 8 November 2015 (UTC)
    1. Did you read what I wrote below about readership?
    2. Do you have any evidence to contradict my theory? (Because I seriously doubt you do) Purplebackpack89 02:55, 8 November 2015 (UTC)
    He also doesn't have evidence to contradict any theory that the readers are Presbyterians who like to take long walks. And you don't have any evidence for the assumptions that you're using to derive your theory. It may make perfect sense to you, but it has no more to back it up than an out-and-out assertion. Chuck Entz (talk) 03:51, 8 November 2015 (UTC)
    Same with DCDuring's theory, if he has one. The bottomline, @DCDuring @Chuck Entz is:
    1. What words we have and what words we don't should be based on how our readers think, but
    2. We don't know how our readers think

Purplebackpack89 04:08, 8 November 2015 (UTC)

  • I like the idea of clarifying the wording (as it stands, you could argue that COALMINE sanctions red car, since we have Redcar), and in general I think this is a pretty good way of explaining the logic behind the policy, and I wouldn't be opposed to loosening it up a bit (I'd be fine with bird song as an alternative form of birdsong, for instance). That said, this would allow lots of useless entries from contractions and acronyms (we don't need I am not a lawyer, even if we do have IANAL, and I don't think it's automatically implies we need it is). Smurrayinchester (talk) 09:19, 7 November 2015 (UTC)
    • If we don't want to allow those possibilities, we can change the rule. We probably want to limit it only to alternative spellings, since coal mine and coalmine are really the exact same thing, but it is and it's are further apart. —CodeCat 13:34, 7 November 2015 (UTC)
    Apparently we already have bird song. Huh. Smurrayinchester (talk) 09:20, 7 November 2015 (UTC)
  • COALMINE is clear and unambigous whereas the proposal depends on what "alternative form" means. I see no example of what would be newly included under the proposed change. A minor point, "criterium" is a rare spelling; the usual spelling is "criterion". --Dan Polansky (talk) 09:37, 7 November 2015 (UTC)
To recap history: The COALMINE vote was based on one of the set of Pawley criteria that in his opinion supported finding an MWE idiomatic. We made it a sufficient criterion to eliminate some of the more pointless debates about inclusion, without regard to occasional instances of what might be considered errors of inclusion. It has succeeded in this regard, IMO.
I'd be a bit concerned that such very rare alternative forms like cole mine might be deemed includable should they be found attested. Making them automatically includable seems like another invitation to obsessive compulsive would-be contributors to add valueless bulk to Wiktionary. DCDuring TALK 13:06, 7 November 2015 (UTC)
If cole mine meets CFI (at least three attestations from independent sources over the space of more than a year) I see no reason not to include it. We're not paper, and our usefulness is not determined by dividing the number of valuable entries by the number of total entries. —Aɴɢʀ (talk) 13:50, 7 November 2015 (UTC)
Can you give an example of something that would be included under this rule but not under the current WT:COALMINE? I lean towards oppose for the same reasons as Dan. On WT:RFD Dixtosa suggested that free throw percentage met COALMINE because of FT%, which Wikitiki noted (and I agree) is not the case. (A lot of unidiomatic phrases have abbreviations.) Btw, if something like this is added, I suggest rephrasing to "...at least one of the alternative forms meets the idiomaticity criterion, then of them should be considered to meet that criterion" (the term may still not be "includable", if it fails to meet other criteria). - -sche (discuss) 20:41, 7 November 2015 (UTC)
FWIW, @-sche, I think we need to explore having a CFI that allows us to include phrases, such as free throw percentage, that are commonly abbreviated. It seems odd that there are many instances where an abbreviation can be included, but the thing it's abbreviating can't. IMO both should be included. Would what you are suggesting achieve that? Purplebackpack89 20:56, 7 November 2015 (UTC)
As I said above, that would give us entries like I am not a lawyer and deformities, contusions, abrasions, punctures / penetrations, burns, lacerations, swelling, tenderness, instability, and crepitus. Would the extra effort it would take to verify these terms, clean and maintain them, and find some way to define them in a useful but non-encyclopedic way be the best way to spend our (often busy and rather tetchy) volunteers' time? Smurrayinchester (talk) 22:02, 7 November 2015 (UTC)
If people want to spend their time that way, they should be allowed to. It's not like expanding CFI automatically means a lot of work for all concerned. It doesn't mean people automatically have to go out and create entries; rather, they are allowed to do it at whatever pace they want. And if your question is "does it improve the project to have those entries", IMO it does. As I said above, we lose editors who are looking for things like "I am not a lawyer" and can't find it. Purplebackpack89 23:31, 7 November 2015 (UTC)
How do you know we lose them, or that people want to look up full sentences? DCDuring asked for evidence and you seemed to dodge the question. Equinox 23:32, 7 November 2015 (UTC)
How do you know we don't, Equinox? Follow my logic here: UrbanDictionary has more readers than we do. However, UrbanDictionary has (on average) lower-quality entries than we do. If UrbanDictionary has lower-quality entries than we do, how come they have more readers? The answer seems to me to be because UrbanDictionary has entries for words and phrases we don't, in particular whole-phrase and whole-sentence entries. And the conclusion from this seems to be that a lot of editorsreaders value quantity over quality; they would rather have a bad entry than no entry at all. Purplebackpack89 23:40, 7 November 2015 (UTC)
You have no logic. Because we value quantity over quality we aren't including more phrases? And the entire reason that urban dictionary has more users than us is because they have a CFI, nevermind the host of other reasons that that could be the case? It couldn't be that urban dictionary was designed from the ground up to actually be usable, as opposed to trying to shoehorn a dictionary into software that was designed for creating an encylopedia. That would be crazy. DTLHS (talk) 23:58, 7 November 2015 (UTC)
'Cuse me, did I say editors? I meant readers. Also, I think you have us and UD mixed up, we have a CFI and they don't. Also, I'm not sure I blame the lack of success of this project as being too much like Wikipedia; if anything, I think we're not enough like Wikipedia. After all, Wikipedia is like Wikipedia and they're viewed more than UD. I'd also reiterate that CFI is something that makes us less usable. @DTLHS, it's never made sense to me that somehow we'd have more readers with fewer entries, at least not at the point in the project we are now. Worry only about quality at this point in our project will not get us increased readership, or increased editorship. Purplebackpack89 00:06, 8 November 2015 (UTC)
Statistics just tell you how many people visit either site, not why they go there. UD gets a lot of its readers by being outrageous, provocative, and entertaining. People make up words and definitions there for the fun of it, or for the attention, and that's more entertaining than information about plain old ordinary usage.
We may be losing readers of the same sort as the TV viewers that watch car chases and reality shows, but that's unavoidable if you want to be a reference work that focuses on accuracy and reliability- and I, for one, don't miss them. Chuck Entz (talk) 01:45, 8 November 2015 (UTC)
I fully agree with Chuck. I like us to be sensitive to serious learners at virtually any level, but not to those whom we often end up blocking as vandals because they insert UD-quality entries and definitions. DCDuring TALK 01:59, 8 November 2015 (UTC)
OK, where's the evidence that most of the people who choose to read (yes, read, not edit) UD over Wiktionary are sensationalist vandals? That seems like a quite extraordinary claim, much more so than the claims I've advanced in this thread. Do you really believe that there's no middle ground between sensationalist vandals and you yourself, i.e. are there not people who don't create or read the most questionable of UD entries, but would like to see more two-word entries at Wiktionary, because they don't know what those words mean and would like to? Also, your attitude toward UD-but-not-Wiktionary users is quite elitist...it seems you want a well-written dictionary, and you care not if anybody reads it. Isn't having a well-written dictionary pointless if nobody reads it? Purplebackpack89 04:08, 8 November 2015 (UTC)
It looks to me like he was referring to a certain group of undesirable editors as an example of the sort of user he wouldn't be concerned with losing, not making blanket assertions about everybody. At any rate, the answer is, as usual, somewhere in between: if we try to be just like Urban Dictionary, we'll lose out to them, anyway, since they're better at being Urban Dictionary than we are. Even if we are losing some readers, we're also retaining some who would be turned off by lowered standards and/or pandering to Urban Dictionary constituencies. I also question how many people would be looking up "I am not a lawyer" that would be satisfied with a wordier paraphrase as a definition- after all, the best definition of "I am not a lawyer" is "literally: 'I am not a lawyer'". A lot of our proverb entries have this problem: the proverb is usually the most concise and efficient way to express the idea, so the definition tends to be much more wordy and technical-sounding, often to the point of being unreadable sludge. Chuck Entz (talk) 04:47, 8 November 2015 (UTC)
I think you and DC overestimate the amount of people who prefer no definition to a poor definition, and underestimate those who prefer a poor definition to no definition. Again, I say, "If there are so many people who care about quality of definitions out there, then why do more people use UD than us?" I don't think that UD's success can be explained away by its different format (which, IMO, is MORE confusing than ours) or by the fact that some of its entries are jocular. A great deal of it has to be due to its lack of an arbitrary, overly-strict CFI. Purplebackpack89 05:16, 8 November 2015 (UTC)
Also, on your I am not a lawyer argument, I'm looking at CFI/RFD from a different angle than you are. You look at the things you think we don't need and say CFI is fine. I look at the things we've deleted or come way too close to deleting (television show was deleted for over a year, field goal percentage will probably get deleted next week) and am appalled. And if you think I'm pushing for lots and lots of bad definitions that have to be kept, remember that we'll still have RFV. Purplebackpack89 12:04, 8 November 2015 (UTC)
COALMINE aside, I agree that “idiomaticity shouldn't depend on orthographic representation”. Our non-following of this idea leads to some problems:
  • words written without a space are considered idiomatic by default. While this works well enough for English (with some caveats), it falls short for languages with more rigid orthographic standards, such as German (compounds written together, no matter how unidiomatic), Spanish (clitic pronouns written together), Latin (-que) as well as for ancient languages that used scriptio continua.
  • words that would otherwise be citable fail RFV. I specifically remember two cases: years ago I was trying to find citations for Copenhagenisation/-ization, I found that one spelling had one cite and the other had two; the other case was a slang two-word compound for clitoris that was being RFVed (I don’t remember the exact word), I found two citations using XY, two using X-Y and two using X Y. The word failed RFV because each spelling had fewer than three cites, even though the word itself had 6.
About Urban Dictionary: even if Wiktionary had a readership of zero, that would be a situation preferable to becoming more like UD. They can market themselves however they like, but UD is as much a dictionary as a porn website is an educational website about anatomy. — Ungoliant (falai) 18:16, 8 November 2015 (UTC)
If Wiktionary's readership is zero, we've all wasted our time. And there are definitions on UD that not only could easily pass RfD and RfV, they are better-worded than some of the definitions we have. People around here paint UD with far too broad a brush. Purplebackpack89 23:33, 8 November 2015 (UTC)
Talk:gaplapper is similar to what you're talking about. - -sche (discuss) 07:04, 9 November 2015 (UTC)
Can't support the proposal in its current form. There have been too many counter examples given to the proposed rule, such as "free throw percentage", "cole mine", DCAPBLSTIC, and the lawyer one. Perhaps CFI could be expanded in a similar but more restrictive way to the proposal, such as by explicitly excluding acronym expansions and very rare forms (or making them case-by-case), but there clearly is far from unanimous support for their inclusion. Also, I would not be sad if no one ever engaged in a back-and-forth about what is or is not "costing us readers" ever again. Please just stop. —Pengo (talk) 07:19, 9 November 2015 (UTC)
I didn't read this proposal as including acronym forms. bd2412 T 15:00, 9 November 2015 (UTC)
Pengo, it's important to have a strong theoretical basis when voting in proposals like these. There's nothing wrong with saying you want policy changed to increase readership; there's nothing really wrong with saying you want policy changed to increase quality either. Purplebackpack89 15:34, 9 November 2015 (UTC)
Symbol oppose vote.svg Oppose since we do not have a clear definition of "alternative form" and this would then lead to the inclusion of all kinds of things that should not be included. --WikiTiki89 15:03, 9 November 2015 (UTC)
@Wikitiki89, your comments beg the questions "what things shouldn't be included", and "why shouldn't they?" Purplebackpack89 15:34, 9 November 2015 (UTC)

Terms with audio links and IPA pronunciation -- where should they go in the category tree?Edit

There are now two category types, e.g. Category:Russian terms with audio links and Category:Russian terms with IPA pronunciation, that are currently placed under "entry maintenance" , but this doesn't seem right, since that mostly concerns mistakes that ought to be corrected. Where should they go? Should we create a new category tree section for pronunciation? Benwing2 (talk) 07:36, 8 November 2015 (UTC)

Are phrases not listed in lemmas?Edit

Most categories are listed under lemmas, but I can't find one for phrases. Are they hidden somewhere? [24] Donnanz (talk) 13:16, 8 November 2015 (UTC)

Added. — Ungoliant (falai) 17:58, 8 November 2015 (UTC)
Thanks a lot. It only appears in the "subcategories" list though, and not in the list at the top. I checked English lemmas as well. Donnanz (talk) 21:08, 8 November 2015 (UTC)
  • Obviously done without further comment. Donnanz (talk) 12:47, 10 November 2015 (UTC)

Community Wishlist SurveyEdit

Hi everyone!

The Community Tech team at the Wikimedia Foundation is focused on building improved curation and moderation tools for experienced Wikimedia contributors. We're now starting a Community Wishlist Survey to find the most useful projects that we can work on.

For phase 1 of the survey, we're inviting all active contributors to submit brief proposals, explaining the project that you'd like us to work on, and why it's important. Phase 1 will last for 2 weeks. In phase 2, we'll ask you to vote on the proposals. Afterwards, we'll analyze the top 10 proposals and create a prioritized wishlist.

While most of this process will be conducted in English, we're inviting people from any Wikimedia wiki to submit proposals. We'll also invite volunteer translators to help translate proposals into English.

Your proposal should include: the problem that you want to solve, who would benefit, and a proposed solution, if you have one. You can submit your proposal on the Community Wishlist Survey page, using the entry field and the big blue button. We will be accepting proposals for 2 weeks, ending on November 23.

We're looking forward to hearing your ideas!

MediaWiki message delivery (talk) 21:30, 9 November 2015 (UTC)

Proposal: Disallow votes to be created by someone who intends to vote opposeEdit

Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines was created by User:Dan Polansky who immediately voted oppose. Although he claimed that "the supporters were given enough time to tweak the vote", the vote was not crafted in a way that enough editors would have supported it. Several of the voters who voted oppose or abstain expressed that they would have supported a similar but less restrictive vote. Thus, the measure was not given a fair chance at succeeding. Therefore, I would like to propose that votes must be written by an editor or editors who intend to support the vote and thus should be willing to put enough effort into its phrasing to make it passable. --WikiTiki89 22:35, 9 November 2015 (UTC)

I wonder whether there is any precedent in real-world political systems for banning this kind of voting strategy. Equinox 22:38, 9 November 2015 (UTC)
Long ago when I was in Middle School, we were voting for class representatives and one student asked if they were allowed to vote for themselves. The teacher replied that on the contrary if they do not vote for themselves then they should not be running. I'm sure someone more into politics than I am would be able to give more concrete examples from real politics. --WikiTiki89 22:51, 9 November 2015 (UTC)
  • This is votes only? It doesn't apply to RfDs and RfVs? Purplebackpack89 23:05, 9 November 2015 (UTC)
    Votes only. We already have a similar unofficial policy at RFD that you should not nominate something you intend to vote keep for (of course you're allowed to change your mind after the fact); and there is no voting involved at RFV. --WikiTiki89 23:11, 9 November 2015 (UTC)
    And we don't enforce it lol. I've seen a number of "courtesy" or "administrative" RfDs and RfVs that were filled out by editors after somebody tagged an entry for RfD or RfV but never filled out the RfD or RfV. Purplebackpack89 23:14, 9 November 2015 (UTC)
    That's different, because whoever tagged the entry is effectively the one nominating it. But this discussion is not meant to concern RFD, so let's please stay on topic. --WikiTiki89 23:16, 9 November 2015 (UTC)
    • People raise Rf[DV]s all the time because they honestly don't think a term is attest(ed|able), but are giving people a chance to prove them wrong. Indeed, that's kind of the whole point of them. This proposal, on the other hand... I can see the point being made, that people shouldn't raise a vote in order to make it fail in order to create a precedent they like, or to make a political point, or to grief their opponents. (The Australian Republican Referendum comes to mind, where exactly that happened: it was set up by people who wanted it to fail, in such a way to force it to fail, when most people actually supported the idea.) But then, making the legality of a proposal dependant on the intent of the proposer is a really, really bad idea. Because then we'll get counter-griefers claiming that "the proposer wasn't doing this in good faith: this vote is void" if it looks like going against them. Will you nullify a vote if the proposer didn't write it well enough? If they don't defend it well enough? (How do you decide "enough"?) If they change their mind? --Catsidhe (verba, facta) 23:19, 9 November 2015 (UTC)
      • This rule does not have to affect the outcome of votes. We can start with having this be an unenforceable, but official, request to anyone creating a vote. If it remains a problem we can discuss how we can penalize creators of such votes. Keep in mind though, I am not assuming any ill-will on the part of creators of such votes. Even if they have good intentions, they do not have the same incentives to make reasonable proposals or compromises. --WikiTiki89 00:11, 10 November 2015 (UTC)
I agree it is problematic, but I'm opposed to making a hard rule against it. What if the proposer changes his or her mind? What if the proposal is simply written in the negative saying we "shouldn't" do something, either because it makes sense to do so or to avoid this rule? What if the question is very simple and already discussed elsewhere? I think it might make a good guideline to suggest proponents of a change be the one to put it forward, as should have been done in this case, but I don't think it would make sense to make a hard rule about it without some very careful consideration. Pengo (talk) 01:40, 10 November 2015 (UTC)
Where did I say above that the proposer can't change his mind? --WikiTiki89 01:57, 10 November 2015 (UTC)
@Wikitiki89: If the proposer can change their mind, then they can get around this rule by proposing something, saying they're supporting it, and then changing their mind shortly afterwards. It might seem silly, but it demonstrates a problem with this rule. Pengo (talk) 09:32, 11 November 2015 (UTC)
That's assuming bad faith on the part of the proposer. I don't see it as a huge problem, since most editors involved in votes generally act in good faith. --WikiTiki89 15:24, 11 November 2015 (UTC)
I think it's worrisome, but I don't know if it's really a big deal. The current vote in question makes the criteria rather narrow, yes, but that means the opposition votes are equally narrow in scope. As it stands now, we could just change the separator in the template from ; to something else, to comply with the vote. The vote outcome only indicate that people don't want the particular combination Dan proposed, but it says nothing about any others. —CodeCat 01:53, 10 November 2015 (UTC)
It wastes people's time and it misleads people who look back at it in the future. It's better to have the supporters craft a better proposal to begin with. --WikiTiki89 01:57, 10 November 2015 (UTC)
  • This is a generally good idea, but it could be abused if made a rule. I think at present we really only have one editor who does this, and making legislation against him individually is absurdly heavy-handed. But as an informal expectation, I agree that I'd prefer if all editors were to avoid doing this. —Μετάknowledgediscuss/deeds 04:19, 10 November 2015 (UTC)
  • I naturally oppose that "votes must be written by an editor or editors who intend to support the vote". I have never intentionally made a bad wording for a vote. Actually, specific defects in the wordings that I have created have not been stated, only that, in principle, as an opposer, I have an objectively existing interest in creating a bad wording. People can still create other votes, with their preferred wording. Even now. They can make wording improvements. By creating votes for CodeCat's proposals and for CodeCat's undiscussed (not even proposed) changes visible in the mainspace, I make sure non-consensual volume changes by CodeCat are limited at least to an extent. Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines is a great example of such a vote; I did not really expect so many opposes, and we how have clear objective evidence of the scope of support for that proposal. CodeCat proceeded to make changes anyway, e.g. in diff, evidence of the need to determine the scope of support in a transparent manner enabled by the vote. By the way, I support to use my continuing extension algorithm on Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines to give it a chance rather than closing it as failed. --Dan Polansky (talk) 09:11, 14 November 2015 (UTC)
    @Dan Polansky: Of you course you wouldn't intentionally provide bad wording for a vote. My whole point is that if you don't support the vote, then you are not in a good position to provide the best wording. Even unintentionally, your wording will most likely be worse. --WikiTiki89 16:08, 14 November 2015 (UTC)
    @Wikitiki89: "your wording will most likely be worse": I don't think that to be true. My wording can turn out to be worse, but not "most likely". Whether my wording will be better depends not only on my potential bias but also on my drafting skills. For the discussed vote, no specific proposal of alternative wording has been made. We really have no tangible material or evidence to support the hypothesis that I create poorly drafted votes that fail not because of lack of support but rather because of poor drafting. In fact, Wiktionary:Votes/pl-2015-08/Templatizing usage examples looks like passing despite my opposition. --Dan Polansky (talk) 16:26, 14 November 2015 (UTC)
    @Dan Polansky: Good point about Wiktionary:Votes/pl-2015-08/Templatizing usage examples, it does look passable.
    That said, I am worried about the "Nested inflection lines" vote. I am inclined to agree with @Wikitiki89 on the statement: "My whole point is that if you don't support the vote, then you are not in a good position to provide the best wording." At diff, Dan Polansky argued that the supporters had enough time to improve the vote but they failed to do so. I think this further proves Wikitiki89's point: If someone is planning to create a vote with the intention of opposing it, then help from supporters of the vote is a requirement, otherwise it is likely that the vote is not going to have the best wording.
    Suggestion: @Dan Polansky, if you want to create a vote with the intention of preventing something to happen, IMO it would be okay being straightforward and proposing just that. See example below.
    "Voting on:
    • Disallowing the mass creation of nested inflected forms, until clear consensus is reached on the acceptance of nested inflected forms, and also what exactly should be the wikicode for that, and what exactly should appear on the entries as a result."
    I think this way you could express your points of view more clearly and openly. The fact that you oppose edits being done en masse without consensus would be the real rationale for the creation of your vote, which you could further elaborate in your words, as you have done to justify the votes you created with the intention of voting oppose. --Daniel Carrero (talk) 16:50, 14 November 2015 (UTC)
    • That is not an acceptable proposal. The whole point is that I do not need consensus to prevent non-consensual changes. The person making a volume deviation from status quo ante needs to gain consensus. When that person does not create the required vote, and does not provide wording input to a vote created by someone else, I end up with little option left other than creating a vote that I oppose. The proposal plays into CodeCat's non-consensual-volume-change cards in ways that really make it not workable, and that contradict the consensus principle. --Dan Polansky (talk) 16:57, 14 November 2015 (UTC)
    • Another thing is that if none of the supporters are willing to draft a vote themselves, then the vote is unlikely to have enough supporters to pass. If a vote is unlikely to pass, then it is a waste of time. I also agree with Daniel Carrero that a potential solution is to reword the vote in the negative. A negative vote that passes is entirely different from a positive vote that fails (and vice versa). --WikiTiki89 16:58, 14 November 2015 (UTC)
      • I really think this discussion is trying to solve a non-existent problem. Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines failed. Now its supporters should pull up their sleves and draft a proposal that can pass. --Dan Polansky (talk) 17:01, 14 November 2015 (UTC)
        • That's exactly the problem, it wasted everyone's time because it failed and accomplished nothing. --WikiTiki89 17:11, 14 November 2015 (UTC)
          • It accomplished expansion of our knowledge, which was its main purpose. We now know that people do not support the proposal. People also left specific comments on why they do not support it, from which opposition from other similar proposals can be inferred. --Dan Polansky (talk) 17:26, 14 November 2015 (UTC)
    Let me register my disappointment with the lack of disciplinary action against CodeCat, for the likes of diff. Under the regular circumstances of the rule by consensus, there really would be no need for me to create such votes. The votes are a slightly unusual measure to deal with the problem of non-consensual volume changes that for some reason that I do not know have not been deal with by the bureaucrats and admins. I believe CodeCat's actions exeplified by diff generally constitute a blockable offence. --Dan Polansky (talk) 09:15, 14 November 2015 (UTC)
  • Let me note that the wording is not so narrow as CodeCat above suggested. The specific wikicode format is not part of the voted proposal, since the vote text says "possibly like" before it gives the example, emphasis on "possibly". What was part of the proposal was "creating the new formatting by means of a single template invocation instead of multiple ones as before". --Dan Polansky (talk) 10:06, 14 November 2015 (UTC)
  • I don't think we need to ban this. Discourage is using dialogue, sure, outright ban, no. Renard Migrant (talk) 16:55, 14 November 2015 (UTC)

Closing RfDs...we need floorsEdit

At present, we have no floors for when an RfD is closed. I'd like to propose the following:

  • If there are three votes of unanimous opinion, or five votes of unanimous opinion with only one dissent, the RfD can be closed immediately.
  • If an RfD has gone on at least a week and has at least five votes, it may be closed if 65% or more of the votes are of the same opinion
  • Any RfD can be closed after a month, regardless of the number of votes or how the votes are distributed.

Of course, we could go longer if we wanted, but we wouldn't have to. This would have the added benefit of de-cluttering RfD of discussions that haven't been commented on in months, or have clear outcomes. Purplebackpack89 23:41, 9 November 2015 (UTC)

Oppose. People commenting on RFDs should be encouraged to take their time to analyse the situation and form an opinion. I mean, it’s one thing to try to speed up the bureaucracy, but your suggestions (especially the first one) will mean that a discussion can be closed within minutes of being opened, or after a new point is posted. The practical result of this is that people will tend to rush to a conclusion and skip careful reading and research. — Ungoliant (falai) 00:02, 10 November 2015 (UTC)
Ungoliant, would you like me to up the number of votes on the first one, or the percentage on the second? Also, I doubt that the result of this will be lots of deletion discussions being closed very quickly...after all, half of deletion discussions don't even get five votes in the first week. If a discussion is 5-0 or 5-1 one way, is it likely to go the other way? No! For one, you'd need about 13-14 total votes to get them to go the other way, and 80-90% of the RfDs on this project don't even have ten votes. There's no point in prolonging the inevitable. Purplebackpack89 00:08, 10 November 2015 (UTC)
Furthermore, the "careful reading and research" should take place before a person votes, so if something is closed quickly, it's because the people who voted came to a decision to vote quickly. And in many RfD cases, people can and do vote quickly, because questions of "research" generally end up at RfV rather than RfD. Purplebackpack89 00:14, 10 November 2015 (UTC)
It’s no wonder you think that, since rushing to a conclusion and skipping careful reading and research is what you already do anyway. — Ungoliant (falai) 00:16, 10 November 2015 (UTC)
Oppose. I would favor a rule stating that an RFD discussion cannot be closed if there has been a new vote or a new contribution to the discussion (barring filibustering) in the past seven days. --WikiTiki89 00:25, 10 November 2015 (UTC)
But, Wikitiki, that means that if seven people vote delete on an article in the first 72 hours, you still can't close it until a week after the last vote. Why is it so necessary that a discussion be dead for a week? Isn't how long the discussion is open more relevant then how long it's dead? Purplebackpack89 00:29, 10 November 2015 (UTC)
Partly to make sure you don't get a team of editors to quickly put in seven delete votes, close the discussion, and delete the entry before anyone else notices. If it's been dead for a weak, there is a good chance no one else has anything to say. --WikiTiki89 00:32, 10 November 2015 (UTC)
Wikitiki, that particular example ignores the practicalities of the situation: a) Seven votes will carry the vast, vast majority of RfD discussions, regardless of length; b) considering the make-up of RfD participants, a team of seven editors is bound to include a sysop who could delete the entry whenever anyway. Purplebackpack89 00:38, 10 November 2015 (UTC)
It's better to be on the safe side. It's not harmful to let a 7-0 lead sit there for a week. It is, however, harmful if the discussion is closed before someone who wanted to vote got a chance to. Keep in mind also that the ratio of votes also matters, there is a significant difference between a word deleted 7-3 from one deleted 10-0. --WikiTiki89 00:44, 10 November 2015 (UTC)
That's why you let them run a week unless there's a huge pile-on consensus. The way to make sure people have time to vote is to let it run a week, not let it be dead a week. Purplebackpack89 01:03, 10 November 2015 (UTC)
But you also have to give people who already voted a chance to read any new information or arguments and rethink their position. --WikiTiki89 01:35, 10 November 2015 (UTC)
  • Oppose. There is no major problem that this proposal solves. DCDuring TALK 04:12, 10 November 2015 (UTC)
  • Oppose per Ungoliant and especially DCDuring. —Μετάknowledgediscuss/deeds 04:15, 10 November 2015 (UTC)
  • Oppose. This proposal assumes that editors who might care about RFDs are always online. Some of us have to step away for a while due to circumstances in real life. Closing anything immediately risks shutting out and disenfranchising any editor who cannot be on Wiktionary all the time.
Also, per DCDuring, this is a solution in search of a problem. ‑‑ Eiríkr Útlendi │Tala við mig 18:17, 10 November 2015 (UTC)
  • Oppose. There is no great harm done by letting the discussion remain open for a few extra days, even after a unanimous consensus has been reached. bd2412 T 19:04, 11 November 2015 (UTC)

WT:RFM#Continuation of #Category:en:Names into Category:English namesEdit

I'm notifying users of this discussion, since it's not so trivial and needs input from many people. —CodeCat 00:48, 10 November 2015 (UTC)

German "du contractions" (haste, fährste...)Edit

What should we do about colloquial German contractions like hast du > haste, where the sound of the "du" gets lost in the "-st" and the whole thing is reduced to a schwa? It's a fairly universal process, but of course it can only occur with verbs that used in a colloquial context so not all -ste forms will be attested. I've added a sense at haste, and created a page at fährste, but I'd like to get some opinions from other German editors about how far to go and how to format the entries before rolling this out any further. Smurrayinchester (talk) 10:42, 10 November 2015 (UTC)

Yiddish has a similar problem האָסט דו ‎(host du) > האָסטו ‎(hostu). We have an entry for the suffix ־ו ‎(-u) itself, but I don't really think it is worth creating an entry for each verb with this suffix. --WikiTiki89 16:05, 10 November 2015 (UTC)
Brabantian Dutch also has this contraction, though it's etymologically different (it's the 2nd person plural in origin). I think there should be entries for them if they are attested. —CodeCat 18:42, 10 November 2015 (UTC)
Maybe also to consider: es contractions, like geht es -> geht's or gehts. - 19:13, 10 November 2015 (UTC)

Wikimania 2016 scholarships ambassadors neededEdit

Hello! Wikimania 2016 scholarships will soon be open; by the end of the week we'll form the committee and we need your help, see Scholarship committee for details.

If you want to carefully review nearly a thousand applications in January, you might be a perfect committee member. Otherwise, you can volunteer as "ambassador": you will observe all the committee activities, ensure that people from your language or project manage to apply for a scholarship, translate scholarship applications written in your language to English and so on. Ambassadors are allowed to ask for a scholarship, unlike committee members.

Wikimania 2016 scholarships subteam 10:47, 10 November 2015 (UTC)

Wiktionary:Votes/pl-2015-10/Entry name section - startedEdit

FYI: Wiktionary:Votes/pl-2015-10/Entry name section started today. --Daniel Carrero (talk) 03:42, 11 November 2015 (UTC)

I looked through it. It looks mostly like formalizing existing practice. Is there anything controversial or anything being proposed that isn't existing practice? Benwing2 (talk) 04:58, 11 November 2015 (UTC)
You're right, that was what I had in mind when creating the vote: it's about formalizing existing practice. I believe there's nothing controversial there; also, I'm not proposing anything new. --Daniel Carrero (talk) 05:41, 11 November 2015 (UTC)

rfi categoriesEdit

I edited {{rfi}} to allow categorization into language-specific categories.

But there are 6,597 entries using {{rfi}} without a language code. Those entries are currently categorized in Category:Entries needing images by language. Can a bot add the language code to all entries, please? Thank you. --Daniel Carrero (talk) 00:13, 12 November 2015 (UTC)

P.S.: This template has lots of redirects: {{reqphoto}}, {{Reqphoto}}, {{rfimage}}, {{rfdrawing}} and {{rfphoto}}. If a bot can add the language code to all these entries as I requested, I also suggest changing all these to {{rfi}} for consistency. --Daniel Carrero (talk) 00:55, 12 November 2015 (UTC)
Why is it important that these are categorized by language? An image doesn't have any intrinsic language (unless it contains text). DTLHS (talk) 01:01, 12 November 2015 (UTC)
I'm with DTLHS. Somethings, adding language codes to templates is more trouble than it's worth. Particularly in this case since pictures can be used in any language. Purplebackpack89 01:07, 12 November 2015 (UTC)
If we had some Commons-image-mining tools AND Commons had some data that indicated the language a diagram was worded in or the language of text appearing in a photo, then the language code might be useful. Until both conditions are met the language code seems useless. By the time those conditions are met the language code may seem like a quaint relic. DCDuring TALK 02:40, 12 November 2015 (UTC)
An image doesn't have an intrinsic language, but they sections they appear in do. These requests aren't independent of the language sections they appear in. Renard Migrant (talk) 17:59, 12 November 2015 (UTC)
One image may contain text in multiple languages, or other complicated scenarios (e.g. a sign that happens to say the same thing in two local languages, so isn't clearly either of them). This feels like categorisation for the sake of it. Equinox 18:00, 12 November 2015 (UTC)
This isn't about what languages are in the image, but whether an editor has the knowledge to judge whether an image is fitting for a given word. —CodeCat 18:10, 12 November 2015 (UTC)
  • I'm having some difficulty understanding how the language code would help in any specific current situation. Are there some specific examples of how this would help? DCDuring TALK 19:27, 12 November 2015 (UTC)
    • Would you trust yourself in adding images to random words in, say, Finnish? What about a rare language like Ainu? If you didn't speak the language, how would you know the image was appropriate? —CodeCat 22:02, 12 November 2015 (UTC)
      • Also, if the Finnish entries needing images were grouped together, they would draw the attention of Finnish editors. I could try improving our Portuguese entries by adding images to Portuguese entries needing them, but out current list of entries needing images is mostly a mess of Translingual taxonomic names. --Daniel Carrero (talk) 22:21, 12 November 2015 (UTC)
      • If the English gloss were accurate, I would. If the gloss were a polysemic English word or an obsolete, archaic, or rare word, I would not make the assumption that the gloss is accurate. Similarly if there were grammar mistakes or poor diction in the English gloss. If our English glosses are of such unreliable quality that these screens are not sufficient, then obviously I should not trust myself. Are these screens insufficient?
      • I look forward to the increased efforts to add images to FL L2 sections that will be unleashed by this categorization. I would have looked forward even more to improvement of the reliable quality of English glosses of FL entries. DCDuring TALK 23:42, 12 November 2015 (UTC)
        • If you don't speak the language, it's hard to get an idea of the "real" meaning of a word, even with an accurate definition. Therefore, some who is familiar with the language would be able to choose are more fitting image. --WikiTiki89 23:51, 12 November 2015 (UTC)
          A bad image could even give the wrong impression about the meaning. —CodeCat 23:54, 12 November 2015 (UTC)
          Also, a native speaker of Finnish navigating Category:Finnish entries needing images would understand most if not all the contents of the category just by looking at the entries listed. She could say: "I don't believe they don't have an image for (thing) yet!"
          With all the 6,500+ requests lumped together, one has to ignore the words in languages they don't know. (The alternative of looking for an English gloss is being discussed above and I don't have anything to add.) I, for one, don't speak Finnish. --Daniel Carrero (talk) 13:24, 13 November 2015 (UTC)

Wiktionary:Votes/2015-11/Language-specific rfi categories --Daniel Carrero (talk) 14:23, 26 November 2015 (UTC)

Appendix:Article 1 of the Universal Declaration of Human RightsEdit

I just made this - what do you all think? It would certainly be interesting to see how many translations we can get. Aryamanarora (talk) 02:24, 12 November 2015 (UTC)

Sources I have ported over versions from Wikimedia Foundation projects. Many more can be found here: http://omniglot.com/udhr/. Can these be imported? —Justin (koavf)TCM 03:04, 12 November 2015 (UTC)
Why can we not just leave these on Wikisource? Do they need to be here? —CodeCat 03:12, 12 November 2015 (UTC)
Yes, leave them where they are. I don't see that they have any function here. SemperBlotto (talk) 08:57, 12 November 2015 (UTC)
I agree with CodeCat and SemperBlotto. Wikisource already has the declaration in many languages. English: Universal Declaration of Human Rights, French: Déclaration universelle des Droits de l’Homme. I see no reason why anyone would look for that information here to begin with. --Daniel Carrero (talk) 09:03, 12 November 2015 (UTC)
RFD it is then! Renard Migrant (talk) 17:57, 12 November 2015 (UTC)
I'm also not quite sure why we need this. But I did fix some errors in it. --WikiTiki89 18:19, 12 November 2015 (UTC)
I agree. Doesn't make sense as an appendix to a dictionary. Equinox 18:26, 12 November 2015 (UTC)
Incidentally, the Even-Shoshan Dictionary of Hebrew does have the Israeli Declaration of Independence in the back. Although, since it is given in two side-by-side versions one with vowel points and the other in plene spelling, it is probably just there to illustrate the differences between vocalized and plene writing. --WikiTiki89 18:34, 12 November 2015 (UTC)
  • I also think we should delete it. Is this enough of a snowball to make it happen? —Μετάknowledgediscuss/deeds 18:30, 12 November 2015 (UTC)
Deleted. - -sche (discuss) 21:01, 12 November 2015 (UTC)
  • I support this deletion after the fact. —Aɴɢʀ (talk) 12:09, 13 November 2015 (UTC)

Unnormalized Old French formsEdit

I'm starting to get interested in facsimiles of the original manuscripts. It's occurred to me dictionaries (that I know of) don't cover manuscript forms and it could be something we do as a USP. If you don't know the basics of French spelling you might not want to read any further. Anyway. A good starting point would be Talk:aprés where the Old French section was unilaterally removed in 2010 because Old French doesn't have an acute accent.

It does and it doesn't; scholars seem to universally use é to represent /e/ at the end of the word or when the penultimate letter and the final letter is s. I'd argue that these forms are both real and Old French. I have at least four books from my university study that I could use as evidence and take pictures of if anyone wants me to. Also on Google Books

There seem to be no other accents that are consistently used. For example you can find seürté here (a third of the way down, use Ctrl + F) but Godefroy lists it as seurté as does the Anglo-Norman On-Line Hub. Other than that, you do get the odd transcription using à and the odd one using grave accents, but these are rare.

So, does anyone have an opinion on the following? I've been updating WT:About Old French but it feels odd doing it on my own. The options I see are allow, disallow and either list as alternative forms but don't create or list in alternative forms without links (just plain text, no square brackets)

  • No acute accents: seurte for seurté (which we list as seürté)
  • No capital letters: france for France
  • No cedillas: francois for françois (you can see francoise here, second word of the second line in black)
  • u/v distinction: they're visually the same but I think they're worth keeping as separate letters like we do in Latin. For example trouve appears visually as trouue (modern trouve, verb form) but I think they're really separate letters that share a glyph. This comment added 21:31, 12 November 2015 (UTC) Renard Migrant (talk)

How about alternative normalizations:

  • Allow diaereses: traïson for traison (the diaeresis here is to show it's three syllables and not two.)
  • à for a (occasionally used for the same reasons modern French has à and . But I can't remember where I saw it. I think it was in a modern print Wace)

Renard Migrant (talk) 18:29, 12 November 2015 (UTC)

Perhaps we can use a Latin-like system for Old French. Meaning, the entry names will not have diacritics that were not actually found in manuscripts, but diacritics will be added on the headword lines and text and stripped from links. --WikiTiki89 18:41, 12 November 2015 (UTC)
I'd considered that but we'd be going against what apparently every other dictionary does. If you're copying and pasting from another website, it could matter quite a lot. Renard Migrant (talk) 21:55, 12 November 2015 (UTC)
Other dictionaries don't actually have an entry name vs. headword distinction. --WikiTiki89 22:42, 12 November 2015 (UTC)
That is the root of the issue really. The fact that all spellings have an individual dedicated page, like color and colour are separate issues. Anyway, a specific example would be like trové being merged into trove (no entry right now) and when you look at the etymology of treasure trove, it says from tresor trové, you go there and it says only Spanish. Sure you can change the Wiktionary entry, but what about other websites that use the standardized spelling trové. And how many people get their hands on actual manuscripts? I've seen a book in Old Frenchl (French in the left-hand column, Old French on the right) in a general book store in France by way of comparison. Renard Migrant (talk) 23:47, 12 November 2015 (UTC)
And if we automatically strip links, that solves all the problems. The entry will be at [[trove]], and both {{l|fro|trové}} and {{l|fro|trove}} will point to the same page. --WikiTiki89 23:54, 12 November 2015 (UTC)
Would {{l|fro|trové}} actually go to trove if there was a page with a different language at trové? We certainly have to think about the case where people are entering words from books.--Prosfilaes (talk) 08:10, 13 November 2015 (UTC)
Yes, it would. {{l|la|jūra}} takes you to [[jura]] since we strip macrons for Latin, but {{l|lv|jūra}} takes you to [[jūra]] since we don't strip macrons for Latvian. —Aɴɢʀ (talk) 11:34, 13 November 2015 (UTC)
We can get away with the Latin practice because there's very little overlap between Latin words with macrons and entries in other languages that use macrons because the languages are unrelated and have different morphology (see, however Latin and Japanese romaji ). In this case, there are a number of closely related languages with similar morphology, so there's bound to be a lot of overlap, and whenever there's overlap, searches go to the entry with diacritics and not to the plain form. Chuck Entz (talk) 18:52, 13 November 2015 (UTC)
I believe that we should have the forms of words that are found in printed text, because that's what people are going to be looking up. Those are the form of the word most commonly found in durably archived forms of the language.--Prosfilaes (talk) 08:10, 13 November 2015 (UTC)
I agree. What Wikitiki89 says is possible, no doubt, but is anyone in favour of it? Copying and pasting trové from another website, won't get you to trove. Renard Migrant (talk) 12:03, 13 November 2015 (UTC)
And this is what an unlinked alternative form looks like. Renard Migrant (talk) 12:34, 13 November 2015 (UTC)

Copyright of images of manuscriptsEdit

Looking at the link (http://www.kb.dk/permalink/2006/manus/225/eng/4/?var=1) above, can we use this image in any way? I assume while the text is uncopyrightable, the photo of it is copyrightable. Renard Migrant (talk) 18:31, 12 November 2015 (UTC)

See commons:Commons:When_to_use_the_PD-Art_tag and commons:Commons:Reuse_of_PD-Art_photographs#Nordic_countries. So we can use it.--Prosfilaes (talk) 08:01, 13 November 2015 (UTC)
Does this come under art? Renard Migrant (talk) 15:10, 13 November 2015 (UTC)
PD-Art is for photos taken from a distance. I think these images were made on a scanner, so commons:Commons:When to use the PD-scan tag applies. At any rate, we certainly have other scans of old manuscripts at Commons, see e.g. commons:Category:9th-century manuscripts. —Aɴɢʀ (talk) 15:51, 13 November 2015 (UTC)

WT:EL, should Etymology come after Pronunciation?Edit

Currently the most common order of headings is to have Etymology before Pronunciation. But when the Etymology headings are numbered, then we put Pronunciation above if it applies to all of the etymologies. I think this is a bit backwards, to be honest, because it means that whenever you add a second etymology, you have to swap the headings around. It's much more common for a word to have only one pronunciation irrespective of etymology, than to have etymology-specific pronunciations. Therefore I think that this should be changed so that the default order is to have Pronunciation above Etymology. It goes without saying that if the pronunciations differ, then they should be placed within their respective etymology sections, as now. —CodeCat 21:56, 12 November 2015 (UTC)

I bet more casual users want a pron than an ety, too. Equinox 21:58, 12 November 2015 (UTC)
If we considered what users want relevant, we'd put the definition first. —CodeCat 22:01, 12 November 2015 (UTC)
We should. Equinox 22:02, 12 November 2015 (UTC)
Ok, but that's a separate debate. I definitely encourage you to start a new discussion on that, and I'd support it too, but I'd rather not get this proposal muddled up by making it bigger than it is. —CodeCat 22:04, 12 November 2015 (UTC)
I support CodeCat's proposal. --Daniel Carrero (talk) 22:55, 12 November 2015 (UTC)
Oppose for aesthetic reasons. It doesn't look good when the etymology section is under the pronunciation section. If you really want to rethink things, it might make sense for the etymology to be at the bottom, but as has been said in previous discussions, this would drastically affect the hierarchy of the entry contents. As far as what users find relevant, it's actually really easy to ignore the etymology section; I never understood why people find it annoying. Also, I'm not sure I buy CodeCat's claim that "It's much more common for a word to have only one pronunciation irrespective of etymology, than to have etymology-specific pronunciations." This just doesn't seem to be the case in my own experience. --WikiTiki89 23:12, 12 November 2015 (UTC)
Don't you work on Hebrew a lot, which has a nonphonetic writing system? That's why. —CodeCat 23:14, 12 November 2015 (UTC)
Yes, but even my experience outside of Semitic languages, such as with Russian and English, leads me to doubt your claim. --WikiTiki89 23:20, 12 November 2015 (UTC)
I agree in part. We have to keep foreign languages in mind too. If a number of languages have different pronunciations depending on etymology, then it may be better to keep the pronunciation first. However, I think it makes more sense for the pronunciation to be first in English entries, except when the pronunciation is distinct for the different etymologies (as for words like wind). Andrew Sheedy (talk) 23:19, 12 November 2015 (UTC)
wind is a bad example because the current entry fails to show that the verb wind /wɪnd/ derives from the noun, and that the noun wind /waɪnd/ derives from the verb. There should really be four etymology sections, though two of them share a pronunciation with the other two. There is no way to solve this kind of overlap and nesting in the general case. —CodeCat 01:02, 13 November 2015 (UTC)
Under the proposal how would one present homographs having the same etymology but different pronunciations? DCDuring TALK 23:49, 12 November 2015 (UTC)
Then they're the same word, just with different pronunciations, aren't they? —CodeCat 00:28, 13 November 2015 (UTC)
I think the question is if the pronunciation section should be split into two sections (one for the definitions with a certain pronunciation and one for those with another pronunciation), since the etymology would be out of the way, being higher up in the hierarchy than the pronunciations. For an example of an entry this would apply to, see right. Andrew Sheedy (talk) 00:52, 13 November 2015 (UTC)
But any other number of properties could be shared or distinct, too. I've seen English entries where a single headword line contains two different inflectional patterns, where the choice depends on the meaning. And of course there are cases where synonyms and such also differ by meaning. We solve this with the {{sense}} template. Why can't that be done for pronunciations? —CodeCat 00:59, 13 November 2015 (UTC)
I'd forgotten about the {{sense}} template. I'll replace the header in the pronunciation section of right with it (though I really should have just added a gloss in the first place). Andrew Sheedy (talk) 01:09, 13 November 2015 (UTC)
I don't like that the proposal does not link to previous discussions on the subject; there are multiple ones. The discussions contains specific objections. --Dan Polansky (talk) 09:57, 14 November 2015 (UTC)

Wiktionary:Votes/pl-2015-11/NORM: 10 proposals - starting soonEdit

FYI: Wiktionary:Votes/pl-2015-11/NORM: 10 proposals (which was created 2 weeks ago) is going to start in 2 days.

Duration of the vote: 3 months. --Daniel Carrero (talk) 04:58, 13 November 2015 (UTC)

The vote started. --Daniel Carrero (talk) 03:41, 15 November 2015 (UTC)

Suggestion: Remove userspace pages from Category:Requested entriesEdit


--Daniel Carrero (talk) 03:00, 14 November 2015 (UTC)

What if we make a subcategory? Category:Requested Entries (Userspace) or the like? Aryamanarora (talk) 03:34, 14 November 2015 (UTC)
Today, I created Wiktionary:Redlink dumps which looks better than a category IMO, since it is an organized list with a few comments here and there. Personally, I don't want to take the trouble to place every page listed at Wiktionary:Redlink dumps (let alone the hundreds of subpages) in a different category. If anyone wants to do it, I don't have anything to say about that. I only want to remove those from Category:Requested entries. --Daniel Carrero (talk) 14:39, 14 November 2015 (UTC)
Notably, I listed User:-sche#German at Wiktionary:Redlink dumps#Other languages, because @-sche has plenty of German redlinks in their user page, but I would not place an user page in a category called Category:Requested Entries (Userspace) (at least not without asking), I feel it would be rude. --Daniel Carrero (talk) 14:46, 14 November 2015 (UTC)
Keep them categorized somewhere like Aryamanarora says. Maybe Category:Requested Entries (user generated). Renard Migrant (talk) 17:21, 14 November 2015 (UTC)
Sure, but I'd understand "user-generated" to mean manual work. Most of these are the contrary of "user-generated": they are script-generated / computer-generated. Category:Redlink dumps should be a good enough name for that, like WT:Redlink dumps. --Daniel Carrero (talk) 17:26, 14 November 2015 (UTC)
Yeah, having userspace pages directly in Category:Requested entries is odd. I don't mind if someone categorizes User:-sche/wanted (the subpage of my userpage where the "wanted" terms are) into something like Category:Requested Entries (userspace). I've thought about merging it into the official list of requested entries, but I don't want to swamp that list, especially since many of the entries on my page are relatively less-important terms I just happened to notice we were missing. Using a list like Wiktionary:Redlink dumps rather than a category is also fine, IMO, and that list could be linked-to from a "See also" type section at the bottom of WT:RE:en, WT:RE:de, etc. - -sche (discuss) 23:03, 14 November 2015 (UTC)
As postscripts: I agree that "user-generated" is liable to be misunderstood and should probably be avoided, since clearer alternatives exist. Also, I don't mind splitting the many German and English words on my "wanted entries" subpage onto separate monolingual subpages, if that would be helpful. - -sche (discuss) 22:18, 15 November 2015 (UTC)
@-sche Thanks for splitting your lists into English and German pages, they do look better now.
Unfortunately, the name Category:Requested Entries (userspace), too, has the problem of not being the most accurate name possible. At WT:Redlink dumps, there are some pages in the Wiktionary: namespace:
--Daniel Carrero (talk) 13:36, 17 November 2015 (UTC)

According to my calculations:

I'm not interested in doing the work of placing those 968 entries in a separate category. We already have WT:Redlink dumps listing them, so I'd rather use my time on Wiktionary doing something else. If someone else wants to do it, you have my blessing. --Daniel Carrero (talk) 11:27, 18 November 2015 (UTC)

Yes check.svg Done --Daniel Carrero (talk) 02:58, 24 November 2015 (UTC)
I mean, I removed all userspace pages from that category. I didn't place them in any other category, for the reasons I said above. --Daniel Carrero (talk) 14:01, 25 November 2015 (UTC)

Blocking policy clarificationEdit

In Wiktionary:Votes/pl-2010-01/New blocking policy, editors seemed to want to simplify blocking policy (WT:BLOCK). To actually achieve the simplification for hasty readers, I think we need to reduce the policy page to the actual policy and nothing else, which would be to reduce it to the following:

:''See also '''[[Help:Interacting with humans]]'''''
{{policy|draft=The portion of it which is policy may not be modified without a [[Wiktionary:Votes|VOTE]].}}

# The block tool should only be used to prevent edits that will, directly or indirectly, hinder or harm the progress of the English Wiktionary.
# The block tool should not be used unless less drastic means of stopping these edits are, by the assessment of the blocking administrator, highly unlikely to succeed.

Interwiki links can follow the above text.

In the past, I have seen multiple cases where editors cited parts of the page that are not the actual policy, so the above proposed change seems really required and useful.

Comments? --Dan Polansky (talk) 11:10, 14 November 2015 (UTC)

Maybe the non-policy portion of the page should be moved elsewhere. Possible name: Help:Blocking. --Daniel Carrero (talk) 14:57, 14 November 2015 (UTC)
Split into Policy and another header, such as Rationale or Explanation. Wouldn't even require a vote to do that as you could leave the policy bit unchanged. Renard Migrant (talk) 17:20, 14 November 2015 (UTC)
Splitting to headers while keeping non-policy on the wiki page does not really work for me. I want the wiki page to contain the policy and nothing else. I have no issue with there being Help:Blocking but it should probably be written in a way that does suggest that it contains regulations (rules). Tables with specific blocking lengths look like rules. --Dan Polansky (talk) 17:34, 14 November 2015 (UTC)
Do you think we need a vote for moving the non-policy contents to Help:Blocking? If the answer is yes, then would the contents of Help:Blocking be locked for editing, thus requiring further votes if we want to change something? --Daniel Carrero (talk) 03:53, 15 November 2015 (UTC)
@Daniel Carrero: I think we need a vote to edit WT:BLOCK. I don't think we need a vote to create a non-policy page Help:Blocking. Admitted, one could argue that it should be possible to edit the non-policy parts of WT:BLOCK without a vote. But I think it much better to use a vote to turn the page into a policy-only page via a vote that removes all non-policy parts. That, of course, presupposes support for changing the page as proposed by me. --Dan Polansky (talk) 08:14, 22 November 2015 (UTC)
Wait, it is already split to headers: WT:BLOCK#Policy and WT:BLOCK#Explanation. And this splitting seems to do little to prevent confusion. There is even red background in the policy text. In fact, the page contains multiple kludges to make it clear that only part of it is a policy. It looks really clumsy and does not really work, from what I can see. --Dan Polansky (talk) 17:44, 14 November 2015 (UTC)
It's not up to you, though, is it. Renard Migrant (talk) 14:55, 22 November 2015 (UTC)

I created Wiktionary:Votes/pl-2015-11/Short blocking policy. --Daniel Carrero (talk) 22:24, 25 November 2015 (UTC)

What distinguishes a synonym from an alternative form?Edit

I've come across several Finnish entries, where one word is defined as an alternative form of another, but the two words have different morphological structure and hence etymologies. An example is kolkkaa vs kolkata, but I've seen examples which differ more substantially. It would be somewhat like having two verbs in English, one with -en and another with -ify, defined as alternatives of the same term. I think these shouldn't be considered alternative forms, but we don't really have clear definitions for what is an alternative form and what is a synonym. Alternative forms, at the very least, are a subset of synonyms, but I would like to set some stricter criteria for distinguishing them. What comes to mind is having the same morphological structure and/or etymology, or forms that differ by dialect or something like that. —CodeCat 18:13, 14 November 2015 (UTC)

I also find myself in this situation every now and then. The general guideline I apply to myself is that words are alternative forms if they are synonyms and have roughly the same etymology.
I know this is not very helpful, because of the “roughly”. But consider the pair piranha and piraña: both have the same Old Tupi etymon, the only difference is that one entered English via Portuguese and the other via Spanish. — Ungoliant (falai) 22:35, 14 November 2015 (UTC)
This is one of those things that's easier to decide on a case-by-case basis than with a general guideline. --WikiTiki89 02:57, 15 November 2015 (UTC)
Kolkkaa only has the third sense listed under kolkata ("to clatter") which seems like sufficient grounds to not list this as an "alternate form". --Tropylium (talk) 19:11, 15 November 2015 (UTC)

Rhymes navigationEdit

I would like to found a couple of rhymes pages so I would like to ask about the preferred format of navigation at the top. I noticed at Rhymes:Czech/ofka that Lo Ximiendo changed it for {{rhymes nav}}, which Dan Polansky has reverted, unfortunately without explanation. The same has happened at about 20 more rhymes pages. I have also noticed that the template itself declares that "this template should be placed at the top of all rhymes pages". So what is the preferred way? Thank you. Jan Kameníček (talk) 21:54, 14 November 2015 (UTC)

{{rhymes nav}} was installed to rhyme pages by user CodeCat by a bot and without consensus. I oppose its use. It sets up rhyme pages for a hiearchy of categories and index pages, which I find undesirable and confusing. In my view, all Czech rhyme pages should be in Category:Czech rhymes, and subcategories like Category:Czech rhymes/a- should not exist. The subcategories are clumsy, do not add any real value and do not provide a truly useful navigation tool, unlike e.g. the tables at Rhymes:Czech. The markup at Rhymes:Czech/ofka that creates "Rhymes > Czech" is perectly simple, straightforward, does what needs to be done, and is in no need to be replaced with a template that presupposes a page structure that the creators of the rhyme pages do not support such as Rhymes:Czech/o- linked from a templated version of Rhymes:Czech/ofka.
I did a lot of substantive work on Czech rhyme pages, and consider myself to know what I am talking about.
I also object to use of templates where they add close to no value but give power to people who lock templates to be only editable by admins and the like, or even move their content to modules (Module:rhymes) to further raise barier of editing. That said, for some purposes, modules are extremely useful. --Dan Polansky (talk) 22:14, 14 November 2015 (UTC)
Is the template generating incorrect content? If not, it should be used. — Ungoliant (falai) 22:15, 14 November 2015 (UTC)
I prefer the non-templated page. There is no need to use a template instead, unless tangible benefits of template can be shown. I still hope that editors at large do not support this type of over-templatization.
The template places the page into a category that IMHO should not exist, and links to an index page that should not exist either. Correct or incorrect is not at stake; editor preferences are at stake. --Dan Polansky (talk) 22:20, 14 November 2015 (UTC)
An aside: as a remnant of CodeCat's non-consensual changes, we still do not have "-" back in the names of rhyme pages; see also Wiktionary:Votes/2014-09/Renaming rhyme pages. --Dan Polansky (talk) 22:20, 14 November 2015 (UTC)
As for what the template itself declares, what else do you expect from CodeCat's templates? See also Wiktionary talk:Votes/2014-08/Debotting MewBot. --Dan Polansky (talk) 22:21, 14 November 2015 (UTC)
I'd say, if you want to change the practice of simple formatting at the top of Czech rhyme entries, and the flat category structure of Czech rhyme entries, please someone create a vote, and do it now. Do it yourself, so that you do not need to accuse me later of poor drafting. --Dan Polansky (talk) 22:23, 14 November 2015 (UTC)
I do not want to change any practice, I am just asking, because I have not learned anything neither from the summaries of your reverts of Lo Ximiendo's edits neither from the summary of your revert of my edit. I do not care very much which format is chosen, I just want to know so that next time I am not reverted again. You wrote that the template started to be introduced without having been discussed, so this might be an opportunity to find out the general opinion of it. (By the way, your work on rhymes pages was enormous; that is beyond any doubt). Jan Kameníček (talk) 23:00, 14 November 2015 (UTC)
Nobody seems to oppose Dan Polansky's arguments, so I go on in the way that he promotes. Jan Kameníček (talk) 21:18, 24 November 2015 (UTC)
You can also go on the way you have done before. There's nothing wrong with your previous edit and they were reverted without ground. Dan is just trying to bully you into doing things his way. —CodeCat 22:01, 24 November 2015 (UTC)

Wiktionary:Votes/2015-11/Language-specific rfi categories - created voteEdit

FYI: I created a new vote about {{rfi}}, the vote is linked in the header.

Also, I think there's no harm in repeating what I said in a previous post: There's an unrelated vote that started today. You can cast your votes on it already:

--Daniel Carrero (talk) 05:19, 15 November 2015 (UTC)

Removing rfi from talk pagesEdit

FYI: I intend to remove {{rfi}} from talk pages.


  • It is used in 19 talk pages.
  • It is used in 6,500+ entries.


  • When the entry still needs an image, I am going to move the {{rfi}} to the entry itself, in the correct language section.
  • When the entry already has an image, I am going to remove the {{rfi}} altogether as a request fulfilled.


  1. Numbers suggest that the overwhelming practice is placing the request in the entry, not the talk page.
  2. About half of those entries have images already. The people who added the images did not remove the box, I suspect this happened because the box was hidden in the talk page.
  3. Probably I just got bored enough to make this list rather than just doing the change in the 19 entries, but I'm erring on the side of announcing one's intent at the BP for openness. It would also be nice if this prevented new rfis being added to the talk pages in the future, for better consistency.

(✔ = request fulfilled, has at least one image)

--Daniel Carrero (talk) 05:44, 15 November 2015 (UTC)

Note: User:DCDuring has been adding images to some entries listed and editing my message above to update the list. And, for that, he has my thanks. --Daniel Carrero (talk) 14:27, 15 November 2015 (UTC)
Done. I finished moving all the rfis to the main entry, or deleting the ones where the request was fulfilled. --Daniel Carrero (talk) 09:49, 17 November 2015 (UTC)

Wiktionaries linked at Norwegian language, Norwegian Bokmål language and Norwegian Nynorsk languageEdit

According to meta:Wiktionary, there are two Norwegian Wiktionaries:

  • Norwegian (Bokmål) = no.wiktionary.org
  • Norwegian (Nynorsk) = nn.wiktionary.org

I don't speak Norwegian, so I'll assume that's accurate.

Our language categories link to Wiktionary editions when they exist, and they are perfectly capable of showing links to multiple Wiktionaries at once. (compare Category:English language and Category:Serbo-Croatian language) But I think our three Norwegian language categories are not linking to Norwegian Wiktionaries accurately.

I believe perhaps they should display:

--Daniel Carrero (talk) 13:46, 16 November 2015 (UTC)

  • Yes indeed, there are Wiktionaries and Wikipedias for both Norwegian languages. Are you referring to those written in Bokmål and Nynorsk? Donnanz (talk) 13:55, 16 November 2015 (UTC)
w:no:Portal:Forside (Bokmål Wikipedia), w:nn:Hovudside (Nynorsk Wikipedia)
no:Wiktionary:Forside (Bokmål Wiktionary), nn:Hovudside (Nynorsk Wiktionary) —Stephen (Talk) 00:59, 17 November 2015 (UTC)
With my ability, I am unable to do the proposed change to the modules. Maybe it has something to do with Module:wikimedia languages.
Currently, Category:Norwegian language links only to nowikt, not to nnwikt. I think it should link to both. --Daniel Carrero (talk) 18:19, 24 November 2015 (UTC)

Namespace abbreviationsEdit

There has been some recent discussion in the BP and in the recent vote adding the “Reconstructed:” namespace about adding more snazzy namespace search bar abbreviations like the preëxisting “WT:” for “Wiktionary:”. It would make things far more convenient to have a few more of these like “TP:” for “Template:”, “AP:” for “Appendix:”, and “MW:” for “MediaWiki:”. I wanted to check to see whether was enough interest to merit a vote and also what abbreviations people would like to see. My initial list would look something like:

  • AP → Appendix
  • C → Citation
  • CT/CA/CAT → Category
  • MD → Module
  • MW → MediaWiki
  • T → Talk/Discussion
  • TP → Template
  • RC → Reconstruction

And potentially many more. These things will make life a lot easy for frequent contributors of this project and delay the impending carpal tunnel syndrome that awaits us all a few years. It would also be cool if there were a way to jump immediately to a talk page by appending “T:” to a namespace (e.g. “AP:T:” → “Appendix talk:” or “TP:T:” → “Template talk:”). —JohnC5 01:28, 17 November 2015 (UTC)

This would get my vote, although I find myself going to category pages a lot more than citation pages, so I might make C = category and CI = citation. I also might find it easier to remember if you use the first two letters whenever the word isn't clearly two parts, e.g. TE=template, MO=module. Benwing2 (talk) 02:09, 17 November 2015 (UTC)
I could support behind all these options.
I know it's not that long to begin with but I could imagine “U:” for “User:” to be useful, especially since “U:T:” would be great. —JohnC5 02:17, 17 November 2015 (UTC)
You could also abbreviate UT=User talk, TT/TET=Template talk, AT/APT=Appendix talk, etc. Benwing2 (talk) 02:51, 17 November 2015 (UTC)
Also a possibility; though the :T: suffix does avoid ambiguity. —JohnC5 02:59, 17 November 2015 (UTC)
I support the proposal, but "MW → MediaWiki" and "C: → Citation or Category" are unavailable, because they link to sister projects. Full list: w:Help:Interwiki linking.
Wikipedia uses "WT → Wikipedia talk" (their equivalent of Wiktionary talk), "T → Template", "CAT → Category" and "H → Help". Full list: w:Wikipedia:Shortcut#Pseudo-namespaces. IMO we should use, too, "CAT → Category" and "H → Help". "Cat" is our standard abbreviation of "category" anyway -- many category templates and at least 1 categorization gadget have "cat" in the name. --Daniel Carrero (talk) 09:38, 17 November 2015 (UTC)
I also prefer CAT over anything else for categories. Equinox 15:09, 17 November 2015 (UTC)
So for the sake of listing things, we would prefer something like:
  • AP → Appendix
  • CIT → Citation
  • CAT → Category
  • H → Help
  • MD/MO/MOD? → Module
  • MWK/MED? → MediaWiki
  • RC → Reconstruction
  • U → User
Then the question remains of what we would like for “Talk” and “Template”. I'd prefer:
  • T → Talk/Discussion
  • TP/TEM/TEMP? → Template
The above option allows for the unambiguous “u:t:” for “User talk:”. Alternatively, if we think that we link/navigate to templates more often than talk pages:
  • TK? → Talk/Discussion
  • T → Template
JohnC5 15:32, 17 November 2015 (UTC)
* DOC:en-nounTemplate:en-noun/documentation
* MDOC:parametersModule:parameters/documentation
--Daniel Carrero (talk) 15:54, 17 November 2015 (UTC)
I was curious about that. Is it possible to have a prefix translated into a circumfix (i.e. “DOC:” → “Template: … /documentation”? —JohnC5 16:03, 17 November 2015 (UTC)
I support new abbreviations for our unique namespaces, like citations, appendix and reconstructions. But I don't think we should be making ones for modules or templates, as this will only add confusion and incompatibility when copying content between wiktionaries / wikis. If it's only for the search bar it's ok, but we shouldn't be creating ways to reference a template in wiki markup that only work on this Wiki. Pengo (talk) 20:36, 17 November 2015 (UTC)
@Pengo: to be honest, the “Template” and “Category” namespaces are probably the ones for which I most would like abbreviations, but that is a very valid comment for which I thank you. —JohnC5 20:43, 17 November 2015 (UTC)
Re: "we shouldn't be creating ways to reference a template in wiki markup that only work on this Wiki."
But aren't some templates kind of like this? When we want to reference a template in a discussion, we often use {{temp}}, so it's like "temp" were a very particular kind of abbreviation for a certain namespace. --Daniel Carrero (talk) 20:49, 17 November 2015 (UTC)
Maybe if we do it first, all the other wikis will follow. When typing fast, I can never spell Category right on the first try (it usually ends up as Cateogry or something), so it would be very useful to have an abbreviation. --WikiTiki89 21:03, 17 November 2015 (UTC)
But {{temp}} itself can be copied verbatim to other Wikis still and will still work the same way and you need to know nothing special about en.wikt to do so. Changing Template: to Temp: might seem minor, but it means exporting wikitext requires actually editing the wikitext, and suddenly requires specialized knowledge of English Wiktionary's configuration. I don't think it's an exaggeration to say it becomes an order of magnitude more difficult and error prone. If those abbreviations work their way into module code then it becomes more difficult again and requires more expertise. I'd rather not break compatibility with the other hundreds of other WMF Wikis for the sake of a minor convenience. If we could introduce temp: or mod: across all wikis then that'd be fine, or if it was just for searching it'd be great, or if they were auto-expanded when you preview/save that'd be okay too. but otherwise I'd rather be cautious and not introduce the possibility of breaking things unnecessarily. Pengo (talk) 21:39, 17 November 2015 (UTC)
Is there somewhere where we can suggest the addition of "cat: -> category:" in all wikis simultaneously? meta.wikimedia.org? --Daniel Carrero (talk) 21:47, 17 November 2015 (UTC)
But you can't type {{temp}} into the searchbar. I don't care so much about actual wikilinks. --WikiTiki89 21:53, 17 November 2015 (UTC)
No, templates can be difficult to copy between wikiprojects. For example {{Navbox}} from Wikipedia is impossible to copy without hard work.--Dixtosa (talk) 10:30, 18 November 2015 (UTC)
@Daniel Carrero:, forget that. Just forget it. Just do it... xD--Dixtosa (talk) 10:30, 18 November 2015 (UTC)
Regarding "U:T:", is a colon inside a namespace name possible? We currently have links like WT:T:ADE, but that was a workaround not having a true namespace alias — they are actual redirects (i.e. they exist as pages, unlike WT:About German which only exists as Wiktionary:About German) in the Wiktionary namespace which start with "T:". We could, of course, continue that convention. How many namespaces are long enough and sufficiently often linked to that they need aliases? Wikipedia, a much bigger project, gets by with only a few. I don't think MediaWiki pages are linked to often enough that they need an alias. Increasing the number of aliases does have a few (slight) drawbacks, e.g. pre-existing or possible future clashes with interwiki links. "Cat", although a language code, is fortunately not an interwiki; the interwiki to Catalan projects is "ca". "Mod", on the other hand, is the only code for Mobilian, so a "mod:" space would conflict with interwiki links if a Mobilian wiki were created; likewise "Cit" is the code for Chittagonian. Do we need shorthand for citations pages, anyway? I would add "AP → Appendix" and "RC → Reconstruction" and "CAT → Category". If we needed a shortcut to modules, what if we just used "M:" and didn't add a shortcut to MediaWiki pages? Likewise I would use "T:" for template; "Talk:" is not hard to type in full. - -sche (discuss) 02:18, 18 November 2015 (UTC)
As I've already pointed out, many of these shortcuts would be much more useful for navigation through the search bar than strictly for linking. Also, if it is at all possible that CAT: would not have to be preceded by a colon in links, that would be very, very convenient. --WikiTiki89 02:28, 18 November 2015 (UTC)

  • I am sure U:T: can't be a namespace.
  • User:Ungoliant_MMDCCLXIV has a user script that adds two or so search inputs beside the main search input that search in specific namespaces. A good idea for those who only need namespace abbreviations to search easily. --Dixtosa (talk) 10:30, 18 November 2015 (UTC)
    Can you link to it? --WikiTiki89 15:54, 18 November 2015 (UTC)
    @Dixtosa how did you find that out?? — Ungoliant (falai) 18:44, 18 November 2015 (UTC)
    I just tried it out.
    Wikitiki, the snippet that does that is inlcuded in his monobook.js. --Dixtosa (talk) 15:30, 21 November 2015 (UTC)

Given all that I've seen above, I would still like to have abbreviations for at least Category, Template, Module, Reconstruction, Appendix since those are the most typed and least convenient. —JohnC5 16:02, 18 November 2015 (UTC)

I am planning on maybe creating a vote for at least the most wanted abbreviations on this discussion. The ones mentioned by John5 (Category, Template, Module, Reconstruction, Appendix) have my support. Aside from that, I would really like to know if it's possible also redirecting "DOC:" and "MDOC:" as I suggested above. --Daniel Carrero (talk) 19:25, 18 November 2015 (UTC)
But the documentation pages are really meant to be viewed from the template/module's page itself, so I'm not sure why we would need to link directly to the documentation. --WikiTiki89 19:30, 18 November 2015 (UTC)
Please do start a vote! I feel that abbreviations for Reconstruction and Appendix are not contentious. It might make sense to have a separate section of the vote for Category, Template, and Module, since Pengo at least seems to have some reservations about those. —JohnC5 18:21, 21 November 2015 (UTC)
I would even say we should have a separate section for each namespace. --WikiTiki89 21:57, 21 November 2015 (UTC)
Sounds good to me. —JohnC5 17:29, 22 November 2015 (UTC)

Wiktionary:Votes/2015-11/Namespace abbreviations --Daniel Carrero (talk) 05:59, 24 November 2015 (UTC)

Remove the numbers from Etymology sectionsEdit

Why do we have numbers on Etymology sections? The numbers themselves don't mean anything, and are subject to reordering anytime anyway. We also don't number other sections; you don't see "Noun 1" or "Conjugation 2" sections anywhere. The numbers are not necessary to understand the entry at all, since the header structure already gives this information. It's also annoying to have to number and renumber them all the time, when we don't require this treatment for other headers. Therefore I propose dropping the numbering from the headers, so that they are treated like we treat other headers already. —CodeCat 18:42, 18 November 2015 (UTC)

Previously, they were used for links, such as foo#Etymology 1. They are also useful as an indication to the reader that there are more etymologies to follow. --WikiTiki89 18:45, 18 November 2015 (UTC)
Most readers don't know or care about etymologies, so it's not very interesting information. The definitions should really come first, and entry information should be grouped by term (like other dictionaries do), not etymology. But since people keep blocking that, we have to find other means to slap some sense into our entry structure. —CodeCat 18:48, 18 November 2015 (UTC)
Re: "But since people keep blocking that, we have to find other means to slap some sense into our entry structure" -- I don't know about other people, but if someone proposes a solid layout with definition first and makes a vote for it, I would consider supporting it. --Daniel Carrero (talk) 19:32, 18 November 2015 (UTC)
Sorry, I meant "more etymology sections to follow". What does term mean in "grouped by term"? And aren't you basically saying "this might not be the right thing to do, but no one wants to do the right thing, so let's do this instead"? --WikiTiki89 18:52, 18 November 2015 (UTC)
I see it as an improvement, which is the point. Wiktionary should be improved. And by "term" I mean part-of-speech headers. I can't say "part of speech" because then people will think I want to group different nouns together. I think that entries should be formatted with etymology and pronunciation nested under the part of speech header. The other way around doesn't make much sense, since every word has its own etymology anyway, there should be as many etymology sections as there are part of speech sections. —CodeCat 19:00, 18 November 2015 (UTC)
Case in point: diff is way too much administrative work to add a simple etymology. I should just be able to add the etymology under the appropriate POS section, and not have to worry about adding additional numbered sections and adding extra equals signs to all the headers. —CodeCat 19:35, 18 November 2015 (UTC)
Other dictionaries number the headwords, e.g. ¹wind and ²wind or wind 1 and wind 2 or the like. I think it would be confusing to have avoid numbers altogether (and we certainly have used "Noun 1" and "Pronunciation 2" headers in the past, though bots tend to remove them). Ultimately I think it would be least confusing (though not 100% nonconfusing) to have each entry on a page of its own, i.e. with its own URL, e.g. "/wiki/en/wind_(movement_of_air)", "/wiki/en/wind_(twist)", "/wiki/nl/wind_(wind)", "/wiki/nl/wind_(form_of_winden)", "/wiki/ang/wind", and so on. Then "/wiki/wind" would just be disambig page, as would "/wiki/en/wind" and "/wiki/nl/wind". —Aɴɢʀ (talk) 19:36, 18 November 2015 (UTC)
It might be tricky for words with ones of meanings - eg, is train1 "/wiki/en/train_(rail_vehicle)", "/wiki/en/train_(long_skirt)", "/wiki/en/train_(procession)", etc? Effectively, the disambiguation pages would just become the entries themselves. Smurrayinchester (talk) 20:42, 21 November 2015 (UTC)
This is certainly true, and it's a problem I'm already having with categorising suffixes. But the issue with numbers is that they don't allow reordering without breaking links. In most dictionaries, the content is created in advance so the editors know the numbering and can refer back to the numbers. But Wiktionary is always in development, so senses are split and rearranged as we go. We invented {{senseid}} to solve this, but we haven't yet used it at a level higher than a single sense. It certainly needs a solution though. —CodeCat 20:50, 21 November 2015 (UTC)

Suggestion: make "Vote started" and "Vote ends" agree on the tenseEdit

Most votes have the dates like this:

  • Vote started:
  • Vote ends:

I suggest editing the vote-generator templates to agree on the tense. I propose:

  • Starting dateStart date:
  • End date:

For reference, the "vote-generator templates" I mentioned are:

--Daniel Carrero (talk) 21:26, 18 November 2015 (UTC)

Ironically, your suggestion has the same problem, mixing a participle with a noun. It should be either "start date" and "end date" or "starting date" and "ending date" (and I prefer the former pair). --WikiTiki89 21:50, 18 November 2015 (UTC)
Sure. Striked the "Starting". --Daniel Carrero (talk) 21:59, 18 November 2015 (UTC)
I find "vote started/ended" (in whatever tense) clearer than "start/end date". The language is more active somehow. Equinox 14:30, 19 November 2015 (UTC)
  • I propose to undo the diff from 2014, resulting in "Vote starts" rather than "Vote started". The past tense is inappropriate, IMHO. --Dan Polansky (talk) 08:06, 22 November 2015 (UTC)

I also suggest retroactively editing all votes to make the "Vote started" and "Vote ends" agree on the tense. I wouldn't mind creating a vote for this. --Daniel Carrero (talk) 04:17, 24 November 2015 (UTC)

redirect pages neededEdit

I don't think this is currently allowed, but Wiktionary desperately needs to start making redirect pages for protowords, transliterations, and spelling variants not currently permitted in mainspace. A simple "#redirect [[correct location of term]]" would be a huge help in finding these items on Wiktionary. Words in protolanguages (e.g. Proto-Indo-European, Proto-Germanic, etc.) are often very difficult to find, especially for Proto-Indo-European, for which roots can often be spelled different ways. What is even worse is if someone has a protoword they want to look up in Wiktionary but is not sure for which protolanguage; since protowords are in appendices, it can be very difficult to find. Placing redirect pages to the appendix entries for protowords would be a huge help. Another example is for Russian, Latin, Classical Greek, etc. words, which can often be spelled with or without diacritics (for stress and tone), but are only indexed in mainspace with minimal diacritics. Looking for words by copying and pasting the diacriticked versions into search can make using Wiktionary difficult as well. Redirect links for the diacriticked versions to the main non-diacriticked pages would be a huge help there as well. Trying to search a copied diacriticked term (e.g. νῑ́κη) by removing the diacritics from the copied non-Roman Unicode characters on an ASCII keyboard is often impossible, and requires replacing the characters using a character map instead (νίκη, νικη). Nicole Sharp (talk) 04:46, 19 November 2015 (UTC)

I agree. The software already redirects to pages that vary only in diacritics, but it needs to be expanded to work with more scripts. — Ungoliant (falai) 17:03, 21 November 2015 (UTC)
[I]f someone has a protoword they want to look up in Wiktionary but is not sure for which protolanguage — Probably the best approach to this is to start with one of the descendants. If a proto-word has been added on Wiktionary, it's likely linked from its descendants, too. I guess in principle someone could have neither — where you only have something like *fō "hand" and no idea what part of the world this is from — but this seems too unlikely to be useful to account for.
Proto-word redirects for transcription variants would be good to have around, and certainly preferrable to treating them as "spelling variants" with separate entries. --Tropylium (talk) 21:37, 21 November 2015 (UTC)
We already do that last part. —CodeCat 00:19, 22 November 2015 (UTC)
I know, I'm just echoing the recommendation to add more. --Tropylium (talk) 02:17, 22 November 2015 (UTC)
Expanding a bit more on this, I'd like to also suggest at least as a general practice (possibly policy):
  • Any reconstruction entry that lists alternate reconstructions should
  1. not link to these
  2. set up all listed alternate reconstructions as redirects (provided that this does not clash with other reconstructed entries).
--Tropylium (talk) 16:17, 23 November 2015 (UTC)
I think links can be helpful just to ensure that the redirects exist. —CodeCat 16:27, 23 November 2015 (UTC)

A cool Pleco update for CantoneseEdit

To Chinese editors: Pleco dictionary now has a new downloadable dictionary for specifically Cantonese terms - 20,000 entries. Such entries are marked as CCY. Previously, only terms that shared the forms with Mandarin were there. Funny enough, these terms still have pinyin readings:

PY qú dì
ZY ㄑㄩˊ ㄉㄧˋ
JP keoi5 dei6

Please spread the news for Chinese editors. No need to go online to Sheik's dictionary any more, which should be larger, actually. --Anatoli T. (обсудить/вклад) 07:13, 23 November 2015 (UTC)

A def line for phrasal verbsEdit

I would like to add a definition line to the main verb for each phrasal verb. I've made Template:phrasal verb for that purpose. You can see it in use at abide right now. I think this will make it easier for people to see that it's there -- if someone is looking for help understanding a sentence that uses "abide by", they won't know they should be looking up the phrasal verb or going down to "Related terms" to find what they're looking for. It might also help reduce the frequency with which people add phrasal verb definitions to the main verb form, since they don't realize that it is part of a different entry. What do you think? WurdSnatcher (talk) 21:43, 23 November 2015 (UTC)

Ave WurdSnatcher, nos correcturi te salutamus.​—msh210 (talk) 19:59, 25 November 2015 (UTC)

Your input requested on the proposed #FreeBassel banner campaignEdit

This is a message regarding the proposed 2015 Free Bassel banner. Translations are available.

Hi everyone,

This is to inform all Wikimedia contributors that a straw poll seeking your involvement has just been started on Meta-Wiki.

As some of your might be aware, a small group of Wikimedia volunteers have proposed a banner campaign informing Wikipedia readers about the urgent situation of our fellow Wikipedian, open source software developer and Creative Commons activist, Bassel Khartabil. An exemplary banner and an explanatory page have now been prepared, and translated into about half a dozen languages by volunteer translators.

We are seeking your involvement to decide if the global Wikimedia community approves starting a banner campaign asking Wikipedia readers to call on the Syrian government to release Bassel from prison. We understand that a campaign like this would be unprecedented in Wikipedia's history, which is why we're seeking the widest possible consensus among the community.

Given Bassel's urgent situation and the resulting tight schedule, we ask everyone to get involved with the poll and the discussion to the widest possible extent, and to promote it among your communities as soon as possible.

(Apologies for writing in English; please kindly translate this message into your own language.)

Thank you for your participation!

Posted by the MediaWiki message delivery 21:47, 25 November 2015 (UTC) • TranslateGet help

About the active votesEdit

These are the most recent votes that I created. All 3 of them were based on previous discussions and were announced before, I'm just repeating the announcement here for conveniency/visibility/whatever.

  1. Wiktionary:Votes/2015-11/Namespace abbreviations -- I created it yesterday. Scheduled start date: Dec 1.
  2. Wiktionary:Votes/pl-2015-11/Short blocking policy -- I created it today. Scheduled start date: Dec 2.
  3. Wiktionary:Votes/2015-11/Language-specific rfi categories -- It started on Nov 22, you can vote on it now.

Also I didn't create this, but it is another recently-created vote which may be of interest:

  1. Wiktionary:Votes/bc-2015-11/User:Chuck Entz for bureaucrat -- You can vote here now, too.

On the opposite end, here are the votes that are closest to end, both on Nov 30:

  1. Wiktionary:Votes/pl-2015-09/Using macrons and breves for Ancient Greek in various places
  2. Wiktionary:Votes/pl-2015-10/Headword line

Also, I extended one vote by 1 month:

  1. Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right -- It currently has 100% support (5-0-0). It was going to end on Nov 22, but I felt uncomfortable closing the vote because only 5 people voted. That said, it's arguably a minor proposal. I guess I could have closed it but I'd rather wait a little more.

I didn't mention all the active votes apart from those that are starting now, ending now or the one that I extended. I'm leaving the vote box here which should contain all the active votes. --Daniel Carrero (talk) 23:05, 25 November 2015 (UTC)

I don't think there was any reason to extend the matched-pair vote. --WikiTiki89 23:30, 25 November 2015 (UTC)
  • We don't have any precedent on enforcing a quorum that I know of. It might not be a bad idea; I might previously have thought of it as pointless bureaucracy given our low turnout, but the recent advances in vote visibility (and thus turnout) have made this feasible in protecting the democratic structure. —Μετάknowledgediscuss/deeds 00:18, 26 November 2015 (UTC)
    In some cases the issues involved in a vote presuppose technical or linguistic knowledge that may not be had by many. Would an abstention on the grounds of ignorance count for quorum purposes? If not, how else would the quorum be adjusted for such cases. DCDuring TALK 00:26, 26 November 2015 (UTC)
I support counting abstention votes for quorum purposes. --Daniel Carrero (talk) 00:32, 26 November 2015 (UTC)
Agreed. Benwing2 (talk) 05:24, 26 November 2015 (UTC)
I take it the quorum would be used as a rationale for extending votes? For example, we could have the rule that we should not close any votes with less than 10 voters; and that such votes should be extended by one month. To prevent an endless loop of extending particularly unpopular votes, however unlikely, we could also have the rule that a number of X months or Y extensions is the maximum alllowed, after which the vote is closed as failed. --Daniel Carrero (talk) 06:04, 26 November 2015 (UTC)

Module errors on many pages (particularly Portuguese) regarding incorrect gendersEdit

I added a check to Module:gender and number to make sure someone didn't add some nonsensical gender like "masculine feminine" or "singular plural". These are two separate gender specifications, so they should be specified as such. Apparently, some editors have been making this mistake a lot, and as a result there are a lot of module errors. Can these be fixed? —CodeCat 00:56, 26 November 2015 (UTC)

It appears that all of these are the work of User:Ungoliant MMDCCLXIV‎, who has invented his own interpretation of gender codes, against years of consensus and common practice. "m-f" has never been a valid gender, so I don't know why this user started adding it to things without any discussion. The proper way to add multiple genders is, and has always been, to specify the second gender with another parameter, such as g=m|g2=f in {{head}} or {{l}}. —CodeCat 01:53, 26 November 2015 (UTC)

For the record, the hundreds of module errors are being caused by CodeCat’s edits to Module:gender and number, not by anything I did. — Ungoliant (falai) 01:57, 26 November 2015 (UTC)
All I did was add a check on incorrect genders. You provided the incorrect genders, and have apparently done so on a huge scale without any discussion. Or can you point me to the discussion where you got consensus for using "m-f" as a gender? —CodeCat 02:02, 26 November 2015 (UTC)
Can you point me to the discussion where you got consensus for filling Wiktionary with module errors where before we had something working properly that hadn’t gotten a single complaint despite years of use? — Ungoliant (falai) 02:07, 26 November 2015 (UTC)
There's nothing nonsensical about m-f or mf- it's just shorthand for both genders (see 父母, for a real-life example). Ungoliant isn't the only one who does this- I've cleaned up a number of instances over the past year or so (muttering under my breath the whole time). Since this isn't ambiguous, it should be allowed for in the code. Chuck Entz (talk) 04:22, 26 November 2015 (UTC)
I agree with CodeCat that you should use g=m|g2=f or similar in place of m-f, but I'd suggest having a bot clean these up. Benwing2 (talk) 05:23, 26 November 2015 (UTC)
g=m|g2=f means something different from m-f in Portuguese templates. This is the second time CodeCat has tried to unilaterally remove this distinction, even though it was discussed in WT:APT. — Ungoliant (falai) 12:17, 26 November 2015 (UTC)
What is the difference? I see WT:APT mentioning mf and morf but there's no mention of specifying multiple genders using different params. BTW one way to deal with this is to have the relevant template convert mf to g=m|g2=f underlyingly. Benwing2 (talk) 01:52, 27 November 2015 (UTC)
m-f and mf are used for words that have multiple genders, with the gender used corresponding to the referent’s sex when necessary; multiple gender parameters (and morf, which became unnecessary after {{pt-noun}} was converted to Lua) are for words that have a single gender, but a different gender is used depending on formality, dialectal, chronolectal or idiosyncratic factors (i.e. gangue is masculine in Portugal and feminine in Brazil). — Ungoliant (falai) 02:06, 27 November 2015 (UTC)
That's not at all obvious, it's just a convention you invented. There are better and more intuitive ways to denote the that the gender of the noun matches the gender of the referent. —CodeCat 02:16, 27 November 2015 (UTC)
That’s not an excuse to break the template. — Ungoliant (falai) 02:17, 27 November 2015 (UTC)
You're right, it isn't. So why did you? —CodeCat 02:19, 27 November 2015 (UTC)
What you've done is like deleting a widely-transcluded template without orphaning it. I don't care how hideous you think the status quo was, it wasn't as bad as the thousands of module errors we're faced with at the moment. This is bad for the function and the reputation of our site, and you have yet to say anything that justifies leaving it this way. Please explain why your edits shouldn't be reverted until the problem with the parameters is resolved. Chuck Entz (talk) 07:16, 27 November 2015 (UTC)
(edit conflict) It seems to me it's useful to have a difference between X and Y vs. X or Y for multiple genders, which the gender module doesn't support, but AFAIK the intended meaning of g=m|g2=f is X and Y. Maybe the gender module should have a way of supporting this distinction? Benwing2 (talk) 02:22, 27 November 2015 (UTC)
Maybe the way forward is to create individual gender modules for languages with exceptional considerations. I know of at least two Spanish words that have non-semantically varying gender: puente and samba. — Ungoliant (falai) 02:42, 27 November 2015 (UTC)
I think it would be better to have consistent usage of gender across languages whenever possible. The distinction of nouns that are multiple genders corresponding to different meanings and nouns that have varying gender by usage without a meaning distinction exists in many languages, probably in all languages with grammatical gender, and it would be good if the gender module supports this. BTW I take back what I said earlier about the intended meaning of g=m|g2=f, I think it's actually intended to cover both situations, or at least I can think of words that use it for both, e.g. in дереве́нщина ‎(derevénščina, yokel) it is used to indicate an epicene noun (m or f according to the semantic referent), whereas in طَرِيق ‎(ṭarīq, road) it indicates a noun whose gender can vary without meaning distinction. Maybe the gender module can be modified to support something like g=m-or-f and g=m-and-f (for Russian this could potentially be e.g. m-an-and-f-an-and-f-in, with more than two genders possible). Benwing2 (talk) 03:26, 27 November 2015 (UTC)
I'm just gonna butt in here and say that Hindi has several nouns that can be masculine and feminine based on context, and some that can be both depending on the speaker's choice. This is not just restricted to Portuguese. See Category:Hindi masculine and feminine nouns. Aryamanarora (talk) 17:57, 27 November 2015 (UTC)

About deleting l/en, l/la, l/de and othersEdit

In some RFDO discussions dating back to April 2015, (which are still open) there seems to be some consensus towards deleting templates on the format of Template:l/de, Template:l/la, Template:l/en.

See discussions:

I am bringing this up on the BP because that is quite a big project: there are a few dozens of templates like this. Most seem to be superfluous to Template:l. You could type {{l|en|buzzard}} and not {{l/en|buzzard}}. These are the templates I'd like to delete.

Some of these templates have actual language-specific purposes: this includes {{l/he}}, {{ja-l}} and {{ko-l}}, which I propose to be kept even if all others are deleted. (but I'd rather move Template:l/he to Template:he-l and leave a redirect, for naming consistency). --Daniel Carrero (talk) 02:57, 26 November 2015 (UTC)

My impression from the old RFDOs and RFMs is that there is already consensus for all of what you propose. —Μετάknowledgediscuss/deeds 03:31, 26 November 2015 (UTC)
I think I'm going to create a vote: "Allow bots to change l/la into l|la" and list the specific templates that would be deleted. Even if we have consensus to do that, this involves editing probably thousands of pages in a repeated fashion, so it is clearly bot work. --Daniel Carrero (talk) 06:30, 26 November 2015 (UTC)
Aren't {{l/en}} etc. meant to reduce page size? That's what I've seen people saying. Aryamanarora (talk) 17:53, 27 November 2015 (UTC)

term cleanup in user pages and discussion pagesEdit

@Angr, CodeCat, Metaknowledge, DTLHS, Jberkel, Equinox, Benwing2:

As per User:Daniel Carrero/term cleanup, I was hired by @Angr to add lang= to all instances of {{term}} as a paid job.

There's also a newly-created vote (Wiktionary:Votes/2015-11/term → m; context → label) which proposes to convert automatically all instances of {{term}} with langcode into {{m}}.

I've been doing it on the entry namespace only. (also Help:Misspellings) But we have cleanup categories for other namespaces, too:

Is it okay if I edit others' user pages and discussions to add the language code?

Editing these pages would be a step in the direction of orphaning and probably deleting {{term}}. If these pages still use it, then the template can't be deleted. If the template is deleted, then it's going to break all pages that use it.

(P.S.: I edited this page and readded my signature a few times to add more people to the ping list. Sorry if the notification appeared multiple times to you, I didn't mean to spam.) --Daniel Carrero (talk) 15:32, 26 November 2015 (UTC)

The pinging didn't work. Anyway, wouldn't there be cases where people were intentionally showing the difference between two templates? We might not want to make the historical archives harder to read. —Μετάknowledgediscuss/deeds 17:22, 26 November 2015 (UTC)
History, shmistory. That argument has never stopped such an effort before, even though in many ways it trashes one of the key ideas of a wiki. DCDuring TALK 20:04, 26 November 2015 (UTC)
As I recall, I hired you (Daniel) to clean up Category:term cleanup and all its subcategories, i.e. including the ones for other namespaces. I don't really care whether {{term}} is deleted or not; if it's kept, then once Category:term cleanup and all subcategories are cleared, I'd prefer {{term}} to output an error message if lang= isn't present. That way Category:term cleanup won't fill up again. —Aɴɢʀ (talk) 21:15, 26 November 2015 (UTC)
I didn't get your ping either, but it's fine with me if you edit other people's user pages to fix this up. I've done this before when making incompatible changes to templates and I think it's more polite to do it than just leave the pages broken, even if many of these pages are outdated. Benwing2 (talk) 01:44, 27 November 2015 (UTC)
Honestly seems like a waste of time. How is this really helping the dictionary in any significant way? Can't you add missing Portuguese words instead? Equinox 11:00, 27 November 2015 (UTC)
I am deeply concerned that someone is getting paid to do the kind of work everyone here does on a volunteer basis. No one's contributions are more valuable than anyone else's. This completely unacceptable. -Cloudcuckoolander (talk) 11:45, 27 November 2015 (UTC)
User:Dan Polansky questioned this project before, at Wiktionary:Grease pit/2015/November#Suggestion: "term -> m" by bot. There, some people supported the proposal of migrating {{term}} to {{m}}; the latter requires a language code so I'm adding the codes. In the earlier discussion Wiktionary:Beer parlour/2013/April#Template term and lang parameter, some people supported the proposal of making langcodes mandatory for {{term}}.
Advantages of having the language code that were already mentioned before include: all text without langcode is assumed English apparently, and uses the script "None", so using the right langcode would result in proper formatting from both MediaWiki:Common.css and Special:MyPage/common.css. The orange links gadget only works with a language code; also, if we can convert all instances of {{term}} into {{m}}, then the code would look more consistent, because both are often used side-by-side but they are basically the same template under different names. --Daniel Carrero (talk) 11:49, 27 November 2015 (UTC)
I have little interest in contributing to a project that does not value quality contributions from all editors equally. Either we all get paid, or our only "reward" for editing is knowing we've contributed to the body of free knowledge. I know the former is infeasible, so the only solution is to ban paid editing on this project. I will not contribute any more until it is. -Cloudcuckoolander (talk) 12:05, 27 November 2015 (UTC)
I suggested doing cleanup work as a paid job at Wiktionary:Beer parlour/2015/October#Boring cleanup work for money. This was based on the earlier discussion Wiktionary:Beer parlour/2012/July#Reward or bounty board, in which someone said "I see no reason why a person doing boring cleanup work should not be paid with money if someone offers that money." and "appropriately clean up Category:Translation table header lacks gloss" is mentioned as one possibility.
According to meta:Terms of use/Paid contributions amendment, Wikimedia allows paid contributions on the condition that they are publicly disclosed, which I did: it's no secret to anyone that I'm doing term cleanup for money.
Anyone can do the same job. Other people are converting {{term}} to {{m}} basically everyday, but it takes a ton of time to do the whole job of manually emptying a cleanup category with 20,000+ entries, so I offered to make it my personal project. It's not that the work of person A is more valuable than the work of person B, I think it's more about the possibility of offering money for choosing what exactly person B is going to do with their time, as long as the community is okay with it. When I first opened the discussion, I even said, basically: "I need money. Want do you want me to do?" The current project was not originally my idea, it was open for the community to choose something if they wanted. --Daniel Carrero (talk) 12:16, 27 November 2015 (UTC)
Paid editing is completely antithetical to a volunteer project aimed at creating a free repository of knowledge. If we were going to hire an expert to create entries for an endangered language, that would be one thing. There probably aren't a lot of, say, native Ainu speakers hanging around on Wikimedia projects, so entries that do get created might be unreliable. Hiring an expert to create reliable entries would be a reasonable solution in that case. But paying regular editors to do regular work is not reasonable. It's creating a two-tier system where some contributions are valued more than others. Why is the work you're doing considered "boring" enough to warrant monetary compensation? Finding citations, formatting citations, creating requested entries -- these are all time-consuming and sometimes tedious tasks I regularly do. I don't get paid for any of it, and I don't expect to. The point is that most of the work necessary to build and maintain a wiki is time-consuming and tedious. If you want to motivate people to take on tasks no one seems willing to do, find another way. This is counterproductive and wrong. -Cloudcuckoolander (talk) 12:51, 27 November 2015 (UTC)
Not that I'm completely happy with it, but what we're talking about here is an arrangement between two editors, not any kind of action by Wiktionary or the community. "The project" isn't assigning a different value, Angr is. As for a "two-tiered" system: it's not cold hard cash, but the system has a provision for thanking other contributors for their edits, and we certainly don't restrict people from offering verbal support for some things, but not for others. Daniel does need to be careful about avoiding conflict of interest in his actions as an admin in relation to this, and we need to keep it in mind in judging his votes and other actions as a community member on related matters, though. I probably would be a lot more concerned if he wasn't already active and contributing without this. Chuck Entz (talk) 22:22, 27 November 2015 (UTC)
@Daniel Carrero If you want another project, I have a list of over 200 sets entries that need to be merged (that is, the entries are alternative forms of one another but have full entries). But you’ll have to convince someone else to provide the dough as I’m also unemployed lol! — Ungoliant (falai) 12:30, 27 November 2015 (UTC)
I'll do it for $10 (dollars) or R$40 (reais) through this PayPal link. :) Also I don't mind if I'm opening a "market" for paid jobs on Wiktionary -- someone else might see my message and offer R$30 to get the job. --Daniel Carrero (talk) 12:53, 27 November 2015 (UTC)
That said, (like anyone, I suppose) I've been known to do stuff for free as a favor, when people ask me to. The term cleanup is the only project I am doing, or have done in the past, that is an exception to that.--Daniel Carrero (talk) 22:45, 27 November 2015 (UTC)

When is a plural not a plural?Edit

<Sorry for the chemistry context - probably more general than that.> The word tripalmitins has recently been added, with the definition of "plural of tripalmitin". Now tripalmitin is a specific organic compound that doesn't really have a plural. But the plural form is easily attestable, in sentences such as "Thin layer chromatography of saturated lipids in the fat indicated the presence of mono, di and tripalmitins." The author here means "the presence of monopalmitin, dipalmitin and tripalmitin" and has chosen to put the "s" on the last member of a list to signify all members of the list. So, is the term (as used in this example) a plural? If not, what is it? SemperBlotto (talk) 08:52, 27 November 2015 (UTC)

I'd say it's really "mono-, di- and tri-" + "palmitins", like how "hydrochloric and sulphuric acids" isn't really "hydrochloric and" + "sulphuric acids" but "hydrochloric and sulphuric" + "acids". "tripalmitins" itself is not a word in that sentence, any more than "mono" or "di" are. I'd say this is something that we should exclude on a commonsense basis, just as we'd exclude "thes" as the plural of the word "the" ("There were seven thes on the page"). Smurrayinchester (talk) 10:53, 27 November 2015 (UTC)
I saw examples other than that kind. There are some that are just "tripalmitins" alone. Equinox 10:58, 27 November 2015 (UTC)
The only other examples I can see are variants on "labeled tripalmitins" – these would be different tripalmitin molecules which have been synthesized with radioactive isotopes (carbon 14) in order to allow the otherwise indistinguishable tripalmitin samples to be told apart. It's a bit like orange juice – uncountable, but you could say "South American orange juices contained more vitamin C than North American ones" if you did a scientific study that artificially categorized the juices. Smurrayinchester (talk) 12:03, 27 November 2015 (UTC)
Yes, I'm not questioning its existence as a real plural (in specialised usage); I just thought there might be an aspect of grammar that I wasn't familiar with. SemperBlotto (talk) 12:07, 27 November 2015 (UTC)
To me it seems like a result of the deepening of analytical knowledge. Someone defines something that is unique in a domain or context. Then someone looks into it further and discovers or invents variations, rendering the plural necessary to encompass the variation, at least in some contexts. For our purposes we should just have a plural without a lot of explanation. DCDuring TALK 16:25, 27 November 2015 (UTC)
I agree that we have to weed out things like "mono-, di- and tri-palmitins" and e.g. "myristic and palmitic acids", per Smurray; compare Talk:Asperger's syndromes. (But, on the subject of palmitic acids, I can find 2012, Osamu Hayaishi, Molecular Mechanisms Of Oxygen Activation, ISBN 0323143261, page 45: "The mechanism in the formation of a-hydroxy acids in the leaf system has been studied with palmitic acids stereospecifically labeled with tritium at C-2 and C-3 (80, 81).")
On the other hand, if uses like "labeled tripalmitins" are attested, then I think it would be reasonable to have tripalmitins as a plural of tripalmitin. If we wanted to, we could expand the definition of "tripalmitin" by adding to the end something like "...; an instance of this triglyceride", but I don't think it would be necessary; happiness lists happinesses as it plural without bothering to expand its definition-line, and likewise other words affected by the routine phenomenon of pluralization of things that might be argued to be conceptually unpluralizable, such as (all attested on Google Books) "oxygens", "uraniums", "carbon monoxides", "angers", "Jewishnesses", etc. - -sche (discuss) 18:12, 27 November 2015 (UTC)

Category:English proper noun plural formsEdit

Given before we just had Category:English plurals, shouldn't we now starts discriminating between noun plurals and proper noun plurals? Plenty of proper nouns have them. Like given names and surnames; Steves; Stephens, Matthews, Dianes and so on. Renard Migrant (talk) 13:05, 27 November 2015 (UTC)

They're not really any different in nature from regular noun plural forms, so I don't think a distinction is really useful. —CodeCat 16:27, 27 November 2015 (UTC)
Wiktionary:Votes/pl-2011-12/Merging proper nouns into nouns, a vote to merge proper nouns into nouns, failed. Wiktionary:Requests for moves, mergers and splits#Merge_Category:.28language.29_proper_noun_forms_into_Category:.28language.29_noun_forms showed (!vote-level, by which I mean 2/3rds) consensus not to merge "proper noun forms" into "noun forms". So, yes, "proper noun plural forms" should be separate from "noun plural forms", unless someone can demonstrate enough consensus to merge them to overrule the consensuses I just linked to. - -sche (discuss) 17:39, 27 November 2015 (UTC)

New Latin vs. TranslingualEdit

At the entry lycaenid, there is "From New Latin Lycaenidae." in the etymology. I added the code mul to the term, under the assumption that Lycaenidae won't ever have a Latin section, just the Translingual section. I didn't change the "New Latin" part. I've seen some entries saying "From Translingual blablabla.", but "New Latin" is more specific than "Translingual", so I wouldn't want to remove that information from the entry. What I did here is probably what I'd do for other entries, so feel free to suggest if I should do anything different. --Daniel Carrero (talk) 13:16, 27 November 2015 (UTC)

No, I think that's what I'd do too. Aryamanarora (talk) 17:51, 27 November 2015 (UTC)
If the etymon is clearly a taxonomic name, as Lycaenidae is, we should call it Translingual, even were it pre-Linnaean. DCDuring TALK 20:29, 27 November 2015 (UTC)
From New Latin is fine; just use the language code mul with it. Renard Migrant (talk) 20:42, 27 November 2015 (UTC)
Read in another language