Open main menu

Wiktionary β

Wiktionary:Beer parlour/2017/April

< Wiktionary:Beer parlour
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← March 2017 · April 2017 · May 2017 → · (current)

"removed Category:en:Grasses; added Category:en:Hordeeae tribe plants"Edit

Is this really a helpful change to anyone? Ƿidsiþ 08:39, 1 April 2017 (UTC)

An issue I've raised before is that when things in category A get split instead into subcategories AA, AB and AC, there is no longer a way to get a list of everything in category A: you have to go through all the subcategories. (That might be a UI criticism rather than a criticism of how granularly we actually classify our entries.) Equinox 08:42, 1 April 2017 (UTC)
We have 89 species, 60 genus, and 9 tribe Translingual entries for members of the family Poaceae, which includes all the grasses. All of those entries, ie, Translingual entries for grasses, are accessible using CirrusSearch without the use of a category.
According to the Angiosperm Phylogeny Group extant are 12 subfamilies, 707 genera, and 11,337 species. I don't know about the numbers of obsolete names and names of subspecies and cultivars, etc, but sometimes vernacular names are associated with taxa of such low rank.
I suppose we could use subfamilies instead of tribes for subcategories of grasses, but I'm not sure about the stability of the membership in those subfamilies. DCDuring (talk) 17:37, 7 April 2017 (UTC)
First of all, I responded to the concern expressed above by changing the category from "Hordeeae tribe plants" to "Hordeeae tribe grasses", because replacing a category with "grasses" in the name with one that says "plants" makes the name less informative for the vast majority of users who don't know or care what "Hordeeae" refers to. My reason for creating the category was to make Category:en:Grasses more manageable in size. The in-between categories I've been creating aren't intended for most languages, but are helpful in a language like English that has thousands of terms for grasses. In general, I don't create organism categories unless A) they're going to have at least a couple dozen members that would otherwise be too many for the parent category, or B) they're an obvious natural grouping that people are going to want to know about. Most of the categories of the B) type have already been created, so I've been concentrating on the A) ones.
As for whether to do subfamilies instead of tribes: many of the subfamilies are too big, and they're even more meaningless to non-botanists than tribes are. Chuck Entz (talk) 01:20, 8 April 2017 (UTC)
Sidenote: as a presentation level, Equinox's concern can be addressed with DynamicPageList which was activated on en.WT a decade ago to do pretty much exactly this: show the members of a category tree (there are more-modern tools as well.) - Amgine/ t·e 18:11, 26 April 2017 (UTC)

Centralizing labelsEdit

Right now there are dialectal data modules that contain labels used by {{alter}} to label alternative forms. Confusingly, they don't just contain dialect names (for instance, Attic or att for the Attic dialect of Ancient Greek), but also spelling systems (for instance, Oxford for Oxford spelling), morphemic variations (with movable nu), sound changes (apocopic), and perhaps other things.

Dialect labels are also found in Module:labels/data/subvarieties, and are used by {{label}} and {{term-label}}, and by {{alternative form of}}.

The labels used by {{alter}} often correspond to each other. For instance, see ἕως (héōs), a form-of entry, where the Attic label from Module:labels/data/subvarieties is used in the definition line through {{alt form of}}, and the main entry ἠώς (ēṓs), where the Attic label from Module:grc:Dialects is used in the Alternative forms section through {{alter}}.

There are some labels that are only found in Module:grc:Dialects: for instance, apocopic and with movable nu.

The latter would be difficult to use in {{alternative form of}}: grammatically, it has to be placed after the lemma: alternative form of lemma with movable nu. That could be specified in the data module, I suppose.

It would also never be used in {{label}}. That too could be specified in the data file too.

This is sort of a rambling post. The point is, I think the so-called dialectal data modules (that don't just contain dialect labels) should probably be moved into Module:labels/data and its submodules, though I am not sure precisely how every detail would work out. This would be easier for the actual dialect labels that are duplicated in Module:labels/data/subvarieties, probably harder for the other types of labels in the dialectal data modules.

I think others have made the same point before, or raised questions on this general theme; @CodeCat, @Angr? — Eru·tuon 21:46, 1 April 2017 (UTC)

I appreciate the problem, but I don't have any ideas on solving it. —Aɴɢʀ (talk) 12:23, 3 April 2017 (UTC)

Oh, and I forgot about the accent labels in Module:a/data, which are used by {{a}} in Pronunciation sections. Those would also be candidates for centralization. — Eru·tuon 02:39, 6 April 2017 (UTC)

And the labels used in {{qualifier}} contain much of the same content as those used in {{label}}, except they do not add categories. That too could be centralized. — Eru·tuon 03:06, 6 April 2017 (UTC)

Yeah, I gotta admit, it have no clue when to use {{label}} over {{qualifier}}. --Victar (talk) 02:14, 19 April 2017 (UTC)

1000000Edit

Why can't I find million if I search for 1000000? Siuenti (talk) 16:40, 2 April 2017 (UTC)

There are infinitely many numbers and we have voted not to include most of them. Equinox 16:41, 2 April 2017 (UTC)
So 1000000 doesn't lead to million for the same reason 97432 doesn't lead to ninety seven thousand, four hundred and thirty two? Siuenti (talk) 16:45, 2 April 2017 (UTC)
one million is the first result when you search for 1000000 (if you bypass the annoying automatic redirect to ១០០០០០០). DTLHS (talk) 16:51, 2 April 2017 (UTC)
The trick is to use the "search" button, not the "go" one. The clue is in the name. SemperBlotto (talk) 16:55, 2 April 2017 (UTC)
I don't seem to have a search button. I do have a little box that says "search wiktionary" Siuenti (talk) 17:11, 2 April 2017 (UTC)
You don't have this? [1] Equinox 17:15, 2 April 2017 (UTC)
Here's what mine looks like [2]. I'm sure there's some preference that controls this. DTLHS (talk) 17:20, 2 April 2017 (UTC)
Yeah mine looks like that if I log out. Can you please make a separate thread about the search button? Siuenti (talk) 18:40, 2 April 2017 (UTC)
Did you guys change your default skin from Vector to Monobook? —suzukaze (tc) 23:31, 3 April 2017 (UTC)
I don't think that matters if I'm logged out. Siuenti (talk) 23:47, 3 April 2017 (UTC)
Vector is the default, including for logged-out users. —suzukaze (tc) 20:57, 4 April 2017 (UTC)

"Declension" and "Conjugation" on the list of valid headers at WT:ELEdit

If you look at the list, it says "Declension" and "Conjugation" right below the POS header, where the headword line should be. This is confusing at best (since there are also sections with these names) and misleading at worst (since there is more often no inflection in the headword line at all). So I think this should be changed to just say "Headword line", so that it matches the name of the section further down that describes the use of the headword line.

Another possible improvement, I think, would be to link the various elements in the list to the sections within the page that describe them. So the aforementioned headword line would become a link to the "Headword line" section. —CodeCat 23:26, 3 April 2017 (UTC)

I agree. I don't think that this is a substantial change that needs a consensus, so I'll go ahead and do it. --WikiTiki89 17:01, 4 April 2017 (UTC)
Thank you. The second point remains, though. —CodeCat 17:19, 4 April 2017 (UTC)

Admin v SysopEdit

We have two current votes - 1) x for de-admin, and 2) y for desysop. Am I mistaken in assuming these are one and the same thing? — Saltmarsh. 19:07, 4 April 2017 (UTC)

They are the same. Sysop is not a term in official use around here but some people have picked it up elsewhere. Equinox 19:09, 4 April 2017 (UTC)
My question was really rhetorical. Perhaps we should "de-sysop" our literature - mixing the terms will confuse - for example Wiktionary:Administrators uses both of them — Saltmarsh. 20:06, 4 April 2017 (UTC)
My bad. I started the votes, and just love using synonyms and grandiloquent words. I find it so...good. --G23r0f0i (talk) 20:10, 4 April 2017 (UTC)
You sound like someone who should be a sysop around here. - TheDaveRoss 20:14, 4 April 2017 (UTC)
Not yet. I need a bit more experience, but thanks for the support. --G23r0f0i (talk) 20:18, 4 April 2017 (UTC)
We should de-admin everything too. The real term is "administrator" (if you stress it on the first syllable it sounds more intimidating). After all, you don't call the Terminator a "termin". --WikiTiki89 20:33, 4 April 2017 (UTC)
I have a pet allig. --G23r0f0i (talk) 20:49, 4 April 2017 (UTC)
Actually, sysop is used by the popups that I have enabled: when I put my curser over a link to a user page, for instance the one in @Wikitiki89's signature, it says sysop. The protection text for modules (from MediaWiki:Protectedpagewarning) also says sysop. — Eru·tuon 20:44, 4 April 2017 (UTC)
Mediawiki itself uses the terms interchangeably, the usergroup is called sysop in the database, but has been "translated" to administrator on Wikimedia projects. Some messages refer to one and some the other. - TheDaveRoss 12:13, 5 April 2017 (UTC)

Osco-Umbrian/Sabellic languagesEdit

Should this be added as a family node? It'd be especially helpful with loanwords into Latin like rūfus, lupus, and bōs, which are currently categorized as deriving from both Oscan and Umbrian, if at all. KarikaSlayer (talk) 19:38, 4 April 2017 (UTC)

Is it actually a genetic family, though? Or just a group of related languages? —CodeCat 12:31, 5 April 2017 (UTC)
It's listed as a node in Glottolog and {{R:De Vaan 2008}} (which also references Proto-Sabellic, implying common origin), at least. KarikaSlayer (talk) 17:15, 5 April 2017 (UTC)
Some of the main distinctions between Sabellic and Latino-Faliscan are:
  • Intervocallic *b, *d, *g, and *gʷ all become *f
  • *kʷ becomes *p
  • Gen.sg. endings *-ojso and *-os were replaces by *-ejs
  • 3pl. ending -nd become *-ns
There are probably a bunch more, but those come from a bit of scrounging. —JohnC5 18:12, 5 April 2017 (UTC)
Strictly, I'd say intervocalic *ɸ, *θ, *x, and all become *f in Sabellic, while in Latino-Faliscan they become *b, *d, *g, and *. But the result is the same. —Aɴɢʀ (talk) 22:37, 5 April 2017 (UTC)
So, how would we go about adding this then? I guess we'd have to decide on a name (Osco-Umbrian? Sabellic? Sabellian?) and a language code (itc-sab would be the most obvious choice). KarikaSlayer (talk) 19:32, 17 April 2017 (UTC)
I'm not really in favour of creating very small families. They make our category structure more messy. —CodeCat 19:37, 17 April 2017 (UTC)
"Derived from [family]" is probably something we should be doing less of, in general. It seems to have two separate uses: a word being derived from another family's proto-language (and added before we had much proto-languages around); or, a word being derived from one of a group of languages, but we don't know which one. With Latin we probably have cases of the second kind, which does not even require the languages in question being from a single sub-family. So filing them under both Oscan-derived and Umbrian-derived is maybe the best option so far. (Maybe we eventually want to add something like "Latin words with competing etymologies" to help with machine-readability.) --Tropylium (talk) 01:37, 29 April 2017 (UTC)

10-year-oldsEdit

An important announcement...the first ever batch of ten-year-old entries have finally matured. Top of the list is unperceptive, which has been brewing 10 whole years without being corrected. It was made by some prat called Keene (talkcontribs). I wonder what came of him... --G23r0f0i (talk) 20:17, 4 April 2017 (UTC)

Why is it the first-ever batch? There have been ten-year-old entries for at least three years now. --WikiTiki89 20:36, 4 April 2017 (UTC)
Nope, I've been monitoring OldPages for many years now. This is first time they've matured. And to show how sad I am, I have been playing a game to try to get all WF pages in the top 20 oldest pages. --G23r0f0i (talk) 20:47, 4 April 2017 (UTC)
It's true that, according to that report, "the first ever batch of ten-year-old entries have finally matured".
But, FWIW, the report is simply wrong. For instance, the entry dictionary was created in 12 December 2002. --Daniel Carrero (talk) 21:01, 4 April 2017 (UTC)
By "oldest pages" it means pages that have not been edited for the longest time. DTLHS (talk) 21:05, 4 April 2017 (UTC)
OK then. --Daniel Carrero (talk) 21:05, 4 April 2017 (UTC)
Still though, it's quite possible entries have "matured" before and then were subsequently edited. --WikiTiki89 21:16, 4 April 2017 (UTC)
I remember Keene (talkcontribs). Most of the time, he did pretty good work. He had some excellent bots that he used. —Stephen (Talk) 19:11, 5 April 2017 (UTC)
Stephen, I wish I knew if you have a really dry sense of humour or not, because sometimes I just can't tell. —Μετάknowledgediscuss/deeds 22:27, 5 April 2017 (UTC)
My sense of humor can be dry and hard to detect, so I try to avoid humor. I'm serious about WF, I can't help liking him. I admit he used to be irritating, but I've gotten used to his antics. —Stephen (Talk) 10:09, 7 April 2017 (UTC)
I would rather say if not even a bot had to update a single bit of formatting in all these years, that these entries are more immature than mature (no IPA, no etym, no usex,...) Julien Daux (talk) 09:43, 8 April 2017 (UTC).
In that list, I found some entries (like in this diff) lacking a space between the headword line and the definitions. Like this:
==English==

===Noun===
{{en-noun}}
# blah blah blah blah
So, apparently our bots haven't been fixing that small formatting "mistake" in the last 10 years. --Daniel Carrero (talk) 15:14, 8 April 2017 (UTC)
Yes, in that entry, the last "blah" is definitely redundant. --G23r0f0i (talk) 17:06, 8 April 2017 (UTC)
But SRSLY though, JD makes a good point. OldPages is generally a good place to start if you want to look for stubs and update some formatting. --G23r0f0i (talk) 17:06, 8 April 2017 (UTC)
So, does that mean that I won't be breaking anybody's heart if I add a few quotations to entries on OldPages? Ben (talk) 00:57, 17 May 2017 (UTC)

to ñandúEdit

If I was reading some Spanish which doesn't have diacritics, I might end up at the page nandu because I didn't know what it meant. What would be my next step? (what I need to find is ñandú but I don't know that yet) Siuenti (talk) 09:17, 5 April 2017 (UTC)

If you're reading Spanish with missing diacritics, your next step is to throw it away and get some Spanish that's spelled correctly. Failing that, I suppose you could go through all the pages listed under "See also:" at the top of the page nandu until you find one with a Spanish entry. —Aɴɢʀ (talk) 11:36, 5 April 2017 (UTC)
Oh come on. Don't deny reality. --WikiTiki89 15:38, 5 April 2017 (UTC)
That seems a bit inconvenient when there could be a #Spanish section that could take me straight there. Siuenti (talk) 21:30, 5 April 2017 (UTC)
This is one of the reasons why we have the see-alsos at the top of the page. --WikiTiki89 21:34, 5 April 2017 (UTC)
I've seen form-of entries for diacriticless forms in other languages; perhaps such entries can be created for Spanish words too? — Eru·tuon 21:35, 5 April 2017 (UTC)
Example: etre, which is marked as a misspelling. — Eru·tuon 21:37, 5 April 2017 (UTC)
I think it would be a little much to create these for all attested diacriticless spellings. --WikiTiki89 21:38, 5 April 2017 (UTC)
Given that French has more diacritics than Spanish and it already appears to have lots of diacriticless entries, it would make more sense to say it is too much to have such entries. — Eru·tuon 21:44, 5 April 2017 (UTC)
Actually, there aren't all that many entries in Category:French misspellings and certainly not one diacriticless spelling for every diacriticful French spelling, so I'm not sure what the criteria for including them was and how to determine whether similar Spanish entries should be added. But I would have thought any attested misspellings can have entries; isn't attestation the criterion for inclusion? — Eru·tuon 22:06, 5 April 2017 (UTC)
Misspellings are treated differently by the criteria for inclusion. They are only allowed if they are exceptionally common. — Ungoliant (falai) 22:11, 5 April 2017 (UTC)
There's a three-way distinction to be made here (I'll talk specifically about diacritics, but this can be applied to any spelling variations in general):
  • Misspellings: the author omits the diacritic(s) accidentally (whether this is simply a mistake or due to lack of knowledge of the "correct" spelling, in an otherwise diacriticized text). These need to be exceptionally common to merit inclusion.
  • Alternative spellings: the author intentionally omits the diacritic(s) (in an otherwise diacriticized text). These follow the ordinary inclusion criteria.
  • The entire text lacks diacritics (which I think is the case Siuenti is referring to). These I don't think should be included, but we don't really have a policy.
--WikiTiki89 16:13, 6 April 2017 (UTC)
What about a search for "nandu spanish" instead of plain "nandu"? —suzukaze (tc) 00:08, 6 April 2017 (UTC)

Should IPs be allowed to create new entries?Edit

A good deal of IP vandalism involves creating new pages, which take much more time to delete than it takes to rollback a vandalistic edit when patrolling. Good IP edits are usually small changes to existing entries or adding translations; creating new entries requires a greater amount of experience. Those IP editors that do spend enough time to learn that can take the time to register an account anyway, and most vandals won't bother. Wikipedia has already had this system in place for a while, which has greatly reduced their patrolling effort (I should note that we have quite a backlog now). Is there support for restricting IP editors from creating new pages? —Μετάknowledgediscuss/deeds 18:22, 5 April 2017 (UTC)

I would oppose that restriction. (I hate having to sign up for things, for many reasons, and think it goes against the openness of a wiki project. I can remember when Web forums were things you could just post on, without having to give away your e-mail address etc.) However, I would support making more use of abuse filters, e.g. blocking new entries that lack certain basic elements, and automatically encouraging the user to create a valid one: trolls won't bother, while those with something to contribute probably will. Equinox 18:24, 5 April 2017 (UTC)
(edit conflict) Yes, they should. I was an IP before I became a registered user in 2013, and I believe other registered users started contributing in the same manner before they took the plunge. DonnanZ (talk) 18:37, 5 April 2017 (UTC)
I agree in principal, however, I'm always cautious when it comes to restricting participation. Believe you me, I'm the first to celebrate less vandalism but I'm inclined to agree with Equinox that optimising abuse filters might be the way to go before having to consider, in lack of a better word, more "drastic" measures. --Robbie SWE (talk) 18:32, 5 April 2017 (UTC)
  • An Abuse Filter that prohibits IPs from adding new entries lacking a valid L2 header would be a good improvement. Can that be written so we can test it for false positives? @DTLHS, maybe? —Μετάknowledgediscuss/deeds 18:53, 5 April 2017 (UTC)
    • @Metaknowledge Invalid L2 headers will be picked up by people doing dump analysis (I know me and Ungoliant MMDCCLXIV at least check this). There's also no way to tell if a L2 header is valid in the abuse filter, just that it's present. DTLHS (talk) 00:57, 6 April 2017 (UTC)
      • However, I have created abuse filter 64 (disabled for now) and will see how many hits it gets. DTLHS (talk) 00:59, 6 April 2017 (UTC)
  • Another observation I'd make is that omitting headers isn't the real problem we want to solve. A blanket ban might further discourage users who have real content to contribute but don't know how to format it. The really "bad" kind of vandalism is the people who just want to write PENIS everywhere. Equinox 16:15, 6 April 2017 (UTC)
  • I think we can handle cleanup or deletion of new entries of anons. There are multiple ways of identifying them. Sometimes such contributions open our little world to entire areas of language that we have neglected, eg, technical jargon specific to trades and industry. DCDuring (talk) 11:00, 7 April 2017 (UTC)
  • Any restriction on anon users will increase the ratio of bad registered users to good registered users--Giorgi Eufshi (talk) 11:33, 7 April 2017 (UTC)
I'd like to extend that to you need to be logged in to make edits to reconstructed entries. Often they're simply reverted because we have no one to address a discussion to. --Victar (talk) 23:52, 18 April 2017 (UTC)

We are missing lots of wordsEdit

As an example, the numbers of words in the following categories are:-

Or we could just be missing lots of simple etymology sections. SemperBlotto (talk) 15:51, 6 April 2017 (UTC)

I'm kind of on it! Equinox 15:55, 6 April 2017 (UTC)
Are you trying to tell us Wiktionary is not finished yet? --WikiTiki89 16:13, 6 April 2017 (UTC)
Our French coverage is pretty good overall, so I suspect the low numbers have more to do with missing etymologies than anything. Often, the spellings are the same as the English word, or similar, and people tend not to put etyms for FL's when that is the case. Andrew Sheedy (talk) 02:14, 7 April 2017 (UTC)

WikibaseLexemeEdit

Wikibase was briefly mentioned in the Beer parlour in February. There is now under development something, a Mediawiki extension, called WikibaseLexeme (mw:Extension:WikibaseLexeme). I'm not sure where this fits in the great plan, but it looks as if it could reorganize and centralize some of Wiktionary's work, in a similar fashion as Wikidata took over and centralized Wikipedia's interwiki links.

Should we care or not? Does it concern us? Or when will it concern us? I don't know.

Anyhow, this extension has a data model (mw:Extension:WikibaseLexeme/Data_Model), that says words have forms (bird-birds, hard-harder-hardest) that are listed, that is, they are explicitly enumerated for each word without any intermediate template or inflection pattern. For me, a speaker of Swedish and German, languages rich in forms that follow a few patterns, this sounds rather stupid, so I had to ask for clarity. Here is the discussion: mw:Topic:Tneli5zzrq5jb5bo, in case anybody is interested. --LA2 (talk) 20:59, 6 April 2017 (UTC)

It does sound stupid. Evidently they are trying to render us obsolete without even understanding how lexicography works. We should be concerned, but it seems that these people don't really have any desire to work with us, let alone listen to us. —Μετάknowledgediscuss/deeds 21:34, 6 April 2017 (UTC)
Storing forms raw makes sense to me. If the Wikidata side tries to become so complex so early, it could end up a huge mess. Languages with complex inflection patterns should have them handled on local Wiktionaries, not on Wikidata, at least until WikibaseLexeme becomes far more sophisticated. --Yair rand (talk) 21:56, 6 April 2017 (UTC)
Seems interesting but woefully incomplete... —Aryamanarora (मुझसे बात करो) 13:45, 7 April 2017 (UTC)
Seems to follow an excessively Rationalist approach. Wouldn't something more evolutionary be better? DCDuring (talk) 17:10, 7 April 2017 (UTC)

Start of the 2017 Wikimedia Foundation Board of Trustees electionsEdit

Please accept our apologies for cross-posting this message. This message is available for translation on Meta-Wiki.

On behalf of the Wikimedia Foundation Elections Committee, I am pleased to announce that self-nominations are being accepted for the 2017 Wikimedia Foundation Board of Trustees Elections.

The Board of Trustees (Board) is the decision-making body that is ultimately responsible for the long-term sustainability of the Wikimedia Foundation, so we value wide input into its selection. More information about this role can be found on Meta-Wiki. Please read the letter from the Board of Trustees calling for candidates.

The candidacy submission phase will last from April 7 (00:00 UTC) to April 20 (23:59 UTC).

We will also be accepting questions to ask the candidates from April 7 to April 20. You can submit your questions on Meta-Wiki.

Once the questions submission period has ended on April 20, the Elections Committee will then collate the questions for the candidates to respond to beginning on April 21.

The goal of this process is to fill the three community-selected seats on the Wikimedia Foundation Board of Trustees. The election results will be used by the Board itself to select its new members.

The full schedule for the Board elections is as follows. All dates are inclusive, that is, from the beginning of the first day (UTC) to the end of the last.

  • April 7 (00:00 UTC) – April 20 (23:59 UTC) – Board nominations
  • April 7 – April 20 – Board candidates questions submission period
  • April 21 – April 30 – Board candidates answer questions
  • May 1 – May 14 – Board voting period
  • May 15–19 – Board vote checking
  • May 20 – Board result announcement goal

In addition to the Board elections, we will also soon be holding elections for the following roles:

  • Funds Dissemination Committee (FDC)
    • There are five positions being filled. More information about this election will be available on Meta-Wiki.
  • Funds Dissemination Committee Ombudsperson (Ombuds)
    • One position is being filled. More information about this election will be available on Meta-Wiki.

Please note that this year the Board of Trustees elections will be held before the FDC and Ombuds elections. Candidates who are not elected to the Board are explicitly permitted and encouraged to submit themselves as candidates to the FDC or Ombuds positions after the results of the Board elections are announced.

More information on this year's elections can be found on Meta-Wiki. Any questions related to the election can be posted on the election talk page on Meta-Wiki, or sent to the election committee's mailing list, board-elections wikimedia.org.

On behalf of the Election Committee,
Katie Chan, Chair, Wikimedia Foundation Elections Committee
Joe Sutherland, Community Advocate, Wikimedia Foundation

Posted by MediaWiki message delivery on behalf of the Wikimedia Foundation Elections Committee, 03:36, 7 April 2017 (UTC) • Please help translate to your languageGet help

What to Do With New Entries for Numerals > 100Edit

I've created an abuse filter that seems to be good at spotting new entries by non-autopatrolled accounts and IPs where the page name consists of nothing but digits and can be parsed as a number greater than 100. Right now I have it just tagging entries, because I'm not sure exactly what we want it to do when it finds such an entry.

I'm thinking we don't want to disallow, because there are no doubt many potential entries for numerals with meaning other than their numeric value (e.g. 411, 666 and 360). I think the best option would be to tag and warn, with a message explaining our policy/practice. Does anyone (perhaps @BD2412?) want to write one so I can add it to the abuse filter? Chuck Entz (talk) 03:53, 7 April 2017 (UTC)

  • I'd be glad to, but I'm turning in for the night now, and won't have time to focus on it until Monday. bd2412 T 04:21, 7 April 2017 (UTC)
  • I think you should consider carefully about whether there should be entries in cases where the number is a synonym for an NSOP entry, such as 1000 Siuenti (talk) 08:37, 7 April 2017 (UTC)
    Sorry, I lost track of this task. Where do we want this policy to live? WT:CFI? bd2412 T 20:25, 12 May 2017 (UTC)

Norwegian languagesEdit

I guess this has been asked umpteen times before, but anyway: if a word is the same (e.g. "nemnd") in Norwegian Bokmål and Norwegian Nynorsk, does one make two entries, one for each language, or only one for "Norwegian"? --Hekaheka (talk) 13:23, 7 April 2017 (UTC)

Thank you. What are Norwegian (no) entries for? --Hekaheka (talk) 13:34, 7 April 2017 (UTC)
They're older entries which haven't been separated yet, which is something I will get back to. They're well and truly in the minority now. DonnanZ (talk) 13:38, 7 April 2017 (UTC)
In some cases they may stay where they are, especially proper nouns, surnames and the like which apply to both languages. DonnanZ (talk) 13:42, 7 April 2017 (UTC)
@Hekaheka: Here's Wiktionary:Votes/pl-2014-03/Unified Norwegian, ending at 50:50. Thereafter, Donnanz started to convert everything to two entries, and no one stopped them. I think that, eventually, the English Wiktionary will contain enough statistical evidence in the form of these Norwegian entries to make it very clear that these are not best thought of as two languages. --Dan Polansky (talk) 20:30, 12 May 2017 (UTC)

Proposal: Create entries for all unattestable Unicode symbols but without "real" definitionsEdit

Context:

Proposal:

  1. Create entries for all Unicode codepoints that are assigned to something, attestable or not: characters, symbols, emojis, diacritical marks, "box drawing" stuff, etc. (except control characters, I guess)
    • Example: ¤ (Talk:¤) failed RFV and doesn't exist. We could recreate it anyway.
  2. For unattestable symbols, instead of a "real" sense, the sense should be a comment like This symbol is not attested.
  3. Keep symbol redirects when multiple codepoints are the "same" symbol in some way.
    • Example 1: We can keep redirecting , and to ! (i.e., redirecting fancy exclamation points to the normal exclamation point).
    • Example 2: is the reverse empty symbol, which failed RFV (Talk:⦰), but today I redirected it to , the normal empty symbol.

Notes:

  • This idea of not using "real" senses was inspired by {{translation only}}, used in entries like older sister and three days ago.
  • We can use the "Description" section to say what is the shape of the symbol, instead of using senses for that.

Rationale:

  • About the "not real" senses:
    1. If a symbol is not used enough by humans, maybe we can't say it "means" anything in a descriptive dictionary. The symbol ¤ is the "currency sign" according to Unicode (a prescriptive authority) but it does not seem to be used in three durably-archived sources, so in a sense it does not "mean" anything.
  • About the inclusion of all symbols:
    1. I don't have any actual numbers, but I wonder if many people are interested in searching for random symbols and emoji. At least, there are some websites dedicated to do these searches, and Wiktionary would join them. We do have complete Unicode appendices, but they are "buried" in the appendix namespace. They may be hard to be found unless someone already knows where to look. Having entries for all symbols would make them noticeably easier to be found.
    2. Entries can have content that can't fit in the Unicode appendices, such as non-durable citations and references from the internet, at most two citations from durable-archived sources (which is not enough for attestation), cross-links/tables/"see also", multiple images, and categories.
    3. This should stop the current practice of creating unattestable symbol entries where the sense is the shape of the symbol. Currently, 🌵 is defined as "cactus", but this does not seem citable. This exact sense would be attested by three quotations like: "I was in the desert and I saw a 🌵!" Otherwise, we can look for other senses of 🌵, or it would probably just fail RFV.

Entry examples:

(Disclaimer: I don't know if all the descriptions are good. The descriptions can be changed in the future.)

Exhibit 1:
cactus (🌵)
Exhibit 2:
currency sign (¤)
Exhibit 3:
a ribbon arrow ()
Exhibit 4:
fuel pump ()
Exhibit 5:
skull and crossbones ()
Exhibit 6:
light bulb (💡)
Exhibit 7:
heart exclamation mark ()
failed RFV: Talk:¤ failed RFV: Talk:⮴
{{character info/new}}
==Translingual==

===Description===
A [[cactus]].

===Symbol===
{{mul-symbol}}

# ''This symbol is not attested.''
{{character info/new}}
==Translingual==

===Description===
A small [[circle]] in the middle of a small [[X]] shape. (?)

===Symbol===
{{mul-symbol}}

# ''This symbol is not attested.''
{{character info/new}}
==Translingual==

===Description===
A ribbon-shaped arrow folded in 90 degrees, coming from the right side and pointing upwards.

===Symbol===
{{mul-symbol}}

# ''This symbol is not attested.''
{{character info/new}}
==Translingual==

===Description===
A [[fuel pump]].

===Symbol===
{{mul-symbol}}

# a [[gas station]]
{{character info/new}}
==Translingual==

===Description===
A [[skull]] and [[crossbone]]s.

===Symbol===
{{mul-symbol}}

# [[death]]
# [[poison]]
# [[piracy]]
{{character info/new}}
==Translingual==

===Description===
A [[light bulb]].

===Symbol===
{{mul-symbol}}

# an [[idea]]
#REDIRECT [[!]]

{{R character variation}}

--Daniel Carrero (talk) 17:10, 7 April 2017 (UTC)

If we were mass creating these entries, how would we determine if they were attested or not? False negatives in other words. DTLHS (talk) 17:27, 7 April 2017 (UTC)
I would be okay with just allowing their manual creation as opposed to necessarily mass creating these entries. Maybe there are some groups of entries that can be safely mass-created, like Appendix:Unicode/Box Drawing and Appendix:Unicode/Block Elements. --Daniel Carrero (talk) 17:46, 7 April 2017 (UTC)
I certainly don't agree with creating pages whose definitional content is just "this isn't attested". And remember that there's some real junk in Unicode for compatibility reasons etc. Equinox 17:42, 7 April 2017 (UTC)
The NISOPs like older sister are defined as this, which looks very similar to "Entry not attested": "Used other than as an idiom: see older,‎ sister. (This entry is here for translation purposes only.)"
Do you have some examples of Unicode compatibility junk you think we shouldn't have entries for? --Daniel Carrero (talk) 17:49, 7 April 2017 (UTC)
The difference is that "older sister" is attested. --WikiTiki89 18:14, 7 April 2017 (UTC)
Yes, but that's beside the point as stated by Equinox above. He mentioned the definitional content, but both "older sister" and the proposed "¤" have basically the same definitional content. --Daniel Carrero (talk) 18:18, 7 April 2017 (UTC)
Still different. The definition in your proposal is basically saying "I do not exist". With "older sister", it's just saying "nothing needs to be said because the definition is obvious". --WikiTiki89 19:05, 7 April 2017 (UTC)
I oppose entries for unattested symbols. --WikiTiki89 18:09, 7 April 2017 (UTC)
I also oppose. - TheDaveRoss 18:33, 7 April 2017 (UTC)
What about just hard redirecting all unattestable symbols to the relevant page in the Unicode Appendix? Andrew Sheedy (talk) 00:05, 8 April 2017 (UTC)
Personally, I don't like that idea very much. If I search for 🍨 (ice cream symbol, which may be unattestable), I would like to see info about it, not about a whole list of food entries, which might be confusing. It's not even clear why the reader was redirected in the first place -- they might not know what Unicode is. More importantly: if the person does not have the right fonts to see the symbol, it would probably reduce significantly their ability to find the right symbol in the list. --Daniel Carrero (talk) 21:23, 16 April 2017 (UTC)
The main reason for not doing this is that it really provides no useful information at all. The user clicks on the link, to be told that they wasted their time. There's already far too much filler on the web as it is: it's easy to compile and present, and people like to think they're being comprehensive. I, for one, find it annoying to look something up and find nothing but the obvious and self-evident. I don't want to be told that there are no elephants native to my zip code, that a green leaf is a leaf that's green in color, or that a small circle inside of an x shape can be described as a small circle inside of an x shape. If the user's reaction is going to be "well, duh...", we're better off not having the entry. Chuck Entz (talk) 00:34, 8 April 2017 (UTC)
As I said below, it's alright if people don't want to do it as proposed above, and your arguments make sense to me. Still, I'd like at least to add {{no entry}} in all the unattestable entries for symbols. It shows that we don't have them because they don't actually exist, as opposed to just having an "incomplete" coverage of symbols. We might want to delete the hundreds of unattestable symbols that we already have, but people will probably keep creating them. --Daniel Carrero (talk) 02:15, 10 April 2017 (UTC)
About this: "a small circle inside of an x shape can be described as a small circle inside of an x shape". Not all people have fonts for all characters, and some characters like 🏦 have major rendering variations. The symbol ¤ can also have usage notes like "Unicode describes it as the 'currency sign' but it has not found widespread use." Plus, the entry for ¤ can have information about the same character in different encodings, and the etymology and derived terms when applicable. --Daniel Carrero (talk) 01:23, 11 April 2017 (UTC)
Does "Unicode characters" include weird hanzi Unicode encoded because they have bizarre standards for inclusion? —suzukaze (tc) 01:23, 8 April 2017 (UTC)
Yes, the proposal above would include those, too. FWIW, this hanzi is already a "no entry": 𪜁. I would support adding the "no entry" in all the weird non-existing hanzi. It shows that we don't have them because they don't actually exist, as opposed to just having an "incomplete" coverage of hanzi. But, if people don't want the "no entry" in all weird hanzi, then I'd suggest deleting that one, too. --Daniel Carrero (talk) 23:55, 9 April 2017 (UTC)
@Daniel Carrero: These ghost kanji may be disanalogous; see Category talk:Ghost kanji. — I.S.M.E.T.A. 23:35, 13 April 2017 (UTC)
It's OK if people don't want to create the entries for all the symbols without "real" definitions as I had proposed above. @Chuck Entz's reasoning is one that makes sense to me, too.
Aside from that idea, what about the symbols that we already have but don't seem to be actually used in three durably-archived sources? I assume it would be OK to just delete most of them outright? Maybe opening RFV discussions for all these symbols is not feasible, because there are too many.
I mentioned the cactus (🌵) above, which may not be attestable and could be deleted. There are also 283 entries (+1 appendix) in Category:Miscellaneous Symbols and Pictographs block, and 169 entries (+1 appendix) in Category:Miscellaneous Symbols block. (A few of these symbols already have citations and don't need to be deleted, I believe.) --Daniel Carrero (talk) 23:49, 9 April 2017 (UTC)

FWIW, I support Daniel Carrero’s proposal. At the very least, an entry’s Description section, transclusion of {{character info/new}}, and any image that may be included in a given entry serves to elucidate incomprehensible mojibake. — I.S.M.E.T.A. 23:43, 13 April 2017 (UTC)

Concerning my proposal above, so far we have:
  • 2 support "votes", counting myself
  • 3 oppose "votes", I believe
  • I'm not counting @Andrew Sheedy and @suzukaze-c in this support/oppose list
Naturally, I'd prefer doing as I proposed before (having all these symbol entries without "real" definitions, or with {{no entry}}).
But let's assume that proposal fails. (Implementing my idea would require a 2/3 majority in an actual vote, it goes without saying.) I guess the natural course of action would be doing the opposite, and deleting all the unattestable symbols. Some people here in this discussion seem to support deleting all those entries. We seem to have at least a few hundreds of entries for symbols likely to be unattestable: (these are all bluelinks at the moment) 🍨, 🍩, 🍪, 🍫, 🍬, 🍭, 🍱, 🍶, 🍼, 🎀, 🎂, 🎃, 🎄, 🎅, 🎆, 🎑, 🎒, etc.
I won't create a few hundreds of RFVs because it would flood WT:RFV, (well, maybe I could start with a few RFVs, just not all the symbols at once!) but in theory we could create those RFVs or just delete the symbols outright. This would be consistent with our normal practices.
I dislike having all these unattestable symbol entries with the Unicode name used as the definition, and I know I'm not the only person with that opinion. Sure, 🍱 means "bento box" but it's just because Unicode says so, not necessarily because there's a symbol used like that in real life. It also apparently sets an uncomfortable precedent -- it may look like we could also create (these are all redlinks at the moment) 🍲, 🍰, 🍯, 🍮, 🍧, 🍥, 🍤, 🍣, 🍢, 🍡, etc. using the same kind of definitions.
Deleting or attesting all these symbols would fix that problem, or creating all the symbols without "real" senses. But I know it's not up to me. --Daniel Carrero (talk) 21:16, 16 April 2017 (UTC)

Separate proposalEdit

I do support adding entries for symbols even though not attested, but make actual definitions for them instead of just saying "not attested". The rationale is that, though symbols such as phone emojis are usually not used in durably archived sources, it is EXTREMELY commonly used in casual texting conversation. 😊 For example, this emoji is more common than I can even say, yet we don't even have an entry for it. On a side note, some genius in this world should come up with a way to durably archive some public chat room text so Wiktionary can have more attested words. PseudoSkull (talk) 21:50, 16 April 2017 (UTC)

I created WT:RFV#😊 for that symbol, let's see if we can find some citations for it.
Thanks for supporting the idea of adding entries for unattested symbols. But let's talk a bit more about your separate proposal. Let's assume this entry is unattestable: 🍨 (it's the ice cream symbol). How would you define it? Sure, we could just say "ice cream" in the sense, but if it's not used like that in real life, then we are not being descriptive anymore, we are being prescriptive. Besides, we already use the "Description" section to say what the symbol looks like.
Compare the symbol . It's the "fuel pump" in Unicode, but it's not a symbol meaning fuel pump. (we didn't find any citations where that symbol means "fuel pump") It looks like a fuel pump, but it means "gas station". If you see that symbol somewhere, it's probably indicating a gas station. --Daniel Carrero (talk) 22:08, 16 April 2017 (UTC)
@ User:Daniel Carrero It would be described based on how it's really used, i.e. with a non-gloss definition. Used to express the need to use or state of using a fuel pump. Used to express the desire for or presence of ice cream. PseudoSkull (talk) 22:42, 16 April 2017 (UTC)
For some support, the Japanese and Swedish Wiktionaries actually have entries for these symbols. PseudoSkull (talk) 22:44, 16 April 2017 (UTC)
Unfortunately, I'm not a big fan of these definitions that you wrote. "Used to express the desire for or presence of ice cream." seems to be exactly a prescriptive definition. How can you tell that symbol means exactly that, apart from the Unicode codepoint name? Are people using the ice cream symbol in three independent, durably-archived sources? Besides all that, the first word in that definition is "Used", which is false if people are not actually using the symbol.
I'm not greatly aware of other Wiktionaries' practices and policies, but I've seen a few symbols in other Wiktionaries before. It's interesting that the Japanese and Swedish Wiktionaries have entries for these symbols, but why did they create these entries in the first place? Do they have some policy concerning symbols? We have a number of these entries too, so maybe they just copied us? I don't know their reasons, and I don't know if these Wiktionaries are striving to be descriptive rather than prescriptive. But if they created these symbols just because they exist in Unicode, and used the Unicode names as normal definitions, then those definitions are prescriptive. --Daniel Carrero (talk) 23:14, 16 April 2017 (UTC)
@Daniel Carrero, PseudoSkull: Would it be OK to write descriptive definitions for these emoji based upon non–durably-archived sources? — I.S.M.E.T.A. 20:24, 17 April 2017 (UTC)
I would support implementing new rules to allow getting citations from some (not all) places on the internet, for the purpose of attesting words and symbols alike.
For details, read this long discussion: Wiktionary:Beer parlour/2016/October#Why we don't need durable citations (especially this subsection of the same discussion: Wiktionary:Beer parlour/2016/October#Proposed CFI change).
This is not how CFI currently works. But one thing I consider important if we do this is the ability to verify and disallow these citations when they become inactive -- that is, according to this proposal, if you get three citatiosn for a word or symbol from websites or from the Web Archive, and then these pages are deleted, then we will need to get three new citations or delete the sense. This is to ensure we can verify how the symbol is actually used. --Daniel Carrero (talk) 23:48, 17 April 2017 (UTC)
@Daniel Carrero: Does the Web Archive not preserve things indefinitely? — I.S.M.E.T.A. 02:06, 18 April 2017 (UTC)
A few past discussions about using Web Archive to get citations:
Also, this vote (it ended with 7 support and 7 oppose, traditionally known as no consensus):
Apparently, owners of the domains can choose to delete stuff from the Web Archive, (not to mention that domain owners often change) and that's why it's not reliable as a durably-archived source. --Daniel Carrero (talk) 02:25, 18 April 2017 (UTC)
@Daniel Carrero: Bugger. How about storing our own screenshots, then? I was thinking something like what Wikisource does to source some of its texts (for example, the scan at s:Page:Keats, poems published in 1820 (Robertson, 1909).djvu/141, used to source the beginning of Keats’ “Ode on a Grecian Urn”). — I.S.M.E.T.A. 23:46, 21 April 2017 (UTC)
What you said right now is one of the multiple ideas that were discussed in Wiktionary:Beer parlour/2016/October#Why we don't need durable citations.
Maybe that could work. If other people want to do it, I might want to support it too.
But one serious flaw of that idea is that these screenshots can be easily fabricated, so they don't serve as extremely reliable proof that the word was once used on the internet, especially if the original webpage gets deleted.
I support getting citations from the internet for words and symbols one way or another. If we are going to do it, the idea I like the most is still the one that I proposed in the discussion "Why we don't need durable citations" (accepting citations from the internet as long as they still exist on the internet). --Daniel Carrero (talk) 00:40, 22 April 2017 (UTC)
@Daniel Carrero: I am uneasy about the prospect of an increasing proportion of our content being perpetually contingent upon external sites' longevity. In what way can screenshots be easily fabricated? More easily than the pages of a PDF of an old book? — I.S.M.E.T.A. 12:20, 25 April 2017 (UTC)
Check your last message in this image. --Daniel Carrero (talk) 12:33, 25 April 2017 (UTC)
@Daniel Carrero: You added an extra space between that final full stop and the hair-space+em-dash of my signature, so that fabricated screenshot lacks strict verisimilitude! Still, I take your point. :-(  — I.S.M.E.T.A. 12:45, 25 April 2017 (UTC)
LOL. For the record, this is how I fabricated the screenshot: I clicked "Inspect element" on my Firefox and edited the HTML source of the page. Then I took the screenshot. --Daniel Carrero (talk) 12:50, 25 April 2017 (UTC)
@Daniel Carrero: I've seen that done with Twitter DMs, so it shouldn't've surprised me to see it done here. — I.S.M.E.T.A. 12:52, 25 April 2017 (UTC)

The last week of the 1st cycle of Wikimedia strategy conversationEdit

Hi, I'm Szymon, a MetaWiki Strategy Coordinator. 3 weeks ago, we invited you to join a broad discussion about Wikimedia's future role in the world. The discussion is divided into 3 cycles, and the first one ends on April, 15. So far, Wikimedians have been discussing mainly about technological improvements, multilingual support, friendly environment, cooperation with other organizations and networks.

I'm pinging a few recently active admins. I hope you'll help me with passing along the news, maybe even join the discussion. @I'm so meta even this acronym, Daniel Carrero, Matthias Buchmeier, Metaknowledge.

Looking forward to your input. Thank you in advance! SGrabarczuk (WMF) (talk) 00:25, 8 April 2017 (UTC)

I (and probably most other active admins) did not fail to give feedback because we did not see the previous mass messages about this, but instead because we are largely uninterested and have little to add. I find the idea of a central movement strategy too vague to respond to; our "strategy" at Wiktionary is simply improving and expanding content, just as it always has been and will be. —Μετάknowledgediscuss/deeds 00:35, 8 April 2017 (UTC)
We did some suggestion in French Wiktionary and they were shrank and translated in Meta. Our ideas included to focus on minor and endangered languages via field projects (not only technical support but workshops with communities); creation of a portal "What contribution means in open projects" with a description of what we do in each project, how we contribute; a Wiktionary targeted to children, by children with appropriate pictures and quotations. I am convince Wiktionaries is more include in the global project when we give our opinion and ideas. But if you agree to let French wiktionarians to led the movement, it's fine for me   Noé 12:10, 11 April 2017 (UTC)
@SGrabarczuk (WMF): My apologies: I've been insufficiently active to pay attention to this. Regardless, thank you for asking me. — I.S.M.E.T.A. 01:46, 14 April 2017 (UTC)

Edit

Rajasekhar1961 has asked me about the format of as a Telugu abbreviation. We used to use Abbreviation as a header for these things, but now that header is not allowed. As I understand it, the header has to be Noun, Adjective, Verb, and so on, and then {{abbreviation of}} is used in the definition line to link to ఉత్పలమాల and ఉత్తరము. Then it becomes more complicated. What does Rajasekhar1961 write in ఉత్పలమాల and ఉత్తరము to define the abbreviation , and how does he link it back to ? Or does he link it to at all? WT:EL is difficult enough for me to understand, and I think it would be much harder for Rajasekhar1961, whose English is en-2. —Stephen (Talk) 07:37, 8 April 2017 (UTC)

Since both ఉత్పలమాల and ఉత్తరము are nouns, I would give its POS as noun. I would list under Synonyms at each of the full entries, just as ave. is listed as a synonym of avenue and St. is listed as a synonym of Saint. —Aɴɢʀ (talk) 07:44, 8 April 2017 (UTC)

Checkbox for translation editor gadgetEdit

In the preferences under gadgets, there is one labelled "Disable the buttons that allow editing of translation tables". It's not clear, but I'm guessing that checking this will disable the translation editor. If that is the case, then I'd suggest changing it so that the editor is enabled when the box is checked, which is more intuitive. Of course it has to be checked by default then, rather than unchecked. Also, the label could be clearer too, what "buttons" is it referring to? It could just say "Enable the translation editor".

I just noticed that the button for the rhymes editor also has the sense inverted: the editor is enabled when the box is unchecked. This also ought to be reversed. —CodeCat 18:56, 8 April 2017 (UTC)

"Alternative forms" of given names and surnamesEdit

I find it EXTREMELY disturbing that we list entire names of people as "alternative forms" of other names. First of all, these are people's NAMES we're talking about here, things that they have lived being called all their lives. Imagine if you looked up your own name or surname here, just to see it listed as "(just an) alternative form of ______ (and it has no other importance than that it is just an alternative)". I feel this could be extremely offensive to some people.

Second of all, they're not actually "alternative forms". If someone's name is Jasmine and you accidentally spell their name like Jazmine on a formal document of some sort, that person will point it out to you and tell you to fix it, because that is not how you spell her name. It's not like if Jasmine goes to a different country with a different dialect, her name magically changes to "Jazmine". It's also not like if in different situations her name is spelled "Jazmine" rather than "Jasmine". No. Her name is ALWAYS Jasmine, unless she literally changes it in court or tells other people "Oh I don't like the original spelling of Jasmine, so just spell my name Jazmine please. That's my nickname." umm.... But the latter would never happen...

I propose to make rules here against listing names as alternative forms of other names. Same goes for surnames. They really aren't alternative forms. PseudoSkull (talk) 19:09, 9 April 2017 (UTC)

Additionally, I think that rather than listing similar names in "Alternative forms", we should put them in "Related terms" or "See also" instead. PseudoSkull (talk) 19:10, 9 April 2017 (UTC)
We're a dictionary; we're interested in the lexicographic properties of words, including names. It's not our job to avoid offending people. My own first name is a relatively rare alternative spelling of a fairly common first name, and if I look my first name up in a dictionary, that's exactly the information I expect to see. It doesn't offend me at all. —Aɴɢʀ (talk) 20:51, 9 April 2017 (UTC)
Try telling that to the Japanese who can't come to terms with shinjitai changing the shape of the kanji in their last names.suzukaze (tc) 20:57, 9 April 2017 (UTC)
I don’t think it is that serious of an issue, but there is room for improvement; different forms/spellings of a name have different semantic implications from alternative forms/spellings of regular words (i.e “I’ll analyse it” and “I’ll analyze it” mean the same thing, but “Jon will do it” implies a different person than “John will do it”).
My proposal is this: instead of treating variants of a name as completely different forms, we add parameters like variantof= and variantq= to {{given name}} such that {{given name|male|variantof=John|variantq=spelling}} displays its current text in templatised form. This would formalize the already common practice of using “variant of” instead of “alternative form/spelling”. — Ungoliant (falai) 21:11, 9 April 2017 (UTC)
I'm not sure if this will work. It's not really possible to point out which is a variant of another; John could equally be a variant of Jon. Our current definition of Jon doesn't make much sense to me, because it defines a single name twice, as if it's two different names when it's the same word, given as a name to people, in both cases. Really, etymology is what should give this kind of information. —CodeCat 22:48, 9 April 2017 (UTC)
Presumably the most common variant (or the first created when you can’t point to a single most common variant) acts as the hub, like we do with all other groups of alternative forms. — Ungoliant (falai) 22:53, 9 April 2017 (UTC)
True, but I agree with the OP that speaking of alternative forms with names is weird. They're not interchangeable, as has been pointed out. What criteria are there for treating them as alternatives of each other anyway? What makes them in any sense "the same"? —CodeCat 22:55, 9 April 2017 (UTC)
Becki is "the same as" Becky because it was directly derived from it by applying some trendy/tacky/whatever spelling rule. The names were not devised independently. How is this different from sulphur/sulfur? Equinox 23:02, 9 April 2017 (UTC)
Then the relationship is etymological. Is there any synchronic relationship? —CodeCat 23:33, 9 April 2017 (UTC)
I strongly oppose changing things because someone might be offended. Just document facts. Equinox 22:27, 9 April 2017 (UTC)
Oppose. People can be potentially offended by just about anything, and our goal is not to offend as few people as possible, it's to document language. It might be useful for someone to know that a certain spelling of a name isn't the traditional/most common/standard one. Andrew Sheedy (talk) 22:46, 9 April 2017 (UTC)
I deleted Support and Oppose because people have introduced new ideas of how to better deal with etymologically similar names. I think the discussion should continue before starting a real vote. PseudoSkull (talk) 23:11, 9 April 2017 (UTC)
As CodeCat and others have said, these are not "alternative forms", as 'Becki' refers to a different person than 'Becky'. One being derived from another is ===Etymology===. "Jon" and "John" is a particularly informative example for someone to have brought up, above, since neither one is even etymologically derived from the other! - -sche (discuss) 00:56, 10 April 2017 (UTC)
Hmm. My ex Rebecca originally used Becky, then changed to Becki to be unique in a class with several "Beckies", then switched back to Becky to get a job and not sound like a stripper. Equinox 01:02, 10 April 2017 (UTC)
Sometimes Becki and Becky refer to the same person. Sometimes Becky and Becky refer to different people. If we were to split them up by who they refer to, we'd have an entry for every single person on earth. So I agree with Andrew Sheedy and Equinox. But I'll also mention that usually Jon and John are not alternative forms of each other, but rather Jon is usually an alternative form of Jonathan. --WikiTiki89 13:45, 10 April 2017 (UTC)
"Sometimes Becki and Becky refer to the same person." Not really. Equinox's example is just some one person's personal action, which I don't even see as a common action. As for diminutives, that's also personal preference (but still in the entry it should be mentioned somewhere that that name is a diminutive of another name), but they do tend to refer to the same person as the full form. For instance, I insist that people do not call me "Maddy" as people will often jump to do, but instead to call me by my real name, Madison. And also, if I did want to be called Maddy, and someone spelled it "Maddie" when referring to me, I'd correct them. And I certainly wouldn't keep changing it from "Maddy" to "Maddie" and vice versa. Also, need I mention that there is a similar name "Mattison" which would refer to a completely different person? PseudoSkull (talk) 17:14, 10 April 2017 (UTC)
Yes, and? My point was that it doesn't matter if it refers to the same person or not. Alternative form doesn't mean interchangeable. Maddie is an alternative form of Maddy, both of which are diminutives of Madison. That doesn't mean that anyone named Madison can be called Maddy and Maddie, but that it is common for people named Madison to be called Maddy or Maddie. --WikiTiki89 17:23, 10 April 2017 (UTC)
I also knew someone who was called Daniel by some people and Dan by others (without any explicit change from one to the other; they were used simultaneously). Equinox 21:46, 13 April 2017 (UTC)
That's actually very common. --WikiTiki89 21:59, 13 April 2017 (UTC)
But that's a nickname. We're talking about things like Jazmine being an "alternative from of Jasmine" when it's not. You either call Jasmine Jasmine or you call Jazmine Jazmine. They're two different people in all but VERY few cases of personal preference. They're not alternative forms. Rather, it should say in the etymology that it is a variant rather than in the definition. The definition should say "This is a female given name." Rather than "This is an alternative form." PseudoSkull (talk) 21:42, 16 April 2017 (UTC)
It's true that calling a person named Traci "Tracy" wouldn't be an example of an alternative form, but that isn't the only way usage varies for names. If the Smiths name their daughter "Traci" and the Jones' name their daughter "Tracy", they're using different forms of the same name. The fact that "Traci" will always be "Traci" and "Tracy" will always be "Tracy" doesn't change that. Getting rid of "alternative forms" means we miss out on obvious relationships between names like Geoffrey/Jeffrey/Jefferey, or Gillian/Jillian/Jill. Besides, your logic would require that we get rid of translations for names, too, since William isn't Guillermo, and Jacques isn't Jack or Jacob. Chuck Entz (talk) 02:14, 17 April 2017 (UTC)
What is the practical alternative to listing some names as variants of others? — I.S.M.E.T.A. 23:47, 13 April 2017 (UTC)
@I'm so meta even this acronym What Chuck Entz said directly above, in addition to the following anecdotal evidence that indicate that people don't see different variants of a name as different names: I was at an event a few weeks ago where there were two guys named Andrés, one of whom was called Andrew by his brother. A couple people remarked how there were three Andrews, even though we didn't technically have identical names. I've also heard exchanges like "What's your name?" "Kathryn" "Do you spell that the normal way, or some weird way?" If "Kathryn" and "Catherine" are truly different names, then one wouldn't hear exchanges like that, which indicate that most people think of them as different versions of the same name. It's also worth noting that if Jesse meets Jessie, they might say "Hey, we have the same name" even though they spell it differently. Andrew Sheedy (talk) 01:18, 18 April 2017 (UTC)
@Andrew Sheedy: Yes, I agree with you and Chuck. I was just wondering whether it's in any way feasible not to treat some names as variants of others. — I.S.M.E.T.A. 02:05, 18 April 2017 (UTC)
Whoops, my brain must have substituted some word in the place of "alternative"... Evidently I need to read more carefully. Andrew Sheedy (talk) 02:14, 18 April 2017 (UTC)

WordsetEdit

Two years later (see older BP post: Wiktionary:Beer_parlour/2015/March#Wordset) they have now officially closed shop, due to “lack of interest”. A bit sad, but the good news is that there's now a data dump available on github, licensed under CC BY-SA 4.0. I'm especially interested in their example sentences, which they tried to make gender neutral (something I think we should also adopt). Are there any objections against (semi-)automatically importing some of this data? – Jberkel (talk) 22:59, 9 April 2017 (UTC)

I'm actually somewhat unsure of the role of usexes here in general, particularly for English, where actual citations are pretty much always better. —Μετάknowledgediscuss/deeds 23:13, 9 April 2017 (UTC)
I'd say they serve two different purposes. Citations are useful as a reference but often a bit long-winded, embellished or written in dated/archaic language or spelling (esp. when citing older works). Usexes in contrast are simple, to the point, written in contemporary language and therefore easy to understand (think of non-native readers). Citations often don't make sense unless a lot of additional context is added. A usex stands on its own, or should be constructed in such a way. It can be scanned quickly and also helps to clarify the sense. – Jberkel (talk) 22:07, 10 April 2017 (UTC)
That would be my assessment of them as well. The French Wiktionary uses many of each to illustrate each sense, and as a non-native speaker, I tend to find the usexes more useful since they are constructed for the specific purpose of illustrating the definition. We would be well off to add far more of each, IMO. Andrew Sheedy (talk) 03:09, 12 April 2017 (UTC)
If English Wiktionary already have enough examples, other wiktionary don't have for English entries. At least for French Wiktionary, we'll be glad to have some of those!   Noé 11:34, 12 April 2017 (UTC)
IMO we often need more usexes rather than fewer, especially in L3 sections with more than one definition, and most especially for grammaticized words. I am not at all confident that usexes from a dictionary that did not have exactly our set of definitions would provide useful usexes without a great deal of manual effort to match the usexes with our definitions. Of course, we can have the same problem with citations for such L3 sections. DCDuring (talk) 12:13, 12 April 2017 (UTC)

Removing inactive editors from Category:User coders, Category:User languages, and Category:User scriptsEdit

The bottom-level categories within Category:User coders, Category:User languages, and Category:User scripts are populated in no small part by the user pages of very many users who are now inactive or who have never been active. The first sentence of Wiktionary:Babel reads: “User language templates aid multilingual communication by making it easier to contact someone who speaks a certain language.” Since it would be pointless to contact someone who, it might be assumed, would never read a message of contact, having those inactive users in those user-proficiency categories undermines those categories’ purpose. Accordingly, I propose that a bot be run, tasked with adding a |nocat=1 or |inactive=yes parameter to the {{Babel}} transclusion of every user who has not edited this project within the preceding year (past 365–366 days) of a given bot run. This parameter may function either to remove a user from his user-proficiency categories or to move that user to different categories (marked “inactive” in some way).
Does that seem desirable to everyone? Is the task automatable, as I’d hoped? Does anyone have a better idea? Pinging KIeio and Awesomemeeos because of their contribution to the relevant discussion at User talk:KIeio#Babel. — I.S.M.E.T.A. 23:58, 9 April 2017 (UTC)

@I'm so meta even this acronym I think that's a good idea! I like what bots can do. Let's do this! — AWESOME meeos * ([nʲɪ‿bʲɪ.spɐˈko.ɪtʲ]) 00:07, 10 April 2017 (UTC)
I think one year is too short a period. Otherwise yes. Equinox 00:10, 10 April 2017 (UTC)
As in the earlier discussion, I support the idea, probably something like 1-2 years inactivity as a threshold. — Kleio (t · c) 22:46, 10 April 2017 (UTC)
support Crom daba (talk) 23:36, 10 April 2017 (UTC)
This is a good idea. In some respects, 1 year does seem a bit short, but then again, when I go through the categories I usually ping only people who've been active in the last month or two! So I guess I am OK with a limit of 1 year, especially if the same bot that checks if users have been inactive and need to be removed from the category (nocat=1-ified) will also check if users have become active and need to be restored to the category. 2 years (mentioned above) would also be acceptable to me; it would leave a lot of inactive users but still winnow the categories somewhat. - -sche (discuss) 23:39, 10 April 2017 (UTC)
@Awesomemeeos, Crom daba, Equinox, KIeio, -sche: OK; how about two years? Re “check[ing] if users have become active and need to be restored to the category”, yes, I would propose that that be part of the bot’s duties (although, couldn’t a newly “reactivated” user just remove the |nocat= or |inactive= parameter from the {{Babel}} transclusion?). — I.S.M.E.T.A. 01:43, 14 April 2017 (UTC)
@Awesomemeeos, Crom daba, Equinox, KIeio, -sche: Shall I write the vote? — I.S.M.E.T.A. 23:58, 15 April 2017 (UTC)
Yes, of course — AWESOME meeos * ([nʲɪ‿bʲɪ.spɐˈko.ɪtʲ]) 00:04, 16 April 2017 (UTC)
Yeah. I think two years is reasonable. Equinox 00:44, 16 April 2017 (UTC)
@Awesomemeeos, Crom daba, Equinox, KIeio, -sche: I've created the vote; see Wiktionary:Votes/pl-2017-04/Removing inactive editors from user-proficiency categories. — I.S.M.E.T.A. 01:14, 16 April 2017 (UTC)
(By the way, I don't think it's a better idea, but if there any technical barriers to the parameter, then a different idea might just be to have the bot comment out the template.) Equinox 01:24, 16 April 2017 (UTC)
@Equinox: Yes, that would work; however, I went for the option that was minimally disruptive whilst still salvaging the value of the user-proficiency categories. — I.S.M.E.T.A. 01:29, 16 April 2017 (UTC)

Category:Braj languageEdit

Shouldn't this be merged into Hindi? —Aryamanarora (मुझसे बात करो) 01:00, 10 April 2017 (UTC)

Yes, I think it probably should. For future reference, this kind of thing usually goes at WT:RFM. @-scheΜετάknowledgediscuss/deeds 02:34, 10 April 2017 (UTC)

PoliticsEdit

Am a bit concerned about Special:Contributions/Romanophile, with a definite agenda to take every man phrase and create a woman one. It's like when PaM started taking every church phrase and creating a mosque version (many were deleted). I don't feel great about politics (even if they are nice "progressive" politics) dictating what entries we have. In particular I feel we need to gloss unusual things like "oh my goddess!" and "woman the lifeboats!" as non-standard, simply to help anyone trying to learn English, rather than creating them without comment. Am I being reasonable? Equinox 02:16, 10 April 2017 (UTC)

We really need to have a written-down policy for usage examples, so that when we revert edits to usage examples it won’t look like we’re targeting their politics. — Ungoliant (falai) 02:18, 10 April 2017 (UTC)
See also [3], which isn't necessarily a bad edit but did raise my eyebrows. Certainly we don't have to use women/pretty, men/athletic in a usex, and it's better to avoid those stereotypes, but changing "men and women" to "women and men" is breaking idiom. Equinox 02:45, 10 April 2017 (UTC)
Actually, "women and men" is fine, I believe. It is citable. --Daniel Carrero (talk) 03:42, 10 April 2017 (UTC)
It's citable but it's not the traditional ordering of the two words. —suzukaze (tc) 03:44, 10 April 2017 (UTC)
To me, this looks like a reason to use "women and men" more — because it's not been used often enough, in comparison with the opposite order of words. To shake things up, and have some variation. --Daniel Carrero (talk) 06:27, 10 April 2017 (UTC)
Our goal is for our usexes to be as natural as possible, not to write what we think people should say, or with the intent of "shaking things up." Andrew Sheedy (talk) 06:49, 10 April 2017 (UTC)
In this case, "women and men" fits because it says right after "but more frequently by women". However, in most cases "men and women" is more natural. See here and here for a percentagewise comparison. If you want things to change, go change them in the real world and eventually the dictionaries will reflect that. But it's not the job of a dictionary to promote any sort of social change, nor to oppose it. This is what being descriptive rather than prescriptive is all about. --WikiTiki89 14:22, 10 April 2017 (UTC)
One could argue (and many do) that there is no way to abstain from that debate, unless one refrains from usage examples which have any gender specificity. If you choose to write "men and women" you are reinforcing historical norms, if you choose "women and men" you are subverting them. Even the androgynous "people" can be considered "progressive." Regarding the question of Romanophile's changes, I am ambivalent to the ones I have read, if they wish to change the ordering of genders in a usex that is no skin off my nose. If there are changes like "firemen" to "firehumans" then I would take issue with that. - TheDaveRoss 14:49, 10 April 2017 (UTC)
@TheDaveRoss: I’d say that the entries that I added were slightly inaccurate at worst. In which case, they could be redefined in a way to make them hyponyms. If the definitions seem inadequate it’s probably because I felt overconfident from the results that I skimmed on Books, not because I wanted to enforce a ‘political agenda’ as the OP’s laughworthy and overly worried conclusion suggests. — (((Romanophile))) (contributions) 16:54, 10 April 2017 (UTC)
Look at the numbers I linked to in my previous post. "Men and women" is historically 99% and since the late 1960s has dropped to about 85%. We don't have to "reinforce historical norms", we just have to follow the current norms. And 85% is still a strong norm, especially if you assume that some of those 15% are context-specific, which shows that the hard rule that "men and women" is the only correct order has disappeared, but it is still the unmarked order. --WikiTiki89 14:55, 10 April 2017 (UTC)
Well, an 85% norm would also suggest that we should use exclusively heterosexual couples in example sentences, and probably avoid any interracial couples. If 20% is a strong norm as well then we shouldn't refer to African-Americans with college degrees. That is neither here nor there, since "strength" of the norm has nothing to do with my point. - TheDaveRoss 16:26, 10 April 2017 (UTC)
I would add an image of a gay couple at couple. It already has quotes, but it lacks an image at the moment. --Daniel Carrero (talk) 21:12, 10 April 2017 (UTC)
Do we also get to change ladies and gentlemen to gentlemen and ladies in order to shake things up? No, it’s only changing things in a way that looks like (to a Tumblrite maybe) it “benefits women” that is politically correct (but if anyone ever finds a woman who actually gains anything from us changing “men and women” to “women and men”, let me know). — Ungoliant (falai) 15:23, 10 April 2017 (UTC)
I don't see any suggestions that we should enforce changes to a "progressive" style, the question at the top was about whether it was allowable to use a less-common construction in usage examples. - TheDaveRoss 16:26, 10 April 2017 (UTC)
ladies and gentlemen is an idiom; men and women is not, otherwise we would have that entry. (Should we have that entry?) Either order ("men and women" or "women and men") is fine and natural. --Daniel Carrero (talk) 21:11, 10 April 2017 (UTC)
Both ladies and gentleman and men and women obey Behagel's law of increasing terms: the shorter term (the one with fewer syllables) comes before the longer term. This is why we say salt and pepper, not *pepper and salt, and why English speakers say bow and arrow but German speakers say Pfeil und Bogen (lit. "arrow and bow"). —Aɴɢʀ (talk) 11:44, 11 April 2017 (UTC)
I had never seen those before, but I like that the second "law" is effectively "bury the lede." - TheDaveRoss 11:52, 11 April 2017 (UTC)
I did not know about Behagel's law of increasing terms, and I found it interesting. But correct me if I'm wrong: as an important exception to that rule, straight couples are usually mentioned as man first, then the woman, regardless of the number of syllables, like: "Anderson and Mary". If it's true, it does sound like a sexist thing to me, even if it's a culturally accepted thing. --Daniel Carrero (talk) 14:28, 30 April 2017 (UTC)
  • My two cents: Changing a shorter phrase like she is pretty (adjective) to a longer phrase like she is the best athlete (multi‐word predicate) seems petty and counterproductive to me. Usage examples (and quotations) should be as short and simple as possible. (As a personal rule, I never truncate a full sentence, though.) Add only what is required to exemplify exactly what you want to show and do it in a way that does not distract from the item meant to be exemplified by it. To me that means they should always be as descriptive and close to natural language as possible, and that means avoiding crassly infrequent forms unless the example is meant to show exactly this infrequent form of speech. If users can recognise that your choice of words was a choice, i.e. is a conscious decision against a habit or is at very least a habit of yours specifically trained, that's your indicator that it's probably not befitting a plain neutral dictionary example. If I can recognise that the author had an agenda, Sachliteratur (educational non‐fiction) loses my trust. ps.: I'm no native speaker, but are these terms even idiomatic? My Google results for womaned are Urban Dictionary, we and discussions whether or not it's acceptable in Scrabble. Korn [kʰũːɘ̃n] (talk) 22:44, 11 April 2017 (UTC)
  • Regarding the topic this has strayed into: as someone once put it, if a man finds it jarring to open a book and see (for example) "Any citizen who wishes to change her registration should ..." (=gender-neutral she) — if it seems like an "agenda" to him — he should try to imagine how women feel opening every book that says "Any citizen who wishes to change his registration should ..." ("gender-neutral" he). Etc, etc. But to return to the initial topic... creating entries for attested "woman" and other terms seem fine, though we'll obviously have to figure out what context labels apply — some need no label ("neo-fascistic"), some are "rare", "nonstandard", etc. - -sche (discuss) 02:45, 12 April 2017 (UTC)
Eh, I think it has more to do with what's most common. Any women or girls I've spoken to about this sort of thing (admittedly there aren't very many) find it odd when they hear someone default to "she" when the gender of someone is unknown. I know "he" sticks out to me slightly (unless it's an older text, since I expect to see it there), since it's not commonly used by my generation, and I'm more used to a gender-neutral "they". Andrew Sheedy (talk) 03:03, 12 April 2017 (UTC)
Literally everything offends somebody. I don't like the default "he" and will try to gender-neutralise things where I can. But I was taught that two wrongs don't make a right, and wouldn't swap one bias for the opposite bias. I know that's terribly unfashionable now. Equinox 03:13, 12 April 2017 (UTC)
I agree. Otherwise we're going to need a men's rights movement in a hundred years or so. ;) More seriously, though: as concerns Wiktionary, I think we should try not to offend within reason, but not be fanatics about it. Let's not offend people's intelligence by including obvious attempts to be all-inclusive, when it doesn't reflect the language as it is used. It isn't the job of a dictionary to promote progress.... (Besides, some potentially offensive usexes have made my day, so that makes them worth keeping, right?) Andrew Sheedy (talk) 04:38, 12 April 2017 (UTC)
@Andrew Sheedy: “Suicide is inimical to the health of the participant.” — That is lovely. A smile was cracked. — I.S.M.E.T.A. 00:04, 14 April 2017 (UTC)
Yeah, I don't see anything wrong with that usex except possibly the last word — does English normally describe a single person who kills themself as a "participant" in suicide? - -sche (discuss) 17:50, 15 April 2017 (UTC)
@-sche: You're right — I'd much prefer practitioner. ;-)  — I.S.M.E.T.A. 19:36, 15 April 2017 (UTC)

It seems the only truly feasible way to resolve such conflicts is to replace the offending usage example with a real quotation. — I.S.M.E.T.A. 00:06, 14 April 2017 (UTC)

  • The news just now is on how the frequent use of career/math/science words with male names but art/family words with feminine names, and other tendencies of our (=humans') texts to accept and reproduce traditional biases, is teaching artificial intelligence to be biased, causing mistranslations of e.g. "o bir doktor". Obviously, Wiktionary's mission is not to save the world from bad AI (we do have a user with a username that looks suspiciously like an abbreviation of John Connor, but I won't blow his cover), but this helps quantify The Dave Ross's comment of 14:49, 10 April 2017 (UTC) about how reproducing traditional biases is not an automagically bias-free position. But I do think we can — and usually do — find a balance, like many above have said. :) - -sche (discuss) 17:57, 15 April 2017 (UTC)
@-sche: But o bir doktor (he is a doctor) is not a mistranslation, in the same way that o bir hemşire (she is a nurse) is not a mistranslation. Grammatically, one can substitute he, she, or it in either case without error. However, there is nothing objectionable or “problematic” (that weasel word…) about Google Translate’s default translations. And those “biases” are wholly justified, both statistically and etymologically: etymologically because of derivation from words undeniably of the corresponding grammatical and/or natural gender [Latin doctor m, Old French norrice f (wet nurse), Persian همشیره (sister)], statistically because most doctors are men and most nurses are women (no matter how “problematic” a fact that may be). It seems that Ian Johnston and writers like him are too ready to infer damnable prejudice. It is far better to maintain unconscious biases, potentially thereby preserving some subtle, unrecognised facet of a language, than it is to extirpate the lot in favour of linguistically misguided, politically motivated counter-biases. — I.S.M.E.T.A. 21:11, 15 April 2017 (UTC)
It's a misleadingly overspecific translation especially if the doctor in question is, for example, established as a woman in a previous sentence. The usual way of indicating a pronoun is not gender-specific, at least in the languages I work with, is "he/she" or "she/he" (though even this fails to capture cases where a pronoun could equally also apply to non-binary people — and some of the languages I refer to are spoken by peoples who recognize traditional more than two genders, like the Ojibwe two-spirits who gave English that term). Who is proposing "extirpating" any swath of things except as a straw man? The usage examples I see discussed here include things like "she is pretty" → "she is the best athlete", where the second phrase is just as fluent as the first, but provides greater variety than the hackneyed "she is pretty". - -sche (discuss) 21:31, 15 April 2017 (UTC)
May partly be an artifact of Google's overly clever "statistical" translations that don't always attempt to understand the context/flow of a text, if they can just compare it with billions of similar snippets. Equinox 21:40, 15 April 2017 (UTC)
@-sche: Sure, there's no problem with something like an isolated instance of substituting she is pretty with she is the best athlete, but I'd like to pre-empt an implicit imprimatur to go about sanitising all our “problematic” example sentences. I mean, great, Ojibwe has non-binary pronouns (or however it expresses that), but English standardly only has he and she — should we maintain that idiomatic binary, or roll out usage like the pronouns ze, zim, and zir (either when translating Ojibwe usexes or just generally)? Like Equinox wrote, “literally everything offends somebody”; I'm sure steak and juicy commonly collocate — is that fact objectionable? — I.S.M.E.T.A. 22:53, 15 April 2017 (UTC)

template:hypernymsEdit

I just realised that I mistakenly classified capitalist as a ‘synonym’ of Keynesian. Is there a reason why we have template:synonyms but not template:hypernyms? Because if not, I’ll just make the template myself. — (((Romanophile))) (contributions) 02:53, 10 April 2017 (UTC)

No, go ahead. DTLHS (talk) 03:44, 10 April 2017 (UTC)
Don't forget the documentation page. The template is currently uncategorised. —CodeCat 14:27, 10 April 2017 (UTC)

customizable lists of termsEdit

I think the easiest and most far-reaching implementation would be a user-friendly interface page where different "linguistid" parameters can be selected to create customized categories, instead of the pre-arranged ones that we have now. --Backinstadiums (talk) 07:13, 11 April 2017 (UTC)

You don't have to use the pre-arranged categories. Just create some new category like Category:English (something) words. But if it's a good category that works for many languages, it should be implemented into the "pre-arranged" system eventually. --Daniel Carrero (talk) 07:28, 11 April 2017 (UTC)
@Daniel Carrero: No, I am afraid that stationary approach is no longer usuful. I meant for users to select certain terms according to the searchable info. appearing in their respective entries. For example, listing "Arabic adjetives with a certain pattern which form their plural with the pattern فعْلة". Currently, such a dynamic approach, narrowing down or intersecting linguistic features, is not available, yet a modern online dictionary might very well offer it. --Backinstadiums (talk) 07:58, 11 April 2017 (UTC)
Oh, I understand. You mentioned "intersecting linguistic features". I would definitely support having some way to intersect categories. I'm pretty sure there was an external website where you could search and intersect Wiktionary categories.
That aside, I don't speak Arabic and I don't understand anything about "the pattern فعْلة". But I would seriously support having a category like Category:English nouns with -es plurals, so maybe other people could consider creating something like Category:Arabic adjetives with فعْلة plurals too, if that makes sense. --Daniel Carrero (talk) 18:22, 11 April 2017 (UTC)
@Daniel Carrero, Backinstadiums You can intersect categories by writing in search bar for instance incategory:"English 2-syllable words" incategory:"English adjectives . It doesn't show like a regular category, but it does definitely help. Julien Daux (talk) 19:45, 11 April 2017 (UTC)

@Daniel Carrero, Julien Daux Categories should be replaced by adding such info. in the template of a term entry, enabling any user to choose the specific features of the list of terms they want to arrange. Instead of those lines of code, it would enrich the dictionary to create a new page where all the different available options (linguistic/lexicographic features actually) can be 'ticked' before hitting the 'search' button. Should this be a formal proposal? How to proceed then? --Backinstadiums (talk) 07:45, 12 April 2017 (UTC)

French Wiktionary monthly news - ActualitésEdit

Hi!

I am glad to inform you that the 24th issue of Wiktionary Actualités just came out! As usual, it is a short page about the project and lexicography in general. This time: a focus on a dictionary about toponyms and another focus on the jargon in use in European administrations. I am sure it is poorly translated in English, and I am still doubting it is useful to translate it, after six months of regular translations. So, you are welcome to read it but also to improve it! Let us know if it inspires you some new ideas for your project. We celebrate two years of regular publication, and we are quite proud of it, but in the same time, we are not sure of the popularity or quality of our collective writing. We received very few feedbacks. The popularity of Wiktionary is changing, and there is more conferences about the project this year than ever, like a big venue in a museum in Paris in February or a one hour discussion about lexicography and Wiktionary this month in Lyon. Another one is plan for this Thursday in Lyon too. In this picture, we do not know if Actualité is playing a role or not. So, in preparation for our next month edition, an anniversary edition, please let us know your opinion and ideas for the future! Thank you   Noé 12:25, 11 April 2017 (UTC)

Persian conjugation tablesEdit

The currently used conjugation tables for Persian words generally do a great job listing the various forms and their transliterations. However, there are some major issued that need be addressed. I will list them as separate points below:

  1. A large number of forms are in fact compound forms. Since these are regularly formed using auxiliary verbs, I don't think that they should be included in the table. It is as though Spanish verb tables included forms such as ha comido, había comido, está coimendo and va a comer. To my knowledge, it is not common practice on Wiktionary to include such forms in any language.
  2. The table includes an alleged "aorist" form, but there is no such form in modern Persian. It might have been described as such by some author, but it is definitely not standard.
  3. There is some redundancy in the naming of the forms "past (imperfect)" and "present (imperfect)". In my opinion, they should be termed simply as "past" and "present".
  4. I find it a little bit strange to have conjugation tables for colloquial forms, but this seems to be done in other contexts too, such as examples (e.g. گفتن#Verb).
  5. Maybe negated forms should be included, since the prefix used is not the same for all tenses.

Thoughts? Unless anybody opposes, I think the conjugation tables should be reworked to more accurately and concisely represent modern Persian verbs. HannesP (talk) 14:03, 11 April 2017 (UTC)

I haven't started learning Persian in earnest yet, so I can't address most of these issues myself. I do agree that across languages, we should avoid giving too much space to compound forms, but I think the presentation of separate colloquial tables is very valuable and should be kept. @Dick Laurent, ZxxZxxZ, Irfan, Vahagn PetrosyanΜετάknowledgediscuss/deeds 16:55, 11 April 2017 (UTC)
Fixing typo above, and because none of these people seem likely to respond but Kolmiel is active, @Irman, KolmielΜετάknowledgediscuss/deeds 23:52, 11 April 2017 (UTC)
While I don't know much at all about Persian, the points you make seem to be sensible, so I support. —CodeCat 23:54, 11 April 2017 (UTC)
My opinions:
1. It is common practice to include compound forms in several languages, but I agree that it is not a good thing. So, support.
Later additional comment: I would count the perfect among the non-compound tenses, though. It's written in one word.
2. The aorist form is definitely standard. First of all, our Persian entries cover all of modern Persian, which starts around the 8th century AD. But even in contemporary Persian this form is used, chiefly in poetic language. I don't know if the name "aorist" is standard. I suppose there could be a better name; particularly since the Greek aorist is chiefly a past form, while the Persian form is present.
3. I agree with calling the present "present", but not with "past". Persian has at least four past forms: "imperfect", durative, perfect, pluperfect. I think the "imperfect" could be called "preterite".
4. Colloquial conjugation tables are useful. The vernacular has different endings and different rules with stems that end in a vowel, etc. Once we get rid of the compound forms, it will also become less messy.
5. I'm not sure if negated forms would be useful. They deserve to be mentioned more than compound forms; agreed. But the prefix is generally na-. In Iran nami- now commonly becomes nemi-, but this is not part of classical Persian nor of Dari. And the thing is that adding these would make the table messy again when what we want is to make it clearer. Kolmiel (talk) 12:15, 12 April 2017 (UTC)
Later additional comment: It's true that the negative imperative is often ma- in older language, and with vowel-initial stems na- becomes nay-... So, maybe it does make sense to include negative forms. However, my preferred solution would be to give just one form (e.g. 1st p. sg.) as an example for each tense. The rest is mere repetition. Kolmiel (talk) 15:05, 12 April 2017 (UTC)
1. Neutral
2. Kolmiel is right
3. No comment
4. Those collowuial forms are very valuable, you can hardly find any source that mention them, and this colloquial Persian (which is markedly different from standard Persian) is increasingly used sometimes even in written published works (unfortunately).
5. I think it may be useful. In passive tense for example the prefix is added before شدن šodan: گفته نشده است gofte našode ast. In classical Persian also it is not always added before می mi-: می نگوید (ha)mê nagôyad. And there is also م_ ma- used for certain forms. I think we should add these stuff and make the table less messy through adding hide and show options.
5+1. An archaic form is missing: it is created by the suffix ی , I don't know what it is called.
--Z 13:07, 12 April 2017 (UTC)
Thanks for your input, Kolmiel and Z!
  • Concerning negative forms, I think you both present valid points. The negative is manifested in enough ways for it to make sense being included (ne-/na-, nay-, be- -> na- in subjunctive, classic forms) but it is true that it would clutter the tables. Just showing it for one person/number sounds like a fair solution, but I can't recall seeing it in other languages on wiktionary.
  • As for the perfect, I still think it qualifies as a compound form, since it is regularly formed with the participle and the enclitic copula. In that sense, it's pretty similar to the perfect in e.g. Italian. Furthermore, it's still written as two words in literary Iranian Persian (شده است etc).
    • Clarification: the last point specifically refers to third person singular.HannesP (talk) 19:13, 13 April 2017 (UTC)
  • How common is this "aorist" form in practice? And what's the correct term? My 400-pages grammar doesn't even mention it, but that's not necessarily indicative of its actual use. Could you please provide examples of where, and in what situations it's used?
  • As Z brought up, there are further forms that aren't included in the current tables. The -i indicative is one, but there are others, such as the hami- indicative and the be- perfective. Where should we draw the line? Why include "aorist" but not "hami- indicative"? Even though Modern Persian spans a long period, I think the conjugation tables should mainly reflect contemporary use. Compare with English entries, which don't include forms such as "[thou] bringest", even though such forms have certainly been used in Modern English.
  • Regarding whether forms such as رفتم should be called "past" or "preterite", I don't really have a strong opinion. In my opinion, the fact that there are several past forms isn't a convincing argument against "past". In English, forms such as "went" are commonly called "past form" but that doesn't mean English doesn't have perfect or pluperfect. However, for clarity "preterite" is fine by me, but I don't know how established it is in the literature. My Swedish grammar uses "preteritum" to describe this form in Persian, though.
    • After giving it some more consideration, I do think that preterite is a better term than past. HannesP (talk) 21:43, 13 April 2017 (UTC)
--HannesP (talk) 19:12, 13 April 2017 (UTC)
@HannesP: You may find it profitable to survey the way in which Latin verb forms are presented in Latin entries’ conjugation tables. — I.S.M.E.T.A. 01:49, 14 April 2017 (UTC)

Read-only mode for 20 to 30 minutes on 19 April and 3 MayEdit

MediaWiki message delivery (talk) 17:33, 11 April 2017 (UTC)

Moving the translations of waterEdit

Water is in CAT:E again because it hits the limit on available memory for Lua. The unusually comprehensive translation table plays a role in that: the page makes it as far as Singpho, ~2,130 out of ~2,750 translations in, before collapsing. Perhaps, in the same way that we put undisplayably long words into special pages, the undisplayable translations of the first definition of water could be moved to a subpage or appendix (the translation-adder might have to be updated to work in the appendix namespace). It would be linked-to using the regular {{trans-see}} template. Obviously, in the distant long term when every page is as complete as water, we'll need to update our modules and templates to be more efficient, or ask the developers to let us use more memory, but for now this page is a special case — the page with the next-most translations has only a third as many, and the page after that has only a sixth as many. - -sche (discuss) 21:14, 11 April 2017 (UTC)

I'd ask why adding more links causes an out-of-memory error in the first place. Surely creating one link doesn't need less memory than creating a thousand? After each link, the used memory should be discarded, so it can be reused for the next. It seems like Scribunto has faulty memory management to me. —CodeCat 21:17, 11 April 2017 (UTC)
It is not running out of physical memory, it is hitting the 50mb limit which set for page construction (Lua memory usage: 50.21 MB/50 MB). We ought to remove all of the calls to {{t-simple}} and create flat wiki-markup links. Even the "simple" version makes calls to modules which have to do page loads and lookups, it is all terribly inefficient without a database. It would also be a good idea to move all translations to a sub-page and transclude them, so that clicking "edit" on the article would be less sluggish. - TheDaveRoss 21:33, 11 April 2017 (UTC)
There is no way that the final page is 50 MB, so something along the line is not releasing memory when it should. Expanding one template doesn't need 50 MB, so if the memory is freed afterwards, you can do the same 1000 more times. It's obviously possible to do this with 50 MB, so if it's not, then the software's memory management is faulty. —CodeCat 21:40, 11 April 2017 (UTC)
I am not sure how the memory works, but I doubt any memory that is used is later freed up; if that were true, the limit could never be reached, given how small our module pages are in bytes. None are anywhere near 50 megabytes. — Eru·tuon 21:54, 11 April 2017 (UTC)
Before I posted here, I considered the possibility of replacing the t-simples with bare wiki markup, but I realized it would have some insidious side effects, like if a code is later removed from the language modules (due to being subsumed as a dialect of something else), there will no longer be an error if someone fails to update [[water]] — and if they search for and replace all uses of the code, the search won't turn up water. - -sche (discuss) 21:41, 11 April 2017 (UTC)
I agree that there are a lot of upsides to having the translations templatized, however when we hit the technical limits of the platform we might have to make accommodations. - TheDaveRoss 21:46, 11 April 2017 (UTC)
We would also be losing the script tagging. DTLHS (talk) 21:47, 11 April 2017 (UTC)
Or the platform needs to be fixed. I suggest filing this as a phabricator issue. —CodeCat 21:48, 11 April 2017 (UTC)
I would sooner look to fixing the languages module than Mediawiki. - TheDaveRoss 21:52, 11 April 2017 (UTC)
Any suggestions? —CodeCat 21:55, 11 April 2017 (UTC)
Migrating all of the data to WikiData, however that is not possible quite yet. If we aren't going that route, creating an extension which does the language lookups from the MW database would also speed things up tremendously and reduce the amount of memory needed by the page creation. - TheDaveRoss 21:58, 11 April 2017 (UTC)
I think you still don't understand the problem. This link doesn't need 50 MB to be created, and it works just fine: test. Why does memory usage go up when I include more of such links? Why are 999 previous transclusions affecting how the 1000th one is transcluded? Transclusions should be completely independent of each other and not share any memory. Once one transclusion has been processed, the memory should be done with and available for the next one so that the next has just as much memory available as the previous. If this is not happening, and pages are gradually running out of memory with each successive transclusion, then there's a memory leak. —CodeCat 22:03, 11 April 2017 (UTC)
That assumes that there is no parallel processing going on, and that GC is run after each object. - TheDaveRoss 22:12, 11 April 2017 (UTC)
If each template expansion uses a shared pool of memory and runs in parallel nondeterministically, then memory usage itself is nondeterministic. It means that some pages will sometimes run out of memory and sometimes won't, entirely at random, by the very nature of the system. That seems like a fairly big flaw. —CodeCat 22:15, 11 April 2017 (UTC)
I have an idea. Module:scripts uses the lang:getScripts() function from Module:languages. This function appends a table of script objects to the language object. Thus, I assume, for example, the script object for "Latn" is repeated inside the language object of every language that uses the Latin alphabet. Not sure how much memory that would use, but it seems quite wasteful. — Eru·tuon 22:05, 11 April 2017 (UTC)
It shouldn't be very wasteful unless you are handling lots of language objects at once. But each call to {{t}} only needs one language object, and as I noted above, the memory of each template call should be freed afterwards. So the amount of memory needed for each individual template is not at all high. It's only if the memory is not properly being freed and reused that it eventually gets used up, which I suspect is the case. —CodeCat 22:09, 11 April 2017 (UTC)
Can we construct a module that proves it is the scribunto implementation that is wasting memory and not something that we can control? DTLHS (talk) 22:10, 11 April 2017 (UTC)
It's hard to do that since nothing else but Lua uses Lua memory. However, if transcluding the same template call 1000 times uses more memory than just once, then something is off. —CodeCat 22:13, 11 April 2017 (UTC)
Again, I think your assumption of how Lua memory works is faulty. It seems that memory is not freed up after a template is evaluated; rather it accumulates as the software goes through each Lua-based template from the top of the page to the bottom. — Eru·tuon 22:16, 11 April 2017 (UTC)
I'm saying how it should work. If it doesn't work that way, then that's a flaw in the software that should be corrected. There's absolutely no need for a call to {{t}} to use memory from the previous call, since individual transclusions and module invocations are entirely separate. mw.loadData is the only way in which multiple modules invocations can share data, and that's tightly controlled and read-only. —CodeCat 22:23, 11 April 2017 (UTC)
A compromise might be to move the "obscure" languages (based on number of entries, or speakers, or something) and keep the common ones: it's annoying to make users do an extra click to translate water into something mainstream like French or Chinese. Equinox 21:18, 11 April 2017 (UTC)
I posted about this in the Grease pit. I'm puzzled why the module error suddenly cropped up. Perhaps the recent addition of two cognates to the Etymology section pushed it over the top. I wonder if it could be avoided by a simpler method than moving translations (though there are so many translations that I think that wouldn't be a bad idea, even without a Lua memory error). Perhaps by handling language objects differently in the etymology and translations modules. — Eru·tuon 21:26, 11 April 2017 (UTC)
Would it be possible to move translations to alphabetical subpages and then have a menu to expand them (by first letter of the language name) as needed? DTLHS (talk) 21:34, 11 April 2017 (UTC)
It would require Javascript to function unless we were willing to load them all every time. I am not sure if we have a stance on whether Javascript should ever be required for all content to be displayed. - TheDaveRoss 21:39, 11 April 2017 (UTC)
Might recent module renovations be a cause? Maybe we should be looking at the efficiency of the code. —suzukaze (tc) 21:56, 11 April 2017 (UTC)

@CodeCat: Test of Lua memory: this edit (1.22 MB) versus this edit (2.98 MB). The second consists of 1 case of {{l|en|word}}, the second of 24 cases of the same. So, it's not quite 24 times as much memory. — Eru·tuon 22:24, 11 April 2017 (UTC)

Yeah, that basically shouldn't be happening. Lua invocations should be entirely independent of each other, that's how Scribunto was designed. I wonder what happens when you do this with a Lua module that has a function that just returns an empty string. —CodeCat 22:33, 11 April 2017 (UTC)
Well, perhaps you are right that it shouldn't work this way or perhaps not. I don't know. Perhaps the memory does not actually accumulate, but rather the memory-measurer increments by the amount of memory used by a given template, whether or not the memory is dumped before the next template is evaluated. I suppose one would have to post on Phabricator to find out. I wouldn't know what exactly to ask. — Eru·tuon 23:30, 11 April 2017 (UTC)
  • Unqualified commentary, I know zilch about tech: This could be an incentive to centralise all translations on WikiData and then find a way to provide them to every Wiktionary at once, providing our Wiki's already‐put‐in work to all Wiktionaries with less manpower. Korn [kʰũːɘ̃n] (talk) 22:30, 11 April 2017 (UTC)
Right now, I think it pretty clear we should move the translations of water to a subpage. It may not be what we would want to do in optimal circumstances, and perhaps Wikimedia improvements could let us move it back, but perfect is the enemy of good.--Prosfilaes (talk) 23:13, 11 April 2017 (UTC)
Is that a better solution than simplifying all of the translations which are presented on the page so that they don't make module calls and thus don't use nearly as many server resources? There are lots of possible solutions, many of which make no substantive change to the user experience. - TheDaveRoss 13:30, 12 April 2017 (UTC)
Removing module calls might require removing transliteration, gender, and script classes. If so, I don't like the idea. — Eru·tuon 15:35, 12 April 2017 (UTC)
It would only remove those things being done on the fly, the exact same presentation is possible without using templates/modules. - TheDaveRoss 16:05, 12 April 2017 (UTC)
I don't understand what you mean by "removing these things being done on the fly". Could you clarify? — Eru·tuon 16:39, 12 April 2017 (UTC)
Sure. When we use {{t}} (and others) the template makes several module calls and require a bunch of Lua overhead. As we have discovered this overhead exceeds the limits put in place by Wikimedia's tech staff. If we do not make module calls, but instead replace the module calls with static results (e.g. instead of calling the languages module getByCode(es) to get "Spanish" we write "Spanish") then we reduce the overhead significantly. On most pages where there are relatively few module calls it is not an issue, however Water results in many thousand module invocations and thus a lot of overhead. The downside is that future changes require a "manual" update of water, but until a better solution comes along it would not require that the user experience of water change, only the wiki-markup. - TheDaveRoss 16:59, 12 April 2017 (UTC)
For example, here is a simple change we could make: {{t-simple/test}} adds an optional lang= parameter, and when a language name is included in the call it skips the module call entirely. - TheDaveRoss 17:06, 12 April 2017 (UTC)
I changed the parameter to |langname=, because |lang= usually means language code. --WikiTiki89 17:23, 12 April 2017 (UTC)
This works, see User:Wikitiki89/water. --WikiTiki89 17:46, 12 April 2017 (UTC)
But the language tags are missing from most of the words. —CodeCat 17:50, 12 April 2017 (UTC)
That's a bug in the template. It can be fixed. --WikiTiki89 18:32, 12 April 2017 (UTC)
Done. --WikiTiki89 18:37, 12 April 2017 (UTC)
@-sche, this is obviously better than using a subpage. Do you have any opposition to using {{t-simple/test}}? —Μετάknowledgediscuss/deeds 00:05, 15 April 2017 (UTC)
If that template really uses less memory, then I'm all for it. (I suppose we can find a better name for it at some point.) However, we'll have to be careful about which translations we use it on, and not use it on any that get font support (like Navajo) or automatic transliteration (which I suppose is a subset of languages that get font support, since I think all our non-Latin-script languages have fonts specified for them?), or even that have genders, because it seems to disable all those functions. (It would seem possible to add the ability to specify a gender without invoking any modules; just accept whatever gender is input and trust human editors to review things on the one or two pages this should be used on.)
I'll set about adding it. - -sche (discuss) 19:01, 15 April 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I think we can just update {{t-simple}} since it is explicitly made for water. It also did not previously do any of the other things that {{t}} did, so the places where it is already in use would see no change to their displayed text. There are comparable workarounds for all of the other components of {{t}}, though, if we want to implement them in {{t-simple}}. - TheDaveRoss 19:18, 15 April 2017 (UTC)

Good point. I've just replaced the content of {{t-simple}} with the code from {{t-simple/test}}, and updated water/translations to use it as much as possible, with langname= specified. However, I haven't moved the translations back onto water; see my comment at the bottom of this thread that has the same timestamp as this one. - -sche (discuss) 02:26, 17 April 2017 (UTC)
When I moved the translations back into the main entry, it broke again; see the Grease Pit. - -sche (discuss) 09:58, 25 April 2017 (UTC)
  • Who says splitting a page can reduce memory usage? misunderstood.--Giorgi Eufshi (talk) 07:54, 12 April 2017 (UTC)
  • I suspect that memory usage could be reduced by creating lighter-weight versions of Module:languages, Module:scripts, and Module:links, which are used by {{t}}. For instance, Module:links stores a separate script object every time it makes a link. All the script code does is confirm that the script code in the language's data file is valid. This could perhaps be done more simply by a table containing { scriptCode1 = true, scriptCode2 = true, ... }. A single script code-validating table would probably use less memory than one script object for every single link in a translations table. Similarly, a comprehensive table of canonical names indexed by language name might be smaller than one language object for every link on a page. I could be wrong, but I suspect this is true, as the amount of memory used by {{#invoke:languages/templates|getByCanonicalName|English}} (which invokes Module:languages/canonical names, which creates a list of all languages indexed by canonical name) is only 10.01 megabytes. — Eru·tuon 17:51, 12 April 2017 (UTC)
    • Such a list could be generated on the fly and then loaded with mw.loadData, so that we wouldn't have to maintain it. All that would be needed in Module:scripts, then, is a function like isValidCode that verifies the existance of the code using this module. However, we would need to ask where this check should be done. When functions expect script objects, the mere fact that you need to provide such an object already forces you to go through getByCode to retrieve it, and a check is part of that process. But if functions take script codes, then this layer of validation is lost. —CodeCat 18:10, 12 April 2017 (UTC)

Use # for senseids in raw-link parameters to templatesEdit

Right now, there is no way to indicate the senseid a link should point to, other than by wrapping a template around the link and giving it the id= parameter. The template {{ll}} was created in part to work around this problem. I think there is a better and more intuitive way to do it. Links already naturally support the # fragment notation to link to a specific section on the page. However, we never use this on Wiktionary because such links are ambiguous and prone to breaking: they always point to the first section with that name on the page, no matter what language it's in. Reordering sections or even adding a new language to the page can break links. Senseids, on the other hand, are stable and don't get messed up like this. So how about we co-opt the # syntax to indicate the senseid for links when they are given to templates?

An example (contrived) use in such a situation might be {{head|en|...|head=[[give#grant]] [[up#skyward]]}}. The {{head}} template, and any other template using Module:links, would automatically convert these links to give#English-grant and up#English-skyward, which is the format for link targets that {{senseid}} generates.

If we are going to do this, then we first have to find any existing uses of fragments within links and fix them up appropriately. —CodeCat 18:33, 12 April 2017 (UTC)

I would agree that it would be better to have a better syntax, but how would the template distinguish between language names and sense-ids? You wouldn't want it, for example, to make {{en-noun|head=[[tête#French|tête]]-[[à#French|à]]-[[tête#French|tête]]}} link to tête#English-French and à#English-French. Also, just a minor correction, {{ll}} was not created to work around this problem, but rather it was adapted later to work around this problem. --WikiTiki89 18:47, 12 April 2017 (UTC)
This is why I said we need to fix such cases first, before implementing this. —CodeCat 18:51, 12 April 2017 (UTC)
You partly misunderstood the question then. What would you replace my example with so that it still links to the French section? --WikiTiki89 18:54, 12 April 2017 (UTC)
I would not have it link to the French section, because it's an English term and the individual parts mean nothing in English. The term was borrowed as a whole from French, so the etymology should deal with that. I would write {{en-noun|head=tête-a-tête}}. —CodeCat 18:58, 12 April 2017 (UTC)
I oppose breaking this functionality. --WikiTiki89 19:03, 12 April 2017 (UTC)
I don't know what functionality there even is that would be broken. The example you gave isn't even functionality but a misuse, so I wouldn't count that. —CodeCat 19:07, 12 April 2017 (UTC)
The functionality that allows language sections to be explicitly specified in the links. It was a feature we intentionally added back when this style of linking was first developed, and it should be kept. --WikiTiki89 19:15, 12 April 2017 (UTC)
I agree entirely with Wikitiki89 here. CodeCat, your idea's a good one; why won't you use your considerable ability to preserve this desirable functionality? — I.S.M.E.T.A. 23:05, 13 April 2017 (UTC)
I agree with CodeCat. I don't think an English head should ever be linking to a French entry. — Eru·tuon 19:14, 12 April 2017 (UTC)
Don't forget that this affects nearly all templates, not just headword templates. --WikiTiki89 19:15, 12 April 2017 (UTC)
Also, it's not a good idea to omit features in templates in order to enforce "good practices", so even if we agreed that this shouldn't be done, that doesn't mean the template shouldn't support it. --WikiTiki89 19:18, 12 April 2017 (UTC)
Of course templates should enforce practices. {{l}} doesn't allow you to omit the language code, nor does it allow you to use |lang= instead. Note that we are still cleaning up the mess from {{term}}'s optional language parameter. —CodeCat 19:38, 12 April 2017 (UTC)
I'm not sure when the language-anchors would be needed, but it is probably best not to transform an anchor marked by # into a senseid. It is likely to cause confusion for editors who have not read whatever documentation pages would describe this feature. Perhaps a different collocation of characters could be chosen to mark the senseid, something not otherwise allowed in page titles. (One of the symbols listed in Appendix:Unsupported titles, I guess.) — Eru·tuon 00:12, 13 April 2017 (UTC)

Cognate & automatic interlanguage linksEdit

Hello all,

From April 24th, a new interlanguage link system will be deployed on all Wiktionaries. This extension, Cognate, automatically links the pages with the same title between the Wiktionaries. This means they no longer have to be added in the pages of the main namespace.

This new feature has been developed by Wikimedia Deutschland as the first step of the project Wikidata for Wiktionary, but does not rely on Wikidata.

To allow the feature to operate, all the former interlanguage links have to be removed from the wikitext. You can do this by using a bot, as it was done on Wikipedia in the past. If you leave them in they will overwrite the automatic links.

During the development we had a lot of discussions with Wiktionary editors to understand their needs, but it's possible that some automatic links don't work as you would expect. If you find some bugs or have suggestions for improvements, feel free to add a sub-task on Phabricator or add a message on this talk page.

Thanks, Lea Lacroix (WMDE) (talk) 07:23, 13 April 2017 (UTC)

Excellent! I also think it would be useful to have an edit filter that will give a warning when someone tries to add an interwiki manually that can be turned on after the switchover. —Μετάknowledgediscuss/deeds 07:34, 13 April 2017 (UTC)
Well. Then the pywikibot has to renovate LOL. I think we must halt bots that put IW links back on pages after the deployed day. --Octahedron80 (talk) 09:02, 13 April 2017 (UTC)
For your info: my global bot UT-interwiki-Bot will stop running on April 21, 2017 until further notice. Greetings from Austria --Udo T. (talk) 13:40, 13 April 2017 (UTC)

Why are we only finding out about this now? Anyway, we need to make sure that the new interlanguage links will work correctly for redirect pages (i.e. redirect pages should be treated as any other page, allowing interlanguage links from them and to them). --WikiTiki89 13:46, 13 April 2017 (UTC)

Also, would it be possible to query the existence of a page in another language's Wiktionary through a template or module? This would be useful in our translation tables. --WikiTiki89 20:24, 13 April 2017 (UTC)
@Wikitiki89 For your question I have filled phab:T163734. --Vriullop (talk) 19:51, 24 April 2017 (UTC)
Thank you Vriullop. Wikitiki89: For redirects I have filled phab:T163717. --Thibaut120094 (talk) 21:07, 24 April 2017 (UTC)
I've asked on phabricator, and belatedly answered their question of whether other scripts had issues like the English and French wikis' disagreement on apostrophes (I noted geresh and palochkas; are there others?). Btw, I can't believe they named this Cognate rather than Homograph. - -sche (discuss) 20:49, 13 April 2017 (UTC)
Again, like I said... lack of communication. As for apostrophes and things, this is why the redirect issue is important, because then as long as we have redirects on both wikis, everything should work fine. --WikiTiki89 20:53, 13 April 2017 (UTC)
I support the idea of installing that extension, but I also think the name should be "Homograph" rather than "Cognate". Can the extension name ever be changed? Relatedly, this discussion has some criticism about "Cognate" too: mw:Extension talk:Cognate#Title whinge (minor). --Daniel Carrero (talk) 23:17, 13 April 2017 (UTC)
Just posting to agree that the extension should be renamed to Homograph if possible. — Eru·tuon 23:21, 13 April 2017 (UTC)
Ditto re Homograph vs Cognate. — I.S.M.E.T.A. 01:37, 14 April 2017 (UTC)
If "Homograph" is a really good name and there are no other ideas for names, maybe we can open a request on Phabricator to request that name change. I can open the request, but maybe we should wait a few days to see if more people want to support "Homograph" or suggest something else. (Technically, I guess we could even create a vote with the idea "Rename Cognate -> Homograph" but then again, I wouldn't think this is required, since apparently the name "Cognate" was chosen without asking us first!) --Daniel Carrero (talk) 04:11, 14 April 2017 (UTC)
English and French Wiktionaries disagree on apostrophes, but they have correct redirects and there will be no problem between them. There are several nonstandard interwiki links though, such as ones between Unsupported titles/Full stop, fr:Titres non pris en charge/Point and de:Punkt (Zeichen). — TAKASUGI Shinji (talk) 05:57, 14 April 2017 (UTC)
It seems phab:T158323 is going to allow linking between Unsupported titles/Full stop, fr:Titres non pris en charge/Point and de:Punkt (Zeichen). --Daniel Carrero (talk) 06:07, 14 April 2017 (UTC)

Hello all,

Thanks a lot for your feedbacks. I hope I can answer to all of them, if I forgot something, let me know.

  • About edit filter: admins can create a new AbuseFilter to check for the adding of manual interwikilinks.
  • The project has been announced and we collected feedbacks among several communities.
  • It will be possible to access the list of the links existing for a page, see explanations here.
  • We will not rename the extension. I understand your point and I apologize for not asking the community about it, but now the extension is about to be deployed, we are not going to make this change. This name, anyway, won't be displayed anywhere for regular users, and once editors will be used to the new system, only people that are deeply involved in the Mediawiki extensions code will notice about this name. The rest will be transparent for the users.
  • If you encounter any specific problem (for example with apostrophes) feel free to create tickets, and ping me so I make sure that a modification will be done soon.

Thanks, Lea Lacroix (WMDE) (talk) 14:40, 15 April 2017 (UTC)

None of the announcements linked from above refers to the name Cognate, nor does it seem to link to a place where the name is mentioned. This seems to be the first opportunity in which the English Wiktionary can get aware of the name, an obvious misnomer. In fact, it would have sufficed if anyone who was choosing the name checked what the word cognate means. Misnomers like this are bad even if they do not affect most Wiktionary users. I add my voice to those who kindly ask you to reconsider and to change the extension name to something that is not a misnomer. --Dan Polansky (talk) 14:31, 16 April 2017 (UTC)
Just for the record, it's virtually impossible to rename an extension. We still have an extension called SyntaxHighlight_GeSHi, despite the fact that it hasn't actually used the GeShi framework for years. And as Lea observes, extension names are purely internal, appearing on Special:Version only. This, that and the other (talk) 04:19, 22 April 2017 (UTC)
It is certainly possible to create a new extension with the same content as another one; the question is whether that is worth the hassle. --Dan Polansky (talk) 09:08, 22 April 2017 (UTC)
@Dan Polansky: This seems to be the Phabricator project page about (Not) Cognate: https://phabricator.wikimedia.org/project/profile/2320/
And this seems to be a related patch: https://gerrit.wikimedia.org/r/#/c/332912/
I'm not an expert in extension-building, but it seems there are discussions, members, watchers, subprojects, milestones and probably other stuff. The patch page displays a list of authors and their contributions in the history. Maybe creating a new extension with the same content is not an option because the new extension would lose all that. It seems we'll have to live with that name. --Daniel Carrero (talk) 03:10, 28 April 2017 (UTC)
Can the "Title Normalization" feature be turned off? It seems to be a mistake. —RuakhTALK 18:27, 16 April 2017 (UTC)
Hello @Ruakh, the title normalization feature has been requested by the community, in order to link together pages that have similar names but just a difference of characters, such as ellipsis vs 3 dots. We can adapt this or create new rules if necessary. Can you explain me some use cases where this should be modified? Thanks Lea Lacroix (WMDE) (talk) 10:44, 18 April 2017 (UTC)
With all due respect to Amgine, I don't think (s)he constitutes "the community". There's no technical reason that we couldn't already treat certain variants as equivalent; rather, we have chosen not to. —RuakhTALK 14:41, 18 April 2017 (UTC)
In self-defense, I was attempting to point out that different variants are different, and should not be normalized. Neither should they generate multiple IW. My exemplar was that combined ellipses are different than three stops, and should not interwiki the same (that is, only three dots should generate an IW to three dots.) (I believe EncycloPetey was also discussing combining unicode at the time.) - Amgine/ t·e 05:09, 26 April 2017 (UTC)

Hi, Colgate Cognate will be deployed at 12:00 UTC tomorrow, should I start removing all interlanguage links with the exact same title with my bot? Regards. --Thibaut120094 (talk) 20:23, 23 April 2017 (UTC)

Supposedly, the presence of manually-spelled-out interwiki links will override Colgate. Once the extension is deployed, I would suggest removing all interwikis with the exact same title from a few pages to check that the extension kicks in and works as expected, and if it does, then it would be a good idea to remove the interwikis by bot. Some might argue it would be good form to hold a quick vote (..and might block the bot otherwise...), although that seems to me like an excess of bureaucracy to accept something that makes it easier for us to obtain the same result (of pages having interwiki links). - -sche (discuss) 21:12, 23 April 2017 (UTC)
I can't wait to see how new toothpaste work. --Octahedron80 (talk) 01:54, 24 April 2017 (UTC)
@Thibaut120094 For information, there is no need to rush to run your bot just after the deployment. When the extension will be deployed, if the manual links are not removed, nothing changes. I would advise to take some time to try first, removing links on some pages to check if the extension behaves correctly, then do the full automatic removal. Lea Lacroix (WMDE) (talk) 10:14, 24 April 2017 (UTC)
  • Does it work immediately? I added the German word dichterisch without an interwiki link and it doesn't show that it is present on the German Wiktionary. SemperBlotto (talk) 10:16, 24 April 2017 (UTC)
@SemperBlotto The deployment is in progress, and expected to be finished at 12:00 UTC. Then, yes, the new words you create should have the automatic links. Lea Lacroix (WMDE) (talk) 11:49, 24 April 2017 (UTC)
Edit: actually, it works now between en and de :) The rest of the Wiktionaries will be populated in alphabetic order in the next minutes. Reminder: if any problem occurs, feel free to create tickets or ping me. Lea Lacroix (WMDE) (talk) 11:58, 24 April 2017 (UTC)
I removed the interwiki-links on the german lemma. After that you can see the links to the Wiktionary 'cs', 'en' and 'eo'. The links 'hu', 'io' and 'zh' are outstanding, because they haven't removed the manual links from the entries. --Alexander Gamauf (talk) 12:02, 24 April 2017 (UTC)
This is problably some caching introducing issue. Now there are all languages A-L, few mintes ago there were only languages A-I. JAn Dudík (talk) 12:31, 24 April 2017 (UTC)
As a test, I tried making Estonian the only interwiki link at grape juice, but all three interwiki links are still there. --WikiTiki89 16:50, 24 April 2017 (UTC)
As a test, if you add a Japanese interwiki at grape juice, then the entry will show all the three existing interwikis plus Japanese (which does not have that entry yet). So, rather than overriding the whole list, you can just add more interwikis manually. --Daniel Carrero (talk) 17:02, 24 April 2017 (UTC)
Which is a problem, because we can't hide incorrect interwikis (if there happen to be any). --WikiTiki89 17:20, 24 April 2017 (UTC)
Another test: if you remove all interwikis from grape juice and add only [[et:example]] in the entry, it will show three interwikis: the two correct "grape juice" and the Estonian "example". So, you can override an interwiki in a specific language, but apparently you can't just delete an incorrect interwiki. Yes, that may be a problem at some point, I guess. --Daniel Carrero (talk) 18:10, 24 April 2017 (UTC)
Thanks for noticing. Let me know if you encounter a case of wrong automatic link that you would like to change/hide. Lea Lacroix (WMDE) (talk) 09:17, 25 April 2017 (UTC)

Have you notified all bot's owners that are adding IW links? I still see some bots edit here which their owners are Meta users. --Octahedron80 (talk) 03:31, 25 April 2017 (UTC)

Am I right in thinking that our bots don't always put the interlanguage links in alphabetical order by language code? I vaguely recall something about the ordering being slightly different. Is this the case, and if so, does the output of Cognate conform to these standards? —JohnC5 17:39, 25 April 2017 (UTC)

@JohnC5: You are correct: they are not alphabetized by the ISO code and Cognate also does not alphabetize by ISO code. They are alphabetized by a different standard. —Justin (koavf)TCM 17:41, 25 April 2017 (UTC)
The standard is explained at WT:EL#Interwiki links. Incidentally, that policy needs to be updated to account for (Not) Cognate. It still says that interwikis are maintained by bot. --Daniel Carrero (talk) 17:42, 25 April 2017 (UTC)
I'm guessing it uses the same ordering that we have been using. --WikiTiki89 17:48, 25 April 2017 (UTC)
By the way, the order we were using is defined here. --WikiTiki 89 17:54, 25 April 2017 (UTC)
They will deal with iw sorting later, as answered in the extension's talk page.--Octahedron80 (talk) 04:08, 26 April 2017 (UTC)
I have removed that section of EL to reflect the deployment of Congregate. —Μετάknowledgediscuss/deeds 17:49, 25 April 2017 (UTC)
@Koavf: I'm surprised you need to be told this, but do not edit other people's content in a discussion. It's fine if you don't get the joke and misinterpret it as an error, but editing it — and then even edit-warring over it! — is never something you should be doing. —Μετάknowledgediscuss/deeds 19:56, 25 April 2017 (UTC)
@Metaknowledge Suffice it to say, in this instance, you intended the misspelling and I missed that. It's generally better to not edit others' comments, sure but there are also times that it's entirely appropriate--refactoring, fixing links, important misspellings, etc. —Justin (koavf)TCM 20:00, 25 April 2017 (UTC)
When in doubt, don't. Particularly "refactoring" worries me; don't refactor my messages and change their context or break sentences apart that flow into each other. The cost of a misspelling is usually much less than the cost of changing a correct spelling (or one consciously chosen by its author).--Prosfilaes (talk) 23:39, 25 April 2017 (UTC)

New proto-languagesEdit

Should the new proto-languages like Proto-Eurasiatic, Proto-Nostratic and Proto-Borean be added to Wiktionary? Although they are controversial and not fully endorsed, there are strong evidence that these might exist. Besides, there are a lot of books and reliable websites with hundreds of reconstructed words from these proto-languages with their etymology. Whether they have existed or not, on Wiktionary exist constructed languages with their own lemmas. So until they will be they might be seen as constructed languages.

Here are some sources with reconstructed proto-words:

--46.188.132.255 11:16, 13 April 2017 (UTC)

No, they shouldn't. Good luck convincing enough editors that there's strong evidence for them. Lingo Bingo Dingo (talk) 11:41, 13 April 2017 (UTC)
This isn't an attempt to convince other editors that there's strong evidence for them. This is not even a proposal. And good luck to the dutch people to become smart. --46.188.139.96 12:44, 13 April 2017 (UTC)
Please do not be offensive to other editors. —JohnC5 16:30, 13 April 2017 (UTC)
We've discussed previously whether to let etymologies mention proposed things like Indo-Uralic and Nostratic. My view, for which there was some support, is that as long as we convey how controversial / undemonstrated a proposal is, it's fine to mention in the etymology appendices like Reconstruction:Proto-Indo-European/h₁nómn̥. I guess we could get by without codes for them, but having codes seems like it would standardize the formatting, and have benefits for categorization. We do (as noted) have codes for various constructed and reconstructed languages that are restricted to appendices, as well as etymology-only substrates, and we have a code for Proto-Altaic already. - -sche (discuss) 21:20, 13 April 2017 (UTC)
I stand by an earlier stance of mine: as soon as you can find a "Proto-Nostratic" or similar root that is accepted by three or more people with competing reconstruction schemes (that is, they roughly agree both on what the reconstruction is and what the descendants are), including it would not be a problem. But, alas… almost all sources on these continue to assume different incompatible proto-forms from one another. There is just about nothing that could be added.
The basic problem, in other words, is that we do not add entries for languages, we add entries for individual reconstructions. If the "reconstructions" are so unstable that no two sources agree on anything, there will be no point in creating an entry. (And, for what it's worth, I suspect this should be a rule of thumb also when dealing with established proto-languages.) --Tropylium (talk) 20:44, 22 April 2017 (UTC)

Two proposals: removing and/or hiding quotations sectionsEdit

Previous discussions:

See this link. It is an entry that I edited. I removed a "Quotations" section and moved the two existing quotations to the respective senses:

Two proposals:

  1. Allow for all "Quotations" sections to be removed manually (not by bot!) in all entries, by moving the quotations to their respective senses. If the sense is unclear, the quotation can be moved to the citations page.
    • Rationale: Quotations serve the purpose of illustrating the senses, so they are better placed below each sense. The "Quotations" section also uses up some space in the entry, as opposed to the quotations hidden below each sense.
    • Note: The vote (which was mentioned above) proposed moving all the "Quotations" sections by bot to the citations page, but the vote failed. Moving the quotations to the senses is better, but a bot can't do that. (with the current technology, I guess! ;) )
  2. Automatically hiding all existing "Quotations" sections by adding templates like {{quote-top}} and {{quote-bottom}} in all these entries by bot.
    • Note: This was originally @Donnanz's idea in the Grease Pit discussion (also linked above). If we want to move all the quotations as explained above, then we can hide them until the work is done; or, if we don't want to move the quotations to the senses, they can just remain hidden.

I think the proposal #1 is important. If #1 passes, in the long run #2 won't matter because we won't have any "Quotations" sections in the first place. But I see some merit in the proposal #2 too as I said above. Maybe other people have more to say. Feel free to agree or disagree or whatever. I think we could create a single vote with the two separate proposals, if people want. --Daniel Carrero (talk) 04:02, 14 April 2017 (UTC)

@Daniel Carrero: I wholeheartedly support your first proposal (which is what you did with Lawrence). I'm undecided w.r.t. your second proposal. — I.S.M.E.T.A. 18:26, 14 April 2017 (UTC)
I support #1. The quotations that are moved need to be formatted in the normal "under-a-sense" way, i.e. collapsible. Equinox 18:31, 14 April 2017 (UTC)
I support 1, but it's going to be a lot of work. DTLHS (talk) 04:08, 15 April 2017 (UTC)
So far, we have 4 supports for the proposal #1 (counting myself) and no one opposed it yet. Looks like we are starting to have consensus for that idea, if we didn't already have that consensus before (which can be judged by reading the previous discussions). If this keeps up, I think we can simply do as proposed -- I mean, I guess we won't need a vote to allow doing that. (that's my opinion, at least)
This is the list of entries to be edited: User:Daniel Carrero/Quotations sections. --Daniel Carrero (talk) 00:18, 16 April 2017 (UTC)
I've been moving quotes to under specific senses for a while now, so I have no objection. Andrew Sheedy (talk) 06:48, 16 April 2017 (UTC)
I proposed using {{quote-top}} and {{quote-bottom}} as a quick and easy temporary solution, but if total removal of these Quotations headers is the preferred option that's OK by me. Everyone seems to realise that's harder work. DonnanZ (talk) 10:35, 16 April 2017 (UTC)

Category:en:KitchenwareEdit

I just created this category as a parent of Category:en:Cookware and bakeware and Category:en:Cutlery. According to w:Cookware and bakeware, this term is restricted to containers for preparing food in, but a lot of the entries in Category:en:Cookware and bakeware are really general kitchenware. I would appreciate any help in recategorising these. There may also be some entries in Category:en:Tools that can be moved. —CodeCat 20:26, 14 April 2017 (UTC)

User:x/Books/ and Category:BooksEdit

Where are these coming from? —suzukaze (tc) 19:06, 15 April 2017 (UTC)

Go to the lefthand sidebar, and near the bottom you'll see "Create a book". If we could hide or disable it, I think that'd be great. —Μετάknowledgediscuss/deeds 19:17, 15 April 2017 (UTC)

Wiktionary:Votes/pl-2017-04/Removing inactive editors from user-proficiency categoriesEdit

I have just created this vote, which is set to run from the 23rd of April to the 22nd of May, this year. For details about it, see the vote page and/or the #Removing inactive editors from Category:User coders, Category:User languages, and Category:User scripts section, above. — I.S.M.E.T.A. 01:16, 16 April 2017 (UTC)

Eponyms of surnamesEdit

I keep coming across entries of eponyms of surnames with no definition for the surname itself, with the etymology saying "Named after ____ {surname}." or something. Here's a damn good example: Nemeth. "Named after Abraham Nemeth." Well whoever put that, don't you think the surname itself merits a definition? So how do you have 2 etymologies when the second etymology just comes from the first etymology? How do we deal with that here? For now, in the entry, I put:

"1. A surname. 2. (Named after Abraham Nemeth) Bla bla bla about some braille thing he made or whatnot."

Is there consensus to instead do this?

Etymology 1

Wherever Nemeth comes from

Etymology 2

Named after Abraham Nemeth.

Or what if we had a thing where we had an etymology within an etymology, for instance:

Etymology 1

Wherever Nemeth...

Etymology 1.1

Named after Abraham Nemeth.

Or instead, we could label it in the etymology.

Etymology

Wherever Nemeth comes from.

(for the eponym) Named after Abraham Nemeth.

Proper noun

1. A surname

2. Bla bla bla braille bla bla bla something else

Get the idea? This is very confusing. What do we do with etymologies within etymologies anyway? We especially need to worry about this for surnames and their eponyms as there are a ton of those I'm coming across. PseudoSkull (talk) 04:02, 16 April 2017 (UTC)

I'm sorry, technically this is not an eponym because it's still a proper noun in both definitions, but you get what I mean. PseudoSkull (talk) 04:03, 16 April 2017 (UTC)
If thing-X is named after surname-X, it might as well be under the same etymology, a lot like figurative senses of existing words. Being a separate sense doesn't make you a separate etymology: we already have sense lines for that. You get a separate ety if your derivation is separate. Equinox 05:22, 16 April 2017 (UTC)
@PseudoSkull: I second Equinox’s reasoning here. The eponymy can (and should) be explained in the etymology, more or less as in your last example of how to deal with Nemeth. — I.S.M.E.T.A. 20:31, 17 April 2017 (UTC)
I agree, with the reminder that the "named after Abraham Nemeth" bit goes in the etymology (or potentially as part of the definition: "A system of braille developed by Abraham Nemeth...") rather than as a context label: [4]. - -sche (discuss) 20:59, 23 April 2017 (UTC)

Flerd (Middle English)Edit

Would someone fluent in Middle English provide a better translation of the quotation at flerd? Thanks. — SMUconlaw (talk) 18:36, 16 April 2017 (UTC)

@Smuconlaw, this sort of query belongs in the Tea room. Anyway, the translation was ungrammatical and missing a large chunk; I have fixed that, but I would still like somebody else to look it over and see if they can't improve it. @Leasnam, perhaps? —Μετάknowledgediscuss/deeds 06:33, 17 April 2017 (UTC)
Oops! Thanks. — SMUconlaw (talk) 06:43, 17 April 2017 (UTC)

Category:Russian words suffixed with -∅Edit

What should we do with the -∅ part? It's not conventional @Atitarev, Wikitiki89, Chuck Entz? — AWESOME meeos * ([nʲɪ‿bʲɪ.spɐˈko.ɪtʲ]) 01:54, 17 April 2017 (UTC)

No idea, I haven't seen any precedent. This was introduced without any prior discussion. Besides, transliterations of -∅ should be suppressed with "-". --Anatoli T. (обсудить/вклад) 02:48, 17 April 2017 (UTC)
For Dutch, we have Category:Dutch words suffixed with -en (denominative). However, the actual suffix -en is just the ending of the infinitive, which is the lemma form of verbs in Dutch. The stem of the word does not change, and this is illustrated by verb forms with no ending, such as the first-person singular present. adem (breath, noun) and adem (I breathe, verb) are the same. Because the derivation happens to be with a lemma form with a nonzero inflectional ending -en, things fit into our system neatly. However, derivations can also work in reverse in Dutch, by converting a verb stem to a noun without changing it. Thus, although it did not actually happen so historically, it's quite possible for adem to derive from ademen instead. Nouns have no inflectional ending in their lemma form, so the lemma is the stem. How would such a derivation be denoted, if not with a null suffix? Again, keep in mind that both this null derivation and -en are the same morphologically, neither of them changes the stem, the difference in form is only because of the choice of lemma forms. —CodeCat 20:49, 17 April 2017 (UTC)
Dutch -en, just like Russian -∅ are inflectional suffixes, not derivational suffixes. Thus, they should not be mentioned in derivations. In other words, we should say things like "ademen is from adem" and "adem is from ademen" rather than "ademen is from adem + -en" and "adem is from ademen with -en removed". --WikiTiki89 16:39, 18 April 2017 (UTC)

Proposal: Remove "The essentials" from ELEdit

Proposal:

Rationale:

  • That section to be deleted reads like a complete guide about "Language", "Part of speech" and "References", but it is too short and sometimes misleading, as said below. WT:EL already has three separate, more comprehensive and up-to-date sections for these items: WT:EL#Language, WT:EL#Part of speech and WT:EL#References.
  • The three separate sections mentioned above were completely created/revised by vote through the last year and a half: Wiktionary:Votes/pl-2015-12/Language, Wiktionary:Votes/pl-2015-12/Part of speech and Wiktionary:Votes/2016-12/"References" and "External sources". (Only "Ideophone" was recently added in the POS list without a vote.)
  • This statement is false, because for definitions, we use the attestation process, not references: "While we may be lax in demanding references for words that are easily found in most paper dictionaries, references for more obscure words are essential."
  • This statement is false, because we don't add references directly in the senses (apart from sometimes adding footnote links in the senses): "References may be added in a separate header of adequately chosen level or added directly to specific senses."

I already created a vote for this proposal in December 2015, but the vote never started:

Procedural notes:

  • I've been planning to propose starting that vote after revising the three actual, separate EL sections covered above. First, I created the three separate votes linked above and revised the three separate sections. Now I feel it's a good time to remove "The essentials" entirely. Feel free to agree/disagree/discuss.

Also, here's one past discussion about this, between Wonderfool and myself:

--Daniel Carrero (talk) 06:12, 17 April 2017 (UTC)

I added this vote in the list. It's going to start in 7 days. --Daniel Carrero (talk) 10:41, 22 April 2017 (UTC)

Please voteEdit

Planned, running, and recent votes [edit this list]
(see also: timeline, policy)
Ends Title Status/Votes
Oct 19 Rename categories no consensus
Nov 1 User:Chuck Entz for checkuser passed
Nov 28 Templatizing topical categories in the mainspace 2 24 (13 people)
Dec 11 Desysopping CodeCat aka Rua  3  12  5
Dec 18 Placing Wikidata ID in sense ID of proper nouns starts: Nov 19
Jan 20 Restricting Thesaurus to English starts: Nov 22
(=6) [Wiktionary:Table of votes] (=99)

Some votes are going to end pretty soon, and have 5 participants at most. Please vote before they end. It goes without saying that abstaining is fine too. Thanks in advance. --Daniel Carrero (talk) 02:05, 19 April 2017 (UTC)

Where does one file a complaint against an Admin?Edit

Hi all, I've been clicking around but I can't find where to signal abuse of admin powers or unwiktionarianlike behaviour by an admin. Can someone point me in the right direction? Thanks. Great floors (talk) 08:35, 19 April 2017 (UTC)

Just post it here. Equinox 08:41, 19 April 2017 (UTC)
As Equinox has said, this is a good place, or you can bring it up directly with the person on their talk page (if appropriate). If you want to maintain anonymity you can email the OTRS service (info-en(a)wiktionary.org or see "Contact us" page on the left) and one of the volunteers there will look into the matter for/with you. - TheDaveRoss 11:02, 19 April 2017 (UTC)
Ach, don't encourage people to send even more spam our way! This user was reverted for adding a noun definition to an adjective, has been rather defensive about it, and wants to waste everyone else's time. I'd recommend not engaging any further. —Μετάknowledgediscuss/deeds 16:24, 19 April 2017 (UTC)
Even if this particular person is not in the right, this conversation might be the one which helps some future user find the right place to seek help. - TheDaveRoss 17:57, 19 April 2017 (UTC)

RankingsEdit

Without any discussion whatever, G23r0f0i has taken it upon themself to remove ===Statistics=== sections and {{rank}} templates from English entries. I don't have any particular interest in rankings either way, but I do think widespread removal of a feature we've had for a long time needs to be discussed before it goes any further. —Aɴɢʀ (talk) 20:37, 19 April 2017 (UTC)

Seemingly, this user has a history of removing stuff like that. See their talk page. PseudoSkull (talk) 20:39, 19 April 2017 (UTC)
Sorry about that, I was removing too many in the same spurt. I should've carried on gradually removing them among other edits, hiding the deletion to avoid detection. It's just a bugbear I have, especially with blatant crap like having the rank template on the page def. And having on the page 4? Really? Useless!--G23r0f0i (talk) 20:43, 19 April 2017 (UTC)
I agree that there should be some discussion, I also agree that they should be removed. The source of the material was not well sanitized (as can be seen by the rank of "Gutenberg") which makes it not only obsolete but also misleading. We should remove the rank data until and unless we can create something which is meaningful. - TheDaveRoss 20:44, 19 April 2017 (UTC)
I don't know about you, but I use the word Gutenberg nearly as often as the. --WikiTiki89 21:02, 19 April 2017 (UTC)
Of course, just not in mixed company. - TheDaveRoss 21:32, 19 April 2017 (UTC)
@G23r0f0i: If you think that Template:rank should be deleted, then propose that. But don't just remove it haphazardly: that defeats the purpose of a deletion discussion and it also makes it impossible for other users to see how a template is actually being used. —Justin (koavf)TCM 22:49, 21 April 2017 (UTC)
It has been proposed, a long time ago. So far a supermajority of people contributing to the RFD support deletion, but it's never been closed. —Μετάknowledgediscuss/deeds 23:05, 23 April 2017 (UTC)
Maybe the user was being facetious, but hiding removal of the template would be worse than performing edits that are more clearly a removal of the template. Fortunately, the user is now blocked, so this proposed action will not happen. — Eru·tuon 22:02, 24 April 2017 (UTC)
If we do decide to include rankings, we should at least automate them in Module:Rankings or some other centralized place because word frequencies change over time. —Aryamanarora (मुझसे बात करो) 22:31, 21 April 2017 (UTC)
@Aryamanarora: I agree. It would be so much easier to generate them with a module than to manually write them out as is currently done. — Eru·tuon 21:52, 24 April 2017 (UTC)

User:CodeCat for admin again.Edit

She seems to need admin tools still, and heck, I don't see why not. PseudoSkull (talk) 00:57, 22 April 2017 (UTC)

I'm in favor as long as (1) User:Wyang is also re-sysopped, and (2) the two of them agree not to wheel-war with each other anymore. —Aɴɢʀ (talk) 10:17, 22 April 2017 (UTC)
No offence, but I do not think CodeCat has the right temperament to be an admin; case-in-point, the example above. She's very "delete first, f**k you and your questions later". --Victar (talk) 16:06, 22 April 2017 (UTC)
I like her shake-things-up attitude, I think her contributions are largely for the better, and her not having the sysop tools is a waste of time. I agree with Angr's caveats, however. --Barytonesis (talk) 18:07, 22 April 2017 (UTC)
@Barytonesis: Do you have a previous account? Your account looks to be only a few months old. --Victar (talk) 18:15, 22 April 2017 (UTC)
Yes. Why? --Barytonesis (talk) 18:17, 22 April 2017 (UTC)
Well, obviously the quality of your opinion is only worth your experience on Wiktionary. --Victar (talk) 18:19, 22 April 2017 (UTC)
As far as I'm concerned, both CodeCat and Wyang are still admins until there's a vote to change that. The reason they're not allowed to act as admins is because 1) reinstating just one would be taking sides and 2) I don't have a commitment from both that neither will resume their side of the wheel war. In other words, Angr's statement is basically what I've said before and will continue to say until either the community decides otherwise or the dispute is resolved. Chuck Entz (talk) 19:29, 22 April 2017 (UTC)
It's important to the project that both of them have their admin tools, and we should restore them at once. If problems crop up again in the future, it is a simple matter to remove the admin bits. There is no risk in this, so we should restore their powers now. —Stephen (Talk) 14:07, 23 April 2017 (UTC)
What will happen if I restore Module:links to its correct state and Wyang undoes it again, like before? Which version has consensus? This needs to be sorted. —CodeCat 14:40, 23 April 2017 (UTC)
Wyang has still been breaking modules though, and having to fix them when someone complains. DonnanZ (talk) 15:34, 23 April 2017 (UTC)
You could start a vote to decide whether phonetic_extraction should or shouldn't be present in the links code. You will have to re-explain your views as to what's going on and why it should go; I don't recall what the argument was any more. I do think it would be helpful if you agree not to remove phonetic_extraction from Module:links, and Wyang agrees not to add it back in if someone else removes it. This is pretty much what Chuck is asking for. Benwing2 (talk) 16:11, 23 April 2017 (UTC)
There you have it, Stephen. You say 'there's no risk in it' but CodeCat's first question is basically an announcement of seamlessly continuing the edit war given the chance. Which isn't surprising, since the matter was never settled, nor did any of them concede. They were just stopped from acting. Those knowledgeable [read this as: Other people than CodeCat and Wyang] must decide to handle those Asian languages one way and not the other, full stop, before either can be reällowed to use the admin tools or we'll have an eternally repeating history. Korn [kʰũːɘ̃n] (talk) 10:59, 25 April 2017 (UTC)
When I said no risk in it, I meant that any files they edit can be returned to a previous state if we don't like the edits, and it only takes a few seconds to do it. It's not as though they could actually do any damage. I don't see CodeCat's question as a threat ... I think it's a good question. It's been over eight months since the wheelwars and I have forgotten what the dispute was. Maybe someone could explain the disagreement and the community could make a decision. —Stephen (Talk) 22:47, 27 April 2017 (UTC)
Two thoughts: 1. Yes, anything they change can be returned to a previous state. Which is what they did. Constantly. We call that an edit war. 2. Frankly, as much as I agree that CodeCat is useful as a admin and stellar as an editor, basically everything a user could aspire to be here, the hardcore stubbornness and willingness to do the very thing for which to stop she was invested with the tools isn't really an advertisement of the diplomatic skills and insight I as a user would hope to find in those in positions of judgemental power. If we solve the issue at hand, we've probably not solved the actual issue. Sorry, CodeCat. Korn [kʰũːɘ̃n] (talk) 10:50, 28 April 2017 (UTC)

Category:en:ScoutingEdit

This has been discussed briefly at Wiktionary:Tea room#Category:en:Scouting, but something must seriously be done. There are so many new scouting terms I and other users have added, and I believe it deserves a category. What I propose we do is make it so that when the label "scouting" is used in Template:lb, it automatically adds it to Category:en:Scouting. Examples of recent entries include merit badge (and I still can't believe this wasn't already here as it's such a well-known scouting concept), merit badge university, Philmont, Philmonter, Eagle, Venturer Scout, Totin' Chit, MBU, PTC, minibear, merit badging, and many more and there are still a lot more to cover. PseudoSkull (talk) 00:57, 22 April 2017 (UTC)

I don't know how/where to do it, or I would. Equinox 16:27, 22 April 2017 (UTC)
First, you need to decide which parent category it would best be placed under. Then, go to that category, click the "Edit category data" button, and in the module code, insert a definition for scouting (the list is alphabetical). —CodeCat 17:49, 22 April 2017 (UTC)
@PseudoSkull, Equinox, CodeCat: Unless I'm mistaken, I believe this and this have instituted what is desired. — I.S.M.E.T.A. 11:26, 25 April 2017 (UTC)
If it works, then I'd say so. But why the capital letter? —CodeCat 13:16, 25 April 2017 (UTC)
@CodeCat: I took my lead from the word's capitalisation in the label for the definition of merit badge. — I.S.M.E.T.A. 13:31, 25 April 2017 (UTC)

Category:English clippings & Category:English short formsEdit

What's the difference? "short form" is given as a synonym of "clipping"... --Barytonesis (talk) 23:55, 22 April 2017 (UTC)

@Barytonesis: Short form sounds more professional than clipping to me. — I.S.M.E.T.A. 11:30, 25 April 2017 (UTC)
@I'm so meta even this acronym: Clipping is the linguistic technical term, though. —Aɴɢʀ (talk) 11:36, 25 April 2017 (UTC)
@Angr: It sounds to me like clipping is the process which creates short forms. — I.S.M.E.T.A. 12:26, 25 April 2017 (UTC)
The term refers both to the process (as a mass noun) and to the form so created (as a count noun). See the first sentence of the second paragraph of the Wikipedia article, for example. —Aɴɢʀ (talk) 12:41, 25 April 2017 (UTC)
@Angr: *grumbles* Fine, I suppose. I can get with that programme. Clipping it is. — I.S.M.E.T.A. 12:50, 25 April 2017 (UTC)
@I'm so meta even this acronym: Although I disagree with you that "short form sounds more professional" and would argue precisely the opposite, I don't really care one way or the other. My concern is over why we have two categories for the same thing. --Barytonesis (talk) 12:41, 26 April 2017 (UTC)
"Short form" is a generic name for anything that's a shorter form. Not all short forms are clippings. Not all clippings are short forms. In some languages, like Russian, "short form" has a specific meaning in the morphology of adjectives (and it's not a clipping, but rather the lack of a suffix in the first place). We should probably get rid of the short forms category altogether, because it doesn't really mean anything useful. --WikiTiki89 14:12, 26 April 2017 (UTC)

IPA character replacements by NadandoBotEdit

This bot seems to be responsible for replacing r with ɹ in every case. Is there any genuine reason for this, or is it a whim of the bot operator? One example is here. DonnanZ (talk) 08:20, 23 April 2017 (UTC)

As long as it's only replacing r with ɹ in English-language sections it's okay. Our convention is to use /ɹ/ for English; see Appendix:English pronunciation. —Aɴɢʀ (talk) 13:39, 23 April 2017 (UTC)
I can't see anything in that reference to indicate why. Other references like Wikipedia, Oxford and Cambridge never use ɹ, so why is Wiktionary the odd one out? DonnanZ (talk) 13:56, 23 April 2017 (UTC)
@Donnanz: Primarily because of Wiktionary:Votes/2008-01/IPA for English r. —Aɴɢʀ (talk) 14:49, 23 April 2017 (UTC)
Ah, I wasn't a registered user then, and a lot of the supporters of that vote aren't around any more. Maybe that vote should be rerun. There's one easy solution: don't enter any IPA with r in it, that way nothing will be altered. DonnanZ (talk) 15:00, 23 April 2017 (UTC)
In fact you should probably never edit any pages at all since your contributions can be changed by anyone at any time (maybe there's a name for this type of website?). DTLHS (talk) 15:46, 23 April 2017 (UTC)
I'm not bothered by that, and my edits are rarely reverted. But one can't be forced to do something in the knowledge that it will be undone by a bot, and of course you run the bot in question. DonnanZ (talk) 16:18, 23 April 2017 (UTC)
Nobody's forcing you to do anything, but all Wiktionarians share in a mutual agreement to follow consensus as determined by votes. If you don't like the vote, feel free to start a new one to repeal it. —Μετάknowledgediscuss/deeds 17:23, 23 April 2017 (UTC)
Almost all Wiktionarians, MK...--WF April 2017 (talk) 22:18, 23 April 2017 (UTC)
Maybe, maybe not. I'm not exactly getting any support for this. DonnanZ (talk) 10:33, 24 April 2017 (UTC)
For what it's worth, I'd be for using the approximant symbol /ɹ/ if it came up for vote again. The use of the generic trill symbol /r/ by most English dictionaries annoys me. — Eru·tuon 22:45, 24 April 2017 (UTC)
Is there any difference in pronunciation? DonnanZ (talk) 08:17, 25 April 2017 (UTC)
Strictly speaking, /r/ stands for a trill, like the Italian r or the Spanish rr, while /ɹ/ stands for the usual English r. However, most English-language dictionaries and phonology reference works use /r/, because it's easier on both typesetters and readers and because the two sounds do not contrast in English. (Some varieties of English, particularly some varieties of Scottish English, do use the trill.) If we were a monolingual English dictionary, I'd be in favor of /r/, but because we're a multilingual dictionary I think it's preferable for us to use /ɹ/ for English and /r/ for languages that actually have a trill. —Aɴɢʀ (talk) 11:09, 25 April 2017 (UTC)
@Donnanz: You can listen to that difference in pronunciation in these recordings for [ra ara] and [ɹa aɹa]. I agree with Erutuon and Aɴɢʀ here. — I.S.M.E.T.A. 11:40, 25 April 2017 (UTC)
Thankyou, but unfortunately I can't hear the audio on either recording for some reason. DonnanZ (talk) 13:26, 25 April 2017 (UTC)
@Donnanz: OK. Try this YouTube video instead. — I.S.M.E.T.A. 13:34, 25 April 2017 (UTC)
No problem with that, thankyou. DonnanZ (talk) 13:50, 25 April 2017 (UTC)
@Donnanz: You’re welcome. Can you hear the difference between them? And, if so, does it change your opinion on this at all? — I.S.M.E.T.A. 22:23, 25 April 2017 (UTC)
I must admit it has. I didn't connect with the meaning of "trill" before. I thought it might be something to do with rolling of Rs, which incidentally I do slightly, being a Southlander by birth. DonnanZ (talk) 22:53, 25 April 2017 (UTC)
@Donnanz: I see. So the phone might really be [r] in your idiolect. Perhaps it's Southland's populace's Scottish heritage. — I.S.M.E.T.A. 23:01, 25 April 2017 (UTC)
@Donnanz: Trill is what I'd mean when saying "rolled r", but I'm not sure what a Southland rolled r is, articulatorily speaking; I'm not sure if I've ever heard it. Would you by any chance be able to find a recording of it? — Eru·tuon 23:07, 25 April 2017 (UTC)
It's more of a drawn-out r, not a trill, some of my relatives were very strong with rolled Rs. Yes, both Otago and Southland have a strong Scottish heritage, but there were Irish and English settlers too, my great-grandfather was born in Stoke-on-Trent. DonnanZ (talk) 23:18, 25 April 2017 (UTC)
It's amazing to me that you're arguing for one over the other without knowing what the difference is! Ƿidsiþ 11:49, 25 April 2017 (UTC)

Category:Russian words suffixed with -∅ and the "null suffix"Edit

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter What should we do with this category and the "null suffix" that it implies? There are many nouns in Russian that appear to be formed by taking the root of a verb and converting it directly into a masculine noun, without any apparent suffix (or alternatively, with a null suffix). In Proto-Slavic, these words would have normally had the suffix -ъ, but this no longer corresponds to anything in Russian. Cf. взгляд (vzgljad, glance), apparently derived from the root of взгляну́ть (vzgljanútʹ, to glance (at)), where -ну́ть (-nútʹ) is a verbal suffix that in most cases suppresses a preceding consonant (in this case, d), and the root взгляд- (vzgljad-) with the d is found in other derivatives such as взгля́дывать (vzgljádyvatʹ). In this case we could probably indicate that it comes from a Proto-Slavic word *vъzględъ, but there are also productive formations that don't go back to Proto-Slavic, e.g. водово́д (vodovód, aqueduct), clearly a calque on the Latin word and transparently composed of водо- (vodo-) (combining form of вода́ (vodá, water)) and -вод (-vod) (action noun derived from the verb води́ть (vodítʹ, to lead)). User:D1gggg has added many etymologies to such words; these are somewhat OK and somewhat broken and I'm trying to fix them up but I'm not quite sure how to proceed. Benwing2 (talk) 17:31, 23 April 2017 (UTC)

Oops, I see this was brought up only a week ago. But there was no resolution then. My main question is what's the proper way of formatting such etymologies. Benwing2 (talk) 17:33, 23 April 2017 (UTC)

Null suffix is:

  1. Almost not related to etymology at all
  2. Has little to nothing with Proto-Slavic, but invented by some linguist (AFAIK)
  3. Taught in regular schools at 1-3 grade
  4. Used by professional linguists "стирай (нулевой)" by Krylova Maria Nikolaevna PhD in Philological Science, Assistant Professor of the Professional Pedagogy and Foreign Languages Department
  5. Was in use for decades and (maybe) about centuries.

Anyone who claims non-existence of null suffixation is illiterate according to standard of the language. d1g (talk) 17:42, 23 April 2017 (UTC)

Null suffixes aren't controversial among linguists. The question is how best to handle them in our etymologies (which include word formation). Benwing2 (talk) 18:33, 23 April 2017 (UTC)
my suggestion is to use common sense and name things with their names "word by etymology" "words by morpheme/allomorphs" (or lemmas or something else) d1g (talk) 19:08, 23 April 2017 (UTC)
Null suffix is the right concept but there is nothing useful in explicitly showing it. The lack of anything after the stem is enough.--Anatoli T. (обсудить/вклад) 22:21, 23 April 2017 (UTC)
But how do you write this in an etymology using {{affix}}? —CodeCat 19:57, 24 April 2017 (UTC)
Why do you have to use {{affix}}? --WikiTiki89 20:06, 24 April 2017 (UTC)
How else? —CodeCat 20:26, 24 April 2017 (UTC)
"From вода́ (vodá, water)) and the root of води́ть (vodítʹ, to lead)", for example. —Aɴɢʀ (talk) 20:30, 24 April 2017 (UTC)
That etymology doesn't really tell me much. The etymology should look like that of Dutch ademen, with the base word and the affix used to derive it, both in their lemma form. If the affix has no orthographic representation in its lemma form, -∅ should do just fine to indicate this. —CodeCat 20:34, 24 April 2017 (UTC)
I disagree. We shouldn't give inflection suffixes such as Dutch -en in derivations. --WikiTiki89 20:37, 24 April 2017 (UTC)
It's not an inflectional suffix, it's a derivational suffix. They have separate sections on the page for -en. Words derived with -en follow a particular inflectional class, which clearly not all verbs with the infinitive ending -en follow. This is analogous to e.g. Latin , which derives only first conguation verbs. —CodeCat 20:43, 24 April 2017 (UTC)
More accurately, you are deriving a weak verb stem from an adjective or noun stem (without any alterations), and the -en happens to be the inflectional suffix of the lemma form of this new weak verb. --WikiTiki89 20:57, 24 April 2017 (UTC)
Yes, just as Latin derives a verb stem whose lemma form has the ending and that inflects as a first conjugation verb. In Dutch, verbs of all classes have the infinitive ending -en, but this is a historical accident caused by the merging of unstressed vowels. Just as historical accident caused former *-āō to become , the same ending as third conjugation verbs. —CodeCat 21:02, 24 April 2017 (UTC)
My point is the verb stem is created without any affixation, and so there is no derivational suffix. Then is added as an inflectional suffix to form the first person singular present, but that shouldn't be part of the etymology. --WikiTiki89 21:07, 24 April 2017 (UTC)
We indicate the combination of null suffix + lemma ending as -en. Likewise, the combination of null suffix + null lemma ending should be -∅. —CodeCat 21:20, 24 April 2017 (UTC)
That's exactly what I'm saying we shouldn't do. Are we clear now that our disagreement is real and not a miscommunication? --WikiTiki89 21:24, 24 April 2017 (UTC)
Our practice has always been to show the morphological derivation of a word. A null suffix is also part of that derivation. Consider a hypothetical case in which a Latin second declension noun is derived from a first declension noun. We'd denote this with + -us. Now, if in another case, a lemma without an inflectional ending is derived from a lemma with one, then you'd denote this in the same way, by showing the suffix + ending as usual. If both are null, then you need another way to indicate it; -∅ seems like a good way to do so. —CodeCat 21:30, 24 April 2017 (UTC)
Actually "our practice" has always been inconsistent about this. In the hypothetical case that a Latin second declension noun is derived from a first declension noun, it would be more useful to say exactly that (something like "from the first-declension noun X, converted to the second declension") than to say "X + -us". In the other hypothetical case where a lemma without an inflectional ending is derived from a lemma with one, the same applies, the inflectional endings should be ignored in the derivation. --WikiTiki89 21:39, 24 April 2017 (UTC)
As a linguist, I don't doubt the existence of null morphemes for a moment, but as a lexicographer, I do doubt the usefulness of showing them in etymology sections of a dictionary. I don't think our readers will benefit from being told that the plural deer is formed from the singular as deer + ∅ or that the past participle run is formed from the infinitive as run + ∅. —Aɴɢʀ (talk) 20:33, 24 April 2017 (UTC)
We don't indicate etymologies for inflections. —CodeCat 20:34, 24 April 2017 (UTC)
Fair enough. I don't think our readers will benefit from being told that the verb dog is formed from the noun as dog + ∅ or that the noun break is formed from the verb as break + ∅. —Aɴɢʀ (talk) 21:00, 24 April 2017 (UTC)
I don't know. I'm not sure how it should be indicated in etymologies, or how the categories should be structured or named, but it is unfortunate that terms that are derived from another part of speech without the addition of a morpheme are currently not categorized in any way. It's another derivational process, and it should be recognized for the sake of completeness. — Eru·tuon 00:08, 25 April 2017 (UTC)
I think I agree with CodeCat and Erutuon that we should maybe show the null morpheme -∅ in derivations. It's definitely a derivational process and IMO the categories are useful. Benwing2 (talk) 01:10, 25 April 2017 (UTC)
I feel more or less as Angr does. Ƿidsiþ 05:03, 29 April 2017 (UTC)
In взлёт etymology, I see взлете́ть (vzletétʹ) + -∅. That does not look like a true null suffix to me. In fact, the stem of взлете́ть (vzletétʹ) had to be modified to produce взлёт. Similarly, in Czech, nákup is probably derived from nakoupit but I would be surprised to see nákup etymology specified as nakoupit + -∅. The derivational process involved does not look like suffixation. Not all derivational processes consist in extension of something; blending, yielding e.g. smog, is an example. For Czech, I will oppose the use of -∅ for such a purpose until I find some convincing arguments. For Russian, if at least a slight majority of Russian contributors opt to oppose the use of "+ -∅", I will add my voice to them.
Can anyone recommend some good reading on "null suffix", written in English? I am not impressed by "Philological Science" (an oxymoron?) and credentialism. --Dan Polansky (talk) 10:11, 29 April 2017 (UTC)

Template:not a morphEdit

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter This was created by User:D1gggg. It originally just said "not a morph" and now says "unlisted as a morpheme" (i.e. in a major reference book of Russian grammar that D1gggg likes to cite), which is more accurate, but I think this entire template is unhelpful and unnecessary. Opinions? Benwing2 (talk) 19:00, 23 April 2017 (UTC)

Orphan and delete.--Anatoli T. (обсудить/вклад) 22:12, 23 April 2017 (UTC)

Oldest quotation in Wiktionary, aka elephant poopEdit

While doing my once-every-ten-year common-misspelling Wiktionary purge, I came across a quote from around 2 millenniums BCE at 𒄠𒋛. It brought me to the obvious question...do we have quotes that date back further than that? In a way, I kinda hope not, coz it would cool if our record-breaking quote is all about elephant poop. --WF April 2017 (talk) 21:49, 23 April 2017 (UTC)

Wow, that's hilarious. I agree, it is a great quote to have as the earliest one on Wiktionary. — Eru·tuon 22:08, 23 April 2017 (UTC)
Methinks a certain person is back. DonnanZ (talk) 22:35, 23 April 2017 (UTC)
You don't say? Here I was thinking his username referred to Wallis and Futuna. —Μετάknowledgediscuss/deeds 23:04, 23 April 2017 (UTC)
LOL, he even welcomed himself on his talk page... Andrew Sheedy (talk) 23:40, 23 April 2017 (UTC)
  • Someone could easily find an older one for Sumerian, but not me. —Μετάknowledgediscuss/deeds 23:04, 23 April 2017 (UTC)
    Good point. I was thinking, isn't Sumerian older than Akkadian? — Eru·tuon 23:57, 23 April 2017 (UTC)
    • @Erutuon: Writing-wise, yes. The Akkadians adopted Sumerian characters. Those are the oldest writing system and the ancestor of all current writing systems other than the Chinese and possibly some deliberate scripts from recent times (that is, for conlangs). —Justin (koavf)TCM 21:35, 24 April 2017 (UTC)
I just added the following quote, I doubt anyone will get anything older. - TheDaveRoss 12:26, 24 April 2017 (UTC)
  • 13.82b BCE - The Universe
    Bang!
Possibly , as in "Russian words suffixed with -∅"... Equinox 14:16, 24 April 2017 (UTC)
The 2nd millennium BC is of course a very wide range. The Rig Veda and Hittite inscriptions are also from that millennium. —Aɴɢʀ (talk) 14:36, 24 April 2017 (UTC)
13.82b BCE is before the Latin alphabet was adapted to write English, so I doubt that was the original script. --WikiTiki89 14:49, 24 April 2017 (UTC)

Hmm, I'd bet some of our Old Persian/Avestan/Sanskrit stuff approaches that. —Aryamanarora (मुझसे बात करो) 01:08, 25 April 2017 (UTC)

Having looked up some stuff, the Avestan Gathas are from 1200 BC, the Old Persian Darius inscriptions are from 550 BC, and the Vedas are 1700-400 BC... so I think there's some competition. —Aryamanarora (मुझसे बात करो) 01:08, 25 April 2017 (UTC)
Our Ugaritic quotations are c. 1400-1200 BCE. I'm sure we could find Akkadian, Sumerian, or Egyptian quotations from the 3rd millennium BCE. And maybe even an Egyptian quotation from the 4th millennium. --WikiTiki89 14:24, 25 April 2017 (UTC)
Here’s an Egyptian one from the 12th dynasty (1991-1802 BCE), probably the oldest one listed here so far. Vorziblix (talk) 05:12, 23 May 2017 (UTC)

Dialect labelsEdit

I've thought further about how to clean up our system of several different templates that create labels ({{label}}, {{term-label}}, {{accent}}, {{qualifier}}, {{alter}}). Besides the details of the way these templates display, the relevant difference is that {{label}} and {{term-label}} are used in definitions and headwords, and they frequently add categories, while the others do not. But all of these labels templates often refer to language varieties.

So, I propose that we have some centralized module for language varieties, and that we allow all these templates to refer to it. Currently, we have Module:labels/data/subvarieties, and all the dialect modules, like Module:en:Dialects (which are only used to label alternative terms).

Perhaps what would be easiest is to allow {{qualifier}} to use the labels in Module:labels/data/subvarieties, while, of course, not adding categories. Then, if you put the name of a language subvariety in {{qualifier}}, it can be automatically linked to a Wikipedia article.

This would, however, require that we add language codes to instances of {{qualifier}}. That way, the module can check that, for instance, the "Australian English" label is being used in an English entry, the language to which that language variety belongs. If no language code is provided in the template, then the language variety labels module will not be used.

Categories should not be added because {{qualifier}} is used in lists of terms related to the current entry somehow (for instance, Derived terms and Synonyms). If a term has a synonym in another variety of the language, it does not mean that the term's entry should be categorized under that variety.

I am not sure if I have the energy to put this idea into practice right now, but there it is. — Eru·tuon 02:16, 25 April 2017 (UTC)

  • {{label}} has been generally replaced by {{lb}}, and performs a different function to {{qualifier}} or {{q}}. DonnanZ (talk) 08:22, 25 April 2017 (UTC)
    • Right, {{lb}} is the abbreviated form of {{label}}. I am aware that {{label}} is used in different parts of entries. I have outlined the purposes of these templates in User:Erutuon/labels. — Eru·tuon 09:19, 25 April 2017 (UTC)
    • The area where the purposes of {{label}} and {{qualifier}} intersect is in giving a dialect, variety, or sociolect. {{label}} gives a dialect in the context of definitions, {{qualifier}} elsewhere. See, for instance, Wikisaurus:bathroom, which tags synonyms of the word using {{qualifier}}. And see pissed, which uses {{label}} to indicate which dialects the two senses are found in. Both templates should draw from the same label data, since they are identical apart from what situation they are found in. {{qualifier}} need not be free-form; it would be better to standardize it like {{label}}. — Eru·tuon 09:39, 25 April 2017 (UTC)
    • To give an example, the entry for pissed might contain the definition {{lb|en|US}} [[angry]], while the entry for angry might contain, in its Synonyms section, {{l|en|pissed}} {{qual|US}}. In this way, the two templates are related to each other and can contain similar content. — Eru·tuon 09:49, 25 April 2017 (UTC)
  • It would be good for general data consistency, but not sure if it adds a lot of benefits given the effort. I'm still hoping that at some point we can avoid passing redundant lang parameters everywhere, and maybe defer this decision until then. I'm cleaning up the mess from the {{qualifier}} template conversion; there does seem to be some confusion around the usage of the two templates. How about leaving {{qualifier}} freeform and adding a specialised {{dialect}} template? – Jberkel (talk) 09:16, 25 April 2017 (UTC)
    • I concur that {{dialect}} would be a better place for this. I like having {{qualifier}} be general purpose and with no-frills. —JohnC5 03:06, 27 April 2017 (UTC)
      • I sort of like the idea of having a dedicated template, but adding another qualifier template would also be a headache for editors. It would require us to go through thousands of entries replacing {{qualifier}} with {{dialect}} whenever it mentions language subvarieties. It would enforce a strict distinction: you can never use {{qualifier}} when mentioning a dialect name. And then you would have to choose between two templates specifically dedicated to language subvarieties: {{accent}}, which is used in pronunciation sections, and {{dialect}}, which is used everywhere else except in headwords and definitions. It seems to me much easier to just loop {{qualifier}} into the existing language subvariety labels module. Editors don't have to change their ways, but their chosen qualifiers will now be modified in the same way that content supplied to {{label}} is modified. (I suppose the only potential change is to require {{qualifier}} to be supplied with a language code, but that may not be necessary.) — Eru·tuon 03:42, 27 April 2017 (UTC)
  • I agree with making a data module specifically for varieties. Presumably, it would replace most of what is currently in Module:etymology languages. —CodeCat 13:21, 25 April 2017 (UTC)

Using Wikidata to store alphabetsEdit

This is a possible case use for Wikidata. @Lea Lacroix (WMDE), could you please check if Wikidata is able to do this?

This is the English alphabet:

  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz

I don't speak Turkish, but apparently this is the Turkish alphabet:

  • Aa, Bb, Cc, Çç, Dd, Ee, Ff, Gg, Ğğ, Hh, Iı, İi, Jj, Kk, Ll, Mm, Nn, Oo, Öö, Pp, Rr, Ss, Şş, Tt, Uu, Üü, Vv, Yy, Zz

Will we be able to store alphabets in Wikidata and query Wikidata for questions like this?

  • Is "H" in the English alphabet? (answer: yes)
  • What is the position of "H" in the English alphabet? (answer: 8th letter)
  • What is the 9th letter of the Turkish alphabet? (answer: capital Ğ, small ğ)
  • What are all the alphabets that use the letter "A"? (answer: a really large list of alphabets)
  • Can we store letter names too? I'd like Wikidata to answer this: What is the name of the letter "B" in English? (answer: "bee")

Thanks in advance. --Daniel Carrero (talk) 11:08, 25 April 2017 (UTC)

Note that there are pages for characters: d:Q9992, so at least part of the data should already be there. — Dakdada 12:36, 25 April 2017 (UTC)
Also for alphabets, with the order: d:Q754673. — Dakdada 12:42, 25 April 2017 (UTC)
Hello @Daniel Carrero and thanks for these interesting questions. As @Darkdadaah mentioned, items already exist for alphabet letters (maybe not all, and not perfectly described).
To describe the alphabet letters, as concepts, items and their statements work well. For example A (Q9659) has a statement "part of: English alphabet". This could probably be improved by adding a qualifier to indicate that it's the first letter of this alphabet. Then one could query the position of the letters and answer some of your questions.
"a" will also exist as a lexeme, several lexemes in that case, to describe the use of this letter as a word, in English, French, and other languages that use this word. In the lexeme "a (English)" one will then describe the lexical category, several forms and senses, that will answer other questions.
I tried a few queries and realized that a lot of letters are still missing or need to be better described in Wikidata, because the list of letters part of English alphabet is almost empty when list of letters part of latin script has way more results. Seems like a work to do together with Wiktionary and Wikidata community :)
I hope this answers to your questions. Lea Lacroix (WMDE) (talk) 10:15, 26 April 2017 (UTC)
Thank you for your response. :) I agree that it seems like a work to do together with Wiktionary and Wikidata community. --Daniel Carrero (talk) 21:27, 27 April 2017 (UTC)
@Lea Lacroix (WMDE), you will find all sorts of alphabetic information in Category:Script appendices. Some examples are: Appendix:Latin script/alphabets, Appendix:Cyrillic script, and Appendix:Arabic script. —Stephen (Talk) 22:31, 27 April 2017 (UTC)

Splitting WT:RFVEdit

RFV is consistently so long as to be unwieldy to edit, just by dint of how long it takes to load. A good solution would be to split it into a page for English terms and a page for non-English terms (though I am not sure what those pages should be called). I was evidently not the first person to suggest this idea, but I'm bringing it here to encourage more discussion of it. —Μετάknowledgediscuss/deeds 07:26, 26 April 2017 (UTC)

  Support - TheDaveRoss 11:58, 26 April 2017 (UTC)
  Support (I have also indicated my support at "Wiktionary talk:Requests for verification"). — SMUconlaw (talk) 12:14, 26 April 2017 (UTC)
  Support - I don't usually have issues with loading, but certainly if it helps out Leasnam (talk) 18:19, 26 April 2017 (UTC)
I wonder what the ratio of English to non-English would be. The problem is that some RFVs sit there for a long time, probably because no one knows how to confirm or reject them. DonnanZ (talk) 12:45, 26 April 2017 (UTC)
Another idea would be to highlight some of the oldest RFVs on the "Recent changes" page in the same way as wanted entries are. DonnanZ (talk) 13:10, 26 April 2017 (UTC)
I think I'd support this. It doesn't feel too hacky/arbitrary to split them, either, since English has special status on en.wikt. Equinox 13:53, 26 April 2017 (UTC)
Very much support this, that page is the length of a book at this point. — Kleio (t · c) 13:55, 26 April 2017 (UTC)
A few things I want to point out: If we split the pages, we would have to be stricter about enforcing language codes in the {{rfv}} and {{rfv-sense}} templates. Otherwise the automatically generated links from those templates (most importantly, the "+" link) will not work. Secondly, much of the slowness of page loading has to do with too much JavaScript. If we can reduce the amount of JavaScript that runs when you load a page, then problem solved. Other than that, I have nothing against splitting the page. --WikiTiki89 14:11, 26 April 2017 (UTC)
It would help if we could do something about abuses like WT:Requests for verification#Compounds with quis. This two-part section, by itself, is 3% of RFV. Its sheer volume is compounded by the fact that it's going to hang around longer because no one is going to read it. Perhaps we could banish it to a subpage with a TLDR message in its place? Chuck Entz (talk) 14:25, 26 April 2017 (UTC)
The fact that it's only 3% of RFV shows that it's not the real issue. I agree it's annoying though. Since it's Latin perhaps User:Metaknowledge is willing to take care of it. --WikiTiki89 14:29, 26 April 2017 (UTC)
Why not just close most of the unresolved RFV discussions as "failed"? The intro text says specifically: "After a discussion has sat for more than a month without being 'cited', or after a discussion has been 'cited' for more than a week without challenge, the discussion may be closed." The RFV discussions will be moved to the entries' talk pages, which will be accessible in case someone in the future wants to try attesting them. --Daniel Carrero (talk) 14:46, 26 April 2017 (UTC)
Because that's prioritising cleaning out RFV over making Wiktionary better. Every RFV'ed term deserves a serious attempt at being cited by someone who is competent with the language in question, and if that takes more than a month (because of people who speak more obscure languages being slow to respond to pings, etc), then so be it. If we close it as failed, nobody is likely to return to check if the term is okay for a very long time. (This is the same thing as terribly unformatted new entries by anons; we should fix them up or ping people who can rather than delete them outright because they broke all our layout rules.) —Μετάknowledgediscuss/deeds 18:12, 26 April 2017 (UTC)
  • @TheDaveRoss, Smuconlaw, Equinox, KIeio, Chuck Entz: How does WT:Requests for verification/English and WT:Requests for verification/Non-English sound? Who is willing to update {{rfv}} and {{rfv-sense}} and force them to take language codes? —Μετάknowledgediscuss/deeds 18:16, 26 April 2017 (UTC)
    • Keep the main page for non-English. Just like we have WT:Requests for deletion and WT:Requests for deletion/Other. —CodeCat 19:00, 26 April 2017 (UTC)
      • I would prefer having the main page serve as a directory pointing to the other two. Otherwise, the main page should be for English, while all other languages get a separate page. --WikiTiki89 19:08, 26 April 2017 (UTC)
        @Wikitiki89: Is there a way that we could ensure that everyone who watchlisted the old page can also automatically watchlist the new ones if it is split directory-style, by moving it? —Μετάknowledgediscuss/deeds 07:10, 27 April 2017 (UTC)
        @Metaknowledge: Yes, all you have to do is before you create the new pages, move the current RFV page to the two new pages' locations and then move it back. --WikiTiki89 11:45, 27 April 2017 (UTC)
        Should the non-English page be subdivided with headings for different languages, like ==Japanese==, ==Latin==, ==Thai==, and so on? As for updating {{rfv}} and {{rfv-sense}}, perhaps someone very experienced with templates like @Erutuon or @JohnC5 can assist. — SMUconlaw (talk) 15:28, 27 April 2017 (UTC)
        I think the non-English page should continue to function as RFV does now. It would be annoying to have add a header for each individual language that has only one RFV'd term, even if it might be useful when a whole batch of terms is RFV'd from one language. --WikiTiki89 15:36, 27 April 2017 (UTC)
I like having everything on one page because it forces us to deal with the buildup of requests when the page gets to big. If we split the page I would hope it would not result in less attention to the non-English section. DTLHS (talk) 18:20, 26 April 2017 (UTC)
  •   Support Quicker loading and saving would make the review/discussion part of the verification process easier to participate in and might speed resolution.
Also, strict time limits, eg, 3/4/6 months, for English RfVs are much easier to justify than the same limits for languages that have fewer contributors. DCDuring (talk) 18:46, 26 April 2017 (UTC)
@WikiTiki89: JavaScript shouldn't be the problem or not be the only problem - even when disabling it, the page takes some time to load.
@Chuck Entz: No, it's no abuse and the text shows that it's a justified RFV. Of course, maybe one could correct entries without having a RFV, but then guys like you could reasonlessly rollback it. Or one could put it shorter like just having the line "For the feminine quaequam and the plural", but then guys like you could complain that no reason is given or that the person maybe didn't search enough and ignore it. Your "because no one is going to read it" might be correct and the length might be a problem for some, but the length isn't the only problem and it's the lesser problem. Another problem is the language. WT:Requests for verification#quintus from 11th October 2016 is very short but about Latin too, and there was not a single reply. WT:RFV#emodulo and WT:RFV#camminus even are from May 2016 - so almost one year old - and yet there were only two replys. And as for emodulo, even though some cites were provided, it misses an analisation or translation to see whether or not the cites attest a certain meaning.
@Daniel Carrero/Μετάknowledge: Technically Daniel Carrero is correct. Those old RFVs would be RFV failed (and are sometimes even marked as RFV failed although they aren't archived), even though the terms could exist. But a problem with Daniel Carrero's suggestion would be that it's not easy to find those old somewhat unresolved discussions. A third way (besides having it RFV failed and placing it somewhere where nobdy can find it, and having it at RFV for a too long time) could be to have it as RFV failed but to collect those discussions elsewhere where one can find them. There could be a subpage like WT:RFV/input needed linked from WT:RFV and then maybe someone sees those old entries and can give some further input.
@DCDuring: Formally by the rules - obviously not factually by practice - for English (and all other languages) it's one month without a reply or having cites. A criterion to have it like 1 month for some and e.g. 3 month for other languages could be WT:WDL/WT:LDL. But even 3 months for a LDL might often be not enough, while 1 year for LDL might be too long even for a LDL.
@CodeCat: It should be better to have two subpages like Appendix talk:List of protologisms#non-English. WT:RFV should contain the rules and links to the subpages and the subpages should contain the RFVs.
-84.161.19.68 19:24, 26 April 2017 (UTC)
I don't know if your Latin requests are appropriate for RFV. I think you could be more productive if you created an account and just started making changes. DTLHS (talk) 19:30, 26 April 2017 (UTC)
I'm not saying you shouldn't use rfv for these things, it's just that everything you post is literally ten times the size that even verbose people like me use to say the same thing. Giving us chapter and verse is bad enough, but you also seem to be throwing in several editions of the entire Bible, the kitchen sink and the chassis of a '57 Buick just to be on the safe side. It's worse than posting in Greek, because at least some of us can read Greek... Chuck Entz (talk) 04:29, 27 April 2017 (UTC)
  • Unlike some I find WT:RFV loads in an instant. But wading through the list is a different matter, and it needs to be shortened somehow. Splitting it can be tried, and if it doesn't work it can be reversed. DonnanZ (talk) 20:50, 26 April 2017 (UTC)
  Support. Like Donnanz, I don't have a big problem with it loading, but it is a pain to scroll through, and I'm much less likely to look at older discussions if I can't participate in half of them. I share DTLHS's concern though. Andrew Sheedy (talk) 01:58, 27 April 2017 (UTC)
  •   Oppose The splitting will do very little and will make it harder to add items since the adding person will have to pay attention to whether the item is English or non-English. --Dan Polansky (talk) 09:56, 29 April 2017 (UTC)

TwitterEdit

Are tweets acceptable sources of citations? I can't recall seeing any before, but they would be useful sources for primarily oral languages like Scots or Swiss German. Any ideas for how a tweet should be cited, wording-wise? Ƿidsiþ 14:07, 27 April 2017 (UTC)

Isn't the Library of Congress archiving them now? Makes the durability issues less relevant. - TheDaveRoss 14:16, 27 April 2017 (UTC)
Is the Library of Congress archiving all Tweets? Otherwise we need a way to tell which ones are archived, and whether they will continue to be archived if they are deleted from Twitter. --WikiTiki89 14:53, 27 April 2017 (UTC)
This is just a recollection I have of news from awhile back, not sure of any details. If they are archiving it would be especially great if we could link back to the archive rather than Twitter itself. - TheDaveRoss 15:03, 27 April 2017 (UTC)
Found this article. It seems we can't rely on it yet, but may be able to in the future. --WikiTiki89 15:23, 27 April 2017 (UTC)
How frustrating. It's an amazing source of data, but I can't see a solution to the problem that the original tweets might just be deleted. Ƿidsiþ 11:48, 28 April 2017 (UTC)
Can tweets be archived at the Internet Archive? — SMUconlaw (talk) 13:51, 28 April 2017 (UTC)
archive.org is not a reliable permanent archive, because a website can choose to have itself removed from it. --WikiTiki89 14:30, 28 April 2017 (UTC)

The strategy discussion. The Cycle 2 will start on May 5Edit

The first cycle of the Wikimedia movement strategy process recently concluded. During that period, we were discussing the main directions for the Wikimedia movement over the next 15 years. There are more than 1500 summary statements collected from the various communities, but unfortunately, none from your local discussion. The strategy facilitators and many volunteers have summarized the discussions of the previous month. A quantitative analysis of the statements will be posted on Meta for translation this week, alongside the report from the Berlin conference.

The second cycle will begin soon. It's set to begin on May 5 and run until May 31. During that period, you will be invited to dive into the main topics that emerged in the first cycle, discuss what they mean, which ones are the most important and why, and what their practical implications are. This work will be informed and complemented by research involving new voices that haven’t traditionally been included in strategy discussions, like readers, partners, and experts. Together, we will begin to make sense of all this information and organize it into a meaningful guiding document, which we will all collectively refine during the third and last cycle in June−July.

We want to help your community to be more engaged with the discussions in the next cycle. Now, we are looking for volunteers who could

  • tell us where to announce the start of the Cycle 2, and how to do that, so we could be sure the majority of your community is informed and has a chance to feel committed, and
  • facilitate the Cycle 2 discussions here, on Wiktionary.

We are looking forward to your feedback!

Base (WMF) and SGrabarczuk (WMF) (talk) 16:07, 27 April 2017 (UTC)

erroneous entries apparently generated semiautomaticallyEdit

Apparently User_talk:SemperBlotto/2015#aufbringen created some or many erroneous German entries, see also einbringen, using some kind of automated or semiautomatic process since his user page says de-1. We should probably check them all and strongly discourage similar activities. Incorrect entries are much more harmful and cause a very much bigger waste of time than missing entries or missing definitions. --Espoo (talk) 21:04, 28 April 2017 (UTC)

  • Both of those entries have entries in the German Wiktionary that are over ten years old. What's the problem? SemperBlotto (talk) 05:50, 29 April 2017 (UTC)
The problem is that you created at least these 2 entries whose content was erroneous or nonsense, and there's the question whether you did a lot of that. As i wrote on your talk page, since your user page says de-1, you probably didn't create the incorrect content of these articles, so it would be important to know what erroneous and unreliable source you used and whether you used a bot to create many other articles using that source. --Espoo (talk) 09:46, 11 May 2017 (UTC)
I think it is a bit melodramatic to call them "nonsense", you changed "insert, introduce (into)" to "bring into, contribute" which is a fairly subtle change as far as is goes. Insert and contribute are close to synonymous in some senses, same with "introduce into" and "bring into". The tone of your statements is quite accusatory and negative, perhaps try and assume good faith. - TheDaveRoss 12:45, 11 May 2017 (UTC)
The definition "bring up" and its synonyms for "aufbringen" is pure nonsense and naive confusion of languages i.e. Denglish, and the definition "insert, introduce (into)" for "einbringen" is just as erroneous. "Insert" or "introduce" can never be used to translate "einbringen". Just because the words I replaced these naive and erroneous translations with can sometimes be synonyms of "insert" or "introduce" does not mean these can ever be used to translate "einbringen". For example, it would be nonsense or completely erroneous to use "insert" or "introduce" here: "bring sth into (contribute sth to) a marriage".
More importantly, i did assume good faith. I specifically said i don't think SemperBlotto made this nonsense and naive translations up but copied these in good faith from an unreliable and bad source. That's why i suggested that we should strongly discourage similar activity before checking the reliability of such sources and try to find out what was copied from there. --Espoo (talk) 21:49, 11 May 2017 (UTC)

Words formed by respelling letters, e.g. deejayEdit

Some words are formed by respelling letters: deejay, emcee, okay, Seabee. Is there a technical name for this? Equinox 19:22, 29 April 2017 (UTC)

I've heard they've started calling them double-you-effs on the streets. The letters stand for "Wiktionary's Free". --Celui qui crée ébauches de football anglais (talk) 20:59, 5 May 2017 (UTC)
Or Wishful Finkin’. — Ungoliant (falai) 21:24, 5 May 2017 (UTC)
This StackExchange thread mentions that the neologisms acronomatopoeia and vocologue have been proposed, but neither sees enough use to meet CFI, as far as I can tell. - -sche (discuss) 21:18, 5 May 2017 (UTC)
From this book: “English master of ceremonies is emcee as well as M.C.. Acronyms of this sort have been labeled syllable words (Marchand 1969; 368), but this distinctive term has not been widely accepted.” — Ungoliant (falai) 21:24, 5 May 2017 (UTC)
If there's no one-word term, we should come up with a phrase to describe it. "Words formed from (spelled-out) letter names"? "Words derived from letter names"? Something like that. It should have a category, anyway. — Eru·tuon 22:43, 11 May 2017 (UTC)
Written-out initialisms? —Aɴɢʀ (talk) 05:53, 12 May 2017 (UTC)
Heh. That's much conciser. I like it. — Eru·tuon 05:56, 12 May 2017 (UTC)

Wikidata interwikis for categories, templates, project pages...Edit

@Lea Lacroix (WMDE), could you please check this case use for Wikidata?

I noticed that Wikipedia has the ability to use Wikidata to store interwikis for a number of namespaces, including categories, templates and the (site name) namespaces.

It seems Wikibooks, Wikinews, Wikiquote, Wikisource, Wikiversity and Wikivoyage can do the same thing, too.

I suppose Wiktionary will be able to use Wikidata for interwikis in some namespaces too?

Here are some pages that could use it:

Thanks in advance. --Daniel Carrero (talk) 23:36, 29 April 2017 (UTC)

This would only be practical if it handles all categories of a certain type the same way. If Category:French nouns and Category:English nouns are stored independently, implying that there will be one "nouns" data item for every possible language, then there's not much point. Also, using equivalences for templates won't work very well. Not all Wiktionaries have a concept of a headword line, for example. So, for example, de.wiktionary has nothing like our {{de-noun}}. Contrarily, many Wiktionaries have templates for generating headers, like de.wiktionary's {{Worttrennung}}, which have no English equivalent. —CodeCat 23:48, 29 April 2017 (UTC)
What about Category:English language, which has 143 interwikis? It would be nice if we centralized them all in a single place, to avoid having to use bots to update all the interwikis in all Wiktionaries everytime something changes -- as in, if another Wiktionary creates a category for "English language" or if some category gets renamed. Obviously, I'm not saying the idea above would work for all categories, templates and pages, but it's because some of these pages won't even have interwikis in the first place. Point taken, {{Worttrennung}} does not have an English equivalent, which means it won't have English interwikis, stored in Wikidata or not. For the pages that have interwikis, Wikidata would help a lot.
Ideally, for categories/templates/pages repeated in many or all Wiktionaries, we should have a system to predict patterns like "Category:(language name) nouns" and "Category:(language name) language", but this probably can't work because maybe not all Wiktionaries have categories/templates/pages with consistent names. Even the English Wiktionary has inconsistently named templates like Template:fa-interjection abnd Template:la-interj. --Daniel Carrero (talk) 02:30, 30 April 2017 (UTC)
This is the next task pointed at d:Wikidata:Wiktionary/Development/Proposals/2015-05#Task 2: Switch on Phase 1 for Wiktionary. It is supposed it will be fed with current interwiki links existing on non main space pages. So Template:de-noun has no interwiki links but ca:Template:de-nom links to English, then it will appear here. Afterwards, false links and conflict links should be edited on Wikidata, as done for Wikipedia. --Vriullop (talk) 12:15, 30 April 2017 (UTC)
phab:T158323. --Yair rand (talk) 05:54, 2 May 2017 (UTC)
Thanks all for your feedbacks! Indeed, it's the next step of the plan, and I hope we can properly announce it to the communities in the end of May, for a deployment in June. Let me know, via the ticket or on wiki, all the requests/concerns/questions about it. Lea Lacroix (WMDE) (talk) 12:39, 2 May 2017 (UTC)

Vote: Well documented languages and constructed languagesEdit

FYI, I created Wiktionary:Votes/pl-2017-04/Well documented languages and constructed languages.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 10:47, 30 April 2017 (UTC)

Discussion should take place before a vote is created. Where is it? —CodeCat 13:08, 30 April 2017 (UTC)
If I understand correctly, you are making the following policy proposal:
  • A vote page should never be created before a discussion about the vote subject already took place.
Is that a policy that you propose? Do you need help in drafting a corresponding vote containing your policy proposal? --Dan Polansky (talk) 13:14, 30 April 2017 (UTC)
This is a minor policy edit that does not seem to change any actual regulations. So, it does not need to be discussed first, as far as I'm concerned. There are probably a number of other situations where it's OK to create a vote without discussing it first. (Or, rather, Dan Polansky created this discussion, let's discuss it here if needed.) Our policies are out-of-date and incomplete enough as it is, to say the least, so let's encourage each other to create votes to edit them whenever we can. --Daniel Carrero (talk) 13:38, 30 April 2017 (UTC)
WT:VP. —CodeCat 14:35, 30 April 2017 (UTC)
"Specifically it is a policy think tank, working to develop a formal policy." (except the "Voting eligibility" section, which was the result of a vote and can't be easily changed) --Daniel Carrero (talk) 14:39, 30 April 2017 (UTC)
How about also including an option to add something like "Any accepted constructed language in the mainspace is considered a well documented language." at the constructed languages section of CFI right before the unordered list? Lingo Bingo Dingo (talk) 12:53, 1 May 2017 (UTC)
Currently, the proposal has "... and any other constructed language indicated as approved at Wiktionary:Criteria for inclusion#Constructed languages". Thus, it refers to CFI, as it should, IMHO. I am not clear how a constructed language can be accepted in the mainspace but not approved via that part of CFI. --Dan Polansky (talk) 13:07, 1 May 2017 (UTC)
Oh, I misunderstood. You seem to want to add a sentence at Wiktionary:Criteria_for_inclusion#Constructed_languages. About that proposal, I don't know. --Dan Polansky (talk) 13:22, 1 May 2017 (UTC)
That's correct, it's about an option to add a sentence at another page. But it's clearly related in purpose and subject and both changes would be very minor IMO. Lingo Bingo Dingo (talk) 14:02, 1 May 2017 (UTC)

Proposal: add rules about the start of votes in WT:VPEdit

Prior discussion:

Proposal:

  • Add a few rules in WT:Voting policy about when exactly votes start after their creation.

Rationale:

  • I believe these to be unwritten rules that are already in effect. But, it's better to have written rules than unwritten ones, so we can learn by actually reading the policies rather than observing previous behavior and finding patterns. This should hopefully be helpful to new editors.

Rules to be added:

(feel free to propose any changes in these rules)

Starting the vote
  1. Once a vote is created, it should have a minimum 7-day waiting period before it starts, except when otherwise stated below.
  2. The start of a vote can be postponed as much as discussion requires.
  3. A vote for granting the rights of an administrator, checkuser or bureaucrat may start immediately after the recipient accepts.
  4. A vote for granting a bot flag may start immediately.

--Daniel Carrero (talk) 14:02, 30 April 2017 (UTC)

What about this simplification:

Starting the vote
  1. Once a vote is created, it should have a minimum 7-day waiting period before it starts, except for a vote for granting users rights or a bot flag, both of which can start immediately after the user accepts.
  2. The start of a vote can be postponed as much as discussion requires.

--Dan Polansky (talk) 14:18, 30 April 2017 (UTC)

That's an improvement, thanks.   Support. --Daniel Carrero (talk) 14:21, 30 April 2017 (UTC)

Let's try some more shortening:

Starting the vote
  1. At least 7 days should elapse betweenbefore the vote creation and the actual start, except for votes for granting user rights or a bot flag, which can start immediately after the user accepts.
  2. The start of a vote can be postponed as much as discussion requires.

--Dan Polansky (talk) 14:28, 30 April 2017 (UTC)

Fine by me, too.   Support. --Daniel Carrero (talk) 14:31, 30 April 2017 (UTC)
It seems "before" should read "between" here: "At least 7 days should elapse before the vote creation and the actual start". Lingo Bingo Dingo (talk) 14:05, 1 May 2017 (UTC)
Corrected; thanks. --Dan Polansky (talk) 14:27, 1 May 2017 (UTC)
Oppose. If the rule about discussing before creating a vote is already wilfully ignored, what guarantee is there that these new rules won't also be ignored? WT:VP should be made a full policy before I can support this. —CodeCat 14:09, 1 May 2017 (UTC)
If you'd like, feel free to create a vote with the proposal "Make WT:VP a full policy." Personally, I wouldn't do it because I'd rather edit the policy piecemeal and review each of the rules instead. Maybe there's already consensus for a lot of rules there, but we don't have any guarantee that there is consensus for all of them. That may be true for much like all the 173 Wiktionary think tank policies.
Even WT:EL is technically a "full policy" because this vote says you can't edit EL without a vote since 2012, but each rule in the policy was mostly unvoted then, and EL was shit. Today, EL is half-shit, maybe. --Daniel Carrero (talk) 15:08, 1 May 2017 (UTC)
But CFI is a gem, after all the cleaning. It glistens. (Half serious. Also, jewelry and asset management are nicer domains than sanitation services.) --Dan Polansky (talk) 15:16, 1 May 2017 (UTC)
Personally, I like WT:CFI. I think it reflects well the consensus where it exists. It may lack some unwritten rules, like the "hot word" rule, but it's pretty good overall.
In other words, when discussing about an entry, we can proudly say: "We should keep the entry, it passes CFI." or "We should delete the entry, it fails CFI."
But, maybe it isn't quite accurate yet to say "This entry is well-formatted!" or "This entry is poorly-formatted!" only because of its compliance to WT:EL rules. It's still normal to say, basically, "The entry follows/doesn't follow some unwritten rules and perceived standards from other entries." --Daniel Carrero (talk) 17:06, 1 May 2017 (UTC)

I created Wiktionary:Votes/pl-2017-05/Starting votes. --Daniel Carrero (talk) 04:38, 5 May 2017 (UTC)