Wiktionary:Beer parlour/2015/June

Good site? edit

Hello. Do you like Wiktionary? --Keyboard Masher (talk) 23:31, 2 June 2015 (UTC)[reply]

@Keyboard Masher: I do. Do you? —Justin (koavf)❤T☮C☺M☯ 04:53, 3 June 2015 (UTC)[reply]

Meh, it's OK. I'd enjoy it more if it came in more colours. --Keyboard Masher (talk) 08:05, 3 June 2015 (UTC)[reply]

We need a "smellyvision" version. SemperBlotto (talk) 06:01, 4 June 2015 (UTC)[reply]

Two words: Cleveland steamer. Chuck Entz (talk) 13:41, 4 June 2015 (UTC)[reply]

Yuck! --Hekaheka (talk) 20:16, 8 June 2015 (UTC)[reply]

Inuktitut characters edit

Would someone like to add entries for each of the characters use in Inuktitut words (those that we haven't already got)?

For instance the word ᑕᕝᕙ (tavfa) contains the characters ᑕ (ta) (which we have), and ᕝ (v) and ᕙ (fa) (which we haven't). SemperBlotto (talk) 16:30, 4 June 2015 (UTC)[reply]

Normalization of entries vote edit

Wiktionary:Votes/pl-2015-05/Normalization of entries started today. --Daniel 22:47, 4 June 2015 (UTC)[reply]

Pywikibot compat will no longer be supported - Please migrate to pywikibot core edit

Sorry for English, I hope someone translates this.
Pywikibot (then "Pywikipediabot") was started back in 2002. In 2007 a new branch (formerly known as "rewrite", now called "core") was started from scratch using the MediaWiki API. The developers of Pywikibot have decided to stop supporting the compat version of Pywikibot due to bad performance and architectural errors that make it hard to update, compared to core. If you are using pywikibot compat it is likely your code will break due to upcoming MediaWiki API changes (e.g. T101524). It is highly recommended you migrate to the core framework. There is a migration guide, and please contact us if you have any problem.

There is an upcoming MediaWiki API breaking change that compat will not be updated for. If your bot's name is in this list, your bot will most likely break.

Thank you,
The Pywikibot development team, 19:30, 5 June 2015 (UTC)

Your usage of English is unforgivable :) —suzukaze (t・c) 23:15, 9 June 2015 (UTC)[reply]

Votes on desysopping inactive admins edit

WF has created votes for the desysopping without prejudice of four sysops who have been wholly inactive for years. As these votes have largely escaped public notice, I've extended them for 10 more days so that more of the community can weigh in on whether or not to remove their bits. Please vote here: Wiktionary:Votes/sy-2015-05/User:Caladon for de-sysop, Wiktionary:Votes/sy-2015-05/User:Jun-Dai for de-sysop, Wiktionary:Votes/sy-2015-05/User:Celestianpower for de-sysop, Wiktionary:Votes/sy-2015-05/User:EivindJ for de-sysop. —Μετάknowledge^{discuss/deeds} 20:54, 5 June 2015 (UTC)[reply]

Has anyone tried contacting these admins for their input? bd2412 T 02:00, 6 June 2015 (UTC)[reply]

(Follow-up) I have posted messages on the talk pages of these four admins, and have sent e-mails to the three who have e-mail set up. bd2412 T 02:09, 6 June 2015 (UTC)[reply]

Entries for ISO codes edit

About having entries for language codes, like en or ang — and also ISO family/script/country codes. Can we have those, provided they are attested as usual, or was there some discussion or some issue preventing creating entries for them? Granted, one can predict that comparatively only a few, far from all codes would be attestable.

It's a bit difficult finding previous discussions in this subject as I naturally can't search for "language code entries" or "language code" without seeing a thousand unrelated discussions, but here are some, all of those are from 2010:

Wiktionary:Votes/2010-03/All ISO 639 codes to meet CFI as Translingual entries (failed vote)
Talk:jv (1 RFD, 1 RFV)
Wiktionary:Beer parlour/2010/June#ISO language codes (again)

I've tried my hand at attesting Citations:Latn meaning Latin script. What do you think, that's good enough that we can create the entry Latn? I tried to find citations where Latn is being used in running text, in accordance with WT:CFI#Conveying meaning. --Daniel 10:11, 6 June 2015 (UTC)[reply]

Case in point: We have Category:ISO 3166-1 (country codes) with 471 entries. --Daniel 11:58, 6 June 2015 (UTC)[reply]

I suppose it comes down to the individual attestability of every single code. Some might be attestable and others might not. Remember we don't even keep units of measure (like stupid zettakilograms or whatever) unless they are attested, even if they follow official naming rules. Equinox ◑ 10:16, 6 June 2015 (UTC)[reply]

I seem to think after jv failed we delete all ISO 639 codes, which is dubious because they didn't all fail RFV, just one or a couple of them. Renard Migrant (talk) 11:03, 7 June 2015 (UTC)[reply]

Contemporary Old High German edit

This is merely a mental exercise on a dogmatic question, but who knows, one day the Alemans could descend off their mountains into our dictionary, so give it a serious shot.
In the south of Switzerland, the local dialects

Do not feature final obstruent devoicing
Do not diphthongise PGM long vowels
Have long consonants
Have not lengthened short vowels in open syllables
Know at least five different vowel qualities in unstressed syllables (i, u, e, o, a)

So nobody can tell me that's not Old High German. At the same time, there are Alemannic dialects which have merged all unstressed vowels into /ə/. How they are not Middle High German is beyond me as well. Is it really sensible to list both as Alemannic rather than as living forms of OHG and MHG? Korn [kʰʊ̃ːæ̯̃n] (talk) 22:47, 7 June 2015 (UTC)[reply]

A language is not defined solely by sound changes, but by other things as well like grammar and lexicon. I would be more convinced by your argument if OHG were more intelligible to these speakers than Old Norse is to Icelanders. —CodeCa t 23:25, 7 June 2015 (UTC)[reply]

Pardon? I wasn't making an argument, I was asking a question. Korn [kʰʊ̃ːæ̯̃n] (talk) 13:20, 8 June 2015 (UTC)[reply]

Is it really sensible to follow the lead of professional scholars on the subject? I'm going with yes.--Prosfilaes (talk) 12:36, 8 June 2015 (UTC)[reply]

Category:Northern German - Category:Southern German edit

Neither of these have a definition. If they're to be kept, they should. Especially in the former category, many of these terms aren't actually restricted to Northern Germany, they're used everywhere. They might be more common in some regions than others but that's not what the regional label is for. -- Liliana • 15:32, 8 June 2015 (UTC)[reply]

Define "elsewhere" and please give examples. I'm highly baffled by your statement. (Which doesn't mean I'm not believing you.) Korn [kʰʊ̃ːæ̯̃n] (talk) 21:02, 8 June 2015 (UTC)[reply]

For example, the term Rummel is definitely not restricted to Northern Germany, it's used everywhere. Same with moin, it might have originated there, but it's used in the whole country nowadays. You can find terms like this in the other category as well: händisch is definitely not restricted to Southern Germany.
There is no line you can draw on the map to denote that north of it is Northern German and south of it is Southern German, unlike (say) Swiss and Austrian German which stop pretty much at the national borders. -- Liliana • 21:15, 8 June 2015 (UTC)[reply]

That's not a reason to ditch the labels, especially given how common they are in other dictionaries, including de.Wikt and the Duden. There's no single precise definition of the "Southern US", and not all of the terms used in Category:Southern US English are used in all of the same exact places. But if certain terms are widely perceived/agreed to be "southern German" or "Southern US", categorizing them as such can still be useful. - -sche (discuss) 21:50, 8 June 2015 (UTC)[reply]

Actually, "Southern US" is a very well defined region, it refers to very specific states. You can't make that claim for the German categories discussed here. -- Liliana • 22:03, 8 June 2015 (UTC)[reply]

But Southern US English is not restricted to the states considered the South. Indiana isn't the South, but the language of the southern half of Indiana is distinctly Southern. —Aɴɢʀ (talk) 14:48, 11 June 2015 (UTC)[reply]

You rediscovered the fact that language boundaries do not conform to political boundaries. Nevertheless, there is a more-or-less defined region, even if there is a gray area in between. --Wiki Tiki 89 15:18, 11 June 2015 (UTC)[reply]

Comment 1: de.Wikt also uses these labels (e.g. in de:kross, de:Obacht), often simply linked to de:süddeutsch (süddeutsch) and de:norddeutsch (norddeutsch). Does that provide sufficient definition?

Comment 2: A while ago, I started a discussion about the redundancy of having both Category:North American English/Category:North American French and Category:American English/French and Category:Canadian English/French. The decision was to reduce the "North American" label to an alias of "US, Canada" and deprecate its category. We already have an "Austrian" label, would it be better to deprecate these two labels in favour of other state- or dialect- specific labels? OTOH, a category for "Bavarian German" regional German could be considered confusing by some. (Compare how "Swiss German" regional German was renamed "Switzerland German" by me because some users, although not me, felt the former name was too confusing.)

- -sche (discuss) 21:26, 8 June 2015 (UTC)[reply]

That might indeed be better although I have no idea how to divide the regions. There definitely are terms that are used only in Bavaria and nowhere else (grüß Gott being perhaps the most famous example). We already have {{DDR}} for terms from East Germany. -- Liliana • 21:34, 8 June 2015 (UTC)[reply]

Two things: 1. If I ever catch a Bavarian saying 'moin', I'll smack some Grüß Gott into him. 2. To me Northern Germany always seemed very strictly defined as "Bundesländer with a sea coast" and more loosely as "areas where Low German happens". "Southern Germany" seems to be defined as areas where Alemannisch+Bairisch+Oberfränkisch happen. Isn't this how it's used 99% of the time, especially in linguistic context? Korn [kʰʊ̃ːæ̯̃n] (talk) 10:16, 9 June 2015 (UTC)[reply]

And the middle states are what, nothing? -- Liliana • 20:34, 9 June 2015 (UTC)[reply]

Central Germany. - -sche (discuss) 21:56, 9 June 2015 (UTC)[reply]

I propose that we formally define Southern Germany for WT as the area south of the Speyrer and Northern Germany north of the Uerdinger line. Korn [kʰʊ̃ːæ̯̃n] (talk) 09:27, 11 June 2015 (UTC)[reply]

"Bundesländer with a seacoast" would exclude Berlin and Brandenburg (which are often considered part of Northern Germany) and Hamburg (which always is). The Uerdingen line seems better for linguistic purposes such as ours. —Aɴɢʀ (talk) 14:48, 11 June 2015 (UTC)[reply]

I've not encountered Berlin and Brandenburg being considered Northern Germany, culturally, by anyone in my life. Hamburg of course is part, but it's within the realm of coastal states, so to speak. Korn [kʰʊ̃ːæ̯̃n] (talk) 23:41, 11 June 2015 (UTC)[reply]

What to do when the lemma form (and only that form) has alternative forms? edit

When a single lemma has several different options for a particular inflection, we create entries for all of them and give them the appropriate definition. So for example, an English noun with two possible plurals will simply have one plural entry for each of them. But in some cases, the form that we have chosen as the lemma will have alternative forms itself. In some cases, this implies that the stem of the word is different, so that they have two separate sets of inflections. In this case, we define one of them as "alternative form" and include inflection tables on both.

But it's also possible that there is one single paradigm that happens to have two possible forms for the lemma form only. For example, a noun might have two different nominative singular forms, but there is only one form for all other inflections. Or a verb could have two possible infinitives. How should we handle these cases? If we use "alternative form of" then it's misleading because the user might think that this is an entirely different verb with its own inflections, when in reality only the lemma form happens to have an alternative form. So I'm thinking that it would make more sense to list this as, for example, "nominative singular of" or "infinitive of", just like we do with any other inflected form.

Of course, the question also arises which of the forms should be chosen as the "real" lemma. —CodeCa t 22:41, 9 June 2015 (UTC)[reply]

I think our current practice is good and needs no changing. See for example honos. —Μετάknowledge^{discuss/deeds} 22:44, 9 June 2015 (UTC)[reply]

Ok, but which lemma do you have the inflections point to, honor or honos? —CodeCa t 22:51, 9 June 2015 (UTC)[reply]

Either to the lemma, or to both the lemma and alternative form. I prefer just to the lemma, which in this case is honor. --Wiki Tiki 89 23:03, 9 June 2015 (UTC)[reply]

Alternative forms are also lemmas. They're categorised as such. —CodeCa t 23:23, 9 June 2015 (UTC)[reply]

That means nothing more than that there is a discrepancy between how I used "lemma" in my sentence and how we use it in categorization. You still understood what I said. --Wiki Tiki 89 01:03, 10 June 2015 (UTC)[reply]

I agree with Meta and WikiTiki. - -sche (discuss) 23:52, 9 June 2015 (UTC)[reply]

The problem seems to be that we use "alternative form" to mean both a single inflected wordform sometimes, and an entire paradigm at others. --Tropylium (talk) 00:06, 17 June 2015 (UTC)[reply]

That's a very good point. Maybe we should start distinguishing between them. --Wiki Tiki 89 16:44, 17 June 2015 (UTC)[reply]

I think "alternative form" should only be used for lemmas. For non-lemma forms, the distinction is moot because both are equally forms of their lemma. —CodeCa t 17:57, 17 June 2015 (UTC)[reply]

Not sure what you mean. What I thought Tropylium was talking about was that we have things like "ax is an alternative form of axe", which means nothing more than "the form ax is an alternative of the form axe", and then we have things like "plow is an alternative form of plough" which really means "plow and all of the forms plow represents are respectively alternatives of plough and all the forms plough represents". --Wiki Tiki 89 18:51, 17 June 2015 (UTC)[reply]

Yes. And I'm saying we should only be using "alternative form" for the latter. We should also not be using "alternative form" for non-lemmas, so nothing like marking octopodes as an alternative form of octopuses. Both should be marked simply as the plural of octopus. —CodeCa t 19:07, 17 June 2015 (UTC)[reply]

So then what would you do for ax? --Wiki Tiki 89 19:20, 17 June 2015 (UTC)[reply]

What we should consider is what to do with the plural. If axes should be defined as a plural of both axe and ax, then the latter is an alternative spelling of the former. But if it's only the plural of axe, then ax just be an alternative spelling of the singular form only. —CodeCa t 19:41, 17 June 2015 (UTC)[reply]

If we follow what you said, then we can't do the latter because you can't have an alternative form of a single form. Unless you want to make an exception and only allow this for the lemma form. --Wiki Tiki 89 19:51, 17 June 2015 (UTC)[reply]

Well, axes is, in fact, the plural of ax as well as the plural of axe (not to mention the plural of axis), which is exactly what the page already says. Do you think that we should say it's the plural of axe (and axis) alone, and not mention ax on the page at all? —Aɴɢʀ (talk) 19:47, 17 June 2015 (UTC)[reply]

That's the idea of this proposal. If one word happens to have two or more lemma forms, then it's not helpful to have inflected form entries that link to each of them separately.

The way I see it is that we treat lemmas as "inflection sets". An alternative form has, by the treatment proposed here, a different (or at least partially different) inflection set from the word it's an alternative form of, and is therefore a lemma in itself, albeit one that is used in variation with another. This means that if two lemmas have the same inflection set except for the lemma form, then they clearly have the same inflection set and only the lemma form of the inflection set has several forms. This is no different from having a non-lemma form that has several forms. For example, having one inflection with two nominative singular forms is conceptually no different from one with two genitive plural forms.

This is not a simple rule, though. After all, there are cases like English nouns where the inflection set consists of only two items, singular and plural. Does a word belong to a different inflection set if the singulars differ but have the plural forms in common? This is something we would need to determine separately. —CodeCa t 20:48, 17 June 2015 (UTC)[reply]

Citation links edit

Using {{seeCites}} at the entry example returns the text:

"For usage examples of this term, see the citations page."

Sometimes the citation page linked is different from the entry name, but the template text shows absolutely no indication of that. For that reason I would like to edit the text.

Proposal:

Insert the title of the citations page in the text.
If the entry is different, link to the entry too. (Use {{l-self}}.)

For example it might return this text:

"For usage examples, see the citations page of example." (unlinked entry)
"For usage examples, see the citations page of example." (linked entry, if the template is used anywhere other than example itself)

Rationale:

At the Portuguese entry como, some citations are at Citations:como, but the citations which are verb forms of comer would be at Citations:comer, both citation pages are linked from the respective POS section but I would like to change the fact that there is no indication that these are actually different citation pages.

After editing the template, at the entry como we would have both:

Adverb/Conjunction/Interjection

"For usage examples, see the citations page of como." (unlinked)

Verb

"For usage examples, see the citations page of comer." (linked)
--Daniel 14:27, 18 June 2015 (UTC)[reply]

Support. Good idea. It's not uncommon for citations of one word to be in another word's citations page, either because citations of all spellings and hyphenations have been gathered in one place (Citations:moose-misse), or because citations of all inflected forms have been placed on the lemma's page (Citations:they), or potentially for some other reason. - -sche (discuss) 15:59, 18 June 2015 (UTC)[reply]

Why not just show the actual name of the citation page? Citations:example? —CodeCa t 17:31, 18 June 2015 (UTC)[reply]

That'd be fine by me. - -sche (discuss) 22:45, 18 June 2015 (UTC)[reply]

Support per -sche. DCDuring TALK 21:37, 18 June 2015 (UTC)[reply]
Support.—msh210℠ (talk) 06:52, 21 June 2015 (UTC)[reply]

Done, with both {{seeCites}} and {{seemoreCites}}. --Daniel 00:25, 4 July 2015 (UTC)[reply]

A new (better) way to collapse inflection tables? edit

MediaWiki:Gadget-legacy.js contained a second, apparently unused method of collapsing elements. It's more flexible: rather than having to put everything in wrapper divs, you can specify for individual elements whether they should be hidden or not. Moreover, it's possible to specify that elements should be displayed only when the element is collapsed. This makes it possible to have a table that shows one set of table rows when collapsed, and another set when expanded. For inflection tables, the expanded version could show the full table, while the collapsed version shows only the most important/least predictable forms (principal parts).

I have created an example of this at User:CodeCat/vsExample. Compare it to the original table at gooien. Note that there are no more wrapper divs, the table itself is the outermost element now. This makes it possible, in theory, to have the table scale automatically as its contents gets too wide. This was not possible with the old method. —CodeCa t 23:52, 18 June 2015 (UTC)[reply]

Looks good. Might be a good way to hide selected portions (like cognate lists) of our too-long etymologies. DCDuring TALK 01:28, 19 June 2015 (UTC)[reply]

Yes. We already do that with the old "div" method in some pages, but with the new method, we can make it look different and more appealing. Though, to be fair, I usually remove cognates if they are just duplicated in many entries and if they can easily be found on a proto-language page. —CodeCa t 01:55, 19 June 2015 (UTC)[reply]

There are more than 7,500 entries containing "{{m|ine-pro|", "English", "Etymology", and "cognate" and/or "compare", so we have a way to go in cleaning them out. I could see why it is handy not to compel those with specialized interests to rummage around in multiple entries, but most users have no interest in such matters and find our above-the-definition material intimidating and confusing. Perhaps all lists of cognates should be enclosed in a template, which allowed them to be hidden by default and displayed for a given registered user always by gadgetry or by use of a show/home control. Perhaps something similar would make sense for the portions of etymology related to all or some reconstructed languages. DCDuring TALK 22:55, 19 June 2015 (UTC)[reply]

I've implemented this for the Dutch inflection tables now, and I'm quite pleased with the result. See groot, goed, verbogen, zijn, werpen, uitwerpen for some examples. But now that the most important forms are shown in the inflection table even when it's collapsed, it's a bit redundant to show them in the headword line as well. Presumably they should be removed from there. If we start extending this kind of inflection table to other languages, then we should probably remove the forms from the headword line then too. For example, we would no longer need to show the principal parts of Latin words in the headword line if they are already shown in the inflection table. It's rather redundant otherwise. —CodeCa t 13:45, 23 June 2015 (UTC)[reply]

@kc kennylau pinging because he has worked on Latin templates recently. —CodeCa t 13:48, 23 June 2015 (UTC)[reply]

They do look good. I wouldn't rush to remove the redundancy on the inflection line as we have trained users to look there for core inflection information. Perhaps the redundancy is really in the new tables. DCDuring TALK 15:24, 23 June 2015 (UTC)[reply]

Category:Perching birds edit

Discussion moved to Wiktionary:Requests for moves, mergers and splits#Category:Perching_birds.

HTTPS edit

Hi everyone.

Over the last few years, the Wikimedia Foundation has been working towards enabling HTTPS by default for all users, including unregistered ones, for better privacy and security for both readers and editors. This has taken a long time, as there were different aspects to take into account. Our servers haven't been ready to handle it. The Wikimedia Foundation has had to balance sometimes conflicting goals.

Forced HTTPS has just been implemented on all Wikimedia projects. Some of you might already be aware of this, as a few Wikipedia language versions were converted to HTTPS last week and the then affected communities were notified.

Most of Wikimedia editors shouldn't be affected at all. If you edit as registered user, you've probably already had to log in through HTTPS. We'll keep an eye on this to make sure everything is working as it should. Do get in touch with us if you have any problems after this change or contact me if you have any other questions.

/Johan (WMF)

22:00, 19 June 2015 (UTC)

Apache nesting edit

(Has this been discussed before?) Only some Apache varieties are nested. Should they all be nested, or should none of them be nested? - -sche (discuss) 22:05, 21 June 2015 (UTC)[reply]

Judging from the Wikipedia article (w:Southern Athabaskan languages), Navajo is more closely related to Western Apache and the Chiricahua/Mescalero group than it is to Plains Apache, Jicarilla Apache or Lipan Apache. That makes it kind of pointless to talk about nesting based on linguistic criteria. We have to decide whether we're nesting based on cultural/historical commonalities, in which case Plains Apache shouldn't be included, or just convenience- lumping together everything named "Apache". I think anything but the latter is going to be confusing to the average user, so I'm inclined to either nest them all or nest none of them. Chuck Entz (talk) 23:41, 21 June 2015 (UTC)[reply]

I would nest none of them, n part because users will probably look for Plains Apache under "P", etc. I question our nesting of Ancient Greek and Mycenaean Greek under modern Greek, too. - -sche (discuss) 00:52, 22 June 2015 (UTC)[reply]

Taxonomic (family) nesting and evolutionary nesting (not yet suggested) seem to suit us, not users unlike us. Listing under hypernyms is at least accessible for ordinary users, as long as the modern language name, which is also usually the hypernym, appears where it belongs in an alphabetical sequence. Sortable tables would address this and similar issues in other data (such as definitions), but they may not be feasible, reliable etc. DCDuring TALK 01:12, 22 June 2015 (UTC)[reply]

We don't really have any rules on nesting anyway, do we? We seem to do it on a very subjective, intuition-based basis. I feel like it makes sense to nest Primitive Irish, Old Irish, and Middle Irish under Irish, but if the rule is to group ancestral forms under the equivalent name without words like "Primitive", "Old", "Middle", etc., then it isn't clear where to put Old English and Middle English (since we never have English in translation tables) or Old Norse (since there isn't a language we call "Norse"). I think I would look for Plains Apache under A rather than P, but I don't know how representative I am. —Aɴɢʀ (talk) 12:32, 22 June 2015 (UTC)[reply]

In my experience, people look up all varieties of Greek under Greek, and all varieties of Apache under Apache. Navajo is expected to be under Navajo. —Stephen ^(Talk) 12:49, 22 June 2015 (UTC)[reply]

Category:Plurals with a red link for singular edit

May I bring this category to your attention. It contains plural words that various people have come across but don't know how to define the singulars.

Any help in providing such a definition would be welcome. Please ignore the appendices, user pages, talk pages and the like. SemperBlotto (talk) 16:21, 22 June 2015 (UTC)[reply]

p.s. I have got as far as "g".

With some adjustments to Module:form of, this can probably be extended for any form-of entry whose lemma is missing. —CodeCa t 17:26, 22 June 2015 (UTC)[reply]

Any idea why Appendix:Proto-Algonquian/aya·pe·waki is in the category? - -sche (discuss) 18:10, 22 June 2015 (UTC)[reply]

Synchronic and diachronic derived terms edit

Many languages have terms which were derived through processes that are no longer productive, but where the relationship is still clear enough to be recognised by most speakers. For example, Dutch has many cases in which a noun is derived from the root of a verb through ablaut, or by using a variety of obsolete suffixes. Some examples: springen > sprong, dringen > drang, spreken > gesprek, zien > zicht. The question is whether these can be considered derived terms. I think most Dutch speakers would understand that sprong is derived from springen, even if the actual method of derivation is opaque. But the actual derivation occurred in Pre-Proto-Germanic times.

And if these are derived terms, then where should we draw the line? Is dawn still a derivative of day? lord a derivative of loaf? —CodeCa t 18:13, 22 June 2015 (UTC)[reply]

I think ====Derived terms==== should be limited to regular derivations whose process is transparent and could be applied to other words. In the examples you gave from Dutch, there are two issues: that it is not clear exactly how the vowel is determined, and that it is not clear without looking at historical evidence whether it was the noun or the verb that came first. Thus, in my opinion, all the examples you gave are better off in the ====Related terms==== section. However, I completely agree that for words whose derivations are still regular and transparent, they should be allowed in ====Derived terms==== even if the derivation took place thousands of years ago in a vastly different parent language. --Wiki Tiki 89 18:27, 22 June 2015 (UTC)[reply]

A cutoff point with the clause "could be applied to other words" can get unwieldy quite fast for agglutinative languages. These often have a wide variety of derivative suffixes that are entirely transparent, but not really at all productive in the sense of being applicable to any arbitrary word. E.g. the Finnish suffix -sto regularly yields collectives, but this does not mean it is actually possible to take any random word like roskakori and form something like ˣroskakoristo. Often they will still be productive in the weaker sense that every so often, a new instance of a word using the suffix is added to the language — but this is not really a synchronically measurable property.

On the other hand: mere transparency seems to be too weak a condition. This will generate things like Category:Finnish words prefixed with geronto- or Category:Finnish words prefixed with terato-, although I do not think there are any cases of native Finnish formations using these Hellenic prefixes.

So, perhaps: morphophonological transparency for native derivational processes, versus evidence of productive use for originslly foreign derivational processes? --Tropylium (talk) 07:45, 23 June 2015 (UTC)[reply]

First of all, we are only talking about the ====Derived terms==== section, not the ===Etymology=== section. Second of all, everything you described about -sto fits my definition of "could be applied to other words". Note that I did not say "could be applied to any other word". --Wiki Tiki 89 16:46, 23 June 2015 (UTC)[reply]

Request to add glosses in etymology sections edit

Could we make it a policy and/or guideline that editors should add glosses for etyma when working on etymology sections? For instance, knowing that knǫrr (“a large merchant ship used in mediaeval Scandinavia”) comes from Proto-Germanic *knarzuz is interesting, but what does *knarzuz mean? It would be more useful if *knarzuz were provided with a gloss right there in the etymology section -- especially when we have no entry yet for the given etymon. ‑‑ Eiríkr Útlendi │^{Tala við mig} 18:14, 22 June 2015 (UTC)[reply]

The normal practice is to give glosses only if the word means something else than the one preceding it (its descendant). So if knǫrr means the same as *knarzuz then only the former would have a gloss. This also means that if the word never changed meaning throughout its known history, then no glosses should be present at all. —CodeCa t 18:20, 22 June 2015 (UTC)[reply]

That is both unclear (as a policy / practice), and not very good usability. Compare the etymologies at give, have, hand, which give more detail. ‑‑ Eiríkr Útlendi │^{Tala við mig} 18:33, 22 June 2015 (UTC)[reply]

It's only unclear if it isn't followed rigidly (which of course it isn't), but I do feel it would be tedious to see that foot comes from a Middle English word that means 'foot', which comes from an Old English word that means 'foot', which comes from a Proto-Germanic word that means 'foot', which comes from a Proto-Indo-European word that means 'foot' and is cognate with a Sanskrit word that means 'foot' and an Ancient Greek word that means 'foot' and a Latin word that means 'foot'. —Aɴɢʀ (talk) 19:03, 22 June 2015 (UTC)[reply]

At the bare minimum, it would be useful to have a gloss for the last etymon in the chain, in cases where the meaning hasn't changed. ‑‑ Eiríkr Útlendi │^{Tala við mig} 19:12, 22 June 2015 (UTC)[reply]
Another set of cases for which we need glosses involves etymon redlinks.

Still another would be an etymologically important missing definition where we have only an incomplete entry for the etymon.

Yet another involves any etymon that is/was highly polysemic, especially in a sense that is less common, archaic, or obsolete.

I find myself constantly trying to look up etymon definitions and being frustrated. When I am able to find the definitions from other sources, the "same definition as previous etymon" assumption proves unwarranted except in the loosest of senses of same. I am often interested in whether a term had achieved a specialized meaning in Ancient Greek or Latin, which specialized meaning are often neglected in our entries.

As a result I would favor having an explicit requirement that we have glosses, except in cases where we have an entry for the etymon, the applicable sense(s) are in the entry, and the applicable sense is clear. DCDuring TALK 20:44, 22 June 2015 (UTC)[reply]

A problem here is that the meanings of words in proto-languages are not necessarily even reconstructible in too much detail. Often it is easy enough to figure out that a word meaning "a" in language A and a word meaning "b" in language B are cognate, but it can be an intractable question if the original meaning was "a", "b", both of them, or perhaps something slightly different altogether. I'm in favor of glossing attested pre-forms in e.g. Latin, especially if they differ, but this policy cannot be fully generalized for all pre-forms. --Tropylium (talk) 07:26, 23 June 2015 (UTC)[reply]

For my use of a dictionary that is a reason to exclude such reconstructions, perhaps by hiding them so they don't waste screen space. DCDuring TALK 09:38, 23 June 2015 (UTC)[reply]

So this is another argument for a user setting "Hide etymologies", I guess? --Tropylium (talk) 12:48, 29 June 2015 (UTC)[reply]

The proto-form explains how the cognates fit together, and the cognates themselves give clues about the possible range of meanings for the proto-form- they're complementary. The problem with too many similar cognates is that they obscure that relationship- especially if one branch shares an innovation, and the sheer number of cognates in that branch gives the impression that they're the norm. Chuck Entz (talk) 13:33, 29 June 2015 (UTC)[reply]

Standard forms of words versus regional variants edit

If there is a standard version of a word, should it be used in place of regional variants? Changing from one regional variant to another regional variant is counterproductive, but what about changing from a regional variant or alternative form of a word to the word's standard version? --WikiWinters (talk) 00:26, 23 June 2015 (UTC)[reply]

This should be decided for each language individually. Some languages have standard forms, but the standard is not widely followed by speakers. So the standard form is not always the most-used or best-known form. —CodeCa t 14:01, 23 June 2015 (UTC)[reply]

Proposal: Always collapse cognate lists in entries edit

In the discussion above, User:DCDuring suggested that cognate lists should always be hidden behind a collapsible element of some sort. I do think this is a good idea, because cognates often clutter up etymologies, and it's not unusual to see huge lists of them because of course everyone insists on including their favourite language.

Aside from this, I think it's also worth discussing what else we can do about cognates. In a lot of cases, the cognates are already listed neatly in the descendants section of the term's ancestor. Listing them in the entry as well is redundant then, and duplicates information, so we may want to remove cognates altogether if they're already listed more thoroughly on another, central page. On the other hand, having them in the entry is convenient to the user, at least. So what can we do to alleviate the duplication? —CodeCa t 13:59, 23 June 2015 (UTC)[reply]

I suppose we might consider whether there is any reasonable way to decide whether a cognate for a term should:

Appear unhidden as part of the etymology in the entry for the term (Some cognates seem to be more or less essential elements of an etymology.)
Appear hidden as part of the etymology. (This might be particularly warranted if there is no entry for the term's ancestor at which the term's cognate would appear as a derived term or descendant.)
Not appear at all in the entry, but rather in descendants or derived terms of an ancestor of the term.

But hiding seems to be a good tool for handling cognate lists, pending moving the cognate to descendant or derived term in another entry or possibly creating such entry. Although this is in principle just a temporary solution, it is likely that there will always be some cognates that have no home as descendants or derived terms. DCDuring TALK 15:20, 23 June 2015 (UTC)[reply]

We could make a template similar to {{etymtree}} in order to list cognates without duplicating information. — Ungoliant ^(falai) 15:26, 23 June 2015 (UTC)[reply]

Combining hiding with avoidance of duplication seems like a good idea. I never cease to be amazed at how little the performance penalty for well-designed templates/modules+data of such apparent complexity. DCDuring TALK 15:40, 23 June 2015 (UTC)[reply]

It should depend on the number. If there are only three or four cognates listed, there really is no need to hide them and they serve to illustrate the etymology. --Wiki Tiki 89 17:22, 23 June 2015 (UTC)[reply]

I understand and sympathize with that view, but I think many current and potential users find cognates distracting and irrelevant, even to etymology. Curious users will click on whatever unhide control we have and registered users can set it up to display by default for them. That CSS flexibility seems to me to fully address the concerns of all parties, if we are willing to do the work. DCDuring TALK 17:34, 23 June 2015 (UTC)[reply]

So you would hide them even if there is only one cognate? --Wiki Tiki 89 17:38, 23 June 2015 (UTC)[reply]

I think that cognates in a few representative major languages should be shown as presently. When the number grows beyond "a few" they could be hidden behind a "click to see longer list of cognates" feature. 217.44.208.136 21:57, 27 June 2015 (UTC)[reply]

There's always going to be non-neutrality in which languages we choose. For example if we choose Swedish, then people will start adding Norwegian and Danish. Include Finnish, and soon there'll be Estonian too. That's just how it always goes. —CodeCa t 22:23, 27 June 2015 (UTC)[reply]

I didn't mean that a fixed list of "major" languages should be enforced. If a Swedish word is used in one place then a Norwegian or Danish one can be used somewhere else. Of course, if people are going to be keeping score ... 217.44.208.136 22:57, 27 June 2015 (UTC)[reply]

It's not a matter of keeping score. There are editors who see it as their purpose in life to add cognates in their language to every English term with an etymology, and especially to those with cognates in languages that they see as linguistic rivals. This is most obvious with Albanian and Kurdish, but various Scandinavian and Romance languages do it too. There are also some real partisans in Turkic, Dravidian and in some African language families, but they don't have English cognates to work with. Chuck Entz (talk) 23:53, 27 June 2015 (UTC)[reply]

You can understand "keeping score" as covering all kinds of activities where individuals must have their favourite language in the non-collapsed part of the list on every occasion, rather than accepting a spirit of give and take. 217.44.208.136 00:05, 28 June 2015 (UTC)[reply]

I'll just throw in what I seem to be saying in every discussion recently: If someone wants to do non-harmful work, why would one undo it. Just collapse them where they are non-essential parts of an etymology section or at least be consistent and disallow them in etymology sections completely. No pick and choose, we must avoid every tiniest opportunity for people to argue. Korn [kʰʊ̃ːæ̯̃n] (talk) 17:58, 28 June 2015 (UTC)[reply]

Oppose. I love (all) cognates. Wyang (talk) 23:59, 27 June 2015 (UTC)[reply]

Delete cognates when they are listed in the proto-page. --Vahag (talk) 12:48, 28 June 2015 (UTC)[reply]

I've noticed that, when using Century 1911, which as accessible as indexed scans of the pages of the print dictionary, that the often longish etymologies seem to defeat the role that etymologies play in grouping definitions by similarity of meaning. I think that very same defeat of user accessibility is what we have achieved in some of our entries with longer Etymology sections. I had formerly supported the current Etymology-first presentation, but I now wonder whether we should revisit the notion of putting Etymology at the bottom of the group of definitions to which it applies. That practice is what most online dictionaries follow, presumably reflecting their beliefs about user behavior, some of which are almost certainly based on actual click data. Having the Etymology sections below the definitions (in each homonym section) would allow the cognate lists to be as long as anyone wanted without much interfering with users who were interested in definitions. DCDuring TALK 19:21, 28 June 2015 (UTC)[reply]
- Definitely agree. Most users want definitions first, so why present them with etymology at the top? —CodeCa t 19:50, 28 June 2015 (UTC)[reply]
- For words with multiple different etymologies, the "Etymology" headings presently also act as section headings, so some consideration would need to be given about how that would work. Would there be an "Etymology 1" heading, for example, and then later a further "Etymology" heading within the "Etymology 1" section? Having said that, I essentially agree that etymolgies should come after definitions. More generally, I think there is very considerable further scope for improvement in Wiktionary page design, so as to make it more attractive and appealing to users. 109.153.244.85 20:27, 28 June 2015 (UTC)[reply]

- I agree the definition should probably come first, as it simply has to be the most important thing for most dictionary users. It's not just online dictionaries that put def before ety, either. Many put the pronunciation before the def, but they don't have a big subtitled section for it! Equinox ◑ 20:35, 28 June 2015 (UTC)[reply]
  I like the etymology-first presentation when the etymology section is short, but we don't seem to be getting very far in limiting its use of scarce screen space that users see first. Cognates are only part of the problem. Many etymologies are just verbose. An alternative to reordering sections would be to have the Etymology sections collapsed in their entirety, with a terse etymology appearing in the show/hide bars such as are produced by {{rel-top}} DCDuring TALK 21:28, 28 June 2015 (UTC)[reply]
  That's just a workaround at best. First we decide to put it first, then we decide that we don't want it there and hide it? We should just move it elsewhere completely. —CodeCa t 21:35, 28 June 2015 (UTC)[reply]
  You can be dismissive as a rhetorical tactic, but I thought it was a way of having one's cake (terse etymology visibly organizing the entry) and eating it too (drastically reducing the space taken by the worst-offending lengthy etymologies). DCDuring TALK 22:49, 28 June 2015 (UTC)[reply]

The Index namespace edit

I think we should either keep the indexes updated, or completely delete them. It is confusing to our readers to have seriously out-of-date indexes. --Wiki Tiki 89 19:09, 23 June 2015 (UTC)[reply]

For the most part, our lemma categories have replaced these. —CodeCa t 19:10, 23 June 2015 (UTC)[reply]

Which is why I favor the latter option (i.e. deleting them). --Wiki Tiki 89 19:12, 23 June 2015 (UTC)[reply]

Large categories are pretty hideous to navigate through, though. The lemma categories should be as easy to browse as the pages of a real dictionary. —CodeCa t 19:22, 23 June 2015 (UTC)[reply]

Yeah, I don't know why they make the categories so difficult. On all other pages (history, watchlist, etc.), you can adjust how many entries you see on the screen and skip multiple pages or to a particular page number; but in categories, the number is fixed to 200 and you can only move forward or backward one page at a time. We need to complain harder to the devs about this. --Wiki Tiki 89 19:29, 23 June 2015 (UTC)[reply]

What about cattoc's? Like here Category:English lemmas or here (with just alphabet which I think is enough) Category:Latvian lemmas.

Agreed on "unwieldiness" of browsing cats. There've been times where I've been "shopping" for an audio file to be used in wiki to illustrate a particular sound and the fact that the only thing for navigation that I have is "Next 200" is very inconvenient, cattoc makes this much more convenient.

Also agree about indices, I understand some people have invested time (at some distant point in the past) in maintaining them but the whole point escapes me. Something like that should always be auto-updated (like categories are.) The indices, imo, should be replaced with lemma cats with cattocs. It probably takes a couple minutes of work to add this cattoc but then there could simply be a drive "want your pet language featured in the indices box on the first page? Well, then go and make a cattoc for an alphabetical index of its lemma cat." Neitrāls vārds (talk) 11:30, 25 June 2015 (UTC)[reply]

We do have "TOC"s in the lemma categories, as you have already pointed out. But having both that and more navigable pages would be much better. --Wiki Tiki 89 15:41, 25 June 2015 (UTC)[reply]

Are there any languages for which the Index is satisfactorily updated? In other words: can we delete the whole Index namespace at once or are there any languages that should be kept? "Chinese radical" index is one that comes to mind since it is different from the rest - it is not a list of Chinese words but a (large) list of Chinese characters. I don't have the ability to tell if it's accurate, of good quality, complete or near completion. Also, what about proto-language indices like Index:Proto-Indo-European/d? Can those be deleted too? --Daniel 00:22, 4 July 2015 (UTC)[reply]

See Special:Contributions/Conrad.Bot. The only languages for which the Index is potentially up to date are those that have not had any new entries since the last time Conrad.Bot updated it (which was May 2, 2012). In other words, if there is such a language, it's rather insignificant. As far as proto-languages, it seems they were updated by a different bot, NadandoBot, which last updated them on September 22, 2012, but I see no reason they should be treated any differently; they have lemma categories just like any other language. Any valid red links can be collected on a requests page. --Wiki Tiki 89 20:26, 6 July 2015 (UTC)[reply]

Looks good enough to me. Since deleting the whole index (or most of it minus Chinese radical, I guess; I don't know if it could be replaced by categories, but it seems it hasn't) is a major project, if it's alright I'm thinking of creating a vote for it sometime in the new few days. --Daniel Carrero (talk) 01:20, 13 July 2015 (UTC)[reply]

Sounds good. --Wiki Tiki 89 13:15, 13 July 2015 (UTC)[reply]

It seems that I am the only one who wants to keep the Index namespace. There are so many talented editors here who could update Conrad's code and run it as bot a couple of times a year. Reasons to keep (I am repeating what I said elsewhere):

an audio link if there is an audio
the part of speech
asterisks linking back to the English entries where translations were added
red-link entries that were added to translations but not created yet
orange-link entries that were added to translations but the FL section is missing on the entry page
it is also an excellent tool for troubleshooting and maintenance, showing mistranslations and incorrect entries
it is more compact than the lemma category (a full-size window can show even 5 columns and all this extra information)
it is easier to navigate than the lemma category (this was also mentioned by others above)

Would you all please reconsider? --Panda10 (talk) 14:31, 13 July 2015 (UTC)[reply]

Your reasons for keeping the Index sound good, but IMO the great problem is how the Index is out of date with nobody yet to update it. I propose carrying on with the project of deleting most of the Index namespace, while we could mention on the vote that this is without prejudice; that people are encouraged to "update Conrad's code and run it as bot a couple of times a year." in case someone volunteers to do so. --Daniel Carrero (talk) 15:05, 13 July 2015 (UTC)[reply]

Template:archive-top edit

Previous discussion: Wiktionary:Grease pit/2015/June#Template:archive-top

The terms "passed" and "failed" are not very clear when it comes to RFD/RFDO. It would be clearer to use "kept" and "deleted". However, for RFV, it does make more sense to use "passed" and "failed". Therefore I would like to propose that we change both the displayed text and the template parameter from "passed"/"failed" to "kept"/"deleted" for and only for archives of RFD/RFDO discussions. The downside would be that it would complicate the template logic and possibly confuse the users of the template to have different sets of values for RFD/RFDO and RFV. --Wiki Tiki 89 18:48, 25 June 2015 (UTC)[reply]

I will just point out that you do not even have to think about such inane details if you just use the archiving script I wrote. Which archives better that you ever could manually. — Keφr 18:52, 25 June 2015 (UTC)[reply]

@Kephir: Using "passed" and "failed" for RFD/RFDO archives is still confusing to the readers of the archive, regardless of how it was archived. Has it failed to be deleted, or has it failed to be kept? --Wiki Tiki 89 18:53, 25 June 2015 (UTC)[reply]

I surmise that readers of the archive read it in page view mode, not directly as wikitext. I have no idea how a detail they are not even aware of could confuse them. — Keφr 18:58, 25 June 2015 (UTC)[reply]

"The following information has failed Wiktionary's deletion process." Is that not the text they would see? --Wiki Tiki 89 19:00, 25 June 2015 (UTC)[reply]

Yes, but that is a completely different issue from what template parameters trigger this text. — Keφr 19:18, 25 June 2015 (UTC)[reply]

I believe I said that this concerns "both the displayed text and the template parameter". --Wiki Tiki 89 19:26, 25 June 2015 (UTC)[reply]

If you wish to change the text, just do it. (I was not particularly happy about some phrasing there anyway.) — Keφr 19:34, 25 June 2015 (UTC)[reply]

Well I want to change both, which is why I started this discussion to get consensus. I realize that we would need a bot run and you would have to change your aWa tool, but that shouldn't be too hard. --Wiki Tiki 89 19:40, 25 June 2015 (UTC)[reply]

You can change the template in ways that do not break existing uses. Using bots is unnecessary in that case. — Keφr 20:14, 25 June 2015 (UTC)[reply]

Yes you can. But then people will continue to use what they see. Not everybody uses your aWa tool. --Wiki Tiki 89 20:17, 25 June 2015 (UTC)[reply]

I do not follow. What is wrong with it? — Keφr 20:28, 25 June 2015 (UTC)[reply]

In addition to what I've already mentioned, to maintain consistency between entered content and displayed content. --Wiki Tiki 89 20:35, 25 June 2015 (UTC)[reply]

If you wish to adjust the blurb or add aliases for parameter values, I have little against it, but I think changing existing usage is too much hassle for no benefit. I am fine with current parameter names. And people who are such masochists that they would want to use the template manually should look up its documentation. — Keφr 20:58, 25 June 2015 (UTC)[reply]

So hypothetically, if it didn't take any hassle at all, what would the ideal parameters be? --Wiki Tiki 89 21:01, 25 June 2015 (UTC)[reply]

"0" and "1". Short, sweet and to the point. — Keφr 21:03, 25 June 2015 (UTC)[reply]

<sarcasm>Not "f" and "j", so that you don't have to move your fingers?</sarcasm> You're forgetting that people read code. This is why people do #define TRUE 1 and #define FALSE 0 in C, so that they can type "TRUE" and "FALSE", even though just using "1" and "0" would be much faster. --Wiki Tiki 89 21:11, 25 June 2015 (UTC)[reply]

Well, you already have to move your fingers to type the pipe character; if you take that into account, you may try "\" and "]". However, "0" and "1" offer a nice balance between readability and brevity. They are also much more universal; they would be just as fit when someone proposes to reword the displayed text. — Keφr 09:52, 26 June 2015 (UTC)[reply]

Support adding "kept" and "deleted" as parameters in one way or another: My preference would be to have "kept" and "deleted" as additional parameters that are supported when somebody enters "kept" or "deleted" where they would normally enter "passed" or "failed". Purple backpack89 18:51, 25 June 2015 (UTC)[reply]
I would still like to see the ability to close discussions as "RFD kept" rather than "RFD passed" in archive-top template. That is, I want to be able to enter {{archive-top|rfd|kept}} and {{archive-top|rfd|deleted}}. I don't want to use AWA tool to archive discussions. --Dan Polansky (talk) 20:22, 25 June 2015 (UTC)[reply]

Yeah, that tool's too complex for most editors, and there's not really any harm in keeping templates that aWa mimics. Purple backpack89 20:43, 25 June 2015 (UTC)[reply]

Appendix:Unicode subpages no longer have article links edit

Due to edits by User:Kephir at Module:character info and Module:character list, redlinks (and also regular links) are no longer showing up at Appendix:Unicode. I don't recall such an action being discussed earlier anywhere in the discussion rooms, so I'm bringing it up here since I'm just curious as to what is going on. Bumm13 (talk) 19:15, 25 June 2015 (UTC)[reply]

Deprecated German spellings edit

Deprecated in 1996

In case of spellings that were deprecated in 1996 it's kind of easy to see what they are, even though they were inconsistently labeled as obsolete, dated, nonstandard or alternative forms here.

obsolete: obsolete here at WT is a stronger term than archaic and forms deprecated in 1996 aren't even archaic. Thus: obsolete doesn't fit.
dated: many (or even all?) forms which were in use before 1996 are still in use - though maybe rather by older than younger people and also being rarer now than they were years ago. Thus: dated doesn't fit.
nonstandard: Appendix:Glossary#nonstandard: "Not conforming to the language as accepted by the majority of its speakers."
- There were many surveys that showed that a majority is against the reform, so it's doubtful that deprecated spellings aren't "accepted by the majority of its speakers", even though deprecated spellings became rarer and might and at least sometimes do count as errors in schools.
- In many cases many people don't know which form is correct accourding to the spelling reform of 1996 (2004, 2006, 2011) anyway. This leads to hypercorrections such as "ausser" and "Fussball", and to the use of deprecated forms which aren't recognised as deprecated or are used anyway (and thus most likely aren't nonstandard; e.g. geschrieen).
- Thus: nonstandard is doubtful or doesn't fit.
alternative: If a form is attestable even after the reform and when there is the "the spelling became deprecated" note, this should be fine. At least it's more fitting than the other labels.
Another alternative label: instead of nonstandard (which is doubtful) and alternative (which might be "too neutral"), something like "unofficial" (German: nichtamtlich) might be better. The term isn't doubtful (in contrary to "nonstandard") and is also neutral/describing (and not prescribing like a misuse of "obsolete" or (sometimes) "nonstandard"), but might be more precise (than just "alternative").

Deprecated in 1902

ATM there's no entry which says that a spelling was deprecated in 1902, but anyway:

Forms that were deprecated in 1902 most likely aren't in use anymore and thus aren't attestable for the 21st century. Thus it shouldn't be "alternative form" or "nonstandard form".
Usually forms deprecated in 1902 are easy to understand (e.g. compare Thür and Tür). Thus "dated" should be more fitting than "archaic" or "obsolete".

So

Questions:

How about labeling forms deprecated in 1996 as "unofficial"?
How should words deprecated in 1902 be labeled?

-93.196.234.171 10:54, 26 June 2015 (UTC)[reply]

My two cents. I don't speak or edit in German, but Portuguese has similar issues. Proposal:

Creating Category:Portuguese spellings deprecated by the Agreement of 1990 for platéia, pingüim, vôo and others
Creating Category:Portuguese spellings deprecated by the Agreement of 1945 for êle, tôrre and others
Doing something similar for German, creating categories that tell exactly in which year or which agreement the changes were introduced. In my opinion, just "obsolete" or "dated" don't tell us much if we can narrow it down more. --Daniel 16:15, 26 June 2015 (UTC)[reply]

Re labels: at Wiktionary talk:About_German#pre-1996_spellings_are_.22_forms_of.22_current_spellings, we worked out to use Template:de-superseded spelling of (or Template:superseded spelling of if it would be feasible to greatly expand its functionality without making it prohibitively expensive for the servers and for users who have to add parameters and have the template know that the German spelling reform of 1996 is not the same as the Foobarese spelling reform of 1996). That template handles the variable labelling of things, based on the age of the reform that deprecated them, as "superseded", "obsolete", etc. Re categories, Daniel's basic suggestion is good (precise category names TBD); we do already have Category:German words affected by 1996 spelling reform. - -sche (discuss) 17:54, 26 June 2015 (UTC)[reply]

Regarding that template:

And what is with re-superseeded spellings, when one spelling superseeded another and then got superseeded by the former spelling, like daß/dass became daß in 1901 and then officially dass in 1996? Thus, "dass" is a superseeded spelling of "daß" (as of 1901-1996), but then "daß" is a superseeded spelling of "dass" (as of 1996). Not to mention that "dass" was deprecated between 1901-1996 is no solution, as this would be a lack of infomation and in a way it would also be non-neutral.
"Obsolete spelling [...] deprecated in [...] 1901." -- Please read Appendix:Glossary#obsolete: "No longer in use, and no longer likely to be understood." The first part is true (at least in case of dropped "h" like in "Thür"), but the second part is not. "Thür" is likely to be understood as it looks pretty much like "Tür". Thus, as the definition in the glossary uses an "and" and not an "or", "obsolete" doesn't fit - and maybe it even is some kind of false friend of German "obsolet" in the sense of "unneeded". Even spellings which came out of use in the 17th century aren't always obsolete - e.g. uncapitalised words are likely to be understood.
The "First Orthographic Conference" failed, so it doesn't make sense to say that a word was "deprecated in the First Orthographic Conference".
"1600s" is 1600-1609, which is something different then "16th century" which is 1601-1700 - so the parameters should rather be just "1700", "1800" (like in "till 1700", "till 1800").

91.63.247.10 15:48, 1 July 2015 (UTC)[reply]

1600-1700 is the 17th century. Also, how about the label 'wrong'? I don't see a need to avoid prescriptivism when the there is a legal prescription. The only way the German orthography could be even more prescriptivist was if the state put fines on media for misspelling words. Korn [kʰʊ̃ːæ̯̃n] (talk) 20:30, 1 July 2015 (UTC)[reply]

The German language is not owned by the country of Germany. Wiktionary mentions prescriptions because they are often relevant, but does not itself prescribe. And forgive me if I am wrong, but there are people even in Germany who categorically refuse to follow the orthographic reforms; Wiktionary is not here to decide whether these people are right or wrong. --Wiki Tiki 89 20:36, 1 July 2015 (UTC)[reply]

As such, shouldn't be marking these clearly by what orthographic prescriptions they follow or don't, and ignore tags like dated until they're really necessary?--Prosfilaes (talk) 22:02, 1 July 2015 (UTC)[reply]

Yes. I feel that dated refers more to words that have naturally fallen out of use, rather than those that were banned. --Wiki Tiki 89 22:05, 1 July 2015 (UTC)[reply]

It's not 100% correct to say that the German language is not owned by the country of Germany, at least in the field of orthography. (As for pronunciation standards: God, no.) Germany, Switzerland, Luxemburg, and I believe Liechtenstein too, have declared the Duden as the legally binding institution for their orthographies. The Duden is based in Germany and its decisions are mainly influenced by discussions within German society and politics. I wouldn't be surprised if its editorial and panel were exclusively German as well. The legal situation is the same in Austria, with their home-based Austrian Dictionary being the entity entitled to decide the rules. Point I want to make is that they're all equally prescriptive. Korn [kʰʊ̃ːæ̯̃n] (talk) 00:31, 2 July 2015 (UTC)[reply]

Yes, but not every German writer in the world lives in the countries you mentioned. And not every German writer that does live in the countries you mentioned actually follows the legally prescribed rules. Should we also say that in the Persian language, any anti-Iranian propaganda is grammatically incorrect? --Wiki Tiki 89 14:21, 2 July 2015 (UTC)[reply]

Depends, is there an authority with any sort of binding power regulating grammar, rather than content, in such a way that anti-Iranian propaganda would automatically fail its requirements? If so, yes. Don't try to be thick on purpose just for political reasons. We do have the label 'misspelling' in English where there is no regulation whatso-fucking-ever and suddenly we're having an argument how labeling something as a misspelling is unacceptable for a language for which every country who has it as a national language has a law regulating its spelling, and all on one based on the same source? Really? Because if there are no misspellings, then we have to include a lot of stuff. I might author a German book entirely in a mixture of runes and devanagari and enter every single one of the words I used here. Descriptivist dictionary ho! Korn [kʰʊ̃ːæ̯̃n] (talk) 14:39, 2 July 2015 (UTC)[reply]

So if there were such an authority in Iran, then you think Wiktionary should follow it as well? Wiktionary is not supposed to take sides—any sides. Wiktionary is only supposed to describe the existing situation. I would have no problem saying "now considered incorrect by Duden" with a link to an appendix page explaining how authoritative Duden is. But we should definitely not mark something as simply "wrong", because that implies Wiktionary supports that view and Wiktionary does not support any views. --Wiki Tiki 89 14:50, 2 July 2015 (UTC)[reply]

The existing situation is that virtually every German speaker considers non-Duden spellings as wrong. Korn [kʰʊ̃ːæ̯̃n] (talk) 16:07, 2 July 2015 (UTC)[reply]

Let's see some evidence of that. I can find plenty of Google Books hits for daß from well after the reform. --Wiki Tiki 89 16:41, 2 July 2015 (UTC)[reply]

I'm not sure what you intended to link me, but I looked at the first twelve pages of the link you gave. The overwhelming majority of hits are from the 18th century, another fair share is from even before that and the two or so hits which are post 1996 are at reprints of pre-reform texts. Korn [kʰʊ̃ːæ̯̃n] (talk) 23:12, 3 July 2015 (UTC)[reply]

It was supposed to show you hits from 2005 and later. Some of them are reprints, but it seems to me that some of them are not; although I may be wrong. --Wiki Tiki 89 20:30, 6 July 2015 (UTC)[reply]

I opened the link in another browser and looked over the first ten pages. They mainly are two things: Unedited reprints of older textbooks and books which quote historical writings. There are 5 genuinely new books which use 'daß', but two are false entries in in Google, where the actual book cover uses 'dass', which serves us as a warning for looking twice. So in the small sample survey I did in that link, new books with old spellings make up 3%. Korn [kʰʊ̃ːæ̯̃n] (talk) 09:41, 7 July 2015 (UTC)[reply]

re "'1600s' is 1600-1609" = no, 1600s is 1600-1699 in most contexts in English.

As I noted on WT:T:ADE, "no longer likely to be understood" only applies to words; for spellings, the concern is only whether or not they are still in use. Spellings which fell out of use more than a century ago are obsolete unless they are still used for effect, in which case they are archaic.

- -sche (discuss) 22:38, 1 July 2015 (UTC)[reply]

Spellings deprecated by a regulatory authority but still in widespread use are alternative spellings. They are neither non-standard, nor obsolete, nor misspellings, nor wrong. The English Wiktionary, being a descriptivist dictionary, does not label entries or spellings as "wrong" based on stipulations of regulatory authorities. this revision of Eßstäbchen looks good to me: it ranks the spelling as alternative but informs the reader via a usage note that the spelling was deprecated. That is the accurate, informative reporting to the reader that we should strive for. We should not prescribe, but there is no need for us to omit the fact that an authority has deprecated the spelling, since many a reader wishes to know that. --Dan Polansky (talk) 19:26, 3 July 2015 (UTC)[reply]

It seems that "usage note" is too heavy weight for this. We could say "1902 Duden standard version (deprecated by 1996 standard) of ..."? That seems clunky, but short of notation more appropriate for a German-language dictionary, I'm not sure how to compress it.--Prosfilaes (talk) 20:46, 6 July 2015 (UTC)[reply]

Excuse me, but if a spelling which is 1. virtually not used and 2. not part of the official standard of literally every country which uses that language, is not fitting the term non-standard for you, you might need to get a new dictionary since you seem to have grabbed an edition printed in w:Bizarro World. Korn [kʰʊ̃ːæ̯̃n] (talk) 09:41, 7 July 2015 (UTC)[reply]

As for the above claim that certains spellings are "vitually not used", here is Google Ngram Viewer in German corpus, for Eßstäbchen, Essstäbchen, going to 2008. What I see there leads me to report that "Eßstäbchen" is an alternative spelling that still finds plenty of use. On another note, my understanding of the phrase "non-standard spelling" is not as "not fitting a prescriptive, stipulated standard" but rather "very rare, causing surprise to native speakers when seen in print". I oppose attempts to label spellings not fitting prescriptive standards as "non-standard". --Dan Polansky (talk) 19:36, 7 July 2015 (UTC)[reply]

What's the advantage in going all judgmental and calling it non-standard instead of correctly stating which standards it's a part of? Besides being prescriptive, it's misleading to readers to have spellings used in 1965 labeled "non-standard", as if the author was a bad speller or the editor incompetent, instead of labeling it not conformant to the 1995 standard.--Prosfilaes (talk) 20:24, 7 July 2015 (UTC)[reply]

It seems to me that what you're thinking of is what we already have, Template:de-superseded spelling of (which was indeed designed to, among other things, obviate/replace usage notes and provide descriptive language on the sense line). - -sche (discuss) 17:44, 7 July 2015 (UTC)[reply]

In Brennessel, the template produces this: "Former spelling of Brennnessel which was deprecated in the spelling reform (Rechtschreibreform) of 1996." That seems slightly misleading since the spelling is not so much former as alternative; it is not "former" because it is still in use, and furthermore, because it is still a spelling; if something is "former X", it means it is "no longer X", so "former spelling" suggests "it used to be a spelling but is no more". For brevity and accuracy, the template should IMHO better produce "Alternative spelling of Brennnessel deprecated in the spelling reform of 1996.", having changed "former" to "alternative" and having dropped "which was" and "(Rechtschreibreform)" for brevity. --Dan Polansky (talk) 19:43, 7 July 2015 (UTC)[reply]

A concern with labelling Foobar an "alternative spelling of Fubar deprecated in the spelling reform of 1996" is that it could be interpreted as saying that Foobar used to be an alternative spelling of Fubar (i.e. that both Foobar and Fubar were standard) until Foobar was deprecated in 1996, leaving only Fubar. This is actually the case with some words, e.g. geschrieen (both geschrieen and geschrien were standard before 1996, now only geschrien is). Just "spelling of Fubar which was deprecated in the spelling reform of 1996" ("which was" strikes me as necessary for clarity, especially if we drop the adjective) is one idea, but it fails to note that the spellings were formerly standard. Hmm... I have changed the wording to "Formerly standard spelling of Fubar which was deprecated in the spelling reform (Rechtschreibreform) of 1996." - -sche (discuss) 17:53, 17 July 2015 (UTC)[reply]

@-sche: By my lights, the word "standard" should not appear anywhere in the output of the template, whether as "standard", "non-standard" or the like. Put differently, "standard" is not a lexicographical category, IMHO. --Dan Polansky (talk) 10:19, 19 July 2015 (UTC)[reply]

I notice that the argument with me is beginning to turn in circles, so I'll make my final case, in order to not hinder the progress of the discussion. There is only one standard. Technically, there are two standards, Austrian Dictionary and Duden, but I can't think of anything where they would diverge in terms of spelling. Then there are a zillion reprints of university textbooks, for which it is neither necessary nor cost efficient to revise their spelling, and texts quoting works with older spellings, which fuck with robots like Google, and then you get a preciously scare group of ultraconservatives who do consciously not follow the reform. The official standard isn't some fringe idea, it's what defines what people perceive as wrong and right. The ß you can get past people as an odd quirk, but omitting an N in Brennnessel will not be considered as a dated alternative, it will be perceived as a mistake by something between virtually and literally everyone. I know it's not the best argument, but I have to say it once, because it keeps popping up in my mind: I can't help but feel that the local foreigners have a wrong impression of how the average and absolutely crushing majority of German speakers see the reform. (They don't care. The state said 'this is correct now' and they accept that as the new given.) It might not be perfectly clear why this bothers me so much, so let me voice why I'm so irked by this debate: I am of the honest and stern conviction that we do a disservice to our users if do not abundantly clearly tell them that these spellings are not equal to the official spellings. 'Alternative' doesn't cut it. The very least we can do is 'proscribed', personally, I'd move for 'misspelling'. If you believe that such a thing as a 'misspelling' does exist as a concept, this is it. Brennessel, adultry, who are we to judge? My answer to that is: We're a dictionary, not a copy shop. We don't embezzle context information. Korn [kʰʊ̃ːæ̯̃n] (talk) 22:39, 7 July 2015 (UTC)[reply]

Can you express that in some way that's not so dated? Like a Youtube video, or Twitter?

When B. F. J. Scheller published their Die amerikanische Brennessel, they correctly spelled the title. No force in heaven or on Earth can turn back the hands of time and make that a misspelling. Any dictionary that presumes to cover the totality of cited human writing like Wiktionary does would fail if it were to mark a word that was correctly spelled as a misspelling because some reform one hundred years later presumed to make it so. Even if there is but one standard today, it differs from the standard then.

I have no clue what you mean by "we don't embezzle context information." It is entirely appropriate for a dictionary to clearly state what standards spellings adhere to.--Prosfilaes (talk) 07:48, 8 July 2015 (UTC)[reply]

Requests for quotations edit

From time to time I come across notes like "Can we find and add a quotation of <author's name> to this entry?" I am curious about these since there is no explanation (and certainly no obvious reason) why some seemingly random author should be so important for some particular word sense. And, if there is a good reason, i.e. the editor knew of a particularly pertinent quote, then why didn't the editor just add it? I would be interested to know more about the reason for these notes. 217.44.208.136 23:54, 26 June 2015 (UTC)[reply]

Webster 1913 often cited authors who used a word, but didn't give the citation; we have copied these while importing the Webster data, because sometimes there are very few authors who ever used a rare word. Also, sometimes we're too busy to fill out the entire entry at the time. Equinox ◑ 00:03, 27 June 2015 (UTC)[reply]

For further background, Webster 1913 is a core source of Wiktionary entries, being out of copyright and available in readily usable form. Also, it is simply a lot of work to add citations. It would take at least 1,000 hours for one person to add just the citations marked with the template {{rfquotek}}, assuming they can manage 10-11 per hour. Try adding just one. DCDuring TALK 00:20, 27 June 2015 (UTC)[reply]

I see, thanks for info. 217.44.208.136 00:45, 27 June 2015 (UTC)[reply]

Format of definitions edit

I think it looks messy that some definitions start with a capital letter and end with a full stop, while others don't. The page Wiktionary:Entry layout explained says "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop." This seems undesirably vague to me. Is there a reason why "may" is not "should"? Another slightly messy inconsistency is that some verb definitions begin with the infinitive marker "to", while others do not. Apologies if I missed it, but I don't see any instruction about this on the "Entry layout explained" page. I think this should be covered. 217.44.208.136 00:49, 27 June 2015 (UTC)[reply]

A definition consisting of a single capitalised word followed by a full stop looks silly. One word doesn't make a sentence. —CodeCa t 01:16, 27 June 2015 (UTC)[reply]

(Almost) no definitions are grammatically full sentences. It doesn't make any difference in that respect whether a noun phrase, for example, is one word or twenty. 217.44.208.136 01:19, 27 June 2015 (UTC)[reply]

How do other dictionaries do it? —CodeCa t 01:20, 27 June 2015 (UTC)[reply]

Well, you can see as well as me that they vary. I should say that I am not especially certain that the "sentence" format is the best. I think the main thing is that all entries should be consistent. If you can find a dictionary that has inconsistency like Wiktionary then that would be more interesting. 217.44.208.136 01:28, 27 June 2015 (UTC)[reply]

We've sought consistency and have not achieved it. Non-English entries are overwhelmingly lower case without period. English entries are mostly upper case with period. Other formats exist but tend to be converted to the dominant format for English or FLs. DCDuring TALK 01:34, 27 June 2015 (UTC)[reply]

Do you know offhand what the barrier(s) to achieving consistency are/were? 217.44.208.136 01:38, 27 June 2015 (UTC)[reply]

Adherents of one or the other option being vehemently against codifying anything but their preference, and the sheer magnitude of work required to synchronize millions of entries that are constantly being edited by an unknown number of people at all hours of the day and night. Chuck Entz (talk) 03:45, 27 June 2015 (UTC)[reply]

The inconsistency between English and FLs is largely attributable to the fact that English definitions tend to be longer, closer to full sentences in length if not structure. (How could a substitutable definition of anything other than a sentence be a sentence?) FL definitions are most frequently a single English word (whether or not that should be the case), sometimes with a disambiguating gloss, for polysemic words (and homonyms). Inconsistency within English is due to some contributors having disagreed with me. DCDuring TALK 04:12, 27 June 2015 (UTC)[reply]

I wonder whether there is any appetite to look at this again with a view to settling on a single format, at least for English. It's not so bad if separate foreign-language sections have different formats from English, but when adjacent English definitions are formatted differently, the effect is, as I say, messy. It does not look designed or intentional, but just like different people are doing different things at random. Even if exact criteria were developed for choosing one format over the other, say based on length, I think that for ordinary readers there would always be an arbitrary-seeming cutoff point at which one would ask "Why are these two formatted differently?" I do not agree that hope of standardisation should be abandoned just because Wiktionary is user-editable. Sure, people may not follow standards, and things may have to be corrected, and they may go uncorrected for a long time, but the same is true of any layout or style requirement. If you go down that route you might as well give up on the whole of "Wiktionary:Entry layout explained". By the way, does anyone have a view on my other point about use of "to" in verb definitions? 217.44.208.136 11:55, 27 June 2015 (UTC)[reply]

I like the use of to in English verb definitions as it accelerates and confirms the recognition that the definition is for a verb. On mobile screens and for longer definitions the PoS heading may not be visible. For a non-native speaker especially the possibility that a defining word is a verb and has a homonym that is of another part of speech adds to the potential for confusing, even ambiguous definitions. DCDuring TALK 12:52, 27 June 2015 (UTC)[reply]

It does look messy when there's inconsistency, but for the most part, the definitions are capitalized and punctuated for English words, and not for non-English words. I have been taking care of inconsistencies as I come across them (that goes for the occasional headers that are the wrong sizes). In fact, those little inconsistencies are what started me editing Wiktionary not too long ago, since they were bugging me. I definitely think we should strive for consistency in every way possible. JodianWarrior (talk) 22:20, 30 June 2015 (UTC)[reply]

We do have some more important problems, like missing definitions, wording of a definition of a part of speech that makes it seem like another part of speech, confusing order of definitions, etc., but working on format is a great way to get exposure to the content of a lot of definitions. By doing so one can pick up prevailing good (and not so good) practice in definitions and other parts of entries. DCDuring TALK 23:20, 30 June 2015 (UTC)[reply]

Strange tables in Korean entries edit

I think that these need to be removed and the "orthoepy" sent to ko-pron. 64.40.43.48 04:24, 27 June 2015 (UTC)[reply]