Wiktionary:Beer parlour/2018/July

Entries for hyphenated attributive forms? edit

We have entries such as transitive-verb, at-sign, open-book, criminal-law, shoulder-blade, sea-urchin (see a more complete list here). The hyphen being a mere spelling device, I think these are pointless, and I would like to see them deleted.

However, people have argued that using an hyphen turns something into a single word automatically; I disagree with that, and I'm not aware of any policy to that effect. Has there been a vote, or might we need one?

My point is that we should restrict ourselves to creating lexicalised attributive-form entries, such as cookie-cutter (idiomatic meaning, adjectivisation). Per utramque cavernam 15:16, 1 July 2018 (UTC)[reply]

I have suggested a vote to the effect that hyphens that are added when a phrase is used attributively would be treated as spaces for the purposes of determining whether the phrase is SOP. It was in the middle of a long discussion last month that you may not have read, and it was marginally off-topic to that discussion. The universe of possible attributive phrases is just too unlimited for us to cover: "6-inch bolts" a "27-foot boat", "Reform-Jewish-rabbi-officiated weddings", etc. Chuck Entz (talk) 16:06, 1 July 2018 (UTC)[reply]
@Chuck Entz: I don't remember reading that discussion, no. Where was it?
Found it. Per utramque cavernam 13:59, 2 July 2018 (UTC)[reply]
Yes. The actual (attested) attributive forms will only be a tiny subset of all possible combinations, but even then it might be a huge set. Attestation is a necessary condition for having an entry, but I don't think it should be a sufficient one. Per utramque cavernam 17:20, 1 July 2018 (UTC)[reply]
OT: I have enough trouble with hyphens appearing in 'vernacular' organism names. Is it blue moor grass, blue moorgrass, or blue moor-grass (all attestable at Google Books) (just to mention one I just ran across)? At least I'm fairly sure that blue-moor grass, bluemoor grass, blue-moorgrass, blue-moor-grass, bluemoor grass, and bluemoorgrass can be ignored. DCDuring (talk) 17:48, 1 July 2018 (UTC)[reply]
You've got my vote. DCDuring (talk) 17:48, 1 July 2018 (UTC)[reply]
For reference, this was discussed at Talk:transitive-verb#RFDE: All English attributive forms (with hyphens) of noun phrases. Note that treating hyphen as space for the SOP determination is a separate issue; here, transitive verb is kept, yet someone wants to delete transitive-verb, where the sum is not transitive + "-" + verb but rather transitive verb + hyphenation-operator, or the like. --Dan Polansky (talk) 18:54, 1 July 2018 (UTC)[reply]
In general I think we should delete entries that are purely for attributive-noun uses, like transitive-verb, but keep entries that can function as non-modifying nouns themselves, e.g. at-sign and probably shoulder-blade (which might be a legitimate British spelling of shoulder blade, as suggested by being listed as an alternative form under shoulder blade). In such entries, I'm undecided about whether to list the attributive use as a possible definition (as it is done under at-sign), and also undecided about cases like open-book, which has both a definition as a non-SOP adjective and a definition as an attributive noun. The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space. It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word. Benwing2 (talk) 15:08, 2 July 2018 (UTC)[reply]
If "The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space", then transitive-verb should be kept no less than transitive verb, since, again, "hyphen ... shouldn't be treated differently from a space". --Dan Polansky (talk) 08:40, 3 July 2018 (UTC)[reply]
"It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word": That is not our practice, as per e.g. Talk:Zirkusschule. --Dan Polansky (talk) 08:42, 3 July 2018 (UTC)[reply]
This conversation doesn't seem to have come to a conclusion, but I would like to see the attributive forms converted to redirects, I have yet to encounter one which provides any useful information beyond the concept that multi-part nouns are hyphenated when being used attributively. - TheDaveRoss 15:23, 10 January 2019 (UTC)[reply]

Slovenian Pleteršnik orthography edit

@Atitarev, Guldrelokk, Dan Polansky Slovenian orthography is very confusing, as there are at least two incompatible diacritic systems (see Appendix:Slovene pronunciation). On top of this, Pleteršnik's dictionary uses yet another system that I don't understand; see [1] for an example. Apparently this system encodes a lot of additional dialectal information, but I haven't been able to find a description of this system and I can't read Slovenian. Can anyone help discover what the Pleteršnik symbols mean? Thanks! Benwing2 (talk) 17:53, 1 July 2018 (UTC)[reply]

BTW, see [2] for a somewhat blurry image of the page that explains the symbols. It's in Slovenian; maybe someone can read it? Benwing2 (talk) 17:58, 1 July 2018 (UTC)[reply]
The description is here. It seems to say:
ẹ and ọ signify close vowels, ę and ǫ signify diphthongs /ie/ and /uo/, which are always long and only occur in stressed syllables. e and o stand for open vowels.
ɐ is [ə].
ł is [u̯].
Three kinds of accent exist: two for long vowels, falling, marked with circumflex, and rising, marked with acute, and one on short vowels, marked by grave. Guldrelokk (talk) 18:12, 1 July 2018 (UTC)[reply]
(with e/c) I know very little about Slovenian. I am not sure what you are trying to do. Your link[3] shows "zdẹ̀"; are you trying to figure out how to render that in IPA? I suspect that "zdẹ̀" is not an actual attested form but rather a dictionary-only form adorned to show pronunciation, and that the attested usual form is "zde"; but I don't really know. For example, tukaj is shown in en wikt as "túkaj" and is shown in Pleteršnik as "tȗkaj" per Fran[4]. If I am right, we are not talking orthography but rather forms adorned to show pronunciation. --Dan Polansky (talk) 18:20, 1 July 2018 (UTC)[reply]
@Benwing2: It also says that macron in loanwords only signifies length and that these vowels are pronounced as ‘pure’. In the dictionary it seems to work like another kind of accent (there is only one per word), apparently it was pronounced as a long vowel with flat tone. Guldrelokk (talk) 18:52, 1 July 2018 (UTC)[reply]
From their description, can you figure out how "ȗ" is to be pronounced, used in "tȗkaj"? --Dan Polansky (talk) 18:56, 1 July 2018 (UTC)[reply]
@Dan Polansky: [ûː], i.e. [ú͜u]. Written identically in the tonal orthography from the Appendix, it’s also in the entry: Tonal orthography: tȗkaj (why are the lemmas in the ‘stress orthography’?). Guldrelokk (talk) 19:05, 1 July 2018 (UTC)[reply]
@Dan Polansky Perhaps "orthography" is the wrong word; maybe "notation" is better. In this case, the etymology for the Russian entry for здесь mentions Slovenian zde. I would rather cite Slovenian words in etymologies in the tonal orthography if possible, as it conveys more etymological information. However, some words (like this one) are available in Fran only in the Pleteršnik notation, and in that case my choices are either to cite it directly in that notation along with a note indicating that it's Pleteršnik's notation (which links to a page explaining that notation), or to try to convert it to normal tonal orthography. Cf. templates like Template:l/sl-tonal, which is used to cite Slovenian words in the tonal orthography and adds a note indicating that the word is in the tonal orthography, with a link to Appendix:Slovene pronunciation, the page that explains the diacritics. This is necessary because, unlike with Serbo-Croatian, there are (at least) two different possible notations, which are incompatible with each other, so without the note, it would be unclear which notation is being used. Benwing2 (talk) 19:06, 1 July 2018 (UTC)[reply]
@Guldrelokk IMO we should always be using the tonal orthography, but I've heard that nowadays most Slovenians pronounce words non-tonally, so they may be more familiar with the non-tonal notation. Benwing2 (talk) 19:08, 1 July 2018 (UTC)[reply]
@Benwing2: It seems that the tonal orthography is basically Preteršnik’s notation, except that the pronunciation somewhat changed: there is (apparently) no more short or unstressed ẹ/ọ, no diphthongs and no ‘flat tone’. No idea what happened to them. Guldrelokk (talk) 19:18, 1 July 2018 (UTC)[reply]
@Benwing2, Guldrelokk, Dan Polansky: Late response. I'm not too familiar with the Slovene tonal notation either and when adding Slovene terms in translations, etymologies or reconstructions, I mostly just use the plain spelling, unless it's already defined here by native/advanced speakers. Rather than making/copying a mistake, I prefer to use what can be confirmed. A good Slovene dictionary is [5] - no tonal notations. You can also use monolingual [6] with some tonal notations. --Anatoli T. (обсудить/вклад) 07:12, 2 July 2018 (UTC)[reply]
@Benwing2, Atitarev I wonder what monolingual dictionaries use stress notation? Guldrelokk (talk) 13:13, 2 July 2018 (UTC)[reply]
@Guldrelokk: If you learn the notation and the phonology a bit, it will give you the stress as well. The notation à la Ali govoríte slovénsko? is probably enough to know how to pronounce Slovene correctly, if you're already familiar with basic phonetic rules. And [7] I mentioned above is probably your best bet online. --Anatoli T. (обсудить/вклад) 13:35, 2 July 2018 (UTC)[reply]
@Atitarev: Yes, exactly, the ‘stress notation’ is redundant once you have the ‘tonal’, and the only reason it’s there is its alleged use by natives. However, I see that the monolingual dictionary you pointed out ([8]) uses the ‘tonal’ notation. What do other monolingual dictionaries use? Guldrelokk (talk) 13:45, 2 July 2018 (UTC)[reply]
@Guldrelokk: SSKJ2 appears to use both; headwords are in the stress notation but then they put the tonal notation in parens after. See for example [9], which has a whole bunch of dictionaries including SSKJ2 and Pleteršnik. Benwing2 (talk) 14:49, 2 July 2018 (UTC)[reply]

Default title for column templates edit

Views are sought on what the default title for column templates such as {{der2}}, {{der3}}, {{der4}}, {{rel2}}, {{rel3}}, and {{rel4}} should be. @Dan Polansky feels it should be the same as the relevant section heading (e.g., "Derived terms"), whereas I am of the view that it is more useful for the title to be "Terms derived from xyz", "Terms related to xyz", for two reasons:

  • It is (marginally) more useful for the template title to display the root term rather than simply repeat the section heading.
  • Where it is necessary to manually add a title, the practice is to put it in the form "Terms derived from xyz (noun)", "Terms related to xyz (verb)", and so on. Thus, for consistency, the default title should be in the same format.

SGconlaw (talk) 10:33, 2 July 2018 (UTC)[reply]

A discussion is at Wiktionary:Beer parlour/2018/June#Display text of Template:der3 and others. In that discussion, there is a post by -sche there that makes lot of sense. I acknowledge that repeating "Related terms" after "Related terms" is not so nice, so I propose other possibilities, like leaving the collapsible bar empty, or saying "Items:", or "List:", or coming up with other options that are user-friendly and non-repetitive. In that discussion, I give an example of how entry party looks; not so nice. --Dan Polansky (talk) 08:48, 3 July 2018 (UTC)[reply]
As for "Terms derived from xyz (noun)", that is another cruft that should ideally be reduced. The derived term section in question is in the noun section, so this does not need to be repeated. By my estimate, the practice originates from some people's taking pleasure in expressly stating obvious things and things of marginal relevance. These are two very different aesthetics. --Dan Polansky (talk) 09:38, 3 July 2018 (UTC)[reply]
I have no strong views on the latter point, but would just like to highlight that having a phrase like "Terms derived from xyz (noun)" does make the section easier to locate in a long entry. — SGconlaw (talk) 10:39, 3 July 2018 (UTC)[reply]

{{look}}

News from French Wiktionary edit

 

Hello!

Sorry that we skip two months, we missed people available to translate to English our publication, but this edition is ready and it's a great pleasure to invite you to read the June issue of Wiktionary Actualités!

A lot of content this time! Articles are about examples, a new tool to record pronounciations, integration of a specialized lexicon offered by its authors, a dictionary about popular words in the XIXth, the linking with Wikipedia and a story about fishermen and fishes. As usual, there is also plenty metrics, a short resume of some newspapers articles and unidentified pictures!

This issue was written by nine people and was translated for you by Dara. This translation can still be improved by readers (wiki-spirit). I hope you will see some interest to know what's up in your neighborhood!   Noé 09:11, 3 July 2018 (UTC)[reply]

@Noé: Merci! Was wondering what happened. I'm always willing to help out with translation work if needed. – Jberkel 09:39, 3 July 2018 (UTC)[reply]
The issues of April and May now also are translated ! @Jberkel: too bad I only see your message now; you can check on the mistakes for these issues, and maybe we could call on you for the next one ! DaraDaraDara (talk) 11:12, 4 July 2018 (UTC)[reply]

July LexiSession: sauces edit

This month, suggested topic is sauces! Because of Caesar sauce, maybe.

In French Wiktionary, we just started by the creation of a thesaurus.

LexiSession in short: a collaborative transwiktionary experiment. Several wiki, a same topic, learning by looking at what our colleagues do. You're invited to participate however you like and to suggest next month's topic. The idea is to look at other community improvements on the same topic to improve our own pages and learn foreign way of contributing. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. If you can spread the word to other Wiktionaries, you are welcome to do so. Also, sorry I was very busy this last two months and I forgot to notice you.   Noé 09:16, 3 July 2018 (UTC)[reply]

Improving sorting of items in categories via Mediawiki customization edit

Currently, in categories, items starting in č are sorted after items starting in z. This is very unconventional. For instances, in Category:cs:Amphibians, čolek is after salamandr. The example is Czech, but a similar problem is there probably for other languages as well.

As a remedy, it seems we could customize en wikt Mediawiki instance to use uca-default for sorting of items in categories. Czech Wiktionary has this, created via cs:Wikislovník:Hlasování/Změna abecedního řazení v kategoriích. An example Czech category is cs:Kategorie:Česká substantiva; an example Russian category is cs:Kategorie:Ruská substantiva.

One consequence would be that, for instance, instead of č being after z, it would be collated together with c. That is still not the conventional Czech order, in which č is after c rather than being on equal footing, but still seems to be an improvement.

A relevant page is https://www.mediawiki.org/wiki/Manual:$wgCategoryCollation.

Maybe someone knows more and can explain impact across languages.

--Dan Polansky (talk) 10:14, 3 July 2018 (UTC)[reply]

There's no perfect solution: ä, for example, is sorted with a in German and after z (and å) in Swedish. But that would get it closer to how an English speaker would expect it.--Prosfilaes (talk) 03:39, 7 July 2018 (UTC)[reply]
Ideally what we need is a mechanism to enable per-category collation: that is, Czech collation for Czech categories, German collation for German categories, and so on; multiple sortkeys per page, for Japanese; and a way to write our own collation algorithms for languages that do not have collation algorithms available, such as Ancient Greek, Egyptian, and Coptic. (See Module:egy-utilities for a makeshift collation algorithm used for Egyptian words in Module:columns, Module:cop-sortkey for one that is used in Coptic categories. Module:zh-sortkey provides a sortkey for Chinese categories, but MediaWiki might have an equivalent collation algorithm that would be available if we had per-category collation.) At least per-category collation has been proposed (see phab:T30397), but I don't know what's happened with it since 2012.
Besides changing the default collation, a workaround is to add some sort_key replacements to the language table. It would be ugly, but I think you could impose the order c < č < d by replacing c with cc and č with . — Eru·tuon 00:06, 8 July 2018 (UTC)[reply]
Surely the "natural kludge" would be to use cˇ instead of č, o´ instead of ó, etc. (though of course this doesn't really work for cases like ł). --Tropylium (talk) 10:10, 14 July 2018 (UTC)[reply]

Most searched-for entries edit

Do we have a list of the most searched-for – or even viewed – pages here? I couldn't see anything under the Special Pages list. It would be nice to make a concerted effort to work on things that most people are looking up. Ƿidsiþ 11:43, 4 July 2018 (UTC)[reply]

[10]. DTLHS (talk) 16:45, 4 July 2018 (UTC)[reply]
Thanks for both the question and the answer. DCDuring (talk) 17:58, 4 July 2018 (UTC)[reply]
Nice! Thanks. Ƿidsiþ 06:42, 5 July 2018 (UTC)[reply]
Well, it was a good thought, but I think that I'd rather not waste your efforts on improving our pornographic content. —Μετάknowledgediscuss/deeds 06:54, 5 July 2018 (UTC)[reply]
:-D — SGconlaw (talk) 06:48, 6 July 2018 (UTC)[reply]
I think this is part of the same phenomenon as the rash of bogus "xx" content being added: as far as I can figure out, Wiktionary is being bundled with mobile operating systems in Africa and Asia, and there are lots of users who don't speak English well enough to realize that it's a dictionary and not part of the user interface. They apparently think they're searching the web for porn sites, but they're actually searching Wiktionary. Chuck Entz (talk) 07:31, 6 July 2018 (UTC)[reply]
What's odd is that people are searching for Roman numerals like XXXIX and XXIX. Must be typos, I guess! — SGconlaw (talk) 08:05, 6 July 2018 (UTC)[reply]
Not odd at all, considering the search engine's auto-suggestion feature. Chuck Entz (talk) 18:28, 6 July 2018 (UTC)[reply]
These are views, not searches. So if someone searches for "XXXIX definition" in google (because it's the number of the current Superbowl or something) they may end up here. DTLHS (talk) 18:38, 6 July 2018 (UTC)[reply]
In context, this is clearly not about SuperBowls... —Μετάknowledgediscuss/deeds 18:39, 6 July 2018 (UTC)[reply]
The February 2018 Super Bowl was LII. DCDuring (talk) 18:54, 6 July 2018 (UTC)[reply]
Look at the logs for Abuse Filters 54, 70, and 74 (to start with). Obviously people (probably horny teenagers) from the areas I mentioned are entering a lot of "x"es in the search engine, and when the auto-suggest doesn't land them in actual entries, they're going to the "not found" page, which gives them plenty of buttons for creating entries. Chuck Entz (talk) 19:50, 6 July 2018 (UTC)[reply]

Livonian alphabet edit

After some discussion it seems Livonian (ie. word Lețmō) should use ţ with cedilla (like Latvian), and not Romanian ț (T with comma below). Latvian-Livonian-English Phrase Book (gramata.pdf[11]) uses cedilla, but Tarto University Estonia-Lețkēļ dictionary[12] uses Romanian Ț in its entries for some reason. --Mikko Paananen (talk) 13:20, 4 July 2018 (UTC)[reply]

I think the usage of the Romanian ț is due to technical reasons. I think that Latvian ţ should be used here though, since there are no restrictions like that here. SURJECTION ·talk·contr·log· 23:00, 5 July 2018 (UTC)[reply]
I can't imagine what technical reasons would require the use of ț instead of ţ. Unicode-wise, ț was added in a later edition (3.0), whereas ţ was there from the start. Though I'm confused about Latvian; w:Latvian alphabet doesn't show any modified t's. w:ţ says it's only used in a Turkic language.--Prosfilaes (talk) 03:30, 7 July 2018 (UTC)[reply]
This is probably an artifact related to how k g n l r with cedilla as also used in Livonian (and Latvian) are rendered with a comma-like diacritic in most fonts; it seems clear that a single palatalization diacritic is what's aimed for here. --Tropylium (talk) 10:16, 14 July 2018 (UTC)[reply]
See w:T-comma#Software_support. It was only added in a later Unicode version, and as a result, many fonts did not support it initially, replacing all instances with T-cedilla instead. That is still done for many Romanian texts (at least according to the article). SURJECTION ·talk·contr·log· 18:52, 14 July 2018 (UTC)[reply]

Hi. I just want to announce again that the English Wiktionary has a Discord server. If you are a Discord user and a Wiktionary editor, we would very much appreciate if you join in via this permanent invite. We would love to have you there. Cheers, and happy editing! PseudoSkull (talk) 05:44, 6 July 2018 (UTC)[reply]

Multi-stage borrowing edit

When a word is borrowed from language A into B, and then from B into C, would you say that C has borrowed the word from both A and B, or just from B? So for example, at kakao (Nahuatl > Spanish > Danish), one could say

From {{bor|da|es|cacao}}, from {{der|da|nah|cacahuatl||cocoa}}.

as I've put, or

From {{bor|da|es|cacao}}, from {{bor|da|nah|cacahuatl||cocoa}}.

depending on whether one thinks Danish can be said to have borrowed from Nahuatl.__Gamren (talk) 14:48, 6 July 2018 (UTC)[reply]

The first one. Whether Spanish borrowed from Nahautl or not doesn't really matter. —AryamanA (मुझसे बात करेंयोगदान) 15:21, 6 July 2018 (UTC)[reply]

Entries descending from themselves edit

On reconstruction pages the different Persian varieties are now normally grouped as descendants of Classical Persian (see, for example, Wiktionary talk:About Persian#Tajiki_Persian_is_not_descended_from_Iranian_Persian.). An example of such a layout is at *wŕ̥kah.

However, Classical Persian has (understandably) not been given separate headings and entries; instead Classical and Modern Persian words are united under the ‘Persian’ heading. See, for example, گرگ. The only difference between links to fa and fa-cls is the language name before the lemma.

Now, a page آهو listed itself as its own descendant. On the talk page @Victar argued that it is correct; it represents the same inheritance of the Modern Persian word from the identical Classical Persian word. However, I don’t think that the layout of reconstruction pages should allow entries to list themselves as their descendants, for the following reasons:

  1. It is confusing; the heading says Persian and one of the descendants is Persian as well, being no different. Moreover, Dari and Tajik words are first listed as regional variants, and then again as descendants; I understand that it’s supposed to represent different instances of the word, one as a modern Persian, one as a Classical, but it is still confusing.
  2. Nor am I aware of any other language that does so. As for Persian, @Victar says it is normal, but I haven’t been able to find any other entries that name themselves their descendants, not even among ones that are descended from Classical Persian on the reconstruction pages, like گرگ.
  3. @Victar argued that such descendants are there to be included into Reconstruction pages with {{desctree}}. However, in reality they break {{desctree}}, as @Chuck Entz pointed out on the talk page. Even if this can be fixed, I fear that such an unusual layout may cause other technical issues.

In my opinion, either Modern and Classical Persian should be separated, with one inheriting from another, or Persian entries shouldn’t list the same Persian entries as their descendants. Continuity with Classical Persian can be implied whenever a word is inherited; a modern word descending from Middle Persian or further must have gone through the Classical Persian stage. Likewise, English entries don’t list themselves as descendants meaning that they come from Early Modern English; it is sufficient to provide a further etymology, or label words that have come out of use since EMnE obsolete. In either case the current layout of (Indo-)Iranian reconstruction entries can be kept – they would simply be the only place to distinguish consistently Modern and Classical Persian in the latter case, as they have reasons for that.

Alternatively, if a آهو-like layout be the accepted one, then I think a bot can be made to automatically list every inherited Persian entry as its own descendant, together with useful notes like @Chuck Entz have added.

Guldrelokk (talk) 21:48, 6 July 2018 (UTC)[reply]

Continuing off the original discussion linked above, here is an example, building off what was discussed there. On the Old Persian entry 𐎠𐎰𐎥 (a-θ-g) we have the descendent tree constructed as so:
* Middle Persian: (/⁠sang⁠/)
*: Book Pahlavi script: 𐮽𐮵𐯋𐮲 (sng), [script needed] (KYPA)
** Bakhtiari: سنگ (sang)
** {{desctree|fa-cls|سنگ|tr=sang}}
Now on the Classical/Modern Persian entry سنگ, we have the descendents like this:
* Iranian Persian: سنگ (sang)
* Tajik: санг (sang)
* Coptic: ⲃⲁⲥⲛϭ (basnc)
* → Hindustani:
** Hindi: संग (sang)
** Urdu: سنگ (sang)
* Ottoman Turkish: سنگ (seng)
  1. I've added a fa-ira etymology code, per @Calak's example in the previous discussion. If that isn't clear enough, I'm not vehemently opposed having some text in Persian descendant sections that reads (Descendents listed reflect that of Classical Persian). Most borrowings from Persian are from the Classical period, which means that virtually all Persian descendents sections would have that note above, so I do find it a tad excessive.
  2. The example of this, which is the basis of the Persian model we currently use, can be seen on Latin entries where we find descendent in the form of Medieval Latin, Late Latin, etc.
  3. There is nothing mechanically "broken" about this method, if that's what you mean, as you can see it working just fine on both pages.
--Victar (talk) 00:28, 7 July 2018 (UTC)[reply]
I see 𐎠𐎰𐎥 (a-θ-g) works fine, but *HaHĉúkah doesn’t. Guldrelokk (talk) 00:36, 7 July 2018 (UTC)[reply]
@Guldrelokk:, it's broken because an {{rfc}} tag got thrown in there. --Victar (talk) 00:39, 7 July 2018 (UTC)[reply]
Good. Even if the technical issues are resolved, others are not. Latin entries do not list ML and LL forms as their descendants when they are identical, so I don’t see how it’s comparable. Guldrelokk (talk) 00:40, 7 July 2018 (UTC)[reply]
And yet we do do that, especially on reconstructed entries, like *blavus. --Victar (talk) 00:53, 7 July 2018 (UTC)[reply]
What about non-reconstructed entries? These are very different cases: *blāvus and blāvus are different entries, Classical Persian: سنگ and Iranian Persian: سنگ are one and the same. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)[reply]
Another point: your new solution makes {{fa-regional}} redundant. The ‘descendants’ will always be the same; either of these will have to go. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)[reply]
Both Latin headers, both using the same language code la, both identical. There are non-reconstructed examples, as I know I've made some, but hard to sift through thousands of entries; I'll look though. It makes a difference, for example, in Latin descendants from a -w- form, and those from a -v- form.
Not really. It would generally only be used on pages with Classical Persian borrowings. --Victar (talk) 01:12, 7 July 2018 (UTC)[reply]
If the younger descendants are only for pages with Classical Persian borrowings and for others {{fa-regional}} will suffice, then I don’t see what additional info do they provide by duplicating {{fa-regional}} and introducing confusion by descending from themselves. If it’s important to show that the borrowings are from Classical Persian (if it indeed is important in general in Persian entries), then a note can be added that ‘the following words were borrowed at the Classical Persian stage’. Similar entries having distinct layouts are bad for consistency: it can confuse readers as well as less experienced editors, who may list New New Persian descendants in entries without borrowings and omit them in entries with borrowings. Guldrelokk (talk) 01:21, 7 July 2018 (UTC)[reply]
And no, *blāvus and blāvus are not identical: mind the link colours. Guldrelokk (talk) 01:26, 7 July 2018 (UTC)[reply]
You've already given your opinion, and I mine. Let others chime in so we're not just discussing this in a circle. --Victar (talk) 01:29, 7 July 2018 (UTC)[reply]
I don't think that showing descendants that are all the same as the headword and have no descendants themselves is at all useful- it doesn't explain anything, and the same information is included in the regional template- it feels like a tautology. In cases where dialectal variation at the Classical Persian stage is reflected in differences among the regional forms, or where there's borrowing from one of the descendants into another language, that's something you would want to show. Chuck Entz (talk) 02:44, 7 July 2018 (UTC)[reply]
I agree, @Chuck Entz. I'm certainly not advocating adding Iranian Persian, Tajik, and Dari to the descendents section of all Persian entries, as that would be needlessly redundant. This only becomes an issue when we have borrowings from Classical Persian or from Dari and Tajik, i.e. whenever a Persian entry merits a descendents section. --Victar (talk) 03:57, 7 July 2018 (UTC)[reply]
My main concern was with the state of the entry when I saw it and the complete lack of explanation in it of what was descending from what. Guldrelokk's initial reaction is basically what I would expect from any of our readers who don't know the finer points of Persian's historical stages and dialectology. Changing the language name in the descendants to "Iranian Persian" was helpful, and better than the qualifier method I used, but I'm still concerned that it's a bit opaque to the average reader. My edits were just a quick mock-up to show what I was talking about- the usage note, especially, is probably overkill.
As for my comment that "{{desctree}} can't use such things": it was based on a quick (mis)reading of the code and the comment about substituting to prevent template loop errors. When I looked at it again it was obvious that it was just substituting a module invocation for a template invocation that did the same thing. I never said anything about it being broken, just that (I thought) the code had a safety mechanism that prevented it from working. I still don't see how it avoids infinite recursion, but I'm not all that good with Lua. Chuck Entz (talk) 02:31, 7 July 2018 (UTC)[reply]
@Chuck Entz: The constraint that the module is avoiding is in the parser: a template isn't permitted to contain another instance of itself (for instance, if you put {{sandbox}} on its own page), and a module that's invoked on a page can't expand another instance of that page. But apparently you can have an invocation of a module function print the result of preprocessing another invocation of the same function. (I tested this with Module:doublet table. It just made the tables in Appendix:English doublets disappear; the preprocessing generated the empty string once for each invocation of the function. No recursion. Maybe I did something wrong, or the developers made the loop terminate somehow.) — Eru·tuon 05:49, 7 July 2018 (UTC)[reply]
It seems to me like there is a technical gap. {{alter}} could perhaps accept additional parameters which would mark alternative forms as being parents or childs or siblings. Forms created by {{fa-regional}} are probably too loosely connected and should go into the {{fa-noun}} template so they can picked up appropriately. Perhaps this would be realized in a way generalized for pluricentric languages by saving siblings into language data (being effective aside from Persian for Hindustani, Serbo-Croatian, perhaps Aramaic …).
Without wise technical solutions discord will stay real. What is wanted at the end is a “semantic web” where the logic as being imagined by the dictionary editor can also be picked up to be displayed in a different fashion (as in descendant trees) by machines. (A drawback would be that correct wikitext would become increasingly byzantinic for new editors).
In any case of course nobody wants to read duplicated entries. Fay Freak (talk) 02:34, 7 July 2018 (UTC)[reply]
I'll sit on the fence for now - interested in the outcome, though. Notifying (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --06:49, 7 July 2018 (UTC) — This unsigned comment was added by Atitarev (talkcontribs).[reply]
Personally I think listing Modern Persian descendants on Modern Persian entries is somewhat redundant. However, if a term is only used in Classical Persian and has a different Modern Persian descendant, then the Classical Persian entry should have the Modern Persian listed as a descendant. But yeah, IMO having "entries descending from themselves" is unnecessary. —AryamanA (मुझसे बात करेंयोगदान) 16:08, 7 July 2018 (UTC)[reply]
@AryamanA: The true redundancy is having all the borrowing from Classical Persian manually duplicated on the Persian and OP/PIr entries. Imagine if we had to do that for Sanskrit. Also, when we have borrowings from CP, Dari, Tajik, Tati, etc., we run into the same problem with as the original discussion in having it look as though Tajik and the CP borrowings descend from Modern Persian. I think better to just have the descendants section on Persian entries represent CP, so we can be consistent and clear. And again, this only applies when we have borrowings from cliefly CP and Tajik, otherwise there would be no descendants section. --Victar (talk) 18:30, 7 July 2018 (UTC)[reply]
Repeating my stuffed-up call to Persian speakers and Vahagn, people who might be interested: (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --Anatoli T. (обсудить/вклад) 02:08, 10 July 2018 (UTC)[reply]
As noted by others, Latin is not a good model in this matter for other languages; this new system causes redundancy for Persian as borrowings from language variants other than Classical Persian is rare (the same goes for Hebrew and Arabic). In those rare cases, we can simply use something like {{qual|via Iranian Persian}} in the descendands section.
Entries having the header "Persian" correspond to Classical Persian, Iranian Persian, Dari, and other regional forms of Persian that use the Perso-Arabic script. {{fa-regional}} should be probably deprecated in favor of {{alter}}. This model has already been used for other languages as well, e.g. Ancient Greek. Descendants section would cover the descendants from all of these variants of Persian, in practice, it reflects mostly descendants of Classical Persian. Rare cases from other forms can be indicated using {{qual}} as I noted earlier or by other similar means. --Z 08:50, 10 July 2018 (UTC)[reply]
@ZxxZxxZ We are currently using the Latin model, which is treating Classical Persian and Modern Persian as the same language. Did you mean something else?
You haven't addressed the problem of CP borrowings appearing as from Modern Persian and the original discussion of Tajik appearing as a descendant of Modern Persian when it has borrowing and we add it to the descendants section, which is surprisingly quite common. You also haven't addressed the content duplication of borrowings on PIr/OP entries and Persian ones without the use of {{desctree}}. Do you have any thoughts on those points? --Victar (talk) 16:39, 10 July 2018 (UTC)[reply]
Let me clarify my previous comment: "Persian" in wiktionary should refer to all forms of Persian (including Classical, Iranian, and Dari Persian), except Tajik. This has been our practice for a long time. This includes the descendants section, so seeing "Persian" in this section does NOT refer to for example modern Iranian Persian unless otherwise stated (a rare situation). So in my view the problem you mentioned does not actually exist.
Latin is good as a model in that particular area you mentioned right above, I meant it is a bad model in the descendant section, because, as I understood, it's not uncommon to have many descendants from each form of Latin for a single lemma. This is not the case for most other languages, and causes redundancy.
I'm not exactly familiar with the functionality of {{desctree}} and how and why it is causing problem here, so I can't comment on this. --Z 17:27, 10 July 2018 (UTC)[reply]
@ZxxZxxZ, OK, so let's use a real world example of Persian خرما (xormâ).
  1. As you can see, all the root borrowings are actually from CP, but to the untrained reader, they appear to be borrowed from Modern Persian.
  2. Tajik is also listed because of its borrowing into Uzbek, which again, makes it appear as descended from Mod. Persian. Now you could argue that the Uzbek borrowing should just be on the Tajik page, but that would a) hide the borrowing away from readers, and b) be contrary to the premise that Persian reflects both Mod. Persian and CP.
The fact is, readers are always going to assume Persian means Mod. Persian because we give them little to no indication otherwise. Personally, I think the only solutions are to
a) treat all Persian descendants lists as CL,
b) treat all Persian descendants lists as Mod. Persian,
c) have a note at the top of all Persian descendants lists specifying that it reflects one or the other, or
d) split Mod. Persian entries from CL, aka the Armenian method.
For the functionality of {{desctree}}, see 𐎠𐎰𐎥 (a-θ-g) and سنگ. Hopefully, that is enough for you to understand its use and the problem at hand. --Victar (talk) 17:52, 10 July 2018 (UTC)[reply]
On the other hand, that untrained reader may also be confused by seeing the names "Dari" and "Iranian Persian". Indeed, many times we follow this practice of mentioning the language variants simply as a poor replace for more accurate information regarding the time of the borrowings. Instead, I suggest adding a new feature to {{desc}} to add this information (year or century, e.g. "before 14th century"). We ultimately should be adding such information in Wiktionary. Doing that would automatically eliminate such problems with Persian and other languages. --Z 18:27, 10 July 2018 (UTC)[reply]
@ZxxZxxZ, I'm not understaning. Could you give us an example of your suggestion in the context of the Persian خرما (xormâ) and {{desctree}} problems mentions? --Victar (talk) 19:40, 10 July 2018 (UTC)[reply]
See my last edit there. This way it becomes clear to all readers that it's not a modern borrowing. --Z 11:52, 11 July 2018 (UTC)[reply]
Thanks, @ZxxZxxZ. So basically you're recommending that we should add the date prefix [11-15th century] to every CP borrowing in descendants lists. Although I do think adding the date of the earliest attestation of a borrowing is a good idea (I do that for Frankish borrowings), I don't think it's a very elegant solution, nor does doesn't address the Tajik or {{desctree}} issues. --Victar (talk) 14:53, 11 July 2018 (UTC)[reply]
Why is it even important to indicate that the borrowing is from Classical and not Modern Persian? If we don't make this distinction explicit in our Persian entries why make it explicit in descendant trees? Crom daba (talk)
For the same reasons made here. --Victar (talk) 17:54, 11 July 2018 (UTC)[reply]
Political correctness? That's a drag. Dating prefixes don't sound so bad, if we have the necessary data I'd say go for it. Crom daba (talk) 18:19, 11 July 2018 (UTC)[reply]
Accuracy, clarity to readers, etc. As I pointed out, date prefixes don't address the various other issues listed above. --Victar (talk) 18:54, 11 July 2018 (UTC)[reply]

More entries than English Wikipedia edit

As of now, and for the first time ever, we have more mainspace entries than Wikipedia. That makes us better than them. Now who can delete their main page to show them who's really in charge around here? DTLHS (talk) 02:51, 7 July 2018 (UTC)[reply]

THIS. --Victar (talk) 04:11, 7 July 2018 (UTC)[reply]
They say the main page is undeletable. They say it can't be done. But *handing out briefing materials, playing suspenseful music* we're assembling a team to do it.
  • SemperBlotto: the lookout. Ever-watchfully patrolling RecentChanges here, he has the skills to keep a lookout for any admins over on 'pedia who might get in our way.
  • Wonderfool: the demolition specialist. He knows how to delete pages that shouldn't be deleted. Pages that "can't" be deleted. He can evade any blocks and get us inside, especially with the help of...
  • BD2412: the inside man. He's been an admin on WP since 2005. Studying them. Gaining their trust (and sometimes their ire, like any rouge admin). He can edit and unprotect protected pages.
  • Equinox: the getaway driver. I don't know how we're gonna incorporate a getaway car into this, but most of the movies I've seen about this kind of thing have one, so we're bringing one. :)
  • Other spaces are still available: volunteer below.
The devs have made it impossible to use the "delete" function on Wikipedia's Main Page, but our haxx0rs have found a backdoor: replace the text of the page with the text of MediaWiki:Noarticletext.
</joke> don't ban me WMF...
- -sche (discuss) 05:49, 7 July 2018 (UTC)[reply]
Wow! It's slightly unfair though because we have one or two non-English entries and they have none. Equinox 12:47, 7 July 2018 (UTC)[reply]
No it is very unfair because we have a lot of bot-created entries that only contain non lemma forms. Dixtosa (talk) 13:20, 7 July 2018 (UTC)[reply]
Hmm, I wonder if there should be an entry for rouge admin? Imaginatorium (talk) 14:40, 7 July 2018 (UTC)[reply]
If it's attested outside WP. Otherwise, w:Wikipedia:Rouge admin covers it. - -sche (discuss) 17:28, 7 July 2018 (UTC)[reply]
Conversely, Wikipedia has lots of entries like w:List of Indian states and union territories by literacy rate, w:List of Indian states and union territories by GDP, w:List of Indian states and union territories by access to safe drinking water, w:List of Indian states and territories by highest point and w:List of Indian states and territories by Human Development Index, in addition to its entries on the actual states themselves. - -sche (discuss) 17:28, 7 July 2018 (UTC)[reply]
Wikt entries that have made me laugh today: National Teacher Appreciation Week. Equinox 17:29, 7 July 2018 (UTC)[reply]
  • Also, I hope we have told the whole world about these feat - on our Twitter page, Facebook page, Instagram feed, in the Online Club of Dictionaries, on Wikicommons, and the Wikipedia Signpost itself. I'll do my bit and try to have it published in El Pais. --Harmonicaplayer (talk) 15:18, 9 July 2018 (UTC)[reply]
  • Yes, I could literally delete the Wikipedia main page. It is highly unlikely that I would do that, as I have zero familiarity with Equinox's getaway driving skills. By the way, Dixtosa, Wikipedia has millions of bot-created entries that have nothing but, i.e., census data for obscure localities. bd2412 T 19:42, 8 July 2018 (UTC)[reply]

Are comparative or superlative forms lemmas? edit

There seems to be some inconsistency when it comes to entries: in Finnish alone, there are ones marked as lemmas: katalin, and ones that are not: kallein. (The exceptions would naturally be if the forms are themselves used in some idiomatic way) SURJECTION ·talk·contr·log· 10:20, 7 July 2018 (UTC)[reply]

I don't think they should be. Big is the lemma; bigger/est are inflections. Equinox 12:46, 7 July 2018 (UTC)[reply]
It is probably better to unify either way for most languages. One could argue that because they can be inflected (in Finnish at least), they could be classified as lemmata, although I also think that they shouldn't be classified as such. Anyone got a bot lying around? SURJECTION ·talk·contr·log· 12:47, 7 July 2018 (UTC)[reply]
It seems Finnish and Spanish are primarily affected; I tried to check the comparative and superlative categories of other languages, and they do not seem to be set as lemmas. SURJECTION ·talk·contr·log· 13:59, 7 July 2018 (UTC)[reply]
Update: Also affects the adverbs. Russian adverb comparatives are also set to be lemmas, when they should not be. SURJECTION ·talk·contr·log· 14:03, 7 July 2018 (UTC)[reply]
I have started working on the Finnish entries - Russian and Spanish seem more numerous, so it is probably better to automate the conversion there. The head templates are what needs to be changed. SURJECTION ·talk·contr·log· 16:06, 7 July 2018 (UTC)[reply]
In Ancient Greek, many comparative and superlative adjectives are treated as lemmas. I think this makes more sense than it does in English, because they have inflected forms of their own, and a few adjectives have more than one comparative associated with them, sometimes with a different range of meaning. For an extreme example, see the bottom of the declension table for ἀγαθός (agathós), which currently lists six comparatives and five superlatives. I agree that consistency is a good idea, but would request that you get agreement from the editors who've worked hardest on a language before making any changes. For the record, I prefer treating Ancient Greek comparative and superlative adjectives as lemmas. — Eru·tuon 18:26, 7 July 2018 (UTC)[reply]
I did actually point this out a bit earlier: "One could argue that because they can be inflected -- , they could be classified as lemmata". The reason why I do not believe so though, is because how to actually derive the comparative and superlative forms is usually predictable and very much resemble how inflection works, making them inflected forms instead. SURJECTION ·talk·contr·log· 18:32, 7 July 2018 (UTC)[reply]
@Surjection: I guess comparatives and superlatives are usually predictable (in English and Ancient Greek at least), but I'm not sure if that is a feature that is used to distinguish inflected forms from derived forms. — Eru·tuon 20:02, 7 July 2018 (UTC)[reply]
The difference is made based on the words you can logically do it to. Most adjectives have comparatives and superlatives, with uncomparable adjectives being the exception. Being comparable is the status quo. That is the opposite for derived forms, where not being able to derive from a specific word is the status quo. The existing categories too say that comparatives are "adjectives that are inflected to display relative degrees of given qualities between nouns". SURJECTION ·talk·contr·log· 20:06, 7 July 2018 (UTC)[reply]
Okay, that makes more sense. I am not sure how to verify "most adjectives are comparable" though. I'm guessing that, for English at least, that would have to include phrasal comparatives like more fun, most fun (as opposed to the silly-sounding funner, funnest). — Eru·tuon 20:21, 7 July 2018 (UTC)[reply]
Based on the English entry for fun listing those as the comparative and superlative, I would assume they are included. SURJECTION ·talk·contr·log· 20:24, 7 July 2018 (UTC)[reply]
Well, despite that, I think funner and funnest usually sound silly, as if they are almost ungrammatical. I have no idea why, because short adjectives usually can have totally normal-sounding comparatives. But longer adjectives such as intelligent usually don't have synthetic comparatives. (Intellegenter, intellegentest sound even sillier than funner and funnest. That is, they are felt as more ungrammatical.) So, while I do think synthetic comparatives and superlatives in English can be considered inflectional forms, or at least that it is traditional to do so, and would be most practical to categorize them as such on Wiktionary, I'm not sure about the generalization that adjectives are comparable by default. — Eru·tuon 20:36, 7 July 2018 (UTC)[reply]
The fact that Category:English comparable adjectives is not a thing but Category:English uncomparable adjectives is should be sufficient evidence to say that comparable adjectives are the default. (This also applies to other languages) SURJECTION ·talk·contr·log· 20:41, 7 July 2018 (UTC)[reply]
There are various factors involved: I'd say, roughly, -er, -est are likelier to "sound right" on words that are older, Germanic, and have fewer syllables. Comparability is certainly not the default for long, modern, Latinate scientific words as found in biology/chemistry. Equinox 20:45, 7 July 2018 (UTC)[reply]
Scientific words tend to be uncomparable due to their rigorous definition, as well as the fact that many describe a "set" and you cannot really compare the degree something is included in a black-and-white set like that. SURJECTION ·talk·contr·log· 20:48, 7 July 2018 (UTC)[reply]
I don't agree: if something can be "rounder", why not "*subovater"? If "smaller", why not "*microscopicer"? Equinox 20:50, 7 July 2018 (UTC)[reply]
"more subovate", "more microscopic". I did not say all scientific words are uncomparable, but that they tend to be. SURJECTION ·talk·contr·log· 20:51, 7 July 2018 (UTC)[reply]
Well, category structure is based on a variety of concerns besides the linguistic concern of which state is the default. If the number of entries is any guide, uncomparable is the default in English because there are somewhat more adjectives in the uncomparable category (63,451) than outside it (116,594 - 63,451 = 53,143). — Eru·tuon 21:15, 7 July 2018 (UTC)[reply]
The large number of uncomparable adjectives to due to two distinct reasons: 1. large number of uncomparable scientific terms and 2. nationality and language terms (which are naturally not comparable). For other languages, there are more comparable than uncomparable adjectives. Beyond that, most basic adjectives in everyday use are comparable. SURJECTION ·talk·contr·log· 21:22, 7 July 2018 (UTC)[reply]
Okay, I guess that makes sense. (Though some nationality adjectives are given as comparable, like Englihs and Russian: after all, one can display more of the typical characteristics of a nationality.) — Eru·tuon 22:24, 7 July 2018 (UTC)[reply]
It actually would seem Ancient Greek is different - no "adjective comparative form" categories, but "comparative adjective" categories. Whether that is done should be decided on a language-by-language basis, and if we are going to do that, this is probably the time to decide for some languages. SURJECTION ·talk·contr·log· 18:38, 7 July 2018 (UTC)[reply]
No, there actually are adjective comparative forms and superlative comparative forms categories for Ancient Greek: see Ancient Greek adjective comparative forms and Ancient Greek adjective superlative forms. Remember to look under adjectives for comparative adjectives and superlative adjectives and under adjective forms for adjective comparative forms and adjective superlative forms. — Eru·tuon 19:37, 7 July 2018 (UTC)[reply]
I did actually find the former category later, and it only has a single entry, which is not an adjective comparative form but a comparative adjective form; it's an inflected form of a comparative adjective. As to the latter category, it seems inconsistent; I cannot find a rule to differentiate between the entries at Category:Ancient Greek adjective superlative forms and ones under Category:Ancient Greek superlative adjectives. SURJECTION ·talk·contr·log· 19:43, 7 July 2018 (UTC)[reply]
@Surjection: Oh, you're right. μεῖζον (meîzon) is the only entry in Ancient Greek adjective comparative forms, and it is the neuter form of μείζων (meízōn), the comparative of μέγας (mégas). That reminds me of another concern: if comparatives and superlative adjectives are categorized as adjective comparative forms and adjective superlative forms, what will we name the category for their inflected forms? (And are there practical difficulties with having a three-link chain: inflected forms of comparative or superlative forms of adjectives? Not sure.) — Eru·tuon 19:53, 7 July 2018 (UTC)[reply]
All of that will depend on whether we will consider comparatives or superlatives lemmas or not. If we do, comparative adjectives > comparative adjective forms, while if we don't, leaving only comparative adjective forms, we will probably have to rename to something else, like adjective comparatives. SURJECTION ·talk·contr·log· 19:56, 7 July 2018 (UTC)[reply]
And then there's words like northernmost, which can be turned around to read "most northern". DonnanZ (talk) 21:09, 7 July 2018 (UTC)[reply]
I meant, if comparatives and superlatives are not categorized as lemmas, then inflected forms of comparatives are a non-lemma form of a non-lemma form of a lemma. That is confusing. I think it is less confusing to treat Ancient Greek comparatives and superlatives (not English ones though) as lemmas. A similar case is participles, which are inflected forms of verbs, but in some languages have their own inflected forms. But actually participles are categorized as non-lemma forms that have their own non-lemma forms; there is no lemma–nonlemma split for participles. — Eru·tuon 21:15, 7 July 2018 (UTC)[reply]
I do not really find it that confusing - polysynthetic languages could go even further than that. Drawing the line between lemmas and non-lemmas based on whether they can be inflected comes across as somewhat disingenuous, as English adjectives are an exception here - many languages have comparative and superlative forms at least have plural forms. I would be completely okay with having comparatives and superlatives be adjective forms, while those categories would have their own subcategories for inflected forms of those. SURJECTION ·talk·contr·log· 21:30, 7 July 2018 (UTC)[reply]
Yeah, well, I can't comment on how to treat polysynthetic languages, because I haven't really studied any. — Eru·tuon 21:47, 7 July 2018 (UTC)[reply]
I've studied a couple, but not in the depth to help much here (and not recently- I'm fuzzy on the details)- @Stephen G. Brown could give you chapter and verse. Basically you may have one undisputed lemma, and then you have concentric layers of affixes that, depending how you look at it, could be derivation, inflection, or even parts of complete sentences- in lots and lots of different combinations. I remember Dr. Bright pronouncing for our class many years ago a string of 13 consonants, which he said was a single Bell Coola "word" for "I saw those two women come this way out of the water". Suffice it to say, you don't want to even try a binary distinction like this for polysynthetic languages- that way lies madness! Chuck Entz (talk) 23:28, 7 July 2018 (UTC)[reply]
As for participles, weelllllll... SURJECTION ·talk·contr·log· 21:33, 7 July 2018 (UTC)[reply]
Based on this, my proposal is: Category:LANGUAGE comparatives and Category:LANGUAGE superlatives, both of which are under Category:LANGUAGE adjective forms (and therefore not lemmata), with both having their respective Category:LANGUAGE comparative forms and Category:LANGUAGE superlative forms subcategories for inflected forms of such. SURJECTION ·talk·contr·log· 21:37, 7 July 2018 (UTC)[reply]
Hmm, but then where do you put comparative and superlative adverbs? — Eru·tuon 21:42, 7 July 2018 (UTC)[reply]
That is a good point, maybe the categories need the actual part-of-speech after the language to get Category:LANGUAGE adjective comparatives and Category:LANGUAGE adverb comparatives. I will admit that is a bit of a mouthful (especially the forms subcategories), but it is still better than the status quo or classifying comparatives or superlatives as lemmata. SURJECTION ·talk·contr·log· 21:44, 7 July 2018 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, comparative adverb and comparative adjective sound better. The reverse order seems quite awkward; I doubt it's very often used, if at all. — Eru·tuon 21:55, 7 July 2018 (UTC)[reply]

That is also good. Based on a quick Google search, it is actually used somewhat often, albeit comparative adjective is not as common as adjective comparative. SURJECTION ·talk·contr·log· 22:01, 7 July 2018 (UTC)[reply]
Wait no, that was all wrong. comparative adjective is more common and is the better option here. So LANGUAGE comparative adjectives and LANGUAGE comparative adjective forms. SURJECTION ·talk·contr·log· 22:02, 7 July 2018 (UTC)[reply]
Since this would be quite a major change, it is probably a good idea to create a vote. SURJECTION ·talk·contr·log· 22:10, 7 July 2018 (UTC)[reply]
Created: Wiktionary:Votes/2018-07/Restructure comparative and superlative categories. SURJECTION ·talk·contr·log· 22:28, 7 July 2018 (UTC)[reply]

Isotopes edit

Do we want systematic names of isotopes, like uranium-235 and oxygen-18? Including theoretical ones, a great deal of these can be attested, but I don't see them as being of lexical interest. (Some isotopes, like deuterium, have a special name that should obviously be kept.) —Μετάknowledgediscuss/deeds 04:00, 10 July 2018 (UTC)[reply]

The contents of the ones that are formed systematically as element - number seems to be predictable from the entry name, so they seem lexically uninteresting; even pronunciation information is coverable by the entries for the element and the number. They seem as (mostly) useless as 58-degree (angle or day), 59-degree, etc (other hyphenated strings), so I am inclined towards deleting them. They also seem (mostly) harmless, so I don't feel too strongly about deleting them. (But it would be absurd, IMO, to keep these but delete attributive-hyphen forms.) - -sche (discuss) 05:20, 11 July 2018 (UTC)[reply]
Yes, they are predictable - so our most plurals. All words should be treated alike. Either keep them as well as attributive-hyphen forms (if they could pass RfV) or delete both. SemperBlotto (talk) 05:27, 11 July 2018 (UTC)[reply]
I'd favor not including them. They are predictable and uninteresting. It is also hard to imagine a human looking them up on Wiktionary. Almost any compound of a word and any of a range of numbers would seem unworthy of inclusion, though there could conceivably be exceptions. Obviously an expression like cloud 9 would be different, but 9 is, I think, the only number that can occupy that slot to create an expression with a distinctive meaning. DCDuring (talk) 14:20, 11 July 2018 (UTC)[reply]

Global preferences are available edit

19:19, 10 July 2018 (UTC)

Live vlogging fr.WT edit

Lyokoï has been occasionally live-editing fr.WT on video as an introduction and contributor recruiting project. The next event is scheduled for 12 July at 20:30 (not sure if that is UTC, url has a countdown) on YouTube. Commentary and editing in French, of course. - Amgine/ t·e 16:52, 11 July 2018 (UTC)[reply]

Do German participles get their own inflection tables? edit

I tried looking it up in the archives, but I couldn't find a clear answer. I'm talking about regular Attributive verbs, where the adjectival form has the same meaning as the verb.

  1. Do German participles get their own (non-comparative) inflection tables?
  2. If so, does the inflection table go in the existing Verb section or a new Adjective section?

Mofvanes (talk) 20:05, 11 July 2018 (UTC)[reply]

I tried to check what Finnish does, and it seems inconsistent... some participles have no declension tables, others do under the Verb section, others have their own adjective section and a declension table there, some have a "Participle" section... SURJECTION ·talk·contr·log· 15:50, 12 July 2018 (UTC)[reply]
Examples of all four: juotu, ajettu, keitetty, hakkeroitu. SURJECTION ·talk·contr·log· 15:54, 12 July 2018 (UTC)[reply]
@Mofvanes, Surjection: AFAIK, there is no general practice. Sometimes German participles have an entry as verb form pointing to the verb (e.g. abbezahlt), sometimes they have an adjective and a verb form entry (e.g. gefragt). At least now there are also German participle entries (e.g. tötend) similar to Dutch participle entries (e.g. gevallen).
Some participles might be comparable, e.g. fragen -> gefragt -> gefragtest- (gefragteste, gefragtesten etc.). -17:49, 31 July 2018 (UTC)

Moving all Volapük entries to the appendix edit

A few months ago all Lojban entries were moved to the appendix. I think this should also happen to all Volapük entries. In the category Category:Volapük lemmas there are 2643 entries, but only one of them has any citations. There are currently 27 entries on the page Wiktionary:Requests for verification/Non-English and I doubt any of them will pass. Maybe it would be better if everything would be moved to the appendix instead. Robin van der Vliet (talk) (contribs) 15:46, 12 July 2018 (UTC)[reply]

My small contribution to this is that you should do a great job of communicating this if you do so. I felt hurt that the Lojban words were moved when I was busy with other non-Wiktionary business. I was actively editing, came back after a few months, and couldn't find out what had happened to my hard work. I understand the desire to not spend energy and time on a language that you don't know and don't use, I just would like to register that rare languages mean the number of people actively working on them is small, and there needs to be a LOT of communication to make sure the ones who care can find out about changes. Jawitkien (talk) 17:51, 12 July 2018 (UTC)[reply]
@DtheZombie, Lingo Bingo Dingo, Lunaris filia, Malafaya, Nielsheur, Pereru, Raekmannen: I would like to invite you all to this discussion, as you all have indicated Volapük in your Babel box. Robin van der Vliet (talk) (contribs) 18:02, 12 July 2018 (UTC)[reply]
The fact that a lot of Volapük words have had to be sent to RFV is not a reflection of the corpus so much as the fact that they were all created by a single problematic editor who seems to have made many of them up on the spot. Volapük does indeed have a corpus on Google Books that shows that a lot of vocabulary is indeed attestable, as was pointed out to me by User:Mx. Granger. The constructed languages that really need moving to the appendix are, in my opinion, Interlingua, Interlingue (Occidental), Novial, and potentially Ido. —Μετάknowledgediscuss/deeds 19:39, 12 July 2018 (UTC)[reply]
I agree that there is a literature in Volapük which simply does not exist in (e.g.) Lojban where all of the "literature" is purely used in experimental contexts amongst the dozen or so speakers exclusively to extend the language and test its underlying philosophy. There are several thousand lemmas in Volapük that can be attested from literature and that is not true for most constructed languages. —Justin (koavf)TCM 21:27, 12 July 2018 (UTC)[reply]

Before we do such a thing, I think we should solve the issues that have been raised at Wiktionary:Beer parlour/2018/June § On the placement of constructed languages, and on the attestation of appendix-only languages. Per utramque cavernam 21:31, 12 July 2018 (UTC)[reply]

  • Thanks for the ping. I agree with Metaknowledge and Koavf that the Volapük corpus seems to be large enough for us to cover a good number of words under CFI (unlike Lojban). It's true, though, that I've been RFVing a lot of Volapük words, mainly because one prolific user has been adding a huge number of unattestable Volapük words. It's tedious getting rid of all these entries through RFV, though, and it might be better to do it faster. Here's one suggestion: temporarily give me (or some other administrator) permission to delete on sight any entry for a Volapük noun or adjective that gets no hits on Google Books or Wikisource. Or something along those lines. That would at least put a dent in the mountain of unattestable Volapük entries, and there wouldn't be much risk of losing good entries, because a Volapük word that has no hits on Google Books or Wikisource would be unlikely to pass RFV if nominated. —Granger (talk · contribs) 01:13, 13 July 2018 (UTC)[reply]

See Also vs Related Terms edit

Could someone tell me if it is better to have a sub-header of "See Also" or "Related Terms" ? I'm seeing both used in the Lojban entries, and I'd like to standardize if it has already been decided. Jawitkien (talk) 17:51, 12 July 2018 (UTC)[reply]

Related terms are for terms that are somehow etymologically related. "See also" is not really defined, you can put whatever you want in it. DTLHS (talk) 17:55, 12 July 2018 (UTC)[reply]
@DTLHS I see entries using "Derived terms" also.
My current usage will be:
if it is syntactically derived, I use "===Derived terms==="
if it is etymologically derived, I use "===Related terms==="
if it is related but not derived, I use "===See Also==="
Does this sound reasonable ? Jawitkien (talk) 23:30, 12 July 2018 (UTC)[reply]
Sure. DTLHS (talk) 04:18, 13 July 2018 (UTC)[reply]
What I do is put all derived terms in the Derived terms section, terms that are etymologically related but not derived (like etymological sisters, cousins, aunts, or nieces) in the Related terms section, and random odds and ends in See also (though I probably haven't used See also as much as the other two). I'm not sure what syntactically derived means. If it means phrases that contain the term in the current entry, then I put those in Derived terms. I think it's misleading to put etymologically related terms in See also rather than Related terms! But sometimes people put terms that are really derived in Related terms, or do other odd things. — Eru·tuon 04:41, 13 July 2018 (UTC)[reply]
One example of See also is in the Spanish entry gallo, meaning "rooster", where "pollo" (chicken meat) is listed. The words themselves aren't related and they aren't synonyms, but there is a clear connection between the two words. Andrew Sheedy (talk) 21:48, 15 July 2018 (UTC)[reply]
@Erutuon: "sometimes people put terms that are really derived in Related terms". There could be two good reasons for doing so: (a) One might see the etymological relation, but don't know how they are related. (b) One might not know how WT handles the difference between superficial-synchronical and historical-diachronical derivation. Synchronically, dancer looks like dance + -er, while diachronically it's from Middle English (according to dancer). Does dancer count as derived term of dance, or are they only related? Similar questions could be asked for backformations, and borrowed terms which can be analysed synchronically. Especially for backformations, it would be strange to give superficial-synchronical derivations as DTs, like giving burglar as DT in burgle (as if burgle + -ar), while burgle is actually derived from burglar through backformation. (WT:EL#Derived terms doesn't clearly state anything about this matter and thus is not helpful. dance gives dancer as DT, but could wrong.) - 84.161.63.96 18:48, 31 July 2018 (UTC)[reply]
Well, typically words like dancer, which were constructed in an ancestor of Modern English, but any English speaker could newly construct (because the suffix -er is productive), could probably be counted as derived. And I wouldn't put burglar in the Derived terms for burgle because the derivation goes the other way around (burgle is derived from burglar).
The way I see it, whether a word is derived sort of relates to etymology sections: if an etymology section mentions word one as the origin of word two, the Derived terms section in the entry for word one should mention word two. But if word one is mentioned as a same-language cognate of word two, it could be placed in Related terms. I guess all of this is still not very clear, though. — Eru·tuon 22:46, 2 August 2018 (UTC)[reply]

Eau, blast! edit

Is there a way to nominate pages with only unattestable entries for deletion? I am thinking of eaublast. See also Wiktionary:Requests for verification/English#eaublast.  --Lambiam 09:38, 13 July 2018 (UTC)[reply]

{{speedy}}, if you're sure the page is too bad to merit discussion through normal channels. Equinox 12:34, 13 July 2018 (UTC)[reply]
I feel it was sufficiently discussed at WT:RFVE.  --Lambiam 14:20, 13 July 2018 (UTC)[reply]

Eye-dialect phrase alternative form entries edit

In 2017, the deletion discussion for thank ya so much happened. The result of this discussion was to delete the page, along with others like thank u so much, etc. Also in 2017, there were also a deletion discussion for fer cryin' out loud. The result of this discussion was different; the page was not deleted, but was instead redirected to the entry for crying out loud.

These two discussions had different results. This is inconsistent; we need a consistent way to deal with these entries, a clear community consensus on it. The problem with entries like these is that many of them have overabundant possibilities; see the comments by User:Mihia in the discussions. By current rules, technically, as I summarize some of Chuck Entz's statements in Talk:thank ya so much, these entries are not sums of their own parts, since you're inserting an eye-dialect variable (or more than one) into a phrase that is already not a sum of its own parts. Thus, I've brought up this discussion to propose that we modify WT:Criteria for inclusion to make a brief statement about these eye-dialect phrase entries, based on the consensus reached by this discussion.

Consensus from both deletion discussions clearly is that trivial eye-dialect forms of phrases should not have dictionary entries. However, there's also a similar discussion for for cryin' out loud. The result was to keep as a dictionary entry due to how common of an alternative form this one actually is. So the exception to the CFI policy I propose would presumably be if a phrase was particularly common in its eye dialect form (i.e. see ya < see you).

However, we can go either one of two ways with this. 1.) Entries such as let's get dis party started should hard-redirect to let's get this party started. 2.) Entries such as let's get dis party started should be deleted completely.

This might be a tough one to figure out. So, before starting a policy vote, I'm gonna need help forming such a vote, as I'm not even sure which direction this should necessarily go. For instance, how should we treat entries that are particularly common as eye dialect forms (such as see ya)? Should another exception be that phrases with only two words in them (X Y) or three (X Y Z) should allow as many eye dialects as possible for entries, or redirects for the second proposal? Please help me out here, much appreciated. Thanks for any input.

I'll go ahead and make some subsections here for some pre-support votes for either side of this debate. (As usual, if there was already a similar discussion to this, I don't recall it and wasn't able to find it, so don't pounce on me if there was.) PseudoSkull (talk) 22:35, 13 July 2018 (UTC)[reply]

Trivial eye dialect forms should redirect edit

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)[reply]

  • Support. If someone goes to the trouble of typing an attested variation of a CFI-worthy phrase into the search bar, it should take them someplace useful. I do not trust the search function to produce useful results. I would require citations first (and shoot on sight the uncited), and use the citations page to gather citations for all incoming variations. bd2412 T 01:46, 14 July 2018 (UTC)[reply]
  • Support hard redirects, with exceptions for the eye dialect form being equally or more common than standard spellings. If someone wants to go through the effort of creating these, that's fine with me. Andrew Sheedy (talk) 02:13, 14 July 2018 (UTC)[reply]
    [thank|fank] [you|ya|yer|ye] [very|verra] much would be 2*4*2 = 16 combinations for just one phrase (and I'm sure each word has many spellings I've not thought of). The issue here is not disk space... Equinox 02:37, 14 July 2018 (UTC)[reply]
    Are each of these variations, as phrases, attestable? How many of them will ever be created if we require attestation in advance? bd2412 T 14:38, 14 July 2018 (UTC)[reply]
  • I doubt they are all attestable but I disagree with creating any of them. Eye dialect/nonstandardness IMO should be dealt with at word level and not phrase level, since one word with n variants will otherwise (potentially, depending on attestation) multiply the number of derived phrases by n. I bet there are reasons other than "space on paper" why professional dictionaries wouldn't countenance this. Equinox 17:28, 14 July 2018 (UTC)[reply]

Trivial eye dialect forms should be deleted edit

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)[reply]

  • Delete entries like "let's get dis party started" and "thank ya so much" (obvious bullshit, nobody would ever search for them), keep "for cryin' out loud" since that's how the expression is usually written and said. It's important to be as restrictive as possible here since someone will inevitably add thousands of these. DTLHS (talk) 01:53, 14 July 2018 (UTC)[reply]
  Support or somebody's going to go cray-cray and slippery-slope a billion stupid (but citeable) phrases in here. Oh, I just glanced upward and DTLHS said exactly what I am saying. Equinox 01:56, 14 July 2018 (UTC)[reply]
  Support based on the examples, but what exactly is a “trivial” eye dialect? Something like cah or masta could be described as trivial. — Ungoliant (falai) 02:21, 14 July 2018 (UTC)[reply]
I understand the Skull to be talking about phrases, not individual words. Equinox 02:27, 14 July 2018 (UTC)[reply]

(edit conflict)

Stuff like let's get dis party started would be trivial, and that's just assuming it happens to be attested at all. By trivial I meant the phrases, not the words themselves. for cryin' out loud is a particularly common one, and it's even more often said that way than "for crying out loud". Also see ya is a very common collocation of this nature, so it should be kept as is too.
But that's part of the problem with this proposal; we need a way to measure by consensus how useful any particular one of these phrases is, but obvious trivial ones should be deleted/redirected according to either of these two proposals. Perhaps we should make the policy with these similar to how we treat misspellings (as in, if "desaire" is not a particularly common misspelling of "desire" it is not kept, regardless of if it has 3 citations as would normally be accepted). PseudoSkull (talk) 02:34, 14 July 2018 (UTC)[reply]
  • Comment on terminology. This thread makes frequent reference to "eye dialect". According to Wiktionary's own definition, as well as definitions that I have found elsewhere, "eye dialect" means nonstandard spellings that indicate a standard pronunciation. Therefore, words such as "cryin'", "dis", "fank", etc. are not, as far as I can see, eye dialect. As I think has been mentioned before, the word "eye dialect" seems to be widely misused within Wiktionary. Or, if it is not being misused, the Wiktionary definition should be changed to reflect that fact that it can include spellings that represent nonstandard pronunciations. Mihia (talk) 11:39, 24 July 2018 (UTC)[reply]
  Support Per utramque cavernam 15:01, 31 July 2018 (UTC)[reply]

What questions concerning the strategy process do you have? edit

Hi!

I'm Tar Lócesilion, a Polish Wikipedia admin and a member of Wikimedia Polska. Last year, I worked for Wikimedia Foundation as a liaison between communities and the Movement Strategy core team. My task was to ensure that all online communities were aware of the movement-wide strategy discussion. This year, my task similar. Phase II of the strategy process was launched in April. Currently, future Working Groups members are being selected, and related pages on Meta-Wiki are being designed.

I’d like to learn what questions concerning the strategy process would you like to be answered on the FAQ page? Please answer here, on my talk page, or on a dedicated talk page on Meta-Wiki. Thanks!

If you have any questions or concerns, please, do ask!

Thanks, SGrabarczuk (WMF) (talk) 18:29, 14 July 2018 (UTC)[reply]

I'm live streaming my editing! edit

I'm live streaming my Wiktionary activity on YouTube right now, if anyone wants to watch. https://www.youtube.com/watch?v=r3-rNoIA7cU PseudoSkull (talk) 22:06, 14 July 2018 (UTC)[reply]

The stream is over but I might do it again sometime perhaps. However, you can still see the contents of the stream. I timed out at 56 minutes. PseudoSkull (talk) 23:04, 14 July 2018 (UTC)[reply]
Just remember that being an admin shows you things that shouldn't be visible to the public- be careful to limit the kinds of things you do while streaming, and make sure what you're working with is clear of any vandalism so you won't be giving it undue attention. Chuck Entz (talk) 20:48, 15 July 2018 (UTC)[reply]
Thanks for the video. It is interesting to see how other people contribute and especially on another Wiktionary (I contribute mainly on the French Wiktionary). Pamputt (talk) 06:12, 16 July 2018 (UTC)[reply]
I disagree with Chuck. Next time, PS, do all the craziest and most hardcore admin stuff possible. I'm thinking Unblocking Vandals, Showing Deleted Edits, Mass Deletion, Hiding Edit Summaries, Editing Protected Pages and, my favourite as I've never done it before...Whitelisting. --Harmonicaplayer (talk) 17:03, 18 July 2018 (UTC)[reply]
Perhaps a Core War-style multiplayer game on Twitch, with one admin against a team of three vandals. Equinox 19:42, 18 July 2018 (UTC)[reply]
I think that's the nerdiest comment ever on WT. --Harmonicaplayer (talk) 18:40, 19 July 2018 (UTC)[reply]
No deleting of the main page? —AryamanA (मुझसे बात करेंयोगदान) 21:49, 21 July 2018 (UTC)[reply]

What's with the new layout? Are we in Europe?? Wyang (talk) 22:05, 16 July 2018 (UTC)[reply]

The layout is by User:Per utramque cavernam (see his talk page for recent discussion of it). It reflects the approximate division of FWOTDs, which is in turn based on our strengths at Wiktionary. Hopefully more non-European languages can be featured in the future, but that also means I'll need more such words to be nominated. —Μετάknowledgediscuss/deeds 22:31, 16 July 2018 (UTC)[reply]
This ‘strength’ at Wiktionary is something to be ashamed about. < 10% of the world’s population is in Europe, yet we still pride ourselves on this Eurocentrism. All words in all languages (in Europe)... with a smattering of words elsewhere? Wyang (talk) 23:20, 16 July 2018 (UTC)[reply]
Remember this is en.wikt and has a user base that somewhat reflects that. There is no automatic way to pull in all the content from other Wiktionaries. Equinox 23:23, 16 July 2018 (UTC)[reply]
The layout of the nominations page has nothing to do with the proportion of words from different regions that are featured. DTLHS (talk) 23:31, 16 July 2018 (UTC)[reply]
Then what's the point? Don't forget that the project's main page reads "Welcome to the English-language Wiktionary, a collaborative project to produce a free-content multilingual dictionary. It aims to describe all words of all languages using definitions and descriptions in English." NOT all words of all European languages. Some editors have been working very hard to increase the coverage of the world's major languages, such as Chinese ― the language with the most native speakers in the world, outnumbering the rest by a wide margin. Yet there are some who view Europe as the centre of the world and actively try to suppress the rest: National European language vs Minor or extinct European language vs Non-European Language. Are you kidding me??? Might as well split it into Wiktionary:European Word of the Day and Wiktionary:Non-European Word of the Day. Wyang (talk) 03:34, 17 July 2018 (UTC)[reply]
I have no idea what the fuck you're talking about. Again, how does the layout of the nominations page affect what words are chosen? Are you volunteering to run the FWOTD project? Are you actually complaining about the distribution of words that are actually featured, in which case why are you talking about the nomination page? DTLHS (talk) 03:47, 17 July 2018 (UTC)[reply]
I have no fucking interest in editing in this system either. Wyang (talk) 03:49, 17 July 2018 (UTC)[reply]
During the 60s in the US South, I'm sure there are white southerners who were asking "what's the big deal about separate lunch counters? The colored folks get served the same food as everyone else?" Chuck Entz (talk) 14:05, 17 July 2018 (UTC)[reply]
When I took linguistics at UCLA, we were required to take at least one year of a non-Indo-European language in order to graduate (I chose Mandarin). The fact is that European languages were so dominant that it was hard to find courses outside of major universities in other languages, so even linguistics students tended to have no exposure to other language families before they came to UCLA, and it was too easy to stick with what was already familiar (things have improved since then, but it's still true to some extent). In that case, it was necessary to address the bias explicitly in order to do something about it. Chuck Entz (talk) 14:05, 17 July 2018 (UTC)[reply]
The current layout is definitely wrong. Not only because it really is Eurocentric but because it doesn't reflect the huge contributions in some non-European languages, such as Chinese or Japanese, etc., the current or any future true distribution of lemmas and it shouldn't. I don't approve Wyang's slamming the doors, though. It doesn't achieve anything.
The layout has to change back to what it was. --Anatoli T. (обсудить/вклад) 13:49, 17 July 2018 (UTC)[reply]
What about convenience to the FWOTD caretaker (i.e. Metaknowledge)? Unless he says otherwise, I think it might help him run the thing.
However, I agree with -sche below that it shouldn't send an undesirable message either, and if it does it's a problem (Maybe I should have named the headers "Type 1", "Type 2" and "Type 3" :p). Per utramque cavernam 15:45, 17 July 2018 (UTC)[reply]
Yes, the split into European and non-European, while probably well-intentioned, is sending a undesirable message/effect and should be undone... the current layout with all the continents seems like an improvement...? What do you think? And though we're constrained by what words people enter in enough detail to feature, a la Equinox's and other people's point, maybe we could try to explicitly counter the preponderance of Indo-European a la Chuck's point by featuring one word from each continent per week? (So people might realize they could copy the formatting of when adding more words from that language?) With two days leftover for constructed languages and repeats of continents? Or at least we could try to feature, say, at least four different continents per week? - -sche (discuss) 15:09, 17 July 2018 (UTC)[reply]
We really don't have the ability to do one word per continent per week. You ran WOTD, so you know how hard it is already to avoid burnout. If anyone volunteers to help with these issues, I'd be happy, but I haven't seen any volunteering yet in this thread. —Μετάknowledgediscuss/deeds 16:14, 17 July 2018 (UTC)[reply]

What a lame discussion. And what a twisted accusation! Obviously, the layout has only reflected what had already amassed for long, not to segregate, just to sort, bringing what the mildest system of order has to comprise. It could even help to get away from Eurocentrism, but that progressive dogma whereby disparities disappear when they aren’t exposed is apparently too attractive. No, @Atitarev, that page, as a medium, cannot just simply reflect contributions across the Wiktionary, people are still invited to post them thither, and if the managers don’t have a secret agenda, then apparent unevennesses are just. And they are also expected, a priori, for a Wiktionary of an European language attracts users of European ties and the economic and even individual probabilities (who gets educated in which languages, becomes computer-literate and has the spare leisure to come hither) play an innegligible role too. Fay Freak (talk) 01:04, 18 July 2018 (UTC)[reply]

@Fay Freak, what is it that you want to to add here? I see name-calling and pooh-poohing. Was that your intent?
I may not share Wyang's tetchiness, but I understand his concerns and I do share them, albeit perhaps to a lesser extent. Please also see Chuck's comment above about lunch counters. ‑‑ Eiríkr Útlendi │Tala við mig 17:15, 18 July 2018 (UTC)[reply]
@Eirikr I don’t see any of it. The point is that there is nothing surprising in the appearance of “Eurocentrism” and people see only things that aren’t there instead of the things that are there, in which latter case nobody would move an eyebrow. Fay Freak (talk) 17:20, 18 July 2018 (UTC)[reply]
If you don't see any of it, why comment? It seems you're trying to make the case for Wyang being wrong, and for Eurocentrism being right. I cannot agree with either proposition.
Again, see Chuck's comments above. ‑‑ Eiríkr Útlendi │Tala við mig 17:41, 18 July 2018 (UTC)[reply]
@Eirikr No it doesn’t seem like that. The whole point is that in the depth there is no Eurocentrism there. It’s just a certain distribution of edits to Wiktionary, to that page, and what the managers find, streamlined. It is a mapping to that “Eurocentrism” of the editors seen together. Which isn’t “Eurocentrism” either, of course because one cannot see the editors together but everyone has different motivations, but a natural result of economic and individual probabilities. Thus I conclude that there is nothing to complain about. Fay Freak (talk) 18:10, 18 July 2018 (UTC)[reply]
I'm not saying whether there's Eurocentrism in content. My point is that it was unnecessarily giving the appearance of Eurocentrism. It doesn't matter whether the appearance isn't a reflection of the reality or not: if people are put off by the appearance, they're not going to stay around long enough to find out about the reality. "Other than that, Mrs. Lincoln, how was the play?" Chuck Entz (talk) 05:05, 19 July 2018 (UTC)[reply]
Okay; this, that people are put off by the appearance, has now been proved by that exemplary case. I am glad that we have attained a common view of the things, we should not give up to view things clearly; in an ex-post perspective Wyang can own to have fallen victim to bait and come back. Fay Freak (talk) 11:15, 19 July 2018 (UTC)[reply]
What "proof"? Chuck was merely stating a proposition. And why does it matter to you so much to "prove" this?
What "bait"? Reading related threads, it appears to me that the initial format change to European/non-European was a good-faith effort at organizing an unwieldy dataset, albeit an effort that seemed to entail an implicit bias. (Note I say "seemed" as I have no immediate knowledge of the state of mind of the people who implemented this change: an important consideration when discussing motivations.) The term "bait" implies that someone was deliberately trying to irritate or otherwise provoke a reaction. Is that your view of the motivation behind the change in format?
You seem quick to ascribe motivations to others, and quick to assume that you know best. To me, your conduct in this thread feels like gaslighting. I honestly can't tell if you're trolling. ‑‑ Eiríkr Útlendi │Tala við mig 16:42, 19 July 2018 (UTC)[reply]
I just agreed with that proposition because it has been proven by at least Wyang’s reaction. Nothing else there.
Then isn’t there good-faith bait? Or when things just effect as bait or trolling when they weren’t intended as such. Bait-in-effect. I don’t believe much in implications. Implications are when one hasn’t the room to be explicit; I am not sure if “good faith” should be a category of cognition, the word “faith” is suspicious alone for he who tries to scrutinize the real nexus. Well actually I tried to make the expression helpful reinforcement, to help him to cope with his Eurocentrism impressions, but you have spoiled it now I am asked to explain it. If “gaslighting” intends to imply that it is objectionable then the label does not apply to me, but according to the Wikipedia definition it doesn’t: Denial, misdirection, contradiction, and lying. I haven’t done this. Don’t search for it. Questioning the own perception is ok and is not unusually what is needed to get over disagreements, and this is also the first thread topic, to posit questions about which perceptions should be maintained – after the perceptions only may come the acting. Fay Freak (talk) 19:09, 19 July 2018 (UTC)[reply]
Please don't get banned again, we need quality Arabic editors. Crom daba (talk) 05:01, 21 July 2018 (UTC)[reply]
Arabic edits are commendable but the understanding what's right and wrong leaves much to be desired. --Anatoli T. (обсудить/вклад) 05:08, 21 July 2018 (UTC)[reply]

Replace {{unreferenced}} with {{rfr}} edit

Hey, could we replace {{unreferenced}} with {{rfr}}? It would fit to the scheme we use for {{rfe}} and {{rfv}}. --Victar (talk) 01:28, 17 July 2018 (UTC)[reply]

I agree the templates should be merged. If you can make {{rfr}}'s parameter 1 default to en or und when not specified (so existing uses of {{unreferenced}} don't break), we could just redirect {{unreferenced}} to {{rfr}}. (We should keep the redirect, of course, because why not? Some people might be used to typing it.) - -sche (discuss) 01:33, 18 July 2018 (UTC)[reply]

Has there been a vote on deleting bagua edit

Does anyone know if there has been a vote on deleting bagua (the components of an I Ching hexagram ? Jawitkien Just to clarify, I am not talking about the word "bagua" but the unicode characters which make up the class of bagua. (talk) 15:39, 18 July 2018 (UTC)[reply]

  • Why would there be? SemperBlotto (talk) 05:01, 19 July 2018 (UTC)[reply]
    Some of them have been deleted. I think because they were created as a skeleton from their Unicode description, which is not very useful. Is it possible to get them un-deleted? They have a particular format that I'm not sure I can re-create. There are less than 8 entries. I can list them here if it helps someone un-delete. Jawitkien (talk) 12:42, 19 July 2018 (UTC)[reply]
I don't think anything useful has been deleted in terms of these trigrams. Two still have entries (, ) whereas five (, , , , ) were never created in the first place. The remaining trigram () was deleted, but the deleted content was not anything like the nice formatting of the two existing entries; the entire text of the page was "I Chang symbol for fire". —Granger (talk · contribs) 14:20, 19 July 2018 (UTC)[reply]
I successfully created the next six, and have augmented the basic form a bit. I intend to update these more by tying them to the various hexagrams that are composed of them. Does anyone else have ideas about how to enhance them ? Just taking a break from Lojban on a small finite project. Jawitkien (talk) 18:27, 19 July 2018 (UTC)[reply]

Klingon language copyright edit

What is its status? Should we be allowing mass creation of entries in the appendix? DTLHS (talk) 20:16, 20 July 2018 (UTC)[reply]

This has been discussed before, and my understanding is that no, we should not host any large number of entries. @Mofvanes, please stop creating Klingon entries. Most of what has been created should probably be deleted, down to just a few representative entries (which, my understanding of previous discussions is, might be beneath judicial notice). IANAL, of course. - -sche (discuss) 21:11, 20 July 2018 (UTC)[reply]
@BD2412 is an actual lawyer and believes that a language can be intellectual property (this has not been tested in the courts in the US, so that remains a belief). In practice, it is clear that CBS believes otherwise, because of the existence of a great deal of material published in and about Klingon, including for commercial gain, that have not been subject to lawsuits (including publications by the Klingon Language Institute). There have also been legal arguments put forward that dispute that a language can be copyrighted at all; see here. I would therefore tend to think that we can include Klingon in the appendix until there is a court case that clarifies it. —Μετάknowledgediscuss/deeds 22:24, 20 July 2018 (UTC)[reply]
In the last 2 years there has been some new legal activity. DTLHS (talk) 22:27, 20 July 2018 (UTC)[reply]
I'm not aware of it. Mind sharing a link? —Μετάknowledgediscuss/deeds 22:30, 20 July 2018 (UTC)[reply]
[13] I guess they declined to rule specifically on the language so it's somewhat irrelevant. DTLHS (talk) 22:36, 20 July 2018 (UTC)[reply]
Indeed. That's the same case for which the memo that I linked above was drafted, and as you noted, the legal situation remained unchanged. —Μετάknowledgediscuss/deeds 22:59, 20 July 2018 (UTC)[reply]
I don't think CBS inaction is because they believe otherwise; as any Ferengi would tell you, don't do it if there's no profit in it, and there's no profit in attacking your fans for something that's not competing with you. Maybe a full dictionary would be too close to competing with--and even if languages aren't copyrightable, dictionaries are, and 80-90% of Klingon's vocabulary is direct from The Klingon Language.
The US Copyright Office has the Copyright Compendium to explain their practices: "In other words, a work may be eligible for copyright protection if it qualifies as a literary work; a musical work; a dramatic work; a pantomime; a choreographic work; a pictorial, graphic, or sculptural work; a motion picture or other audiovisual work; a sound recording; or an architectural work. Works that do not fall within the existing categories of copyrightable subject matter are not copyrightable and cannot be registered with the U.S. Copyright Office."(313.3) They link to a policy statement that rejected compilations of exercises as copyrightable material. There's no court case, but a court would give deference to the Copyright Office interpretation, and that says that a language qua language is not copyrightable.--Prosfilaes (talk) 21:58, 24 July 2018 (UTC)[reply]
1. The problem with The Klingon Dictionary is that it's simply a wordlist. There's not really a way to describe nger (n) theory in other words. Where possible, I do try to make sure the Appendix definitions are different. The etymologies, IPA pronunciations, verb transitivities, and quotations are not in the dictionary. It's also missing words coined after 1992.
2. I'm purposefully excluding Star-Trek-only mainspace-equivalent-less words like pIvchem warp field and excluding any non-linguistic trivia. The Appendix won't be a complete dictionary.
3. http://klingonska.org/ and http://mughom.wikia.com/ has been getting by fine so far with a CC BY-SA licence and no trademark notice.
Mofvanes (talk) 03:09, 25 July 2018 (UTC)[reply]

Word coined by a blog edit

What should be done in the situation where the exact blog post in which a word was coined is known? Should the blog be named or the post even linked in the Etymology section? In my case, there is no WT:CFI violation, since the word is citable and is in wider use, and while the reliable source does not name the exact blog, I think I have managed to locate it (the dates match, it is the earliest blog post that has the term and the overall tone and comments would suggest that it indeed is a coinage by the blogger). SURJECTION ·talk·contr·log· 21:29, 21 July 2018 (UTC)[reply]

Don’t see a reason why not to name. An example I can think of is snowclone which has been invented by the Language Log (it has been wanted to track somehow etymologies attributable to specific entities, but this has not happen yet, so this is what I have only). Also such blogs that are influential enough to influence language can be assumed to have durable links, more than many journalistic products, so that is not where it should fail. You could think only whether it is undue promotion to link, and what plays into this consideration is if the blog is commercially oriented or scholarly. If the blogs are journal-tier it is a respectable service to the reader to link them. Like I don’t exactly think we shouldn't link to the Kiwi Hellenist or Marijn van Putten’s blog except “muh blog”; then there are all those archivists’ and librarians’ blogs. WT:CFI applies to quotations, not to further readings or etymology references, and even WT:CFI gives the general yardstick “it is better to cite sources that are likely to remain easily accessible over time”. “Blogs” are a spook, and even though one would troll by saying that one doesn’t know what a blog is, it is an underspecified genre by content, because a blog can contain anything from keyword spam to legit scientific writing, and by creation, because journalists blog and bloggers do journalism with press cards and all.
What do we do about Buzzfeed and Medium.com? Answer: We still decide by durability and the spam criterion. Fay Freak (talk) 22:20, 21 July 2018 (UTC)[reply]
The blog is not one of those that would be considered influential eonugh to influence the language; it is a blog of a private person, who simply happened to come up with a translation for a word which then spread from there. SURJECTION ·talk·contr·log· 22:40, 21 July 2018 (UTC)[reply]
Named, yes. Linking to an otherwise non-notable private blog might raise issues of whether we would be promoting the blog, especially if the term or the subject matter (or even the blog) is offensive or controversial (Not that we should never link- just that we would need to consider those issues before deciding to do so). I think the main reason for linking would be if there were aspects of the context necessary to understanding the original usage that couldn't be explained easily in an etymology. If it's just a matter of completeness, i.e. to show you know the exact spot, I would say not to bother. Chuck Entz (talk) 22:44, 21 July 2018 (UTC)[reply]
An archive.org link is recommended if possible. DTLHS (talk) 03:13, 25 July 2018 (UTC)[reply]
Last time I have tried one could not, using the proper templates, have archive and original link at the same time, the archive.org link displaces the original one. The quotation/citation templates need some overhaul in this. Fay Freak (talk) 08:55, 25 July 2018 (UTC)[reply]
@Fay Freak: with {{quote-web}} (and, indeed, all the quote-family templates), use |url= for the original URL, and |archiveurl= and |archivedate= for the Archive.org version. This is explained in the template description page. — SGconlaw (talk) 11:10, 25 July 2018 (UTC)[reply]
@Sgconlaw: I know, I know. But the template looks too bad when we still have both URIs accessible, like on تَرْفِيهِيّ (tarfīhiyy): “archived from the original on [date]” is too bait if we have the original linking working and just give an archive link precautionarily as an alternative. It should be rather hidden like in a <sup> tag, for example IA 2018-07-25 (I don’t even really know why we need the archive date, it is excessive). I always wanted to give both the original and an archive link in quotes, but the layout is too butters. Fay Freak (talk) 12:23, 25 July 2018 (UTC)[reply]
The quote templates here were modelled after the ones at the English Wikipedia, and those over at the English Wikipedia use both |archiveurl= and |archivedate= in the format used in our quote templates. If I had to explain why it is necessary to have both |url= and |archiveurl=, it may be because the archive URL is not intended to replace the original URL, just to supplement it if the original URL becomes inaccessible. The reason why |archivedate= is necessary is because websites change, so the date indicates which version was archived. I'm not sure an element like "IA 2018-07-25" would be easily understandable by readers. — SGconlaw (talk) 14:08, 25 July 2018 (UTC)[reply]
It isn’t really an argument what the templates do on Wikipedia because they do not provide quotes for senses.
Currently I think the layout is confusing for readers, independently of how understandable my first suggestion is. Currently the templates insinuate the archive link to be the main link somehow which it regularly isn’t. I don’t really get why the archive link takes the place of the main link. I’d rather expect the archive link to be visually formatted as a fallback.
I’d like this too:
Muraselon[14], archived 2018-07-25
Currently:
Muraselon[15], archived from the original on 2018-07-25
Saved space and highlighted what is important. (Don’t know how the date format should be.) Now, the reader isn’t too obtuse for that format, is he?
The layout can have the archive link more prominent if the original link is not provided (particularly, if a bot finds it to be dead, so people save a click). Fay Freak (talk) 14:33, 25 July 2018 (UTC)[reply]
I might make one at some point, but I thought there was a bot that went around doing that. SURJECTION ·talk·contr·log· 13:06, 25 July 2018 (UTC)[reply]
There isn't. DTLHS (talk) 19:22, 25 July 2018 (UTC)[reply]

Japanese: Categories for kanji readings in cases of jukujikun and other irregularities edit

We have at least one anon, apparently editing from multiple IP addresses, who is creating categories for kanji readings that classify, at best, as irregular. The most recent examples I'm aware of are [[Category:Japanese terms spelled with 海 read as え]] and [[Category:Japanese terms spelled with 老 read as び]], inspired by the jukujikun spelling of 海老 (ebi, shrimp, prawn). Historically, the reading ebi is phonetically wholly unrelated to the spelling, and no Japanese-language resource I know of treats e or bi as readings of either or . I'm generally of the opinion that we should not be categorizing irregular readings for kanji, but I might be able to see some utility...

JA editors, your thoughts? Should we generate these, or delete these?

@Suzukaze-c, Tooironic, Dine2016, MGorrone, kc_kennylau, TAKASUGI Shinji, Poketalker, anyone else I haven't thought of. ‑‑ Eiríkr Útlendi │Tala við mig 16:27, 24 July 2018 (UTC)[reply]

My opinion is delete. —Suzukaze-c 18:42, 24 July 2018 (UTC)[reply]
I agree, these categories don’t make sense. Like for 煙草 (tabako), is that “spelled with 煙 read as タ” + “spelled with 草 read as バコ”, or “spelled with 煙 read as タバ” + “spelled with 草 read as コ”? Either one is equally nonsensical.  --Lambiam 21:19, 24 July 2018 (UTC)[reply]
I agree. Delete. — TAKASUGI Shinji (talk) 15:55, 25 July 2018 (UTC)[reply]
@mello -- Ah, that last was probably prompted by a mistake on my part in categorizing the reading as "kun" instead of "irregular". ‑‑ Eiríkr Útlendi │Tala við mig 19:04, 25 July 2018 (UTC)[reply]
  • More anon fun, this time going through given names and creating these reading categories for nanori -- which can be extremely irregular, where kanji and readings are mixed together in a blender of cross-associations.
Are folks okay with a stated position of not creating categories for nanori-only readings, and also of deleting any such existing categories? ‑‑ Eiríkr Útlendi │Tala við mig 20:58, 17 August 2018 (UTC)[reply]

Tools for easy find-and-replace ? edit

I think there might be tools on wiktionary, possibly involving Javascript which make finding and replacing text on a wiki page easier. Currently, I have to keep a file with the replacement text open as I'm editing, and if I see something I want to replace, copy it out of the file into the page, replacing the existing text. Is there an easier way? Jawitkien (talk) 16:25, 25 July 2018 (UTC)[reply]

The regex gadget at Special:Preferences#mw-prefsection-gadgets or meta:TemplateScript. —Suzukaze-c 18:33, 25 July 2018 (UTC)[reply]
Thanks ! now I need to figure out how to load +5,000 definitions into it. Jawitkien (talk) 20:53, 25 July 2018 (UTC)[reply]
Hmm... if you're pulling chunks of text out of a premade file, maybe you should get a bot. —Suzukaze-c 01:56, 26 July 2018 (UTC)[reply]
@Suzukaze-c Is that possible? How do I do that? Most of the changes are simply normalizing the entries for Lojban words, so they all have the same components and spelling, etc. Jawitkien (talk) 16:16, 26 July 2018 (UTC)[reply]
I must admit that I am not sure. The name "Pywikibot" is usually mentioned a lot.
If the changes can be done with a chain of regexes, perhaps AutoWikiBrowser could help too. —Suzukaze-c 07:50, 27 July 2018 (UTC)[reply]

There are a lot of pages here, would anyone like to help? DTLHS (talk) 04:52, 26 July 2018 (UTC)[reply]

The best thing you could probably do is to tell Word dewd544 (talkcontribs) to double-check when adding etymologies. --Chicken is fun (talk) 14:47, 29 July 2018 (UTC)[reply]

An anon is adding lots of verbs into this category. I'm not convinced that most of them belong there. Any ideas? SemperBlotto (talk) 14:42, 28 July 2018 (UTC)[reply]

Is there a way to test the theory that they don't belong? The few I checked fit the description on Wikipedia. Jawitkien (talk) 23:12, 29 July 2018 (UTC)[reply]
Looking at our definition, most of them seem to belong. The "break" ones seem to be out of place though. Andrew Sheedy (talk) 23:41, 29 July 2018 (UTC)[reply]
I don't think that any of the ones that involve copulative (be), modal ('d), or auxiliary (some have) verbs belong. DCDuring (talk) 00:17, 31 July 2018 (UTC)[reply]
Other suspect inclusions in the category include Verb + PP (eg, those beginning with go) and those involving phrasal verbs. DCDuring (talk) 00:20, 31 July 2018 (UTC)[reply]
I feel strongly that we should take a restrictive stance with respect to inclusion of individual entries. If there are other categories that more controversially might be included, they can be included by reference on the category description page for Category:English light verb constructions. Similarly, we could refer to appendices etc. DCDuring (talk) 00:34, 31 July 2018 (UTC)[reply]
IMO the category needs a major cleanout. How many of the "pay" expressions are just metaphorical, but lexical, uses of pay? Each entry using a verb not on the list of core light verbs needs to be examined. The specific definitions that qualify each term as a member of the category labelled, visibly or invisibly. DCDuring (talk) 01:21, 31 July 2018 (UTC)[reply]
I just noticed this with keep mum, where keep means remain: doesn't seem particularly to be a "light verb". Equinox 23:23, 21 August 2018 (UTC)[reply]

Lojban words edit

I'm sure most folks here are very busy dealing with words that are actively known to speakers of a natural language which are not adequately covered in Wiktionary. For those few who are not, I'd like to ask for some advice. I am trying to make a fair representation of Lojban words, and don't know what the appropriate action should be. When I go to a chat board for Lojban, such as https://lojban.slack.com/messages I see a lot of words that I don't see here. Some of them are a class that isn't even present on wiktionary (attitudinals) and some of them are just compound words that are composed of the parts (rafsi) mentioned here, but which don't exist as a page. Should I try to make the Lojban pages here reflect those words which are actively used in "speech"/chat in the wild? Or is wiktionary primarily interested in the base words and not the way they are combined together? I see other languages have compound words on wiktionary, I just don't know what my efforts should be. Jawitkien (talk) 23:25, 30 July 2018 (UTC)[reply]

Wiktionary is interested in the way words are really used in the wild. Compound words (lujvo) and attitudinals should be included in a dictionary of Lojban—both are essential parts of the language. If Lojban were still covered in the mainspace, I would encourage you to add any compound words and attitudinals that are attestable. Since the language is now in the appendix where the standards for inclusion are weaker and less clearly defined, I suppose you can even add them when not CFI-attestable. —Granger (talk · contribs) 00:16, 31 July 2018 (UTC)[reply]
That's no reason not to add references / quotations if possible, even if they aren't technically needed for inclusion and even if they aren't durably archived (archive.org can help). DTLHS (talk) 00:43, 31 July 2018 (UTC)[reply]
I agree. —Granger (talk · contribs) 05:54, 31 July 2018 (UTC)[reply]

New user group for editing sitewide CSS/JS edit

I propose that we treat the "interface admin" right like the "rollbacker" right, and give it to all admins. We can then choose to give it to non-admins if we decide they could use it, like with "template editor". —Μετάknowledgediscuss/deeds 16:08, 30 July 2018 (UTC)[reply]
@Metaknowledge: That is a fundamentally incorrect approach. Template editors can mess up the appearance of the wiki content, but interface admins can potentially take over others' user accounts or add various malicious scripts to Wiktionary if they go rogue. Tgr notes that "a malicious user or a hacker taking over the account of a careless interface-admin can abuse it in far worse ways than admin permissions could be abused". This, that and the other (talk) 06:07, 5 August 2018 (UTC)[reply]
Your comment suggests to me that you don't understand what this is about. Interface admins cannot "take over others' user accounts". Do you even realise that every admin here currently has the powers of an interface admin? Please read the docs if this remains unclear to you. —Μετάknowledgediscuss/deeds 17:20, 5 August 2018 (UTC)[reply]
They're not completely wrong: this is the only thing on the site that executes code on people's computers rather than on Wikimedia's servers. In theory, malicious JS could do bad things, though I wouldn't know to what extent. I agree, though, that allowing admins to continue to have access that they've had for a decade and a half without incident isn't exactly a dire threat. Indeed, as long as the same people who are choosing interface admins are choosing regular admins, and by similar processes, this doesn't seem like much of an added safeguard. Chuck Entz (talk) 20:32, 5 August 2018 (UTC)[reply]
I understand exactly what's going on. Yes, admins currently have the power to edit JavaScript, leading to the ability to compromise others' user accounts and do other evil things, but that will cease to be the case in a few weeks. Appointing people who haven't been vetted by the community as interface admins is reckless. This, that and the other (talk) 02:06, 7 August 2018 (UTC)[reply]
See cross-site scripting, cryptomining, phishing, cross-site request forgery, etc. --Rschen7754 04:40, 8 August 2018 (UTC)[reply]
I agree that we should limit this to people who would use the tool, however we don't need to become overly alarmist. As long as everyone who knows what they are doing is following the relevant pages it should be hard for anyone to abuse the global js pages. It might be a worthwhile security policy that we limit or prohibit inclusions on the global pages so that all changes to global js are tracked in one place. - TheDaveRoss 12:53, 8 August 2018 (UTC)[reply]
As I understood from the discussion on Meta, main concern is the large amount of accounts that can be compromised. The objective is to reduce the target of accounts to those who really use them for editing JS, supposedly techy-minded people with strong passwords or using 2FA. A simple opt-in for current admins solve the question. A restrictive policy for inactivity is recommended. --Vriullop (talk) 12:28, 9 August 2018 (UTC)[reply]

Requested edit discussion on WT:Blocking policy edit

See the discussion here. Thank you! Salvidrim! (talk) 16:58, 30 July 2018 (UTC)[reply]

Derogatory != pejorative? edit

Module:labels/data distinguishes between pejoratives and derogatory terms. Is this desired/intentional? Aren't they the same thing?__Gamren (talk) 14:58, 31 July 2018 (UTC)[reply]

You ask because you haven’t a strong opinion on it. Other people hadn’t either. People just weren’t sure to merge. One can also ask oneself about the relation to offensive. Fay Freak (talk) 20:23, 31 July 2018 (UTC)[reply]
derogatory = pejorative, both != offensive. -20:41, 31 July 2018 (UTC)
Offensive is not a synonym for those because a word can be offensive for other reasons than being devaluing but this is mostly context-dependent so in the cases when the word “offensive” is supposed to be used for labels, it will because a word is generally intendend to be offensive by its derogativeness – i.e. the distinction is there but as far as I see not for lexical labelling. Or give me examples of words or word senses which merit the label “offensive” without meriting the label “derogative” or “pejorative”. Fay Freak (talk) 20:53, 31 July 2018 (UTC)[reply]
My opinion is that we could merge "pejorative" and "derogatory" without making much difference to anything (and we probably should!): it's like "humorous" vs. "jocular". But "offensive" means something different. (For example, calling an old car a "rustbucket" is derogatory but not offensive.) Equinox 21:00, 31 July 2018 (UTC)[reply]
I thought "derogatory" was stronger than "pejorative" (like "rare" vs "uncommon"), but from our entries and other dictionaries', that doesn't seem to have a basis. Neither word is defined in Appendix:Glossary. Some entries do contrast the two labels, e.g. swine, but even if there is a distinction, over-fine distinctions will probably never be applied consistently on a project with multiple editors, so I guess they could be merged. I'd prefer to keep "derogatory", which seems to be more common and of longer history. I agree the label(s) should not be merged into "offensive". In addition to Equinox's example, one could consider in the other direction something like (some) Christians taking offense at "Xmas" (or "Happy Holidays"!) despite (most) users of those terms not intending derogation. - -sche (discuss) 21:16, 31 July 2018 (UTC)[reply]
It is offensive to the old car though, or no? It offends the car symbolically. From a behaviorist perspective, the words are synonymous in lexical usage because they refer to the quality of a lexeme effectuating negative reinforcement towards the significate. Even if you have difficulties with such a view, consider that it is probably not necessary anyway to make the distinction of offensiveness and derogativeness and pejorativeness. This point made with “rustbucket” would only lead to labeling words for things “derogative” and words for humans “offensive”. Only based on the quality of the significate being able to understand the language? As I see the things it would be viable to merge all into “derogative”. The point is “why distinguish?”, when the only thing we want to say is that the word is somehow directed “against” something, or “why would the reader need such a distinction?”
For the “Xmas” thing, it is what I have mentioned about the offensiveness being context-dependent. When those people who are offended by it aren’t there it is not offensive thus. If we want to go that far with the labels we will have to state to whom terms are offensive. (We don’t want to go in that direction, it would be dealing with fluctuating PC habits too much).
I can’t understand the distinctions at swine btw. Fay Freak (talk) 21:36, 31 July 2018 (UTC)[reply]
I think it's useful to distinguish words that are used pejoratively towards humans vs. other objects. DTLHS (talk) 22:12, 31 July 2018 (UTC)[reply]
I do too, this is why we have the category “ethnic slur“ - it should be a subcategory of pejorative/derogative as it seems to me. So how to implement the distinction practically? The distinction between “pejorative”/“derogative” and “offensive” isn’t the one or not transparent to casual dictionary editors. Somehow we need to think about what we can sustain. While the argument about the reality of any distinction is probably sophistical, as I see ex post, the inciting question here is how we want to categorize. Even distinguishing offensive from pejorative/derogative, what ever that distinction be, labeling with these words isn’t a way presentable categorization will be brought about.
An easy way, easy to grasp for the editors and having the appeal of being readily availabe (unlike what we yet have to think out), is to allow all three terms, displaying “pejorative” as “derogative” or inversely but displaying “offensive” as “offensive”, while still categorizing all into one category. Fay Freak (talk) 23:00, 31 July 2018 (UTC)[reply]
... everyone else above is suggesting that we merge derogatory and pejorative, but keep offensive distinct. Why would we allow the display of all three, but categorize them the same? No one else has proposed this, and given the thread above, this proposal doesn't make sense to me. ‑‑ Eiríkr Útlendi │Tala við mig 23:41, 31 July 2018 (UTC)[reply]
Because why have Cat:English offensive terms and Cat:English derogatory terms? Who wants to look things up under two different pages? Why should we have that distinction? They overlap so hard, and apparently a huge part of terms can bear both epithets while it does not happen often in practice that editors label senses “derogatory, offensive”. This is a sign that the terms are synonymous in the context of dictionary labels.
I have not suggested the display of all three, but of two only. Because editors might want to distinguish in the label display between “offensive” and “derogatory”, while perhaps it is not worth for distinct categories.
Still nobody has proposed a distinction that is feasible and does not result in random categorizations. So what are the relations of the terms between each other? With the occasional examples of one label being not applicable while the other is no ordering principle is attained. Fay Freak (talk) 00:10, 1 August 2018 (UTC)[reply]
A compliment can be offensive if it includes a stereotype, for instance saying that someone is a great lawyer because they're Jewish and know all the tricks to get what they want from people, or a great accountant because Jews are really good with money. Chuck Entz (talk) 03:17, 1 August 2018 (UTC)[reply]
Cars are not sentient and can't be offended; one cannot even try to offend a car because it won't listen. You can surely see why "nigger" or "bitch" is offensive and "rustbucket" isn't? I say this as somebody who has a problem with the current trend for censoring language that might "offend" someone (thus giving minorities an easy way to silence majorities); yes I know everyone hates my white ass for this. Equinox 00:13, 1 August 2018 (UTC)[reply]
Then it is offensive to the current or previous owner of the car? Or, in other words for things, for the producer? The ctrl-left could surely tell you something about such usage being classist, elitist and offensive to queer-cared people? I don’t trust so much in a common sense of label users, if an autistic sense can be contrived it is used, and what is currently in the categories suggests that the distinction is only good in theory (and useless it is, because what do we get from such categories even if utmost diligence in the usage of the labels “derogative” and “offensive” has been kept? I still see only a hotchpotch of words.) Something idiot-proof please. Fay Freak (talk) 00:55, 1 August 2018 (UTC)[reply]
I think all we can do is flag words as "offensive" if it is (more or less) impossible to use those words in a way that isn't intended to offend (see snarl word). We certainly can't gloss every word that might possibly offend somebody, because anyone can claim they were offended by any word. Equinox 01:12, 1 August 2018 (UTC)[reply]
As far as "calling a car a rustbucket is offensive to the car's owner", that's surely extralexical, because it relies on cultural knowledge (owning a cheap or worthless thing when you could upgrade it supposedly makes you a loser). Equinox 01:49, 1 August 2018 (UTC)[reply]
Our application of the labels pejorative and derogatory seems unsystematic. Is the reason that ugly is not labeled pejorative or derogatory that the definition makes it obvious? Ie, we apply Occam's razor in cases where the label is redundant to the definition. DCDuring (talk) 01:37, 1 August 2018 (UTC)[reply]
Note: before the labels are merged, we should check for entries which use both labels, especially any entries that use both on the same line, e.g. "{{label|en|derogatory|slang|pejorative}}" which would become "(derogatory, slang, derogatory)". I'll take care of the ~20 English entries which use both labels right now. - -sche (discuss) 01:33, 1 August 2018 (UTC)[reply]
I certainly agree that offensive != derogatory/pejorative. A pejorative is used to belittle the person or thing to which it is applied. An offensive terms isn't necessarily so. Calling someone grønlænderstiv (drunk as a Greenlander) isn't an insult to the interlocutor, but the term relies on a negative stereotype, hence it's offensive. Many people use straight-acting as a compliment, but it, too, relies on a stereotype. Some people use neger and nigger without meaning to offend anyone (yes, even in 2018), yet people are offended by it -- regardless of whether there is any "rational" reason for being offended, we must acknowledge that offense is commonly taken. Yes, of course every word could hypothetically offend someone, but there are still words that are more offensive than others. @-sche Thank you, but that's not actually necessary. {lb|en|derogatory|derogatory} becomes "derogatory", and {lb|en|humorous|jocular} becomes "humorous". @DCDuring I don't think ugly is necessarily pejorative, it's just that ugliness is usually undesirable. Suppose an artist wants to shock her audience into, say, caring for the environment. She paints a picture of an ecological dystopia, attempting to make it gaze-lockingly hideous and its effect long-lasting. She communicates her intent to someone else, and asks him of his opinion. He says that it is very ugly, and means it as a compliment. The artist understands this, and is pleased by his assessment.__Gamren (talk) 18:50, 1 August 2018 (UTC)[reply]
That seems like a very lame, disabled questionable explanation. With that kind of elastic scenario-creation there is no word, however offensive, that could not be used in a positive way. In the world of art, fiction, role-playing, stagecraft etc Nazi could be a positive way of describing a staged march or a manner of saluting or a certain kind of uniform. DCDuring (talk) 19:10, 1 August 2018 (UTC)[reply]
Well, sure. Nazi isn't necessarily offensive or pejorative.__Gamren (talk) 11:01, 2 August 2018 (UTC)[reply]
Can you give any example of a word that IS "necessarily offensive or pejorative?
The point of either a "pejorative" or "offensive" label is that the labeled term is typically (or merely often or dangerously?) pejorative or offensive. Almost any characteristic of a word can be shown not to be true in some subset of its uses. DCDuring (talk) 11:21, 2 August 2018 (UTC)[reply]
Regardless, I've merged pejorative -> derogatory. Delete Category:Pejoratives by language and subcats at your leisure.__Gamren (talk) 20:49, 3 August 2018 (UTC)[reply]
Thank you so much. DCDuring (talk) 21:58, 3 August 2018 (UTC)[reply]