Wiktionary:Beer parlour/2021/November

The sorry state of gsw edit

I want to majorly change the policies for this lect and I need input over at Wiktionary_talk:About_Alemannic_German#The_sorry_state_of_gsw. Fytcha (talk) 17:04, 1 November 2021 (UTC)[reply]

I made this as it was a redlink but it's not already included in Module:languages, Module:languages/canonical names, and Module:languages/code to canonical name; it also evidently doesn't have an ISO code or even Glottolog). I want to be conservative about editing those modules so I'm providing transparency so someone else can add it more thoughtfully. —Justin (koavf)TCM 17:29, 2 November 2021 (UTC)[reply]

Macedonian secondary imperfectives edit

In Macedonian, if an imperfective verb is derived from a perfective verb which is itself derived from an imperfective verb, provided that there are no semantic shifts across the derivational chain (e.g. that the perfective verb does not add any meaning beyond perfectivity (such as an inchoative, diminutive or intensive meaning) to the original imperfective verb), the secondary imperfective combines the telic meaning of the perfective verb with grammatical imperfectivity and consequently appears in a limited range of contexts compared to both the perfective and the primary imperfective verb, namely the following:

1. Narrative present (where perfective actions are being recounted, possibly ones which were actually completed in the past, but expressed as imperfective forms to create an impression that they are ongoing and involve the reader or hearer)

2. Iterative or habitual contexts (where there is a series of sub-events which are individually perfective in that they attain their telos, but which comprise a global event which is not itself perfective; this sense of the secondary imperfectives is not to be confused with lexically frequentative verbs whose basic semantic content implies that an action is repeated several times in succession, e.g. with verbs like "knock" or "hum")

3. Irrealis clauses which allow or require an imperfective verb, whereas the context requires a perfective (telic) meaning.

Unlike the primary imperfective verbs, the secondary ones cannot be used in canonical present context, where the time of the event overlaps with the time of speech, e.g. in contexts where English would use a progressive construction like "I am cleaning".

There might be more uses of the secondary imperfectives than the three enumerated above, but I have not researched the subject well enough to come up with an exhaustive account.

An example:

  • скока - primary imperfective - to jump (to be in the process of jumping or to habitually jump, with an atelic meaning, i.e. with the endpoint of the event being neither asserted, implied nor denied)
  • скокне - primary perfective (to jump, with a semelfactive, instantaneous, telic meaning, with explicit assertion of the endpoint)
  • скокнува - secondary imperfective, used in contexts such as (authentic English examples which could be translated with the Macedonian secondary imperfective):
    Narrative present: "Enter MONKEY leisurely, looking about, throws up one or two things, then jumps in the box" (stage directions from a 1875 play)
    Iterative, habitual: "When an exception occurs, the processor always jumps to this instruction address, regardless of the cause" (from a modern IT textbook)
    Irrealis: "Intuitively, the sentence is not true if uttered in our world, whether the speaker actually jumps out of the window or not" (from a modern linguistics textbook")

My question is: how can I label these secondary imperfectives on Wiktionary? Currently, I've labelled them {{lb|mk|iterative}} to group them together in a list, but I think that some more meaningful label should be used, especially since {{auto cat}} has printed a misleading explanation at Category:Macedonian_iterative_verbs and since "iterative" can be confused with the frequentative meaning of verbs like knock, as mentioned above A label that reads "secondary imperfectives" would also be unsatisfactory, since imperfectives derived from perfectives derived from imperfectives with a change of meaning are also technically secondary imperfectives, but do not display the properties laid out above. For example, in the derivational chain оди (to go, walk) > изоди (to walk until the end of a route, to complete a route) > изодува, the secondary imperfective is just a normal imperfective counterpart to изоди, because the meaning changes during the derivation of изоди to оди. Consequently, изодува does not mean "to go, walk" in narrative, habitual or irrealis contexts, and must be systematically distinguished from cases like скокнува. Finally, not labelling secondary perfectives in any way is no good, because translating both скока and скокнува as "to jump" would mislead readers unfamiliar with Slavic languages.

The phenomenon in question should exist in other Slavic languages too, so perhaps Slavic-speaking contributors might be able to contribute more pertinently than others. Martin123xyz (talk) 08:26, 3 November 2021 (UTC)[reply]

@Martin123xyz: I don't think Russian has this construction (unprefixed derivations are already rare, and double derivations don't exist as far as I know), but I think {{lb|mk|frequentative}} is what fits best from what I understand. Do I understand correctly that second imperfectives denote repeated action over different occasions? Iterative means repeated action on one given time, rather than spread out over time. Thadh (talk) 09:21, 5 November 2021 (UTC)[reply]
@Thadh Thank you for the reply. Russian definitely has secondary imperfective verbs, i.e. imperfective verbs derived from a perfective verb itself derived from an imperfective verb (тратить > утратить > утрачивать; казать > показать > показывать). It's just that in most such derivational chains, the meaning changes at some point, mostly during the derivation of the perfective verb. Consequently, the secondary imperfective is not semantically equivalent to the primary imperfective, as is the case with the Macedonian secondary imperfectives at Category:Macedonian iterative verbs. I proposed починять as a Russian verb of this kind, since it seems to mean the same as чинить, but the fact you have not picked up on it leads me to assume that I was on the wrong track.
It should be noted that prefixation vs. suffixation is irrelevant. In Macedonian there are also cases like гори > изгори > изгорува (perhaps comparable to Russian гореть > сгореть > сгорать?), where the perfective is prefixed rather than suffixed, but the secondary imperfective still exhibits the properties listed above (narrative, habitual/frequentative and irrealis use), with the same semantics as the basic гори. This is different from a cases like бие (to beat) > убие (to kill, perfective) > убива (to kill, imperfective), where the meaning changes during the perfectivization, such that убива and бие are two imperfective verbs with different meanings and a full range of imperfective uses, whereas гори and изгорува are two imperfective verbs with the same meaning such that only the former has a full range of imperfective uses, whereas the second is restricted and statistically rarer.
Going back to the term "iterative", if you say that "frequentative" means repeated over a longer period, that's fine - we can use that term. However, we will not capture the narrative and the irrealis use discussed above. Could this be resolved by writing a Macedonian-specific definition for "frequentative", to be displayed at the top of the category of Macedonian frequentative verbs? As for verbs like "to knock", "to hum" and to "to hop", would they then be labelled "iterative"?
Currently, the Wiktionary entries for хаживать and сиживать describe them as "iterative" rather than "frequentative", even though they refer to events spread out over time, contrary to what you are now proposing. Should this be changed? Martin123xyz (talk) 09:38, 5 November 2021 (UTC)[reply]
@Martin123xyz: According to en.wp (which we mostly use for grammatical explanations), frequentative is spread out repetition and iterative is momental repetition. I don't know what the options are for language-specific category explanations (you should ask others), but in the worst case scenario you could just say that's a grammatical feature that you expect readers to know, or you could create a dedicated template that redirects to a Macedonian appendix explaining this feature and automatically adds the cat.
Re Russian: yes, most prefixed (im)perfectives carry a certain change in meaning, like сгорать means "to burn up", so the second imperfective carries that meaning. Words like "починять" don't exist and would mean the same thing as the regular imperfectives. Thadh (talk) 09:50, 5 November 2021 (UTC)[reply]
Thank you for the discussion. I will look into defining "frequentative" and "iterative" for Macedonian somewhere. As for починять, there is a link to it at чинить and a page at the Russian Wiktionary with a real quote. It has also been included in several dictionaries, as you can see here. It is for this reason that I mistook it for an existing word. Martin123xyz (talk) 09:57, 5 November 2021 (UTC)[reply]
Oh, huh, guess I learned a new word today. Looks like an archaism, and it seems to mean the same thing as чинить, without any frequentative/iterative meaning attached to it apart from the regular imperfective functions. Thadh (talk) 10:14, 5 November 2021 (UTC)[reply]
Then you agree that this починять is a secondary imperfective not only in terms of its position in the derivational chain, but also in terms of how usable/normal it is. This is the case with all the verbs from Category:Macedonian iterative verbs which must somehow be subordinated to their primary imperfective counterparts. If we just create an entry for изгорува and write that it means "burn", foreigners might think that it is the basic word for "burn" and start saying things like "??селото изгорува" for "селото гори" (the village is burning). Similarly, if we just create an entry for починять and write that it means "repair", foreigners might wrongly use it everywhere where they should be usingчинить instead. This is the reasoning that prompted me to start labelling Macedonian secondary imperfectives and start the discussion. For the Russian cases, perhaps usage labels like "archaic", "rare term" and the like suffice, but because in Macedonian, the secondary imperfectives also have a special grammatical status and are perfectly natural in the examples I gave in my original post (starting with the one with the monkey), a grammatical label is necessary. Martin123xyz (talk) 11:00, 5 November 2021 (UTC)[reply]

We need more and better labels (and glossary entries) for the higher register edit

At present, there are (literary), (formal), (solemn), which are good labels but don't cover the entirety of the higher registers. What's missing are labels (and corresponding glossary entries) for educated/erudite speech (Bildunggssprache) and a general higher register (gehobene Sprache). I've heard this complaint uttered by at least one other editor before and I think there are non-standard labels like (lofty), (posh), (exalted) in circulation right now. What's more, loads of German articles currently lack the proper denotation of them belonging to the higher register which probably goes back to the lack of adequate labels in general. Fytcha (talk) 13:21, 3 November 2021 (UTC)[reply]

I agree, but I'm a bit divided on this. Clarity and order are usually nice things, but too much bureaucratic paternalism (of the authors) can also be discouraging. The user, on the other hand, regards online reference works as a tool to be used and does not appreciate the work of art. The engineer is always inculcated: "as filigree as necessary, not as possible! The work is sufficient if it fulfills its purpose." And a user who knows how to look up is not too stupid to think and understand. So I have no problem finding labels in completely different categories after or in front of a term. I prefer that to inventing 3 different formats.
On the one hand, the attributes, like the text examples, should be easy to understand, even for people with 30% language skills. Well and good. On the other hand, sometimes more precision doesn't hurt. Not every phrase from fairy tales, for example, is automatically correct under dated or obsolete. Freedom is needed here because it is impossible to foresee all cases. I therefore advocate recommendations instead of hard rules and ask for a sense of proportion with regard to hair-splitting and usefulness - especially with terms that hardly anyone looks up. And have respect for the work of generations before us, including those before the invention of the Internet. Everyone who reinvents the bicycle should have a good reason for doing so.
But it can't hurt to create a correspondence table showing which word fields could be used in which superordinate expression. You can perhaps orientate yourself on existing, recognized dictionaries. A bot could try to gently unify different things that mean the same thing. (rare, seldom, rarely, uncommon, unusual) -> rare. You have to agree beforehand whether to abbreviate or not. I am in favor of obs. instead of writing obsolete. Short wins.
And if an input mask offers a selection of attributes when editing, these will certainly be used more often. With all the goodwill, I always have problems finding formatting templates and instructions at the right time. And it will be the same for many committed helpers. There is no ill will in this; the help system could still be expanded a little. Herr de Worde (talk) 17:53, 3 November 2021 (UTC)[reply]
I'm inclined to agree. Wiktionnaire regularly uses terms like "soutenu" for an educated register (though "formal" could be our approximate equivalent, I suppose). English tends to have fewer distinctions when it comes to formality, which makes it hard to come up with universally understandable terms. That's likely why we haven't implemented them. Andrew Sheedy (talk) 18:44, 3 November 2021 (UTC)[reply]
I would also be on board with language-specific labels. A language usually has more precise terms to meta-describe its own registers and jargons than outsider languages (in this case English). See e.g. the already mentioned terms Bildungssprache, gehobene Sprache, soutenu, or things like 敬語, Burschensprache. Fytcha (talk) 18:50, 3 November 2021 (UTC)[reply]
Russian lexicography makes wide use of the label просторечие (prostorečije, simple speech), to denote a non-geographical register that should be avoided by educated speakers in most contexts, which doesn't really correspond to "informal". Allahverdi Verdizade (talk) 13:49, 5 November 2021 (UTC)[reply]
@Allahverdi, Fytcha In Russian we have typically rendered просторечье as {{lb|ru|low|_|colloquial}} for want of a better term; this classifies under CAT:Russian informal terms although perhaps it should do something else. Benwing2 (talk) 06:03, 10 November 2021 (UTC)[reply]
@Allahverdi Verdizade Oops. Benwing2 (talk) 06:04, 10 November 2021 (UTC)[reply]
Which is exactly why I think that forcibly merging the categories "informal terms" and "colloquial terms" for all languages was a naughty thing to do. Logically, the antonym of "informal terms" is "formal terms", right? However, "formal terms" is not the opposite of просторечье. Allahverdi Verdizade (talk) 12:56, 10 November 2021 (UTC)[reply]
@Fytcha: I use both Wiktionary and DWDS (in German) to look up German words. One issue with DWDS is that the labels are rather cryptic to me as an English speaker. I've resorted to keeping a mini-dictionary just for German dictionary labels, and it's one of the main advantages of and English dictionary that I don't have to keep referring to an additional list of words that don't often appear in everyday language. The words I've collected so far are: abwertend, derb, papierdeutsch, spöttisch, gespreizt, umgangssprachlich, bildlich, übertragen, salopp, gehoben, landschaftlich, veraltend, scherzhaft, fachsprachlich, & vertraulich. I don't think there are any which don't have a reasonably close English equivalent. RDBury (talk) 14:08, 19 November 2021 (UTC)[reply]
Interesting, maybe we could collect them in some appendix. – Jberkel 14:43, 19 November 2021 (UTC)[reply]

Proposal: add a colon in {{audio}} if the third argument is present edit

See for instance house#Pronunciation. All other pairs in the pronunciation sections are formatted like <Key>: <Value> so the audio stepping out of the line in that regard is really inconsistent. To answer any concerns about double colons, I can run a quick grep on the Wiktionary database dump or we could just run a bot to remove trailing colons from the third argument. Note that editors are manually adding the semicolon to the third argument. Fytcha (talk) 16:29, 3 November 2021 (UTC)[reply]

  1.   Support Martin123xyz (talk) 12:31, 4 November 2021 (UTC)[reply]
  2.   Abstain I don't have strong feelings on this but I will say that the Key/Value thing isn't visually jarring for me, since the other entries are "Text: Text" and this is "Text: Media". —Justin (koavf)TCM 15:53, 4 November 2021 (UTC)[reply]
  3.   Support I had thought of doing this myself; I agree it looks better with a colon. Benwing2 (talk) 04:39, 10 November 2021 (UTC)[reply]
  4.   Support More aesthetically pleasing, IMO. - excarnateSojourner (talk | contrib) 09:01, 24 December 2021 (UTC)[reply]

Request for those who work on Samoan or Tongan (or anyone else) edit

I made matalafi and want to make sure that it looks correct to others. Thanks. —Justin (koavf)TCM 18:30, 4 November 2021 (UTC)[reply]

I'm very surprised that you don't even know how to format an English etymology! —Μετάknowledgediscuss/deeds 18:37, 4 November 2021 (UTC)[reply]
I want to be conservative about "borrowed", "descended from", "adapted from", "derived from", "cognate of", etc. so I don't use any structured data or templates but standard running text in English. Someone else who knows better can fix it. Cf. all the problems with the etyl template: no need to introduce more errors. —Justin (koavf)TCM 19:44, 4 November 2021 (UTC)[reply]
This is the wrong approach. Never use running text like this. If you don't know something, don't add it (or add a request template like {{rfe}} with your guess, or ping someone like me). This just goes under the radar and forces someone else to fix your mess. —Μετάknowledgediscuss/deeds 20:55, 4 November 2021 (UTC)[reply]
Another way to think of it is that having nothing there for etymology is a mess that needs to be fixed. Yes, templates should be used whenever possible but something is better than nothing. In the future, I'll add {{rfe}} along with the running text. —Justin (koavf)TCM 00:55, 5 November 2021 (UTC)[reply]

Names of people referred commonly by their surname edit

For example, at Gandhi, there is a sense “Mohandas Karamchand Gandhi”; similarly at Hitler, there is the sense “Adolf Hitler”. The problem is that there are countless people with the surname who are referred so, for example, Indira Gandhi (the WP article itself refers to her many times as simply Gandhi). This also doesn't seem like dictionary stuff. While there are words like Gandhian which is logical to be included in a dictionary, because it means "relating to that particular person with surname Gandhi", their etymology can be given as From {{w|Mohandas Karamchand Gandhi}} {{suf|en||an}}. Hence, I do not think that we should include such senses. —Svārtava [tcur] 04:43, 5 November 2021 (UTC)[reply]

I disagree with eliminating these senses. If a figure has become famous enough that they have subsumed the meaning of the word, they should be included as one of its definitions. Here, "Mahatma Gandhi" is essentially the definition of Gandhi. When someone refers to "Gandhi," it can safely be assumed they are referring to Mahatma. That's useful information to include in a dictionary. Imetsia (talk) 15:55, 5 November 2021 (UTC)[reply]
My inclination is to include these when (a) the last name is used without any prior reference to the first name (people are expected to know which Hitler is being referred to; not so with Lincoln--could be Abraham Lincoln, but more context is required to make it clear); (b) the name is used in a general way outside of a given subject field (is "Montessori" unambiguously Maria Montessori outside of pedagogical literature?); (c) the name is only used this way in reference to one person (for instance, I think "Gandhi" without context always means Mahatma, not Indira Gandhi). Andrew Sheedy (talk) 16:49, 5 November 2021 (UTC)[reply]
I agree with most of this except that I think Lincoln also refers to Abraham Lincoln in most situations (at the very least in Europe). Thadh (talk) 17:02, 5 November 2021 (UTC)[reply]

Proposal: New abuse filter for Rhyme categories edit

As we have some pretty strict rules as to the permissible characters in {{IPAchar}} and its derivatives, it is only logical that we apply those same standards to the titles of rhyme category pages and as such trigger an abuse filter whenever somebody tries to create a rhyme category page containing such an impermissible character. See for instance the erroneously created page Category:Rhymes:Polish/aga. I'm not sure whether there needs to be special care taken on the part of bots or whether an abuse filter is enought. @Benwing2 as the owner of User:WingerBot. Fytcha (talk) 15:39, 5 November 2021 (UTC)[reply]

An abuse filter doesn't seem like the correct method- I would suggest putting these rules in a module and throwing an error. DTLHS (talk) 18:03, 5 November 2021 (UTC)[reply]
I agree with User:DTLHS here although it has to be done carefully so as not to disallow legitimate rhymes. Benwing2 (talk) 06:00, 10 November 2021 (UTC)[reply]

Other names for Narua edit

The Narua language (nru) seems to have many names. Wikipedia calls it "Na", while there is also "Naxi", "Mosuo" and "Moso". I originally made a direct request at Module talk:languages/extradata3/n for these to be added, but Surjection didn't feel comfortable unilaterally adding these names. Any thoughts, objections, comments? This, that and the other (talk) 02:44, 6 November 2021 (UTC)[reply]

@Surjection No-one seems to care. Any objections to adding them? As far as I can tell, this only impacts the display of the table on Category:Narua language, but I may well be missing something. This, that and the other (talk) 05:23, 24 November 2021 (UTC)[reply]
@This, that and the other Looking into this, one issue is determining which of these names actually refer to nru vs to nxq. Wikipedia appears confused on this point, linking the Mosuo to Naxi in some entries (including the main entry) but using them contrastively in w:Naic languages. It seems like Na and Mosuo do signify nru, but Naxi is nxq (not nru), unless we take the Chinese approach of merging them...? - -sche (discuss) 20:52, 6 December 2021 (UTC)[reply]

Usefulness of translations for most Latin present participles edit

I was recently astonished to find that Latin present participles contained translations unlike German present participles that merely redirect to the infinitive by means of Template:present participle of (like schwimmend). The entries dedicated to the German infinitive and to the Latin 1st person singular present form offer comprehensive and detailed translations. Only a few Latin present participles have a meaning that evolved beyond the mere translation with the English -ing gerund (like repens, colens etc.), but even those can be equipped with two sections, one with the aforementioned template and one with the additional meaning (like the German entry laufend). Is there any justification for the widespread use of full-fledged translations for those verb forms instead of the Template:present participle of? Bogorm converſation 10:52, 7 November 2021 (UTC)[reply]

Agreed that it is (in most cases) unnecessary to provide a translation. Is it worth going to the trouble of deleting them, though? The Nicodene (talk) 21:25, 8 November 2021 (UTC)[reply]

Translingual animal emojis edit

Looking through some of the animal emojis, I've noticed that @Kephir and @Koavf have deleted quite a few of these entries (🐥, 🐪, 🐫, 🐭) while leaving many others up that lack attestation just as much (click on the ones I've listed and scroll to the left or right, you'll come across many, many emoji entries that consist of nothing more than the Unicode name, e.g. 🐬). I personally don't really care about what we do: I don't think these entries are particularly useful but they're not harmful either. The only peculiar thing is the difference in enforcement. We should either allow them all without attestation (and as such recreate the already twice deleted ones I've listed above) or get rid of the others too. I don't like inconsistency. --Fytcha (talk) 21:37, 8 November 2021 (UTC)[reply]

I think they should all be kept and we should have entries on all emoji characters and sequences. I just deleted them due to consensus. —Justin (koavf)TCM 01:16, 9 November 2021 (UTC)[reply]
They should all be deleted. DTLHS (talk) 01:17, 9 November 2021 (UTC)[reply]
I am of the opinion that we should provide an entry (or at least a redirect) for every printable Unicode codepoint, with a description and obvious definitions (🐫 = camel, ☂ = umbrella). It seems like an eminently useful thing for a modern Internet dictionary to do. Most other sites that have a page per codepoint are fully automatically generated. I'm fully aware that CFI as it stands does not necessarily allow this, but that doesn't shift my opinion. This, that and the other (talk) 11:25, 9 November 2021 (UTC)[reply]
Meh. I bet all this will get more complicated over time, e.g. Apple changing the gun emoji to a water-pistol in response to complaints about violence; will the meaning of the codepoint change from "gun" to "water pistol"? And that "information kiosk woman" who has somehow become a complaint emotion on Twitter. Ha, emoji in Unicode were always a bad idea. I don't regard explaining 🐫 as "camel" as a definition, more of a technical mapping. Not to mention the appallingly-thought-out flags based on ISO codes, not properly allowing for historical flags and future changes to flags. Equinox 11:33, 9 November 2021 (UTC)[reply]
We have Translingual entries for Han characters that lack a definition, so there is precedent for that. (Incidentally I've always wondered why the backstory-type info for Han characters is under the "Chinese" header. There must be a good reason, but it's lost on me.) Flag emojis, well, that's a whole different ball game (the pedant in me is obliged to point out that they're technically not codepoints). This, that and the other (talk) 11:50, 9 November 2021 (UTC)[reply]
IMO, the ones like 💁, 🍑, 💅 that have acquired special meanings beyond their Unicode character names are among the emojis most worthy of entries, since someone could be unaware of the significance, or possibly even have a font where it renders differently, and therefore look for the meaning. I'm not sure what someone looking up 🐬 would expect to find, though, other than the obvious "dolphin". 70.175.192.217 01:50, 15 November 2021 (UTC)[reply]
I'd say they should all be deleted unless they have some additional idiomatic senses, as discussed above. We don't have technical characters and just list their Unicode names as their definitions either, and this isn't far removed from that. — surjection??21:38, 9 November 2021 (UTC)[reply]
I'm open to the possibility of putting them all in an appendix and setting up redirects to that appendix in case someone tries to look them up. But I think the Mainspace should only have the idiomatic ones. Andrew Sheedy (talk) 22:49, 9 November 2021 (UTC)[reply]
I "sub-commented" above. I feel it's in our interests to be able to "say something" about every character; however, I don't think it's worth creating special entries to say that 🐫 is a camel (when that is a reiteration of Unicode standards rather than an actual definition). We already have some nice templates. Under current rules (and pretending this was RFD; of course it isn't) I would say delete to these; however, in the long term, I wouldn't object to some automatic "Unicode page" thing that merely shows the codepoint, numbers, and official name, etc. Because someone will look these things up. It just isn't lexicography. Equinox 07:44, 10 November 2021 (UTC)[reply]
And yes I know we already have the special template that shows what a character is. But right now it isn't a page until it's created, and usually this would require a meaning beyond the "picture of a camel". Oh well you get the idea. Equinox 07:45, 10 November 2021 (UTC)[reply]

Looking for Community Consensus edit

There is an inactive discussion on Wiktionary talk:Administrators. If this subject is not an allowed topic, please remove kindly. GareginRA (talk) 19:47, 9 November 2021 (UTC)[reply]

Complaints about administrators can be brought to the Beer Parlour (i.e. here), as can discussions about language-specific policies. But since Armenian looks to me like chicken scratches drawn with an Etch A Sketch I will defer to the Armenian administrators on policy and blocking related to that language. Vox Sciurorum (talk) 18:04, 10 November 2021 (UTC)[reply]

Make Template:rhymes point to the new category pages edit

@Surjection I see you've done a lot of good work moving info (e.g. the syllable count) from the old Rhymes:... pages to the {{rhymes}} template. I think we should change {{rhymes}} to point to the new category pages instead of to the old Rhymes:... pages, and eventually delete the latter. Thoughts? Apologies if this has been discussed already, I've been gone for a couple of months and don't see any such proposal in the Beer Parlour in Sep, Oct or this month. Benwing2 (talk) 04:43, 10 November 2021 (UTC)[reply]

It has been discussed before at Wiktionary:Beer parlour/2021/August#Retiring Rhymes:. Out of my eight-step program, basically only the first step has now been done. — surjection??10:10, 10 November 2021 (UTC)[reply]
@Surjection Thanks. However, I think it's too conservative to defer changing the {{rhymes}} template to step 7; at this rate, this will never happen, and people will still feel the need to manually update the old Rhymes: pages. I'd instead suggest moving the Rhymes:... pages to the appendix, like you suggest, and then going ahead and changing the {{rhymes}} templates to point to the category space (or switching the order of these two steps). Once we change the {{rhymes}} template, we can delete any of the old Rhymes: pages that don't have any extra information on them (by bot if done carefully), which makes it clearer which Rhymes: pages need to manually have info moved to the corresponding Category: page. BTW it should also be possible, I think, to autogenerate intermediate pages like the category equivalent of Rhymes:Italian/u-. Benwing2 (talk) 04:08, 12 November 2021 (UTC)[reply]
I don't have the time to look into it in greater detail, but I'm not strictly opposed to any plans to expedite the change. There are other considerations to be taken as well, such as the "Rhymes" link on the main page. — surjection??11:44, 12 November 2021 (UTC)[reply]

Sudovian (Narew Baltic) language: code, orthography? edit

Hello,

There is a manuscript called "the pagan speeches of Narew" (which I have copied here), and you can read more about it on Wikipedia. Basically, it's a dictionary of ~200 words in an unknown Baltic language, copied by an amateur (the original source is lost).

Many scholars believe that the language attested is Sudovian/Yotvingian (language code: xsv), but others claim that it is e.g. a dialect of Lithuanian with strong Germanic (Yiddish) influence. A lot of Latvian etymologies here include {{cog|xsv|...}}, and some other sources about Baltic etymology just list it as Sudovian/Yotvingian, although the Altlitauisches etymologisches Wörterbuch (ALEW) calls it "narewisch" ("nar." for short) to be agnostic. This raises the question, should the code xsv be used? I think it could be appropriate, but maybe a disclaimer could be added that the language is uncertain. I'm not sure that Category:Undetermined language is the right way to go though.

Although the more well-known extinct Baltic language, Old Prussian, is a bit of a quagmire, I think that Sudovian/Narew Baltic doesn't have to end up like that. The main problems with Old Prussian are outright neologisms, confusion over normalized/reconstructed forms, and lack of standardized orthography.

Like Old Prussian, there are people who have created Sudovian neologisms or reconstructions, e.g. the "Suduva" website, which states: "Today, the Prussian language is enjoying a revival [...]. Perhaps a restored Sūdovian-Yotvingian language [...] will also fare as well.". Even lt.wiktionary.org has a bunch of neo-Yotvingian entries: "Category:New Yotvingian words", e.g. lt:wendorėdas is not attested. Since there is only one source for Narew Baltic (not even definitely Sudovian) words as far as I am aware, except for reconstructions based on toponyms, the issue of telling whether a word is attested is pretty easy to resolve.

That still leaves orthography as an issue. The actual script used in the Narew manuscript is based on Polish, so the characters ż and ł occur in some words. Moreover, an s-character that looks more like ſ or ʃ is used. (ALEW uses ſ, but uses a font that looks like ʃ. The original papers explaining the document used ʃ. lt.wiktionary.org and some other sources use s.) I'm not sure how faithful we want to be to the original writing format.

Thanks, 70.175.192.217 23:57, 11 November 2021 (UTC)[reply]

Suggestion on Malay: Soft-redirecting Jawi entries to Latin edit

Now, we have separate full-fledged entries for the same Malay word in both Jawi (Arabic script) and Rumi (Latin script). For example, we have ribu and ريبو, both defining the word as "thousand". I suggest soft-redirecting the less commonly enquired Jawi entries to their equivalences in Rumi, like how we soft-redirect entries in simplified Chinese to traditional Chinese (e.g. 单位 (dānwèi) to 單位单位 (dānwèi)), or entries in hiragana to kanji when the word is more commonly spelt in kanji (e.g. たんい to 単位). Jonashtand (talk) 14:20, 14 November 2021 (UTC)[reply]

Support. This will help to reduce redundant (and potentially also divergent) information about the same thing. Basic information (word class, meaning) should IMO however remain visible in the Jawi entry, even it is just one click away. I have done something similar (with the kind help of User:Fenakhay) for Makassarese Lontara entries. –Austronesier (talk) 18:26, 14 November 2021 (UTC)[reply]
You can add a gloss to the link, like:
  1. Jawi spelling of tupai (squirrel).
Vox Sciurorum (talk) 14:56, 15 November 2021 (UTC)[reply]
We are doing that ↑ with the template {{ms-jawi}}, unless we are lazy because they are a lot. You can help. It will be appreciated. --Octahedron80 (talk) 00:46, 16 November 2021 (UTC)[reply]
@Octahedron80 OK. So there has been a consensus that Jawi entries are to be soft-redirected to Rumi entries? Should we write this in WT:About Malay?
May someone write the consideration please. --Octahedron80 (talk) 00:35, 21 November 2021 (UTC)[reply]

Punjabi pairī̃ edit

Why should I be unable to find the transliteration pairī̃ for Punjabi for 'in the foot'? The Punjabi noun ਪੈਰ (pair, foot) is in Wiktionary, and a sparse declension is shown for it. I wanted to link to it from Wikipedia to explain English pairin ("Gurmukhi subscript") (includability TBD). --RichardW57m (talk) 13:04, 15 November 2021 (UTC)[reply]

Wikipedia states that the locative/instrumental case is now considered vestigial and is mostly confined to a few set adverbial expressions. It looks like an IP removed both the locative/instrumental and ablative cases from the main Punjabi noun declension table template in this 2017 edit. I could not find any discussion about the change. Apparently nobody has cared enough until now to complain about the cases' absence. 70.175.192.217 00:40, 16 November 2021 (UTC)[reply]
I suspect the 'IP' was actually @AryamanA. Anyway, it seems that the answer is that it now counts as a derived term, and would be a lemma of its own. Or are there objections to that approach? --RichardW57 (talk) 08:18, 17 November 2021 (UTC)[reply]
I don't think AryamanA was the IP, given that the two talked on the anon editor's talk page. Anyway, I'm not sure whether it would be a noun form or a lemma adverb, or both. Wiktionary:About Punjabi doesn't exist. Perhaps someone else would know the policy here, or you could just be bold with what makes the most sense to you. 70.175.192.217 21:42, 17 November 2021 (UTC)[reply]
I think I should be bold, it's just that the quotations won't be very good, and might even be wrong. The form is common enough. I brought the question here because it's a matter of policy, but we don't seem to have a community of editors of Punjabi. --RichardW57 (talk) 20:14, 18 November 2021 (UTC)[reply]
The situation reminds me of the illative case in Lithuanian, which we also don't provide in declension tables, despite still being used in spoken language. However, I don't think the templates ever included that. (Edit: on the other hand, Hindi's vocative is considered "obsolete" according to Wikipedia yet we provide that form of nouns. I wonder if a compromise of providing the terms, but with a little asterisk and footnote at the bottom of the table, would be okay. Demo here.) 70.175.192.217 00:40, 16 November 2021 (UTC) edited at 21:42, 17 November 2021 (UTC)[reply]

A user has been mass-replacing {{female equivalent of}} with {{n-g|feminine equivalent of}} in (mostly) German entries. The reason they are doing this is because these nouns can not only be used to refer to female people but also to other female nouns in similes/metaphors for which they provide correct examples in their edit messages: Lebensgefährtin, Komplizin. It's worth noting that the documentation of {{female equivalent of}} states: "It is used for nouns which occur in pairs for different natural genders of the referent,". The word Lebensgefährtin is overwhelmingly used in the sense of Lebensgefährte while referring to a woman but it is also true that it can and has been used for merely grammatically female entities in the context of metaphors and other rhetorical devices. The annoying thing about their replacement is that the word doesn't show up anymore in Category:German_female_equivalent_nouns, a category bearing the description "German nouns that refer to female beings with the same characteristics as the base noun." which indisputably applies to Lebensgefährtin, it's just that in addition to referring to female beings, it may also refer to grammatically female entities in general. If we were to go with the logic this editor is applying, then the above category would be rather sparse.

Can we discuss and come to a consensus on this? I'm really not a fan of the status quo; I believe that these terms belong in that category and I don't see any reference to metaphorical senses over at Lebensgefährte either (I think there's even a policy against that), so why honor the rare metaphorical use of Lebensgefährtin by removing it from the category? If we really wanted to honor it, the way to go in my opinion would be to add a second sense below {{female equivalent of}}. Fytcha (talk) 14:47, 15 November 2021 (UTC)[reply]

This anon's (perhaps B-Fahrer (talkcontribs) ?) main problem seems to be the use of "female" instead of "feminine". There used to be a template {{feminine equivalent of}}, but it was deleted with a redirect to {{female equivalent of}} (see related discussion). As pointed out there, we already have {{feminine of}} which can be used in these cases. Removing entries from the category is not ok, I wasn't aware the edits had this side-effect. – Jberkel 15:25, 15 November 2021 (UTC)[reply]
Revert these edits and add additional senses if necessary. Ultimateria (talk) 16:11, 15 November 2021 (UTC)[reply]
@Ultimateria: In that case, tweaking the definition of {{female equivalent of}} might be helpful. The only reason why I hesitated to roll back those changes was because of the wording "It is used for nouns which occur in pairs for different natural genders of the referent, one referring to a male individual and another referring to a female individual.". Fytcha (talk) 16:17, 15 November 2021 (UTC)[reply]
@Ultimateria: I've reverted them but the editor reverted my reverts. To reiterate: All those words are grammatically the female equivalent but they are not exclusively used to address entities with a natural gender of female. We should either 1. change the documentation of {{female equivalent of}} so as to make it clear that it is the grammatically female form without necessarily having to only refer to entities of female natural gender or 2. create a new template that we can paste into all these articles as a second sense. However, writing out a {{ngd}} and applying a category really can't be the solution (inconsistent, error prone, much more typing) and the longer we wait now, the more we will have to fix later. --Fytcha (talk) 23:09, 17 November 2021 (UTC)[reply]
@Fytcha: There is nothing wrong with the template. Option 1 doesn't work; see my comments below about productora. Just because it refers to a company doesn't mean it isn't also "a productor who is female". If I had to guess, B-Fahrer is assuming that stubs with "female equivalent of X" as the only definition are as complete as they'll ever be, but doesn't realize that they are missing senses (even though BenWing made it clear in the RFD discussion a year ago and I mentioned it below a couple days ago). I'll revert the rest of their edits. Ultimateria (talk) 02:15, 18 November 2021 (UTC)[reply]
Your reverts were reverted again (with snarky summaries). In the majority of these cases {{female equivalent of}} is correct as the primary sense, but some need to be checked individually. – Jberkel 13:53, 18 November 2021 (UTC)[reply]
@Ultimateria: Unfortunately, half your edits were reverted again so now it's all very inconsistent again. Some articles now look like this: Unterstützerin. Is there any value in having both female equivalent and feminine equivalent as different senses? To cite them separately perhaps? B-Fahrer's only argument seems to be that female doesn't apply to words that are only grammatically female but sexless regarding the natural gender (which is debatable, see sense 4 of female; which is why I initially proposed changing the wording in the documentation of {{female equivalent of}} because it currently does bolster his claim). The changes made to Unterstützerin are equivalent to changing the article Unterstützer to having three senses: 1. supporter (male human) 2. supporter (human of unspecified sex) 3. supporter (sexless entity). I don't see any value whatsoever in doing this. Even if we decided that we wanted female and feminine as two separate senses (I hope not), {{n-g}} is not the way to go. Fytcha (talk) 20:45, 20 November 2021 (UTC)[reply]
@Fytcha: No value. I have no idea how that's supposed to be parsed; I'll remove the feminine equivalent definition from that page. I'll clean up the rest of these entries and deal with B-Fahrer if they come back. I did block one of their IPs for 24 hours for edit warring and invited them to participate in the discussion. Ultimateria (talk) 02:02, 22 November 2021 (UTC)[reply]
I think there are a few forms where a special treatment is warranted (like Herstellerin, Klägerin), but a usage note is probably better than having confusing separate senses. – Jberkel 10:57, 22 November 2021 (UTC)[reply]
@Jberkel: I haven't touched the pages with quotes yet because I'm undecided on how to handle them. How about the definition "female equivalent of X" with the note "May also refer to non-human entities of feminine grammatical gender"? It explains the situation, but I'm hesitant to propose anything that could probably be spread across thousands of pages, when the information belongs in a grammar and not a dictionary IMO. I guess I can live with it. Thoughts on the wording? Ultimateria (talk) 00:28, 23 November 2021 (UTC)[reply]
@Ultimateria: Is reintroducing {{feminine equivalent of}} a possibility? We could just enable it for German for the time being and harshly demand citations for every use of the template, which would alleviate the issues brought up in the RfD (namely that it is just used interchangeably with {{female equivalent of}} with no semantic distinction). Fytcha (talk) 16:21, 25 November 2021 (UTC)[reply]
@Jberkel: Thanks for linking to that discussion. By the way, I don't think {{feminine of}} may be used here. Its documentation states: "This template should be used when there is no singular/plural distinction, or this distinction is irrelevant." I also think that it's likely that that user is behind the IP: The IP started editing right about when the user stopped[1] and additionally that user was very vocal in the discussion surrounding exactly these templates. Fytcha (talk) 16:26, 15 November 2021 (UTC)[reply]
As pointed out before, not just by me but also e.g. @Mahagaja (here), "female" simply is incorrect as dozens of examples (provided in version history and sometimes in entries) proof.
And in Romance languages too, it's often about gender than sex (see here for more if needed).
And as for the category, albeit it only fits partially (for a limited usage, when the -in term refers to living beings like humans or some animals), it can also be added manually by adding [[Category:German female equivalent nouns]] to the bottom of the entry.
--18:40, 15 November 2021 (UTC)
Re: "it's often about gender [more] than sex": those are simply separate senses, and we already treat them as such. A feminine term for a type of company is obviously not the "equivalent" of anything, it just means that that specific definition is not covered by the "female equivalent" template. I've rearranged Spanish productora to preserve your quotes while matching the page to our formatting norms. (Unfortunately the English glosses weren't great, so I had to improvise.) Ultimateria (talk) 00:11, 16 November 2021 (UTC)[reply]
Re: ""female" simply is incorrect" See female: 4. (grammar, less common than 'feminine') Feminine; of the feminine grammatical gender. --Fytcha (talk) 23:09, 17 November 2021 (UTC)[reply]

"sufficient", "ample" etc. as determiners edit

(This issue has arisen out of the RFD for the adjective sense of "enough".) It appears to me that the underlined words in the following contexts, and perhaps some other similar words too, are not adjectives, as we (and other dictionaries) presently imply, but are in fact determiners.

we have sufficient bread
we have ample bread
we have adequate money

This is on the basis that these words do not describe what kind of bread, or kind of money, as adjectives should. Certainly, if we believe that "enough" in "enough bread" is a determiner, then there seems no reason why e.g. "sufficient" in "sufficient bread" should not also be a determiner (yet in "a sufficient reason" it could be construed as an adjective).

However, before I make potentially quite wide-ranging changes along these lines, please say whether you agree/disagree that we should add determiner sections for words in usages such as these (there is in fact already a determiner section at "sufficient", but it is presently an oddity dealing only with pronoun-like usage). Mihia (talk) 18:35, 15 November 2021 (UTC)[reply]

CGEL (2002) (which calls the word class determinatives) lists enough and sufficient as sufficiency determinatives. The authors mention that enough and sufficient can appear in "fused head" constructions, which seems to be a requirement for something the be a determinative, eg.
You've said enough to convince me.
I don't have much money with me, but sufficient for a taxi.
I don't think ample and adequate can be used in such constructions.
When used as a determinative, sufficient has a quantifying sense. Otherwise it seems like an adjective. I think enough may not ever be an adjective in current English, at least not in my idiolect. DCDuring (talk) 16:35, 16 November 2021 (UTC)[reply]
@DCDuring: Thanks for that information, which confirms my own feelings about "sufficient" at least. To me, "I have ample for a taxi", implying ample money, is fine, while "I have adequate for a taxi" is a bit more marginal. I did originally think that other similar words ("quantifying" determiners presently listed as adjectives) would come to light, but so far I haven't been able to think of any. Mihia (talk) 19:01, 20 November 2021 (UTC)[reply]
In fact, another example appears to be scant. In the usage example for sense #1, allegedly an adjective, "Mary had scant reason to believe John", it seems that Mary did not have reason that was "scant", but in fact had little reason, and indeed even the definition of #1, "Very little, very few" is that of a determiner. Use of "scant" in the "fused head" construction seems somewhat unusual, but even so I readily found e.g. "Grace smiled, while inside her heart fury brewed and boiled, and it had scant to do with Brenda", "Cap had scant to say but he began to do John small kindnesses in return", "As a philosopher of religion I have had scant to offer", etc. Mihia (talk) 22:17, 20 November 2021 (UTC)[reply]
Those cites, if durable, are enough for inclusion, but should some of these be marked rare for now? Sufficient seems to clearly have become a determiner. I wonder how long ago. It's not easy for words to break into the relatively closed sets of function words. DCDuring (talk) 15:31, 21 November 2021 (UTC)[reply]
I would say that "sufficient" has always been what it is and meant what it does, but of course traditionally all(?) determiners were called adjectives, so I think it's a question of the terminology catching up to the meaning rather than the meaning changing. Ditto for "ample" and "scant". "adequate" is perhaps slightly more borderline, but even here I see two interpretations of e.g. "We have adequate supplies", one where "adequate" describes a property of the supplies (adjective), and one where it describes the quantity of supplies (determiner). Furthermore the determiner uses e.g. "she has scant reason", or "she has ample reason" seem to me to be routine uses of the words. The only cases mentioned that I see as not common/normal are the "fused head" uses of "scant" and "adequate". Mihia (talk) 18:46, 21 November 2021 (UTC)[reply]

can the page for "reexida" be removed? edit

The creation of the page "reexida" was an accident and it's a spelling mistake for the Catalan word "reeixida". Kyning (talk) 04:02, 16 November 2021 (UTC)[reply]

Done.   AugPi 04:29, 16 November 2021 (UTC)[reply]

Section statistics edit

I made a page with some statistics about existing sections. It shows how many times a section is used, and at what level. The tables are split into Languages, POS sections mentioned in WT:POS, other sections mentioned in WT:ELE, and other Nonstandard sections.

The list of Nonstandard sections gives some interesting insight into possible additions to WT:ELE: At the top of the list is Statistics, with 27643 entries, which looks like just a bunch of information about the popularity of a given name. Meanwhile, Trivia is explicitly allowed, but only used 48 times.

Other non-sanctioned categories with more than 1000 uses include: Compounds, Readings, Idiom (explicitly disallowed), Derived characters, Alternative scripts, and Adjectival noun

Per WT:POS: There are a number that are explicitly forbidden:

  • Abbreviation, Acronym, Initialism
    Abbreviation (5), Abbreviations (383)
  • “(POS) form”: Verb form, Noun form, etc.
    Affixed forms (537), Runic forms (7), and many more with just a handful of uses
  • “(attribute) (POS)”: Transitive verb, Personal pronoun, etc. (with the exception of Proper noun)
    Adjectival noun (1389), Verbal noun (408), Dependent noun (37), Stative verb (23)
  • Cardinal number, Ordinal number, Cardinal numeral, Ordinal numeral
    Ordinal number (519)
  • Clitic, Gerund, Idiom
    Clitic (25), Gerund (75), Idiom (7266), Idioms (428)

I'm not making any suggestions here, but maybe someone with more knowledge than I have can use this to propose something concrete. In the meantime, it's proven to be helpful for catching some existing typos. If a given section has less than 100 uses in any level, its name is clickable and you can see which pages it's used on.JeffDoozan (talk) 01:16, 19 November 2021 (UTC)[reply]

Nice. Just FYI, the display of the last section, "WT:ELE", screws up in my browser (Edge) such that not all the numbers are visible. However, if I reduce the zoom from 100% to 80% it is all visible. It may be a browser bug. On a point somewhat related to the list, present usage of sections "Derived terms", "Related terms" and to some extent "Coordinate terms" is presently a random and inconsistent mess across many English-language entries. I sometimes wonder whether we should auto-merge at least "derived" and "related" since it seems impossible to enforce usage of these to be kept to what the documentation stipulates. Mihia (talk) 20:25, 20 November 2021 (UTC)[reply]
I don't know why it's not displaying for you, it's just a wiki table wrapped with {{rel-top}} and {{rel-bottom}} to make it collapsible. If anyone knows of a a better way to display it, I'm happy to make changes. JeffDoozan (talk) 14:39, 21 November 2021 (UTC)[reply]
The columnised table may be "too much" for browsers. I looked at it in Chrome and the section that breaks in Edge displays OK, but other sections are wrong, with overlapping and clipped text. Mihia (talk) 23:25, 21 November 2021 (UTC)[reply]
Are you using custom CSS or an extension that might be altering the table? The table should only be 7 columns wide, it fits comfortably on screen even on my phone's browser. JeffDoozan (talk) 01:06, 22 November 2021 (UTC)[reply]
As far as I know, I have not customised anything. When I say "columnised table", I mean that the whole table is run across two vertical columns on the page, side by side, reading all the way down the left column, then back up to the top of the right column and down again, so 14 columns in all across the screen. In certain cases the right page column (seven table columns) overlaps the left page column and/or is right-clipped so that content is not visible. Mihia (talk) 10:28, 22 November 2021 (UTC)[reply]
That's weird, on my devices I see just a single 7 column table with nothing alongside it. I wonder if the skin you're using has some affect. If you open the page in a new private browser window, where you're not logged in to Wikimedia, does it still show up as a "columnised table" for you? Is anyone else seeing this? JeffDoozan (talk) 00:29, 23 November 2021 (UTC)[reply]
Yes, it still displays the same. Doesn't the "rel-top" .. "rel-end" templates always create this two-column format? What do you see below? Do you not see two columns?
To me, it looks as if applying this two-column layout to a table is "too much" for browsers (well, Edge and Chrome anyway), or cannot be made to fit, even though probably you have not actually done anything "wrong". I don't know why you would not be seeing two columns though. Mihia (talk) 15:31, 23 November 2021 (UTC)[reply]
You're absolutely right, I do see two columns with your example, but not on my page (using Firefox). I didn't realize that {{rel-top}} split the data into two columns. I've removed the {{rel-top}} and just added <div> tags to apply the same formatting without the two column weirdness. I hope it's more readable now. Thank you for your help troubleshooting this. JeffDoozan (talk) 00:27, 24 November 2021 (UTC)[reply]
Yes, it all looks to display correctly for me now. Mihia (talk) 13:03, 26 November 2021 (UTC)[reply]
Thank you for generating this list. I've had a lot of fun doing cleanup thus far.
The only proposal I could think of right now is codifying that parts of speech may appear in their plural under a ====Derived terms==== header as is the case in e.g. ق_ط_ف. This only really applies to Arabic roots from what I've seen so it should be written into Wiktionary:About_Arabic but on the other hand, there's no real reason why other languages' derived terms sections may not also be categorized by the part of speech (there's already =====Compounds===== after all) apart from the fact that lemmas usually don't have so many derived terms of so many different parts of speech that this is merited. Fytcha (talk) 19:42, 24 November 2021 (UTC)[reply]

Is there a reason why we don't have a template for interlinear glosses? edit

It would be useful for some {{ux}}es, see for instance the examples that I wanted to clean up in yardli or gardidi. Wikipedia seems to already have a template for that: Template:Interlinear. Fytcha (talk) 02:36, 19 November 2021 (UTC)[reply]

I guess the reason is we're a dictionary, not a grammar book. Interlinears help to understand grammar, but are next to useless to understand words. MuDavid 栘𩿠 (talk) 03:57, 19 November 2021 (UTC)[reply]
But they would help with understanding examples and quotations if one had only a shaky grasp of the language. --RichardW57 (talk) 05:00, 19 November 2021 (UTC)[reply]
@MuDavid: They should just be one more possible mode of translation ({{ux}} already allows literal translation in addition to idiomatic translations) which could be very useful for documenting some of the rarer and more obscure languages. If I were to go through a textbook of a rare language that contains examples in interlinear, wouldn't I want to conserve this information here on Wiktionary too? Fytcha (talk) 20:50, 20 November 2021 (UTC)[reply]

Clarify what web pages count as "permanently recorded" for WT:ATTEST edit

After this post, I made my own proposal to officially define Internet-Archived pages as "permanently recorded", and it failed with a tie, just like a similar proposal in 2012. From the discussion, it appears likely that there's a supermajority of voters who would support some reform in this direction; the tricky part is nailing down a specific compromise that would get enough support. I have little faith in my ability to craft one, both because I'm not very active on Wiktionary and because I ultimately failed to understand some of the details of the objections, but I hope somebody else tries their own proposal for this. —Kodiologist (t) 21:36, 19 November 2021 (UTC)[reply]

What about applying those laxer attestation criteria only to internet and gaming jargon terms as a first step and seeing how it goes? Fytcha (talk) 21:59, 19 November 2021 (UTC)[reply]
The reasons for people's "oppose" votes are there to see, but my impression is that what counts as "permanently recorded" was not much of an issue. I think the main concerns were around whether allowing entries on the basis of three Internet-sourced attestations risks opening the floodgates to lots of crap. While the present CFI wording in fact does not clearly exclude such entries, I think the feeling amongst objectors was that if we are doing anything around this wording, we ought to codify stronger "reliable source" requirements for Internet material. I think this is the area that needs addressing. Mihia (talk) 18:49, 20 November 2021 (UTC)[reply]

Example sentences copied? edit

Am I correct in understanding that the various pages using Template:RQ:ja:Xin Shidai Ri-han Cidian (101 transclusions at the moment, according to Toolforge) have copied usage examples from that dictionary? I'm not sure if that is a violation of copyright – it may well be perfectly legal, and even consistent with Wikimedia policies – but it feels wrong to me.

If I understand correctly, 新時代日漢辭典 (Xin Shidai Ri-Han Cidian) is a Chinese-Japanese bilingual dictionary. It appears to me that some portion of those 101 pages, which I think are all Japanese entries, contain usage examples copied from the dictionary. I am not suggesting that there is anything wrong with the examples as examples. But it feels contrary to norms around intellectual property to copy example sentences from that dictionary into this one, even with proper attribution. It seems to me that de minimis quotations from a range of sources would be better than multiple quotations taken from a published dictionary. Am I alone in feeling this way? Or, am I mistaken about what is going on here?

@Onionbar, Suzukaze-c, Benwing, I think you have been involved with the pages that include the template. Apologies if you're not interested in this discussion.

Cnilep (talk) 03:37, 21 November 2021 (UTC)[reply]

@Cnilep: Talk:相応しいSuzukaze-c (talk) 03:40, 21 November 2021 (UTC)[reply]

Extended mover edit

I would like to have the extended mover right, so that I can move pages without redirects. It's quite a useful right to have. I will use this tool responsibly and rationally, and should too many doubts/controversies arise over the pages I move, I will willingly surrender this. —Svārtava [tcur] 16:23, 26 November 2021 (UTC)[reply]

  Support Good editor and, judging by their contribution, they could use these rights. Those redirects can sometimes really be annoying. Fytcha (talk) 16:37, 26 November 2021 (UTC)[reply]
  Oppose. He was given this right on a temporary basis before and abused it so quickly that Surjection had to remove it. —Μετάknowledgediscuss/deeds 20:35, 27 November 2021 (UTC)[reply]
Yes, I was given in January when I ‘abused’ it. I had no idea that deleting {{d}}-tagged pages by moving them and replacing their content was ‘abuse’. Next, I got the right again in August for a week and there was no such problem. I am well-aware of what to and what not to do, and will not let any such abuse happen, if I am granted the mover right. Lastly, should such abuse happen, take away the right and block me as long as seen fit; I will unobjectingly accept it in that case. —Svārtava [tcur] 09:50, 28 November 2021 (UTC)[reply]
  Support. I think the extended-mover right would go to good use in this case. And the arguments against granting it are unsatisfying, for the reasons Svartava points out above. In the worst case, the right can always be removed if it becomes misused. But I don't see the point of preemptively denying a right because of a hypothetical but unproven doubt that it might be abused. Imetsia (talk) 01:28, 29 November 2021 (UTC)[reply]
I withdraw my support. Imetsia (talk) 15:08, 4 December 2021 (UTC)[reply]

Talk to the Community Tech: The future of the Community Wishlist Survey edit

 

Hello!

We, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will take place on 30 November (Tuesday), 17:00 UTC on Zoom, and will last an hour. Click here to join.

Agenda

  • Changes to the Community Wishlist Survey 2022. Help us decide.
  • Become a Community Wishlist Survey Ambassador. Help us spread the word about the CWS in your community.
  • Questions and answers

Format

The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (all points in the agenda except for the questions and answers) will be given in English.

We can answer questions asked in English, French, Polish, Spanish, German, and Italian. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

We hope to see you! SGrabarczuk (WMF) (talk) 20:03, 26 November 2021 (UTC)[reply]

Etymologies of usernames page in the Wiktionary namespace edit

A number of years back, I created User:PseudoSkull/Etymologies of usernames, which has been extensively used by the community since its creation. I believe it's useful enough and has enough community interest that it could be moved to the Project space. That would make it easier to find, and would be more inviting to edit than a userspace page. Any objections? PseudoSkull (talk) 16:22, 27 November 2021 (UTC)[reply]

The project space is for pages that serve the project. You called it "useful", but it actually doesn't serve the project in any way — it's actually just interesting. I don't think we should fill up project space with content about ourselves — it's fine where it is, and you can even add a notice encouraging people to edit it if you so choose. —Μετάknowledgediscuss/deeds 20:30, 27 November 2021 (UTC)[reply]
  • PseudoSkull I would   Support a Wiktionary-namespace shortcut, like we have WT:WF for Wonderfool which doesn't serve the project per se, it just tracks the edits and accounts of one particular user. Maybe create WT:EU as a redirect? —Svārtava [tcur] 08:21, 29 November 2021 (UTC)[reply]
I support this too per nom. Imetsia (talk) 16:22, 29 November 2021 (UTC)[reply]

Gender-neutral Spanish neologisms (amigx, maestrx, etc.) edit

We've had entries for Latinx/latinx, Chicanx/chicanx, lxs, novix, amigx, and probably a few others for many years now. Recently, however, the gender-neutral x in Spanish has reached a level of acceptance (within very limited circles) where we are starting to see entire journal articles, academic papers, or even books[2] in which all generically-gendered words are replaced by their gender-neutral equivalents. It's still a very limited phenomenon, but at this point it's relatively easy to find three durable citations for almost any common gendered Spanish word in an x form. My questions are:

  1. Should we start creating entries for all these words now that they pass WT:CFI?
  2. Should we modify {{es-noun}} and similar templates to accommodate these forms?
  3. Some of the existing x entries are marked "informal", but I'm not sure that's accurate at this point. Would "uncommon" be the best usage label?
  4. Should we create a special category for them?
  5. Should we create a standardized usage note for them?

Nosferattus (talk) 05:09, 28 November 2021 (UTC)[reply]

1. Yes: we create both masculine and feminine forms for all words that have both, so there's no reason not to include them. 2. Not yet. They are not widely accepted, so it would not be an accurate representation of the language to put them on equal standing with masculine and feminine forms. 3. Yes, I think so. In fact, I suspect they're more often used in formal rather than informal writing. 4. I'm not sure that's necessary, but I don't really care. 5. Yes. A nice quick template would be good. Andrew Sheedy (talk) 06:30, 28 November 2021 (UTC)[reply]
  1. 1.) For sure: no different than any other attestable word. 2.) Probably not, as this is still a new development. 3.) Also correct: maybe "non-standard"? 4.) Yes. 5.) That would probably be helpful: we could explain how it functions grammatically and how it's a new development. —Justin (koavf)TCM 17:17, 29 November 2021 (UTC)[reply]
After looking at some other examples, I think "gender-neutral" and "neologism" are probably the best usage labels for these. Here's another article about the rise of 'e' and 'x' endings. Nosferattus (talk) 01:49, 1 December 2021 (UTC)[reply]
I agree with koavf, and also, I merged the two usage notes templates. - -sche (discuss) 21:12, 6 December 2021 (UTC)[reply]

Mini RFC: What is the best way to define these terms? edit

Currently, all of the following forms are being used:

  1. (gender-neutral, neologism) gender-neutral form of maestro.
  2. (gender-neutral, neologism) gender-neutral form of maestro, teacher
  3. (gender-neutral, neologism) gender-neutral form of maestro or maestra.
  4. (gender-neutral, neologism) gender-neutral form of maestro or maestra, teacher
  5. (gender-neutral, neologism) teacher

Are any of these better or worse than the others? Nosferattus (talk) 02:47, 1 December 2021 (UTC)[reply]

@Andrew Sheedy, Eirikr, The Editor's Apprentice, Vriullop, Jberkel, Koavf. Nosferattus (talk) 02:54, 1 December 2021 (UTC)[reply]
  • Personally, my preference is for #4 above: this gives the user the most information, all of it pertinent to the question of "what is this word?" (presumably why most users would look up a word in a dictionary). However, I spend very little time working with Spanish entries, and I am ignorant of any formatting conventions specific to that area. ‑‑ Eiríkr Útlendi │Tala við mig 04:04, 1 December 2021 (UTC)[reply]
Option 3 is my preference (though I would rather see "(gender-neutral, neologism) gender-neutral form of maestro/maestra", since "or" can imply that maestrx/maestre is intended to replace one or the other, when it is in fact intended to be ambiguous and replace both at the same time. I'm not opposed to option 1, but I suspect that those who actually use these forms would see it as unfairly—and perhaps inaccurately—favouring the masculine term. I'm also not opposed to option 4. (As an aside, I oppose adding periods/full stops at the end of any of the options, since the typical standard for non-English entries is to omit them.) Andrew Sheedy (talk) 04:11, 1 December 2021 (UTC)[reply]
Seems like we need to include both -o and -a forms as this replaces them but it also replaces the -o form as a kind of gender neutral. This is tricky, since an @/e/x version of a Spanish word is both intended for individuals whose gender is understood to be not feminine or masculine but also for individuals whose sex is unknown. It' shard to not make it too wordy but it's important to point out that it's both, just like the masculine is both masculine and the default. I think that #4 explains this the most but the ambiguity of #5 actually works well and as long as the -a and -o forms are linked in the head above it and we have the gender-neutral usage note, that should explain it enough. —Justin (koavf)TCM 04:58, 1 December 2021 (UTC)[reply]
Agree with Justin and Andrew. From a grammar POV I would say "alternative form of maestro as gender-neutral". From an activist POV: "gender-neutral form of maestro/maestra". Really tricky. Vriullop (talk) 12:39, 1 December 2021 (UTC)[reply]
As Justin notes, the forms are also said to be used for known individuals with non-binary genders, and ostensibly not used for known individuals with binary genders. When used that way, they are definitively not gender neutral, instead having a specific gender denotation, specifically a non-binary one. Because of that, the forms can communicate least multiple gender meanings (which can bleed into one-another) depending on specific usage: a gender neutral (agnostic) one for unknown individuals, a gender neutral (indicating mixedness)one for multiple-gender groups, and a genderqueer one for known individuals with non-binary genders. I would argue that some or all of these deserve their own sense lines. I think something along the lines of option 1 would be most honest to the current prominence of these forms and nicely parallels format used for female and feminine forms (-a). I think there is a great need to collect a variety of actual uses of e/x/@ forms so that we can analyze them instead merely philosophizing. Pinging @Metaknowledge, Ungoliant MMDCCLXIV, Ultimateria since they list theirselves under the "Spanish" part of Template:wgping and have yet to comment or be pinged. —The Editor's Apprentice (talk) 23:57, 1 December 2021 (UTC)[reply]
Out of those, I prefer #3 and #4. I’d also remove the redundant “neologism” and format the gloss like usual - quoted and parenthesised. Heres a suggestion for a more straightforward option:
  1. gender-neutral neologism for maestro and maestra (“teacher”)
. — Ungoliant (falai) 00:52, 2 December 2021 (UTC)[reply]
I like Ungoliant's proposal best. Ultimateria (talk) 19:39, 5 December 2021 (UTC)[reply]
I like Ungoliant's idea, too. I agree with Justin above that we should mention the -a and not just the -o forms if we're mentioning either of them, since the raison d'etre for the -x/-e forms is to sub in for both, not to replace just the -o forms. And "and" addresses the issues with "or" that Andrew pointed out, although a slash would also work. - -sche (discuss) 21:25, 6 December 2021 (UTC)[reply]
@Ungoliant MMDCCLXIV I like your suggestion as well. However, if we remove the redundant usage labels, we lose the categorization. Is it OK to add the categories manually at the end of the language section? I'm not an experienced Wiktionary editor so I apologize if this is a dumb question. Nosferattus (talk) 23:18, 6 December 2021 (UTC)[reply]
@Nosferattus, if we create a template for this, it will include an automatic category. Alternatively the entries can be categorised by the Usage notes template. — Ungoliant (falai) 23:38, 6 December 2021 (UTC)[reply]
Would it be premature for the template (if created) to be general, i.e. not specific to Spanish? My understanding is that that similar changes are happening/being advocated for in other European languages—Romance, Germanic, and Slavic. —The Editor's Apprentice (talk) 05:13, 7 December 2021 (UTC)[reply]
Follow-up: the wording above has been applied to a variety of entries by various users, but manually and therefore still with considerable variation. I took a stab at templatizing it: Template:gender-neutral neologism for. - -sche (discuss) 07:19, 6 March 2023 (UTC)[reply]
@Koavf, AG202, -sche, Ultimateria Further followup: Based on discussions on my talk page, I created a new gender tag gneut, which categorizes into 'LANG gender-neutral POS' (e.g. Category:Spanish gender-neutral nouns), similarly to other gender tags. This is now being used in the entries for various Spanish gender-neutral terms. (I believe that "neuter" is inapplicable here because the neuter gender is normally used for inanimate objects rather than for gender-neutral terms referring to people. I originally used mfbysense but some editors objected to that based on the agreement rules being different: mfbysense terms like artista take masculine or feminine agreement depending on the gender of the referent, but gneut terms take gender-neutral agreement in associated adjectives, pronouns and determiners.) Benwing2 (talk) 22:31, 18 July 2023 (UTC)[reply]
@Benwing2: After your change at alumnx, the plural is indicated as alumnx. This is not correct (see quotations). J3133 (talk) 22:39, 18 July 2023 (UTC)[reply]
@J3133 This should be fixed (at least for nouns). Benwing2 (talk) 22:58, 18 July 2023 (UTC)[reply]

Present text at WT:ATTEST:

 

Where possible, it is better to cite sources that are likely to remain easily accessible over time, so that someone referring to Wiktionary years from now is likely to be able to find the original source. As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived by Google. Print media such as books and magazines will also do, particularly if their contents are indexed online. Other recorded media such as audio and video are also acceptable, provided they are of verifiable origin and are durably archived.

 

In my humble opinion this text is not fit for purpose. It strangely (from a 2021 perspective) gives great prominence to Usenet, a corner of the Internet that fewer and fewer people have even heard of, while failing to make clear the general position as regards Internet-sourced content, or at least failing to reflect what the position is in actual reality. It mentions books, which are the major source of quotations (as far as I see), as a kind of grudging afterthought. I am aware that proposals to change this text to clarify the position regarding Internet-sourced content have failed to gain consensus. My proposal is an INTERIM one, intended to last only until such consensus can be reached, so we do not have to retain the present unsatisfactory text in the meantime (which may be a long time, judging by present rate of progress). To that end, I propose we change the text to read like this:

 

Where possible, it is better to cite sources that are likely to remain easily accessible over time, so that someone referring to Wiktionary years from now is likely to be able to find the original source. Traditional print media, including books, newspapers and magazines, is a major source of citations. Other recorded media such as audio and video is also acceptable, provided this is of verifiable origin and is durably archived.

The community has not presently reached a consensus on which if any citations from Internet-only sources, such as social media, forums, blogs, or general website content, may count towards attestation requirements, or in which circumstances these sources should be treated as permanently recorded. Citations from Usenet have traditionally been recognised, but those from other Internet-only sources often have not, though editors' practices may vary.

 

Please comment. Mihia (talk) 19:42, 29 November 2021 (UTC)[reply]

I love the idea and the proposal with only minor comments. To nit pick, instead of "provided this is of verifiable" maybe "provided it is of verifiable"? I also feel like "Traditional print media [...] is a major source of citations" is weaker than "will [...] do", maybe "are generally accepted" is a better match? The phrase "recorded media" seems redundant with just "media". Good luck and take care. —The Editor's Apprentice (talk) 07:06, 30 November 2021 (UTC)[reply]
Thanks, I will address your comments along with other comments arising below in due course in version 2 of the text. Mihia (talk) 12:51, 2 December 2021 (UTC)[reply]
It seems that perceptions of "will do" in this context must vary, because to me this phrase feels grudging or unenthusiastic, implying that citations from books and magazines are not ideal but can tolerated if nothing better is available. Mihia (talk) 15:29, 2 December 2021 (UTC)[reply]
I am thinking that pages in the Internet Archive should be considered to be durably archived. bd2412 T 07:33, 30 November 2021 (UTC)[reply]
I think that the intention is here to not consider anything anew but only be more clear to newcomers who read the text. Fay Freak (talk) 04:22, 1 December 2021 (UTC)[reply]
Right, this text is not supposed to be making any new proposals for agreement, just explaining the present de facto situation. Mihia (talk) 12:51, 2 December 2021 (UTC)[reply]
Just saying. bd2412 T 02:40, 4 December 2021 (UTC)[reply]
This proposal seems like a definite improvement. I like the idea of actually explaining where there is uncertainty. Nosferattus (talk) 03:01, 1 December 2021 (UTC)[reply]
This has my support. Andrew Sheedy (talk) 04:12, 1 December 2021 (UTC)[reply]
Support on my end. Though I still feel that online journals and newspapers should be explicitly included somewhere. AG202 (talk) 01:20, 2 December 2021 (UTC)[reply]
I can certainly mention these if I know what to say. What is the present position? Has there been a demonstrated consensus that we can cite from these, or are they a part of the disputed/unclear territory? Mihia (talk) 12:51, 2 December 2021 (UTC)[reply]
It’s contentious, since even Vice, where you can find every bs, and Medium.com, where anyone can author, are journals or newspapers. Or why not? Why not even include blogs, since they are kind of redacted or edited and nowadays often work the same way as journalism or are just a specialized form of it. Essentialist distinctions are contentious and in this likely do not even comprise that scope of vocabulary that is sought. I already see from the edits on User:Fay Freak/words that would actually deserve entries our current criteria force us to miss out on that people want memey or slangy words used when arguing on the internet which journalists or posh publishers are too hoity-toity or ivory tower too include in their parlance—we just need to make sure it’s not something but some 4channer and his friends used (which I believe is sufficiently circumscribed in my proposal, but we are collecting entropy for which kinds of words would be included). Fay Freak (talk) 00:55, 4 December 2021 (UTC)[reply]

Version 2, incorporating the above comments:

 

Where possible, it is better to cite sources that are likely to remain easily accessible over time, so that someone referring to Wiktionary years from now is likely to be able to find the original source. Citations from traditional print media, including books, newspapers and magazines, are normally fine. Citations from other media, such as audio and video, are also acceptable, provided these are of verifiable origin and are durably archived.

The community has not presently reached a consensus on which if any citations from Internet-only sources, such as online journals and newspapers, social media, forums, blogs, or general website content, may count towards attestation requirements, or in which circumstances these sources should be treated as permanently recorded. Citations from Usenet have traditionally been recognised, but those from other Internet-only sources often have not, though editors' practices may vary.

 

Any more comments? Any objections? Mihia (talk) 18:19, 10 December 2021 (UTC)[reply]

Thanks, you may rest assured that I am familiar with the formal voting procedure for changes to policy, but imagined or hoped that we might bypass it in this instance, since the proposed changes to the text are not only uncontested but in fact do not change de facto policy, only update the CFI text to reflect reality. I hope that an administrator will respond. Mihia (talk) 10:51, 29 December 2021 (UTC)[reply]
Support on my end, and @Mihia re: online magazines & newspapers (apologies for the late response), I've been told that they can be used for RFV provided that they have a "print edition", but being that a lot of newspapers are moving towards being online-only, it's definitely been difficult to verify and make sure, and in practice, it seems that it's not always needed regardless. AG202 (talk) 21:45, 2 January 2022 (UTC)[reply]

Etymology edit

When a particular word (like literal or problema) is used with identical spelling and essentially identical meaning in multiple languages (typically all of them related European languages in the cases I've encountered), must each language section always have its own etymology subsection? Obviously so if the precise etymological path varies even slightly between languages; but repeating exactly the same etymological information for each language seems redundant and cluttersome. Is there ever an option of placing the etymology at the very top of a page, with the understanding that it serves all of the various languages in the entry? -- 2603:6081:8004:DD5:6451:2AC4:EB73:1BE 01:08, 30 November 2021 (UTC)[reply]

I doubt we would ever agree to such a thing Notusbutthem (talk) 09:10, 30 November 2021 (UTC)[reply]
  • Consider also that certain display settings result in only one language being shown at a time. In such cases, only including the ===Etymology=== section in one language and not all of them would result in the etymology effectively "vanishing" when viewing the entries for other languages. ‑‑ Eiríkr Útlendi │Tala við mig 18:44, 30 November 2021 (UTC)[reply]
There's no reason to present a full etymology in many cases- consider only showing the most recent step, which can be expanded upon in the parent entry. DTLHS (talk) 20:05, 30 November 2021 (UTC)[reply]
What DTLHS says, and if the etymology is moderately convoluted then you can refer to an exposition at one place, as shown on مَرَنْد (marand), but often an etymology only needs to consist of a most recent step or two—depending on which parts are essential and are not likely to change. You avoid potential asynchronicity and duplication which would challenge the reader to read long texts to look for eventual differences. Fay Freak (talk) 04:28, 1 December 2021 (UTC)[reply]

Makes sense; these are helpful answers. Thanks, everyone. -- 2603:6081:8004:DD5:6451:2AC4:EB73:1BE 13:43, 1 December 2021 (UTC)[reply]