Wiktionary:Grease pit/2022/March

Edit request: Module:headword/data

Chadong (cdy) is a "language with spaces between syllables". Please add it into data.no_multiword_cat. Thanks. --沈澄心 ✉ 15:46, 2 March 2022 (UTC)[reply]

Cleaning up after a bot

Special:Diff/32422359, a 2015 edit by a now-blocked bot, introduced an error by removing one of the two plural forms of French nouveau-né. I fixed the entry today. There may be other errors. Is is possible to filter edits by this bot with the edit summary "removing redundant parameter from {{fr-noun}}? Vox Sciurorum (talk) 17:03, 2 March 2022 (UTC)[reply]

At the bottom of Special:Contributions/MglovesfunBot, there is a link that says "Edit summary search". You can use it to find such edits: [1]. 70.172.194.25 17:56, 2 March 2022 (UTC)[reply]

Incorrect deletion message

The message that appears when you are about to recreate a previously deleted page, MediaWiki:Recreate-moveddeleted-warn, finishes with the sentence

Note that administrator comments older than one year may be inaccurate, as explained in Deletions.

The "older than one year" is clearly not correct any longer; the message has been in place since 2010! Would an admin like to alter it so it says:

Note that deletion comments prior to 2010 may be inaccurate, as explained in Deletions.

(also taking the liberty to remove the unexplained italics)? This, that and the other (talk) 03:56, 3 March 2022 (UTC)[reply]

Hah, good catch. Done. - -sche (discuss) 04:30, 3 March 2022 (UTC)[reply]

Category:German terms from or influenced by Burschensprache

I want to create this category for terms such as schiffen or Gaudium; I take it it should be a sub-category of Category:German terms by etymology, right? Should I change the module data or should I just manually create the category as a sub-cat thereof? — Fytcha〈 T | L | C 〉 17:49, 3 March 2022 (UTC)[reply]

Burschensprache is kind of historical as Bursche no longer means “student”. For current terms one might lever peruse Studentensprache. Fay Freak (talk) 18:27, 3 March 2022 (UTC)[reply]

@Fay Freak: For current terms (i.e. 20th / 21st century student register) sure, but schiffen and Gaudium emerged in the 18th - 19th century; for these terms I think both names are defensible with Burschensprache having the benefit of being more "endonymic". — Fytcha〈 T | L | C 〉 20:31, 3 March 2022 (UTC)[reply]

Do we anticipate ever categorizing modern Studentensprache? Iff so, it may be unlikely random editors categorizing a term would all grasp or maintain a distinction whereby Studentensprache is Burschensprache before year #### but Studentensprache after that, so a single category for Studentensprache might make more sense. (I'm not sure. Maybe it's odd to consider 18th century and modern Studentensprache all part of the same register at all, but that's a wiki-wide issue; 18th century obsolete Americanisms and 2010s ones all go in Category:American English, etc etc for most registers.)
Regarding the rest of the proposed name: existing category names would seem to suggest either "German terms derived from ___sprache" (dropping "or influenced by"), or else just "Category:___sprache" like Category:Rotwelsch, or "Category:German ___sprache" (like Category:English Polari slang). The latter, without "derived from", does seem to be employed even when a term has spread outside register X, like many of the Rotwelsch and Polari terms had to do in order to meet CFI, or like Category:Scottish English terms labelled {{lb|en|originally|_|Scotland}}, etc etc for other dialects. - -sche (discuss) 03:32, 5 March 2022 (UTC)[reply]

Implementing the coauthors parameters in quotation templates

It seems the coauthors parameter is not actually a part of Template:quote-journal, Template:quote-book, and others despite being mentioned in their documentation. Can someone actually add support for the parameter? Thanks and take care. —The Editor's Apprentice (talk) 19:25, 4 March 2022 (UTC)[reply]

Circumfixes and `{{af}}`

{{af}} is truly a wonderful template, allowing for compounds, any time of affix (no duh), and even allowing multiple language parameters for when a foreign/inherited word has been affixed. However, the syntax for circumfixes is unintuitive, as it is {{af|LANG|cf--cf|word}}. @Fenakhay has suggested the syntax of {{af|LANG|a_|WORD|_b}}. Do other editors feel this would be a good change to the template? Vininn126 (talk) 15:13, 5 March 2022 (UTC)[reply]

Hi

I was trying to put the word of the day and it kept said this is harmful or something like that i didnt do anything harmful. — This unsigned comment was added by Imhotie9274 (talk • contribs) at 20:40, 8 March 2022 (UTC).[reply]

Why were you trying to change the word of the day? FishandChipper (talk) 21:24, 8 March 2022 (UTC)[reply]

Category:Western Fijian language

Abuse filter or w/e it's called is stopping me from creating this. Can someone do it please? 37.110.218.43 09:25, 10 March 2022 (UTC)[reply]

Also, Category:Chak language, using code {{auto cat|Bangladesh|Myanmar}}, 37.110.218.43 09:29, 10 March 2022 (UTC)[reply]

*sighs* Here's another one: Category:Djimini language, using {{auto cat|Côte d'Ivoire}}

And so, I've ended up fulfilling my own requests since no one else took notice lol. User: The Ice Mage ^{talk to meh} 21:17, 10 March 2022 (UTC)[reply]

Possible vandalism in translations

For some reason I'm getting certain tab names ("Read", "Edit" and "Add topic") in another language, while others are displaying correctly in English. I also have no idea what language they are: Read is "Kenkan", Edit is "siesie", and Add topic is "Fa tiban ka ho". Googling shows up nothing.

My suspicion is that someone has vandalised the British English 'translations', as it fixes itself when I switch preferences from en-gb to en but I don't know how to fix it myself. Theknightwho (talk) 14:20, 10 March 2022 (UTC)[reply]

If you add ?uselang=qqx to the end of the URL like so, you can see the MediaWiki message names. For example, "Read" is "view-view". Translations are provided by MediaWiki:View-view/en-gb for British English, for example. The place where MediaWiki messages are translated by volunteers is TranslateWiki.net. On TranslateWiki.net, someone mistakenly created MediaWiki:View-view/en-gb with the text "Kenkan", which has since been deleted. Based on the talk page of the page creator mentioned in the deletion log, they were trying to translate into Twi, but saved their work as British English accidentally. The mistaken translations have already been removed, so I think the fix should propagate the next time Wikimedia updates their software version. 70.172.194.25 02:33, 11 March 2022 (UTC)[reply]

Thanks for this - makes sense! Theknightwho (talk) 07:33, 12 March 2022 (UTC)[reply]

Conjugation of vuaj

Not sure if this is the right place to post this, but I just added a conjugation table at vuaj assuming it conjugates exactly like shkruaj. If there are any differences, someone please correct the table. MGorrone (talk) 21:25, 10 March 2022 (UTC)[reply]

Diverging sense numbers

We often use (sense i) as part of image captions. This is a bad implementation of a good idea because nobody goes and checks the image captions after working on the definitions which leads to divergences between the sense numbers (cf. fork). While this can be detected and even automatically be fixed from database dumps, I wonder whether there's not a better solution. The ideal solution in my mind is linking the sense with {{senseid|en|foo}} and then having a template {{senseno|en|foo}} in the image caption that automatically displays the correct number. Is this template implementable? In the late game of Wiktionary, practically every sense is going to have an id anyway. — Fytcha〈 T | L | C 〉 21:59, 10 March 2022 (UTC)[reply]

Created {{senseno}}. (It had a bug, but I think I fixed it, User:Fytcha.) 70.172.194.25 09:01, 11 March 2022 (UTC)[reply]

Phenomenal work! I was just about to look into what caused the garden fork to have too high a number but you beat me to it. Thank you a lot! — Fytcha〈 T | L | C 〉 09:16, 11 March 2022 (UTC)[reply]

Some future ideas could be making the output link to the sense in question and whether the word "sense" should be included in the output. But I'm not sure whether those bells and whistles are desirable or not. In any case, you can easily implement those things on top of the template I just made, so I won't bother to do so for now.

What's more complicated is making a template that handles {{etymid}} too (not like it's super hard or anything, especially with the knowledge of how this one was built, but it does need some work). I'm not sure whether this is worth it or not, but I can look into this too if desired. 70.172.194.25 09:20, 11 March 2022 (UTC)[reply]

From a human-user point-of-view reliance on bare sense numbers in image captions forces the user to find the numbered sense in order to fully understand the application of the image. It requires some micro-effort to do so in an L2 section with all the definitions on the same screen. It requires more effort, including memory effort, to do so in a long L2 section in which all the definitions are not on the same screen. The difficulty is worse if someone has zoomed the text because of impaired vision (eg, due to aging). I think a caption should have a terse version of the definition and also allow a link to the highlighted full specific definition, such as {{senseid}} provides. DCDuring (talk) 15:04, 11 March 2022 (UTC)[reply]

I think making the sense number clickable, highlighting the corresponding sense in blue as usual, also solves this. — Fytcha〈 T | L | C 〉 15:15, 11 March 2022 (UTC)[reply]

I made it clickable. See fork for an example. 70.172.194.25 18:17, 11 March 2022 (UTC)[reply]

I would have pressed the thank button but it's not available for IPs. Oh well, let me just thank you in text instead then! — Fytcha〈 T | L | C 〉 10:07, 12 March 2022 (UTC)[reply]

In the case that the image and the definition are not on the same screen, it still requires more effort (and fine motor skills) to find clickable spot and jump to the definition than to look at the image and see the mini-definition in its caption. Also, how does one get back to the image? Note that one has the cognitive load of remembering the definition if one goes back to the image. DCDuring (talk) 23:27, 12 March 2022 (UTC)[reply]

The back button brings me back to the image. — Fytcha〈 T | L | C 〉 23:33, 12 March 2022 (UTC)[reply]

This is an excellent idea! Thanks, 70.172.194.25 and Fytcha! That being said, I seem to be having trouble implementing it at "queen". It's giving me a Lua error message. Would someone be able to have a look at my attempt to see what I'm doing wrong? Graham11 (talk) 06:11, 13 March 2022 (UTC)[reply]

Fixed it by replacing the complex module parsing with a simple regex. There may be a bug in Module:templateparser, but that's not code I wrote. 70.172.194.25 06:16, 13 March 2022 (UTC)[reply]

Excellent, thanks again! Graham11 (talk) 06:18, 13 March 2022 (UTC)[reply]

I may have spoken too soon. All of the sense numbers seem to have increased by one at "queen". For example, the caption of the portrait of Queen Victoria refers to "sense 3" despite the sense referring to female monarchs being the second sense of the noun. Graham11 (talk) 06:23, 13 March 2022 (UTC)[reply]

New solution. [2] The algorithm is basically to find the most recent section break (in this case, the ===Noun===) and start counting lines with # from there. However, previously it did not actually find the most recent section break (it would have counted all senses on the page up until the senseid). Now I fixed it so it should be robust. My apologies, I guess I didn't actually test it on any other pages where that would have made a difference, good catch. 70.172.194.25 06:31, 13 March 2022 (UTC)[reply]

No apology necessary – it's a work in progress! And your most recent change seems to have fixed it. Graham11 (talk) 06:33, 13 March 2022 (UTC)[reply]

I wonder if the link should be made more prominent in some way, perhaps an underline or bold? Since it's often only one character long it may be easy to miss that it's blue. 70.172.194.25 06:40, 13 March 2022 (UTC)[reply]

Bold would probably be overly prominent. And an underline would definitely look out of place. Instead, I think I'd add an option (probably set by default) to include the word "sense" before the numeral. Graham11 (talk) 06:48, 13 March 2022 (UTC)[reply]

Implemented. I also added documentation for the template. 70.172.194.25 08:04, 13 March 2022 (UTC)[reply]

Is the weird line break in art related to the recent changes? — Fytcha〈 T | L | C 〉 14:32, 13 March 2022 (UTC)[reply]

I didn't touch anything other than Module:senseno and Template:senseno, neither of which is invoked or transcluded on that page. So I don't think so. [Edit: okay, I see what you mean.] 70.172.194.25 15:41, 13 March 2022 (UTC)[reply]

I think this is because of how the function at Module:senseid#L-176 is implemented. In particular, it works by returning an unclosed <li> element. The code

# ABC
# <li style="background-color:aqua">DEF GHI
# JKL <li style="background-color:orange">MNO

generates this output:

ABC
DEF GHI
JKL
MNO

(Note the break between "JKL" and "MNO".) So, basically, {{senseid}} has to be the very first thing on a sense line or it will create a line break. In that page, it is placed after the {{lb}}. Again, this has nothing to do with recent changes, but it should be documented somewhere. 70.172.194.25 15:54, 13 March 2022 (UTC)[reply]

Thank you for investigating. Apparently, this behavior is actually documented at {{senseid}} (should have read that first, sorry!) but it's still wrong in a couple of articles (diff). When I get a bot up and running to "patrol" recent changes, I'll make sure it also looks out for this. — Fytcha〈 T | L | C 〉 16:24, 13 March 2022 (UTC)[reply]

Category:en:Constitutional law

If we're having this we should incorporate it into the existing topic category system, right? I would do it but I don't know how. Acolyte of Ice (talk) 12:50, 11 March 2022 (UTC)[reply]

Done: diff. Others may feel free to object or send to RFD (Sgconlaw, Chuck Entz, Ultimateria, Fay Freak: pinging from a previous RFD). —Svārtava (t/u) • 16:16, 11 March 2022 (UTC)[reply]

👍. No, it makes sense. Due to Normenhierarchie this is a field of peculiar terminology. Fay Freak (talk) 17:26, 11 March 2022 (UTC)[reply]

Template:de-proper noun

...displays "Foobar n (proper noun, [...])". Other German and other-language templates I looked at don't repeat the part of speech, since the header already states it in big bold letters. So shouldn't we drop the redundant "proper noun" text here? - -sche (discuss) 05:43, 16 March 2022 (UTC)[reply]

I agree that it's not very consistent with other headword formatting. The code that produces that was added in this recent edit. Maybe User:Benwing2 can explain their reasoning. 70.172.194.25 05:57, 16 March 2022 (UTC)[reply]

+1 Jberkel 21:34, 26 March 2022 (UTC)[reply]

Hiding semantic relations

Do we need a vote on modifying the various semantic relations templates to make them function as quotations etc. do: optionally hidden? I am finding some of our entries increasing difficult to scan or read for the definitions themselves. I doubt I am the only human to have this problem. If we do not need a vote could someone make the required changes in the various semantic relations templates? DCDuring (talk) 17:36, 21 March 2022 (UTC)[reply]

Quite honestly, I think the layout presentation needs a total revamp. I'm fine with the order and content, but you're absolutely right about longer entries being hard to use, and it's particularly true when you have multiple lengthy etymologies within one language. I don't know anyone who would actually want to use this page as a guide to anything at present, as parsing it is beyond ridiculous. Wiktionary feels more like a database dressed up as a dictionary, sometimes. Theknightwho (talk) 17:44, 21 March 2022 (UTC)[reply]

I am hoping that we can have our cake and eat it too, without requiring radical revisions that are unlikely to pass a vote. The extensive (and often worthy) treatments of etymologies and usage notes, and derived and related terms all can be hidden, as many are when properly formatted. I have done so on occasion. Better, simpler, and also easier to justify is the notion of hiding properly formatted citations, which appear under individual definitions. Semantic relations have only recently been allowed to be placed under definitions, previously having been in their own L4/5 sections. Now that their adverse impact on the readability of entries is demonstrable, we may be able to make some technical changes to restore readability.

I also don't see why we have quotation templates have detailed publication information that it far longer than the quotation itself. I thought that kind of thing would belong in another namespace or in footnotes in the entry. DCDuring (talk) 18:29, 21 March 2022 (UTC)[reply]

I completely agree. I suppose I'm just venting frustration that the current state of affairs is the worst of both worlds: all of the user unfriendliness of a database, without actually being structured data. Theknightwho (talk) 18:48, 21 March 2022 (UTC)[reply]

FWIW some dictionaries, namely Polish WSJP split everything off into different openable sections that you have to navigate too, defaulting to a list of definitions, e.g.. If you click on a definition, it defaults to a single semantic definition, while you have to purposefully navigate to other information such as nyms and ety. On one hand, it might be easier for the reader, on the other hand, it can also be difficult to navigate if there's a lot of information. Vininn126 (talk) 09:58, 22 March 2022 (UTC)[reply]

I think some kind of middle-ground would be preferable.

I am not suggesting this for DCDuring's suggestion above, but we could definitely achieve a lot by introducing a structured data format for parts of entries in the way that Wikimedia Commons does for files. It would not only ensure standardisation in the areas that we want it (such as entry layout), but would also remove a lot of the faffing around with data entry and formatting, while making it easier for people to adjust the layout to cater to their preferences. That's not to mention the possibilities it opens up for displaying extensive information on non-lemma entries without the need for data entry duplication, creating more powerful etymology trees, translation and thesaurus links etc. I am sure there are many reasons why it isn't feasible in the short or medium term, however. Theknightwho (talk) 13:55, 24 March 2022 (UTC)[reply]

Though the entry structure is not that of a relational database, there is a great deal of structure, both of wikitext (for definitions, citations, usage examples, and inflection lines) and content-enclosing templates. These structure allow for much flexibility in display and search (especially using regexes). Having an even more structured database would probably make manual corrections and minor additions and other changes more cumbersome for normal folks. That would in turn make recruitment of those with a feel for language, but weaker technical skills, harder to recruit. There may be good ways of altering the user interface to a relational-type database to facilitate that process, but I'd like to see some examples. DCDuring (talk) 15:00, 24 March 2022 (UTC)[reply]

Yes - you would essentially be building Wiktionary 2, and it would be a massive undertaking. Theknightwho (talk) 15:09, 24 March 2022 (UTC)[reply]

Most of our entries are still like that. I see the advantages of the under-each-definition placement of the semantic relations, but I'd like to see them only when I click on something like the "[quotations]" control for showing or hiding quotations appearing under a definition. DCDuring (talk) 14:51, 24 March 2022 (UTC)[reply]

This was already a concern when we allowed semantic relations under definition lines, and as part of the vote it was suggested to automatically shorten long lists to keep the entry readable. A simpler solution is to reuse the quotation style toggling logic, and we already have a working script for this which we could enable globally. I remember talking to @Erutuon to install it in MediaWiki:Gadget-VisibilityToggles.js but this hasn't happened yet (this was back in 2019). – Jberkel 16:18, 24 March 2022 (UTC)[reply]

Now that the under-individual-definition placement has become more common, it would seem to be time to implement it. There shouldn't be too much risk. DCDuring (talk) 17:16, 24 March 2022 (UTC)[reply]

If it wasn't already clear, I also support this. Theknightwho (talk) 17:29, 24 March 2022 (UTC)[reply]

Not sure what happened back then. Maybe I was testing it and never finished. Added my version to MediaWiki:Gadget-VisibilityToggles.js. Semantic relation toggling should work for everyone now. If there are any problems, post a message in this section or on the gadget talk page. — Eru·tuon 19:59, 24 March 2022 (UTC)[reply]

I disagree with this change and I hope that we 1. expand everything but quotations by default 2. make it possible to hide all toggle buttons except the ones for quotations (which doesn't seem to be possible currently as all toggle buttons have the class HQToggle). Pinging @Erutuon. — Fytcha〈 T | L | C 〉 15:02, 25 March 2022 (UTC)[reply]

Yeah, not a fan either. It's not as if semantic relations take up a big chunk space under the senses, like quotes do. It feels weird to have a collapsey thing for what is in most cases a single line of text. This, that and the other (talk) 05:58, 26 March 2022 (UTC)[reply]

I don't know but assume that this discussion had lead to the recent hiding of the synonyms (directly under definitions) as we see on the Japan page. This is not an acceptable solution to any problem in my view. As it now stands, some absurdly obsolete alternative forms for Japan are given more prominence on that page than the wildly more common Land of the Rising Sun, etc. You have to make an extra click to see "Land of the Rising Sun", but "Giapan" is hanging out there at the top like it's important to know- this is a fundamentally flawed juxtaposition of the value of the two points of knowledge- 'Giapan' is worthless to any but specialists whereas someone who doesn't know 'Land of the Rising Sun' needs to know it. You have to hide both or neither in my opinion. Brings to mind- "Neither do men light a candle and put it under a bushel, but upon a candlestick, that it may shine to all that are in the house." Matt 5:15 --Geographyinitiative (talk) 13:33, 26 March 2022 (UTC)[reply]

If you want to change WT:ELE, which specifies the location of alternative forms, start a WT:BP discussion. DCDuring (talk) 16:28, 26 March 2022 (UTC)[reply]

I don't like semantic relations being hidden by default, either; as This,that says, they don't take up nearly as much space as quotes. I also wonder who really wants to see (and click to toggle on) only one semantic relation at a time. Two ideas: make all the semantic relationships one class that users can collapse or expand all at once, and make them expanded by default, but with the option (perhaps made more durable as a thing in Special:Preferences?) for users to set them to always be collapsed? - -sche (discuss) 17:50, 26 March 2022 (UTC)[reply]

One good thing is that the quotations control and the single -nym control or multiple -nyms controls seem to appear on the same line. As we get more definitions with citations and synonyms the vertical screen space taken up by these controls will probably seem less and less intrusive. DCDuring (talk) 21:13, 26 March 2022 (UTC)[reply]

Another good thing is that these controls appear at the end of each definition so that they often do not take up any additional vertical screen space. DCDuring (talk) 21:19, 26 March 2022 (UTC)[reply]

The way it was originally implemented was to show synonyms by default, but leave the other (less 'important') relations collapsed. I think Erutuon then changed it so that everything is collapsed by default. – Jberkel 19:25, 26 March 2022 (UTC)[reply]

I can’t get used to having to click to show the semantic relations. Your screen does not need to be too large to be predestined to show all at once, and I also considered them as kind of definitional: Definition by differentiation from other terms. As clicking is extra work I can’t like it unless the hiding supports the apperception of other content. Maybe let the code decide by the size of the user’s display interface whether the semantic relations are hidden or displayed by default? Or is it possible to calculate from the total number of semantic relations lines in a language section whether they should be hidden by default? Fay Freak (talk) 14:38, 29 March 2022 (UTC)[reply]

The left-hand side of the Wiktionary page has a series of show and hide controls, which provide a way for you to express your preferred set of shown and hidden headings (quotations, translations, synonyms, derived terms, other, etc). These preferences are durable, at least for a browser session. DCDuring (talk) 14:53, 29 March 2022 (UTC)[reply]

Based on the discussion above, I undid the collapsing-by-default of the synonyms. As DCDuring notes, there is a control on the left-hand side of the page for those who wish to toggle the synonyms to be collapsed. - -sche (discuss) 23:32, 29 March 2022 (UTC)[reply]

Polish cleanup wishlist

(Notifying BigDom, Hythonia, KamiruPL, Tashi, Luxtaythe2nd, Max19582, Hergilei, Shumkichi): and also looking for anyone with a bot and technical experience who's willing to help.

I would like to make the following mass changes to Polish entries, most of which are hopefully bot-able. My list is as follows:

{{con}}, {{confix}}, {{pre}}, {{prefix}}, {{suf}}, {{suffix}}, and {{com}}, {{compound}}, {{affix}} should be {{af}}, with the appropriate use of - in the right box, of course for the sake of consistency. This is small, but it would be nice. I think this list covers all possibilities, but I'm not sure. Some will have a t1= etc, these should be preserved. Secondarily, if a page has one of these, but has no "From {{af}} .", as in the "From" and ".", it would be nice to have these added as well. Also potentially changing {{univerbation}} to {{univ}}
~~a list of pages without etymology sections so I can potentially add them~~
Change "analyzable as" to {{surf}}?? I'm not sure I want this, but I think it might more in line with the formatting we have now. Shumkichi said he likes it, some other editors don't care, I'm curious what BigDom thinks. Hythonia also approved.
~~Can we somehow change ()'s in definition lines to {{gl}} (and not {{gloss}}? Also small, but would be nice for the sake of consistency.~~ Done.
Fixing the IPA module so that -cja, -sja, and -zja, and -istka (and their non-lemma forms) automatically correctly syllabify - currently it goes Vc(z/s).jV (ac.jV/az.jV/as.jV)and is.tkV (the end vowel represents inflected forms), and it should be V.CjV (a.cjV/a.zjV/a.sjV) and ist.kV. The exceptions are some genitive plural forms (a.cyj/a.zyj/a.syj), which will syllabify correctly naturally or ist.ek, which also syllabifies correctly already. It would also be nice to teach the template other prefixes, but I'm not sure how possible that is logistically, given how janky the module is. If not we can continue to add these manually, as it's not TOO much of a hassle. It'd also be nice if /r/ atatched to the previous consonant word medially if possible.
Finding pages with related or derived terms but are missing qualifiers and generating a list, or ones that have {{s}} instead of {{q}} cause someone changed a few... This would be in preparation for the next change:
Converting pages to use {{col3}} (was gonna be 5, changing to 3) in derived/related terms. ~~I was considering other options, but 5 seems to be the best option.~~ Aesthetically, I prefer what we have now, but I recognize that col is designed for that and works better on a technical level. Logistically, each part of speech would have it's own col template, which is why I want to find pages missing POS qualifiers - I can manually convert those to have col5 Pages with qualifiers (e.g. azymut) should be able to naturally absorb them and become words with few terms on my sandbox. Pages without related/derived of course would be skipped by the bot. Some verb pages will have a derived box, which will be covered by the next point. A very small handful of other POS will also have a drop down box, which I might have to do manually (e.g. bajać, baba, nowy, and perhaps one or two more. Hopefully a bot will be able to find these pages and skip them, providing me a list.
I think something like {{hu-verbpref}} would be useful for polish verb pages but it would be more manual to allow for perfective/imperfective forms in their entirety. This would have to be developed first, and the bot would find a derived box such as on rzucić and convert it to this new template. Organizing them otherwise in a regular col template would be a nightmare, so it would be nice to have something sort of between what's on, for example, делать, where you can have "no imperfective/perfective equivalent", and next to each other physically but have them physically closer to each other so it's more obvious (compare rzucić again, how it's very obvious przerzucać and przerzucić are pairs). The way I envision this is that it would be a sort of col2 box, and we would enter each form next to each other like you would normally, but instead of going top to bottom left to right, it would go left right top to bottom. If there was a missing pair, such as for urzucić, we could tell the template that a specific imperfective or perfective form has no equivalent. I'm not sure what the best way is to do that. I have a list of the possible prefixes on my userpage, I'm not sure how we can incoporate that. I found a solution to make {{col3}} work. I would like a list of words with der boxes, however, so I can fix those.Vininn126 (talk) 01:17, 26 March 2022 (UTC)[reply]
Also if it's possible, add acceleration to {{pl-adj}} and {{pl-noun}}, as I have verbs covered.
Somehow merge some of the verb templates - would there be ANYway to merge all of the -ę verbs? it seems unlikely. If not, whatever.

There's probably more, but this is everything I could come up with for now. Vininn126 (talk) 10:43, 24 March 2022 (UTC)[reply]

Here's #2, a list of Polish lemmas without etymology, it's huge so let me know if there's anything I can do to filter it to something more manageable. I can probably help with #1. JeffDoozan (talk) 15:09, 24 March 2022 (UTC)[reply]

Oh yeah, I overlooked that. Could we filter out non-lemmas? Also #1 seems the easiest. Vininn126 (talk) 15:16, 24 March 2022 (UTC)[reply]

Okay, I cleaned up the compounds that needed it, I believe we should be ready to make the first change. Is there anyway you could do number 4 at the same time? Vininn126 (talk) 19:19, 24 March 2022 (UTC)[reply]

Looking closer at #1, are you suggesting that everything should just use {{af}}? I can do that with a bot, but it seems like you're losing some readability, eg {{prefix|es|un|do}}, to me, is more understandable than {{af|es|un-|do}}, even though the latter is easier to type. Admittedly, I don't work on etymologies, but I want to verify that's not a controversial change.

#2: I updated the list to remove the non-lemmas. There's still stuff like amnestiowanie which, knowing nothing of Polish, looks like a non-lemma to me, but it's currently classified as a lemma.

#4: there are 13,000 senses that contain (, which means there are going to be too many corner cases to just blindly convert them to {{gl}}. I should be able to make you a file with all of the affected senses that you can edit in your favorite text editor that we can then feed back to the bot to fix all of the pages. JeffDoozan (talk) 20:42, 24 March 2022 (UTC)[reply]

The Polish editors all agree about these changes, and in terms of using just {{af}} (of course for compounds or affixes, as opposed to other things), we all agree this should be used. Within that community, this is not controversial.

Verbal nouns are in a weird place, as some dictionaries list them as lemmas, but the vast majority do not. They are sometimes true lemmas, so we can leave them and I should be able to work with this list. Thanks a lot!

For number 4, a list would be fine. Vininn126 (talk) 10:27, 25 March 2022 (UTC)[reply]

Would it also be possible to do that for number 6? That's what I was imagining anyway. Or perhaps like with number 2. Also, would it be possible to do number 7, but skipping pages that have the derived box? Vininn126 (talk) 10:39, 25 March 2022 (UTC)[reply]

Here's the #4 list, you can copy the text to your favorite text editor and then update the page when you're ready.

don't alter the text before the first :
lines that are unchanged will not be changed by the bot
lines that are deleted will not be changed by the bot
If an entry has been edited since 2022-03-20 and the current line no longer matches the original line in this list, it will not be overwritten automatically. I'll manually review them to merge the changes.

I'll see what I can can come up with re #6/7 JeffDoozan (talk) 18:47, 25 March 2022 (UTC)[reply]

Thanks a bunch Jeff! Vininn126 (talk) 19:02, 25 March 2022 (UTC)[reply]

I'm running the bot now and I think you might want to double-check anything with {{taxlink and {{gl|''. If there are any changes you want to make, just update the file again and I'll run another pass. JeffDoozan (talk) 22:39, 25 March 2022 (UTC)[reply]

Okay, I will. Thanks! Let me know when you start the others, and if there's anything I can do to help. Vininn126 (talk) 00:08, 26 March 2022 (UTC)[reply]

It looks like we will probably do col3, not col5. I updated the above list to match my current wishes. Vininn126 (talk) 01:13, 26 March 2022 (UTC) And we probably don't need a template for verbs anymore. Vininn126 (talk) 01:13, 26 March 2022 (UTC)[reply]

@JeffDoozan I have acquired a list for #8. What is the status for #1? Vininn126 (talk) 20:13, 28 March 2022 (UTC)[reply]

{{prefix}} {{suffix}} {{confix}} have been converted to {{af}}. Here's a short list of the pages that use {{compound}}, you'll have to check and convert those manually. JeffDoozan (talk) 11:53, 30 March 2022 (UTC)[reply]

Thanks, I noticed. There's also a little more cleanup I'll have to do manually. If you'd be willing to help finally with #7 I'd be very grateful and temporarily satisfied. A list of words to skip can be found here, but the bot should in theory be able to absorb the {{q}}, {{qualifier}} or {{s}} in the derived/related terms as titles for col3 boxes. Also I've found inconsistency with the headers - sometimes a der/rel section is duplicated (and should be merged), and sometimes Rel is above der and should be moved. The other changes on my list can probably wait, but I would be very grateful if you helped with this last one~(for now lol). Vininn126 (talk) 11:57, 30 March 2022 (UTC)[reply]

@JeffDoozan Also I found one small problem, a few of the af's have double hyphens, could we get that down to one? Vininn126 (talk) 13:37, 30 March 2022 (UTC)[reply]

@Vininn126 Is this what you're looking for with #7? JeffDoozan (talk) 17:54, 30 March 2022 (UTC)[reply]

@JeffDoozan Yes, exactly! Vininn126 (talk) 18:01, 30 March 2022 (UTC)[reply]

Okay, it's running now. Right now it's only going to operate on sections without any unexpected formatting or text or templates. Sections consisting only of lines in the format * {{lb|pl|label}} {{l|pl|word}} will get converted. It will skip any sections with extra formatting (pies), without a title (chaos), with extra template parameters (Amsterdam), or text outside of the {{lb}} (Sierra Leone). JeffDoozan (talk) 18:11, 30 March 2022 (UTC)[reply]

Awesome, thank you. Would it be possible to generate a list of words skipped (but have a der/rel section)?Vininn126 (talk) 18:13, 30 March 2022 (UTC)[reply]

User:JeffDoozan/lists/pl/rel_to_col3_skipped, User:JeffDoozan/lists/pl/der_to_col3_skipped JeffDoozan (talk) 19:03, 30 March 2022 (UTC)[reply]

Thanks so much for all your help, Jeff. Vininn126 (talk) 19:16, 30 March 2022 (UTC)[reply]

@JeffDoozan, hey it's adding the col box with a white space after the Derived terms header, and directly on top of the Further reading. Vininn126 (talk) 20:21, 30 March 2022 (UTC)[reply]

Also will words like ewokacja get fully converted? It only did one section. I think we can make your list a lot smaller if we tell the bot to get rid of any g= or double pipes with text in them, we don't do that now. And finally, could we get it to do that on pages without a qualifier, to just put all the terms in one template? Vininn126 (talk) 20:43, 30 March 2022 (UTC)[reply]

The bot converted Derived terms in one pass and is cleaning up Related terms in a second pass, so everything it can fix should be fixed soon. I can strip out the g= params and run it again. Can you clarify what you mean by "double pipes with text in them" and what you want to happen?

I'll run a cleanup later to fix the whitespace and also to fix some other mistakes JeffDoozan (talk) 02:04, 31 March 2022 (UTC)[reply]

Ah, I see, thanks. In your list, there are words like bezprawny skipped: complex template: {{l|pl|bezprawie||lawlessness, anarchy; crime}}, we can get rid of the English translation. Vininn126 (talk) 08:07, 31 March 2022 (UTC)[reply]

Let me know when the bot is done, so I can make sure everything got hit. (It seems related terms with qualifiers haven't been done yet). Vininn126 (talk) 12:46, 31 March 2022 (UTC)[reply]

Here's the list of pages it skipped. JeffDoozan (talk) 18:39, 31 March 2022 (UTC)[reply]

Thanks a ton, Jeff. Were you able to move the col3 box to directly under the rel/der sections, away from the further reading? Vininn126 (talk) 18:47, 31 March 2022 (UTC)[reply]

It looks like most of these just have a bar after the lemma with an alt form. We could delete that. So {{l|pl|troskać|troskać się}} -> {{col3|troskać|title=WHATEVER}}, and also pos=. We could delete those. There's a bunch of old der-boxes that I converted manually. Vininn126 (talk) 19:35, 31 March 2022 (UTC)[reply]

The bot's finished with all of the cleanup tasks. I don't think blindly removing all of the alternate text is an improvement over just leaving it alone, see: możesz mi pomóc. I think you'll have to handle the remaining entries manually. JeffDoozan (talk) 01:55, 1 April 2022 (UTC)[reply]

Ok. Vininn126 (talk) 09:16, 1 April 2022 (UTC)[reply]

The bot damaged the wikicode here: [3] 78.11.223.83 21:23, 21 April 2022 (UTC)[reply]

Dealing with orthographic representations

I have been adding/improving the definitions for some of the medieval sigla that commonly crop up in manuscripts. For example, ꝫ, ꝑ, ꝓ and, most famously, &.

One issue with these is that they often stand for sequences of letters, rather than complete words. For example, ꝑ sometimes means the Latin word per, but it's much more frequently used as part of longer words to stand for the letters "per", "par" or "por". They aren't affixes, either, as you'll commonly come across ꝑs for pars and so on. The only way to define these is orthographically (i.e. saying what letters they stand for). The proper way to enclose orthographic representations is to use angle brackets ⟨ ⟩ (which are not the same as greater than/less than < >). For example, ⟨per⟩, ⟨par⟩, ⟨por⟩.

I'd therefore like to propose two things:

A new {{orthographic}} template, which encloses some sequence of letters in angle brackets. This is very simple to do. Shortcut {{ortho}}.
A modification to Module:form_of that prevents the input being linked to if it's enclosed in angled brackets. The reason for this is so that you don't end up with pointless redlinks when you have an entry like "Scribal abbreviation of ⟨per⟩." I would also suggest removing the bold as well.

Theknightwho (talk) 13:35, 24 March 2022 (UTC)[reply]

This is a good thing to bring up. To the formatting question, I think the bold is fine, consistent with other 'mostly non-gloss' definitions highlighting the 'substitutable' part so it stands out, whether it's linked or not (as where it'd "link" to the same page, or an entry for an acronym where [[of]] might not be linked in the 'definition' but is still bold), although I see we don't bold George in the "equivalent to" clause of Georg (but maybe we should). Where would {{orthographic}} be used? Inside {{scrib of}}, like {{scrib of|la|{{ortho|per}}}}? Wouldn't it be tidier to just have a form-of template for this kind of form-of, that would add brackets itself? Such a template (or 'type of form-of' that the module recognizes) could also refrain from linking, which seems like it might be a better approach than making the form-of module look for angle brackets and suppress linking inside them, since some entries reference single symbols inside angle brackets like in ⠸⠕ or ray, e caudata, and linking there seems fine (it's not done by a form-of template at the moment, but it could be like "Name of the character ⟨ę⟩").
Do we want to distinguish ꝑ as an abbreviation of "per" the word, which could/should be linked-to, vs ꝑ as an abbreviation of the letter-sequence? I do think it should be made clear whenever something can abbreviate a non-morphemic string even inside a word, since that's not(?) the norm with abbreviations today, and I could've sworn our entries pointed it out in the situations where it is the case, like that @ can be used in abbreviating b@ (bat) or & to abbreviate b& (band) in texting, but our current entries on @ and & actually don't seem to cover such use at all, huh. (⁊ has a usage note.) So, should {{scrib of}} / whatever form-of template we use for this itself (have a parameter/option to) generate wording like "Scribal abbreviation of per whether representing the word per or the sequence within a longer word.", or is it enough to have short wording + usexes like on ꝑ? - -sche (discuss) 21:41, 25 March 2022 (UTC)[reply]

Thanks for this. To address your points in order:

I'm not hugely bothered about bold, but I always assumed it was there to make the link more obvious. Plus by default, a link to the current page will appear unlinked but in bold, so it was to avoid replicating that formatting. Up to you, really, but I just thought Scribal abbreviation of ⟨per⟩ was a bit much.
Yeah, I did envision {{ortho}} being used inside the main template, but I thought there might be other use-cases so felt it deserved its own template. Again - no major feelings either way, though.
I agree that we should be distinguishing the orthographic abbrevation from the word abbreviation, but I thought that was covered by having an entry under "letter" (which maybe should be "siglum", but that's a different conversation) and an entry under "preposition", which is what I've done on ꝑ. We could add something similar to @ etc. I do agree it might be worth signposting it a little more, as angled brackets are a bit obscure, but I'm not sure combined wording like you've suggested would work, as "per" can only refer to the word while "⟨per⟩" refers to the letter sequence.

Theknightwho (talk) 21:58, 25 March 2022 (UTC)[reply]

I just realized hieroglyphic entries like 𓃭 show one way the issue of "X can abbreviate abc either as a word or as a string within a word" has been handled, namely they have separate sense lines for Biliteral phonogram for rw. (when writing a word that contains rw) and Logogram for rw (“lion”). (the word). It also suggests a potentially better header than "letter", namely "symbol". Interestingly, the bold around both instances of rw is apparently "automagically" added by the interplay of {{ngd}} and {{m}}'s competing italics (which I consider fine/desirable).
Anyway, what still needs to be done? Creating {{scrib of}}...? - -sche (discuss) 00:41, 30 May 2022 (UTC)[reply]

Uncertain numerals and Template:cardinalbox (Etruscan)

In Etruscan, the numerals 𐌑𐌀 (śa) and 𐌇𐌖𐌈 (huθ) could be either 4 and 6 or 6 and 4, respectively, although currently 𐌑𐌀 is listed as 4 and 𐌇𐌖𐌈 is listed as nothing. Because the cardinalbox template gives the number before and after the entry, how could you indicate it has two possible positions? Two cardinalboxes would be a good solution, but then which entry do you link to from 3, 5, and 7? If a preference ends up being needed, I will note that 𐌇𐌖𐌈 as 4 and 𐌑𐌀 as 6 is actually supported by all linguistic and archaeological evidence we have, and the reverse is only assumed from a common -- not absolute -- Roman dice pattern. airy—zero (talk) 03:07, 28 March 2022 (UTC)[reply]

You can actually do something like this, if desired. I just realized that's not the right formatting since the box has tally marks in that position, not the word (but you could e.g. list the two tally options if desired). I think two boxes may actually be the clearest way to go, though. 70.172.194.25 05:06, 29 March 2022 (UTC)[reply]

Adding Law French

Hiya - the discussion at the Beer Parlour re Law French seems to have concluded, and I was wondering if someone with the relevant rights could please add it as an etymology-only langauge to Module:etymology languages/data?

m["xno-law"] = {
    canonicalName = "Law French",
    cparent = "xno",
    cwikidata_item = 2044323,

}

Theknightwho (talk) 17:10, 28 March 2022 (UTC)[reply]

Formatting issues with Descendants lists

I noticed at least two formatting issues with "Descendants" lists. I believe they are different issues.

1) At pisang#Malay the Descendants is properly formatted (with Afrikaans appearing under Dutch) in Chromium (screenshot) while there's a misalignment with Firefox ESR 91.7 (screenshot).

2) At حق#Arabic the list is misaligned (too far left) starting with "Indonesian" (this happens both with Chromium and Firefox). I believe the problem is in the desctree from Persian because حق#Persian also shows a problem (Telugu is too far to the right).

MartinMichlmayr (talk) 04:56, 29 March 2022 (UTC)[reply]

The first issue is due to the multi-column CSS wrapping too early. The {{mid2}} template is supposed to tell the browser where to split the columns, but it doesn't seem to do anything.

The second issue is due to the output of getDescendants() in Module:descendants tree, which is adding </ul> inappropriately. The function parses the list of descendants, formatted using * and :, into HTML-style wikicode. I didn't figure out exactly what caused the problem, but this change to the Persian descendant list seemed to fix it. 70.172.194.25 05:57, 29 March 2022 (UTC)[reply]

Update Wiktionary:Statistics

Hi, @Ungoliant MMDCCLXIV used to update Wiktionary:Statistics/generated on a monthly basis but they haven't been active for a while. How could we update this page? A455bcd9 (talk) 17:45, 30 March 2022 (UTC)[reply]

Ungoliant gave me access to the code to generate the stats years ago, but I lost the files… Just sent an email to ask for them again. Then we should try to get it hosted on toolforge so that anyone can regenerate the stats. – Jberkel 18:56, 30 March 2022 (UTC)[reply]

Thanks. If not toolforge, we could at least put the code on GitHub/GitLab :) A455bcd9 (talk) 11:13, 31 March 2022 (UTC)[reply]

Yes, code running on toolforge is required to be available externally anyway. –Jberkel 11:39, 31 March 2022 (UTC)[reply]

Unfortunately I haven't heard back from Ungoliant yet. However, as I've already written a lot of parsing code for the wanted pages project, it'll be relatively straightforward for me to re-build the statistics. It might be a bit tricky to match the exact numbers, without knowing how the old process works internally, but I'll give it a try. Maybe it's also an opportunity to rethink how the stats work, or if we need any additional information from them. – Jberkel 22:08, 5 April 2022 (UTC)[reply]

Ah too bad... The French Wiktionary offers similar stats (I find them even better than the ones on the English Wiktionary actually). @Unsui said they could help if someone wants to implement similar stats in the English Wiktionary. A455bcd9 (talk) 08:06, 6 April 2022 (UTC)[reply]

@Jberkel, A455bcd9, Unsui. Hey guys, sorry I’ve been AWOL for so long. Are you still going to take up the mantle, Jberkel? If not, I’ll start generating them again. — Ungoliant ^(falai) 20:26, 9 April 2022 (UTC)[reply]

@Ungoliant MMDCCLXIV: Thanks! It's not ready yet, I think I should be able to generate the stats for the next month. – Jberkel 22:03, 9 April 2022 (UTC)[reply]

@Ungoliant MMDCCLXIV, I hope everything is okay for you :) Thanks a lot for updating the stats. Did you put the code on GitHub or GitLab by any chance? A455bcd9 (talk) 08:28, 10 April 2022 (UTC)[reply]

Quiet Quentin gadget's quote-book language codes

The Quiet Quentin gadget 1) occasionally labels a text's language wrong, like "en" for an Italian book or vice versa (perhaps this is unavoidable, whether it's getting the data from Google or guessing on its own), and 2) uses "un" for undetermined, which is not a valid code (the gadget should supply "und"). Old examples linked-to here, example from this week here. - -sche (discuss) 23:47, 30 March 2022 (UTC)[reply]

Also if we're fixing this, I'd for it to automatically be able to delete the spaces after punctuation... Vininn126 (talk) 00:24, 31 March 2022 (UTC)[reply]