Wiktionary:Grease pit

(Redirected from Wiktionary:GP)

Wiktionary > Discussion rooms > Grease pit

A grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and thesaurus and as a website.

The Grease pit is a place to discuss technical issues such as templates, Lua modules, CSS, JavaScript, the MediaWiki software, extensions to it, Toolforge, etc. It is also the second-best place, after the Beer parlor, to think in non-technical ways about how to make the best, free, open online dictionary of “all words in all languages”.

Others have understood this page to explain the “how” of things, while the Beer parlour addresses the “why”.

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Find information and helpful links about modules, Lua in general, and the Scribunto extension at WT:LUA.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with “tips-n-tricks” are to be listed here as well.

Grease pit archives edit

October 2023

Chinese Mandarin or Min Nan interwikis with alt forms edit

Hi. @Theknightwho, @Benwing2, @Erutuon:

無線電無綫電无线电 (zh-min-nan) (bô-sòaⁿ-tiān) should link to 無線電#Min Nan and nan:無線電 just like 無線電无线电 (zh-min-nan) (bô-sòaⁿ-tiān)

Or 無線電無綫電无线电 (zh) (wúxiàndiàn) should link to 無線電#Mandarin and zh:無線電. Compare with 無線電无线电 (zh) (wúxiàndiàn) or 無綫電无线电 (zh) (wúxiàndiàn), which still work fine. I think it worked but it stopped working now. Anatoli T. (обсудить/вклад) 07:01, 4 October 2023 (UTC)Reply[reply]

I've mentioned this problem before to TKW - we thought that the most reasonable approach is linking to the first term (if my memory serves correctly). It would be trivial to change Module:translations#L-70 to term = wmlangcode .. ":" .. mw.text.split(target_page, "//")[1], for this behaviour. – wpi (talk) 14:44, 4 October 2023 (UTC)Reply[reply]
@Theknightwho, @Benwing2: Any thoughts or updates? Anatoli T. (обсудить/вклад) 23:54, 8 October 2023 (UTC)Reply[reply]

@Benwing2 I noticed that since some weeks ago when you specify the accent of an audio link the audio file doesn't appear on the entry. Rodrigo5260 (talk) 23:59, 4 October 2023 (UTC)Reply[reply]

Rhyme tool broken on Rhymes:English/ɪtʃ edit

When I try to add Chich, it says: "ERROR:TypeError: Cannot read properties of null (reading 'index')". Equinox 01:56, 5 October 2023 (UTC)Reply[reply]

@Equinox I noticed that the template used was {{rhyme list start}}, which is a redirect to {{rhyme list begin}}. I switched it to the latter, and it seems to work now (I didn't save the edit, so you can repeat your original attempt to see if it works for you). Chuck Entz (talk) 00:20, 9 October 2023 (UTC)Reply[reply]
On further investigation, it appears that {{rhyme list start}} was created by @OmegaFallon, who added it to a number of rhyme pages, with similar bad results. I'm guessing that the rhyme-adder gadget is looking for the template name in the wikitext and not finding it. I think we need to orphan and delete the redirect. Chuck Entz (talk) 00:39, 9 October 2023 (UTC)Reply[reply]
@Chuck Entz: Thank you for the investigation. I have no interest in any of this template/code bullshit and I hope somebody else will sort it out! Hooray! Equinox 00:59, 9 October 2023 (UTC)Reply[reply]
(edit conflict) Never mind. There were only 28 pages, so I did it myself. There's no content dependent on the template that would get obscured in page histories, so I just went ahead and deleted the redirect. It just shows the danger of experimenting without making sure you understand what you're working with, and apparently without testing to see if it would actually work. Chuck Entz (talk) 01:00, 9 October 2023 (UTC)Reply[reply]

Creation of Jaxxon is being disallowed because of "no xx", however, English Wikipedia [1] has uses of this proper noun as [2] a given name and a surname, so it is a valid "xx" term. People such as Harriet Jaxxon [3] or Jaxxon D. Silva [4], Jaxxon R. Roberts [5], Jaxxon Brosnan [6], Jaxxon Tucker [7], Jaxxon Sullivan [8] -- 04:21, 5 October 2023 (UTC)Reply[reply]

  Done Created. We get huge amounts of xx vandalism so we do not allow IP addresses to create these. Equinox 08:21, 5 October 2023 (UTC)Reply[reply]

Quick add button edit

This is admittedly likely to be a fairly niche use-case, but I'm looking for a solution to get a quick-add type function for adding single words into userpages (my own), appendices, etc. without having to edit the whole page every time—kind of something along the lines of what we already have in place for translation boxes. Do we already have such a feature or, alternatively, is there in fact a very strong argument for why we would not want such a thing? German wiktionary seems to employ such a thing in a few different [[implementations, all of which appear to run on code originally derived from here, English wiktionary. In general it would be cool to also have something along the lines of the aforementioned translation boxes, where the user can specify the lang code, and optionally include some basic checkboxes for gender, number, etc. I do note that {{trans-top}} is quite clear that it should not be used for anything else. Any thoughts/tips/rotten tomatoes would be welcome. Helrasincke (talk) 13:09, 6 October 2023 (UTC)Reply[reply]

You can edit invidual sections by giving them headers by using =. Compare User:Vininn126/Sandbox. Vininn126 (talk) 13:14, 6 October 2023 (UTC)Reply[reply]
@Helrasincke on user pages, you can use {{trans-top}} for whatever you want. Here is an example of someone using it to keep to-do lists. This, that and the other (talk) 09:14, 8 October 2023 (UTC)Reply[reply]

New Thai Entries in CAT:E edit

In the past week, newly created Thai entries have been appearing regularly in CAT:E, apparently due to {{th-x}} not being able to extract the pronunciation information from the entry itself in order to transliterate/romanize the occurrence of the entry name in the sentence. If the new entry does have valid pronunciation information, the page itself displays no module error and a null edit clears the entry from CAT:E. I do notice that the module error reappears in preview if I edit a section rather than the whole page, but goes away on save.

Is this the result of something new on the back end, or is it just a result of @GinGlaep creating the entries with {{th-x}} already included instead of adding them in a later edit?

More to the point: is there a way to change things so this doesn't keep happening? Chuck Entz (talk) 18:48, 6 October 2023 (UTC)Reply[reply]

Creation of Galician accelerated conjugations edit

I'd like to make a request for templates such as {{gl-conj-ar}}, {{gl-conj-er}}, {{gl-conj (ir)}} (among others) to have green links in them. Some conjugations aren't correct and there's pretty much zero support for defective verbs it seems, but plenty of verbs are accurate and links could be created for them. It feels like I could count on my fingers the number of conjugations created for Galician so far. If the tables worked like {{ast-conj-ar}}, that'd be incredible.

Galician has two orthography norms and the paragraph above concerns the official one. I made a table for the reintegrated orthographies by using {{pt-conj}} as a base. They're seriously really close in how they're coded, but the accelerated links, although they are green, don't quite work. I'm thinking that if the reintegrationist Galician tables got some sort of subpag e in Module:accel, they'd start functoning. I might be wrong though; perhaps a "Module:gl-inflections" would need to be created to match Module:pt-inflection. Or maybe it'd be best to recode how the links are made? I don't know, I'm not knowledgeable enough to do it myself or really understand how it works. I'd seriously appreciate if someone could do it for me though; I'd definitely try to help by giving as many pointers as I can! MedK1 (talk) 13:02, 8 October 2023 (UTC)Reply[reply]

@MedK1 You don't actually need to add a subpage to Module:accel for Galician; instead you just need to set the |accel-form= parameter appropriately in the various calls to {{l-self}} in {{gl-conj-table}}, something like this:
(for the first person singular present indicative).
I also see you created Module:gl-reinteg-verb. What state is it currently in (i.e. how well does it work)? Probably we should do the same for the standard Galician orthography, in place of the various templates. The code in Module:pt-verb (which you based Module:gl-reinteg-verb on) generates accelerated entries that use {{pt-verb form of}} to autogenerate the correct inflections; that's what Module:pt-inflections does. You could do the same for Galician, or you could change the module to have it directly generate the appropriate forms in accelerated entry by changing a few lines (the support for this is already present in the underlying Module:inflection utilities). Probably the former approach is better if you can get it to work. Benwing2 (talk) 00:02, 9 October 2023 (UTC)Reply[reply]
@Benwing2 About Module:gl-reinteg-verb's current state, there are a few kinks under the hood that might need fixed; it still mentions Module:pt-common (gl-common isn't a thing), I'm unsure if it's coded 'optimally' per se (a more skilled coder could definitely save some bits) and I didn't touch clitic support at all (dunno how I'd even begin doing that considering I've seen no Portuguese tables with clitics in them either).
That aside, it's working perfectly from what I see. I can't think of any single verb the module trips up on, and all the problems I managed to find myself have been fixed.
I can't say the same for the standard table though. Having a single module for that too would be great, especially since I've spotted inaccuracies in pages more than a few times now (mostly regarding the imperative and defective verbs). Since the accentuation rules are different, maybe it'd be best to use the Spanish table as a base for it...?
Thanks for the info on how to get it done! I think I can do the l-self fixes in gl-conj-table myself, but the other one seems seriously complicated; it'd be a while until I get time to not just work on it but figure how exactly it functions too... MedK1 (talk) 02:48, 9 October 2023 (UTC)Reply[reply]
@MedK1 Good to hear! I think you may be right about starting from the Spanish module, or at least reusing the functions in Module:es-common that take care of accent placement. As for clitics, there are definitely reflexive verbs that use the reflexive support, e.g. esbaldar-se, but I forget whether there is separate non-reflexive clitic support (as there is in the Spanish module); if so it's likely unused. Benwing2 (talk) 04:28, 9 October 2023 (UTC)Reply[reply]
@Benwing2 It seems like there's no clitic support for non-reflexives in pt-verb (and therefore not in gl-reinteg-verb either). I've run into a problem with the gl-reinteg-verb module, and I don't think I can fix it myself.
It's about clitic placement; some verbs have multiple forms for a few tenses, and it usually goes 'a unique form' (arrependim-me, for example) followed by 'a form matching Portuguese' (arrependi-me).
  • The ones that match Portuguese should have mesoclisis when applicable (future, conditional)
  • The 'unique' forms for those tenses should have enclises instead of mesoclisis.
  • The conditional form with -ais (the one that goes to the footnote) should have both.
I have absolutely no idea how to code it. Could you please help me with that? I'll try to figure out the pt-verb form thing in the meantime. MedK1 (talk) 02:54, 12 October 2023 (UTC)Reply[reply]
@MedK1 Let me take a look. Benwing2 (talk) 04:21, 12 October 2023 (UTC)Reply[reply]
@MedK1 I am working on cleaning up the code. How attached are you to the format of the footnotes that say things like "dixérais exists as well"? Having forms like this in footnotes causes a lot of problems for the underlying code; it would be simpler (and IMO potentially clearer) to include them as additional forms with a footnote saying "less common" or similar. What do you think of that? Benwing2 (talk) 00:22, 14 October 2023 (UTC)Reply[reply]
@Benwing Awesome! Thanks so much in advance! I'd actually been considering changing the "exists as well" text to make the wording less repetitive, so I'm definitely not exactly attached to that aspect, but I would prefer that they be kept separate from the other forms somehow; I'm thinking the cells might end up a bit cluttered if these forms were to be included right alongside the other ones... (They're kept separate in the Estraviz dictionary, too)
Although, maybe if the table were wider to compensate? Like, there's some white space to the right of the table columns. If that white space could be reduced to make sure each cell [most of the time] only has one line's worth of content, I think the "less common" footnote could work just fine while still looking aesthetically good...
I'd tried doing it myself, but setting a different width for each column would just break the entire thing for some reason... MedK1 (talk) 01:09, 14 October 2023 (UTC)Reply[reply]
@MedK1 I see what you mean about the whitespace. I'm pretty sure if we set the CSS width to 100% that will go away. One thing that would also be possible is to italicize the "less common" forms, as well as include a footnote. That is what the Portuguese verb tables do with superseded forms; see averiguar for an example (BTW this table doesn't have the whitespace on the right). Also I am trying to simplify the irregular verb specifications. One thing I'm not sure about is the fourth_foot setting, which seems to be used only for fazer and derivatives. Are there any other verbs like this? If not it should be easier to handle this by just setting some extra overrides for fazer. In addition, the Estraviz tables don't mention most of the short past participles that exist in Portuguese; the only ones I've found so far are frito, preso, pago, ganho, gasto, entregue, expulso. Examples that aren't mentioned are bento, morto, eleito, aceso, suspenso, envolto, revolto, emerso, expresso, distinto, aceite, enxuto, assente, anexo, completo, expulso, findo, limpo, pasmo, pego, solto. It seems many of the latter exist as adjectives but can they be used as past participles (especially in passive constructions)? If not we should clean up the verb entries appropriately. Benwing2 (talk) 03:26, 14 October 2023 (UTC)Reply[reply]
Italicizing sounds like a good idea. Aboutfourth_foot, yeah, it's only used for fazer verbs. I used a variable instead of just adding extra overrides due to the footnotes having links in them; trying to make them work the way they do by setting more parameters ended up causing errors related to concatenation. The variable's useless now that the less common forms are going back into the table haha.
Yeah, I noticed the absence of the participles too. I tried contacting them via e-mail to ask for clarification, and I ended up being misunderstood; the answer was (in better words than these, of course) that 'the regular forms are valid too'. Considering how I didn't get told that the short forms aren't valid and how "aceites" as a participle is actually utilized in the "Acerca do Estraviz" page, I'm thinking the irregular forms that aren't marked as 'Brazil' in the pt table are all valid here as well (the ones marked as 'Portugal' are actually closer to 'not-Brazil' since Africa uses them too). MedK1 (talk) 14:56, 14 October 2023 (UTC)Reply[reply]
@MedK1 It should be working now, including mesoclisis. Please let me know if you see any errors. I don't think a regular Galician verb conjugation module will be hard, starting from Module:gl-reinteg-verb and incorporating stuff from the Spanish modules as necessary; in particular, a lot of the stuff in Module:gl-reinteg-verb to handle alternative variants ("Portuguese-like", "Galician-like", "less common", etc.) just goes away. One hitch is I can't find any tables of reflexive or cliticized verbs using standard Galician spelling. The only resource I can find that seems to contain more or less error-free tables is the Real Academia Galega, and its reflexive verbs e.g. [9] refer you to the nonreflexive equivalent, which includes only a nonreflexive table, even for verbs that are reflexive-only like this one (arrepentirse). BTW the current {{gl-conj-*}} templates seem to be full of errors; whoever created them didn't really know what they were doing. Benwing2 (talk) 04:55, 15 October 2023 (UTC)Reply[reply]
@Benwing2 Thank you so much! The code looks way better now; seriously, thank you again. Looking at the table, however, I can see a few errors.
  • For irregular verbs like fazer/trazer/pôr etc., -che forms for 2nd person singular preterite are gone.
  • For irregular verbs like fazer, there are duplicates for most persons of the preterite, the pluperfect, the subjunctive imperfect and future of the subjunctive cases. One of the duplicates is marked as 'less common', while the other one isn't. The one that's not marked should persist imo, as the duplicate forms are ones matching Portuguese.
  • For pôr, 3p.p. form "póm" is marked as 'less common'. I don't think it should be marked as such since it's a form that matches the one for standard Galician. A similar case happens with "ler"/"crer", where "le-" (instead of "lei-") forms are marked as 'less common' even though they match standard Galician.
There might be a few more items worth noting, but I'm actually running out of free time for today already... I'll come back with a more in-depth response soon!! MedK1 (talk) 13:12, 15 October 2023 (UTC)Reply[reply]
@MedK1 Thanks. I marked the le- and cre- forms as less common because they're in the 'Outras Variantes' section of Estraviz; let me know if you still want the footnote gone (e.g. you could argue that since standard Galician has '-ches' in the pret_2s, we should have that or at least not mark '-che' as less common; I take it that Estraviz's tables are based on a different dialect of Galician than the standard spelling). I'll see about fixing the other issues. Benwing2 (talk) 18:43, 15 October 2023 (UTC)Reply[reply]
@MedK1: I created a standard Galician verb module. See User:Benwing2/test-gl-conj for various conjugations. It's close to being correct for most or all non-reflexive verbs, but the reflexive support is all messed up because I don't have any good reference on how reflexive verbs work in standard Galician. If you could point me to any resources that have a table of reflexive forms and/or a detailed explanation for how they work, it would be very helpful. Note that there are some reflexive verb tables at sources like Verbix but they seem to have lots of mistakes in them. Benwing2 (talk) 02:36, 16 October 2023 (UTC)Reply[reply]
@MedK1 I pushed the module live to Module:gl-verb and created template {{gl-conj}}, and converted all the -ar and -er verbs (except for some irregular ones) that were using the old templates (which have now been deleted). I don't know of any remaining errors in the conjugation tables except for the problem with reflexive verbs (if you try to use the module on a reflexive verb you get an error). Benwing2 (talk) 07:20, 16 October 2023 (UTC)Reply[reply]
Still to do is to fix {{gl-verb}} to work like {{pt-verb}}; this should not be hard, nor should it be hard to create {{gl-reinteg-verb}} for the headword of reintegrated verbs. Benwing2 (talk) 07:23, 16 October 2023 (UTC)Reply[reply]
Awesome! We're super close to being able to reliably use the accel gadget with Galician now! I actually tried editing gl-verb once, but it was protected... I presume gl-reinteg-verb works much the same way what with having to be protected, right...? So godspeed Benwing, and again, thank you so much for all this help!! MedK1 (talk) 15:34, 16 October 2023 (UTC)Reply[reply]
Yeah, I really would prefer that the footnotes be gone. I don't think "-ches" should be in the table as it's not an 'accepted' spelling. As for "-che", well, it's close to "-ches" but it's not quite the same thing, so I actually can see the argument for keeping the footnote on it hmm. Keeping forms like "le-" and "cre-" without the footnote makes it easier on the pages too, as they won't have to be edited with some sort of, say, "(Reintegrationist) less common form of leiamos" sense; it'd match how "-ade" imperative forms are treated as they don't have footnotes on them either. MedK1 (talk) 15:18, 16 October 2023 (UTC)Reply[reply]
Upon having read through informational pages on Estraviz and in the AGAL websites, I've changed my mind. The forms that are separated in the Estraviz conjugation tables aren't separated because they're less common (even if sometimes that is the case, as with -ais); indeed, some forms like imperative "-ade", and "cre-" for crer are very common. They're there though because they're less preferred by AGAL in comparison to the other ones. So now I actually believe it's best to keep the footnotes where they are. I'm still unsure about adding them to the imperative "-ade" forms: they're separated in the Estraviz and the AGAL website tables, but they match up with the perfectly-acceptable regular "-des" forms just fine. I couldn't find any mentions of them being less preferrable or less acceptable the way I could for actual verb forms like "traer". MedK1 (talk) 14:25, 18 October 2023 (UTC)Reply[reply]
Also, for -ais forms specifically, they're dialectal forms, from the eastern parts of Galician (oriental Galician). Still not preferred, though. MedK1 (talk) 16:00, 18 October 2023 (UTC)Reply[reply]
@Benwing2 I've got some time now, so I can finally write up about the positioning for clitics in standard Galician. These aren't tables, but they do tell the reader where to put the pronouns so they should work, right? Searching for "colocación dos pronomes en galego" gives you some helpful links, but I think the best one is this one from the Xunta. The positioning works pretty much exactly like in European Portuguese (and therefore just like the table in pt-conj): verb+clitic by default, clitic+verb when preceded by "non" or in a subjunctive sentence (among other situations that aren't relevant for the table; the exact specifications are clitic+verb when preceded by a negative/interrogative adverb or in the subjunctive). Their spelling works like in Spanish where instead of the word being written normally and then the clitic is tacked onto it, 'word+clitic' works as a single unit and that dictates how accents are placed (so "arrepentinme" without the accent, not "arrepentínme").
I did notice the previous Galician templates were weird for the lack of a better word. I'm happy to learn it wasn't just me, and really glad you've remade it. Thanks so much for that! MedK1 (talk) 15:14, 16 October 2023 (UTC)Reply[reply]
@MedK1 I'm concerned about edge cases in spelling; "works like Spanish" can hide a lot of edge cases that don't work like Spanish. For example, I gather than 'ui' and 'iu' are normally pronounced in Galician with stress on the first letter, whereas in Spanish the stress goes on the second letter. So Spanish writes argüir and concluir but Galician writes argüír and concluír. There must be other such cases; do you have a reference for this? Benwing2 (talk) 05:08, 18 October 2023 (UTC)Reply[reply]
Also for feminines and plurals of nouns and adjectives. I gather, for example, that fácil might have a different masc plural in reintegrated spelling vs. standard spelling, but neither dictionary consistently gives the feminines and plurals of adjectives or the plurals of nouns like túnel. Do you have a reference for this? Benwing2 (talk) 05:21, 18 October 2023 (UTC)Reply[reply]
Another such case is with prohibir, which has Spanish forms prohíbo, prohíbes but Galician forms prohibo, prohibes (both standard and reintegrationist). Do you know why? Benwing2 (talk) 05:59, 18 October 2023 (UTC)Reply[reply]
In standard Galician, plurals of words ending in -l are the same word with -es tacked on that the end (plus possible accent changes made to make sure the stressed vowel is the same in both words). This is how Galician Wikipedia/Wiktionary both do it and it's also the recommended form by official grammar manuals by the Xunta. However, the same Xunta who says that's how plurals are made has also written "dous túneis".
"Túneles" and "túneis" both have plenty of hits, but since the manual says -les is recommended rather than -is, I'd keep the -is forms as alternative forms. Alternative, not nonstandard, because a) I managed to get it from an official website b) Pretty much every other variation like this is marked as alternative forms; it'd be inconsistent to change it now.
There's no such thing as "fácilas" in Galician from what I've seen. Only "fáciles" and "fáceis" (as mentioned in the paragraph above) exist.
I'll answer the other questions soon; I want to back up my words with sources, but the websites are all unusable on mobile... MedK1 (talk) 12:18, 18 October 2023 (UTC)Reply[reply]
For the reintegrated orthography, the only plural for words with -l is -is, as in Portuguese, save for a few specific exceptions like til (Portuguese has the same irregular exceptions).
As for "prohíbo"/"prohibo", I'm thinking it has to do with a difference in interpretation/treatment for the H. If we analyze "H" as a merely etymological grapheme rather than an actual consonant, it'd make sense for the accent to be there in Spanish. Hence, "proibo/prohibo" would be diphthongs. This checks out: see cohibir. As for Galician, H is being treated as an actual consonant, one whose sound/value is / /. Just like "protibo" would be a grave word with TI as its stressed syllable, the same goes for "prohibo" and HI. This is similar to how Portuguese Bahia isn't accentuated; compare baia and baía.
When I mentioned "similar to Spanish", I was talking specifically about the treatment of cliticized words. Regarding -uír, this is common behavior in standard Galician. saír and fluír work the same way. I'm thinking it's because "ai" and "ui" can form diphthongs, while "ae" for example can't, hence traer and not *traér. I don't have a reference per se, but RAG is very consistent concerning this. MedK1 (talk) 13:56, 18 October 2023 (UTC)Reply[reply]
About the irregular participles, I was actually able to find a source of their acceptability in reintegrated Galician. As we thought, the missing forms in Estraviz aren't missing intentionally. MedK1 (talk) 14:25, 18 October 2023 (UTC)Reply[reply]
@MedK1: Thanks for the references. I have fixed the reflexive handling for the standard norm and added {{gl-verb form of}} and {{gl-reinteg-verb form of}}, which are now used with accelerators (which work). Please make a list of the conjugation table changes you want for reintegrationist verbs. Also note that {{gl-verb}} takes the same param as {{gl-conj}} and in addition can take a |reinteg=1 param. If the latter is specified, both the standard and reintegrationist spellings are generated in the header using the same value of |1=. See conseguir for an example. There's also {{gl-reinteg-verb}} which generates only the reintegrationist spellings in the header; if the value of |1= differs between the standard and reintegrationist norms, you need to write the header this way (e.g. for odiar (to hate)):
{{gl-verb}}<br />
The same thing applies to {{gl-noun}} and {{gl-adj}}, both of which take a |reinteg=1 param but for which there also exist {{gl-reinteg-noun}} and {{gl-reinteg-adj}} variants. Benwing2 (talk) 09:15, 22 October 2023 (UTC)Reply[reply]
Now I can go around creating a bunch of verb forms, woohoo! Thank you Benwing!! Here's the list of changes as you asked (there really aren't many, as you've done some excellent work!):
  • Fix accent rules so that -aim, -oim and -uim forms (-Vim in general) don't get changed to -aím, -oím or -uím. -im should work just like -ir when it comes to accentuation.
  • Tweak "less common" so it says something along the lines of "less preferred" or "less recommendable". Some forms there aren't necessarily less common.
  • (this goes for the Portuguese module too) Fix clitic forms so that forms like "arrepender-se-ia" don't get the "arrepender" part bolded just because it's in the "arrepender" page (this form links to "arrependeria" after all).
  • (this one's actually for standard Galician) Seeing how we're including 'less recommendable' forms for reintegrationist Galician, maybe the same should go for standard Galician too and forms like "mundo" as the participle of moer and "túneis" as the plural of "túnel" should be included?
  • (this is also only for standard Galician) "moer" is losing its preceding "m" in some conjugations. What could be causing that? MedK1 (talk) 14:46, 22 October 2023 (UTC)Reply[reply]
@Benwing2 Just now I noticed there are some irregular participles missing from both gl-verb and gl-reinteg-verb! See III.9.9. Particípios duplos here. Some (most?) of the forms shown here are in the RAG standard Galician conjugation tables as well, like "comesto" for "comer". MedK1 (talk) 22:41, 22 October 2023 (UTC)Reply[reply]
@MedK1: I've fixed issues #1, #2 and #5. I used the wording "less recommended" but I have no issue with "less recommendable" if you prefer that. For #3, the issue is that a form like 'arrepender-se-ia' is linked like [[arrepender]]-[[se]]-[[arrependeria|ia]], and [[arrepender]] occurring on the arrepender page is a "self-link", and self-links by default get converted into unlinked bold. This can be changed so that self-links are allowed (meaning they will remain as links), although I'm not sure if we want this for all self-links or only for the ones occurring as part of mesoclitic forms. What do you think? As for #4, I did intend to include 'mundo' as an alternative past participle of moer because it's listed in the RAG's conjugation tables, but I didn't do it for túnel because I couldn't find any reference to forms like túneis in the RAG's dictionary or grammar sections. My instinct is to only include such forms if they're mentioned by some standard source, otherwise we could end up including all sorts of dialectal variants and it might be difficult to know where to draw the line. The "less recommended" forms are found in Estraviz's tables, and I assume Estraviz can be considered an authoritative source for the reintegrationist norm, so that's why I'm fine with including them. I'll work on the irregular past participles next. Note that the way I tried to handle this for Portuguese was to include all irregular participles of -er and -ir verbs in the module, but require that -ar verbs specify short participles in the {{pt-conj}} call using short_pp:entregue or similar. I did this because there's a large number of short participles of -ar verbs and it seemed it would be less surprising to the user if these are found in the wikicode of the page rather than the module; also, irregular participles of -er and -ir verbs often apply to all derived verbs as well, whereas the irregular -ar participles typically only apply to a single verb. It looks like there are rather more -er and -ir verbs with irregular participles listed in your reference, but for the moment I'll stick with the same principle. BTW one question concerning chover: the RAG says this is impersonal and only has third person singular forms, but we currently have a bunch of verb-form entries for forms like chovemos and choviches. I'm not sure whether to keep these and change chover to have a full conjugation, or delete the forms. Estraviz's dictionary gives a full conjugation for chover, but I don't really trust it because it also gives full conjugations for chuviscar, chuvinhar, relampar and other weather verbs that seem unlikely to ever be used outside the third singular. Benwing2 (talk) 01:55, 23 October 2023 (UTC)Reply[reply]
@Benwing2: For mesoclitic links, I think they should work like [[arrependeria|arrepender]]-[[se]]-[[arrependeria|ia]]. We don't link "arrependerem-se" as [[arrepender]][[arrependerem|em]]-[[se]] nor "renderia" as [[render]][[renderia|ia]] or anything, so I think this take makes sense; the mesoclitic forms are supposed to be just 'normal' future/conditional forms with the -se- shoved in the middle of the word after all.
RAG might not mention túneis (it doesn't mention túneles directly either haha), but considering the Xunta is an official government body, forms that encounter usage in it should probably count well enough, no...? It should at least get a mention, lest it'd be considered a purely reintegrationist word when that's really not the case.
I'm alright with the module verb conjugation principles of course!
About "chover" specifically, according to Estraviz, it's impersonal when it means literally "to rain", but it can have a couple other figurative meanings too (just like English rain actually! See sense #3) and in those cases, it can be conjugated normally. This is how it's treated in Portuguese too (even places that mark it as defective have an asterisk of sorts allowing it for figurative uses), and indeed, Wiktionary's PT module has a full conjugation for it too!
It's a similar story for chuviscar and many other verbs that are defective in standard Galician. In Portuguese, it's allowed mostly for figurative senses. For reintegrated Galician, I believe it's much the same, but AGAL's exact words are "Following the criteria in the Practical Guide of conjugated Galician verbs, we have written down all the verbal tenses for they are possible forms – albeit of rare occurrence – and can serve as a model for other verbs. I'm thinking the reason behind these differences is a matter of philosophy; marking the verbs as defective is to outright prohibit some conjugations no matter the context behind them. MedK1 (talk) 02:03, 25 October 2023 (UTC)Reply[reply]
Also... I was creating the conjugated forms for roer just now and I noticed they sort of came out 'duplicated', like in roerdes. Is that intentional? I was imagining that the reinteg-specific template would only be used for terms/spellings that are only found in reintegrated Galician, but this works too! MedK1 (talk) 02:28, 25 October 2023 (UTC)Reply[reply]

sandbox modules and templates in the mainspace edit

Just FYI, I am moving mainspace sandbox modules and templates to the userspace of the person who created them as I encounter them, or deleting them if that's not possible (esp. if they're more than 6 months old). As a general rule we should not have sandbox modules and templates in the mainspace for various reasons:

  1. They are essentially litter, and junk up the autocomplete lists.
  2. You can only have one sandbox named '.../sandbox'; you can't have per-user sandboxes using this mechanism.
  3. You can't tell who the sandbox belongs to without looking at the history.
  4. Errors in the sandbox show up in CAT:E (i.e. Category:Pages with module errors) rather than in Category:Pages with module errors/hidden.

Sandbox modules should typically be named e.g. Module:User:Benwing2/category tree/topic cat (sandbox equivalent of Module:category tree/topic cat) and sandbox templates should typically be named e.g. User:Benwing2/topic cat (sandbox equivalent of Template:topic cat; no Template prefix). Benwing2 (talk) 23:50, 8 October 2023 (UTC)Reply[reply]

Accelerated inflections may create both Verb and Participle headers edit

You can see this by trying to create the green (accelerated) "got the lead out" link from get the lead out. It seems wrong to create two separate parts of speech here. Equinox 14:07, 10 October 2023 (UTC)Reply[reply]

stray comma in Czech verb conj templates edit

hardly the most important thing around, but i figured i'd post. on padnout#Conjugation and probably many other verbs, there is a stray comma before padší, the active adjective form, in both templates. it jumped right out at me because for some reason it also puts the word in a different CSS class which i noticed because i use custom CSS. For everyone else, it's barely noticeable, but still there ... and it leads me to wonder if there's a missing form that's not being filled in. All the best, Soap 23:56, 10 October 2023 (UTC)Reply[reply]

Note if it helps at all ... the special CSS class appears to be <p> ... it's creating an HTML paragraph for some reason, for that form of the verb only, and not any others. Soap 23:57, 10 October 2023 (UTC)Reply[reply]
@Soap BTW I have an old half-completed project to redo the Czech verb conjugations using a module, which should fix all the issues. Unfortunately it's in an incomplete state as of now. Benwing2 (talk) 04:56, 15 October 2023 (UTC)Reply[reply]

Missing Bulgarian category? edit

Category:en:Physical quantities is alive and well. When I created the equivalent bg version, there's a warning - Category:bg:Physical quantities. I had assumed that things like science categories are mirrored across languages? At least that's been my experience so far.


Chernorizets (talk) 10:07, 11 October 2023 (UTC)Reply[reply]

See Wiktionary:Beer parlour/2023/October#non-integrated topical categories and what to do about them. The category needs to be placed into Module:category tree/poscatboiler/data (which I don't have the slightest idea of how it works). Also note that the English category isn't using {{auto cat}}. – wpi (talk) 16:47, 11 October 2023 (UTC)Reply[reply]
@Wpi thanks! I'd seen that thread but didn't realize this was one of those categories. Chernorizets (talk) 21:00, 11 October 2023 (UTC)Reply[reply]

Automating mass moves from English to Translingual edit

See WT:TR#Why are stock symbols English?. DCDuring (talk) 15:26, 12 October 2023 (UTC)Reply[reply]

@DCDuring This needs to be done semi-manually, but given that there are only 20 entries in question, this won't be hard. However, I notice that obvious symbols like AMZN, AAPL and META are missing. What's the CFI for stock symbols? There's no mention of symbols of any sort in WT:CFI. Benwing2 (talk) 01:57, 13 October 2023 (UTC)Reply[reply]
There are many other items in English sections of entries for letters and abbreviations that arguably and, IMHO, obviously belong in Translingual sections. Identification needs to be manual using search, eg, with regexes, but the execution of the move is quite time-consuming. I do things like that using an expanded Clipboard and sometimes the regex editor, but my regex skills are limited and the regex editor has limitations. Plus, I have limited coding skill in general and miss opportunities to use these tools to their full potential. DCDuring (talk) 12:35, 13 October 2023 (UTC)Reply[reply]
@DCDuring If you can identify the terms to be moved, I can help with the moves, which are much easier when done semi-manually using regex search and replace in a text editor or with a script. Benwing2 (talk) 00:46, 14 October 2023 (UTC)Reply[reply]
The recurring, general problem is moving content between L2 headings, eg, from English to Middle English, from English to Translingual. I suppose the content to be moved is too variable for a really convenient solution. So cut-and-paste it is.
A simpler problem is recategorizing hard-categorized pages. My case is moving the remaining contents of Category:Taxonomic hypernym templates to subcategory Category:Taxonomic hypernym templates (family). The purpose of the subcategories is to simplify rooting out errors. DCDuring (talk) 14:12, 15 October 2023 (UTC)Reply[reply]
@DCDuring Moving those templates is easy; as long as you're sure they're all family templates, I can move them en masse. As for moving between L2 headings, yeah that requires some manual effort. But what I typically do is load all the pages to be modified into a single file, edit the file with a text editor, and push the changes all at once. This makes it easy to do regex search-and-replace or other sorts of edits that would be much more painful when you have to do them one by one. This is a bit conceptually similar to using AWB or JWB except that you don't make any actual changes to pages until the end, so if you get partway through and realize you have to go back and do things a bit differently, it's pretty easy to do so. That's why I say I can do these things in a semi-manual fashion fairly easily. Benwing2 (talk) 07:17, 16 October 2023 (UTC)Reply[reply]
Tell me about the efficient ways of downloading and uploading large numbers of entries. I occasionally create useful lists from dumps (though not very efficiently), but I don't get how to upload efficiently. DCDuring (talk) 13:36, 16 October 2023 (UTC)Reply[reply]
@DCDuring You will find the source code to my bot scripts here: [10] If you're comfortable running Python scripts, you can use find_regex.py to download pages using various criteria (e.g. all in a specified category, all pages referencing a specified template, additionally optionally only pages that match a given regex). This writes the contents to a single text file, with a line like this before each page:
Page 39 ripiar: -------- begin text --------
and a corresponding line after each page:
-------- end text --------
Generally I follow the following steps:
  1. Use find_regex.py to download a set of pages, maybe like this:
    • python3 find_regex.py --refs 'Template:gl-verb-old' --text --lang Galician > find_regex.gl-verb-old.out.1.orig
    • (this says to download the full text of all pages that contain a reference to the template {{gl-verb-old}}, fetching only the Galician content)
  2. Copy the .orig file to another file that will be modified:
    • cp find_regex.gl-verb-old.out.1.orig find_regex.gl-verb-old.out.1
  3. Edit the latter file (NOT the .orig file), making any changes you want but making sure not to touch the -------- begin text -------- and -------- end text -------- lines.
  4. Push the changes like this:
    • python3 push_find_regex_changes.py --direcfile find_regex.gl-verb-old.out.1 --origfile find_regex.gl-verb-old.out.1.orig --comment "use {{gl-verb}}, {{gl-reinteg-verb}} or {{head|gl|verb}} in place of {{gl-verb-old}} (manually assisted)" --lang Galician --diff --save > push_find_regex_changes.find_regex.gl-verb-old.out.1.out.1.save
    • Here we specify the modified file; the original/unmodified file; a comment saying what was done (I usually add "(manually assisted)" into the comment to indicate that I made this change in this fashion, rather than through an automated script); --lang Galician to change only the Galician section (since we only downloaded the Galician section of each file); --diff to write a unified diff to stdout showing what was changed; and --save to actually save the changes (without this it will show what was done but not make any changes).
It is also possible to use find_regex.py to read from a recent dump file and extract pages matching certain criteria, saving it out to the same format as above. I find this approach is very efficient, and I habitually use it if I have more than 10 or so similar pages to change. Note that the purpose of the .orig file is to track what the state of the page was at the time we downloaded it; if someone else changes it in the meantime, push_find_regex_changes.py will refuse to make a change to that page, so we don't end up overwriting someone else's change.
If this makes sense to you, let me know and I can give you more information (e.g. how to set up my scripts, which only takes a few minutes). Note also that I always make these changes using my bot account. If you don't have a bot or similar account, it might be OK to make them through your regular account (since these are semi-manual changes rather than automated script-based changes), but to be kosher it's probably best to get a bot account. (Such an account can also be used for AWB/JWB changes; e.g. Dan Polansky had such an account, named User:DPMaid.) Benwing2 (talk) 02:53, 18 October 2023 (UTC)Reply[reply]
It more-or-less makes sense to me, but I've never used Python and have used only the tiniest fraction of what Perl can do. My skills are back at grep. Even AWB scares me. It looks like I'd have to ask for favors to get stuff done or rely on cleaning up multiple things in L2 sections when making necessarily manual changes.
But maybe I can make AWB and Python part of an anti-Alzheimer's mental exercise program. DCDuring (talk) 14:13, 18 October 2023 (UTC)Reply[reply]

Template:SI-unit edit

The SI-unit template needs to be updated with the new 2022 metric prefixes: ronna, quetta, ronto, and quecto. Currently, the templates look like this (using grams as an example):
ronna: (metrology) An SI unit of mass equal to 1027 grams. Symbol: Rg
quetta: (metrology) An SI unit of mass equal to 1030 grams. Symbol: Qg
ronto: (metrology) An SI unit of mass equal to 10−27 grams. Symbol: rg
quecto: (metrology) An SI unit of mass equal to 10−30 grams. Symbol: qg
The exponents should be 1027 for ronna, 1030 for quetta, 10-27 for ronto, and 10-30 for quecto.
The symbols should be R for ronna, Q for quetta, r for ronto, and q for quecto. Netizen3102 (talk) 21:19, 13 October 2023 (UTC)Reply[reply]

@Netizen3102 Done. Ahiise2 (talk) 05:04, 16 October 2023 (UTC)Reply[reply]
Big thanks! Netizen3102 (talk) 05:05, 16 October 2023 (UTC)Reply[reply]

Relabeling with auto-cat edit

Is it possible to change the automatically displayed text in categories using the {{auto cat}} template? specifically in Category:English terms coined by Tetsuya T. Fujita I want the name Tetsuya T. Fujita to link to Wikipedia, as he was primarily known as Ted Fujita and it's possible people might not realize this is the same person (at first I thought maybe he had had a son or a father in the same line of work). Soap 07:38, 15 October 2023 (UTC)Reply[reply]

@Soap There is currently a handler for all 'LANG terms coined by COINER' that works the same for all coiners. Maybe we should link all coiners to Wikipedia by default; we could add a special case for this name but it doesn't seem scalable as there certainly are other names needing the same treatment. Benwing2 (talk) 07:46, 15 October 2023 (UTC)Reply[reply]
I would point out that {{coinage}} auto-links the coiner's name to Wikipedia. I'm not convinced this is particularly desirable behaviour, but that might strengthen the case for adding it to all coinage categories.
The other option is just to manually add {{wikipedia}} or similar to the categories of coiners with Wikipedia articles about themselves. This, that and the other (talk) 09:42, 16 October 2023 (UTC)Reply[reply]

Tooro script edit

Please add Latn for the script of the Tooro language at Module:languages/data/3/t. Thank you. Please also add Rutooro as an alias for the Tooro language. Ahiise2 (talk) 18:06, 15 October 2023 (UTC)Reply[reply]

  Done. - -sche (discuss) 19:42, 16 October 2023 (UTC)Reply[reply]
Thank you! Ahiise2 (talk) 20:25, 16 October 2023 (UTC)Reply[reply]

der4 (col4) edit

Can someone please check the Derived terms at English spot ? Leasnam (talk) 02:37, 16 October 2023 (UTC)Reply[reply]

@Leasnam: [[heaven spot] was all it took to do that. Most people won't notice what's wrong, but all the programmers' eyes are no doubt twitching... Chuck Entz (talk) 02:55, 16 October 2023 (UTC)Reply[reply]
Thank you ! Leasnam (talk) 03:32, 16 October 2023 (UTC)Reply[reply]

Live code inside of Template:attention edit

I just fixed a couple of module errors caused by templates in the text inside {{attention}}. Back in April, @Theknightwho had included the template syntax they replaced in their comment, and changes to the module meant that syntax now throws a module error. The result was an invisible module error: the entry showed up in CAT:E, but nothing displayed in the entry itself.

Is there any reason we need to execute things inside of a non-displayed comment? Attention templates can sit unnoticed for years and the people who leave them tend to forget about them, which makes them a maintenance headache. It might even cause problems with memory, time and expansion limits on some pages. Chuck Entz (talk) 13:46, 16 October 2023 (UTC)Reply[reply]

@Chuck Entz I agree - these should be treated like HTML comments (i.e. left inert). Unfortunately, it’s not possible to transclude <nowiki> tags directly (for reasons I won’t go into here), but it is possible to nowiki-fy text in Lua, which should prevent other templates from processing the output in turn. Actually, ignore that - that wouldn’t solve the issue at hand, since templates are processed from innermost to outermost. I’ll have a think. Theknightwho (talk) 13:51, 16 October 2023 (UTC)Reply[reply]

Grouped Recent Changes edit

Would there be value in having, for example the recent changes of, for example, Category:Slavic languages display recent changes to all pages listed as a Slavic language (or say other language families)? Is it even possible? Vininn126 (talk) 18:22, 16 October 2023 (UTC)Reply[reply]

Spacing at Huelva. For some reasons, there's no space in the first line, which reads "A provincein the southwest of Andalusia, Spain". I hate modules. P. Sovjunk (talk) 18:48, 17 October 2023 (UTC)Reply[reply]

Be consistent in using the new format with <<..>> or the old format with multiple params. Don't mix them. Benwing2 (talk) 01:13, 18 October 2023 (UTC)Reply[reply]

to syntax highlight or not to syntax highlight? that is the mystery ... edit

Some pages like Module:it-verb and Module:ar-verb don't have syntax highlighting, but others like Module:be-verb and Module:uk-verb do. There seems to be absolutely no pattern whatsoever as to why some Lua modules are correctly syntax highlighted and others aren't. Does anyone (e.g. User:This, that and the other, User:Erutuon, User:Theknightwho) have any ideas why this might be happening? It doesn't seem obviously related to the content of the module, to the name of the module, to whether there's a doc page or to which categories the module is in. Maybe somehow it's connected to how recently the last change was, but that seems very strange if so. Benwing2 (talk) 02:35, 18 October 2023 (UTC)Reply[reply]

@Benwing2 I think it’s to do with some kind of page size or line length limit, since I’ve noticed this only tends to happen with very large modules. Theknightwho (talk) 03:43, 18 October 2023 (UTC)Reply[reply]
@Benwing2 the Mediawiki SyntaxHighlight extension won't highlight anything larger than 102,400 bytes = 100 KB, or longer than 1000 lines. This, that and the other (talk) 04:26, 18 October 2023 (UTC)Reply[reply]
Well, I can see that Module:be-verb is longer than 1000 lines and yet it is still highlighted. So perhaps that threshold is ignored or overridden elsewhere. But the bytes criterion does appear to be honoured. This, that and the other (talk) 04:27, 18 October 2023 (UTC)Reply[reply]
@This, that and the other Hmmmph. Can we change this configuration value? We have a lot of modules bigger than 1,000 lines, and I don't think it makes sense to force them to be split up just for this reason. Benwing2 (talk) 04:35, 18 October 2023 (UTC)Reply[reply]
OK, I checked a bunch of modules, and quite a lot of them are > 1000 lines but < 102,400 bytes, and all are highlighted. But there are still plenty of larger modules > 100,000 bytes, although it's rare for them to be > 250,000 bytes except for a few data modules. Benwing2 (talk) 04:43, 18 October 2023 (UTC)Reply[reply]

Italian conjugation table links shouldn't strip accents when the accents are used in normal Italian spelling edit

Not sure exactly what the best fix is for this or what other pages it affects, but currently the conjugation table at essere links to e when it should link to è. Urszag (talk) 14:33, 18 October 2023 (UTC)Reply[reply]

Sarò links correctly there, and so do può and poté at potere, looks like something of è in particular. @Benwing2. Catonif (talk) 15:14, 18 October 2023 (UTC)Reply[reply]
@Urszag, Catonif Fixed. The module knows to strip accents in non-final syllables and preserve them in final syllables, but monosyllables need special handling because sometimes the accent is preserved, sometimes it's stripped (cf. dare with do, dai, dà). The default is to strip the accent in monosyllables, so you need to add a special character to preserve the accent, which was wrongly omitted. Benwing2 (talk) 23:16, 18 October 2023 (UTC)Reply[reply]

Typo in WOTD link edit

Today's (19 Oct, "honeycomb") word of the day has a typo in a link anchor: [[honeycomb#Noub|honeycomb]], Noub should be Noun. This is in verb sense 1.1. Probably invisible for a logged-out user, but I have the yellow links setting on in my preferences so I noticed it. I tried to fix it but was (understandably) automatically blocked, and the message told me to come here. — Oatco (talk) 15:22, 19 October 2023 (UTC)Reply[reply]

Fixed. J3133 (talk) 15:38, 19 October 2023 (UTC)Reply[reply]

Cite/quote template title language tagging edit

It seems the cite and quote templates do not apply language tagging to the title in the output. Is there a reason for this? — SURJECTION / T / C / L / 16:03, 20 October 2023 (UTC)Reply[reply]

You mean the content of |lang=, currently just applied to the cited text and for categorization of quotes if |termlang= is not present? The cited work can be part of an anthology with essays in different languages, |chapter= in one language and |title= in another, though the positioning of the text “(in langname)” is misleadingly placed after the title of the work and one has to assume that the language in |lang= would correspond to that of |chapter=.
However since this year quotation templates (but not citation templates) allow the parameter contents to begin with a coloned langcode, Wingerbot changed a lot of these quotes in August such as in أُشْتُرْغَار(ʔušturḡār). Fay Freak (talk) 18:06, 20 October 2023 (UTC)Reply[reply]
Isn't this exactly why |worklang= exists? — SURJECTION / T / C / L / 19:38, 21 October 2023 (UTC)Reply[reply]

Serbo-Croatian headword edit

Something happened to the Serbo-Croatian headword. It shows "?" instead of the gender and no longer shows the Cyrillic/Latin forms. I think it may have to do with @Stujul edits.

Removing |g= removes the display and Cyrillic forms

Example: Beògrad m (Cyrillic spelling Бео̀град) Anatoli T. (обсудить/вклад) 21:42, 20 October 2023 (UTC)Reply[reply]

@Atitarev: Fixed. An IP has removed the Lua-based implementation of {{sh-proper noun}}. — Fenakhay (حيطي · مساهماتي) 22:43, 20 October 2023 (UTC)Reply[reply]
We should really protect a lot of these modules that are widely used and pretty set in stone. Vininn126 (talk) 11:28, 21 October 2023 (UTC)Reply[reply]

Malay translation error? edit

I was editing the page tomorrow, and a message flashed up preventing me saving the page, saying "Warning: Please be careful when editing translations of Malayic languages: we treat them independently, which means that they should each be listed separately in the translation table. For example, Indonesian should not be listed as a variety of Malay, or vice versa.". There's no reason why I should get that warning, as I didn't touch any stupid Malayic languages. Please fix this, and fix every single other bug in the Wiktionary. P. Sovjunk (talk) 18:54, 21 October 2023 (UTC)Reply[reply]

@Theknightwho: First of all, this filter is very inefficient- it should eliminate edits that don't include translations and don't include the relevant language names before performing an expensive regex search on the entire new wikitext (and why is this checking the Reconstruction namespace?). That will also eliminate giving warnings to people who aren't doing anything with the translations in question. Having the filter check unnecessary stuff slows down page loading and puts unnecessary extra load on the servers. Remember that execution of a string of ANDed expressions is from left to right and stops at the first one that evaluates to false, so you should have simple conditions that eliminate as many edits as cheaply as possible at the start. While I'm at it, I should mention that an expression with lots of arguments like ccnorm_contains_any is better than a string of individual conditions. Also, I'm not so sure the new wikitext is the right variable, anyway, since you're only interested in what the edit has changed, not in the state of the page as a whole.
I also don't see anything wrong with the translation nesting on that page. Javanese has multiple lines (added in March of 2014), but those seem to be for sublects, not separate languages. Chuck Entz (talk) 20:35, 21 October 2023 (UTC)Reply[reply]
@Chuck Entz There's a full explanation of why it's structured in the way it is here, including why it uses the wikitext variable, but obviously we want to eliminate the false positives ASAP. It's not really slowing page loading, though - it takes 0.16ms to run. Theknightwho (talk) 20:44, 21 October 2023 (UTC)Reply[reply]
As far as page loading is concerned: don't forget that every single active abuse filter runs on every single page on Wiktionary for every single user action. Tests for things like namespaces, action type and user classes minimize the resources spent on each run, but the abuse filters are in addition to everything else and they can add up. Any time spent looking for Javanese translations in a page like is a completely unnecessary waste, even if it's only a fraction of 1.6ms. Chuck Entz (talk) 22:34, 21 October 2023 (UTC)Reply[reply]

Template:l and Template:m fall out of popups edit

They yield "missing words" in a popup. It would be cool if this could be fixed, but perhaps it is already known to be something that is (1) not feasible to fix or (2) not high enough priority because most users will not see popups anyway? Quercus solaris (talk) 23:10, 21 October 2023 (UTC)Reply[reply]

@Quercus solaris Yeah, that gadget is 7,286 lines of JavaScript. Good luck getting someone to fix it :) ... I think User:-sche did some work on this, but I don't know how much or what state the gadget is currently in. Benwing2 (talk) 04:30, 24 October 2023 (UTC)Reply[reply]

Custom input on {{prefixsee}} edit

I sorted all the pedo- words into the three categories for soil, children, and feet. Is there a way to get the {{prefixsee}} template to accept custom input, so that I could feed it pedo- (soil), pedo- (child), and pedo- (foot)? Then we could replace the manually curated ====Derived terms==== headers on the pedo- page with the three automatically generated categories.

If it helps, I can confirm that going to pedo- (soil) and previewing a page with just the {{prefixsee}} template on it pulls down the correct words, and only the correct words. Thanks, Soap 00:07, 22 October 2023 (UTC)Reply[reply]

{{prefixsee}} accepts an |id= parameter, see x-. Einstein2 (talk) 00:14, 22 October 2023 (UTC)Reply[reply]
OK thank you. I will try to get around to adding it to the documentation later in case someone else finds themselves in the same situation. Soap 00:49, 22 October 2023 (UTC)Reply[reply]

Error in "Q" template: "bad argument #1 to 'lc' (string expected, got nil)" edit

I noticed this error on this revision of illic, which caused the quotation of the pronoun to be blanked out and unviewable except when editing the page. It seems the error shows up only when the fourth parameter (for the location of the quote within the work) is left empty. Ideally, that information would never be absent, but it seems like an error for the entire template to silently (to the reader) fail to display anything if it is left out: I'd imagine the intended behavior is to simply display what information is present. I'm guessing this is caused by some recent edit to an upstream module. Urszag (talk) 16:56, 22 October 2023 (UTC)Reply[reply]

VisualEditor disabled in some namespaces edit

Wiktionary has VisualEditor enabled in only some of its namespaces, such as Main, User and Help. But since VisualEditor is very useful when making and editing tables (its best asset imho), could we consider enabling it in the Appendix namespace, which tends to have lots of tables? –Vuccala (talk) 17:13, 22 October 2023 (UTC)Reply[reply]

Yes, VE is arguably more useful in Appendix space than in the main namespace. If you have enabled VE in your preferences, you can access it in any namespace by manually adding ?veaction=edit to the URL, for example, https://en.wiktionary.org/wiki/Appendix:French%20verbs?veaction=edit. I guess we could file a configuration change request to make the tab available in the Appendix namespace. This, that and the other (talk) 00:37, 23 October 2023 (UTC)Reply[reply]
@This, that and the other Thank you, I've now been using ?veaction=edit to access VE whenever I need it on the Appendix. I waited to see if anyone would raise an objection to "I guess we could file a configuration change request to make the tab available in the Appendix namespace.", since there was none, would you be able to initiate that request? –Vuccala (talk) 01:00, 10 November 2023 (UTC)Reply[reply]

Lua memory errors and Lua garbage collection edit

FYI Tim Starling is looking into this: [11]. - -sche (discuss) 08:22, 23 October 2023 (UTC)Reply[reply]

Omg, what happened, has Wikipedia now memory problems as well and they remembered us? :) Amazing news. Jberkel 09:10, 23 October 2023 (UTC)Reply[reply]
Indeed, this is very good news. Their experiments showed that doing an emergency GC right before running out of memory can save half or so of the memory. Benwing2 (talk) 09:26, 24 October 2023 (UTC)Reply[reply]
The limit has been doubled to 100MB and my GC patch is waiting for review. As I wrote elsewhere, I truly did not know about the trouble Wiktionary was having with the Lua memory limit. Nobody told me. I happened to see a screenshot of the issue on social media. -- Tim Starling (WMF) (talk) 22:44, 24 October 2023 (UTC)Reply[reply]
Well, thank you very much! I was starting to just accept it as a permanent feature of Wiktionary... I do notice that our longest entries now take a very long time to load, but perhaps that's simply the price we have to pay. Editors of Zulu will be pleased, haha! Andrew Sheedy (talk) 02:58, 25 October 2023 (UTC)Reply[reply]
Yes, I also want to say a big thank you @Tim Starling (WMF). In future, would you prefer us to contact you directly if there are perennial issues like this? The only other limit I can think of that's caused me issues is the tiny C stack-size limit (200), but reporting things on the Phabricator often feels quite futile. Theknightwho (talk) 16:19, 25 October 2023 (UTC)Reply[reply]
@Theknightwho There are some things you can do to improve visibility of tasks on Phabricator. Tag all relevant software components (e.g. LuaSandbox). Tag all teams that might be responsible for maintenance, since teams typically have triage meetings. Look at the git logs and subscribe developers who you think could do the work. If you have a task which you think is low effort and high reward, you can pitch it to me directly. The C stack size limit (LUAI_MAXCCALLS) is hard-coded in the upstream liblua5.1-0 package, I can't just bump it as part of the next LuaSandbox release. If you think it's worth a few of hours of work per year for an SRE to maintain a fork of the package, you can ask for that in Phabricator, and tag serviceops. Include a detailed rationale. -- Tim Starling (WMF) (talk) 22:13, 25 October 2023 (UTC)Reply[reply]
@Tim Starling (WMF) Thanks for the advice. I was hoping it was simply a matter of bumping up the limit (i.e. low effort, high reward), but if not I'll see if it's worth making a case for it. So far, it's only happened with very complex edge-cases, thankfully. Theknightwho (talk) 23:12, 25 October 2023 (UTC)Reply[reply]
This is amazing news. I came here from Wiktionary:News_for_editors#October_2023 and pages like a seem to be working correctly already. Has this been fully deployed and is entirely functional? —Justin (koavf)TCM 01:07, 28 October 2023 (UTC)Reply[reply]
@Koavf The doubled memory limit has already been applied. I think the fix to do an emergency GC when memory is about to be exhausted is still in progress; this requires some code review and a testing period followed by a new MediaWiki deployment, which AFAIK happens weekly. Benwing2 (talk) 05:53, 28 October 2023 (UTC)Reply[reply]
  • Let's see if we can complicate Modules even more and get back to the 50+ pages with memory errors again.P. Sovjunk (talk) 23:32, 1 November 2023 (UTC)Reply[reply]

fans edit

can Category:en:Fans and Category:en:Fans (people) link to each other? i know i can just edit the category pages, but i haven't seen a single example yet of a page that has {{auto cat}} plus some other text within it, so i get the impression we tend to leave autocat pages alone. someone finding the first category especially might not know that the other exists. thanks, Soap 08:58, 23 October 2023 (UTC)Reply[reply]

All you have to do is find the data module, figure out what to edit, and hope what you do is legal. Easy-peasy. DCDuring (talk) 13:00, 23 October 2023 (UTC)Reply[reply]
There are some pages that have 'manual' disambiguation links added atop autocat, e.g. Category:Louisiana Creole language, Category:Switzerland German, Category:Cyrillic script, Category:el:All topics. Whether it'd be better to keep doing that or to add some kind of disambiguationlink= parameter to auto cat, I don't know. - -sche (discuss) 16:34, 23 October 2023 (UTC)Reply[reply]
Apropos here: I have added see-also links to various categories for "English terms prefixed with X" and "English terms suffixed with X". A typical example is discoverable at Category:English terms suffixed with -algia. I recommend not banning/disallowing these links, because I am convinced that they are useful in certain worthwhile ways. I am not hellbent on the exact method or formatting — the useful connection is the point, however it be expressed. Quercus solaris (talk) 23:05, 23 October 2023 (UTC)Reply[reply]
Yes, I think this should be done from Lua somehow, as the disambiguation would apply to all the language versions of these categories. The obligatory ping to @Benwing2 here. This, that and the other (talk) 04:01, 24 October 2023 (UTC)Reply[reply]
@This, that and the other Agreed. You can do this using the additional key to place the text after the description, or the preceding key to place the text before the description. Usually it's placed after the description, like this: additional = "{{also|Category:{{{langcode}}}:Weather}}" (an actual example taken from the 'Meterology' category; see Module:category tree/topic cat/data/Sciences). (This is documented for poscat categories in Module:category tree/poscatboiler/data/documentation, but topic categories are currently lacking in proper documentation. It's on my list of things to do to clean up this documentation.) I'm thinking of adding a special key for see-also notes, to simplify the syntax. It's definitely also possible to place see-also links manually in the wikitext of individual categories, but it only really makes sense to do this for one-off categories (like most of the categories mentioned by User:-sche). Note also that there's a separate key topright for right-aligned elements that go above the existing right-aligned stuff, such as Wikipedia boxes. The reason for this is that all right-aligned elements need to be placed inside of a right-aligned HTML div, and if you just stick the Wikipedia box manually in the wikitext, it will go outside this div and typically end up to the left of the obligatory right-aligned elements instead of above them. (Formerly there wasn't a div around the right-aligned elements and it was possible to manually stick Wikipedia boxes into the wikitext, but this led to some unwanted effects where left-aligned text was getting pushed down below the right-aligned elements.) User:-sche's suggestion of adding a parameter to {{auto cat}} is something I have also thought of doing but it leads to problems for non-one-off (i.e. language-specific) categories, as TTO mentions; it's potentially possible to work around this by placing the params in the umbrella category of the language-specific category and scraping it for calls to {{auto cat}}, but (a) this adds a lot of complexities to the code, and (b) it's fragile. I do do scraping of this nature for dialect categories like Category:Switzerland German; for these categories there are several params to {{auto cat}} that can be given to customize the display (see the documentation of {{auto cat}} for the exact params, and see Category:Texas German for a fairly simple example and Category:Durham University English, Category:Katharevousa or Category:Issime Walser for slightly more complex examples). But the code to do that (in Module:category tree/poscatboiler/data/language varieties) is significantly complex and requires some nasty hacks such as the dialect_parent_cats_to_scrape list at Module:category tree/poscatboiler/data/language varieties#L-221. Benwing2 (talk) 04:27, 24 October 2023 (UTC)Reply[reply]
See what I meant: easy-peasy. DCDuring (talk) 12:29, 24 October 2023 (UTC)Reply[reply]
@DCDuring Can you please cut the snark? Theknightwho (talk) 23:14, 25 October 2023 (UTC)Reply[reply]
As soon as there's some response: like documenting things before advancing to the next bright shiny object, like less imperialistic structures, fewer grand schemes. DCDuring (talk) 01:53, 26 October 2023 (UTC)Reply[reply]
@DCDuring People are more likely to do what you want if you stop insulting them, which is probably why you keep being ignored. We get it - you don't like learning how to do things differently, but that doesn't mean you get to be an obstructionist about it unless you start compromising by offering up genuine solutions that keep everyone happy. Last time I tried to get you to do that you flat-out told me nothing would work, so it's starting to feel like the real reason behind all of this is that you don't want to learn new things, and don't give a shit about how much that inconveniences the rest of us, since you've repeatedly said you want to do everything manually. No thanks. Theknightwho (talk) 02:16, 26 October 2023 (UTC)Reply[reply]
@Soap see Cat:Fans now: https://en.wiktionary.org/w/index.php?title=Module:category_tree/topic_cat/data/Technology&diff=prev&oldid=76471325
While we're at it, fan death shouldn't be in a "type" category like Cat:en:Fans, but it is a word connected to fans so it "feels like" it should be in some kind of fan-related category... This, that and the other (talk) 06:58, 24 October 2023 (UTC)Reply[reply]
Thanks ... I've bookmarked the diff in case I need to do something like this on my own, though I'm not comfortable with editing code, so unless it's the exact same situation again I'll still post a quick request here first. As for fan death, I dont work with categories much so I dont know if it would upset things to change the scope of the category. If we must keep the category to be for the tangible object only, then I think that's okay. the fan death page will always be linked from fan and probably from death. Soap 07:05, 24 October 2023 (UTC)Reply[reply]
@Soap @This, that and the other User:-sche and I have talked about cleaning up the categories even more and maybe including the class of topic category in the name of the category, so it would then be possible to make a "related to fans" category in addition to a "types of fans" category. But in general, more specific classes like "type" are strongly preferred to "related-to" categories because the latter can become a morass of vaguely related terms and we don't yet have clear criteria for what belongs in such categories. In this particular situation I'd just remove fan death from CAT:en:Fans and make sure it is listed among the ==Derived terms== of fan. Benwing2 (talk) 07:23, 24 October 2023 (UTC)Reply[reply]

Why does Google show "Wikipedia" as page title for our entries? edit

Try this search: [12]. The top result is our entry, but the page title shown above the snippet just says "Wikipedia", which is neither the true page title, nor accurate (we are not Wikipedia). Why is this? Equinox 15:27, 23 October 2023 (UTC)Reply[reply]

WMF rebranding strategy, phase 1 :) Nah, looks like a Google f*up. phab:T348203. Jberkel 16:00, 23 October 2023 (UTC)Reply[reply]
I've encountered this before, too. My guess is that maybe google reached Marioesque via a [[:wikt:Marioesque]] link in a Wikipedia article and couldn't figure out that it had been taken to a different website since [[ ]]-style links are normally to other pages on the same site... and it for some reason doesn't notice the different URL? - -sche (discuss) 16:39, 23 October 2023 (UTC)Reply[reply]
In my mobile browser it correctly titles the entry Wiktionary, but if I enable 'Desktop site' it says Wikipedia instead. I'm on Android Firefox. –Vuccala (talk) 17:02, 23 October 2023 (UTC)Reply[reply]
I am getting "Wikipedia" in both Laptop and Mobile. Definitely an issue. Geographyinitiative (talk) 23:07, 23 October 2023 (UTC)Reply[reply]
I've noticed this earlier today [but apparently observed 10 days ago]. It seems to be for any word, including function words (like for) not likely to be linked to from WP. DCDuring (talk) 00:14, 24 October 2023 (UTC)Reply[reply]
For reference, this is how it looks for me in mobile view where it's correct. Odd that it seem to change according to user agent string. –Vuccala (talk) 01:36, 24 October 2023 (UTC)Reply[reply]


Google Search is evidently using some pretty gross genAI now. A search for "wiki commons" (no quotes) shows top results that are correct in terms of link resolution, but the result that resolves to Wikimedia Commons at commons.wikimedia.org is labeled as "Wikipedia Commons" at "commons.wikipedia.org". The genAI is evidently faithfully reflecting most humans' view that sloppy catachresis is fine for practical purposes informally, but unfortunately it is also currently misapplied as if it were also fine to pollute formal contexts that are traditionally supposed to be technically accurate, such as the labels telling the name of the site. Humans are dancing with the devil in this era by letting genAI ape and greatly amplify all of their most unadmirable mental habits but without appropriate guardrails. Quercus solaris (talk) 18:01, 18 November 2023 (UTC)Reply[reply]

Wasn't there an effort to change the name of the whole shebang to Wikipedia? The rationale was that WP, not WM, was what people know. Most folks eyes glaze over when I try to explain the basic organization. This reflects what people seem to think. DCDuring (talk) 00:06, 19 November 2023 (UTC)Reply[reply]

Various entries use Template:sense, Template:a, Template:q, or manual formatting (and probably other things I haven't seen yet, given that I found those three while editing just a dozen entries) instead of {{lb}} on definition-lines. Is anyone tracking this, e.g. periodically generating lists of such entries to fix? - -sche (discuss) 22:23, 24 October 2023 (UTC)Reply[reply]

@-sche I found about 8500 of them in this fairly cursory search: Wiktionary:Todo/manually crafted labels. Many of these could be "botted" (for instance, {{q|obsolete}} can comfortably be converted to {{lb}}), but a lot of them reflect misuses of context labels - for instance, in the Akan "transportatation" entries, the context label there is simply not necessary - while others are trying to communicate something other than a context label, like many of the Spanish entries on the list. This, that and the other (talk) 23:10, 24 October 2023 (UTC)Reply[reply]
Thank you! Yeah, if someone can {{lb}}-ify the ones that are bottable ({{q|obsolete}}, {{q|archaic}}, {{q|dated}}, {{q|by extension}}, (rare), (uncommon), (literal) + (literally), (figurative) + (figuratively), (by extension), what else?), we can get a better overview of what needs to be handled more manually (or with AWB). - -sche (discuss) 14:01, 25 October 2023 (UTC)Reply[reply]
I see Benwing2 decided to help out with this! [13] for instance, and [14] for the full list. Awesome stuff as usual. This, that and the other (talk) 08:16, 27 October 2023 (UTC)Reply[reply]
Although what was this about? This, that and the other (talk) 08:17, 27 October 2023 (UTC)Reply[reply]
@This, that and the other I'm not sure why User:Donnanz undid this. I moved all qualifiers involving 'of ...' after the definition, which is where they normally should be. Benwing2 (talk) 08:18, 27 October 2023 (UTC)Reply[reply]
Quite honestly, it looked better before. The bot edit made it look back-to-front. I could have reverted many more, but stopped at one. DonnanZ (talk) 08:31, 27 October 2023 (UTC)Reply[reply]
I'm confused by the way WingerBot rearranged it. I've never seen {{q}} used like that. Maybe {{gloss}}, sure, but even then, not after a full stop... This, that and the other (talk) 11:11, 27 October 2023 (UTC)Reply[reply]
I don't know of any reason why {{q}} can't be used like that. DonnanZ (talk) 12:57, 27 October 2023 (UTC)Reply[reply]
FWIW, there's nothing unusual about {{lb|en|of a|whatever}} before definitions, so changing the titular issue that {{q}} was being used in place of {{lb|en}} would've left the entry in a good state. (T:of a was a label template already in the earliest days when all the labels were in their own separate templates like T:archaic etc, and is preserved in Module:labels/data/qualifiers's "of a" and "of an". insource:"lb en of a" finds 1,500 entries in English that start with "of a", not to speak of ones that have other labels first like "dated, of a".) I've also seen longer texts just made part of the definition, like # Of a variable in a Meissner equation in which the other variables are the result of Foo transforms: suitable to be Bar transformed". I don't know that I've seen {{q|of a}} after definitions much? So my initial inclination would be to format entries like well-wooded with {{lb|en|of a}}, or even omit the text entirely (what else would "having many trees and shrubs" be taken to apply to, that we need to clarify it doesn't apply to? a well-stocked gardening supply store, I guess?). - -sche (discuss) 14:49, 27 October 2023 (UTC)Reply[reply]
@-sche: I think the key here is English vs. other languages. I'm more used to editing non-English entries, where the definitions are short glosses and the "of a X" stuff definitely should be after; cf.
# [[comb]] {{q|of a rooster}}
# [[beard]] {{q|of a goat}}
# etc.

where writing it in the opposite order would read rather strangely:

# {{q|of a rooster}} [[comb]]
# {{q|of a goat}} [[beard]]
# etc.

In general, I think it could be argued both ways for 'of a FOO' tags in English; they are similar to glosses, which go after, but also similar to context labels, which go before. Either way, however, it's unclear to me that it makes sense to use {{lb}} for these tags; the use of {{lb}} doesn't gain anything over {{q}} in these cases since they're neither getting categorized nor linked specially. Benwing2 (talk) 05:47, 28 October 2023 (UTC)Reply[reply]

FWIW, IMO things like that shouldn't be {{qualifier}}s (or labels) at all, they should just be in plain parentheses as part of the definition: "comb (of a rooster)", "beard (of a goat)". (But In the case of things like well-wooded I maintain that "{{lb|en|of an|area}} Having many trees." is a better, and more standard, approach than "Having many trees. {{q|of an area}}".) I appreciate you formatting so many of these mis-templated labels, though! - -sche (discuss) 04:30, 4 November 2023 (UTC)Reply[reply]
@-sche @This, that and the other BTW I have done two rounds of cleanup already and have a third one that will be pushed soon, but I'm hitting diminishing returns. There were maybe 15,000-20,000 cases originally (when checking for labels that are formatted using {{qualifier}}, {{sense}}, {{accent}} or any alias, or manually using italics with parens on the inside or outside) and we're down to around 8,000 but they are highly varied in nature and there isn't too much low-hanging fruit remaining. Benwing2 (talk) 05:56, 28 October 2023 (UTC)Reply[reply]
@Benwing2 can you generate a cleanup list to replace Wiktionary:Todo/manually crafted labels? Clearly your search criteria were more extensive than mine. This, that and the other (talk) 06:25, 28 October 2023 (UTC)Reply[reply]
To make a long story short, English entries using {{q}} should have been excluded from the cleanup, if possible. DonnanZ (talk) 09:07, 28 October 2023 (UTC)Reply[reply]
I may revert some more bot edits, the problem I have is picking them out, as my watchlist was flooded at the time with a mixture of English and Norwegian words. The Norwegian ones seem to be OK. DonnanZ (talk) 09:19, 28 October 2023 (UTC)Reply[reply]
(Later) I have identified 48 possible reverts. DonnanZ (talk) 10:56, 28 October 2023 (UTC)Reply[reply]
I disagree about excluding English entries with {{q}} from cleanup in general, since often times labels like colloquial and obsolete were wrongly tagged using {{q}} instead of {{lb}} in English, just as in other languages. If you feel like reverting the specifically English changes that specifically moved of ... qualifiers after the definition, I won't object although please leave such changes alone for non-English entries, as I think it does make the most sense to put "of ..." qualifiers after glosses for non-English terms. Benwing2 (talk) 01:58, 29 October 2023 (UTC)Reply[reply]
OK, I have noted in a notebook only English entries, no non-English ones. I will look at each individually in the next few days. DonnanZ (talk) 08:38, 29 October 2023 (UTC)Reply[reply]

The Wiktionary page only lists the Latin script, but Wikipedia says the N'Ko script (and Arabic??) is used for it as well. MedK1 (talk) 23:01, 24 October 2023 (UTC)Reply[reply]

Regex question; also /y/ in English IPA edit

I want to find instances of one thing between other things (such as the start and end of a template). For example, I want to find anywhere there is a y inside {{IPA|en|/.../}} or {{IPA|en|[...]}}, because people (copying e.g. Dictionary.com) sometimes misnotate /j/. Or I want to find curly apostrophes inside {{m|twf|...}} or {{l|twf|...}} for this. I am trying to figure out how to write that as a regex string that I can swap out the "start", "thing I'm looking for / middle", and "end" of, so I can search a database dump for various things with AWB.
In the case of the IPA, AFAICT I want to look for \{\{IPA\|en\|(/|\[) (either of the possible "opening" strings I want), followed by any number (including zero!) of any characters other than y or }}, followed by y, followed by any number (including zero) of any characters other than }} (including any other ys), followed by }}, so... if I write \{\{IPA\|en\|(/|\[)[^y}]y[^}]\}\}, will that work (as far as the general question of finding instances of y anywhere between the opening and closing of the template, ignoring for a moment that that particular search would also find uses of y in places within the template I don't care about like qual= or ref=)?
I just tried searching a database dump for that string and got no results, which I suspect means I did something wrong, since it seems unlikely that there are truly no pages with y in {{IPA|en}}. Is there a better way of writing what I want in regex? And is there a way of only finding instances of y in the actual pronunciations and not ref1= etc? (If the answer to that question is "not in regex / AWB", but someone wants to check using another method for any entries that have y in English IPA pronunciations, I appreciate it.) - -sche (discuss) 16:48, 25 October 2023 (UTC)Reply[reply]

Try this search with \{\{IPA\|en\|([^|{}]+=[^|]+\|)*[^|}=]*y. It isn't necessary to do regex matching after y unless you're trying to generate a list of wikicode of the problematic stuff, but I think that would be much more difficult to accomplish considering one has to also match and count the curly braces.
The on-site search engine likes to timeout when there are too many wildcards to match so this search (\{\{IPA\|en\|[^}]*y) would also work, with some false positives in the results, though it should be relatively easy to sift through the 100-or-so instances of false positives.
It seems that the use of /y/ in English IPA is mostly in loanwords or dialectal (usually in MLE with the RP equivalent being /uː/), with Tanchangya and Baiyun Ebo and being the only entries that are likely misuses of y. – wpi (talk) 17:47, 25 October 2023 (UTC)Reply[reply]
@-sche Your regex looks mostly OK but you didn't include a * quantifier after [^y}] or after [^}], which is probably why you didn't match anything. Benwing2 (talk) 23:58, 25 October 2023 (UTC)Reply[reply]
Never underestimate the ability of local accents in the US South/Appalachia to come up with vowel sounds and intonation contours that everyone assumes are impossible for English. I vaguely remember decades ago running across an example of a woman from Tennessee pronouncing spoon as something like [spyːn]. Chuck Entz (talk) 04:16, 27 October 2023 (UTC)Reply[reply]
You can use my IPA search engine (only {{IPA}} at the moment) like this. — Eru·tuon 17:24, 26 October 2023 (UTC)Reply[reply]
give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime. DCDuring (talk) 17:55, 26 October 2023 (UTC)Reply[reply]
Oh, that's a useful tool! We could probably (after some discussion about what to standardize on) even use it to find and standardize things like the competing representations of V˞ vs Vɚ vs Vɹ vs Vəɹ vs whatever else (e.g. we currently have aardvark as /ˈɑ˞d.vɑ˞k/ but hard, mark as /hɑɹd/, /mɑɹk/). - -sche (discuss) 15:07, 27 October 2023 (UTC)Reply[reply]
@-sche Ah yes, the sometimes excessive flexibility of IPA strikes again. This reminds me of resurrecting my project to create an {{en-IPA}} template that takes in English respelling and generates IPA; if used everywhere, that would avoid the problem of competing IPA representations and also allow for automatic generation of rhotic vs. non-rhotic representations, etc. Benwing2 (talk) 02:01, 29 October 2023 (UTC)Reply[reply]
Aha! thank you to Wpi and Benwing for the *; adding it makes the search work. I will try my hand at modifying the search to look for other starts and middles (like the twf apostrophes) later. :) - -sche (discuss) 15:07, 27 October 2023 (UTC)Reply[reply]

In 緊急発進, the correct romanization with accent should be [kìńkyúú háꜜsshìǹ] instead of [kìńkyúú wáꜜsshìǹ]. However, if I set the accent pattern as |acc=6, the output will be correct (ha but not wa). --TongcyDai (talk) 18:16, 25 October 2023 (UTC)Reply[reply]

{{neutral equivalent of}} or something thereof edit

What would it take to create a template for neuter forms of words? I.e. in Polish you have a few that already exist such as wnuczę, but also "neuteratives" are gaining ground, and I would be able to even attest some, such as "artyszczę". @Benwing2 Would this be difficult to make? Vininn126 (talk) 21:58, 27 October 2023 (UTC)Reply[reply]

"Neutral" by itself seems like a too ambiguous term (before seeing this, I thought it would be about e.g. neutral as opposed to pejorative or positive). Is the situation that these words are neuter in grammatical gender and used as gender-neutral counterparts to gendered pairs of words for human beings? It's not related to grammatical gender, but the category used for words like English "nibling" is Category:English gender-neutral terms.--Urszag (talk) 03:04, 28 October 2023 (UTC)Reply[reply]
@Vininn126 Maybe you want {{gender-neutral form of}} or something. Benwing2 (talk) 05:39, 28 October 2023 (UTC)Reply[reply]
It is both grammatically neuter while referring to a gender-neutral or sometimes neuter person. Vininn126 (talk) 09:36, 28 October 2023 (UTC)Reply[reply]
I made an example entry at aktywiszcze. Vininn126 (talk) 12:07, 28 October 2023 (UTC)Reply[reply]
The wording {{gender-neutral form of}} isn't exactly correct, as the neologistic nature of the forms pretty clearly points to a gender identity outside of the binary :p I feel like {{neuter form of}} would be best for this kind of thing. Hythonia (talk) 12:14, 28 October 2023 (UTC)Reply[reply]
having thought on this, agreed. Vininn126 (talk) 12:42, 28 October 2023 (UTC)Reply[reply]
@Vininn126 User:AG202, User:Ultimateria and I had a similar discussion about the use of "neuter" in Spanish to refer to gender-neutral terms; at least in Spanish I don't much like "neuter" in general for people because "neuter" traditionally refers to inanimate objects and Spanish already has forms like ello that are traditionally considered neuter and don't refer to the gender-neutral. The result was I added a new gender gneut = "gender neutral" to capture the emerging new gender in -e. Here it's a bit different because these terms are in fact grammatically neuter, but I still think this needs a bit more thought before just creating a "neuter equivalent of" or whatever. What is the exact semantics here? Are they normally gender-neutral, or specifically non-binary (in which case a nbin = "non-binary" gender might make more sense)? Or either one, depending on the situation? Benwing2 (talk) 01:53, 29 October 2023 (UTC)Reply[reply]
@Benwing2 That's exactly the situation - if you look at the entry you'll get a bit more insight. People are using neuter forms and some people dislike it because it's for objects, but that's exactly what's being used. It's not gender-neutral, it's neuter. I know it's a weird situation or hard to believe, but that's what's going on. They use neuter verb forms and neuter grammar, and in Polish are called neutratywy; "neuteratives". Vininn126 (talk) 09:17, 29 October 2023 (UTC)Reply[reply]
I made {{neuter equivalent of}}. Vininn126 (talk) 13:58, 30 October 2023 (UTC)Reply[reply]
@Benwing2 I think I effed something up - or maybe but, but I can't figure out where I need to go to add the category to the tree and how to turn off the appendix link on the word "neter". Vininn126 (talk) 14:55, 30 October 2023 (UTC)Reply[reply]
Speaking of which, could we add this functionality to {{fr-noun}} as well for entries like frœur? AG202 (talk) 18:40, 30 October 2023 (UTC)Reply[reply]

Can't create user page edit

I attempted to create a user page but it's been flagged as matching "various specific spammer habits" and won't allow the page to go through. I followed the guidelines from Wiktionary:Usernames_and_user_pages when creating the page so I don't know what happened. It directed me here if I think the flag was an error. This was the attempted text if that's relevent: "

en This user is a native speaker of English.
Search user languages or scripts
UTC-5 This user's time zone is UTC-5 and observes Daylight Saving Time from March to November.

My Wikipedia profile Welcome to my user page." ARZ100 (talk) 03:16, 28 October 2023 (UTC)Reply[reply]

  Done I added the relevant bits. Note that these kinds of flags are typically for users with very lo edit counts or who have very new accounts and these restrictions will go away over time. Happy to have you here. —Justin (koavf)TCM 03:36, 28 October 2023 (UTC)Reply[reply]
Thank you! ARZ100 (talk) 04:46, 28 October 2023 (UTC)Reply[reply]

"go" topic label edit

I want to adjust the link of {{lb|...|go}} (go) so it links directly to Etymology 2 via the etymid provided, that is, {{l|en|go|id=game}}. Is this possible?

Moreover, the documentation for the data structure format used at Module:category tree/topic cat/data/Human is found all the way over at Template:topic cat/documentation, with no direct link provided. Sure, I can fix this myself, but I feel like this is the responsibility of those who implemented the functionality, especially as this code is so instrumental to our category system. It's imperative that (a) the documentation is provided in appropriate locations, and (b) those editing the data modules keep the documentation up-to-date - in this case, the special syntax for description that begins with the = sign does not appear to be documented. This, that and the other (talk) 00:39, 30 October 2023 (UTC)Reply[reply]

I've now tried to resolve the documentation situation. However, my question about the "go" topic label still stands. This, that and the other (talk) 01:13, 30 October 2023 (UTC)Reply[reply]
@This, that and the other Thanks for your work on documentation. I am in the process of writing proper documentation, which should be done in a day or two. As for your other question, you are trying to link {{lb|en|go}} to the section of go for the game instead of just the overall entry? I think this should be possible using display = "{{l|en|go|id=game}}" but I'm not completely sure. If not it should be implemented by passing the value of display through mw.getCurrentFrame:preprocess(). Benwing2 (talk) 08:31, 30 October 2023 (UTC)Reply[reply]
@Benwing2 this doesn't appear to do anything, no matter what I set display to. I tried label too, with no apparent effect. This, that and the other (talk) 10:12, 30 October 2023 (UTC)Reply[reply]
I suppose that [[go#Etymology 2|go (game)]] is too old-school. DCDuring (talk) 12:40, 30 October 2023 (UTC)Reply[reply]
I use that method in some cases, like etymologies. DonnanZ (talk) 13:56, 30 October 2023 (UTC)Reply[reply]
@This, that and the other I think you must have been editing the topic cat data file. If you edit the label data file where the label go is defined, it works (Module:labels/data/topical). Benwing2 (talk) 19:24, 30 October 2023 (UTC)Reply[reply]
Note also for some reason there are two labels, go (alias Go) and game of Go. We probably should merge them. Benwing2 (talk) 19:27, 30 October 2023 (UTC)Reply[reply]
@Benwing2 that was it! Thanks for the somewhat psychic debugging.
@DCDuring that's a bit dangerous, as someone may decide to rearrange the etymologies at some point so that the game becomes Ety 3 or something. I'd suggest to always use the id parameter in your links, along with the etymid template in the entry concerned. This, that and the other (talk) 22:49, 30 October 2023 (UTC)Reply[reply]

A short form for "unincorporated community"? edit

I don't see anything in Template:place. Can I suggest "ucomm"? It would save a lot of typing. DonnanZ (talk) 09:57, 30 October 2023 (UTC)Reply[reply]

@Benwing2: Thanks - "ucomm" works well (e.g. Boston), and saves 19 bytes per application, though that may not be a primary concern. "small ucomm" doesn't work, but I didn't expect it to, and it's rare anyway. DonnanZ (talk) 11:01, 31 October 2023 (UTC)Reply[reply]
@Donnanz Hmm, 'small ucomm' ought to work. Something to look into at some point. Benwing2 (talk) 22:29, 31 October 2023 (UTC)Reply[reply]
@Donnanz: I have fixed with issue with "small ucomm". Benwing2 (talk) 06:55, 8 November 2023 (UTC)Reply[reply]
@Benwing2: Ah, cheers. I am still discovering useful short forms like "rcomun" (regional county municipality) and "twpmun" (township municipality). DonnanZ (talk) 09:18, 8 November 2023 (UTC)Reply[reply]
@Donnanz They should all be documented under {{place}}. Benwing2 (talk) 09:31, 8 November 2023 (UTC)Reply[reply]
@Benwing2: Yeah, that's where I found them. I could have used them before now! DonnanZ (talk) 09:46, 8 November 2023 (UTC)Reply[reply]

Frequent Errors during Term Searches edit

For the past couple of weeks I have been receiving error messages multiple times every day when looking up terms on Wiktionary. Is there a lot of site maintenance going on or could it be because of something else? This hasn't happened to me before. Note that I am currently located in Montenegro.
Thanks for any clarification. Vedac13 (talk) 21:50, 30 October 2023 (UTC)Reply[reply]

@Vedac13 I have not seen this. Generally this site maintenance is out of our control (it's performed centrally by MediaWiki), and I'd expect it to be global in its effects, but what you're seeing could also be caused by server overload (depending on the particular error message you're receiving), and that might have local effects. Can you report the exact error message you're getting? If it persists, maybe we can report a Phabricator bug. Benwing2 (talk) 23:10, 30 October 2023 (UTC)Reply[reply]

November 2023

Chinese fallback edit

I was wondering if this can be applied to Wiktionary: When I'm reading about Mandarin (shuǐ) or Hokkien and wanted to click the links, I wanted to be taken to the Chinese (shuǐ) definition right away. Sure, I can just scroll down a bit but can I (or every other user) be taken to the Chinese part already instead of scrolling? I think this improves user experience. This can be done when the Chinese "dialects" are in Chinese characters and not Latin like Mandarin shuǐ, Hokkien chúi that the Chinese definition can be a fallback. Ysrael214 (talk) 01:20, 1 November 2023 (UTC)Reply[reply]

Lingua Libre: Is it okay to just directly link to .WAV files for pronunciation without any .OGG reuploading gymnastics? edit

I found some earlier discussion on this topic. The instructions about Lingua Libre in Help:Audio_pronunciations are very short and unclear about what we are supposed to do with the resulting .WAV files. I also see that the {{audio|pl|LL-Q809 (pol)-Poemat-spartaczyć.wav|Audio}} template from spartaczyć conveniently turns into the following information in the kaikki.org's exported json:

{"audio": "LL-Q809 (pol)-Poemat-spartaczyć.wav", "text": "Audio", "ogg_url": "https://upload.wikimedia.org/wikipedia/commons/transcoded/b/b2/LL-Q809_%28pol%29-Poemat-spartaczy%C4%87.wav/LL-Q809_%28pol%29-Poemat-spartaczy%C4%87.wav.ogg", "mp3_url": "https://upload.wikimedia.org/wikipedia/commons/transcoded/b/b2/LL-Q809_%28pol%29-Poemat-spartaczy%C4%87.wav/LL-Q809_%28pol%29-Poemat-spartaczy%C4%87.wav.mp3"}

So the autoconverted .OGG and .MP3 files are nicely referenced from it. Are there no downsides or have I missed something important? Ssvb (talk) 16:46, 1 November 2023 (UTC)Reply[reply]

BTW, I'm trying to use the audio recording data from Lingua Libre this way. Please let me know if this is considered to be inappropriate. Ssvb (talk) 16:51, 1 November 2023 (UTC)Reply[reply]
@Ssvb: yes, that's fine. — Sgconlaw (talk) 19:23, 1 November 2023 (UTC)Reply[reply]
@Sgconlaw: Thanks! I can indeed see many other examples of .WAV files usage. Should Help:Audio_pronunciations be updated to make it less misleading? The way it is now, it's probably scaring away many potential pronunciation audio contributors. — Ssvb (talk) 08:59, 5 November 2023 (UTC)Reply[reply]
@Ssvb: please indicate which parts of that page you think require updating. — Sgconlaw (talk) 11:55, 5 November 2023 (UTC)Reply[reply]
@Sgconlaw: Please try to put yourself in the shoes of a new user. You visit some article and see the following message from the {{rfap|en}} template:
Coincidentally, you are a native speaker with a microphone and actually want to help. The link "record some" brings us to the Help:Audio_pronunciations page. What kind of ideas do you get after reading it? The emphasis there is on using the .OGG file format for storage efficiency. It also states "We recommend that you download Audacity" as if it were the current best practice. Then there are walls of text about the file naming conventions and some tips about the cumbersome process of renaming/uploading many files. A small notice about Lingua Libre is even not immediately visible, because it's just sandwiched somewhere in the middle of the page. But let's suppose that you found Lingua Libre and recorded a few hundreds pronunciation audio files using it. Now what? Lingua Libre uses .WAV format instead of .OGG and does not follow the prescribed file naming conventions. How do we convert to .OGG format, rename and upload the files? The article mentions that "a bot was made for adding the audios onto different wikis (e.g. Wiktionary in French)" without going into any details. Having no other information, a reasonable assumption is that the automated process of converting/renaming/uploading the resulting .OGG files is exclusively available only to the lucky French Wiktionary users via some sort of a bot. But we are in the English Wiktionary! Do we need to download Lingua Libre files one by one, manually rename them to comply with the required naming conventions and upload them manually too? That's the idea that any newcomer would get after reading the current not very helpful Help:Audio_pronunciations page.
My suggestion is to explicitly recommend Lingua Libre as a convenient unified solution, which already takes care of both recording and uploading. Mention that the .WAV files created by it are okay and can be linked directly. Maybe mention User:DerbethBot and what it does. The bits about Audacity surely have some archaeological value and may be kept as a fallback option, but they shouldn't be dumped on the unsuspecting new users the way they are now. Do I need a special permission and consensus to edit Help:Audio_pronunciations myself? — Ssvb (talk) 05:36, 6 November 2023 (UTC)Reply[reply]
@Ssvb I'd encourage you, as someone with recent experience of this process, to consider updating as much of Help:Audio pronunciations as you can. I see that you already made a start, but explaining exactly how to get the pronunciations added to the page after recording them using Lingua Libre seems like a critical detail that we are missing. Honestly our entire Help namespace is in a dire state of outdatedness... please just go ahead and pitch in. This, that and the other (talk) 09:56, 10 November 2023 (UTC)Reply[reply]
@This, that and the other: Done. At least content-wise I have nothing more to add. Now the others are welcome to proofread it for grammar/spelling/formatting issues and maybe give Lingua Libre a test drive to see whether the instructions are comprehensive enough. —Ssvb (talk) 14:21, 10 November 2023 (UTC)Reply[reply]

Wikitable syntax gets mangled when using "Reply" functionality on Discussion pages edit

Wikitable syntax gets mangled when using the "Reply" functionality on Discussion pages, as a consequence o prepending a colon (:) to every line of 'code'.

Example at Talk:lay#To_lay_and_to_lie.

I'm not sure whether this is already a widely known problem. I'm also uncertain about how to resolve it: my first thought was that a parser could watch for key 'code phrases' to know when to forgo the prefixing of a colon, but I'm not sure that will be robust to all user intentions.

—DIV ( 07:42, 2 November 2023 (UTC))Reply[reply]

Yes I have seen this when trying to use <pre> in Reply responses. BTW in your table about lie vs. lay I don't think the reflexive use of "lie" is correct; cf. Bridge Over Troubled Water, "I will lay me down"; saying "I will lie me down" sounds quite wrong to me. Benwing2 (talk) 21:03, 2 November 2023 (UTC)Reply[reply]
"Now I lay me down to sleep, / I pray the Lord my soul to keep." On the other hand, "lie myself" sounds wrong to me. Equinox 21:19, 2 November 2023 (UTC)Reply[reply]
Seems like a use of lay ("to put, to place") rather than lie ("to be placed horizontally"). — Sgconlaw (talk) 21:30, 2 November 2023 (UTC)Reply[reply]

Double derived terms edit

There are plenty of examples where the same term has been added to a table twice, e.g. in Derived terms with {{col-auto}}. Any way we can generate a list of them and/or delete them all? P. Sovjunk (talk) 22:30, 2 November 2023 (UTC)Reply[reply]

Can you give some examples? Benwing2 (talk) 10:03, 3 November 2023 (UTC)Reply[reply]
I don't remember any, so I made my own error! |foolproof appears twice at fool. P. Sovjunk (talk) 10:10, 3 November 2023 (UTC)Reply[reply]
A few appear among multi-word derived terms of many one-word English vernacular names of plants, mammals, birds, fish. Also alternative forms are shown separately in such lists. It is tedious to try to edit the duplicates out manually and group the alternative forms because the edit window does show them alphabetized. This in one of the things that I hold against automatic alphabetization in column templates. If both removal of duplicates and combining alternative forms on a single line were automated for all content in column templates, that would go a long way toward making autoalphabetization desirable for editors. DCDuring (talk) 13:00, 3 November 2023 (UTC)Reply[reply]
@DCDuring How would turning off automatic alphabetisation help with these issues in any way? Theknightwho (talk) 22:23, 3 November 2023 (UTC)Reply[reply]
It would probably double the speed with which I can find duplicates and alt form near-duplicates and make me more willing to do so. It might even cause other contributors to do so. DCDuring (talk) 23:37, 3 November 2023 (UTC)Reply[reply]
@DCDuring So what you're really saying is that we should inconvenience everyone else for the sake of pointlessly sorting the wikicode. No. We can certainly eliminate duplicates automatically, but we don't need to waste everyone's time for the sake of making an unnecessary job a bit more convenient - it would be much more helpful if you spent time doing things that can't be done via automation. Theknightwho (talk) 23:39, 3 November 2023 (UTC)Reply[reply]
But the point is that they have not been done by automation and are not being done by automation. DCDuring (talk) 00:00, 4 November 2023 (UTC)Reply[reply]
@DCDuring The display form is what actually matters. If you need to turn off automatic sorting for yourself, just use sort=0 in a preview. Theknightwho (talk) 00:18, 4 November 2023 (UTC)Reply[reply]
  • Speaking of alphabetization, I've been dumping loads of Derived terms with blatant disregard for alphabetical order. The idea is that one day a bot is gonna 'betize 'em anyway.P. Sovjunk (talk) 22:08, 3 November 2023 (UTC)Reply[reply]
    How hard would it be to alphabetize using a text editor, word-processor, or text-sorting utility before dumping? DCDuring (talk) 23:37, 3 November 2023 (UTC)Reply[reply]
    I don't think anyone else wants what you want here, @DCDuring. Asking everyone to sort everything just for a particular case is somewhat selfish. Vininn126 (talk) 23:50, 3 November 2023 (UTC)Reply[reply]
    What I've asked for all along is that the auto-alphabetizing column templates not be applied in an automated way. IOW, I've asked for less work from others than they insist on doing. DCDuring (talk) 00:00, 4 November 2023 (UTC)Reply[reply]
    @DCDuring Forcing everyone else to manually sort things for your personal convenience is extremely selfish. Most of us don't want to do that, and the convenience issues you raise are very straightforward for you to get around. Theknightwho (talk) 00:49, 4 November 2023 (UTC)Reply[reply]
    @DCDuring It sounds like you actually want auto-alphabetisation, but only if "removal of duplicates and combining alternative forms on a single line were automated" as well? I think any reasonable editor would want those things too in the long run (although maybe I should speak for myself). So I am sure we will get there. Wiktionary is a work in progress on every front, including the technical fronts. This, that and the other (talk) 01:09, 4 November 2023 (UTC)Reply[reply]
The draft template {{derived terms}} removes duplicates, just to prove that it is possible. However, that template is only a draft. On my to-do list is to fix it up so that it can be put into production. This, that and the other (talk) 01:01, 4 November 2023 (UTC)Reply[reply]
@This, that and the other It's something that can certainly be integrated into th main column templates, but the issue is making sure that it can account for various oddities like language code differences, qualifiers etc. Theknightwho (talk) 01:05, 4 November 2023 (UTC)Reply[reply]
  • There's probably a way to fiddle with a template and add a parameter to avoid alphabetization in a template, like alpha=no. I certainly can't do it, but I'm sure it's doable. P. Sovjunk (talk) 10:37, 4 November 2023 (UTC)Reply[reply]

template include size doubles when transcluded?? edit

@Erutuon, Theknightwho Can you help me understand why the template include size more than doubles between Template:label/documentation and Template:label? I added a table to the former that shows all the defined labels. It's rather large at 1,026,442 bytes, but well below the 2M limit. However, it exceeds the 2M limit when transcluded into Template:label. I checked other pages with large documentation tables in them, and e.g. on Template:inflection of/documentation the size is 100,907 bytes, but it increases to 282,012 bytes when transcluded into Template:inflection of. Is there a a way to avoid this? Benwing2 (talk) 05:10, 3 November 2023 (UTC)Reply[reply]

@Benwing2 Have a look at w:WP:Post-expand include size, which has some info about it. There are multipliers that apply in various situations, so that’s almost certainly what’s going on here. Theknightwho (talk) 05:21, 3 November 2023 (UTC)Reply[reply]
(e/c) I "fixed" this by moving the table directly to the noinclude portion of {{label}}, but it's a nasty hack. Benwing2 (talk) 05:21, 3 November 2023 (UTC)Reply[reply]
@Theknightwho There are no if statements or anything in the transclusion; {{documentation}} directly calls Lua. Apparently this is a known bug from way, way back, marked as "will not fix"; see [15]. Benwing2 (talk) 05:24, 3 November 2023 (UTC)Reply[reply]
@Benwing2 Very frustrating. I guess we could use the template parser which would get around this, but it’s not finished yet. Theknightwho (talk) 05:27, 3 November 2023 (UTC)Reply[reply]

Bot request: Thesaurus language code tidying edit

The Thesaurus namespace is stuck in an awful timewarp. One of the most severe issues is that the main templates used, {{ws sense}} and {{ws}}, do not require a language code. This means that thesaurus entries are not properly categorised according to language, and per-language/script text formatting and automatic transliterations are missing on the listed terms.

{{ws header}} takes a |lang= parameter, but this doesn't make sense, as that template occurs once on each page, while there may be multiple L2s for different languages (just as for our regular entries). Over 3500 thesaurus entries do not have this parameter set, so counting the thesaurus entries in a particular language is difficult. Moreover, it seems impossible to truly be sure of how many different languages are represented in the thesaurus without using dumps.

It's clear that the community wants to retain non-English content in this namespace (Wiktionary:Votes/pl-2017-11/Restricting Thesaurus to English), so to assist with categorisation, could I please request a friendly bot owner to help? We would need to add a language parameter (corresponding to the L2) as the first parameter of every occurrence of at least {{ws sense}}, to allow for proper categorisation, and ideally also {{ws}} as well, to allow for automatic transliterations on pages like Thesaurus:അമ്മ. We may as well also remove the lang= parameter of {{ws header}} while we're there. This, that and the other (talk) 11:56, 3 November 2023 (UTC)Reply[reply]

@This, that and the other Given we now have a much large buffer when it comes to memory issues, it might be worth reconsidering whether we can integrate the thesaurus into the mainspace instead. It's a lot more likely to get attention that way. Theknightwho (talk) 23:03, 3 November 2023 (UTC)Reply[reply]
Content-wise, the namespace isn't especially neglected; it gets occasional, but reasonably consistent, contributions, although I doubt many of the edits are patrolled that closely by experienced contributors. It's more the technical side that is in a state of almost complete neglect, probably not helped by the fact that one of the main and most prolific thesaurus contributors was an editor who was notoriously difficult to work with (but is no longer editing Wiktionary). This bot work is necessary to gain a proper understanding of the content that's in the namespace and I'd say it would be premature to consider any suggestions to integrate the content elsewhere before this work is done. This, that and the other (talk) 00:57, 4 November 2023 (UTC)Reply[reply]
@This, that and the other I have a script to do this. Can you create new sandbox versions of {{ws}} and {{ws header}}, where the first takes a mandatory langcode param in |1= and the second doesn't take a langcode param? Once you do that, I will do the following:
  1. Rename {{ws}} to {{ws-old}}, leaving the former name as a redirect.
  2. Do a bot run to replace all uses of {{ws}} with {{ws-old}}.
  3. Replace the definition of {{ws}} with the new sandbox one.
  4. Do a bot run to rename {{ws-old}} back to {{ws}}, in the process adding the lang code in |1= and moving all numbered params up by one.
  5. Delete {{ws-old}}.
  6. In the process, insert the lang code in |1= for {{ws sense}}.
  7. Do a bot run to remove the |lang= param from {{ws header}}.
The reason for this process is because {{ws}} is being changed in an incompatible way and it doesn't appear possible to distinguish the old calling convention from the new one, since the old calling convention allows for a variable number of numbered params. {{ws sense}} seems to already take a language code and handle both new and old calling conventions, so we don't need to do the same thing for it. Benwing2 (talk) 06:55, 6 November 2023 (UTC)Reply[reply]
@Benwing2 thanks for pitching in! I was considering making a vote to give my bot TTObot the flag, but I would be much more likely to make a silly mistake!
It would be possible to potentially reduce the number of edits required at the cost of marginal additional complexity. Consider that the vast majority of thesaurus entries (97% by my reckoning) link only to terms that are longer than two characters. We could implement temporary logic in {{ws}} that treats parameter 1 as a language code if it is two characters, otherwise, as the term to be linked. That way, the only entries needing to be touched by step 1 would be those containing instances of {{ws}} with two-character terms (and three-character language codes, I guess), which would have an optional, temporary |lang= parameter added.
If you think this is worth doing, let me know and I'll code up a version of {{ws}} with extra smarts. Otherwise I'll just do a basic one where all the parameter numbers are increased by 1. This, that and the other (talk) 09:41, 6 November 2023 (UTC)Reply[reply]
@This, that and the other I think this could get complicated because there are longer language codes, e.g. ine-pro and ine-bsl-pro and zlw-ocs and such. So it might not be worth the extra work to save some edits given the much higher possibility of mistakes and the fact that there are only about 3,500 pages that employ {{ws}} on them currently. Benwing2 (talk) 09:54, 6 November 2023 (UTC)Reply[reply]
@Benwing2 I doubt there is too much Proto-Indo-European in the thesaurus :) Anyway, sure thing, it would have got pretty complex the way I suggested even without accounting for fancy language codes.
I've made {{ws/new}}, but I don't think any changes are required to {{ws header}}, as the |lang= parameter is optional, so it can just be removed from all the invocations. This, that and the other (talk) 10:15, 6 November 2023 (UTC)Reply[reply]
@This, that and the other Can you add support for |pos= to {{ws sense}}? It was formerly present in {{ws header}}, and I just removed the support but it's used e.g. on Thesaurus:प्रसिद्धि and some other Sanskrit pages. Benwing2 (talk) 03:27, 7 November 2023 (UTC)Reply[reply]
@Benwing2 thanks for your awesome work on this! I wonder what would be the impact of removing the |lang= parameter from {{ws header}}? It seems redundant now that {{ws sense}} is doing categorisation. This, that and the other (talk) 04:29, 7 November 2023 (UTC)Reply[reply]
@This, that and the other Once you add support for |pos= and fix up the few places it's used, the impact should be none. Benwing2 (talk) 04:30, 7 November 2023 (UTC)Reply[reply]
@Benwing2 I'm not really sure that |pos= belongs in {{ws header}}; it is a property of the language section, not of the thesaurus entry itself. The param was only used for categorisation, and since the categorisation logic has moved to {{ws sense}}, I guess the |pos= parameter and associated logic should be moved to that template... although I'm not totally convinced that it is needed at all... This, that and the other (talk) 04:34, 7 November 2023 (UTC)Reply[reply]

Enabling Module:form of/lang-data/ttj edit

Please add ["ttj"] = true to Module:form of in order to enable the language specific tags for Tooro located at Module:form of/lang-data/ttj. Thank you. Ahiise2 (talk) 16:46, 5 November 2023 (UTC)Reply[reply]

@Ahiise2  Done. — Fenakhay (حيطي · مساهماتي) 17:01, 5 November 2023 (UTC)Reply[reply]
Thank you! Ahiise2 (talk) 20:42, 5 November 2023 (UTC)Reply[reply]

I can't create a sandbox? edit

The bot is flagging me as a spammer for trying to create a sandbox page. Is it because my previous user page was deleted? Saph668 (talk) 23:04, 6 November 2023 (UTC)Reply[reply]

Not sure. User:Saph668/Sandbox has now been created. Please make sure that your content is on topic. Sandboxes and user pages in general are held to lower scrutiny than the main dictionary, but we don't provide free hosting for just any material. —Justin (koavf)TCM 23:06, 6 November 2023 (UTC)Reply[reply]
Thanks Saph668 (talk) 23:09, 6 November 2023 (UTC)Reply[reply]

Finding the diacritic-stripped link target for a page title edit

I am trying to fix a problem in {{ws}} that concerns Arabic entries. Thesaurus:أحزن has an antonym, أَبْهَجَ‎(ʔabhaja). This word has a corresponding thesaurus entry, Thesaurus:أبهج, so the [⇒ thesaurus] cross-reference should be shown. However, the {{ws}} template does not recognise the existence of Thesaurus:أبهج because it is looking for Thesaurus:أَبْهَجَ‎, which contains diacritics that are not used in page titles.

I'd like to adjust the parameter to {{#ifexist:}} so that it looks for the correct title, but I cannot find an existing template that will generate this title. Essentially I just want the target of the link generated by {{l}} or {{m}}, not the link itself. It looks like Module:links contains a function export.getLinkPage that would do this, but I don't know enough Lua to make it callable from a template. This, that and the other (talk) 01:21, 8 November 2023 (UTC)Reply[reply]

@This, that and the other You can do it with {{entryname|ar|أَبْهَجَ‎}}, which gives أبهج‎. In all honesty, it's probably easier to rewrite it in Lua, because otherwise you'll keep running into issues like this. Theknightwho (talk) 02:45, 8 November 2023 (UTC)Reply[reply]
@Theknightwho Ah, thanks, exactly what I was after. I've put it in a category so hopefully it can be found more easily by others in the future. This, that and the other (talk) 02:51, 8 November 2023 (UTC)Reply[reply]

Tabbed languages? edit

Is the Tabbed languages gadget failing for anyone else who usually uses it? —Mahāgaja · talk 18:52, 8 November 2023 (UTC)Reply[reply]

Yes, it suddenly died. I'll look. This, that and the other (talk) 22:28, 8 November 2023 (UTC)Reply[reply]
Should be back now. I have no idea why this suddenly broke. I guess something must have changed in the latest MediaWiki deployment. Nor do I have any idea why the code I commented out was present in the TabbedLanguages script in the first place. This, that and the other (talk) 00:13, 9 November 2023 (UTC)Reply[reply]
Reported at phab:T350080#9318075. This, that and the other (talk) 00:17, 9 November 2023 (UTC)Reply[reply]
Thanks! It's back for me now. —Mahāgaja · talk 07:31, 9 November 2023 (UTC)Reply[reply]

Why does Template:senseid use HTML li tag? edit

@Erutuon, This, that and the other The {{senseid}} template sticks its anchors inside of <li .../> by default. This is highly problematic because it prevents anything from occurring to the left of the {{senseid}}. In particular, {{transclude sense}} puts a {{senseid}} at the left of the generated text, and if the user quite reasonably inserts a label before that, you get an unexpected blank line. Why is 'li' necessary? Why not just use 'span'? That's what was there originally. Benwing2 (talk) 06:22, 9 November 2023 (UTC)Reply[reply]

@Benwing2 It's so the following CSS in MediaWiki:Common.css works:
/* senseids */
.senseid:target { background-color: #DEF; }
The :target pseudoclass applies to the element whose HTML id attribute matches the URL hash. If an empty span were to be used, this pure-CSS approach would need to be replaced by JS code. Once the :has CSS selector is fully supported and becomes established, it would be easy to switch to using an empty span instead. This, that and the other (talk) 07:31, 9 November 2023 (UTC)Reply[reply]
@This, that and the other Thanks. Does this mean that the link to the sense ID is somehow colored? Presumably the text of the sense ID itself isn't colored because it's empty. Can you give me an example where this works? For {{place}} and {{transclude sense}} in particular, I'm going to hack it to use <span>, because having missing background color (in links to sense ID's that probably are never going to be linked to) is better than a highly visible extra newline. Can you go ahead and add the :has selector code to MediaWiki:Common.css so it works correctly with <span> wherever it's supported? (Which browsers are these?) Benwing2 (talk) 07:53, 9 November 2023 (UTC)Reply[reply]
@Benwing2 sorry, I didn't really explain myself! I started writing something on my phone and then switched to my computer, but forgot to re-write the stuff I wrote on my phone.
This CSS rule powers the effect you see when you follow a link like {{m|en|sun|id=Q525}} sun, where the sense linked to is highlighted in blue. Does that help to explain?
I'll look at adding the extra CSS rule in a bit. This, that and the other (talk) 08:05, 9 November 2023 (UTC)Reply[reply]
@This, that and the other Ahh, I see. So for example if I define Picardie in Norman using {{transclude sense}} to transclude the English definition of Picardy, and it generates a sense ID for that definition, someone who links to the Norman definition of Picardy e.g. {{m|nrf|Picardie|id=Q1249603}} will (ideally) see that definition highlighted in blue. You can see on that definition how the auto-generated {{senseid}} doesn't work well with {{lb}}. I think I'd rather lose the blue highlighting than have the extra newline always inserted, esp. if you can add the extra CSS rule so that it works with spans on newer browsers. Thanks! Benwing2 (talk) 09:10, 9 November 2023 (UTC)Reply[reply]
@Benwing2: The tag can be changed in Module:transclude/sense. That requires directly calling the function in Module:senseid rather than expanding {{senseid}}, because {{senseid}} only permits li and p tags, which will generate highlighted text without JavaScript, so it is only a solution because {{transclude sense}} is implemented in Lua, rather than wikitext. Currently no extra processing is done in {{senseid}}, so no features are missed by calling directly into Module:senseid. — Eru·tuon 15:49, 9 November 2023 (UTC)Reply[reply]
@Benwing2 I added the CSS, but I seemed to have some trouble getting it to work - I'd be interested to learn of your results. This, that and the other (talk) 01:11, 10 November 2023 (UTC)Reply[reply]
@This, that and the other It seems to work for me, thanks! See User:Benwing2/test-senseid-link. This is a link using {{m|nrf|Picardie|id=Q1249603}} to the Norman entry for Picardie, using the sense ID added by {{transclude sense}}. User:Erutuon already changed the code in Module:transclude/sense to use a "span" instead of "li". This is using Chrome version 119.0.6045.105 on Mac OS Ventura 13.3. Benwing2 (talk) 01:30, 10 November 2023 (UTC)Reply[reply]
@Benwing2 works for me too on Chrome. My testing was admittedly somewhat artificial, so it's good to see that a real example works.
In Firefox, as expected, only the content of {{transclude sense}} is highlighted, not the whole list item. This, that and the other (talk) 01:35, 10 November 2023 (UTC)Reply[reply]
@This, that and the other Hmm, which is more correct? Presumably the Firefox behavior? Does "as expected" refer to Firefox's generally better implementations of W3 standards? Benwing2 (talk) 01:51, 10 November 2023 (UTC)Reply[reply]
@Benwing2 it refers to the fact that MDN's compatibility table (which I linked in my first, somewhat incomprehensible, reply) says that Firefox currently doesn't support the :has selector. The :has selector is in the standard so I suppose Firefox is yet to catch up. This, that and the other (talk) 03:03, 10 November 2023 (UTC)Reply[reply]
@This, that and the other Aha, OK; thanks for implementing it! Benwing2 (talk) 03:41, 10 November 2023 (UTC)Reply[reply]
@Benwing2 {{transclude sense}} has a parameter for labels. (But I agree it should be changed). Vininn126 (talk) 10:13, 9 November 2023 (UTC)Reply[reply]
I'm not 100% sure whether the changes made here have affected this, but I believe redirects to senseid anchors also used to result in blue highlighting, e.g. clicking on the talk used to result in the relevant sense of talk being highlighted upon landing on that page, but there is no longer any highlighting for me in Firefox or Chrome (although the Picardie link above does result in highlighting on that page). Highlighting was helpful due to the known bug that the screen can "jump" when tables collapse (see Wiktionary:Tea_room/2023/November#wrong_link_on_tetigere for a recent description), so it'd be nice if we could find a way to restore the functionality. (If not, I agree with Benwing that if we have to choose, not having the newlines seems like a higher priority than having the blue.) - -sche (discuss) 06:28, 10 November 2023 (UTC)Reply[reply]
@-sche By "used to" do you mean up until the last 12 hours or so, or some time back? Benwing2 (talk) 06:58, 10 November 2023 (UTC)Reply[reply]
I don't encounter redirects to senseid senses often, so I can't be sure whether the changes made here are what removed the blueness, but Internet Archive's most recent archive, from 26 September, had blue highlighting. (Were there other changes made between then and now to how senseid links work?) - -sche (discuss) 07:25, 10 November 2023 (UTC)Reply[reply]
I can still see the blue on that redirect. This, that and the other (talk) 09:53, 10 November 2023 (UTC)Reply[reply]
@This, that and the other I don't see it (Chrome version 119.0.6045.105 on Mac OS Ventura 13 as above); but strangely, I do see it upon refresh. Same behavior with Safari version 16.4 (I don't have Firefox installed). On which browser are you running? Benwing2 (talk) 10:08, 10 November 2023 (UTC)Reply[reply]
The blue highlight works consistently on Edge 119.0.2151.44 and Chrome 118.0.5993.120, both on Windows, when I click the link [[the talk]], even if I have to scroll up a bit to see the actual blue-highlighted definition. When browsing directly to the URL https://en.wiktionary.org/wiki/the_talk it seems intermittent.
On Firefox the highlight only appears after refreshing the page.
Not sure why this would happen: I find it hard to believe that my recent change is to blame, but I could undo it for an experiment if it would help.
When clicking {{m|en|the talk}} the talk it does not work in any browser, probably because Tabbed Languages interferes with the hash component of the URL. This, that and the other (talk) 10:22, 10 November 2023 (UTC)Reply[reply]
{{m|en|the talk}} translates to the talk#English. I don't have Tabbed Languages enabled, and when I click that, I go to talk#English. It seems that the fragment #English overrides the fragment #English:_the_talk specified in the redirect page. — Eru·tuon 15:52, 10 November 2023 (UTC)Reply[reply]
Fascinating, refreshing the page causes the blue highlighting to appear for me, too, in both Firefox and Chrome, but going to the talk and being redirected for a second time does not cause any blue. I'm not sure why the blue only appears when refreshing the page (not when navigating from the talk) in this case, but appears straight away (without requiring any refreshing) when navigating from Benwing's test page to Picardie, or when clicking a {{l}} link to get to talk, like the one I inserted here, or even just using a direct URL link like the one I inserted here. So, it seems like a soft redirect that made the user click through to talk would result in the sense of talk being highlighted, but a hard redirect doesn't. This is probably a bad idea, but FWIW if we had e.g. some javascript that would notice when a user had been redirected to an anchor and refresh the page after a second, it would solve both the "there is no blue until refresh" issue and the "page jumps, so link appears to go to wrong section" (tetigere) issue. But it would probably annoy people who'd started reading or scrolling in the meantime. - -sche (discuss) 16:18, 10 November 2023 (UTC)Reply[reply]
In Firefox I also don't see the "customary conversation" definition highlighted on clicking the talk and redirecting to talk#English_the_talk. This, that and the other didn't change the CSS rule that highlights the definition, and I can't remember visiting a redirect where the target was a sense ID anchor, so I can't verify whether the definition used to be highlighted. The archive link is a direct link to the sense ID target, so there is no redirect involved and it has a highlighted definition just as talk#English:_the_talk does. It seems like it's a browser bug where the element that's identified by the URL fragment #English_the_talk isn't being selected by the :target CSS selector when the page talk has been reached by a redirect. That's assuming the web standards say :target selector is meant to work after a redirect, or at least don't say it's not supposed to work. — Eru·tuon 15:42, 10 November 2023 (UTC)Reply[reply]

ditto in it-verb edit

On prudere#Verb, the conjugation reads

prùdere (first-person singular present prùdo, first-person singular past historic (rare) prudétti or (ditto, traditional) prudètti, no past participle)

Can it just say rare, traditional instead? "Ditto" seems unacademic and breaks the readers' train of thought. And it might be somehow mis-parsed if the code changes and someone is looking at an old diff. Thanks, Soap 18:39, 9 November 2023 (UTC)Reply[reply]

@Soap The reason I put in the ditto notation was to avoid repeating long sentences that sometimes occur. This can definitely be made smarter, so that e.g. it only does this if the qualifier(s) in question are longer than a certain length, and the word itself can be changed (any ideas?). Benwing2 (talk) 20:41, 9 November 2023 (UTC)Reply[reply]
@Benwing2 The footnotes that Russian headwords have works quite well:
трюм (trjumm inan (genitive трю́ма, nominative plural трю́мы or трюма́*, genitive plural трю́мов or трюмо́в*) (* In professional speech.)
Theknightwho (talk) 00:33, 13 November 2023 (UTC)Reply[reply]

Sorting mutated form categories edit

Can someone (e.g. @Benwing2) edit either the "mutated form of" templates ({{aspirate mutation of}}, {{eclipsis of}}, {{h-prothesis of}}, {{hard mutation of}}, {{lenition of}}, {{mixed mutation of}}, {{nasal mutation of}}, {{soft mutation of}}, {{t-prothesis of}}) or Module:form of/templates (which they invoke) so that the corresponding categories are sorted alphabetically by the base form (i.e. |2= of the form-of template) rather than by the page name? Thanks! —Mahāgaja · talk 20:37, 10 November 2023 (UTC)Reply[reply]

Should be added to all of these templates. Sorry for the multiple pings. Benwing2 (talk) 22:38, 11 November 2023 (UTC)Reply[reply]
Great, thanks! —Mahāgaja · talk 08:01, 12 November 2023 (UTC)Reply[reply]

Request for deletion of "persón", not a Catalan word. edit

Request for deletion of "persón", not a Catalan word. Such a word does not exist in Catalan, an even if it did, it would not be written with "ó". Esberginia (talk) 16:36, 11 November 2023 (UTC)Reply[reply]

This is the Grease Pit, for technical matters, not substantive language matters. Remove {{rfd|ca}} and add {{rfv|ca}} (Request for Verification) at "persón", and use the small "+" to add this to the right page. DCDuring (talk) 18:07, 11 November 2023 (UTC)Reply[reply]
Ping @Esberginia in case you missed DCDuring's message. Thanks for your contributions! This, that and the other (talk) 00:22, 13 November 2023 (UTC)Reply[reply]
Thanks, definitely missed previous reply.
Also, could you direct me to the place where to ask questions on grammatical templates and the like?
Esberginia (talk) 12:05, 13 November 2023 (UTC)Reply[reply]
@Esberginia This is the place to ask for technical advice on templates. If your questions is more linguistic in nature you could try the Tea Room, but i'm not sure if we have many (any?) active Catalan contributors. This, that and the other (talk) 23:02, 13 November 2023 (UTC)Reply[reply]
@Esberginia: What's your question regarding the template? If it's a question about a specific template, you can also try starting a discussion on its talk page, although editors don't always watch these pages and might miss it, so best to {{ping}} someone in particular. Check the history to see who's worked on the template. There's also a talk page for general language level discussions, Wiktionary talk:About Catalan (almost empty, though). Jberkel 00:04, 14 November 2023 (UTC)Reply[reply]

Hard-coded Albanian reference links edit

While cleaning up hard-coded URLs linking to Wikipedia, I came across a whole bunch of references to a certain Albanian-Latin dictionary, added without any templates by someone who was later blocked for other reasons. For example, at Albanian pak:

  • <ref>[https://archive.org/details/fialuurivoghels00junggoog/page/n112/mode/2up Fialuur i voghel Sccyp e ltinisct (Small Dictionary of Albanian and Latin), page 94], by P. Jak Junkut, 1895, [https://en.wikipedia.org/wiki/Shkodër Sckoder]</ref>

Click on Special:Search/insource:"Wikipedia.org/wiki/Shkodër" to see all 68 of them. I don't really want to just convert the Wikipedia part to {{w|Shkodër|Sckoder}} in all of these, since it seems like a lot of work that will only address one aspect of the problem. I'm not sure whether we want to create a template for these, given that no one else will probably ever link to this source, or convert them to something like {{quote-book}}, or just remove them- but whatever we do, this looks like bot work. Anyone interested? Chuck Entz (talk) 21:49, 11 November 2023 (UTC)Reply[reply]

Every entry in this dictionary is assigned a unique ID, and the reference template takes the entry's ID as the main parameter. However, due to substantial updates to the dictionary this year, nearly all entry IDs have been modified. If feasible, I think we should use a bot to rectify this issue. --TongcyDai (talk) 01:34, 12 November 2023 (UTC)Reply[reply]

Fixing up Portuguese pre-AO forms with bots edit

@WingerBot The Portuguese entries for spellings before the orthographic agreements are a mess. They use like three hundred templates and for the agreements made before 1990, they're absurdly inconsistent. Sometimes a word that was standard until the 70s was flagged as "obsolete", etc. etc. For the past couple of days, I've been endeavoring to straighten up every single old form, and to this end, after talking to people on Discord, I've created {{pt-pre-reform}}.

I need it to be applied to pages using the previous templates (there are about 1k of them), and I heard this would be really really easy with bots. I don't know how to do it though. If someone could apply these changes for me, I'd be really really grateful. Here are the exact replacements that need done:

I've tested and re-tested this in Wiktionary:Sandbox, so it should be working 100% as intended... It's a lot of lines, but that's because a bunch of these are redirects to each other; I'd been trying different solutions and learning about each reform deeply as I went. I hope this isn't too big an ask. MedK1 (talk) 04:35, 13 November 2023 (UTC)Reply[reply]

I enjoy how you pinged the bot itself... maybe it is only a matter of time before our bots are hooked up to LLMs and can carry out tasks like this autonomously...
I'm not sure how I feel about the numbers in this template. I suppose there is a good reason why they have been used, but at the very least, they must be documented at Template:pt-pre-reform/documentation. This, that and the other (talk) 05:43, 13 November 2023 (UTC)Reply[reply]
Yeah, of course! Making documentation is the plan. It should be there by tonight. The TL;DR though is that they mark what reform got rid of what kinds of words (there was more than one reform and they weren't exactly synchronous). I went for numbers rather than years because 1) it means there are less bits you have to replace/write when adding the template to a page or changing how a specific word is classified; and 2) the 1943 reform was technically only applied in 1946, and the 1990 one wasn't applied instantly either; if someday people decide to change the dates in the template to reflect that, no pages would need edits to go along with it. 2804:1B0:1903:FF5F:6580:2887:B51E:2729 10:21, 13 November 2023 (UTC)Reply[reply]
@MedK1 Unfortunately I don't get pings addressed to my bot. I can carry out this change but I definitely think you should replace the numbers with something less opaque: Either named abbreviations of the reforms in question or years, and use country codes for the different countries referenced. Benwing2 (talk) 22:58, 13 November 2023 (UTC)Reply[reply]
@Benwing2 My bad, I wasn't aware you didn't actually get pinged; I was imagining you had to log into the account in order to start applying the changes. Guess it just shows how little I know about this haha!
I was thinking about listing the reforms in the documentation, with a short summary of what each one does (and a few examples of which words fall where) similar to what I did in {{pt-archaic-sc}}. The numbers aren't random, they refer to the order at which the reforms happened: Portugal had 4 reforms (1911, 1945, 1973, 1990) and Brazil had 3 (1943, 1971, 1990). So doing "2" for the Portugal parameter would get you the "1945" description and "3" for Brazil, the 1990 description. Would this solution work?
As for the country codes, I actually wish I thought of that! I did consider that there could be confusion a la "does Brazil come first in the template or does Portugal?". All I could think about was putting it in alphabetical order and calling it a day. Making, for example, {{pt-pre-reform|WORD|br=1|pt=1}} is a much more elegant solution and I'm applying it right away! MedK1 (talk) 00:54, 14 November 2023 (UTC)Reply[reply]
@MedK1 I use the pywikibot library to implement my bot code, and it logs into my bot account automatically (except when I need to do bot actions that need admin privileges, like deleting pages; for these I have it log into my admin account). I still think it would be better to use years rather than numbers; it seems especially confusing that the numbers refer to different reforms for Brazil vs. Portugal. In general, it's better to use human-parsable/memorable abbrevs rather than numbered or lettered ones unless the items in question are well-known by their numbers. So for example the seven classes of Germanic strong verbs are usually identified by number, same with the declensions and conjugations of Latin, but Germanic noun declensions don't have well-known numbers so we refer to them by name (e.g. i-stem, a-stem, ō-stem, etc.). I don't think the issue with the Brazilian 1943 reform being adopted in 1946 is a big issue with using years; Wikipedia, for example, identifies the reforms by years and refers to the "orthographic reform of 1943". Benwing2 (talk) 02:22, 14 November 2023 (UTC)Reply[reply]
@Benwing2 There! I've coded it so you can use years; br=1971|pt=1945 instead of br=2|pt=2. I thought it'd be nice if br=71|pt=45 worked as well, so it does! I'm gonna work on the documentation now, but other than that, it should be all set. MedK1 (talk) 02:55, 14 November 2023 (UTC)Reply[reply]
@MedK1 Thanks! Let me see about coding up a bot script. Benwing2 (talk) 04:33, 14 November 2023 (UTC)Reply[reply]
Can you change the table above to reflect the new template calling syntax? Benwing2 (talk) 04:37, 14 November 2023 (UTC)Reply[reply]
Ah, of course.   It's done, @Benwing2! MedK1 (talk) 14:19, 14 November 2023 (UTC)Reply[reply]
Btw @Benwing2, I've made the documentation as well! MedK1 (talk) 02:11, 15 November 2023 (UTC)Reply[reply]
@MedK1 I applied the above changes except to the following, which have extraneous params:
Page 344 proêmio: WARNING: Unrecognized param: from=Brazilian Portuguese form
Page 95 metempsychose: WARNING: Unrecognized param: 2=
Page 53 metempsychose: WARNING: Unrecognized param: 2=
Page 88 abobada: WARNING: Unrecognized param: t=vault, arched ceiling
If you can fix these last three, I'll delete the old templates. Benwing2 (talk) 07:41, 15 November 2023 (UTC)Reply[reply]
@MedK1 In addition to what Benwing has noted, I noticed that sciencia is in Cat:Portuguese forms superseded by AO1990, whose description states it contains terms current from 1971 to 2008, but the entry itself says the term was obsolete by 1943. Can you check what's going on here? This, that and the other (talk) 12:17, 15 November 2023 (UTC)Reply[reply]
The change from 1/2/3 to actual years was affecting the calculations for what categories to apply. I've since fixed that and the pages in @Benwing2's list; every page should be working alright now. MedK1 (talk) 16:08, 15 November 2023 (UTC)Reply[reply]

Maybe improve Template:rfap to add a link to the existing pool of Lingua Libre records? edit

Let's take a look at the Russian word молот (molot) as an example. Right now it has a pronunciation audio created by @Ivnadur, which is nice. Except that the humming noise in the background is a bit annoying. Do we have any possible replacements for it? Yes! Going to https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-rus?from=молот allows us to easily see 4 existing pronunciation records of the same word "молот" from different speakers. I think that it would be useful if the Template:rfap template could generally give the users a hint to check the Lingua Libre records pool with an appropriate link. Currently it looks like this:

This entry needs audio files. If you are a native speaker with a microphone, please record some and upload them. (For audio required quickly, visit WT:APR.)

I suggest to modify it like this:

This entry needs audio files. If you are a native speaker with a microphone, please record some and upload them. But there may be even some existing Lingua Libre records here. (For audio required quickly, visit WT:APR.)

Maybe the other parts of the message could be changed too (is WT:APR still relevant?). And on the technical side, the two-letter language code needs to be substituted with a three-letter code in the link. Ssvb (talk) 11:38, 13 November 2023 (UTC)Reply[reply]

Great idea. I've always been confused by the phrase "For audio required quickly" - who on earth "requires" an audio pronunciation "quickly"? @Ssvb check it out now. This, that and the other (talk) 00:59, 16 November 2023 (UTC)Reply[reply]
I can think of many reasons why one might need an audio recording of a pronunciation quickly, but they're not going to get it here. If you need an audio of a pronunciation right now, you go to YouTube and find someone talking about an issue that includes the word in question. —Mahāgaja · talk 07:17, 16 November 2023 (UTC)Reply[reply]
@Mahagaja: I think that right now the presence of the {{rfap|en}} template in an article already effectively means "this pronunciation is needed more urgently than the others". Because the lists of words for Lingua Libre (such as this one) are constructed automatically regardless of the presence or absence of the rfap template in Wiktionary articles. —Ssvb (talk) 12:34, 16 November 2023 (UTC)Reply[reply]
@This, that and the other Thanks, this looks better. Though I believe that it would be useful to have a link to a more precise location. Not to the whole Lingua Libre category, but also pinpoint the right language and the right word itself. A Lingua Libre bot automatically maintains the list of English words lacking pronunciation audio here. In the latest update of this list, the bot removed words decoction, lawlessness and maidservant from the list, because some contributors already recorded pronunciation samples for these words. Now let's look at the lawlessness Wiktionary article. If somebody adds a pronunciation section with an rfap template to it, then it would be useful to precisely link to https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-eng?from=lawlessness from the template notice banner. Rather than sending people just in a general direction. —Ssvb (talk) Ssvb (talk) 12:16, 16 November 2023 (UTC)Reply[reply]
@Ssvb The reason I didn't add the language name is I'm not aware that we maintain a list of three-character ISO codes anywhere on this wiki, and I couldn't be bothered to start one... This, that and the other (talk) 22:30, 16 November 2023 (UTC)Reply[reply]
@Ssvb, This, that and the other: I can make a module for this based on List of ISO 639-1 codes on Wikipedia, but I need to know which three-letter codes are used by Lingua Libre, as the Wikipedia article lists two codes for some languages (e.g. 'sqi' and 'alb' for Albanian). [I am guessing that it's the ISO 639-2/T codes (e.g. 'sqi' not 'alb'), because these match ISO 639-3.] Benwing2 (talk) 23:08, 16 November 2023 (UTC)Reply[reply]