Wiktionary:Grease pit/2023/December

is popups disabled?

i suppose recent changes to the CSS may have had far-reaching effects. i notice the clock in the upper right is gone, and reverting my highly customized CSS to a much smaller script doesnt seem to bring it back. i want to know if popups was specifically disabled by these new changes or if it's just a side effect that may be happening only to some of us or even only to me. i rely on the popups gadget for a lot of things and i know we are working on a replacement that in many ways is actually superior to popups, but popups still has some features that the new script does not, as it's meant to be a means of navigation (hence the full name of the script, navigation popups) and not just a means to preview articles.

if it's not just me, is there a work-around i can use to get the clock back, and more importantly, restore navpopups? thanks, —Soap— 07:10, 1 December 2023 (UTC)[reply]

To clarify, i have navpopups explicitly enabled in my Preferences, rather than copying the code manually. So I suspect this means that navpopups has been explicitly disabled in the Wiktionary global CSS, meaning I can't override it through Preferences, but might still be able to override it if I find the code and copy it manually here. But even assuming that it has been disabled on purpose, I want to know if the popups has been disabled by name or by function. In other words, if i copy the code over, will it still not work because the functions have been disabled by another script, or is it addressed by name, such that it just looks for a script called "popups" and disables that? thanks, —Soap— 07:13, 1 December 2023 (UTC)[reply]

Ive removed the custom CSS i used to use and reverted to essentially a normal CSS sheet at User:Soap/vector.css. This is just to eliminate any questions that it might be my custom CSS that's getting in the way. The vustom CSS helps me visually isolate and read small text but I can restore it bit by bit or even write a new CSS sheet that restores the enhancements but with different code. If we can elimninate theese questions I can restore the custom CSS that highlights different paragraphs in differenty colors. But before I will restore the code I want to get navpopups back and ideally also restore the clock at the top right. Thanks, —Soap— 07:25, 1 December 2023 (UTC)[reply]

also, enabling popups on my touchscreen laptop (where it's useless) automatically hides the clock, so Im pretty sure the two bugs are directly related. —Soap— 07:49, 1 December 2023 (UTC)[reply]

A bunch of stuff isn't working for me: popups, but also tabbed languages and clicking "reply" on a page like this one, and the character-insertion menus below the edit box. —Mahāgaja · talk 08:42, 1 December 2023 (UTC)[reply]

Thanks, now that I look I've noticed two other things missing .... the citations tab, and the "Visibility" section normally present in the left sidebar that allows me to collapse sections such as use-examples, quotes, and other things. What this has to do with popups, I cant imagine, so I wonder if recent CSS changes, possibly inhrited from Meta, caused some sort of unpredictable collision of two scripts that otherwise wouldn't conflict with each other. —Soap— 09:48, 1 December 2023 (UTC)[reply]

Those are gone for me too, but hitting "[reply]" works again. But translations boxes can't be opened at all, not just using the Visibility section in the left sidebar. —Mahāgaja · talk 09:52, 1 December 2023 (UTC)[reply]

If you add the line

importScript('MediaWiki:Gadget-popups.js');

to your .js file (mine is User:Soap/vector.js), and then disable navpopups in Preferences (so it doesnt try to load twice), it may restore the functionality of the popups gadget without breaking the hitherto unrelated other functions such as Citations, show/hide, tabbed languages, and the clock at the top right. (I suspect all four of these, and more, are relying on a single function that is being somehow missed.) The order of script execution may be at the core of the problem, since the code of MediaWiki:Gadget-popups.js and the internal Wiktionary gadget should be exactly the same, but they might load at different times. —Soap— 09:54, 1 December 2023 (UTC)[reply]

In fact, anything collapsible isn't working: inline synonyms etc. after #: don't collapse, and inline quotations after #* aren't showing up at all. Family tree boxes on language category pages also don't open. —Mahāgaja · talk 09:56, 1 December 2023 (UTC)[reply]

Orange links also aren't working. —Mahāgaja · talk 15:03, 1 December 2023 (UTC)[reply]

I too find that nav popups simply aren't working at all. I don't have anything special or customized regarding CSS or JS at User > Preferences (just popups setting = yes). Quercus solaris (talk) 16:17, 1 December 2023 (UTC)[reply]

{{trans-top}} doesn't show the translations; {{syn}} doesn't allow the synonyms to be hidden.

Could this be the result of vandalism? DCDuring (talk) 17:35, 1 December 2023 (UTC)[reply]

This stuff (navigation popups, expanding translations tables) still works for me; perhaps the change hasn't propagated out to me yet, or has been reverted/fixed. If it wasn't a change to some local thing, Wiktionary:Wikimedia_Tech_News/2023#Tech_News:_2023-48 mentioned some WMF changes to MediaWiki and to the Javascript system for gadgets and user scripts. Although nothing announced there jumps out as having been likely to break existing scripts, nav popups breaking was also reported on Wikipedia, so it's not just a local issue. - -sche (discuss) 20:20, 1 December 2023 (UTC)[reply]

I went to Preferences and turned off Navigation popups, and all the other missing features returned, so it's definitely some problem with popups. ~~@-sche, perhaps it was this edit of yours that caused the trouble (the timing is right).~~ Sorry, I misread the year on that! The last edit to the popups gadget was coincidentally one year ago today. —Mahāgaja · talk 20:42, 1 December 2023 (UTC)[reply]

For those who depend on popups, turning off the gadget in Preferences and then adding the line

importScript('MediaWiki:Gadget-popups.js');

to your user JS file might work, because it's working for me. Popups returned and all of the other things returned too. I wrote this up above but people seem not to have noticed it, perhaps because I wrote so many other things in a short time. —Soap— 21:32, 1 December 2023 (UTC)[reply]

@Soap: Thanks, that works for me too. —Mahāgaja · talk 22:15, 1 December 2023 (UTC)[reply]

Thanks, all. Wouldn't we want the gadget to work properly or be removed? DCDuring (talk) 19:15, 2 December 2023 (UTC)[reply]

I'm hoping that because it broke on at least two WMF sites, someone is looking into why. Even if loading it another way restores functionality, why did loading it the first way suddenly break, and cause other things to break? But perhaps we need to be the someone who looks into it. - -sche (discuss) 23:09, 2 December 2023 (UTC)[reply]

All I need to help is the most basic understanding of CSS and its application layers at Wiktionary and WM. DCDuring (talk) 18:52, 3 December 2023 (UTC)[reply]

I was the one it broke for on enwiki. But I was loading popups in a weird way. I was loading the enwiki one in my meta global.js file, but I was not using any mw.loader.using statements to make sure its dependencies also loaded. (The enwiki popups has a lot of dependencies. Yours doesn't though.) There were no other error reports on enwiki. So in conclusion, it's possible that my issue is unrelated to the issue with popups on this wiki. Hope this helps. Novem Linguae (talk) 21:17, 4 December 2023 (UTC)[reply]

@Erutuon, do you think the issue you fixed here is what was breaking popups on en.Wikt? Any idea why other users above (and below) reported that popups breaking also broke other stuff, and that loading the popups gadget differently (before your fix) caused the other things to work? - -sche (discuss) 21:30, 4 December 2023 (UTC)[reply]

@-sche Nothing has broken for me, and unfortunately I don't know front-end stuff very well so I'm not sure I can help debug this. Maybe User:Erutuon and/or User:This, that and the other. Benwing2 (talk) 20:24, 5 December 2023 (UTC)[reply]

@-sche BTW Erutuon's one-char change seems unlikely to have broken anything; if it did, something more fundamental was broken previously. Benwing2 (talk) 20:26, 5 December 2023 (UTC)[reply]

FTR I was asking (since it seemed at the time that the users for whom things had broken were reporting that the things were working again) whether Erutuon's edit had fixed, not caused, the problem. - -sche (discuss) 22:54, 5 December 2023 (UTC)[reply]

@-sche Oh, I see. I don't know about that, but it seems it was specifically to fix a syntax error on Firefox. I suppose it's possible that would have blocked the loading of various things on Firefox; if all the people reporting problems were using Firefox, that could be the cause (I use Chrome and sometimes Safari, but not normally Firefox). Benwing2 (talk) 22:56, 5 December 2023 (UTC)[reply]

FWIW I have almost nothing in User:Benwing2/common.css and User:Benwing2/common.js. Benwing2 (talk) 20:29, 5 December 2023 (UTC)[reply]

that diff is newer than when the problems started, so i dont think it could be the cause of all of this. —Soap— 20:32, 5 December 2023 (UTC)[reply]

I haven't witnessed any breakage myself. When I turn on the "Navigation popups" gadget in my preferences, the popups function as expected, as does other JS-based functionality like Tabbed Languages. This, that and the other (talk) 22:52, 5 December 2023 (UTC)[reply]

See #Expand_sections_broken? below for an explanation of my edit and why it fixed the problem for me. I didn't see this thread before I posted. — Eru·tuon 18:55, 6 December 2023 (UTC)[reply]

Help with a Javascript extension for Chrome/Firefox

There are two gadgets mentioned at Wiktionary:Quotations#Quotation_gadgets in the line "Another pair of gadgets allow users to generate properly formatted quotations straight from the search pages of Google Books and Google Scholar. You can learn more about how these gadgets work and how to manually install them at this 2022 Grease Pit discussion." I use both of these and they are incredibly handy and I am wondering if someone experienced with Javascript would be able to help make two similar extensions for two Polish websites. The first is http://nkjp.pl/. There are a couple different ways to attack that one (i.e. there are two places we can extract quotes from). The second is https://polona.pl/ or its old version https://polona2.pl/, it might be better to have it work with the first website. If needed I could provide the API's I believe for both, definitely for polona. Vininn126 (talk) 10:31, 1 December 2023 (UTC)[reply]

@Vininn126: Additionally, it would be great to have a convenient Javascript gadget for searching quotations in Wikisource and automatically formatting them into appropriate Wiktionary templates. I see that this topic has been around at least since 2016: https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2023/Wiktionary#Insert_attestation_using_Wikisource_as_a_corpus Ssvb (talk) 00:19, 2 December 2023 (UTC)[reply]

Toolforge Grid Engine shutdown

FYI if any of the Toolforge tools we use run on Grid Engine, it's being shut down starting on December 14th. WP VPT announcement, mailing list announcement. - -sche (discuss) 20:22, 1 December 2023 (UTC)[reply]

My enwikt-translations and templatehoard tools have been running on the Grid Engine, and they can't be fully migrated away from the Grid Engine yet because the Toolforge maintainers haven't finished setting up a way of compiling Rust programs on Kubernetes. (I could try compiling them on my own computer and copying them over, because that has worked in the past.) The status of the tools is tracked in task T319724 and task T320083, and Rust compilation is tracked in task T194332. There are other smarter people who also run Rust-based tools, so this will likely be fixed eventually, but maybe not before the official Grid Engine shutdown. — Eru·tuon 18:51, 6 December 2023 (UTC)[reply]

Allow to restrict the Quiet Quentin search results to a specific language

Right now searching for Belarusian quotations via Quiet Quentin often results in finding a lot of undesirable hits in Russian and Ukrainian books whenenver the Cyrillic spelling of certain words is identical. But it's possible to restrict search to just a single language via the langRestrict Google Books API query parameter. As a test, I tried to apply this change as my custom modification of Quiet Quentin and it really works. I think that it would be interesting to have language selection (via an extra edit box or a dropdown menu) in the Quiet Quentin gadget itself for all Wiktionary users. Does the community want this? And if yes, then who can implement it? —Ssvb (talk) 23:47, 1 December 2023 (UTC)[reply]

Not sure if this has been fixed yet, but another language issue to fix is that some of the language codes Google and hence QQ plugs into the quotation templates are (or used to be; apologies if this was fixed already and I missed it) different from the ones we use, e.g. Google/QQ uses "un" for our "und" (described here, here, basically QQ just needs to systematically change certain invalid codes to specific other codes).
If anyone has time, it'd be great to also make QQ do the equivalent of clicking the "Search instead for "what you searched for, as opposed to what we changed your search to"" link in Google Books... as it is, sometimes I search for words and find that QQ can't be used because Google feeds it the results of some unrelated search which finds unrelated or no results. - -sche (discuss) 00:46, 2 December 2023 (UTC)[reply]

I tend to find using quotation marks fairly effective with QQ, often in conjunction with a common word (like "the") to narrow results to a given language. What Ssvb is proposing would be fantastic though. Andrew Sheedy (talk) 01:09, 2 December 2023 (UTC)[reply]

Expand sections broken?

I'm using up-to-date Win 10 and the Edge browser.

Starting today, most expandable sections no longer display correctly for me. Looking at 柜#Chinese, for instance, the Characters in the same phonetic series box has no clickable "expand" text, and I can find no way to get it to display properly. Similarly, any English entry with a Translations section shows me the one-line Translations box, but with no way to expand.

Meanwhile, the Pronunciation section on 柜#Chinese is entirely expanded, with no way to shrink it down -- which is irksome, as there is a lot of pronunciation information for most Chinese single-character entries, requiring a good bit of scrolling to get past.

Any ideas what might have happened? Is this a change in CSS, JS, something else? ‑‑ Eiríkr Útlendi │^{Tala við mig} 01:26, 2 December 2023 (UTC)[reply]

See Wiktionary:Grease_pit/2023/December#is_popups_disabled?. There are similar problems on Wikipedia; it seems like something broke somehow with how Mediawiki processes navigation popups when loaded via PREFs (?) and this also messes up other .js somehow (??) - -sche (discuss) 01:41, 2 December 2023 (UTC)[reply]

Thanks for the pointer, I skimmed the section titles but not the content and never thought of these expand sections as "popups". 😄 ‑‑ Eiríkr Útlendi │^{Tala við mig} 21:40, 4 December 2023 (UTC)[reply]

Oh, they're not, but when popups broke, it broke expanding sections (and loading popups differently fixed expanding sections), at least for some users, as discussed there. - -sche (discuss) 22:27, 4 December 2023 (UTC)[reply]

Yeah, the fact that these are apparently linked somehow on the back end is a bit disturbing. Seems to suggest spaghetti code. ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:00, 4 December 2023 (UTC)[reply]

The translation adder tool seems to be gone too. —Mahāgaja · talk 07:23, 5 December 2023 (UTC)[reply]

Hmm. Well, if it's any help in figuring out what exact set of things someone has to have turned on in order to experience vs not experience this breakage, all this stuff (navigation popups, expanding collapsed boxes, adding translations with the translation-adder, ...) still works for me. I have legacy scripts, "Format fake headings", "Add country flags", "Add default styles to text in non-English languages", navigation popups, "Enable targeted translations" (I don't think I actually use this), "Show a "Citations" tab on entry pages", "Display more accurate information about blocks", "Display excerpts from the revision deletion log" (don't recall turning this on), orange links, the anchor, ACCEL, "Enable the buttons that allow editing of translation tables", Hotcat, M Patrolling enhancements, Rhyme buttons, AWA, QQ, and Chinese dialectal maps (not sure if this works? given the WMF shutdown of graphs a while ago) enabled via Special:Preferences, and my common.js is just "importScript('User:Yair rand/orangelinks2.js'); importScript('User:Erutuon/scripts/changeCaseOrder.js');". I haven't changed any of this since before this issue was reported. All the stuff that broke recently for some users (navigation popups, etc) has kept right on working for me in Firefox and Chrome, even after clearing my cache, and the stuff I'm able to test while viewing a page logged out (e.g. expanding collapsed boxes) also works. One thing that jumps out at me that some people have enabled that I don't is Tabbed Languages; is Tabbed Languages somehow not playing nicely with the other stuff? Is it the issue mentioned here rearing its head in a new way? - -sche (discuss) 16:23, 5 December 2023 (UTC)[reply]

OK, so actually there's nothing wrong with the translation-adder tool; I just forgot to switch it back on in my Preferences after I switched everything off a few days to try to find the source of the problem (which turned out to be the Popups gadget). —Mahāgaja · talk 16:40, 5 December 2023 (UTC)[reply]

@This, that and the other, Benwing can you help troubleshoot this? More discussion is at Wiktionary:Grease_pit/2023/December#is_popups_disabled? but this thread summarizes it if you're busy (and the even shorter summary is: navigation popups and expanding collapsed stuff broke for some users, and loading navigation popups differently seemed to make not only them but the other things work again, but for other users like me nothing has broken this whole time). Although navigation popups broke for one user on Wikipedia too, he remarked above that he was loading - in a particularly janky way - the en.WP version which has lots of dependencies, which seems unrelated to our own version and its breakage, despite the timing. - -sche (discuss) 16:33, 5 December 2023 (UTC)[reply]

@Eirikr do you see the same issues if you log into your account using an incognito or Private Browsing window? This, that and the other (talk) 23:24, 5 December 2023 (UTC)[reply]

@This, that and the other, it's been fine for me since some time that following day, 2 December here in the US Pacific Northwest. Not sure if @Erutuon's edit did the trick, or some unrelated change made things work again. I've checked using the Edge browser both logged in, and not logged in, and via my phone's Safari browser as well. No problems currently. ‑‑ Eiríkr Útlendi │^{Tala við mig} 00:32, 6 December 2023 (UTC)[reply]

Very strange. @-sche is anyone actually still suffering from this glitch? This, that and the other (talk) 04:02, 6 December 2023 (UTC)[reply]

Apparently not, given Mahagaja's comment above; I'm sorry to have pinged you, then, if Erutuon's edit to the one gadget resolved the issues with all of them. The remaining thing I'd be curious about is why one character being missing from the popups gadget would've made other things like expanding/collapsing boxes stop working, but I suppose as long as everything does work it's not urgent to peer too deeply into why. - -sche (discuss) 05:24, 6 December 2023 (UTC)[reply]

@-sche: The edit in question. MediaWiki:Gadget-popups.js stopped other gadgets from being run because multiple gadgets were pasted into one file by ResourceLoader (which handles loading of gadgets) and that file was given to the browser. Then the browser tried to parse that JavaScript and encountered a parsing error in one gadget, so it didn't run any of the other gadgets in that file, including ones that would have been parsed just fine. The parsing error showed up in the browser JavaScript console, and I copied the content of the JavaScript file to find where parsing failed. I don't understand how it was a syntax error or why it only showed up recently (because that part of the file hasn't changed for more than a year), but apparently a semicolon fixes it. I guess there must have been some change in ResourceLoader (perhaps a change in how the source code is minified) that caused the error to crop up. — Eru·tuon 18:39, 6 December 2023 (UTC)[reply]

pls let me publish the page

i worked really hard on it and i don't know why it isnt publishing 2600:1700:6E6:A610:80EE:BD4D:6540:31A9 03:50, 2 December 2023 (UTC)[reply]

Please read our Entry layout page. Your text is so badly formatted that it's fooled an abuse filter into blocking your edit as spam or vandalism. Someone is going to have to spend a great deal of time cleaning up what you've added so far. Chuck Entz (talk) 03:58, 2 December 2023 (UTC)[reply]

On closer examination, everything seems to be made-up nonsense dressed up to look like real terms in the various languages, complete with references to dictionaries that don't have entries on the words in question. Time to put a stop to it. Chuck Entz (talk) 05:52, 2 December 2023 (UTC)[reply]

Some highlights: an old English entry in the Reconstruction namespace with a reference to Bosworth-Toller- but no explanation of why it's reconstructed, but one can still look it up in a dictionary. It also links to our English lemma with only a Middle English parent in the etymology, and the MED entry for that term says it's probably borrowed from Old Norse. A Latin cognate "sniffō". An Old French entry with Du Cange's dictionary of Medieval Latin as a reference. An Old English "example" with an inflected form of the OE term in question in the middle of a modern English sentence. This person obviously has some familiarity with Old English, Sanskrit and Proto-Indo-European, among other languages- it's a shame they decided to waste their time and ours on this. Chuck Entz (talk) 06:25, 2 December 2023 (UTC)[reply]

It might be the same person who gave us the beadle hoax a few weeks ago. —Soap— 07:30, 2 December 2023 (UTC)[reply]

i want to correct the translation

https://en.wiktionary.org/wiki/nachui - i want to change the "Let it go on dick - blood how much it stinks here" to "Holy fuck - it really fucking stinks here" eina nachui should be translated as "holy fuck" and there's bliat which also is a swear word that can be translated as "fuck/fucking" 46.204.109.2 16:17, 3 December 2023 (UTC)[reply]

This seems a reasonable request so I fixed the obviously wrong translation, but someone who knows Lithuanian needs to review the usex to make sure it's grammatical etc. (The only words I recognize here are nachui and bliat, because these come from Russian; otherwise I have no idea.) Benwing2 (talk) 01:57, 4 December 2023 (UTC)[reply]

Template:collocation

This template follows {{ux}}, allowing only one collocation per line. See off the bat for its current display. It would be better, IMO, if it followed {{syn}}, allowing multiple items per line, possibly with something fancy to adjust to available screen/frame width. Most collocations are short. For most screen widths (not phones), more than one collocation could appear on the same line without crowding. DCDuring (talk) 13:07, 4 December 2023 (UTC)[reply]

@DCDuring: Does anyone else have thoughts about whether this makes sense to do? Also, if we were to do it, it would only make sense for desktop browsers, as you note; mobile browsers would still want one collocation per line. Does someone who knows CSS (e.g. User:Erutuon, User:This, that and the other) know if it's possible to have different output of this sort for desktop vs. mobile? Maybe we'd have two outputs, one wrapped to display only on non-mobile and one wrapped to display only on mobile? Also I'm not sure about whether the fancy adjustment to frame width can be done, but maybe we can follow the lead of the existing translation tables. Benwing2 (talk) 20:51, 4 December 2023 (UTC)[reply]

Didn't we have something for mobile phones? It's not unusual for some parts of website content to be suppressed in mobile use. The WP app doesn't let you see lots of things unless you go to the desktop site, where you risk ugliness, endless side-to-side scrolling to read content etc. DCDuring (talk) 21:50, 4 December 2023 (UTC)[reply]

Yes, this should be possible using CSS flex. I'll look. This, that and the other (talk) 21:41, 5 December 2023 (UTC)[reply]

@Benwing2, This, that and the other, Vininn126: If this is really too complicated I can use {{collocation}} as I have changed it at [[off the bat]]. It's just a tad more work and may not be consistent across entries and even lines on the same English L2. Coming up with a more compact display, yet usable and attractive, when there are translations seems harder. DCDuring (talk) 22:03, 5 December 2023 (UTC)[reply]

Hmm, I spoke too soon. I forgot that each collocation appears on its own colon-indented line, and those colon indents are not specifically identified as collocations. In view of this, it's not possible to achieve this effect without a new template like {{coi-multiple}} that works only for English and takes multiple collocations as its various parameters, like {{coi-multiple|en|'''chocolate''' bar|'''chocolate''' cake|hot '''chocolate'''}}. This, that and the other (talk) 22:33, 5 December 2023 (UTC)[reply]

@This, that and the other Right, I had assumed we would need a new template. Thanks for looking into this. Benwing2 (talk) 22:52, 5 December 2023 (UTC)[reply]

@DCDuring I would definitely not recommend doing what you have done at off the bat; this seems really hacky and unlikely to work across multiple browsers, devices, etc. Benwing2 (talk) 22:53, 5 December 2023 (UTC)[reply]

That ship has already sailed and perhaps demonstrates that there is a need for a {{coi-multiple}} template... This, that and the other (talk) 07:11, 6 December 2023 (UTC)[reply]

@DCDuring Fuck me, please don't do this any more. I'm gonna have to write a bot script to fix them all up. Benwing2 (talk) 07:54, 6 December 2023 (UTC)[reply]

Has anybody ever complained about the appearance? We seem to have inadvertently run a 10-year user-response test, finding no user problem with this. What is the technical problem caused by the HTML? DCDuring (talk) 15:01, 6 December 2023 (UTC)[reply]

@Benwing2 to be absolutely clear, these were not all added by DCDuring! It seems they were mostly added by ReidAA about 10 years ago. Here is another one added by a different user. I just meant my comment "that ship has already sailed" as a general statement. This, that and the other (talk) 11:21, 6 December 2023 (UTC)[reply]

@Benwing2 I added exactly one illustrative use. Isn't {{collocation}} more recent than ReidAA's brief tenure here? [I see. He was using {{ux}} for what we now call collocations. I don't think I joined in on that.] As I recall ReidAA was driven off enwikt for adding non-breaking spaces before dashes (which kept them from sometimes appearing at the start of lines). Does CSS do that automagically? DCDuring (talk) 14:51, 6 December 2023 (UTC)[reply]

@DCDuring My apologies for assuming the additions were yours. I wouldn't take silence as consent here; a lot of obvious problems can persist for years before someone points them out. Benwing2 (talk) 21:56, 7 December 2023 (UTC)[reply]

More than 400 entries over 10 years is a lot of exposure to users and contributors. DCDuring (talk) 22:26, 7 December 2023 (UTC)[reply]

@DCDuring As I said, I've seen tons of real breakage that has had years of "exposure" without people commenting on it. In this case, for example, the use of the hackiness will cause issues for mobile, but in general our mobile experience is so fucked up that people won't comment on it. Benwing2 (talk) 22:29, 7 December 2023 (UTC)[reply]

Well, first things first. In any event, we also don't know whether anyone ever simply removed the non-breaking spaces. DCDuring (talk) 22:34, 7 December 2023 (UTC)[reply]

This is a bad idea for languages that aren't English. Vininn126 (talk) 20:57, 4 December 2023 (UTC)[reply]

Why? DCDuring (talk) 21:09, 4 December 2023 (UTC)[reply]

Are you suggesting that they be alongside each other as well as in lines, or just all next to each other? Imagine having 10+ collocations (often on Polish pages) with translations all on one line. Vininn126 (talk) 21:13, 4 December 2023 (UTC)[reply]

I don't know the best way to handle translations of collocations. It may be that the current display is the best we can do for languages other than English. But English doesn't need to be handicapped by things required for other languages. DCDuring (talk) 21:54, 4 December 2023 (UTC)[reply]

Bot for automating the creation of conjugated verb pages

I want to make a bot that automatically creates the page links made by {{pt-conj}}. See isentar for an example; I wanna fix those redlinks. This bot does pretty much exactly what some older, now-inactive bots did, like User:NadandoBot and User:BuchmeierBot. How can I do it? I presume I can't just copy-paste their code (especially since neither of them have made their Portuguese code public seemingly...). Ideally, I'd be able to extend this to {{gl-conj}} and {{gl-reinteg-conj}}. I figure it can't be all that impossible for someone experienced at this, but that's not me haha. Is there any chance the code by those other bots (or by ones focusing on {{es-conj}}) can be reused for these purposes of mine? I want to try and create all these entries myself (making sure to look out for any mistakes in irregular verbs as I go), and making a bot would really make this actually realistically doable — according to Jberkel's latest dump, there's so much to be done. MedK1 (talk) 02:15, 6 December 2023 (UTC)[reply]

@MedK1 {{pt-conj}}, {{gl-conj}}, {{gl-reinteg-conj}} and {{es-conj}} all work the same so writing a script for all 4 isn't the issue. The issue is rather writing a script for just one of them. I have a script to create Russian inflections (nouns, verbs, adjectives) and it runs to 3,500 lines of Python. Granted, there's a lot of complexity with Russian (in particular with correctly handling irregular transliterations) that is not present for Latin-script languages. But there are still lots of cases to handle, e.g. whether there's an existing entry for the language (and if so, does it already encode this particular form?), whether there are multiple etymology sections, where there's no entry for the language but an entry for other languages, various formats for the existing entry, etc. Nonetheless I can probably adapt my Russian script by deleting much of the complexity and giving up in overly complex cases (which would have to be cleaned up manually), but it will take a little while before I get there as I have some other projects in flight. Benwing2 (talk) 09:48, 6 December 2023 (UTC)[reply]

@MedK1: I went ahead and worked up a script to add conjugations for these four lang/spelling combinations. Because of a lot of factors it's much simpler than the Russian one -- only 322 lines currently, which is < 1/10 the size of the Russian script. Before letting it loose I need to do full test runs for all four combinations to make sure nothing goes wrong; these are running currently and should be done within around 24 hours (for Spanish, which will take the longest given the number of verbs involved). Benwing2 (talk) 08:31, 7 December 2023 (UTC)[reply]

@Benwing2: That's some great news, and so quick too! Thank you so much, Benwing! I'm a bit curious about how it'll actually work though: you mentioned "letting it loose", so would the bot just automatically scout and create any redlinks in the template tables? Would editors manually feed it verbs like with User:BuchmeierBot? Would I be able to somehow operate it? I have no idea how these scripts work... MedK1 (talk) 13:58, 7 December 2023 (UTC)[reply]

@MedK1 I don't have a "feedme" setup or anything. It would operate by me manually running it on some set of verbs, probably all the verbs in CAT:Galician verbs or CAT:Portuguese verbs or whatever. Given a verb, it uses the |json= flag to get all the inflections, and creates any that are missing. This includes those for which a definition already exists that isn't a verb inflection; e.g. it will convert the existing definition of abaixo to Etymology 1 and add an Etymology 2 for the verbal sense. It also includes cases where there's already a verb inflection but for a different verb, e.g. afago is currently defined as a verbal inflection of afacer but it's also an inflection of afagar; in this case, it inserts another definition directly after the existing one. OTOH if it sees a verbal definition of the same verb already present, it will do nothing if the conjugation matches, but issue a warning (and take no action) if the conjugations don't match (e.g. if it's trying to insert {{gl-reinteg-verb form of|aderir<i-e>}} and it sees {{gl-reinteg-verb form of|aderir}}). What isn't completely resolved is what to do for standard vs. reintegrationist forms of the same verb; currently if for example the word amaremos is being processed for the reintegrationist norm and there's a {{gl-verb form of|amar}} definition already in existence but no {{gl-reinteg-verb form of|amar}}, it will insert the latter directly after the former. Possibly it should do something smarter, but I don't quite know what (see whether the actual inflections are the same or something?). Benwing2 (talk) 21:19, 7 December 2023 (UTC)[reply]

One thing I forgot to add: I am skipping reflexive verbs entirely for now because there are additional complexities. In particular, if the corresponding non-reflexive verb exists, the inflected forms should be based off of that verb; only when there's no corresponding non-reflexive verb should the inflected forms (which don't include the reflexive pronoun) be defined as forms of the reflexive verb. That requires additional logic that I don't feel like dealing with now. Benwing2 (talk) 22:35, 7 December 2023 (UTC)[reply]

One other thing: the effect of wrapping existing sections with Etymology 1 and adding Etymology 2 is that Pronunciation sections end up inside of Etymology 1 when in some cases they may refer to both etymologies and (hence) should be placed above both of them. The problem is that in general my bot can't know whether it's OK to move the Pronunciation section up above both Etymology N sections. Specifically, it's probably OK for Spanish but often not for Portuguese or Galician, where you have e.g. governo "government" with /e/ but governo "I govern" with /ɛ/. As a result I think it's best done in a separate pass, either manually or by bot when it's possible to automatically determine that it's OK to do. Benwing2 (talk) 22:55, 7 December 2023 (UTC)[reply]

And one final comment: I know that stressed e and o in Portuguese -ar verb forms are generally /ɛ/ and /ɔ/, not /e/ or /o/, but are there exceptions and if so under what circumstances? I have heard that there are exceptions before palatal consonants like j and lh, hence eu desejo and eu aconselho with /e/. But how general are these exceptions? For example, eu molho is currently given in Wiktionary with /ɔ/. For reference, here are the verbs in -ejar and -elhar that we have entries for:

verbs in -ejar, -elhar

aconselhar
ajoelhar
almejar
alvejar
aparelhar
apedrejar
arejar
arpejar
arquejar
assemelhar
avermelhar
azulejar
bafejar
bocejar
cacarejar
centelhar
chamejar
cortejar
cotejar
dardejar
deixar a desejar
desaconselhar
desejar
despejar
destelhar
drapejar
embotelhar
emparelhar
ensejar
entelhar
esbracejar
esbravejar
espelhar
farejar
festejar
flamejar
fraquejar
gaguejar
gargarejar
golfejar
gorgolejar
gotejar
gracejar
grelhar
grugulejar
invejar
lacrimejar
lampejar
latejar
manejar
marejar
mercadejar
motejar
mourejar
ornejar
padejar
palejar
palidejar
pelejar
pentelhar
pestanejar
planejar
porejar
praguejar
rastejar
rastrejar
relampejar
remanejar
sacolejar
sejar
semelhar
sobejar
telhar
traquejar
trebelhar
trovejar
velejar
vellejar
verdejar
vermelhar
vermelhejar
versejar
vicejar

Benwing2 (talk) 23:03, 7 December 2023 (UTC)[reply]

So I wouldn't be able to do any of it myself huh.

Still, that's way more than I was expecting; I didn't think it'd be able to handle irregular and <i-e> forms too. The latter is especially useful since I remember fixing a few verbs that had it wrong before. Definitely wasn't expecting automatic pronunciations either haha. This is great.

For standard vs reintegrationist Galician, I like the idea of seeing whether the inflections are the same; there really shouldn't be many cases where they use the same form for different tenses or anything.

Reflexive verbs are a mess right now, I totally understand you skipping them.

As for -elhar,-ejar verbs, this is actually a little embarrassing, but there were a lot of verbs where I straight up don't know? I've never used or seen them used before in the 1st person singular. Every single one I've actually used before are "ê"/"ô" though. Except for "invejar", that is. That one's "é".

100% sure these are "ê"

aconselhar
ajoelhar
apedrejar
assemelhar
bocejar
cortejar
cotejar
deixar a desejar
desaconselhar
desejar
despejar
farejar
festejar
gaguejar
pentelhar
planejar
rastejar

Not 100%, but I feel these are "ê"

arquejar
azulejar
bafejar
centelhar
emparelhar
espelhar
gotejar
gracejar (noun has "ê")
praguejar
semelhar

100% "é"

invejar

MedK1 (talk) 02:26, 8 December 2023 (UTC)[reply]

@MedK1: You would be able to do it yourself if you have a bot account and are comfortable running Python scripts from your local computer. If that sounds doable, let me know and I can help you with the setup. In order to get things working ala User:BuchmeierBot, I'd have to set up some sort of Toolforge process that runs continuously in the background, checking a feedme page. This is possible but it might take a lot of work as I haven't done this before and don't know what would be required to get it working.

As for the reintegrationist vs. standard Galician verb forms, are you suggesting that if both occur in the same sets of slots, we should just skip including a separate call to {{gl-reinteg-verb form of}}? If so, that is certainly possible and I'll see about implementing it.

Thanks very much for the comments on -ejar and -elhar verbs. What about -ojar and -olhar verbs? Are they similarly mostly ô or is it different? Here's a list of them (there are many fewer than -ejar/-elhar verbs, only about 15 total):

verbs in -ojar, -olhar

alojar
antojar
antolhar
arrojar
bojar
demolhar
desalojar
despojar
enojar
entreolhar
espojar
folhar
molhar
olhar
realojar
rojar

Benwing2 (talk) 02:48, 8 December 2023 (UTC)[reply]

BTW I just implemented some other fixes, e.g. checking for cases where {{gerund of}} is used instead of {{*-verb form of}} and inserting missing participle definitions as a separate subsection directly before adjective subsections instead of creating a new etym section, on the assumption that they're related. I also have the code skip short past participles as well as any other verb forms that have the same form as a short past participle form (e.g. (que ele) entregue and (que tu) entregues), because they require special handling that is best done manually. Benwing2 (talk) 02:55, 8 December 2023 (UTC)[reply]

@Benwing2: Awesome, love to see it! By the way, yeah, I'm definitely interested in running Python scripts from my computer and whatnot. I'd have to create the bot account and I don't know a single thing about Python (other than no ;s what the!), but I can learn for sure!

You said you were manually running on a set of verbs? If it were possible for me to do something like that after getting the Python-related program and the bot account set up, maybe you wouldn't need Toolforge/feedme at all? MedK1 (talk) 03:05, 8 December 2023 (UTC)[reply]

@MedK1 Running bot scripts in a bot account generally requires permission (meaning you need to set up a vote for this and post some information on what you're planning on doing with the bot). See Wiktionary:Votes/bt-2023-09/User:KamusiBot for bot status and Wiktionary:Votes/bt-2023-06/User:KovachevBot for bot status for the two most recent such votes. Generally these votes are one week long (and they usually pass). In the meantime you can do test runs out of your own account, or go ahead and create the bot account as long as you use it only for tests. Setting up the Python script environment isn't generally so hard; it depends on whether you're running on a Mac or a Windows machine but the Python environment as well as the Pywikibot library that my bot code uses to interface with Wiktionary are both quite stable and mature, so you shouldn't run into too many issues. My bot scripts let you specify the set of verbs to run on in various ways, e.g. all verbs in a category (optionally filtered down in some way), a specified list of verbs, etc. If you got that working, yes I wouldn't need to set up a Toolforge process or anything. Benwing2 (talk) 03:23, 8 December 2023 (UTC)[reply]

Alright! I should be able to find time to properly set it up in a week from now; currently I'm a little swamped with real-life affairs... MedK1 (talk) 19:32, 9 December 2023 (UTC)[reply]

@Benwing2: Yeah, that's exactly what I'm suggesting! Especially since the reintegrated terms category only includes forms that are exclusively reintegrationist...

For -olhar and -ojar, alojar and its derivatives (desalojar, realojar) are all "ó". molhar, olhar and entreolhar are "ó" too. enojar is "ó" as well. I've never seen most of the other ones, and arrojar/despojar are most commonly (as in, pretty much only ever) seen in "arrojado"/"despojado" so I don't know what they'd be in 1st person singular. My gut says "ó" for those too though.

There's not a single verb there where I'm like "oh, that's definitely 'ô'". MedK1 (talk) 03:02, 8 December 2023 (UTC)[reply]

@MedK1 Thanks, that's very helpful. Note that AFAIK the third-singular and third-plural present for -ar verbs have the same vowel as the first-singular present, so for verbs that occur in the present tense it should be fairly clear which vowel is used even if the verbs don't normally occur in the first-singular. When I have a chance I'm going to clean up the pronunciations of the forms of these -elhar/-olhar/-ejar/-ojar verbs; I'm sure a lot of them are currently wrong. Benwing2 (talk) 03:25, 8 December 2023 (UTC)[reply]

Tag German Low German [Term?]

This tag displays "German Low German". Example: German Low German hering. Is this duplication intended? Duchuyfootball (talk) 04:09, 6 December 2023 (UTC)[reply]

That is the name of the language coded nds-de:

{{langname|nds-de}} ⇒
German Low German

As to where this code came from, see the links to discussions in the relevant row of WT:LT. This, that and the other (talk) 07:13, 6 December 2023 (UTC)[reply]

Briefly, German Low German is Low German as spoken in Germany as opposed to the Netherlands. —Mahāgaja · talk 07:25, 6 December 2023 (UTC)[reply]

False positive detection of vandalism when adding section heading to Discussion

I tried to add ===Formatting needed=== or ==Formatting needed== to the top of Talk:positive to deal with the irritating 'orphaned' comment at the top. But the edit was wrongly blocked as vandalism. —DIV (49.186.112.234 06:42, 6 December 2023 (UTC))[reply]

There was a stray ''Italic text'' in your edit, presumably because you clicked something. You're supposed to replace the "Italic text" part with the actual text you want in italics, so it should never be there by the time you click Publish Changes. Some vandals like to click random things just to make a mess, which is why the abuse filter blocks edits with that kind of thing. Chuck Entz (talk) 07:16, 6 December 2023 (UTC)[reply]

Update data for Kamayo

According to w:Kamayo language it has some alternate names we should probably document on the category page and also the script (Latin surely?) needs to be specified. Acolyte of Ice (talk) 13:25, 6 December 2023 (UTC)[reply]

"Older Latin" noun declension for modern Latin terms

Template:la-ndecl adds a footnote "Found in older Latin (until the Augustan Age)." to the output for the singular genitive of 2nd-declension nouns. This gives the false impression that modern-day Latin terms like paramecium were used in Ancient Rome (when Paramecium, for example, was coined in 1752). Thoughts on how best to address this? Perhaps a flag to suppress the older form where it is not relevant? -Stelio (talk) 12:30, 7 December 2023 (UTC)[reply]

It is possible to suppress it, as per the documentation at Template:la-ndecl#Second-declension_nouns: the format is "la-ndecl|absārius<2.-ius>" (or "2.-ium" for a neuter noun). There was discussion a while back about flipping the defaults that got some support, but hasn't gone forward yet (see User:Benwing2/la-noun-ius-ium).--Urszag (talk) 12:45, 7 December 2023 (UTC)[reply]

fixed by Urszag. Thank you very much! :-) -Stelio (talk) 16:20, 7 December 2023 (UTC)[reply]

Template:la-IPA and Category:Latin terms with Ecclesiastical IPA pronunciation only

The documentation at Template:la-IPA alleges that there is a parameter "|classical=0 or |classical=no: Don’t generate the Classical pronunciation (generated by default)." However, Category:Latin terms with Ecclesiastical IPA pronunciation only is empty, plus the parameter didn't work when I tried to add it to foetus#Latin (a variant spelling of fetus that seems to have arisen postclassically, only after "oe" and "e" had merged in pronunciation). Urszag (talk) 16:53, 7 December 2023 (UTC)[reply]

@Urzsag I vaguely remember doing some preliminary work to fix either this or a related issue, but it never got finished. Benwing2 (talk) 21:01, 7 December 2023 (UTC)[reply]

@Urszag Oops. Benwing2 (talk) 21:51, 7 December 2023 (UTC)[reply]

@Urszag This was due to a recent change by User:Theknightwho to Module:parameters, which has broken various things by not correctly distinguishing between false and nil. BTW, TKW, there are several places you have written things like foo and bar instead of foo and bar or nil. These are not the same; the former evaluates to false when foo is false or nil, whereas the latter evaluates to nil. In general I'd strongly recommend being more careful with false vs. nil, and always use foo and bar or nil unless you specifically intend to have the variable assume the value of false. Benwing2 (talk) 07:32, 8 December 2023 (UTC)[reply]

@Benwing2 Thanks. Theknightwho (talk) 08:05, 8 December 2023 (UTC)[reply]

Inflections with a red link for singular

@This, that and the other, Benwing2: Is there any particular purpose to the hidden category:Inflections with a red link for singular?. If so, it would make sense to split them by language, rather than merging them into 26 currently difficult to access sections. However, I regard such cases as a way of eliminating optionally orange links (i.e. links to non-existent entries that would go on extant pages) for inflections or other scripts' forms when there is no page for the citation form. (Example: Pali วน (vana).) I think 'singular' is a misnomer for 'citation form'. Not all citation forms are even words! --RichardW57 (talk) 09:28, 8 December 2023 (UTC)[reply]

@RichardW57 This is the first I've seen this. User:This, that and the other how is this being generated? I'm pretty sure this category is duplicative of other already-in-existence categories that are properly split by language. Benwing2 (talk) 09:58, 8 December 2023 (UTC)[reply]

@Benwing2 I can't remember how it is being generated; perhaps in the code of {{inflection of}}? As for the reason, it was a sort of experiment to see if anyone found it useful. It could certainly be split by language if that's something we want to do. This, that and the other (talk) 10:04, 8 December 2023 (UTC)[reply]

@Benwing2 yes, it is generated there. We also have Category:Plurals with a red link for singular. The whole thing is very crude, as what we are really interested in is whether the link is "orange", and that can only be verified by dumps (or expensive Lua parsing, which we should not do). This, that and the other (talk) 10:06, 8 December 2023 (UTC)[reply]

@Benwing2: It's not reduplicative; at least I don't see anything similar in the categories of Pali sīdati. Those who believe that every inflected form should have a possibly shared page will probably find it useful. I would also find it useful if it could find orange links - at the moment I am only dealing with them if I stumble across them, and often just hold my nose for non-Pali orange links. --RichardW57 (talk) 10:08, 8 December 2023 (UTC)[reply]

Incidentally, most Pali verbs fall into this category, for the lemma is usually a homonym of a case form of the present active participle, and I normally only add pages for present participles if I have quotations for them. --RichardW57 (talk) 10:02, 8 December 2023 (UTC)[reply]

The category is clogged with errors, at least in English. If a term appears in the categorizing template with a colon (":") before it or a section heading (eg, "#Noun"), then it appears in this category. (There may be other cases.) Thus anyone occupying their time by adding the singulars will find that they are occupying more time than necessary, having to separate wheat from chaff. DCDuring (talk) 15:54, 8 December 2023 (UTC)[reply]
@DCDuring: I'm afraid your second sentence does not compute. You mean, 'If the lemma parameter has a colon before the lemma or a section heading after lemma,..." I had to look for an example to work out what you meant. I feel one should use |pos=noun in such a case, though the fragment identification might work better for English. However, the fragment notation takes one to the first noun antic, leaving no hint of the second English noun! I think that one is a job for an etymology or sense ID and one definition line per etymology or sense. --RichardW57 (talk) 16:29, 8 December 2023 (UTC)[reply]

@Theknightwho A curious upshot is that a lot of these spurious English errors need attention anyway! Or is the general view that these fragment ID's are (still?) legitimate? --RichardW57 (talk) 16:29, 8 December 2023 (UTC)[reply]
I would have thought that the non-alpanumeric characters surrounding the lemma would make it possible to correctly categorize these. If section links are against rules (sometimes? all the time?), this is hardly the way to identify and correct such violations. The semi-colon is (apparently) essential. DCDuring (talk) 16:57, 8 December 2023 (UTC)[reply]
@DCDuring: It's not an ideal way of enforcing such a rule (if it exists), but it is better than nothing for enforcing them. Meanwhile, investigating anomalous Pali examples has found at least one error, namely where the lemma parameter was blank. I'm poor at distinguishing black and visited blue. I often have to hover over links to tell whether they're visited blue or blackened red (as in inflection tables). --RichardW57 (talk) 17:41, 8 December 2023 (UTC)[reply]

@RichardW57 @DCDuring I renamed the category to the more logical name Cat:Inflections with a red link for lemma and fixed the code so there are far fewer false positives. Do you think this is something of value such that it is worth investing more effort to break it up into language subcategories? This, that and the other (talk) 01:03, 10 December 2023 (UTC)[reply]

Yes, I think it would be worth breaking up by language. I could then add something to the Pali category to say which ones should be eliminated. I've made several enhancements to Pali entries in response, but now I'm trying to work out how 'cc' looked like 'ñc' in some script relevant for Pali. I think my best hope is to trawl through an old Sinhalese grammar. --RichardW57 (talk) 02:37, 10 December 2023 (UTC)[reply]

@User:RichardW57, @User:This, that and the other I find it easy enough to identify English entries by looking under the heading "E". Pali would be under "P" and has a script that probably distinguishes it from other languages beginning with "P" that are in the category. DCDuring (talk) 17:30, 10 December 2023 (UTC)[reply]

@DCDuring, DCDuring &This, that and the other: Yes, that half of the battle was won by my enabling the 'list of contents' to take one to the right page. What I want to do is add some notes to the category about when it's OK for the lemma to be red-linked. For example, it's OK if the lemma has a blue-linked transliteration. --RichardW57 (talk) 18:40, 10 December 2023 (UTC)[reply]

I don't think we should have "forever redlinks". A redlink implies a missing entry. If the entry for a term will never be linked, I feel like it should be displayed using the alt parameter:

third-person nominative passive of something

This, that and the other (talk) 21:44, 10 December 2023 (UTC)[reply]

@This, that and the other: In principle, they're not red forever, they're just waiting until an editor turns up a quotation so that the lemma can be added. A past participle doesn't always prove the existence of the finite verb, though I could be persuaded that a Pali present tense justifies adding an entry giving the inflection of a present participle. My time is bounded by my lifespan and my money is bounded by my stinginess. --RichardW57 (talk) 09:54, 11 December 2023 (UTC)[reply]

Adding items to Watchlist automatically, but for a limited time

One can add items to one's watchlist for a limited time (1 week, month, or year or 3 or 6 months) or one can add items to one's watchlist automatically, but permanently, AFAICT. (See Preferences: Watchlist.) It would be handy to allow limited time as a possibility for each of the conditions which lead to an addition to one's watchlist. Is that something that can be done here or does it require Wikimedia action? DCDuring (talk) 16:24, 8 December 2023 (UTC)[reply]

This is something I would like too. I even wrote the MediaWiki code for it: gerrit:789321 but I got stuck at the point of writing tests and left it for others to complete. The patch is described in the comments as "super close" so hopefully @MusikAnimal will be able to find time in the next few months to look at it. This, that and the other (talk) 02:50, 9 December 2023 (UTC)[reply]

Also ping @TheresNoTime on this (I'm not sure if you're a pure volunteer or also a WMF staff member!) This, that and the other (talk) 02:10, 10 December 2023 (UTC)[reply]

@This, that and the other: Hey, also staff, but replying with my "volunteer hat" on 🙂 I'll see if MusikAnimal wants to work on this during an upcoming hackathon. I'm pretty sure the code was fairly close to being ready! TheresNoTime (talk) 10:38, 11 December 2023 (UTC)[reply]

As it happens, we're in sort of a hackathon this week, and I have @This, that and the other's patch at the top of my list of priorities. I expect it to be finally, finally be DONE come next week at the latest :) MusikAnimal (talk) 23:55, 11 December 2023 (UTC)[reply]

@User:MusikAnimal. Thanks. We're keeping our fingers crossed. DCDuring (talk) 15:41, 12 December 2023 (UTC)[reply]

@DCDuring @This, that and the other Update at phab:T265716#9414068. I apologize I underestimated how much work was left! I'm 100% going to finish this project, it just won't happen this week as I had hoped. MusikAnimal (talk) 19:35, 18 December 2023 (UTC)[reply]

@MusikAnimal Thanks for the update. Good luck. DCDuring (talk) 19:48, 18 December 2023 (UTC)[reply]

`{{RQ:Vulgate}}` is broken and it's unclear how to proceed.

It wasn't even listed in {{Bible quotation templates}}, but that's fixed now.

It links to Latin Wikisource, but its target, s:la:Biblia Sacra Vulgata (Stuttgartensia), no longer exists. s:la:Vulgata Clementina appears to be a complete version, but doesn't have page images, but also has a different naming convention. The text is a newly published version (2002, updated more recently), as well.

Should we be linking to scans of a physical copy? The {{RQ:Wycliffe Bible}}, for example, links to an 1850 copy of the text. grendel|khan 17:12, 8 December 2023 (UTC)[reply]

@Sgconlaw Can you respond to this? You seem generally aware of how these templates work. Benwing2 (talk) 03:04, 11 December 2023 (UTC)[reply]

Am rather busy in real life these couple of days, but I’ll have a look later in the week. — Sgconlaw (talk) 04:40, 11 December 2023 (UTC)[reply]

@Benwing2, Grendelkhan: I have revamped the quotation template based on this 2007 version of the Vulgate at the Internet Archive. Please have a look and see if it requires tweaking. In particular, because I don't know Latin, I am not sure if I have selected the correctly declined forms of the books of the Bible (and possibly other chapters of the work)—please help to check these. Thanks. — Sgconlaw (talk) 18:05, 13 December 2023 (UTC)[reply]

I've also realized that the Latin names of books of the Bible which I used in the template seem to differ from the original ones used at the now-missing version at the Latin Wikisource (which I cannot refer to). We'll have to decide whether to use the original names, or to update all the entries using the template. — Sgconlaw (talk) 18:12, 13 December 2023 (UTC)[reply]

@Sgconlaw Thanks! Can you give me some examples of the differences in the book names? Benwing2 (talk) 20:55, 13 December 2023 (UTC)[reply]

@Benwing2: I looked randomly at senex just now, and realized it stated the book of the Bible as |book=Samuelis I, which is not the name I used in the updated template. Fortunately, the list of entries using the template is not massively long, so if we had to replace the names manually it wouldn't be too laborious.

However, I do think the issue of what the best names to use are needs to be confirmed by a Latin speaker. I don't know if there are standard names for books of the Bible in Latin; if there are, perhaps we should use them. I can't really tell with certainty what names to use based on the work. For example, in the work 1 Chronicles is entitled "Liber Dabreiamin id est Verba Dierum qui Graece Dicitur Paralipomenon". The running head is "Verba Dierum", with "I Par" on the page corners. I therefore went with "I Paralipomenon" since it was most similar to "1 Chronicles". — Sgconlaw (talk) 21:10, 13 December 2023 (UTC)[reply]

@Sgconlaw Hmm, I have studied Latin but I don't know the Vulgate in detail. The meaning of "Liber Dabreiamin id est Verba Dierum qui Graece Dicitur Paralipomenon" is "The Book of Dabreiamin (?), that is Words of the Days, which is called Paralipomenon in Greek". I don't know what Dabreiamin means; it's not Latin. Here it essentially gives three versions of the titles, one in some unrecognized language, one in Latin, one in Greek. Benwing2 (talk) 21:26, 13 December 2023 (UTC)[reply]

@Sgconlaw I think Dabreiamin is a transliteration of דברי יומימ davréi yomím, which means "the words of days" in Hebrew. Benwing2 (talk) 21:31, 13 December 2023 (UTC)[reply]

I think the issue is these books aren't titled in the original Hebrew, so the early translators kind of made things up. Benwing2 (talk) 21:32, 13 December 2023 (UTC)[reply]

First of all, Hebrew דָּבָר also means "thing", so it might have been used used like Latin rēs. As for "iamin": the plural in -in reminds me of Aramaic, but then it would have a וֹ, which would end up as "v" or "o" in Latin: Aramaic יוֹמִין.As for "Paralipomenon", that would be Ancient Greek Παραλειπομένων (Paraleipoménōn), which refers to things that were "left out" of the books of Kings. Chuck Entz (talk) 04:52, 14 December 2023 (UTC)[reply]

@Sgconlaw: Wow, that is really thorough. It's a little tough because there aren't chapter headings, I don't speak Latin, and it's in an odd script, but this is pretty much the best case one can hope for for a fourth-century source. Great work! grendel|khan 23:43, 25 December 2023 (UTC)[reply]

@Grendelkhan: you’re most welcome. I’m still waiting for some help confirming the Latin declensions, though. I may have got some of the forms wrong. — Sgconlaw (talk) 04:09, 26 December 2023 (UTC)[reply]

Multipart labels

I think I have queried on this general topic before.

My current question is what template to use to produce

(countable and uncountable)

as at bloom. Something similar is shown at the headword(?), but I want to add it to the 8th sense (algal bloom). Temporarily I've added

(countable, uncountable)

but I find these commas tiresome, because it's never clear to me whether they're supposed to mean AND or OR. Obviously

(countable and uncountable)

loses the ability to look up the glossary.

Hmmm.... ...or, now that I think further, perhaps this sense should be split???

(countable) An algal bloom.
(uncountable) Occurrence of an algal bloom or blooms.

Sorry, I realise that's now deviated from a purely technical enquiry.

—DIV (49.181.60.115 01:21, 9 December 2023 (UTC))[reply]

Follow-up on the AND/OR ambiguity of the comma

In sense 2 of learned#Adjective

(law, formal) A courteous description used in various ways to refer to lawyers or judges.

and in sense 2 of potato#Noun

(informal, UK) A conspicuous hole in a sock or stocking.

the comma seems to mean AND (formal legal settings; informal UK expression).

But then at cavity#Noun

(engineering, manufacturing) The female part of a mold: the depression itself or (metonymically) the half of the mold that contains it.

and at bowl#Noun

(sports, theater) An elliptical-shaped stadium or amphitheater resembling a bowl.

the comma seems to mean OR (engineering or manufacturing; amphitheatres used for sport or theatre).

This vexes me. —DIV (49.181.60.115)

You can write {{lb|en|this|or|that|or|the-other}} though I think that often looks silly: some common sense is expected of readers. Equinox ◑ 01:40, 9 December 2023 (UTC)[reply]

Ah. Thanks for the pointer. And, in hindsight, I guess I should have looked at the template's page, which gives several examples. It didn't occur to me that it already had that functionality. Mea culpa. —DIV (49.181.60.115 04:32, 9 December 2023 (UTC))[reply]

My new signature: {{lb|en|This|that|and|the other}} This, that and the other (talk) 05:59, 9 December 2023 (UTC)[reply]

Template:Swadesh list auto and Lua timeouts

Lua timeout errors (as well as template include size and memory errors) in Swadesh list appendices have been an intermittent problem for many years, but when it didn't go away with a null edit or two, I was able to fix things by splitting the appendix into two or more smaller pages. Now Appendix:Sinitic Swadesh lists seems to have permanently settled in to timing out, but this time there's nothing I can do. That's because there's only a single call to {{Swadesh list auto}} instead of hard-coded tables, so it's all or nothing: either it shows everything, or there's just the line with the module error and no content. I'd rather not limit the number of languages, so the logical method is by the numbered rows/terms.

To allow splitting, I would like to propose adding two parameters: |from= and |to= (or something equivalent). These would allow showing only a subset of the list. If not given, |from= would default to 1. If |to= is not given, it would default to the last number (207). The range displayed would then be from |from= to |to=.

As for how one might split things, it turns out that the Swadesh list isn't random:

1-35 are pronouns, determiners, adverbs, adjectives, numbers and other core vocabulary
36-91 are nouns
92-146 are verbs
147-165 are things in the sky, earth, landscape, etc.
166-207 are adjectives, colors, prepositions and some other odds and ends

There are further subgroupings, as well, so a number of other logical dividing lines are possible. The main thing is, this problem isn't going to go away, so we need to be able to split the lists one way or another. Chuck Entz (talk)

Should `{{alter}}` support the same context labels as `{{label}}`?

I am looking at Hafer trying to specify appropriate dialect labels for the alternative form listed there, using specifically the same information included in {{label}} at the corresponding entry: Haber (Southern Germany, Austria, Switzerland dialectal, otherwise obsolete). However we get: Haber (Southern Germany, Austria, Switzerland dialectal, otherwise obsolete), which leads me to conclude that {{alter}} doesn't support the same variety of labels and punctuation that {{label}} does. Module:de:Dialects also only recognises four national varieties, but none of the other national or sub-national varieties which populate Category:Regional_German, is this intentional? What is the correct way to indicate this information? Evidently one way to get around this would be to simply use {{label}} after {{alter}} instead but then we probably run into the problem of pages being wrongly categorised. {{qualifier}} has the same downside as {{alter}} in that the punctuation isn't supported and in addition, none of the terms are wikilinked or presumably normalised. Would {{alter}} ideally support the same labels as {{label}}? If not, when do we provide the labels within {{alter}} if not here, as in the existing example Haber (obsolete)? I can't find any mention in the respective documentation pages of the purpose or scope of labels within {{alter}}. Helrasincke (talk) 15:46, 9 December 2023 (UTC)[reply]

I agree, these should be unified. It’s something I’ve been meaning to bring up for a long time, so thanks for raising this. Theknightwho (talk) 17:27, 9 December 2023 (UTC)[reply]

It's always bothered me that we have several sets of dialectal information. Some languages (e.g. Albanian?) borrow the dialectal information in labels from that used in {{alt}}, but all probably should. Note that we also have {{accent}}, which now borrows the list of punctuation from {{label}}. Benwing2 (talk) 20:55, 9 December 2023 (UTC)[reply]

Verb category (gerund, participle) descriptions for Romance languages

There's a problem with the category descriptions of gerunds and participles, and it seems to carry over to multiple languages; Portuguese, Spanish, Galician, Italian, French...

Here's the description of each of the two:

Gerunds: <language> forms that generally act as an action noun for the verb that they are formed from.
Participles: <language> verbs not fully conjugated, usually to be used in compound conjugations.

This... really, really doesn't look accurate. It's seriously very English-centric; you can use gerunds as nouns in English ("I love swimming", "Why can't I stop hiccuping?", "His constant crying and whining really annoys me."), but not in any of these Romance languages afaik. See Portuguese translations for the three example sentences: "Eu amo nadar", "Por que não consigo parar de soluçar?", "O choro e reclamações constantes dele me irritam demais". All of them use either the infinitive or nouns derived from the verbs, and I have no reason to believe it's any different for any of the other languages I mentioned.

The participle isn't nearly as bad, but it still comes across as anglocentric to me. Using it to create verb tenses is exceedingly common in English, but it's not as... predominant? in Romance. For Portuguese again, you have "ter <participle>", sure, but using the forms for adjectives and nouns is super common too. While I use "Eu tinha saído" (I had left) often, I barely (if ever) use "present of ter+participle" forms (as in "Eu tenho feito um bom trabalho" — I've been doing a good job); I much prefer "present of vir+gerund" instead ("Eu venho fazendo um bom trabalho" for the previous example).

This issue might be present with other verb forms' categories too, though I didn't check. MedK1 (talk) 19:50, 9 December 2023 (UTC)[reply]

@MedK1: I agree, these are problematic. I think of participles as generally verbal forms that behave like adjectives, and gerunds as verbal forms that behave like (usually indeclinable) nouns, although Russian has both adjectival and adverbial participles and Lithuanian has something like 13 or 14 participles, some of which are adjectival and some adverbial; and as you point out, Romance gerunds don't really function like nouns. Can you propose better wording? Benwing2 (talk) 20:52, 9 December 2023 (UTC)[reply]

Honestly, my dislike for the wording I could come up with for the gerunds was what prompted me to come here talk about it instead of just doing it myself haha.

Gerunds: <language> forms that express ongoingness of a verb's action, or one's state during another action.

I find that using "action" twice is repetitive. I thought about "while carrying out" or "while performing" rather than "during", and, although those are more obvious, they're pretty wordy. And "ongoingness" isn't exactly the most common word, either... though it still feels more accurate than "continuity" to me, for some reason.

Participle: <language> forms that can act as adjectives and form compound conjugations.

I didn't really understand the "not fully conjugated" part for the first one. Maybe it's got to do with how you mostly only have past participles and no future participles...? I think this works fine though... MedK1 (talk) 22:11, 9 December 2023 (UTC)[reply]

@MedK1 Thanks. I don't understand the "not fully conjugated" part either; (adjectival) participles in most languages can be fully conjugated as adjectives, although in some languages they're invariable when used in compound tenses (like in Spanish and Portuguese, but not French or Italian, where they agree with a preceding direct object). Benwing2 (talk) 22:20, 9 December 2023 (UTC)[reply]

@MedK1: I presume you've raise the issue because you're looking for consensus. We agreed some time ago that the term 'gerund' had so many different meanings that a universal wording wasn't possible, and there's therefore an instruction on the category page for changing the wording. Some of the actual code enables one to select the wording by language family. --RichardW57 (talk) 02:14, 10 December 2023 (UTC)[reply]

Fully conjugated largely refers to changes for person. In Indo-European languages, participles often decline for number, gender and case, just like adjectives. --RichardW57 (talk) 02:14, 10 December 2023 (UTC)[reply]

@MedK1, RichardW57: I took a shot at rewording the participle description as follows:

"{{{langname}}} verbal forms that behave syntactically like adjectives (or sometimes adverbs), and in some languages are often used in compound conjugations and/or reduced relative clauses."

This is a bit verbose, and I don't really like the juxtaposition of "some" and "often", but I think it is at least accurate. We can customize it for specific languages or families, as is done for gerunds. The bit about reduced relative clauses is intended to be less European-specific; e.g. Turkish regularly forms relative clauses using participles instead of complementizers or relative pronouns. Benwing2 (talk) 01:52, 11 December 2023 (UTC)[reply]

@Benwing2: The bit about relative clauses also applies to English compared to Russian. --RichardW57 (talk) 02:58, 11 December 2023 (UTC)[reply]

Different title fonts for entry vs revision history

I seem to remember this being discussed before a while back but I cannot find the conversation or remember if there was a resolution. For some reason I am getting a different font displayed for Hebrew script page titles on the revision history page than at the top of the entry itself. This also occurs on Cyrillic entries, c.f. [1] and [2]. Yet for Latin entries both have the same font. Is this intentional and is there an easy way to change this either on a site-wide or individual user level? Helrasincke (talk) 21:33, 9 December 2023 (UTC)[reply]

@Helrasincke I think this is because on regular pages, Module:headword uses the DISPLAYTITLE functionality to wrap the page title in the appropriate language tag for non-Latin fonts (if there are multiple languages on the page, the last one alphabetically takes precedence). This doesn't happen on pages with Latin-script titles and it doesn't happen on revision history pages because Module:headword doesn't run on those pages. So on those pages you're getting the system default font, whatever it is. If you want them to display the same, you can set your own private CSS file to use the system default font for a particular language (if you really want to do this), but I don't think we have control over how MediaWiki handles revision history pages. Benwing2 (talk) 22:17, 9 December 2023 (UTC)[reply]

@Benwing2 To clarify, I prefer the way MediaWiki handles it (how it looks at the revision history pages) since it seems to be more consistent with the way Latin is handled (I find it jarring that Hebrew and Cyrillic for instance get a sans serif font, where Latin has a serif one). Out of curiosity, what was the reason we changed it for non-Latin fonts, readability perhaps? Helrasincke (talk) 22:37, 9 December 2023 (UTC)[reply]

@Helrasincke I'm not really sure, maybe User:Erutuon, User:This, that and the other or User:Surjection can comment, as I think they've worked more on the CSS. Benwing2 (talk) 03:02, 10 December 2023 (UTC)[reply]

This sounds like a historical question. In the old "Monobook" skin, all page titles were sans-serif (the Arial font was used on my computer, but I believe it was system-specific). The font rules were probably copied across from Monobook to Vector without considering whether a change was required. I wouldn't be opposed to changing the per-script page title fonts to serif fonts where an appropriate font can be identified in each case. This, that and the other (talk) 03:05, 10 December 2023 (UTC)[reply]

No objections from me to changing the defaults to use serif fonts. Benwing2 (talk) 01:13, 11 December 2023 (UTC)[reply]

It'll be a laborious process to do it for all languages. There's no automatic way to select the serif fonts out of the existing lists of fonts in our CSS. If someone writes up lists of preferred header fonts that work for each script in MediaWiki:Gadget-LanguagesAndScripts.css on the CSS file's talk page, an interface administrator can eventually add them to the CSS file. They just need an additional h1 in the CSS selector. For Hebrew, for instance, if Times New Roman is okay in headers, a rule like h1 .Hebr { font-family: 'Times New Roman'; } would set the font for top headers with a class="Hebr" display title. — Eru·tuon 23:38, 11 December 2023 (UTC)[reply]

Module:lo-translit - terms with brackets

Compare ຂ້ອຍບໍ່ມີເງິນ (khǭi bǭ mī ngœn) (correct) vs ຂ້ອຍ ບໍ່ມີ ເງິນ (khǭibǭmīngœn) (incorrect). Adding brackets between words removes a space in the transliterations.

Calling @Octahedron80, @Theknightwho, @Benwing2, @Fish bowl. Anatoli T. ^{(обсудить}/^вклад) 22:05, 9 December 2023 (UTC)[reply]

@Atitarev I don't know anything about Lao or how it's being handled currently but I assume it should (eventually) work like Thai. Benwing2 (talk) 22:12, 9 December 2023 (UTC)[reply]

@Benwing2: Thanks. The implementation is different and also simpler. Tones are ignored.

I actually don't know if scraping will be required for Lao as with Thai or Khmer. It's much more phonetic and less complicated by nature (consonant clusters are almost absent). There are much less words pronounced irregularly. In short, the current module can be tweaked to work better (without the need to use transliteration scraper). Also, importantly, you can notice that the output of {{lo-pron}} doesn't produce a transliteration. Anatoli T. ^{(обсудить}/^вклад) 22:17, 9 December 2023 (UTC)[reply]

@Atitarev I see. Something for the TODO list I suppose, unless someone else gets to it. Benwing2 (talk) 22:21, 9 December 2023 (UTC)[reply]

@Benwing2: Thank you very much. It's something you can start over time, if you're still interested and show, perhaps, a better way of handling South East Asian languages.

Here's an example of and irregular pronunciation in Lao: ບັດເຄຣດິດ (bat khē dit). The automated translit ບັດເຄຣດິດ gives "bat khēn dit" (wrong)

Three things happen here:

The module can't distinguish rare consonant cluster. Compare with Thai บัตรเครดิต (bàt-kree-dìt) where a true cluster "kr" has to be respelled with "คฺร". ◌ฺ (the phinthu marks a lack of a vowel), otherwise the เคร (kree)- part "kree" can also be "keen" (the Thai module should actually fail this for ambiguity). Lao doesn't employ such a symbol, AFAIK. With a "vowel killer" symbol it would produce something like "bat khrē dit"
Letter ຣ (ra) "r" is always pronounced as "l", so many Thai cognates with ร (rɔɔ) are now spelled with ລ (la). Words with ຣ (ra) are mostly obsolete or loanwords. Sanskrit and Pali derivations, unlike Thai, are now respelled phonetically after reforms.
True clusters are mostly missing in Lao, so even if they are occasionally spelled with "l" or "r", they are not pronounced so, a respelling is required only for such cases (l/r is silent). We now use only manual transliterations at Wiktionary when the transliteration fails, since irregular pronunciations are less common but syllabification can still be a problem, also with letter ຫ (ha), which also acts as a tone changer, like Thai หมา (mǎa) compare with Lao ໝາ (mā) (or Thai มา (maa) without "h" has a different tone). Here, two consonants together don't represent a cluster but the following vowel has a different tone and "h" is silent.

There is a rather expensive Lao dictionary (small format), which I can easily get from our city library. Their transliteration method is similar to Paiboon that we use for Thai (with some enhancements for missing vowel length) - with tone marks. I mentioned it on User_talk:Octahedron80#Module:lo-pron_and_Module:lo-translit but received no response. Anatoli T. ^{(обсудить}/^вклад) 02:22, 11 December 2023 (UTC)[reply]

@Atitarev: The Thai language doesn't include phinthu either. However, since Unicode 12.0, it's in both scripts, Thai and Lao, at algorithmically corresponding codepoints, primarily to support Pali written as an abugida rather than as an alphasyllabary that meets Daniels' definition of an alphabet. The Lao code point is U+0EBA LAO SIGN PALI VIRAMA. Rendering support for it has, unsurprisingly, been generally poor so far. RichardW57 (talk) 03:14, 11 December 2023 (UTC)[reply]

@RichardW57: phinthu IS used in respellings, in the input to {{th-pron}}, not in entry titles, if I didn't make it clear.

Try creating an entry with a cluster and {{th-pron}}. You'll get Lua error: Please replace ** in the respelling with **. This includes syllables with the initial ห (hɔ̌ɔ). E.g. หมา (mǎa) is respelled as หฺมา.

The Lao version of virama could be used for the same purpose but a module change would be required. Anatoli T. ^{(обсудить}/^вклад) 03:27, 11 December 2023 (UTC)[reply]

I've made a redirect ◌຺ to the Lao virama, otherwise it's impossible to even discuss it. Anatoli T. ^{(обсудить}/^вклад) 03:34, 11 December 2023 (UTC)[reply]

Manually creating `{{infl of}}` entries for rtl languages.

Manually creating for example {{infl of|yi|פֿאַרשיידנ||2nom|m|s}} yields "2nom masculine singular of פֿאַרשיידנ (farsheydn)", where it appears to be a problem related to the direction agnostic non-alphanumeric characters messing with the parameter order. Accelerated entry creation does not appear to be affected, e.g. [3]. Substituting url encoding for the RTL script term solves this particular issue but introduces another in that the translit module then doesn't work: see for example [4]. Any suggestions? Helrasincke (talk) 22:32, 9 December 2023 (UTC)[reply]

The problem is that the clash of text directions means you're not typing or pasting things where you think you are. I avoid this by setting up the boundaries of the rtl text first before I add it: first I type the "||" part, then I put the cursor between the pipes, then I enter the rtl text. If I'm going to have ltr and rtl together, I start out with a temporary pipe between the ltr and the rtl, enter both, then delete the extra pipe. In this case, I would type {{infl of|yi|||2|nom|s}}, then {{infl of|yi|_||2|nom|s}} (I'm using "_" to represent the cursor), then {{infl of|yi|פֿאַרשיידנ||2|nom|s}} giving second-person nominative singular of פֿאַרשיידנ (farsheydn). Chuck Entz (talk) 23:36, 9 December 2023 (UTC)[reply]

@Chuck Entz This does indeed solve the problem, thankyou! Helrasincke (talk) 09:49, 11 December 2023 (UTC)[reply]

Help creating template for Yele language nouns

I have recently began studying Yélî Dnye and wished to include the vocabulary and grammar of Levinsons grammar, I am however unsure of how to create templates for wiktionary. The only things, I believe, I need from it are a nouns animacy and specified form. Not super good at the technical part of editing quite yet. Thank you~ ACertainNumberFive (talk) 18:24, 10 December 2023 (UTC)[reply]

R:ar:Wehr-4 reference template issue

{{R:ar:Wehr-4}} doesn't link correctly on مَرْعًى (marʕan). The Hans Wehr entry is on page 401 but the link opens on page 366 (!) with this rubbish in the search window "ر%Cا%95ي".

A manual search with the root letters رعي (also with spaces in between) brings to the right page.

Is it the last edit by an IP?

Pinging those who worked with the template or might be interested. @Erutuon, @Benwing2, @Fay Freak, @Fenakhay Anatoli T. ^{(обсудить}/^вклад) 00:52, 11 December 2023 (UTC)[reply]

Oh, @Fenakhay just fixed it. Thanks! Anatoli T. ^{(обсудить}/^вклад) 00:53, 11 December 2023 (UTC)[reply]

more aggressive GC

Tracked in Phabricator
Task T349462

I just got a message that Tim Starling's more aggressive Lua GC patch (T349462) has been merged, so hopefully it will make it here soon. We'll have to see whether and by how much the memory of some of the biggest pages (e.g. a) goes down. Benwing2 (talk) 22:52, 11 December 2023 (UTC)[reply]

@Benwing2 it would be interesting to consider the idea of splitting large pages into sub-pages, and having some sort of top-level navigation (e.g. a "< Previous" and "Next >" at the top of the page). So a page like "a" would be more of a container, and it would start out by loading just the first of its subpages, showing "Next >" at the top. Chernorizets (talk) 00:14, 15 December 2023 (UTC)[reply]

I proposed this at Wiktionary:Grease pit/2023/July#Lua errors: back to the bad old days, but the reception was lukewarm. I still think these gigantically long pages are a terrible user experience though! This, that and the other (talk) 00:21, 15 December 2023 (UTC)[reply]

It doesn't look like there was much interaction with your post, probably because not as many people follow Grease Pit discussions. I think it might be worth re-proposing in the Beer Parlour. I've had reservations about the idea, but I think your mock-up is a great way to handle it. Andrew Sheedy (talk) 00:29, 15 December 2023 (UTC)[reply]

@Chernorizets @Andrew Sheedy Is there really a need for this, with the recent changes to limits? I suspect this will end up simply making page maintenance harder, without much benefit. Theknightwho (talk) 01:10, 15 December 2023 (UTC)[reply]

@Theknightwho I see it as an idea worth discussing. We endeavor to, over time, include every word in every language. If we're successful, this entails the longest pages in common scripts becoming even longer, and other pages also moving over to the "very long" category. I suppose we can continue to play the limit-increase game, and perhaps the less-Lua-in-your-templates game, but one requires external action (devs), and the other adds burden for editors (more choices to make) and template devs (more templates to make/maintain). Splitting to me is analogous to pagination in API responses - you set an acceptable working window and it gives you the ability to gracefully handle an arbitrarily long collection. Chernorizets (talk) 02:30, 15 December 2023 (UTC)[reply]

I will say though - if we were to add support for split pages, there would need to be some infrastructure set up to make the maintenance process easier. E.g. I don't want an editor to have to move language sections from one subpage to another, because the subpage they're editing has now gotten "too big" per our criteria, and we need to rebalance. Most of that should be invisible to editors, and taken care of by the infra. Otherwise, we end up with the same state as today, except even messier. Chernorizets (talk) 02:36, 15 December 2023 (UTC)[reply]

It depends how well the more aggressive GC patch affects page load times. Right now, any of the letter pages take forever to load for me, and when they do load, they tend to be so long that it's hard to actually find the information I want. Add a couple thousand more languages and you're looking at a completely unwieldy page, regardless of load time or any other technical factors. Andrew Sheedy (talk) 02:31, 15 December 2023 (UTC)[reply]

Candrabindu as Consonant Nasaliser

Candrabindu has two main functions - a vowel nasaliser, and a consonant nasaliser. Transliterating the vowel nasaliser seems not to be much of a problem. As a consonant nasaliser, it appears in most Indic scripts part of the consonant cluster, which in Sanskrit is yy, ll or vv. In Sanskrit, the consonant nasaliser only occurs as a result of external sandhi, so it would only be a problem for terms when they are multiword terms. Quotations are a different matter. If the consonants are stacked, it is visually ambiguous with the vowel nasaliser, but it may contrast with the vowel nasaliser if the consonants are joined horizontally. Where this difference can be encoded, should we not transliterate the two differently? How, though? My preferred solution would be use the Latin candrabindu diacritic on the first consonant, or should we write, for example, lm̐l?

(Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76, AryamanA, Atitarev, Benwing2, Smettems, Kutchkutch, Bhagadatta, Msasag, Svartava, Getsnoopy, Rishabhbhat, Dragonoid76, कालमैत्री): Now, the consonant-nasalising candrabindu in the Sinhala script seems to act very differently to that in Devanagari. It is marked on the consonant cluster before the one with the nasalised consonant, as in the quotation for විද්‍වාංස් (vidvāṃs). Is candrabindu used in Sanskrit in Sinhala script for any purpose but marking the nasalisation of the consonants y, l, v? If not, then the option of the transliteration marking the transliterated nasalised consonant with candrabindu remains open, and seems feasible, even though it acts across white space. (It seems that candrabindu in the Sinhala script is restricted to Sanskrit.) --RichardW57m (talk) 15:49, 12 December 2023 (UTC)[reply]

@RichardW57m: I don't quite understand the question. Are you able to provide an example of cases in Devanagari? Are you talking about दाँत (dā̃t) vs दांत (dānt) (also rarely as दन्त (dant)) but with some other consonants? Anatoli T. ^{(обсудить}/^вклад) 00:08, 13 December 2023 (UTC)[reply]

@Atitarev: Yes. No. --RichardW57 (talk) 12:54, 13 December 2023 (UTC)[reply]

For the first question, the example of Sanskrit त्रील्‍ँलोकान् (trīl‍m̐lokān) (with त्रील्ँ लोकान् (trīlm̐ lokān) preferred} is given at https://en.wikisource.org/wiki/Sanskrit_Grammar_(Whitney)/Chapter_III Paragraph 206(a). Sometimes the consonants are stacked, so the text looks the same as ल्लाँ (llām̐). --RichardW57 (talk) 12:54, 13 December 2023 (UTC)[reply]

If I don't apply language appropriate formatting, I get ल्लाँ. Such are the vagaries of Devanagari rendering! --RichardW57 (talk) 13:02, 13 December 2023 (UTC)[reply]

But I used the wrong vowel! I should have used a non-spacing vowel, so ल्लुँ for an example of ambiguity. --RichardW57 (talk) 13:07, 13 December 2023 (UTC)[reply]

@RichardW57m: Are you saying that त्रील्‍ँलोकान् (trīl‍m̐lokān) should be "trīl‍̃lokān" or "trīl̐lokān" instead of "trīl‍m̐lokān". The candrabindu ◌ँ (◌m̐) appears on top of the first ल (this is the decompiled order of symbols: त ् र ी ल ् ‍ँ ल ो क ा न्.

By default ◌ँ by itself transliterates as "m̐" in Sanskrit but in e.g. Hindi, it defaults to ◌̃. It's fine with vowels. You're right, I think consonants with a candrabindu should be transliterated differently but not sure how. To me personally, "trīl‍̃lokān" looks confusing and I can't even say which letter is nasalised but "trīl̐lokān" looks more intuitive. You can try experimenting and make a suggestion. I don't think it's being handled well. --Anatoli T. ^{(обсудить}/^вклад) 04:18, 15 December 2023 (UTC)[reply]

@Atitarev: The candrabindu belongs on the first 'l' - "trīl̐lokān". I made a suggestion above, and I don't think the mechanics of transliteration are complex for Devanagari if we don't try to guess what a candrabindu apparently on a vowel actually applies to. Rather, it is the behaviour of the Sinhala script which may cause problems, as it seems that we may have to 'guess' there, going for the option that applies in 100% of cases.. --RichardW57m (talk) 09:54, 15 December 2023 (UTC)[reply]

@RichardW57m: You described your request in words, which is not immediately clear, unless you clearly show some "before" vs "after" without mixing all possible scripts. My guess, was (based on your "I made a suggestion above") you wanted "trīl̐lokān" Is that the "Latin candrabindu"?

Don't expect any actions, especially from people who may have little knowledge of Sanskrit, Devanagari and other scripts but might be able to apply a possibly simple module trick (to me, the change seems simple, again if you understood you correctly). I also don't think you can request a change for all scripts in one go, unless you refer to a module line that does all the conversions. Of course, if you just want to chat about it, without engaging any module writers ... Anatoli T. ^{(обсудить}/^вклад) 01:57, 18 December 2023 (UTC)[reply]

@Atitarev: I was proposing to do the change myself. However, my other question has not been addressed - does candrabindu ever apply to vowels in the Sinhala script? --RichardW57 (talk) 02:37, 18 December 2023 (UTC)[reply]

@RichardW57: As you yourself suggested, it may only be relevant in Sanskrit written in the Sinhalese script. Anatoli T. ^{(обсудить}/^вклад) 03:29, 18 December 2023 (UTC)[reply]

Nil method 'display_difference' in new testcase module

I am trying to check round-tripping between the Sinhala script and SLP1, and am finding that the test process, implemented in Module:sa-utilities/translit/SLP1-to-Sinh/testcases, is failing with what looks like, but surely isn't, an internal error. Can someone please advise me how to avoid this error. I've written several testcase modules using Module:UnitTests, and those I've looked at are still working. I can't see what I've done wrong this time. I've set up monitoring to detect the use of global variables, but their usage seems internal to Module:UnitTests. --RichardW57 (talk) 18:36, 13 December 2023 (UTC)[reply]

@RichardW57 I think you have to invoke equals using tests:equals() instead of tests.equals(). Benwing2 (talk) 22:31, 13 December 2023 (UTC)[reply]

@Benwing2: Thanks, that fixes the problem. --RichardW57 (talk) 23:56, 13 December 2023 (UTC)[reply]

Internationalism calques and semantic loans

I think it would be useful to modify {{internationalism}} to have a parameter for calques and loans, compare międzynarodowy and król#Polish and potentially even have a category "International calques" and "International semantic loans" (or perhaps "Internationalism"? That sounds worse to me. I realize the current setup I have is non-standard, which is why I would like to change it. Vininn126 (talk) 10:09, 14 December 2023 (UTC)[reply]

Everything that involves more than one country (anything English in Polish, for instance) is literally "international", so using that word would just be confusing. Chuck Entz (talk) 14:45, 14 December 2023 (UTC)[reply]

Apropos of nothing really, personally I've never liked this template much; it seems to encourage lazy etymologizing. I understand it may not be possible to determine the chain of borrowing of recent scientific terminology and such but IMO using {{internationalism}} should be a last resort. Benwing2 (talk) 21:46, 14 December 2023 (UTC)[reply]

@Benwing2 I don't disagree, but how would you change it? (Also I'd still like to have this functionality...) Vininn126 (talk) 22:35, 14 December 2023 (UTC)[reply]

@Vininn126 Most of the etymologies I've added were done before this template existed (usually for Russian terms, as I created several thousand entries for Russian terms c. 2015-2017). Generally if I didn't know the chain of borrowing, I have tended to say "ultimately borrowed from ..." listing wherever the term was first coined, and if it's formed of Latin and/or Greek roots, including those. Sometimes I have written "probably borrowed from" when a given intermediate source was likely, e.g. a lot of slightly older Russian terms were borrowed through German. I admit this is imperfect but I'm not quite sure what the benefit of using the {{internationalism}} template is in most cases, other than to categorize under "Foo internationalisms" (and I'm somewhat skeptical of the benefit of this category, as I'm not sure there are clear criteria for what counts as an internationalism). Benwing2 (talk) 23:03, 14 December 2023 (UTC)[reply]

The wording I use typically is "internationalism, possibly borrowed from XYZ, ultimately from {der}." Vininn126 (talk) 23:05, 14 December 2023 (UTC)[reply]

@Vininn126 that's in line with what the documentation of {{internationalism}} recommends:

You should always link either to the term that contains the full etymology or the term that the internationalism is based on if you use this template. Often that term is the English term, but it may also be in another language. You should prefer using the ultimate source language (the language in which the word was first used) if it is known.

Chernorizets (talk) 00:20, 15 December 2023 (UTC)[reply]

I should know, I helped write the documentation when it was made. Vininn126 (talk) 00:26, 15 December 2023 (UTC)[reply]

sco-verb template: past2 parameter does not work

See hae#Scots: it is not showing the supplied "haed" past2 form. Equinox ◑ 15:21, 14 December 2023 (UTC)[reply]

@Equinox Yeah the whole Scots support here is a big stinking pile of caca. I'm thinking we should just generalize the existing English noun and verb modules to also support Scots, although I don't know how much Scots-specific stuff will have to be added; do nouns and verbs follow the same spelling conventions as for English? (plurals in -es after s/x/z/ch/sh, -y changes to -ies after a consonant, verbs may double a consonant before -ing to preserve the short vowel sound, etc.?) I don't know much about Scots spelling conventions (or even if such conventions are standardized at all). Benwing2 (talk) 21:55, 14 December 2023 (UTC)[reply]

@Benwing2 Could we possibly group Middle English (and eventually Yola and Fingallian) in as well? They're all sufficiently close to English that it could potentially work, I think. Theknightwho (talk) 22:38, 14 December 2023 (UTC)[reply]

@Theknightwho Sure, why not. I just need to know how the spelling conventions work for each language. Benwing2 (talk) 22:55, 14 December 2023 (UTC)[reply]

Edit filter request

I'm wondering if it's possible to set up an edit filter to warn against removing all content from a page and replacing it with {{delete}}. Ultimateria (talk) 19:54, 14 December 2023 (UTC)[reply]

@Ultimateria Should be possible, yes, and IMO would be a good idea. Benwing2 (talk) 21:47, 14 December 2023 (UTC)[reply]

Punctuation symbols removed from links

At some stage the removal of punctuation symbols in a number of scripts have been removed from links, which affected mainly the phrasebook space.

Just one example: (esm-e šomâ čist?) no longer links to (esm-e šomâ čist) for which there exist an entry.

Should the removed punctuation symbols be restored or should entries now contain the punctuations? Or should |alt= be used? Anatoli T. ^{(обсудить}/^вклад) 03:49, 15 December 2023 (UTC)[reply]

We had long discussions about what symbols to remove for which script but I haven't seen any discussion about the removal, which already happened.--Anatoli T. ^{(обсудить}/^вклад) 04:19, 15 December 2023 (UTC)[reply]

@Theknightwho Any ideas here? I don't recall removal of punctuation happening previously, but maybe it did. Did you make any recent changes relating to this? User:Atitarev, do you recall when this changed? Benwing2 (talk) 05:16, 15 December 2023 (UTC)[reply]

@Benwing2: Sorry, no idea when. You will find that template documentations still mention that diacritics and punctuations will be removed from links - Template:link/documentation, Template:doublet/documentation, which is not happening for all languages.

It works for English: what is your name?

Doesn't work for Greek: πώς σε λένε; (pós se léne?), Spanish: ¿cómo se llama usted?, etc.

Here is someone complaining about the removal in 2022Template_talk:link#diacritics_and_punctuation_and_confusion. Anatoli T. ^{(обсудить}/^вклад) 05:28, 15 December 2023 (UTC)[reply]

@Benwing2 Not sure - I was under the impression these were still being removed. Theknightwho (talk) 05:31, 15 December 2023 (UTC)[reply]

@Atitarev It "works" for English because there's a redirect from what is your name? to what is your name. Benwing2 (talk) 07:10, 15 December 2023 (UTC)[reply]

@Theknightwho: The code is still there: Module:languages#L-942 The regex is obviously not working, though, probably as an unintended side-effect of your core changes to Module:languages. Can you take a look at why it isn't working? Benwing2 (talk) 07:15, 15 December 2023 (UTC)[reply]

@Benwing2, Theknightwho: The new regex matches if the middle text before the optional first punctuation mark and the last punctuation mark only contains non-whitespace and non-punctuation characters, except that a whitespace character can occur before the last punctuation mark. The Greek and Spanish examples above have some whitespace characters in the middle text that are not right before the last punctuation mark, so they don't match the new regex. I haven't looked at the whole evolution of the regex, but in a randomly selected version on November 11, 2019, the old regex only required one of the characters in the middle text to be non-whitespace and non-punctuation, so it would have applied to the Greek and Spanish examples above. The newer version of the regex might be faster, but it's different from the old version. — Eru·tuon 21:31, 16 December 2023 (UTC)[reply]

@Erutuon Thank you for looking into this! I think we should probably revert to the old regex then. Benwing2 (talk) 21:39, 16 December 2023 (UTC)[reply]

@Erutuon Yes, thank you. Agreed @Benwing2 - I honestly couldn't tell you why I changed this (I assume it was me a while ago - I haven't gone through the diffs). Theknightwho (talk) 22:47, 16 December 2023 (UTC)[reply]

@Theknightwho No worries, same thing happens to me :) ... Benwing2 (talk) 23:15, 16 December 2023 (UTC)[reply]

@Erutuon, @Benwing2, @Theknightwho: Hi. Any updates? The last ping didn't work, anyway.

I had to remake the whole list of translations on [[what time is it]]. It's especially bad currently for languages requiring transliterations, when it looks like this:

समय क्या है (samay kyā hai)?

but should be:

समय क्या है? (samay kyā hai?)

Anatoli T. ^{(обсудить}/^вклад) 23:32, 20 March 2024 (UTC)[reply]

@Erutuon, @Theknightwho, @Benwing2: Hello. Any updates on restoring the functionality? I have recreated some old redirects, since it doesn't seem it will be fixed quickly. — This unsigned comment was added by atitarev (talk • contribs).

Ahh, I never followed through with the edit. Done. The new pattern was introduced in these edits. There was a previous change to the pattern in this edit. I am reintroducing the pattern from these edits. Please post here if there are any problems caused by my fix. — Eru·tuon 00:21, 21 March 2024 (UTC)[reply]

Thank you so much, @Erutuon! Anatoli T. ^{(обсудить}/^вклад) 00:39, 21 March 2024 (UTC)[reply]

@Erutuon: Re: diff Thank you for fixing some translations. There are some more complex cases or rather, other symbols, where your regex will not work but the links are fixed by your edit, e.g. Armenian ժամը քանի՞սն է (žamə kʻani^?sn ē) is now correctly linking to ժամը քանիսն է (žamə kʻanisn ē). I guess you regex didn't pick up diacritics, that's fine, not urgent to redo. Thanks again! Anatoli T. ^{(обсудить}/^вклад) 00:46, 21 March 2024 (UTC)[reply]

The Armenian one has worked for years because of entry name replacements that remove ՞. I guess someone added the alt parameter because they didn't know. — Eru·tuon 19:36, 21 March 2024 (UTC)[reply]

Sanskrit Script Conversion Templates

(Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76): : Do there exist any? If there don't, I intend to create a template {{sa-convert}} that will implement function tr from Module:sa-convert to convert from Devanagari. I just don't want to invest effort in documenting it only to find it being deleted as a duplicate. My chief application for it will be to convert existing declension tables in which each cell is manually populated individually from Devanagari to, for now, Sinhala. If there were no objection to using #invoke: in term's pages I wouldn't need to bother with it. --RichardW57 (talk) 01:42, 18 December 2023 (UTC)[reply]

I've now converted the template. --RichardW57 (talk) 23:03, 29 December 2023 (UTC)[reply]

@AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76, RichardW57:

What about the Kawi writing system? --Apisite (talk) 23:10, 29 December 2023 (UTC)[reply]

Sanskrit terms in the Kawi script don't seem to meet WT:CFI. Anyway, what about it? Are you proposing to crreate monstrosities like fictions such as Sanskrit ශ්රී (śrī)? --RichardW57 (talk) 23:40, 29 December 2023 (UTC)[reply]

@RichardW57: I figure, that at least in the section for the etymology for Old Javanese terms, the Sanskrit terms could be in Kawi displayed. --Apisite (talk) 03:35, 30 December 2023 (UTC)[reply]

@Apisite: For the most part, that would only hinder. And how would that help them meet CFI? --RichardW57 (talk) 07:57, 30 December 2023 (UTC)[reply]

@RichardW57: I wouldn't be surprised, if nobody would be interested in looking for any relevant resources upon encountering something along the lines of {{der|kaw|sa|(insert Devanagari script here)|(insert Kawi script here)}}; but at least there's the option of making entries of the Kawi characters themselves nonetheless. --Apisite (talk) 08:04, 30 December 2023 (UTC)[reply]

Northern French

Trying to make a proper category for Northern French terms. See Category:Northern French They are often written in articles as "Northern France"; see chicon for instance. Synotia (talk) 10:56, 18 December 2023 (UTC)[reply]

We have a label {{lb|fr|Picardy}}; between that and {{lb|fr|Belgium}}, doesn't that cover Northern French pretty well? —Mahāgaja · talk 11:05, 18 December 2023 (UTC)[reply]

Does not cover Nord-Pas-de-Calais Synotia (talk) 15:32, 18 December 2023 (UTC)[reply]

I suppose the label needs to be added somewhere. Should it go in Module:labels/data/lang/fr or Module:labels/data/regional? This, that and the other (talk) 03:09, 19 December 2023 (UTC)[reply]

The other option would be to change "Picardy" to "Hauts-de-France". Are there regionalisms distinctive to Picardy that are absent from Nord-Pas-de-Calais? This, that and the other (talk) 03:10, 19 December 2023 (UTC)[reply]

Of course, I suggest you take a look at this page. Synotia (talk) 09:10, 19 December 2023 (UTC)[reply]

@Synotia What terms are characteristic of this particular administrative region? It doesn't seem to correspond to any traditional region, so there may be no terms that are specific to this region and this region only, and not part of Picard, for example. Benwing2 (talk) 09:16, 19 December 2023 (UTC)[reply]

The Nord is not a traditional region? It includes the area known as French Flanders to begin with, which brings its share of words like witloof, drève etc.

I suggest you take a look at these links too. Synotia (talk) 09:43, 19 December 2023 (UTC)[reply]

@Benwing2 maybe you meant to ping me, since I was the one who brought Hauts-de-France into the discussion.

I'm fairly convinced that we might need a new label, but I think it should display as (___, Belgium) where ___ is the appropriate English name for the regions of France concerned (@Synotia ?). I'd rather not use "Northern French", unless it can be shown this is an accepted term in English. This, that and the other (talk) 10:51, 19 December 2023 (UTC)[reply]

Can be "Northern France" or "Nord-Pas-de-Calais", or perhaps just "Nord" if you want to keep it short, although Nord is only the 59 department. Synotia (talk) 14:01, 22 December 2023 (UTC)[reply]

I redesigned category breadcrumbs

Currently category breadcrumbs are just a bare paragraph of links in HTML:

<p><small>[[:Category:Fundamental|Fundamental]] » [[:Category:All languages|All languages]] » [[:Category:English language|English]] » [[:Category:English lemmas|Lemmas]]</small></p>

which produces

Fundamental » All languages » English » Lemmas

This is a poor practice for very many reasons.

I redesigned them

visually;
to make them compatible with the accessibility guidelines provided by W3C Web Accessibility Initiative.

Example output:

<templatestyles src="Module:category tree/styles.css" /><div role="navigation" aria-label="Breadcrumb" class="ts-categoryBreadcrumbs"><ol><li>[[:Category:Fundamental|Fundamental]]</li><li><span aria-hidden="true" class="ts-categoryBreadcrumbs-separator"> > </span>[[:Category:All languages|All languages]]</li><li><span aria-hidden="true" class="ts-categoryBreadcrumbs-separator"> > </span>[[:Category:English language|English]]</li><li><span aria-hidden="true" class="ts-categoryBreadcrumbs-separator"> > </span>[[:Category:English lemmas|Lemmas]]</li></ol></div>

which produces

You can test it by previewing {{User:JWBTH/poscatboiler|en|lemmas}} at Category:English lemmas. (Maybe there is a better way to test it, but your system is too complicated and I couldn't figure it out.)

If everything is OK, please replace Module:category tree with my sandbox version of it. The TemplateStyles stylesheet used by it is available at Template:poscatboiler/styles.css. JWBTH (talk) 16:28, 18 December 2023 (UTC)[reply]

@JWBTH This is great! I would only propose two changes: remove the corner-rounding (noting that no other visual elements in Wiktionary's skin have corner rounding) and retain the » character in place of > (it gives Wiktionary breadcrumbs a distinctive, more dictionary-like character).

I'll go ahead and implement this with my proposed changes.

If you'd also like to attack the Rhymes breadcrumb system to match (e.g. Rhymes:English/uː-) that would be marvellous. This, that and the other (talk) 00:21, 19 December 2023 (UTC)[reply]

@JWBTH, This, that and the other: I also agree this is better; however I'd like to retain the invisible background rather than having it be gray. Benwing2 (talk) 02:56, 19 December 2023 (UTC)[reply]

The reason I used the background is that:

the breadcrumb line was 11.67px in Vector, 10.58px in Monobook;
it has an uncomfortably small size and is, say, beyond the accessibility guideline outlined in w:MOS:SMALLFONT (<small> in enwikt is even smaller than <small> in enwiki);
if we are to increase the font size, then, without structural isolation, it's hard to see the difference/boundary between the breadcrumbs and the main content of the page; see an example.

For this reason I hold that a clearer visual separation is required. JWBTH (talk) 03:20, 19 December 2023 (UTC)[reply]

An alternative would be to use a separator like this. JWBTH (talk) 03:31, 19 December 2023 (UTC)[reply]

Yes, you explained it better than I could have. Some kind of visual separation is required, otherwise the breadcrumbs do not stand out from the category description text.

A border below as in the screenshot (plus, I take it, a small amount of additional margin below) would achieve this purpose just as well, and would also resolve the issue of the first letter of the breadcrumbs not being flush with the first letter of the category description. @JWBTH would you like to try implementing this at Module:category tree/styles.css? This, that and the other (talk) 03:49, 19 December 2023 (UTC)[reply]

@This, that and the other @JWBTH Whatever you can do to fix it up, please do it. The gray background IMO looks really terrible. Benwing2 (talk) 09:17, 19 December 2023 (UTC)[reply]

FWIW, I also much prefer the » character (and I find it more intuitive, actually; I tend to initially read > as a greater-than sign every time I see it). Andrew Sheedy (talk) 17:36, 19 December 2023 (UTC)[reply]

@This, that and the other: If you'd also like to attack the Rhymes breadcrumb system to match (e.g. Rhymes:English/uː-) that would be marvellous.
Here you go: {{User:JWBTH/rhymes nav|it|a|llo}} gives I played with other designs taking into account how the breadcrumbs are placed on "Rhymes:" pages, but didn't arrive at any good ideas, and I think having the same design as for categories is good enough. But, I see, there are a lot of pages like Rhymes:English/ɔɪtə(ɹ) where there is a horizontal line under the navigation. Probably some bot should remove it.
The source is at Module:rhymes/sandbox. JWBTH (talk) 16:52, 22 December 2023 (UTC)[reply]

@JWBTH thanks for this! Implemented.

As well as removing the redundant horizontal lines, the pronunciations on those rhyme pages also need to be templatised. I could have a go using AWB... This, that and the other (talk) 00:07, 23 December 2023 (UTC)[reply]

@This, that and the other I am doing a bot run to remove the horizontal rules. Templatizing the pronunciation might be better suited for AWB. There are also pages that have manual breadcrumbs that need fixing, e.g. the Czech pages like Rhymes:Czech/at. Benwing2 (talk) 00:30, 23 December 2023 (UTC)[reply]

Why do you use > instead of →? I don't understand the incentive for arrow-like hacks when 99.9%+ of computers will correctly render an actual arrow. —Justin (koavf)❤T☮C☺M☯ 17:54, 19 December 2023 (UTC)[reply]

If you google "breadcrumbs ui", you will mostly see image- or CSS-based separators that look like a flattened > (chevron, like here) or slash. The use of "→" is rare. JWBTH (talk) 21:55, 19 December 2023 (UTC)[reply]

Chaghatay (chg) descent from Qarakhanid (xqa)

A line containing ancestors = "xqa" should be added into the entry for Chaghatay (chg) in Module:languages/data/3/c, as Template:inherited does not currently properly work on pages relating to Chaghatay terms descending from Qarakhanid, and results in an error being displayed when used. I mentioned this earlier at Module talk:languages/data/3/c, but the page does not appear to be monitored. Samiollah1357 (talk) 06:53, 19 December 2023 (UTC)[reply]

How does Khorezmian Turkic (zkh) fit in there? Wikipedia seems to say it's a descendant of xqa and an ancestor of chg, but I'm not sure I've read it correctly. —Mahāgaja · talk 07:23, 19 December 2023 (UTC)[reply]

It would be fine to put Khwarezmian Turkic between Qarakhanid and Chaghatay, as that would not interfere with the inherited template. Samiollah1357 (talk) 16:57, 19 December 2023 (UTC)[reply]

OK, see the family tree at CAT:Karakhanid language. Karakhanid is the ancestor of Khorezmian Turkic, which is the ancestor of Chagatai, which is the ancestor of Uyghur and Uzbek. —Mahāgaja · talk 19:10, 19 December 2023 (UTC)[reply]

Chagatay is a literary standard based on the previous literary standards Khorezmian Turkic and Qarakhanid. So indeed there are at least two parallel ancestors in distance languages only. Also we think it representative of Uzbek and Uyghur. This is too hard for Wikipedia’s language trees, but anyone can name languages descending from multiple concurrent languages if they are artificial in the first place, and in fact even oral-only languages can be mixed languages. Fay Freak (talk) 07:59, 19 December 2023 (UTC)[reply]

Faroese Conjugation Table Mishap

How could anyone here describe this bug of the Faroese conjugation tables? -- Apisite (talk) 08:04, 19 December 2023 (UTC)[reply]

trans-title in Template:quote-web

|trans-title= in {{quote-web}} is not working for me. Please see Скочиле́нко (Skočilénko). Anatoli T. ^{(обсудить}/^вклад) 21:54, 19 December 2023 (UTC)[reply]

Tagalog column

Please replace all der(x) and rel(x) templates with col(x) templates in Tagalog in source code. Thank you. Ysrael214 (talk) 11:42, 20 December 2023 (UTC)[reply]

post-expand include size

FYI there's a discussion on Wikipedia about increasing the post-expand include size, Wikipedia:Village_pump_(technical)#Increasing_the_post-expand_include_size, which might lead to pinging some specific developers in a bid to get that longstanding, long-closed-as-wontfix issue a second look like the one that got the Lua memory limits issue reopened and fixed. If anyone here wants to chime in over there with expertise about the issue or who to ping (or if we want our own PEIS limit increased), jump in. - -sche (discuss) 21:27, 20 December 2023 (UTC)[reply]

for reference, here is Category:Pages where post-expand include size is exceeded. Lately it's been mostly documentation pages for templates and modules, where there are multiple examples using a template that everywhere else occurs only once per page. There are also appendices that sometimes hit this limit and sometimes the time limit, but those are better split into sub-pages anyway. Chuck Entz (talk) 22:17, 20 December 2023 (UTC)[reply]

For anyone who's tracking this, the feedback is that the PEIS limit will not be raised: [5]. - -sche (discuss) 05:38, 3 April 2024 (UTC)[reply]

Remove obsolete global styles

Please remove these obsolete styles for diffs from MediaWiki:Common.css:

/* Old revisions */

#mw-revision-info {
	border: 2px solid #8888FF;
	border-left: 0px;
	border-right: 0px;
	font-size: 110%;
	margin: 5px;
	margin-left: 0px;
}

#mw-editingold {
	margin-left: 15px;
	margin: 5px;
	padding: 5px;
	border: 2px solid #CC0000;
	border-width: 2px 0px;
}

They are from the times when diff info wasn't wrapped in a warning box. Now they are redundant and produce side effects.

JWBTH (talk) 22:44, 20 December 2023 (UTC)[reply]

Wiktionary has had these lines since time immemorial; I remember them from 2010ish. I was never quite sure why they were there - I imagined it was to make it easier to know you were looking at an archived revision. I think we have all got used to them and there is no good reason to get rid of them! I'd invite the opinions of others. This, that and the other (talk) 22:58, 20 December 2023 (UTC)[reply]

I hate to break it to you, but their entire purpose is to make a warning box (or rather just box in the case of MediaWiki:Revision-info) – something that is done now with a real warning box. With these styles, it is just duplicated. English Wikipedia and other wikis had similar styles which looked like this and now are got rid of. Not getting rid of them is the real-life analogy of this xkcd. JWBTH (talk) 23:08, 20 December 2023 (UTC)[reply]

(If it's not clear or you developed a false memory, old version pages looked something like this until the change in MediaWiki came some time recently and the styles, indeed, made sense on their own. Not anymore.) JWBTH (talk) 23:44, 20 December 2023 (UTC)[reply]

Quotations do not collapse under Template:transclude

@Benwing2, Fytcha: See, e.g., abszach. J3133 (talk) 12:05, 21 December 2023 (UTC)[reply]

The template is creating a span.senseid that is not closed (check at Special:ExpandTemplates), and MediaWiki closes it but also encloses the ul containing quotations, creating ol > li > span.senseid > ul instead of ol > li > (span.senseid +) ul. — This unsigned comment was added by Fish bowl (talk • contribs).

Template:coinage

Is there a way to use Template:coinage, or a similar template, to refer to organizations or groups of people rather than individuals? For indigenous languages, I think it would be interesting to categorize terms that were coined by missionaries, Jesuits etc.

I thought of using {{lb|x|neologism}}, but it feels weird with 500-year-old coinages. Trooper57 (talk) 14:20, 21 December 2023 (UTC)[reply]

You can use {{coinage}} for organizations and groups, e.g. {{coinage|nv|Jesuit missionaries|in=the 17th century|nobycat=1|w=-}}, which yields "Coined by Jesuit missionaries in the 17th century". The parameter |w=- prevents the template from adding a link to Wikipedia; |nobycat=1 avoids creating CAT:Navajo terms coined by Jesuit missionaries; you can omit that parameter if you think having such a category is a good idea. —Mahāgaja · talk 14:30, 21 December 2023 (UTC)[reply]

Non-IAST Romanisations we use for Sanskrit

WT:About Sanskrit#Transliteration says that the use of commonly used translation schemes for Romanising Sanskrit is forbidden other than for IAST. What uncommon transliteration schemes do we deliberately use? I know there is an uncommon system in use for Mongolian (script Mong at least), and Tibetan seems to have another one.

(Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76): , @Benwing2, Theknightwho: The reason I ask is that these uncommon schemes undocumentedly break the automatic conversion to Devanagari used in {{sa-sc}}. In particularly, for Sanskrit written in the Sinhala script, do we deliberately use 'gn' for the character (ඥ (gna) that IAST renders as 'jñ'? I hit the problem with the page for Sanskrit ප්‍රඥා (prajñā, “wisdom”), and I don't know whether to fix the transliteration Module:si-translit, the script {{sa-sc}} or just eschew default transliteration for such pages. We also need to supplement the transliteration repertoire for non-IAST schemes so as to reduce errors in inflection tables. --RichardW57 (talk) 15:44, 21 December 2023 (UTC)[reply]

We should be fixing {{sa-sc}}. There seems to be a overarching problem throughout Wiktionary for Indic languages where people seem to be confusing transcription (related to pronunciation) with transliteration (related to writing), which seems to have led to people inventing ad hoc transliteration schemes for every language. The transliteration for all Indic languages should in fact be the same for all the same corresponding characters, and it's only the IPA given for those words where the local pronunciation would be different. This is akin to, for example, the word banana being pronounced differently in the various English dialects, but the spelling is the same. Getsnoopy (talk) 17:37, 21 December 2023 (UTC)[reply]

I don't know if ALL Indic languages should be transliterated the same way. Bengali, e.g. differs from Hindi and Hindi has minor differences to Sanskrit. In this particular case Sinhalese ඥ (gna) should be changed to "jña" from "gna", IMO. Anatoli T. ^{(обсудить}/^вклад) 00:14, 22 December 2023 (UTC)[reply]

But that's what I'm saying: ISO 15919 was specifically made to cover all the cases of Indic scripts, and they all have fundamentally the same letters (at least for the ones that correspond). Insofar as there isn't a corresponding character in another script for that letter (e.g., Bengali's य़ equivalent or the Dravidian languages' ऱ and ऴ equivalents), it makes sense to deviate. But even in those cases, ISO 15919 covers those cases with a standard transliteration. Outside of that, it doesn't make sense to at all to confuse transcription with transliteration. The Bengali inherent vowel should not be transliterated any differently from any of the other Indic languages because it represents the same thing fundamentally; it's merely a difference in pronunciation. It's no different to, for example, the "aw" in awful being pronounced [ɔ(ː)] in the UK and non-cot–caught merger US dialects vs. the ones that have the cot–caught merger, where it is pronounced [ɑ], but both are written the same way because they represent the same thing. Getsnoopy (talk) 01:01, 22 December 2023 (UTC)[reply]

@Getsnoopy: I understand but how does it help? I don't think anyone would be happy to change transliterations and rules for each language. The way to go, possibly, is to add a Sanskrit function to each module or create separate translit module for each script. Anatoli T. ^{(обсудить}/^вклад) 01:13, 22 December 2023 (UTC)[reply]

Well I'm just pointing that out as something that should be the guiding principle when approaching Indic-language transliteration. As for anyone being happy, many (if not most) Indic languages currently already either are 1-to-1 with ISO 15919 or are very close to it, so it wouldn't even be a noticeable difference to update it.

Regarding the relevance to this case, I think just having the Sanskrit transliteration module be able to handle all scripts which are used to write Sanskrit words would be the way to go, since at least in the narrow Sanskrit case, we want everything to align with IAST. Getsnoopy (talk) 03:40, 22 December 2023 (UTC)[reply]

@Getsnoopy: I have no personal strong objection to all Indic languages to use ISO 15919 or get as close as possible but there are always some small individual things that make it undesirable in some cases, e.g. nasalisation in Hindi, shwa-dropping. Editors have spent a lot of time making modules more phonetic by catering for shwa dropping or catering for letters and sounds not present in Sanskrit. So, it won't happen in the real world but can be made closer. I agree with your second point, so I think there is no point in pursuing the first. Anatoli T. ^{(обсудить}/^вклад) 04:27, 22 December 2023 (UTC)[reply]

While IAST is limiting since it only accounts for sounds in Sanskrit, that's not the case with ISO 15919, which accounts for all Indic languages. It has provisions for representing nasalization properly as well, so the only thing that would be an except is the schwa syncope, which I'm not really opposed to because I view it as an issue independent of transliteration. But it's this "making modules more phonetic" that I'm pointing out as a problem, since that reveals the fundamental issue where editors seem to be confusing transliteration with transcription. Getsnoopy (talk) 23:48, 22 December 2023 (UTC)[reply]

I'm not a Sanskrit or Indic languages editor, but it seems perverse to me to interpret "Standard transliteration system for Sanskrit on Wiktionary is exclusively IAST - all the others of dozen or so commonly used transliteration schemes such as Harvard-Kyoto or ISO 15919 are forbidden" as meaning that uncommon transliteration systems are allowed--is that really what this is supposed to mean? I also don't see how it's helpful to discuss other Indic languages at this point (unless there's an issue with there being no bright line between Sanskrit and those languages): the cited policy seems to be specifically about Sanskrit. It makes sense to me to transliterate Sanskrit ප්‍රඥා with 'jñ', regardless of whether Sinhalese ඥ is transliterated as gna.--Urszag (talk) 00:27, 22 December 2023 (UTC)[reply]

We're forced to use a non-IAST system in cases like Mongolian, because the correspondence isn't straightforwardly one-to-one as it is with most Indic scripts, for example. Tibetan has the same issue to a lesser extent. Theknightwho (talk) 00:33, 22 December 2023 (UTC)[reply]

That makes sense, but in the case of scripts like Mongolian the issue isn't related to how common or uncommon the transliteration scheme is, right? It depends rather on the source script. I guess if the Devanagari form given in the definition line always had its own transliteration (i.e. if the definition of forms like ᠱᠷᠢᠢ (šrii) were given as "Mongolian script form of श्री (śrī́)") it wouldn't really matter how the Mongolian form was transliterated; is there some reason not to have sa-sc show a transliteration of the Devanagari form?--Urszag (talk) 01:10, 22 December 2023 (UTC)[reply]

Yes, agreed. One solution would be to have {{sa-sc}} check the headword transliteration, and to display one if they're different. Theknightwho (talk) 01:24, 22 December 2023 (UTC)[reply]

@Theknightwho: For 'headword transliteration', you meant the invocation of {{sa-alt}}. In general, that's the only Devanagari besides invocations of {{sa-sc}}. --RichardW57 (talk) 10:26, 22 December 2023 (UTC)[reply]

@Urszag: Well, with the piecemeal development of the heading structure we are getting a lot of repetition. Historically, Romanisation has been slow to develop. When transliteration was added to Pali, non-Roman entries would typically have a transliteration on the headword and then the same form as part of the definition line. Module:headword contains tooling to stop this sort of thing, though it breaks down when distinctions are lost, as in the traditional Bengali writing of Sanskrit, or most Lao script writing of Pali. (For Pali, I got the suppression moved to the the Pali headword module and templates, so the suppression can be overridden when appropriate.) If IAST were insisted on for 'transliteration', then we would have the transliteration in both the headword and in the definition line, which would be excessive.

It has been going through my mind that requiring |Deva= for {{sa-alt}} in non-Devanagari entries (and likewise |Latn= for {{pi-alt}} in non-Roman Pali entries) is excessive. I also noticed that switching to deducing most of the parameters for {{pi-alt}} significantly reduced cut and paste errors.

Incidentally, the problem isn't that the Romanisation isn't being shown by {{sa-sc}}, it's that the generated Devanagari isn't coming up correctly, and my eyes for one tend to glaze over when presented with Devanagari. Also, I don't know by heart how to enter the Devanagari into its parameter for that template. --RichardW57 (talk) 10:55, 22 December 2023 (UTC)[reply]

@Theknightwho: Can you give me an example of where converting the Romanisation of Mongolian Sanskrit into IAST can't be automated. One-to-one conversion stops at the Burmese border (/au/ becomes mandatorily multi-part until one gets to Cambodia) and we manage to convert Thai script Pali to IAST, though sometimes we need to be told the writing system. --RichardW57 (talk) 11:05, 22 December 2023 (UTC)[reply]

@Urszag: It makes sense. If we transliterate from one script - e.g. Sanskrit in Devanagari, we don't have to worry about what whould the transliteration be in other scripts, (as long as they are consistent and correct).

However, the module at Module:sa-headword#L-27 doesn't generate correct alternative forms and that needs to be solved, is that right? How?

प्रज्ञ (prajña) shows Sinhalese ප්‍රඥ (prajña), not ප්‍රඥා (prajñā). Is that the problem, @RichardW57? Anatoli T. ^{(обсудить}/^вклад) 00:55, 22 December 2023 (UTC)[reply]

@Atitarev: No. The first Sinhalese script forms shows final /a/ and the second shows final /ā/, which is at it should be. The immediate problem is that if one does not supply the Devanagari form for {{sa-sc}}, the script converts {{SUBPAGENAME}} to Devanagari by first Romanising it and then converting from IAST to Devanagari. This doesn't work if the Romanisation isn't to IAST, and in this case the Devanagari form presented has ग (ga) where it should have ज (ja).

Incidentally, depending on the font used, the graphic difference between the two Sinhala script forms can just be the shape of the hook at the bottom right - consonant and final vowel optionally ligate. --RichardW57 (talk) 09:46, 22 December 2023 (UTC)[reply]

@Atitarev, Urszag: The obvious answer to the problem of Module:sa-headword serving up the wrong equivalents from another script is to properly test the module providing them, Module:sa-convert. I've set up a test case module, Module:sa-convert/testcases, to do the testing, but obviously the test cases need to be added, and sometimes discussed, and there are problems with variability. For example, I've been thinking overnight about how I am to deal with Sinhala script form of Sanskrit बुद्ध (buddha) having the two Sinhala script forms බුද‍්ධ (buddha) (touching letters) and බුද්‍ධ (buddha) (conjunct that often looks very like ඞ (ṅa)). In the Sanskrit included in the BJT Pali edition of the Dhammapada, for this cluster, touching letters outweigh conjuncts by 12 to 4. We generate Sanskrit inflection tables in SLP1, and SLP1 doesn't distinguish the two ways of writing the cluster. I think I can handle it by applying a global change to the generated inflection table. --RichardW57 (talk) 10:19, 22 December 2023 (UTC)[reply]

I agree with @Urszag on this matter of interpretation. I should have spotted this oddity when I adopted Module:si-translit for Pali, but I might have overlooked it because ඥ doesn't occur in Pali. Since then, I've found @Theknightwho relying on this perverse interpretation. My first thought had been to add another dependency on language to Module:si-translit.

Incidentally, no-one's answered the question of whether transliterating Sanskrit ඥ (jña) distinctly from ज्ञ is intended, or whether it's just an accident. It might just be an accident. --RichardW57 (talk) 11:22, 22 December 2023 (UTC)[reply]

(Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76): : If no-one's argued for transliterating Sanskrit ඥ (jña) as 'gn' by Christmas, I'll bring Module:si-translit into line with IAST for Pali and Sanskrit. The one argument I've seen for doing something like that is that the Tipitaka transliterations listed at https://www.accesstoinsight.org/tipitaka/sltp/index.html uses 'gñ'. That is less incompatible. --RichardW57 (talk) 19:16, 22 December 2023 (UTC)[reply]

@RichardW57 What are you talking about? I pointed out that Mongolian has to rely on a different transliteration system because it is a fact of the script that it does not have a one-to-one correspondence with Devanagari, not because of some "perverse interpretation" of a policy. We would have exactly the same issue if we were to create Sanskrit entries in other non-Indic scripts. Your bizarre accusations are nonsensical. Theknightwho (talk) 20:54, 22 December 2023 (UTC)[reply]

@Theknightwho: Are you saying that the Mongolian script doesn't make all the contrasts that IAST does? That would be rather like Lao-repertoire Lao script Pali, which still gets transliterated to IAST here. Even Lao-rules Lao script Pali gets transliterated to IAST, even though -ss- and -cch- then get merged. --RichardW57 (talk) 23:50, 22 December 2023 (UTC)[reply]

@RichardW57 I can't speak for Lao, but it would be nonsense to apply IAST rules to Mongolian since it obscures the fact that the two scripts don't line up one-to-one. Mongolian necessarily uses multigraphs in some places, and we should be clear about that in the transliteration. Theknightwho (talk) 23:55, 22 December 2023 (UTC)[reply]

@Theknightwho: Why? The Burmese script, including for Sanskrit, uses multigraphs for dependent /o/ and /au/. I wouldn't propose transliterating Burmese script /o/ as 'eā'. Most non-Indian Indic scripts write independent /ā/ as independent /a/ plus dependent /ā/. There are also lots of two-part vowels in Indian scripts used for Sanskrit that hide their nature from a programmer by being written as a single character in form NFC. --RichardW57 (talk) 00:48, 23 December 2023 (UTC)[reply]

@RichardW57 Because "Mongolian necessarily uses multigraphs in some places, and we should be clear about that in the transliteration." We shouldn't brush it under the rug - this is precisely the issue @Getsnoopy was referring to: it's the difference between transliteration and transcription. Theknightwho (talk) 00:50, 23 December 2023 (UTC)[reply]

(Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76):

Done Transliteration of Sanskrit in Sinhala script now conforms to IAST as understood here. --RichardW57 (talk) 09:31, 29 December 2023 (UTC)[reply]

Another broken interface stuff

A table on MediaWiki:Blockedtext gets broken and displayed like this when there is more than one reason of a block (for example, when a public proxy IP is blocked both locally and globally, which is a common occurrence). This happens because

the block reasons are displayed in a list (when there is more than one);
MediaWiki:Blockedtext begins with table markup.

To fix this, add a newline to the beginning of MediaWiki:Blockedtext. (Ideally, instead of this tables should be transformed to <div>s, but that's too much time to invest for me...) Not 100% sure this will work though, gotta check afterwards. If that doesn't work, try to add <nowiki /> to the first (empty) line. JWBTH (talk) 21:21, 21 December 2023 (UTC)[reply]

Thanks for this, fixed.

I also note that the unblock message refers to the {{unblock}} and {{unblock-ip}} templates as a way for users to appeal their block. General question for admins: Does anyone monitor the use of these templates? Is it still worth mentioning them there? Perhaps more of a BP question... This, that and the other (talk) 23:27, 21 December 2023 (UTC)[reply]

I confirm it is fixed. JWBTH (talk) 23:33, 21 December 2023 (UTC)[reply]

(Classical) Mandaic

@Nebulousquasar has been creating "Mandaic" entries but the Wiktionary language module(s) currently generate(s) the name "Classical Mandaic" for the code they are using (myz). This discrepancy should be fixed by modifying the module(s) or the entry headers as appropriate. User: The Ice Mage ^{talk to meh} 14:29, 22 December 2023 (UTC)[reply]

Classical Mandaic is a liturgical language that is the equivalent of Biblical Hebrew. "Mandaic" is Neo-Mandaic or colloquial Mandaic. The two are very different. Nebulousquasar (talk) 14:36, 22 December 2023 (UTC)[reply]

Is there a way to use a bot to change all ==Mandaic== headers to ==Classical Mandaic== where [myz] is used instead of [mid]? Nebulousquasar (talk) 16:05, 22 December 2023 (UTC)[reply]

@Benwing2 You run bot stuff, right? Perhaps you can make these changes. User: The Ice Mage ^{talk to meh} 16:43, 23 December 2023 (UTC)[reply]

@The Ice Mage Are they correctly Classical Mandaic or Neo-Mandaic? Is User:Nebulousquasar using the right codes? User:Fay Freak maybe you can take a look? Benwing2 (talk) 23:32, 23 December 2023 (UTC)[reply]

@Benwing2, Nebulousquasar: Nebulousquasar appears to have moved the codes himself, but did not correct the L2 headers from Mandaic to Classical Mandaic. Before, when someone added “Mandaic” lemmas, I used to ask him whether, for vocabulary treatment, we should not merge Mandaic and Classical Mandaic anyway, as we did when we removed “Syriac” and only left Classical Syriac, because it is too confusing and apparently “Mandaeologists” published all materials, classical or modern, they could get hold of, together, since after all there only a handful of this scholar species anyway who fit in a small room—where is the chronological line between Classical and Neo-Mandaic anyway? I may believe that the two are “very different”, but is it only a superficial synonymy or can you imagine a basis on which all could have the same header and only occasionally distinguished by labels “Neo-Mandaic”, “Classical”? Only questions. Those people who added Mandaic were then always banned for editing languages they did not know/abusing multiple accounts so I have not obtained an answer about this fringe topic I have avoided (since the corpora are smaller and remoter or in general topically more specialized than those written in Hebrew square or Syriac script or even inscriptional ones like Hatran Aramaic which are more hype amongst scholars; CAL just copied Drower/Macuch’s 1963 dictionary and after it only occasionally anything happened), maybe after the pandemic there is a more real community now, since in the last three years the article Mandaic language has been greatly expanded by multiple people (with some duplication at Neo-Mandaic). Fay Freak (talk) 01:23, 24 December 2023 (UTC)[reply]

@Fay Freak: It's similar to the differences among Classical Syriac and Neo-Aramaic varieties of Syria, Turkey, and Iraq. Neo-Mandaic is spoken in Iran by about 100 people. The grammatical particles are quite different, and so are the spellings of words. Charles Häberl has done a lot of detailed documentation of Neo-Mandaic. Take a look at Appendix:Mandaic Swadesh list for example.

'sun': Neo-Mandaic šamaš vs. Classical Mandaic šamiš

'moon': Neo-Mandaic sara vs. Classical Mandaic sira

'star': Neo-Mandaic kakua vs. Classical Mandaic kukba

(1) Both varieties are well documented enough to have entries included in Wiktionary. (2) Both varieties are distinct enough from each other to have separate headings/sections/categories.

Nebulousquasar (talk) 21:50, 25 December 2023 (UTC)[reply]

@Nebulousquasar: Thanks for the list. No. 205 kills me. 205 ‘if’ agr [ˈægæɹ] i.e. Persian , whereas I see other words known from Classical Mandaic in opposition to other Aramaic lects, e.g. qar ‘in’, and most known in any (like over 80% is also the similarity between Slavic languages, that would easily be 170 of your 207), [oˈkuːmɔ ‘black’] with some derivatives classically only Mandaic so I also see how Neo-Mandaic descends from Classical Mandaic. You really know what you talk about, though the three phonological variants do not cut it (stressing phonological details is usually a poor sign for two languages being very different, people can argue for Bosnian against Croatian against Serbian in the same fashion, see, though we all know I could just talk in one to some Yugo I meet in Germany and know not to classify though to understand which he answered, could I also just learn Classical Mandaic and start speak to Mandaeans in Khorramshahr without giving it off?); the last (as also in No. 46) happened in antiquity already as e.g. in ܠܘܓܡܐ (luḡmā) vs. ܠܓܡܐ (lgāmā), and as with Modern English vs. Old English we probably don’t have the exact descendants of the dialects that were usually written back in classical times. I suppose there is a large gap of no attestation for centuries between Classical and Modern Mandaic anyway so there is practically no question of when the cut between the two should be made (like Middle and Modern English at 1500)? Fay Freak (talk) 22:48, 25 December 2023 (UTC)[reply]

The issue has been resolved. I have manually replaced "Mandaic" with "Classical Mandaic." Nebulousquasar (talk) 14:10, 30 December 2023 (UTC)[reply]

'please add an English translation': Crimean Gothic the

The Busbecq quotation for Crimean Gothic the has a note being attached to it: '(please add an English translation of this quotation)'. But it already has. The Latin (quoting the Gothic) is under a param "|passage="; this then has just an HTML br tag followed by the English translation. Obviously this is not enough; we want some new param, and I guessed "|tr=" but that's not it and I don't know how to hunt through modules looking for correct params. --2A04:4A43:979F:F3CE:D57B:C4E0:2DDB:EA7E 17:33, 22 December 2023 (UTC)[reply]

Fixed; the parameter is |t=. J3133 (talk) 18:17, 22 December 2023 (UTC)[reply]

Thesaurus topic categories

I would like to extend the mainspace topic category infrastructure to the Thesaurus. So, alongside Cat:en:Fruits we would have Cat:Thesaurus:en:Fruits (see WT:RFM#Add a language code to all Thesaurus topic categories for a discussion of the naming structure, which has seen minimal input).

My current idea is to make a new template, {{ws topic}}, that functions just like {{topic}} but applies the Thesaurus: version of the category. The Thesaurus topic categories themselves would follow an identical hierarchy to the mainspace cats, but each cat would also be in the corresponding mainspace cat - so Cat:Thesaurus:en:Fruits would be in Cat:en:Fruits as well as Cat:Thesaurus:en:Foods, Cat:Thesaurus:en:Plants and Cat:Thesaurus:Fruits.

What would be the best way to implement this? I feel like adding logic to Module:category tree/topic cat would be preferable to implementing a new "Module:category tree/ws topic" submodule. This, that and the other (talk) 00:33, 23 December 2023 (UTC)[reply]

Sanskrit inflection and variant spellings

@Benwing2, (Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat, Dragonoid76): I have a problem with generating Sanskrit inflection tables when there are different ways of writing the same sound. I first confirmed it in the Sinhala script with the nouns Sanskrit බුද‍්ධ (buddha) and බුද්‍ධ (buddha, “the Buddha”). When I naively use {{sa-decl-noun-m}}, the inflected forms for the second spelling appear as the inflected forms for the first spelling. The cause of the problem is that the Sinhala script form gets converted to SLP1, which has no record of how the consonants are combined, the inflected forms are generated in SLP1, and then converted to the Sinhala script. Now, I can partially fix this by using function replace in Module:string to change the stem in the inflected forms back to the second spelling, but the inflected form that is identical to the lemma remains a blue link rather than being bold black mere text. I thought I could use the form override, in this case the undocumented |voc_s=, to make the form correspond to the lemma before the application of replace, but I was defeated. The internal processing round trips override forms in the script of the lemma to SLP1 and then back to the script of the lemma - except the 'round' trip does not restore the original form.

How should I fix this problem? (I never did like inflection via normalisation to a reference script.) Or is it not a problem?

In this case, as Wiktionary's search function disregards the spelling difference, perhaps I can get away with only presenting the table for one spelling. A good reason for including the non-Devanagari lemmas is to enable users to look up the inflected forms. --RichardW57 (talk) 01:39, 23 December 2023 (UTC)[reply]

Quote not collapsible

@Justinrleung, Fish bowl, Wpi, theknightwho The quote at 紐西蘭／纽西兰 is not collapsible. It seems to be due to {{tcl|zh|New Zealand|id=Q664}}, which when removed resolves the issue. RcAlex36 (talk) 04:04, 23 December 2023 (UTC)[reply]

@RcAlex36: I mentioned this two days ago (§ Quotations do not collapse under Template:transclude). J3133 (talk) 04:31, 23 December 2023 (UTC)[reply]

Monsun : This action has been automatically identified as harmful, and therefore disallowed

A brief description of the abuse rule which your action matched is: no blank line before subsequent heading

What I wanted to enter: {{quote-journal|1=de|year=1852|author=|title=Vermanisches Reich.|journal=[[w:Oeconomische Encyclopädie|Oeconomische Encyclopädie]]|url=https://www.google.be/books/edition/Oeconomische_Oekonomisch_technologische/8cwUAAAAQAAJ?hl=en&gbpv=0|volume=212|issue=|page=|text=Der zweite Tagemarsch ( 18. März ) führte von Panlabang über weite Ebenen im Irawadithale durch Reisfelder, die während der Südwest- '''Monsune''' weithin überschwemmt zu einer großen Wasserfläche werden.|t=The second day's march (March 18) led from Panlabang over wide plains in the [[Irrawaddy]] valleys through rice fields, which are widely flooded during the southwest monsoons, turning into a large water area.}} Synotia (talk) 10:51, 23 December 2023 (UTC)[reply]

I've added it on your behalf. Note that Irawadithale is singular "Irawadi valley"; that -e is the old-fashioned dative singular ending, not the plural ending. (The plural of Tal is Täler anyway.) Sometimes it's hard to predict why the software considers certain actions harmful, especially coming from a user who isn't all that new (almost 2600 edits in 13 months). —Mahāgaja · talk 11:03, 23 December 2023 (UTC)[reply]

Did you change anything to it? (Besides my translation mistake, thanks for that) Synotia (talk) 11:09, 23 December 2023 (UTC)[reply]

No; I kept it exactly as above, except for changing "valleys" to "valley" and removing the spaces between "18. März" and the surrounding parentheses as well as the space after "Südwest-". —Mahāgaja · talk 11:13, 23 December 2023 (UTC)[reply]

Oops! ...I did it again

Tried to enter this in Ebene: Sense #2 is a {{semantic loan|de|la|planum|nocap=1}}, whereas #3 is one from {{semantic loan|de|en|level|notext=1}}.

(Fyi my source is the Duden) Synotia (talk) 11:16, 23 December 2023 (UTC)[reply]

Have my enemies cursed my account? Synotia (talk) 11:17, 23 December 2023 (UTC)[reply]

@Synotia You need to put a blank line between headings.

This is wrong:

==French==
===Etymology===
From ...
==Noun==

This is right:

==French==

===Etymology===
From ...

==Noun==

@Theknightwho implemented this rule, and a few others, that prevent users from saving edits with certain WT:NORM violations. I don't think making these rules "disallow" is a great idea. This has the effect of preventing potential legitimate new contributors from editing Wiktionary just because they can't quite get the formatting right. I'd support making these filters "warn" only for the time being. This, that and the other (talk) 12:06, 23 December 2023 (UTC)[reply]

It's the equivalent of telling someone "hey, watch where you're going!" then gouging their eyes out. Synotia (talk) 12:48, 23 December 2023 (UTC)[reply]

Getting flashbacks. Are we becoming a lousy social media site where contributions are filtered by AI? The example is more transparently bottable afterwards. As said, the psychological effect is enough to be detrimental on average—to our conversion rate that we have to watch like a web-shop—, though I imagine some examples of spastics who really need some filters before they save their edits. Fay Freak (talk) 13:47, 23 December 2023 (UTC)[reply]

@This, that and the other I initially did exactly that, but the reason I changed it is because I had a look at the patterns of edits that got caught by these filters. There are three main ones:

Inexperienced editors who don't know about WT:NORM (and sometimes experienced editors who've made a typo/mistake), who just need a reminder. These users only tend to trigger the warning, and then correct whatever the issue is. They don't try to continue anyway, which is the point when the "disallow" notice is triggered.
Opportunistic vandals, who would invariably click through the warning anyway. These are now stopped.
People who ignore the warnings anyway, even though they explain what the issue is. These tend to correlate quite strongly with users known for other issues (to a greater and lesser degree).

I've also noticed a pretty strong correlation between triggering these filters and low quality edits.

Theknightwho (talk) 16:30, 23 December 2023 (UTC)[reply]

@Theknightwho @This, that and the other I would also rather these simply be warnings, otherwise it's too hostile. Benwing2 (talk) 23:34, 23 December 2023 (UTC)[reply]

I'm not bothered about contributors in categories 2 and (to some extent) 3. It's contributors in category 1 who I worry about. The rules in WT:NORM are non-trivial for someone who has never edited wikis before. Especially if they are editing a long or complex entry, it could be almost impossible for them to appreciate what they have done wrong and fix it themselves. I fear that we will scare off potential good new contributors who simply need time to learn our practices.

The real issue here, in my view, is that we need a regularly operative bot to address these NORM violations, rather than putting the onus on users who might not have the technical know-how to repair their own edits.

I've set abuse filters 115, 163 and 167 back to "Warn" for now. (I left 164, 165 and 166 at disallow for now, as they ought to be pretty easy for any newbie to solve, and the warning messages provides explicit, clear guidance.) If there is an intent to set any NORM-related filters back to Disallow, it would be good to discuss this more widely at BP. This, that and the other (talk) 09:06, 25 December 2023 (UTC)[reply]

FWIW I got caught in filter 163 myself earlier after putting an {{rfv}} tag without a blank line after it, and on a balance am inclined to agree with you and other folks above that these should only Warn and not Disallow, because these are cosmetic issues and better handled by periodic bot runs; I am sympathetic to the point that this requires a bot to make periodic runs, but we're probably always going to need a bot to make periodic cleanup runs because I doubt we could (or should) ever block and require users to fix every formatting issue before submitting. - -sche (discuss) 05:39, 26 December 2023 (UTC)[reply]

@This, that and the other @-sche Yeah, fair enough. It's probably a good idea to keep an eye on the hits for these, in any event. Theknightwho (talk) 22:44, 27 December 2023 (UTC)[reply]

Mobile version broken in Farsi Wiktionary

The mobile version of Farsi Wiktionary is broken — at least on the latest version of Chrome. Try searching for a word and you'll be faced with a blank page.

Looking at the Chrome console, the page throws the following exception:

Uncaught TypeError: Cannot read properties of undefined (reading 'appendChild')
   at HTMLDocument.makeLanguageTabs (ext.gadget.TabbedLanguages-script-0.js:126:48)
   at mightThrow (jquery.js:3489:29)
   at process (jquery.js:3557:12)

This wasn't the case a few months ago. Danial23 (talk) 15:31, 23 December 2023 (UTC)[reply]

@Danial23 We're the English Wiktionary not the Farsi Wiktionary. You should report this to the Farsi Wiktionary devs. Benwing2 (talk) 23:35, 23 December 2023 (UTC)[reply]

Oops, I didn't realize each Wiki has a separate dev base. Danial23 (talk) 01:05, 24 December 2023 (UTC)[reply]

Make `{{uxi}}` work like `{{ux}}` on mobile phones

Several months ago @Allahverdi Verdizade noted in Discord that {{uxi}} looks terrible on mobile phones (generally, devices with narrow screens) and suggested an idea that {{uxi}} should only work on desktop and should automatically work as {{ux}} on mobile.

I went ahead to write a prototype based on a quite simple TemplateStyles sheet. When you narrow the window horizontally, at some point {{uxi/sandbox}}'s look will turn into the look of {{ux}}.

Then I was distracted and forgot about it (for which I'm sorry to Allahverdi Verdizade). Since then @Benwing2 has rewritten Module:usex, so I would need to catch up.

But let's discuss the idea and possible complications. I currently see only one: when {{uxi}} is placed inside the flow of text, the paragraph gets split into 3 parts. I have found a few of such cases – many of them don't look good anyway and can be replaced. Changing the code of the prototype itself would be undesirable, because that would require to get rid of <dl> and <dd> tags and use an improvised tag, trying to make its styles match the styles of <dl> and <dd> in different skins.

Ping @CitationsFreak and @Equinox who also participated. JWBTH (talk) 00:18, 26 December 2023 (UTC)[reply]

How much work would it take to make template:infl of and others link directly to the parts of speech?

We could save readers a bit of clicking and scrolling if {{infl of}} was able to sense, from the parameters given, whether the word being described was a noun or a verb (or something else) and thus direct the person to the appropriate section on the target page. This seems like obvious low-hanging fruit but I dont know how difficult the coding would be and how much we'd weigh that against the benefit. —Soap— 10:59, 26 December 2023 (UTC)[reply]

{{inflection of}} already takes |pos=, though it doesn't link to an anchor on the relevant part of speech. —Mahāgaja · talk 13:16, 26 December 2023 (UTC)[reply]

Probably not worth the trouble. The system knows nothing about language sections or etymology sections when it assigns anchors, so linking to the POS will almost always be wrong for non-English entries and no better than a coin-toss for English entries with multiple etymologies. Chuck Entz (talk) 13:57, 26 December 2023 (UTC)[reply]

So the software can't tell the difference between different sections with the header "Verb"? Okay, that makes sense. I consider this a bug, or at least a deficiency, in MediaWiki ... it seems it works well on every project except Wiktionary, since we're the only project that uses the same header more than once per page. Oh well, not much we can do about it. I'll consider this a

CANTFIX then because there's nothing we can do on our end. —Soap— 14:06, 26 December 2023 (UTC)[reply]

I suppose this would be half-possible if we made headword templates generate anchors based on their language and POS, so then foos could link to the anchor "foo#English-noun" generated by the {{en-noun}} at foo. This would still fail in cases where there were multiple English nouns all spelled foo and foos was only a plural of the second one — unless we made the code more complicated and expensive / memory-intensive than I think would be reasonable (e.g. making all headword templates parse the texts of their pages to determine whether there were multiple "noun" sections, etc) — but it would work most cases. - -sche (discuss) 02:13, 28 December 2023 (UTC)[reply]

@Soap, -sche, Mahagaja, Chuck Entz: The case of multiple nouns spelled foo is what we have {{senseid}} and {{etymid}} for, though pedantically we need {{wordid}} for the intermediate level. --RichardW57m (talk) 12:11, 9 January 2024 (UTC)[reply]

@-sche It's possible to avoid doing massive repetition of work by putting something like that in a module which is called by mw.loadData, which we already do for quite a few things already. By its nature, it can't take any arguments, but that's okay because the only thing that changes is the pagename. All you need to do is put the relevant info into a table, subdivided by language, which can then be read by any headword templates on the page. It's not particularly memory intensive, and I doubt we would be generating tables with many hundreds or thousands of keys (which is when memory use stops being negligible). Theknightwho (talk) 17:02, 9 January 2024 (UTC)[reply]

Toggle button for inflection in pluricentric languages?

in some pluricentric languages the inflection is almost the same, save for a few forms which are different. a nice solution would be a toggleable button or a list one can access in the table that would change said forms. does anyone know of any examples of this in action that i can look at? or is the best way for this just to have multiple tables on the same page, or shove both/all standard variants in one cell? RagingPichu (talk) 00:22, 27 December 2023 (UTC)[reply]

Category:Terms with manual sortkeys different from the automated ones/en

I hope that the (hidden) categorization of, say, malva pudding in this category is the result of some mistake in some module. The category header provides no information on how the category is populated, apart from the category title itself. I see no manual sortkey in the entry. Indeed the entry seems completely unexceptionable. I haven't looked to find out how many similar cases there are, but there are nearly 23,000 member of the category. DCDuring (talk) 14:12, 27 December 2023 (UTC)[reply]

It's not only happening in English; entries in other languages are being put into the corresponding categories as well. —Mahāgaja · talk 14:30, 27 December 2023 (UTC)[reply]

After playing with HTML comments, I can verify that {{en-noun}} (or something it transcludes, rather) is what is causing this in the entry in question. My guess is it has something to do with template-parsing code added by @Theknightwho. Chuck Entz (talk) 14:51, 27 December 2023 (UTC)[reply]

@Chuck Entz @DCDuring @Mahagaja So this needs a little explanation, and probably a (slight) rename of the category. initially when I created this, it did exactly what it says on the tin: it checks for the use of a sort= parameter, and then categorises it there if the input differs from the automatic sortkey, so it made sense to name it in the same vein as Category:Terms with manual transliterations different from the automated ones. However, I figured that it would be useful if this also encompassed raw categories (e.g. [[Category:en:some category]]), since what we really care about from an entry-maintenance perspective is whether the sortkey is actually correct. If it finds any raw categories on the page for the relevant languauge, it checks the language's automatic sortkey against the page's default sortkey, since that's what raw categories use. This is obviously useful for any languages like Welsh, where certain digraphs/trigraphs/whatever need to be sorted differently on a language-specific basis, which won't happen if someone adds raw categories.

There are two reasons why English entries are disproportionately affected: firstly, there are a huge number of English entries that use raw categories, and secondly because English sortkeys ignore spaces (whereas default sortkeys do not). So in the example given above, it's comparing the page's default sortkey MALVA PUDDING against the English automatic sortkey MALVAPUDDING and finding that they're different, which is why it's placing it in this category. That's why the vast majority of terms in the English category look to be multiword terms.

To avoid confusion, it's probably best to take the word "manual" out of the name, and call it "Terms with sortkeys different from the automated ones", and also to put an explanation of the two different ways that entries can get added. I was going to do the explanation anyway, but hadn't got around to it yet (since I only just added the functionality anyway). Theknightwho (talk) 15:04, 27 December 2023 (UTC)[reply]

So it is a supercategory that conflates distinct alleged noxious departures from unvoted-on norms.

We already have a category specifically for manually entered categories, for which alleged defect there is a distinct mode of correction.
Compound words should obviously be excluded as there is nothing to correct.

The mass of hidden categories with long titles camouflages what might be of interest to any individual user. This category as applied is at best a red herring for most entries it has been categorizing. DCDuring (talk) 15:46, 27 December 2023 (UTC)[reply]

@DCDuring No, it simply categorises terms which are being sorted differently from the automatic sort, and there are two different ways in which that can happen. As with transliterations, there may be perfectly good reasons for that to happen. If the raw category results in the same sortkey, then it does not get put into this category. At a low estimate, around 100,000 English entries use raw categories (90K in Category:English entries with topic categories using raw markup and 57K in Category:English entries with language name categories using raw markup, assuming that some will have both). By comparison, there are only 23K English entries in this one, and even if that grows a little it will certainly not apply to all of them.

Compounds words do need to be corrected: if we want English sortkeys to ignore spaces (as I remember you were keen to have), then raw categories added to compound terms violate that, since the spaces will not be ignored during sorting. This category simply keeps track of where that's happening, and the issue won't go away if we stop tracking it.

Finally, the categories are hidden for a reason - the majority of users won't even know they're there. Theknightwho (talk) 15:57, 27 December 2023 (UTC)[reply]

Right, it's only active contributors who face the annoying clutter. Maybe we need gadgets that filter the excessive number of hidden (and other categories). I am reasonably sure that most (all?) taxonomic categories are of no interest to anyone but me. The more "permanent" categories may not have to be displayed at all. I certainly find translation requests of very little value to me.

What is the nature of the compound words "correction"? Is it something that those not allowed to edit templates can do anything about? It doesn't seem a real "issue" to all those of us who use the listing of hidden categories. Will the "issue" be promptly resolved, say, in the next two weeks? It seems to me that what needs to be corrected is the categorization. If there are plenty of reasons for entries being in the category that don't require correction, then the category definition is overbroad. I don't understand why manual categorization is inherently bad, but at least it is an understandable category. DCDuring (talk) 16:41, 27 December 2023 (UTC)[reply]

@DCDuring Yes, it would be nice if categories weren’t all shoved at the bottom, but there’s no straightforward way to fix that. Even without hidden categories visible, a page like a renders them effectively unusable unless you know exactly what you’re looking for.

The issue of compounds is simple: do you want English sorting to ignore spaces or not? If you do (and I know that you do, since you argued strongly in favour of it), then we need to use the category templates to make sure that happens, or otherwise it won’t. It’s as simple as that. The issue is more acute with languages that have more complex sorting, where the default sortkey is sometimes completely different from the automatic one. That applies to a lot of languages. Theknightwho (talk) 17:01, 27 December 2023 (UTC)[reply]

What is the problem with mass changing of categories from "manual" to template-using? Shouldn't that be possible to do with minimal risk for large classes of terms like Category:English compound terms? Are there known problems/risks with doing so? DCDuring (talk) 18:28, 27 December 2023 (UTC)[reply]

@DCDuring Yes, it's possible, and the category makes it much easier to do that. Theknightwho (talk) 20:19, 27 December 2023 (UTC)[reply]

@Theknightwho IMO these categories are badly named. Since most of them haven't been created yet, can you rename them to e.g. 'Terms with redundant script codes/lad' -> 'Ladino terms with redundant script codes', 'Terms with manual sortkeys different from the automated ones/prg' -> 'Old Prussian terms with manual sortkeys different from the automated ones' etc.? The same change needs to be made to existing categories of this form, e.g. 'Terms with redundant transliterations/ru' -> 'Russian terms with redundant transliterations' and 'Terms with manual transliterations different from the automated ones/pi' -> 'Pali terms with manual transliterations different from the automated ones'. I can take care of renaming the existing transliteration categories but please go ahead and fix the code that generates these script code and sortkey-related categories to use the new names; I'll wait for you to do that before running my script to auto-populate the categories in Special:WantedCategories. Benwing2 (talk) 10:30, 28 December 2023 (UTC)[reply]

@Theknightwho Thank you for renaming the categories. I notice you named them e.g. 'terms with non-automated script codes', although this isn't quite correct because terms with redundant non-automated script codes go into a separate category ('terms with redundant script codes'). What do you think of the name 'terms with non-redundant non-automated script codes' (or maybe better, 'terms with non-redundant manual script codes')? If you agree, I can make the change and rename the categories. I'm thinking categories like Category:Terms with manual transliterations different from the automated ones/ru could have similar names, e.g. 'Russian terms with non-redundant manual transliterations' (here I use "manual" in place of "non-automated" -- I think you are using "non-automated" for sortkeys because they can be specified either using the |sort= param, or using raw wikitext, or using DEFAULTSORT, but that doesn't apply to script codes or transliterations, so maybe we should just use "manual" for them). Benwing2 (talk) 05:07, 29 December 2023 (UTC)[reply]

@Benwing2 Yeah, I did have a think about this, and I settled on "non-automated" as a compromise between accuracy and clarity, because I felt like the more qualifiers there are the more confusing it'll be to someone who isn't familiar with the categories, and the category description makes things clear. You're right about why I settled on "non-automated" for sortkeys, and I used it for script codes for the sake of consistency, but I don't mind if you want to rename them. Theknightwho (talk) 05:13, 29 December 2023 (UTC)[reply]

@Theknightwho OK, thanks. I also think maybe we should use "entries" instead of "terms", at least for the transliteration categories, because very often they're triggered by a translation or other term that isn't the term of the page itself. This might also make sense for script codes (?), but not so much for sortkeys. Benwing2 (talk) 05:59, 29 December 2023 (UTC)[reply]

@Benwing2 Yeah, that's something I've also wondered about, but using "entries" doesn't work when you take into account the language name, because listing non-English entries in "English entries with redundant script parameters" would be really confusing. I couldn't think of anything better than "terms", but it's not ideal. Theknightwho (talk) 06:07, 29 December 2023 (UTC)[reply]

@Theknightwho Right, I see. Maybe we should just clarify what's going on in the category text. Benwing2 (talk) 06:34, 29 December 2023 (UTC)[reply]

Request to disable or modify spam filter regarding "no-ip"

I have encountered a spam filter which blocked an edit of mine because it contained the string no-ip. I am wondering if that it would be possible to modify or disable that filter so that the contribution could be made.
To be more detailed, earlier this month, I was trying to add the resource list hosted at the homepage of the PolyU Corpus of Spoken Chinese to Wiktionary:Corpora. The home page is specifically located at http://wongtaksum.no[dash]ip.info:81/corpus.htm. A record of blocks I encountered can be seen in this part of the spam block list log.
Any advice or help would be greatly appreciated. Thanks and take care. —The Editor's Apprentice (talk) 23:40, 27 December 2023 (UTC)[reply]

Looking into this, you were stopped by the global spam blacklist having \bno-ip\. on it, which was added in 2013, for being a weird redirect service: "if you are on a changing IP, you can link the site to your changing IP, and the service is set up to follow you". Does the Corpus not have another URL? Meta folks in 2015 were declining to un-blacklist any subdomains, so if we wanted to allow the PolyU Corpus no-ip link here, we would have to locally MediaWiki:Spam-whitelist either that specific subdomain, or specific URL(s). - -sche (discuss) 01:40, 28 December 2023 (UTC)[reply]

Hey -sche, I really appreciate you taking time to look into the issue and collect context about what was going on. To clarify, the PolyU Corpus does have another URL, but the corpus itself is not of specific interest to me. Instead the "Links to Other Corpora and Databases" section below the corpus' description is what interests me. Doing a search for a excerpt of the page led me to an older version of the page hosted on the City University of Hong Kong website. That page has a "last updated" date in 2018 while the version at the "no-ip" domain has a "last updated" date from this month (December 2023). This age difference corresponds to differences in their lists, with the Cantonese section, for example, being significantly longer at the "no-ip" version. Given the outdated quality of the only alternative URL, locally whitelisting the subdomain would be my preferred approach. Thanks again and take care. —The Editor's Apprentice (talk) 01:32, 29 December 2023 (UTC)[reply]

@-sche, pinging you to follow up on this discussion and possible next steps. Thanks again. —The Editor's Apprentice (talk) 23:48, 12 January 2024 (UTC)[reply]

OK, I have whitelisted wongtaksum\.no-ip\.info:81/corpus\.htm in diff. - -sche (discuss) 03:01, 13 January 2024 (UTC)[reply]

Thank you! It seems to have successfully worked as I was now able to add a link to the site to Wiktionary:Corpora. I once again appreciate your help and hope you take care. —The Editor's Apprentice (talk) 00:04, 14 January 2024 (UTC)[reply]

Requested addition to bad image list

A user is repeatedly adding File:ShortForeskin.jpg, depicting a human penis, to غلغلستان. It is requested that the image be added to MediaWiki:Bad image list to prevent further abuse. Normally at enwiki, we do this every time someone uses an obscene image for vandalism. LaundryPizza03 (talk) 02:13, 28 December 2023 (UTC)[reply]

OK, I've added the image to the blocklist and protected the page, and koavf blocked the IP for a week. - -sche (discuss) 02:18, 28 December 2023 (UTC)[reply]

Some context. This is apparently the same person who edit-warred here for a while a few years ago because Connie Glynn did something he didn't like and he felt that he had to verbally abuse her in any page remotely connected to her (apparently raindrop cake is the only such page at Wiktionary- see some of his activities at WP). He apparently uses vandalism of public spaces to compensate for feelings of anger and powerlessness, like some arsonists do. A couple of days ago, he had some kind of interaction with some girl, and he was so hurt that he's been adding penis pictures to random pages such as paravoza, subglacial, and now غلغلستان. He's currently using some kind of app or service that provides him with random IP addresses from Morocco, so blocking specific IPs won't stop him for long. After a while, he'll get tired of this and wake up to the realization of how pathetic it looks to the people who see what he's doing. We just need to deal with the mess he's making until then. Chuck Entz (talk) 08:15, 28 December 2023 (UTC)[reply]

Also see an interaction I had with him on my talk page back then: User_talk:Chuck_Entz/2020#What makes you think no one cares about it?. Chuck Entz (talk) 08:29, 28 December 2023 (UTC)[reply]

So that’s what the incident was all about? Geez. And it had to be during one of the periods when I was busy and forgot about protecting the WOTD pages. — Sgconlaw (talk) 08:41, 28 December 2023 (UTC)[reply]

I wonder if you could have a bot auto-protect the WOTD pages. CitationsFreak (talk) 18:21, 28 December 2023 (UTC)[reply]

The WOTD page itself is protected by a filter, I think. I seem to recall some discussion about whether the WOTD (and, by extension, the FWOTD) entry pages should also be protected in some way (cascading protection?) but this was decided against—I don't remember why. @Chuck Entz may recall. — Sgconlaw (talk) 19:44, 29 December 2023 (UTC)[reply]

We have to be very sparing with cascading protection, because it hits every template or module that the cascading-protected page uses. In the past, when the main page was cascading-protected at e.g. "admins only" (with the goal of protecting the transcluded WOTD template against vandalism), it meant that on any day when that WOTD used e.g. {{lb}}, which was most days, the protection also locked Module:labels/data so that no-one but admins was able to add labels for use on other entries, unrelated to the WOTD (some discussion). Note that because cascading protection only applies to things the protected page transcludes, cascading-protecting the main page also couldn't protect the FWOTD mainspace entry, because the entry is not transcluded, only the FWOTD template that summarizes the entry (and transcludes other things, but not AFAICT the mainspace entry itself) is transcluded. - -sche (discuss) 21:11, 29 December 2023 (UTC)[reply]

@-sche: ah, that was probably it. I guess @CitationsFreak's suggestion of running a bot regularly to protect the WOTD entries could be explored, then. — Sgconlaw (talk) 21:14, 29 December 2023 (UTC)[reply]

Surjection created a couple of new abuse filters to address this. See "Why can't I add images?" below. Chuck Entz (talk) 21:36, 29 December 2023 (UTC)[reply]

Just to set the record straight: in an edit that got reverted due to gratuitous verbal abuse, the IP did raise a valid point: I know nothing about their gender, so I was off-base in calling them "he". The context and their insistance on displaying penises made it a half-way reasonable guess, but when dealing with something so deeply personal guessing is always a bad idea. Chuck Entz (talk) 21:57, 29 December 2023 (UTC)[reply]

Allow accents, not just qualifiers, in T:IPA to enable checking

Just spitballing, not sure if this would have downsides: and perhaps this is moot if we're switching to T:en-IPA soon, but OTOH, perhaps this'd be quicker to implement: what if we allowed accent labels within {{IPA}}, the same way it's possible to put qualN=? I thought of this because currently {{IPA}} can only know what language a pronunciation is, but if it also knew what accent a pronunciation was from, we could code it to flag issues like the widespread [until I fixed it] use of /e/ in accents that don't have /e/ (usually /ɛ/ is meant, but sometimes /eɪ/), which it can't currently flag AFAIK because other accents like Australian do have /e/. Or use of /ou/ for /oʊ/, or of /ou/ for /aʊ/. - -sche (discuss) 18:39, 28 December 2023 (UTC)[reply]

@-sche Not opposed to this. Even assuming User:Theknightwho's {{en-IPA}} is ready soon, it will be quite awhile before all the manually specified pronunciations get converted; and having standardized accent tags instead of non-standard qualifiers will help the conversion process to the extent it can be automated. (This is indeed possible; I've done it in the past in a semi-automated fashion with thousands of Portuguese and Russian manually-specified pronunciations.) Benwing2 (talk) 06:02, 29 December 2023 (UTC)[reply]

It won’t be ready soon - it’s an enormous task, and will take months. Theknightwho (talk) 02:50, 30 December 2023 (UTC)[reply]

Allow Template:t-check for "fa-ira", ""fa-cls"

Error:

Lua error in Module:languages/errorGetBy at line 16: Please specify a language code; the value "fa-ira" is not valid (see Wiktionary:List of languages).

For ^{(please verify)} اَرْژ (arž)

@Benwing2: Could you please take a look :) Anatoli T. ^{(обсудить}/^вклад) 00:26, 29 December 2023 (UTC)[reply]

Works here but gives an error at price#Translations. Anatoli T. ^{(обсудить}/^вклад) 00:27, 29 December 2023 (UTC)[reply]

Template fa-IPA at giyâh

What's with the seemingly minor error message from {{fa-IPA}} at the entry (giyâh)?

As a side-note, where can one find any good literary resources that are applicable to the logic of the pronunciation template?

Thanks for reading. -- Apisite (talk) 22:00, 29 December 2023 (UTC)[reply]

Sources for Chinese Dialectal Synonyms and Dialectal Pronunciation?

Hey!

I have no idea if this is the right place to ask this and I'm sorry if it isn't, but does anyone know where the Chinese dialectal synonyms and pronunciations are sourced from? Respectively they are Module:zh/data/dial-syn and Module:zh/data/dial-pron. Suspecting the latter is from zhongguoyuyan.cn, but I can't find anything online for the first one. Unfortunately the user who created them, Fish Bowl, their page no longer exists. I'd love to do some research on them. Thanks! 74.101.125.9 02:50, 31 December 2023 (UTC)[reply]

The dialectal synonyms are compiled manually by editors from various sources (check the history tab on each individual page, as well as User:Justinrleung#Subpages)

The dialectal pronunciations are from a source whose name I had to ask User:Wyang to add: the comment at the very bottom of that page reads Credit: The book series "Phonetic Database of Modern Chinese Dialects". Pronunciation information is uncopyrightable.. Since then, that data has also been modified here and there (also check the history tab, unfortunately)

(I'm not sure where you got the information that I created these, though.) —Fish bowl (talk) 03:27, 31 December 2023 (UTC)[reply]

The specified language Proto-Indo-European is unattested, while the given word is not marked with '*' to indicate that it is reconstructed.

@Theknightwho I assume this was a change of yours. On πρᾶος, for example, you see "Lua error in Module:links at line 176: The specified language Proto-Indo-European is unattested, while the given word is not marked with '*' to indicate that it is reconstructed." This page has not been edited recently, so something changed in Module:links, apparently to disallow display forms without * for reconstructed languages, which was formerly allowed. Can you either revert this change or correct the errors if they're legitimate errors you want corrected? There are now over 100 pages in CAT:E with this error. Benwing2 (talk) 09:07, 31 December 2023 (UTC)[reply]

@Benwing2 I had been working through them, but they've built up again. I'll handle them. Theknightwho (talk) 09:10, 31 December 2023 (UTC)[reply]

@Theknightwho Awesome, thank you! Benwing2 (talk) 09:14, 31 December 2023 (UTC)[reply]

@Benwing2 It's about a 50-50 split between instances which really should have the asterisk, and compound terms where people have put (e.g.) {{l|ine-pro|*term1}}-{{l|ine-pro|*term2|term2}}, which is easy to solve if you make sure they're in the same template: {{l|ine-pro|*[[term1]]-[[term2]]}}. Theknightwho (talk) 09:17, 31 December 2023 (UTC)[reply]

DOPAC (excessive conversion of commas to and)

Discussion moved from WT:TR.

There's a template error in the def; idk how to fix. Thanks, Kritixilithos (talk) 11:06, 31 December 2023 (UTC).[reply]

And I notice that even changing the {{w}} template to a "manual" link {{abbreviation of|en|[[:w:3,4-dihydroxyphenylacetic acid|3,4-dihydroxyphenylacetic acid]]}} doesn't fix it: the issues appear to be that commas are being converted to and even inside of the link, and separately that this is making the link not link, rather than just wrongifying the target or display form: the first thing seems unintended / undesirable, and it seems fortuitous that the second thing happened to make it noticeable. (This is an issue in the general case, independent of whether or not we want to view this specific term DOPAC as an abbreviation or just a synonym or short name or something.) - -sche (discuss) 14:55, 31 December 2023 (UTC)[reply]

@Benwing2 Seems like implementing an escaping \ would be helpful here. Progress on the wikitext parser is going well, but it's not ready yet, so we'll need to do this the conventional way. Theknightwho (talk) 15:04, 31 December 2023 (UTC)[reply]

Fixed for the time being with , as a workaround. Chuck Entz (talk) 15:32, 31 December 2023 (UTC)[reply]

@-sche This is happening inside of form-of templates when there's a comma not followed by a space. The code that splits on commas isn't smart enough to recognize links and avoid touching them, otherwise it would work correctly even with the {{w}} template (which is processed before the form-of code runs, and just converts to a manual link). I will probably implement both the backslash escaping of commas and try to make it smart enough to recognize links. Benwing2 (talk) 20:20, 31 December 2023 (UTC)[reply]

Should be fixed. Benwing2 (talk) 22:38, 31 December 2023 (UTC)[reply]

Thank you! - -sche (discuss) 01:37, 1 January 2024 (UTC)[reply]