Open main menu

Wiktionary β

Wiktionary:Grease pit

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of "all words in all languages".

Others have understood this page to explain the "how" of things, while the Beer parlour addresses the "why".

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with "tips-n-tricks" are to be listed here as well.

Grease pit archives edit
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017


May 2017

Category:langrev subtemplatesEdit

This category contains 8,085 templates, all of which need to be deleted. Can this be done by bot? — I.S.M.E.T.A. 00:50, 2 May 2017 (UTC)

I can do this, but I want to make sure there's consensus for this before doing something major like this. Any objections? Benwing2 (talk) 03:39, 2 May 2017 (UTC)
Where on earth are those templates used? It seems they convert language name to code, which can be done with Module:languages/templates instead. — Eru·tuon 03:46, 2 May 2017 (UTC)
@Benwing2, Erutuon, CodeCat: They can't be deleted because MediaWiki:Gadget-TranslationAdder.js still relies on them, and CodeCat (who marked them for deletion) refused to fix it. —Μετάknowledgediscuss/deeds 04:09, 2 May 2017 (UTC)
CodeCat can't fix this. Only administrators can edit js pages. --Giorgi Eufshi (talk) 05:27, 2 May 2017 (UTC)
She can fix this. If she (or anyone else) wrote out exactly what needed to be changed and pinged me, I'd make the edit. —Μετάknowledgediscuss/deeds 05:29, 2 May 2017 (UTC)
@CodeCat, do you care to explain here the necessary changes that need to be made to MediaWiki:Gadget-TranslationAdder.js in order to obsolete the members of Category:langrev subtemplates completely? — I.S.M.E.T.A. 11:02, 2 May 2017 (UTC)

{{cardinalbox}} should classify into Category:LANG cardinal numbers, and same for {{ordinalbox}}Edit

Any objection if I make these changes? Benwing2 (talk) 02:23, 2 May 2017 (UTC)

The only objection is that we shouldn't rely on the boxes to provide these categories. For languages for which no one has added these boxes, the numbers should still be properly categorized with the headword templates and such. --WikiTiki89 21:25, 2 May 2017 (UTC)

Sauraseni Prakrit transliterationEdit

When I link to a word in Sauraseni Prakrit (psu) written in Devanagari (according to Module:languages/data3/p the only script for this language) with a template like {{l}} or {{m}}, the transliteration comes out in Devanagari as well, which seems counterproductive (e.g. वेदस). Either we should automatically transliterate the Devanagari into the Latin alphabet, or we shouldn't transliterate it at all. But "transliterating" it into the exact same script can't be right. —Aɴɢʀ (talk) 14:06, 2 May 2017 (UTC)

I think the problem is that it invokes Module:Brah-translit as if it were using Brahmi script, when it should (if it is written in Deva) invoke Module:Deva-Latn-translit. Any character that is not recognized by a transliteration module is "transliterated" as itself. ... But having made that change, I see that the instance above now produces a module error, so I undid my edit. - -sche (discuss) 17:21, 2 May 2017 (UTC)
Module:Deva-Latn-translit seems to transliterate from Latin to Devanagari or other scripts. Hindi and Sanskrit each have their own transliteration modules: Module:sa-translit and Module:hi-translit. I suppose Sauraseni Prakrit will have to have a transliteration module created for it. — Eru·tuon 19:50, 2 May 2017 (UTC)
Or at the very least, edit some module so that no transliteration appears at all, as already happens for other Devanagari-script languages like Awadhi and Bhojpuri: {{l|awa|वेदस}} and {{l|bho|वेदस}} simply surface without transliteration, and that would be preferable for Sauraseni rather than this ridiculous "transliteration" into itself. —Aɴɢʀ (talk) 21:19, 2 May 2017 (UTC)
Yeah, I think interlingual script translit modules should check the script before transliterating (and return nil if it's the wrong script). --WikiTiki89 21:24, 2 May 2017 (UTC)
It's quite bizarre that Module:Brah-translit was added as the transliteration module for psu, since the only script listed is Deva. So, I removed the transliteration module from the language data file. — Eru·tuon 21:30, 2 May 2017 (UTC)
I also made Module:Brah-translit return nil if the script code isn't Brah. Not necessary to solve this problem, but probably a good idea anyway. — Eru·tuon 22:31, 2 May 2017 (UTC)

Automatic interlanguage linksEdit

So we have automatic interlanguage links now. I don't know where this is coordinated, but it apparently isn't working for entries containing certain non-ASCII characters. When I tried removing the interlanguage links from mało, łopata, and Sigolène, they vanished. The same problem is occurring in some other languages as well, but not all: cy:łopata is showing the links (although they've been removed from the page by bot), but pl:łopata has no links showing. —Aɴɢʀ (talk) 20:08, 3 May 2017 (UTC)

This has been discussed at length in the BP (see WT:Beer parlour/2017/April#Cognate & automatic interlanguage links for one discussion of it) and I posted it in WT:N4E. Anyway, this looks like a bug, so I'm summoning @Lea Lacroix (WMDE) to explain. —Μετάknowledgediscuss/deeds 20:11, 3 May 2017 (UTC)
The Cognate extension has been discussed at length and announced at N4E, but the problem that it works on some pages and not on others has AFAICT not been discussed yet anywhere. —Aɴɢʀ (talk) 20:30, 3 May 2017 (UTC)
I know, which is why I pinged someone who can address it. My links were in response to your statement "I don't know where this is coordinated". —Μετάknowledgediscuss/deeds 23:07, 3 May 2017 (UTC)
Is it only some articles for which it doesn't work, or did it all work yesterday but is all broken now? LA2 (talk) 23:10, 3 May 2017 (UTC)
Currently it is broken at all: phab:T164407. --Vriullop (talk) 06:13, 4 May 2017 (UTC)

Thanks for your messages. Indeed, if you find some bugs or other things that Cognate doesn't do correctly, you can ping me and I'll transmit it to the developers. I created a ticket with the examples you mentioned.

Right now, like Vriullop mentioned, a problem occurs with the extension. We try to solve it as soon as possible, thanks for your understanding. Lea Lacroix (WMDE) (talk) 07:47, 4 May 2017 (UTC)

Broken script and links in new request categoriesEdit

None of the request categories have links that point to the appropriate language section anymore. They are also no longer formatted with the appropriate script. This was still done when they were implemented by {{poscatboiler}}. —CodeCat 20:51, 3 May 2017 (UTC)

I've added the catfix. It currently adds language attributes, but not script classes. — Eru·tuon 21:05, 3 May 2017 (UTC)
Translation request categories should point to #English. DTLHS (talk) 01:58, 4 May 2017 (UTC)
Okay, I've made that category use an English catfix. If there are other categories like that, it can be done the same way. — Eru·tuon 02:19, 4 May 2017 (UTC)

Cognate Extension Malfunctioning?Edit

Is the Cognate extension really malfunctioning? Or is something else happening? --Lo Ximiendo (talk) 16:40, 4 May 2017 (UTC)

Apparently they have made quite a mess of it. See phab:T164407, mentioned two sections up. —Μετάknowledgediscuss/deeds 16:46, 4 May 2017 (UTC)
I think it can be attributed to the strain of introducing a major new extension at the same time they've been doing major work connected to the new backup-server location. With all that going on at the same time, it's a wonder that WMF's already-overextended technical staff hasn't completely lost it... Chuck Entz (talk) 02:15, 5 May 2017 (UTC)

Listing scripts for etymology languagesEdit

Now that we are using etymology language codes not only in {{etyl}} but also in templates like {{der}} and {{desc}} which take a term, we should list scripts for these languages in Module:etymology languages/data as well. In particular, I wanted to add Zzzz, Mani, Syrc to sog-bud, sog-man, and sog-chr, but I guess there won't be a difference since our modules are developed in a way to fetch data from Module:languages/data2 (etc.) only. --Z 07:55, 5 May 2017 (UTC)

@ZxxZxxZ: It probably wouldn't hurt to add script codes to Module:etymology languages/data, since Module:etymology languages will probably just ignore them. But I can only imagine them being useful if the etymology language uses a subset of the scripts used by its parent language. In that case, they could be used to provide an error message if the script supplied to the template is not used for that etymology language, even if it is used for other varieties of the parent language. I can't think of examples where that would be needed, but perhaps there are some. — Eru·tuon 08:56, 5 May 2017 (UTC)

Make Template:place categorise free countiesEdit

@Daniel Carrero, Ungoliant MMDCCLXIV Middle Dutch hollant is a county that doesn't belong to a larger entity. Countries as such didn't exist yet, and the status of various polities as county, duchy, bishopric and such is largely arbitrary and probably not worth subcategorising. So I made brabant, a duchy, categorise in Category:dum:Polities. Can hollant be also made to categorise in that? I was able to add a new place type (duchy) but since county is already used by existing entries, I don't know how to do it. A thorough documentation of Module:place/data would certainly be appreciated for future reference! —CodeCat 20:57, 5 May 2017 (UTC)

  Done. The documentation for editing the data module is at Template:place/documentation, as I felt it better to keep documentation in a single page at the time. — Ungoliant (falai) 21:04, 5 May 2017 (UTC)

Help needed at is.wiktEdit

We have got a slight issue at the Icelandic Wiktionary. If anyone from here could help us, that would be very much appreciated.

As one can read at this discussion, some pages have some kind of wiki code on top of the page and we have no idea how to fix this... Does anyone know what is wrong here? --Ooswesthoesbes (talk) 13:52, 6 May 2017 (UTC)

@Ooswesthoesbes, is:frakki looks just what it should look like. What's the problem? . There really is HTML as text when I log out. --Dixtosa (talk) 15:19, 6 May 2017 (UTC)
I see it too, even when logged in: <img src="//upload.wikimedia.org/wikipedia/commons/thumb/7/72/Disambig.svg/25px-Disambig.svg.png" alt="" style="width: 25px; height: 20px;" />. A quick guess is that perhaps the person who translated the banner at the top of the page that links to [1] into Icelandic may have messed something up in the code for it. I suggest leaving a post at meta:Meta:Babel asking them to check that everything is alright on their end. Or perhaps a recent change to some is.wikt site-wide .js or .css has done it? - -sche (discuss) 17:10, 6 May 2017 (UTC)

It is fixed now. Thank you for your help! :) --Ooswesthoesbes (talk) 11:01, 11 May 2017 (UTC)

No worries. It was probably caused by a wiki-volcano or something. --Celui qui crée ébauches de football anglais (talk) 11:10, 11 May 2017 (UTC)

Display of vertically written languages (Mongolian, Manchu)Edit

Is it possible to convert a space in terms of these languages to a new line character (while keeping the transliteration unchanged), when they are generated by the link templates?

i.e. on 松花江 (Sōnghuā Jiāng), the code

{{m|mnc|ᠰᡠᠩᡤᠠᡵᡳ ᠪᡳᡵᠠ||Milky Way}}

should generate:

ᠰᡠᠩᡤᠠᡵᡳ
ᠪᡳᡵᠠ
(sunggari bira, Milky Way),

instead of the current

ᠰᡠᠩᡤᠠᡵᡳ
ᠪᡳᡵᠠ
(sunggari bira, Milky Way).

(These languages are written from left to right.)

Thanks! Wyang (talk) 10:24, 7 May 2017 (UTC)

I'll look into it. It should be possible. — Eru·tuon 23:10, 7 May 2017 (UTC)
I've made the change, and it appears to work! Let me know if there are any problems. — Eru·tuon 23:23, 7 May 2017 (UTC)
One problem: it will only transform spacing characters into <br> tags when the space is inside a link. Any spaces outside a link will not be converted to newlines. For instance, {{m|mnc|[[ᠰᡠᠩᡤᠠᡵᡳ]] [[ᠪᡳᡵᠠ]]||Milky Way}} displays as ᠰᡠᠩᡤᠠᡵᡳ
ᠪᡳᡵᠠ
(sunggari bira, Milky Way). There may be a way to fix this. — Eru·tuon 23:27, 7 May 2017 (UTC)
@Erutuon Excellent, thank you! Wyang (talk) 06:26, 8 May 2017 (UTC)
@Wyang: Fixed the problem that I mentioned above. — Eru·tuon 06:34, 8 May 2017 (UTC)
@Erutuon Thanks. One correction is needed though: The character ‹᠎› (U+180E, MONGOLIAN VOWEL SEPARATOR) should not be converted to a line break. An example is at тарвага (tarvaga); ᠲᠠᠷᠪᠠᠭ᠎ᠠ is currently displayed as ᠲᠠᠷᠪᠠᠭ᠎ᠠ (tarbaɣ-a). Wyang (talk) 12:40, 8 May 2017 (UTC)
@Wyang: Ahh, okay. I've now made it so that the function only converts plain spaces (U+20), not other spacing characters. — Eru·tuon 12:52, 8 May 2017 (UTC)
Very nice! Other ideas:
In ᠭᠠᠩᠰᠠ, the first instance of "ᠭᠠᠩᠰᠠ", above the language header displays horizontally, while the headword line displays vertically. I know we can italicize particular pages' titles (see w:Template:Italic title), can we cause these pages' titles to display vertically?
Also at ᠭᠠᠩᠰᠠ, it might be a more economical use of space to display the usexes side-by-side (perhaps in box elements?) instead of one after the other; can/should this be done? Ideally, the boxes would "break" onto new lines/rows based on screen width, or at least the current display would be preserved on the mobile version of the site.
Alternatively, perhaps {{usex}} (or a script-specific template) could take usexes of Mongolian-script and either link each individual word with a "black link" (a link that looks like regular text, as used in e.g. some inflection tables), or else somehow subject them to Erutuon's excellent line-breaking feature without linking anything, to reduce the amount of vertical space the usexes take up.
- -sche (discuss) 06:55, 8 May 2017 (UTC)
@-sche: I think the code that makes Mongolian script display vertically is found in MediaWiki:Common.css:
.Mong {
	...
	-webkit-writing-mode: vertical-lr;
	-moz-writing-mode: vertical-lr;
	writing-mode: vertical-lr;
	layout-flow: vertical-ideographic;
}
As the class Mong isn't applied to the heading at the top of the entry ᠭᠠᠩᠰᠠ (ɣangsa), the heading displays horizontally instead of vertically. {{DISPLAYTITLE:}} or a JavaScript function can be used to apply the class to the article title (see this edit, for instance). JavaScript would require less editing than {{DISPLAYTITLE:}}, but {{DISPLAYTITLE:}} would work even for readers who don't have JavaScript enabled. — Eru·tuon 08:33, 8 May 2017 (UTC)
Moving the newline-adding code to Module:script utilities would make it apply to usage examples as well as links. I should probably do that. — Eru·tuon 08:36, 8 May 2017 (UTC)
It seems to be possible to obtain verticalization even by transcluding a template that includes {{DISPLAYTITLE:{{lang|mn|{{PAGENAME}}|sc=Mong}}}}. Would it be feasible and desirable to edit the headword-line templates of languages that use Mongolian script to use such an approach (perhaps different templates could be used on Mongolian- vs Cyrillic- pages, or one template could be smart enough to only apply DISPLAYTITLE on Mongolian-script pages), so that rather than separately spelling out {{DISPLAYTITLE:}} on every page, merely adding a standard headword-line template would verticalize the page title? - -sche (discuss) 09:01, 8 May 2017 (UTC)
Hmm, that's an interesting idea. I wonder if a Lua module can use the {{DISPLAYTITLE:}} magic word (or if it can correctly expand a template that contains that magic word). If it can, the headword module could certainly do what you suggest. That would also be nice, though perhaps less important, in entries using other scripts. For instance, it would be great if Ancient Greek entries could apply the class polytonic to the title, though in some cases the Modern Greek entry would be found on the same page, and Modern Greek uses the class Grek instead. — Eru·tuon 09:58, 8 May 2017 (UTC)
It works with Lua, tested in sandbox: Special:Permalink/42835570. --Vriullop (talk) 11:37, 8 May 2017 (UTC)
For line breaks in each word, it works adding display:table-caption in class Mong: Special:Permalink/42835630. --Vriullop (talk) 12:07, 8 May 2017 (UTC)
@Vriullop: I copied your Lua code to Module:User:Erutuon/Mongolian and am testing it at User:Erutuon/ᠭᠠᠩᠰᠠ, but it doesn't seem to be working. In preview mode, it gives the message Warning: Display title "<span class="Mong">Erutuon/ᠭᠠᠩᠰᠠ</span>" was ignored since it is not equivalent to the page's actual title. Rather odd. — Eru·tuon 12:33, 8 May 2017 (UTC)
@Erutuon: It misses the namespace, "Erutuon/ᠭᠠᠩᠰᠠ" is not equivalent to "User:Erutuon/ᠭᠠᠩᠰᠠ". It works on page ᠭᠠᠩᠰᠠ. --Vriullop (talk) 12:42, 8 May 2017 (UTC)
@Vriullop: I see! I changed my code to include the namespace, and now it works. — Eru·tuon 12:55, 8 May 2017 (UTC)
@Vriullop: I find display: table-caption; causes the quotations in ᠭᠠᠩᠰᠠ (ɣangsa) to overlap, and the headword to overlap the bullet that is after it. See the screenshot to the right.
 
Screenshot of the entry ᠭᠠᠩᠰᠠ on the English Wiktionary, with display: table-caption; applied to the class Mong (for Mongolian script) in my common.css.
Eru·tuon 13:04, 8 May 2017 (UTC)

Thanks to @Vriullop, I tried again and figured out how to make the display title thing work. So the top header in ᠭᠠᠩᠰᠠ (ɣangsa) and other Mongolian-script entries will now have the class Mong and will display vertically. The same can perhaps be done with other scripts. — Eru·tuon 13:26, 8 May 2017 (UTC)

I moved the newline-adding code to Module:script utilities, so now it is applied to the usage examples in ᠭᠠᠩᠰᠠ (ɣangsa). — Eru·tuon 13:45, 8 May 2017 (UTC)

Very cool!
I notice that the entries in Category:Manchu lemmas are vertical (and remain so even when I add testentry to the category, which itself becomes vertical), whereas Mongolian-script entries in Category:Mongolian adjectives are horizontal (probably so that the more numerous Cyrillic entries display correctly?). I suppose there's no way around that.
- -sche (discuss) 17:59, 8 May 2017 (UTC)
What does this newline conversion code do when it encounters spaces in code, like say in a HTML tag or Wikimarkup? —CodeCat 18:05, 8 May 2017 (UTC)
Well, {{usex|cmg|ᠮᠥ<a href{{=}}"https://en.wiktionary.org/wiki/ice">ᠰ</a>ᠦᠨ|ice}} results in the "<a href="https://en.wiktionary.org/wiki/ice">" being broken at the space and displayed as text (see here), but under what circumstances would such a thing occur validly? I had to use {{=}} just to make that "work" (i.e. not just resolve to a Lua "parameter not used" error). (If there are valid usecases, perhaps we could add a switch by which users could suppress space-to-newline conversion in individual instances?) If the concern is that it might sometimes be necessary to call a template that includes a space in its name inside a {{usex}} or the like (as in this contrived example), that could be solved by having a redirect to the template from an unspaced name. - -sche (discuss) 19:54, 8 May 2017 (UTC)
The code I've written assumes that the text being script- and language-tagged only contains wikilinks, nothing else. I'm not satisfied with it, because it's overly verbose. And it should be able to find and ignore anything that should not be transformed (HTML tags, the target part of wikilinks) and then just replace spaces in the rest. That would be much simpler. I'm trying to figure out how to do that. — Eru·tuon 22:17, 8 May 2017 (UTC)
Done. Anything that should not have its spaces transformed (currently, wikilink targets and HTML tags) is escaped before the replacement, then unescaped. — Eru·tuon 22:50, 8 May 2017 (UTC)
This is very cool. Thank you! Wyang (talk) 01:31, 10 May 2017 (UTC)

Category interwiki linkingEdit

It's good that our articles nowadays automatically get interwiki (inter-language) links. But our semantic categories still need manual linking (right?), the way it used to be in Wikipedia. So, once I found out that Category:Hormones should link to ru:Категория:Гормоны and sv:Kategori:Hormoner, is there some tool (on the WMF toolserver) that helps me in interwiki linking all the subcategories such as Category:en:Hormones to their counterparts in the other languages? --LA2 (talk) 21:25, 7 May 2017 (UTC)

That's supposedly the next step that will happen at Wikidata. —Μετάknowledgediscuss/deeds 21:32, 7 May 2017 (UTC)

{{desctree}} changesEdit

I'm working on some changes to {{desctree}} in {{desctree2}}, which I outlined here. I'm trying to logic out how to deal with the alternative forms. I could employ some method using |altN=, but I would prefer it if I could use * {{desctree|goh|apful}} {{l|goh|aphul}}, ... and have the imported list of descendants start on the next line. Is there some way to do this in Lua, like table.insert(arr, currentline+1)? --Victar (talk) 04:37, 8 May 2017 (UTC)

@Victar: I don't think that's possible. A Lua-based template can only insert content at the position where you put it in the text. — Eru·tuon 04:39, 8 May 2017 (UTC)
@Erutuon: Drat. What about getting the current line, surrounding the template? --Victar (talk) 04:59, 8 May 2017 (UTC)
@Victar: The module might be able to grab the content from the {{l}} after the {{desctree}}, but I'm not aware of any way in which it could delete the content of that {{l}}. So, you would still have the alternative form aphul repeated after the descendant tree. — Eru·tuon 05:54, 8 May 2017 (UTC)

Apply a font to cuneiform page titlesEdit

Erutuon, with some help from Vriullop and me, excellently found a way to apply a class to the page titles of Mongolian-script pages so that they display correctly. Can we similarly apply a class to the pagetitles of cuneiform pages, so that they are rendered in the same font as they are when they are linked (𒋼), rather than the default font they are currently rendered in, which displays them as boxes (for me)? More generally, we could perhaps apply classes and hence relevant fonts to many non-Latin scripts which might otherwise display as boxes. - -sche (discuss) 19:39, 8 May 2017 (UTC)

You can add script-tagging to the title for any scripts you want, by adding the script code to the to_be_tagged table near the top of Module:headword. — Eru·tuon 01:27, 10 May 2017 (UTC)
(I've moved the list of scripts to Module:headword/data.) — Eru·tuon 01:33, 10 May 2017 (UTC)
Huh, all the Cuneiform titles show up for me without tofu. —Aryamanarora (मुझसे बात करो) 19:45, 10 May 2017 (UTC)
You mean they do now, or they always did? I think that whether or not they display without script-tagging depends to some extent on whether a user already has a font installed that can display them, and on how recent their operating system and browser are. Script-tagging them helps them display for more people. - -sche (discuss) 00:46, 11 May 2017 (UTC)
How "expensive" is this script-tagging of page titles? Do we know? If it's "inexpensive", perhaps tagging most non-Latin scripts would be appropriate. - -sche (discuss) 00:46, 11 May 2017 (UTC)
I doubt it's very expensive. The script code is already present in Module:headword, and it is quite quick to look it up in the table of to-be-tagged scripts. — Eru·tuon 01:22, 11 May 2017 (UTC)
Something to consider: the software disallows overriding the display title more than once. I believe the second attempt results in an error. That means that any entry with more than one headword line will break. —CodeCat 00:59, 11 May 2017 (UTC)
I don't think it does cause an error. I have added tagging for polytonic, and that would probably make the display title function be called twice in a page such as ὅς (hós), yet there's no error there. — Eru·tuon 01:25, 11 May 2017 (UTC)
Yes, 𒈗 has a number of language sections, each with their own headword line, and also does not seem to suffer an error. - -sche (discuss) 01:28, 11 May 2017 (UTC)
However, if I tried to add tagging for Grek along with polytonic, on a page like λόγος (lógos) that has an entry for Modern Greek as well as Ancient, I suspect the <span class="Grek"></span> added by the Greek headword template would override the <span class="polytonic"></span> added by the Ancient Greek one, because the alphabetization makes it be the last on the page. It seems the last display title on the page is what is actually used. I tested this by adding {{DISPLAYTITLE:ὅς}} on ὅς (hós): if I place this DISPLAYTITLE parser function at the top, it's overrided by the polytonic-class display title added by the headword template; if I place it at the bottom, it overrides the headword template. So, the last DISPLAYTITLE on the page is the one that wins out, but there is no error. — Eru·tuon 01:33, 11 May 2017 (UTC)
Oops! There is an error, at the bottom of the page, in preview mode at least: Warning: Display title "ὅς" overrides earlier display title "<span class="polytonic">ὅς</span>". When the display titles added by subsequent headwords are identical, there seems to be no such message. — Eru·tuon 01:43, 11 May 2017 (UTC)

template:fr-nounEdit

This template continually miscategorizes ‘mf’ nouns as having no plural, which is patently false. Please fix this, it’s very annoying. — (((Romanophile))) (contributions) 02:38, 10 May 2017 (UTC)

@Romanophile: I think what "missing plurals" means is that there is currently no entry for the plural. Module:fr-headword, which is used by {{fr-noun}}, checks to see if the entry title for each plural form exists. Feminine nouns without entries for their plurals are also put in this category: see accessoirisation. — Eru·tuon 02:58, 10 May 2017 (UTC)
Indeed. And the only reason you see it is because you've opted to see hidden categories in your prefs. —Μετάknowledgediscuss/deeds 04:42, 10 May 2017 (UTC)

spelling templatesEdit

The current spelling templates produce confusing and misleading entries, at least the one for American spelling. Now almost all users interpret demonize, for example, as saying that its meanings are American, not its spelling (which isn't even true). --Espoo (talk) 07:03, 10 May 2017 (UTC)

I'd thought the spelling e instead of ae might be not in use in UK English, but it seems we have editors who don't even know the basics of UK spelling and think that -ize is not also UK. Is there a bot that can find and fix this kind of nonsense? --Espoo (talk) 07:13, 10 May 2017 (UTC)

RegEx flags in addToToolbarEdit

Hey, how can I add regex flags, like global, to my addToToolbar function? Thanks for any help! --Victar (talk) 17:40, 10 May 2017 (UTC)

That's done by putting g after the slashes of the regex: /\b.*\:\s*\{\{l\|(.*)\|(.*)\b\}\}/g. — Eru·tuon 19:11, 10 May 2017 (UTC)
So weird! It wasn't working for me before. Thanks. --Victar (talk) 19:41, 10 May 2017 (UTC)
Does it work? I thought the curly braces had to be escaped ({}\{\}), as they're used in quantifiers. — Eru·tuon 19:49, 10 May 2017 (UTC)
Sorta! =P Now that global is working, I'll try make it better. --Victar (talk) 19:57, 10 May 2017 (UTC)
@Erutuon: There we go! Thanks again. Feel free to steal. --Victar (talk) 20:27, 10 May 2017 (UTC)

@Erutuon I'm trying to use callback instead, but I can't seem to get it to work. Could you take a peek and see what I'm doing wrong? --Victar (talk) 02:35, 11 May 2017 (UTC)

@Victar: I'm sort of a newbie with JavaScript, but perhaps it's not working because you only escaped the first curly bracket in each group? Each one should be individually escaped. — Eru·tuon 03:02, 11 May 2017 (UTC)
Ohh, the regex shouldn't be enclosed in quotes either. /\b.*\:\s*\{\{l\|([^\}]*)\}\}/gEru·tuon 03:03, 11 May 2017 (UTC)
Good eye, but still no luck. --Victar (talk) 03:10, 11 May 2017 (UTC)
I copied your code to my common.js and correctly escaped the brackets, but it still doesn't add anything to the toolbar while I'm in editing mode. — Eru·tuon 03:33, 11 May 2017 (UTC)
You might have better luck trying to use TemplateScript. That's what I use to create my customized regex tools. — Eru·tuon 03:34, 11 May 2017 (UTC)
Oh!! I see it's being added to the edit toolbar. Duh! But it's not working... — Eru·tuon 03:41, 11 May 2017 (UTC)
The instructions for adding your own toolbar button that uses a JavaScript function leave me at sea as to just how the callable function is supposed to work. — Eru·tuon 03:54, 11 May 2017 (UTC)
@Erutuon: Yeah, the documentation really sucks. Thanks for looking into it. I just reverted it back to my original version so I can at least use it for now. --Victar (talk) 04:17, 11 May 2017 (UTC)

Show-and-hide templates not working properlyEdit

The templates in the "derived terms" and "translations" sections of our entries ({{trans-}} and {{rel-}}) are not working anymore. They normally have a button to show and hide the content inside them, but this button apparently disappeared, making the templates nonfunctional. I've tried different browsers and computers and still get the same problem. What is the matter? - Alumnum (talk) 04:06, 11 May 2017 (UTC)

Whatever this is also breaks tabbed languages. DTLHS (talk) 04:12, 11 May 2017 (UTC)
Additionally with Chrome (but not Netscape or Edge) I have a long blank space between the bottom line and the category section - but only when there is a pull down table (like inflections) present. — Saltmarsh. 04:46, 11 May 2017 (UTC)
It seems to work when I log out. DTLHS (talk) 05:16, 11 May 2017 (UTC)
Really? Not for me. What browser and OS? --Espoo (talk) 08:09, 11 May 2017 (UTC)
I also can't see the quotations I just added when creating the entry κᾰτήγορος (katḗgoros). They just entirely vanish. (Another thread on a similar topic: Wiktionary:Beer parlour/2017/May § cannot open translation boxes.) — Eru·tuon 05:19, 11 May 2017 (UTC)
Using Chrome logging out make the long blank space (see above) disappear, by the show/hide problem is still there — Saltmarsh. 05:22, 11 May 2017 (UTC)
I just added a quote to Kennebec, the only way to display it is by taking out * DonnanZ (talk) 14:20, 11 May 2017 (UTC)

I have filed https://phabricator.wikimedia.org/T165015. - -sche (discuss) 08:24, 11 May 2017 (UTC)

Compare the earlier issue described at Wiktionary:Grease pit/2017/April#Show_translations. - -sche (discuss) 08:29, 11 May 2017 (UTC)
Yup, all quotations are no longer visible, and the "Show translations" and "Show quotations" links on the left side of the screen are gone. Quiet Quentin also no longer appears. — SMUconlaw (talk) 08:48, 11 May 2017 (UTC)
Today I had the same problem on ca.wikt. My web console says, among other weird stuff: 'Gadget "LegacyScripts" styles loaded twice. Migrate to type=general. See <https://www.mediawiki.org/wiki/RL/MGU#Gadget_type>.' Fixing it on gadgets definition has solved the issue. --Vriullop (talk) 08:54, 11 May 2017 (UTC)
Any administrator around? I am pretty sure all the problems reported today are solved fixing MediaWiki:Gadgets-definition. Add type=general in ResourceLoader definition of gadgets loading both js and css: LegacyScripts, TabbedLanguages, DocTabs, DefSideBoxes, aWa, and QQ, as explained at mw:ResourceLoader/Migration_guide_(users)#Gadget_type. --Vriullop (talk) 14:07, 11 May 2017 (UTC)
@Vriullop I have added type=general to the scripts. Tabbed languages appears to still be broken. DTLHS (talk) 15:18, 11 May 2017 (UTC)
But collapsible tables are working again, so that's good. —Aɴɢʀ (talk) 15:20, 11 May 2017 (UTC)
Not loading LegacyScripts was provoking a cascade of errors in unexpected places. Once fixed, the problem with TabbedLanguages is beyond my understanding, but obviously they are related and provoked by last version of Mediawiki installed last UTC night. --Vriullop (talk) 15:57, 11 May 2017 (UTC)

Help, declension tables no longer are openableEdit

I think this applies to all collapsible tables. I see it e.g. on the page любимица, where the declension table is missing the button to open it. User:JohnC5 noted something similar in the April Grease Pit. I don't have any Javascript, and I don't even know what skin I have. (How does one switch skins?) I tried this on Chrome and Safari (on Mac OS 10.9.5), same thing. It even happens when I log out. Any ideas? Benwing2 (talk) 05:58, 11 May 2017 (UTC)

OK, I think the above thing about show-and-hide templates is the same. Benwing2 (talk) 06:01, 11 May 2017 (UTC)
Can someone file an urgent phabricator bug about this? Maybe User:Daniel Carrero, who knows how to do this? Benwing2 (talk) 06:02, 11 May 2017 (UTC)
Is the JavaScript function that handles this collapsible content a local Wiktionary thing or a global MediaWiki feature? — Eru·tuon 06:06, 11 May 2017 (UTC)
So, I believe all the code for this is housed at MediaWiki:Gadget-legacy.js, but nothing has changed recently to cause these problems. —JohnC5 07:38, 11 May 2017 (UTC)
I have filed https://phabricator.wikimedia.org/T165015. Let me know if I should add anything to the report, or comment on it yourself! - -sche (discuss) 08:24, 11 May 2017 (UTC)

translation drop-down function broken?Edit

Is it my phablet or is there a bigger problem? I restarted the phablet, but nothing happens when i click on a translation box in Android Chrome and Firefox. It worked yesterday. --Espoo (talk) 07:40, 11 May 2017 (UTC)

Same problem here, I can't get it to work with any browser and it worked fine yesterday. --Bricklayer (talk) 07:52, 11 May 2017 (UTC)
Just discovered that all other drop-down boxes are broken too, f.ex. conjugations. Also discovered it's only broken in desktop view, not mobile -- both on phablet. Will now test on laptop. --Espoo (talk) 07:58, 11 May 2017 (UTC)

I have filed https://phabricator.wikimedia.org/T165015. - -sche (discuss) 08:24, 11 May 2017 (UTC)

Translation js drop downs stopped workingEdit

Checked with 3 browsers. Arrows do not appear, nor selected language translations. When inspecting pages I can see:

Gadget "LegacyScripts" styles loaded twice. Migrate to type=general. Gadget "DocTabs" styles loaded twice. Migrate to type=general. <https://www.mediawiki.org/wiki/RL/MGU#Gadget_type>. VM60:243 This page is using the deprecated ResourceLoader module "es5-shim". Use of the "es5-shim" module is deprecated since MediaWiki 1.29.0--Spiros71 (talk) 09:38, 11 May 2017 (UTC)

Thank you, this is informative. Vriullop describes the same message on ca.Wikt, and a fix which could be tried. I wonder what caused the gadgets to break / to 9apparently) need |type=general] to be declared... - -sche (discuss) 09:51, 11 May 2017 (UTC)

NEC is not working either. DonnanZ (talk) 10:10, 11 May 2017 (UTC)

Tabbed languagesEdit

Tabbed languages also broke, which makes Wiktionary completely unnavigable. —CodeCat 12:36, 11 May 2017 (UTC)

I've turned my tabbed languages off, which helps. —Aɴɢʀ (talk) 14:13, 11 May 2017 (UTC)
Apparently there is a general issue where "Gadgets that use both scripts and styles, but do not specify type=general, are never loaded (JS file not loaded but CSS file is)" (per a report on Phabricator). But type=general appears to have been set for TabbedLanguages in this diff; does the problem persist? - -sche (discuss) 19:02, 11 May 2017 (UTC)
PS, Aklapper mentions on Phabricator that "on an unrelated note, https://en.wiktionary.org/wiki/MediaWiki:Gadget-purgetab.js is broken (link is undefined". - -sche (discuss) 19:06, 11 May 2017 (UTC)
This was one of the collateral effects by LegacyScript not loading. Once fixed, this error message does not appear any more and purgetab is working fine. But the problem with TabbedLanguages persists. --Vriullop (talk) 20:21, 11 May 2017 (UTC)

CSS classes for transliterationsEdit

It would be beneficial to have CSS classes for transliterations. Many transliteration systems use characters that will not display well in all fonts. With predictable classes, fonts could be specified in MediaWiki:Common.css to ensure they display well. Or users can set whatever fonts they like in their common.css. I've wanted to do the latter. Or users can write JavaScript functions that convert one transliteration system to another, allowing people to see whatever transliteration system they prefer.

Currently, all transliterations are tagged with class="tr", if generated by {{l}}, or with class="tr mention-tr", if generated by {{m}}. So, there is no way to distinguish between transliterations of different languages when using JavaScript, and absolutely no way when using CSS. They are all identical.

I was thinking the class names could be tr- plus the language code. Then, the transliteration class for Ancient Greek (language code grc) would be tr-grc. That would be following the tradition of using |tr= for transliteration, and of using the class tr to mark transliterations in general. Or perhaps translit- plus the language code could be used instead (translit-grc). Or the order could be reversed (grc-tr or grc-translit). This reversed order would resemble a language code combined with script code (fa-Arab), though translit isn't a script. Not sure which of these ideas is best.

These class names would be added on to the default ones: thus, the class for an Ancient Greek transliteration in the {{m}} template would be class="tr mention-tr tr-grc". So, no existing functionality would be broken.

This idea is perhaps not absolutely necessary, but it would allow for easier customization of transliterations with CSS and JavaScript. — Eru·tuon 05:49, 11 May 2017 (UTC)

I don't think this is necessary, as it is possible with CSS selectors to style by language (e.g. .tr:lang(grc)). This was added in CSS2 and has wide support. – Krun (talk) 11:28, 11 May 2017 (UTC)
@Krun: But at the moment there's no language attribute (lang="grc") added to a language's transliteration, so that selector wouldn't work. For instance, in the headword of ἀρχιμανδρῑ́της (arkhimandrī́tēs), the transliteration is coded as follows: <span class="tr" lang="" xml:lang=""><span class="tr" lang="" dir="ltr" xml:lang="">arkhimandrī́tēs</span></span>. This can only be selected with .tr or :dir(ltr). Perhaps Module:links (which currently handles transliterations) should add the language attribute to the transliteration. This would be fine since MediaWiki:Common.css currently doesn't use language codes alone to set fonts (if it did, adding language attributes to transliteration might cause the wrong fonts to be applied to transliteration). — Eru·tuon 16:43, 11 May 2017 (UTC)
On the other hand, it might cause problems for screen readers, which would try to read the transliteration because it was tagged with a language code, and would likely fail because they would only be able to read the language's native script. Hence, separate classes for each language's transliteration might be a better idea. — Eru·tuon 16:45, 11 May 2017 (UTC)
@Erutuon: Ah, I didn't realize the language code was missing. The transliteration should definitely be tagged with the relevant language code (in this case grc), as it is text in that language, whatever the script. It's definitely a problem if it isn't, because in that case it inherits the language designation of a parent/ancestor tag or the entire page, namely en, and would e.g. be read by a screen reader as text in English! When the script is different, as in this case, that can and should be indicated in the HTML (lang="grc-Latn"). – Krun (talk) 10:17, 12 May 2017 (UTC)
@Krun: Hmm. lang="grc-Latn" sounds reasonable, but what about class="(tr mention-tr )Latn" lang="grc"? Putting in Latn as a class would be more consistent with the way we usually handle scripts. — Eru·tuon 01:20, 14 May 2017 (UTC)
I've gone and done the script class and language attribute solution. Now a language's transliteration can be selected with, for instance, :lang(grc).Latn. — Eru·tuon 02:03, 14 May 2017 (UTC)

What annoys me about adding language attributes to transliterations is that my browser (Chrome) adds undesirable styling to them: for instance, full-width font to transliterations of Japanese. That is what I hope to avoid by naming the classes tr-language code. — Eru·tuon 01:52, 16 May 2017 (UTC)

I agree we have to figure out how to fix that. --WikiTiki89 15:03, 16 May 2017 (UTC)
This is also happening to me in Safari. The transliteration for Japanese, e.g. is being displayed in Hiragino rather than the basic site-wide font. I tried changing the language code on the page (using the browser inspector) to ja-Latn and that worked (i.e. that makes it use the site-wide font). I really think it's more appropriate to append the script tag in this way in every case where we are using something other than the regular script for the language, and, in any case, it seems to work styling-wise. – Krun (talk) 16:03, 17 May 2017 (UTC)
@Krun: Changes to transliteration tagging can now be done quickly and easily by editing code in Module:script utilities or Module:script utilities/data. I think I will the change you suggest; it's the most immediate solution to the problem. — Eru·tuon 22:55, 17 May 2017 (UTC)

I've added script classes and language attributes to the transliterations generated by Module:headword, Module:usex, and Module:ru-adjective as well as Module:links. It would be an improvement if a single function tagged transliterations in all modules. — Eru·tuon 02:14, 16 May 2017 (UTC)

@Justinrleung mentioned on Module talk:links § Script tags for transliterations that for him, lang="language code-Latn" still applies incorrect fonts to the transliteration. It seems his browser (Firefox) ignores the -Latn part and applies fonts based on the language code. This could be solved by replacing the language attribute with the class I suggested above: class="tr-language code". However, I tried using Firefox and didn't see the same problem. Not sure why that would be. — Eru·tuon 05:57, 19 May 2017 (UTC)

Automatic sortingEdit

Greetings. Is there any way to sort lists automatically in wiki? It would be useful as it is painstaking to sort long lists by hand (lets say without text editor). Perhaps something like this could be enabled? https://www.mediawiki.org/wiki/Extension:Sort2 Another option I can think of could be the table sorting Javascript ("wikitable sortable") https://en.wikipedia.org/wiki/Help:Sorting Anyone care to find the best way and implement it (in case we have nothing)? --Hartz (talk) 06:03, 11 May 2017 (UTC)

It is possible to make a list auto-sorted by Lua modules. However, every language is alphabetically ordered in its own way, so it must be per-language concept. The extension just globally sorts them out in one way which might not be preferable. --Octahedron80 (talk) 11:52, 11 May 2017 (UTC)

New Entry CreatorEdit

Possibly allied to the problems outlined above, NEC has been disabled. DonnanZ (talk) 11:13, 11 May 2017 (UTC)

  • Fixed now. Cheers. DonnanZ (talk) 16:23, 11 May 2017 (UTC)
  • I just tried to create a new entry and the preloaded template did not come up. SpinningSpark 17:49, 11 May 2017 (UTC)
That's odd. I just used it for Westchester County. DonnanZ (talk) 17:51, 11 May 2017 (UTC)
It also works for me. Try a w:WP:REFRESH. --Vriullop (talk) 18:03, 11 May 2017 (UTC)
Possibly I'm doing something wrong. I'm typing a new word into the search box, eg ghzz. Then I click on the "create it" link in the search results. This takes me to this page, which the url claims to be the User:Yari_rand app for new creations, but the edit box is completely blank and there is no template for selecting entry parameters. Monobook, Firefox 53.0.2, Windows 7 SP1 64bit. SpinningSpark 18:10, 11 May 2017 (UTC)
Update, it's only broken in Monobook, Vector works fine. SpinningSpark 18:13, 11 May 2017 (UTC)
Looking at this further, it seems that it is something in my own monobook.js that is breaking it. Since I haven't changed anything in my personal js recently, I guess this must be an old script-pocalypse issue. SpinningSpark 21:05, 11 May 2017 (UTC)

Translation pop-ups not opening?Edit

When I checked a word today (11 May), the translation pop-ups did not display an 'open' button. The html text of the translations is still there, if I click on 'edit', so I am assuming a technical problem of some sort. I checked a number of other words, and they are all doing the same thing. —This unsigned comment was added by 2602:302:D10C:83B0:226:8FF:FEEB:4F85 (talk) at 13:44, 11 May 2017 (UTC).

Categorise JavaScripts?Edit

Is there a way to place site-wide JS pages in a category? I think this would be useful as half the time I can't find them. —CodeCat 19:20, 11 May 2017 (UTC)

I second that. — Eru·tuon 19:29, 11 May 2017 (UTC)
Apparently, Mediawiki pages can be categorized, because some are in Category:Wiktionary scripts. We could add others. (If Mediawiki pages couldn't be categorized, the talk pages could stand in for them.) - -sche (discuss) 21:18, 11 May 2017 (UTC)
Yeah, you apparently just add documentation pages like MediaWiki:Common.js/documentation. —JohnC5 22:12, 11 May 2017 (UTC)
I'm not able to create or edit such documentation pages, which seems kind of counterproductive. Could someone else categorise all gadgets in Category:Wiktionary gadgets? —CodeCat 12:21, 13 May 2017 (UTC)
I'm putting every *.js in the Mediawiki namespace that begins with "Gadget-" in Category:Wiktionary gadgets, and every other *.js in that namespace into its parent category, Category:Wiktionary scripts, except MediaWiki:Gadgets-phantom.js which I left uncategorized because it seems to be intentionally blank. Some pages do not appear in the categories yet due to the usual issues with caching. - -sche (discuss) 17:36, 13 May 2017 (UTC)
If this is of interest to anyone, the following pages consist exclusively or almost exclusively of importing other sites' javascript:

And the following pages consist of extremely short text for disabling various things:

- -sche (discuss) 17:36, 13 May 2017 (UTC)

Module:translit-redirectEdit

I created Module:translit-redirect to do the work of choosing between transliteration modules, for languages written in several different scripts. It replaces modules such as Module:pa-translit and Module:khb-translit (Punjabi and Lü), which simply redirect to another module, depending on which script is being used. The other module does the actual work of taking the native script and generating a transliteration. The redirecting code in these two modules is remarkably similar. It seems simpler to centralize all this transliteration-redirecting in one module. So far, I've made Punjabi use the redirecting module, and it appears to work. — Eru·tuon 04:04, 12 May 2017 (UTC)

Bot request: External links → Further reading Edit

Can a bot do this, please?

  • Change all instances of the heading "External links" into "Further reading".

See this diff for an example.

This action was voted and approved at Wiktionary:Votes/2017-03/"External sources", "External links", "Further information" or "Further reading". Thanks in advance. --Daniel Carrero (talk) 09:31, 12 May 2017 (UTC)

While someone is running a bot doing this, you could also check for any instances of "Future reading". I may have made that typo from time to time when renaming the sections manually. —Aɴɢʀ (talk) 09:47, 12 May 2017 (UTC)
I just grabbed the latest dump and can work on this. TheDaveBot has started running, if anyone wants to look at the changes. And I am looking for "Future reading" as well. - TheDaveRoss 15:04, 12 May 2017 (UTC)
I finished everything in NS:0 from the latest dump, there may be a few remaining. Wasn't sure if non-NS:0 pages should be updated, but I updated a few of those before I thought better of it. - [The]DaveRoss 12:41, 16 May 2017 (UTC)

Wrong Greek transliterationEdit

νηῦς (nēûs) is being transliterated with a short e, but it should be a long ē. —CodeCat 14:17, 12 May 2017 (UTC)

Also, it's being transcribed as a two-syllable word, but it's a monosyllable with a long diphthong. The 5th-century BC pronunciation should be /nɛ̂ːu̯s/, not /nɛː.ŷːs/. —Aɴɢʀ (talk) 14:56, 12 May 2017 (UTC)
Pinging @Erutuon and @JohnC5 as the people most likely to be able to help. —Aɴɢʀ (talk) 14:58, 12 May 2017 (UTC)
I've fixed the transliteration. The IPA transcription may be harder to fix. — Eru·tuon 16:30, 12 May 2017 (UTC)

Hi , Is there a tool that check loops in wiktionary definitions ?Edit

that is two terms each defined by the other. I was asked to write such a bot to the hebrew wiktionary so I better first check if such a bot exist already. —This unsigned comment was added by Shavtay (talkcontribs) at 21:07, 12 May 2017 (UTC).

I am not aware of one. It sounds useful! Please let us know if you do design one. - -sche (discuss) 04:06, 13 May 2017 (UTC)

Hi, Yes I'm thinking of writing one, you know any tools that I can use ? I know of Dbnary [2] Shavtay (talk) 12:10, 13 May 2017 (UTC)

{{calque}}Edit

{{calque|zh|mnc|ᠵᡠᡵᡠ ᡥᠣᡨᠣᠨ|t=two cities}} doesn't display the gloss on 雙城子双城子 (Shuāngchéngzǐ). Please help. Wyang (talk) 07:00, 13 May 2017 (UTC)

@Erutuon Something seems to have stopped working since March 25. —suzukaze (tc) 20:11, 13 May 2017 (UTC)
@Wyang, Suzukaze-c: Fixed. Problem related to parameter aliases. Thanks for pinging me. — Eru·tuon 22:23, 13 May 2017 (UTC)
@Erutuon Thanks! Wyang (talk) 02:12, 14 May 2017 (UTC)

MediaWiki:Gadget-WiktAssistedEditing.jsEdit

Do we still need this? It claims "This script changes timestamps such as those in comments to be relative to the local time", but all it does is import MediaWiki:Gadget-TranslationAdder.js. (Is this a reminder that the "editor" functions need to be split out of the "translation adder", and if so, can someone do that?) Also on the subject of gadgets which seem to only load other gadgetry: MediaWiki:Gadget-WiktBlockedNotice.js. - -sche (discuss) 16:51, 13 May 2017 (UTC)

MediaWiki:Monobook.jsEdit

Do we still need this .js page? It consists exclusively of a note that everything should be placed on a different page instead! See also MediaWiki:Monobook.js/sv. - -sche (discuss) 17:31, 13 May 2017 (UTC)

Category:English nouns with irregular pluralsEdit

A discussion from 2010, archived on the talk page, suggests this category should be renamed, but a bigger issue IMO is that it should be generated automatically in entries that use {{en-noun}}, rather than being added manually to such entries. (Then we have to decide what is "irregular": any plural other than "s" or "es"?) - -sche (discuss) 20:03, 13 May 2017 (UTC)

I'm ok with this. I also think we should get rid of Category:English irregular plurals. —CodeCat 20:13, 13 May 2017 (UTC)
Yes, the only cases that come to mind where a plural could be in the "irregular plural" category while the singular was not in the "nouns with irregular plurals" category are cases where the plural is too nonstandard and uncommon to give in the headword of the singular, like dogz, but I wouldn't expect casual users to notice such a distinction and I'm not convinced it'd be useful. (Plus it could be argued dogz isn't even irregular, but a regular eye-dialectal change.) So I'd be OK with getting rid of Category:English irregular plurals. - -sche (discuss) 04:08, 14 May 2017 (UTC)

De-link transliterations pleaseEdit

Please see Lao entry ເງິນຕາ (ngœn tā). The transliteration is currently displayed as ngœn, which is wrong. --Anatoli T. (обсудить/вклад) 02:07, 14 May 2017 (UTC)

@Erutuon... —Μετάknowledgediscuss/deeds 02:09, 14 May 2017 (UTC)
@Atitarev, Metaknowledge: I was puzzled at first, because Module:headword removes links before generating a transliteration. The problem was in Module:lo-headword, which was unnecessarily generating an automated transliteration. Fixed. — Eru·tuon 03:37, 14 May 2017 (UTC)
Thanks! —Μετάknowledgediscuss/deeds 03:42, 14 May 2017 (UTC)

"Lua error: not enough memory" at man, tooEdit

Like water did before we split its translations onto a subpage (see GP, BP), man is now running out of memory.
I've null-edited the page several times and the error is consistently in the Swedish etymology section (whereas the error in water would move several sections up or down after null edits), but it does display a small amount of randomness: sometimes the memory runs out right at Proto-Germanic, sometimes it runs out at PIE.
I have a Phabricator report ready to file about water and man, but before I file it, are we still suspecting that individual tasks like instances of {{t}} not releasing memory after they finish is a cause? And what assistance would we want from the folks at Phabricator if they say they can't change their implementation of Lua? - -sche (discuss) 06:44, 14 May 2017 (UTC)

I don't have answers to your questions, but I changed {{t-simple}} so that it frees up a little memory, pushing the error to the Declension section for Volapük. Strangely, when I switched more translations to {{t-simple}}, it made the problem worse. — Eru·tuon 08:23, 14 May 2017 (UTC)
I think there may be some wasted memory in our language and script objects, though I'm not sure how to confirm that. Each language object contains one or more script objects. This probably results in duplication. For instance, when we have many languages in the translations list that use the Latn script, the Latin script object will be repeated inside each of those language objects. That duplication could account for part of the memory problem.
(I don't know about the whole memory-releasing thing, but I'm imagining there's a counter that tracks how much memory is used, whether or not it is released when the functions have done their work.) — Eru·tuon 08:34, 14 May 2017 (UTC)
After recent edits, the page consistently makes it as far as the Volapuk Coordinate terms section before running out of memory. Interesting that the error on this page is so much more consistent than the error on water, and that adding t-simple to this page apparently makes it worse, apparently unlike water. - -sche (discuss) 03:28, 15 May 2017 (UTC)
I don't see the same increase- I was able to decrease the memory to 45 MB with {{t-simple}}. DTLHS (talk) 03:34, 15 May 2017 (UTC)
Quite odd. The edit with the memory increase is linked above. You added {{t-simple}} to non-Latin-script terms; do we want that? — Eru·tuon 03:39, 15 May 2017 (UTC)
If that's the only way to fix it. I imagine non-Latin script languages use a lot more memory, since they require script classes and transcription. DTLHS (talk) 03:45, 15 May 2017 (UTC)
Re "able to decrease the memory": but not by adding t-simple to the same terms. In theory, adding it to Latin-script terms should either simplify things slightly or have no effect, but instead it had a bad effect (when tried).
Adding it to non-Latin-script terms is not a good fix, since it breaks script support and transliteration. We could try simplifying t-simple further to allow script to be set manually the same way it allows langnames to be set. And, of course, we could try to simplify our modules. I am revising the bug report I intend to file on Phabricator to see if folks there have suggestions. 04:49, 15 May 2017 (UTC)
I've created a light_link function in Module:links, hopefully as barebones as possible. I tested it with {{t-simple/test}} and User:Wikitiki89/water, but it doesn't free up near enough Lua memory. I guess that proves the template can't be Lua-based. — Eru·tuon 07:57, 15 May 2017 (UTC)

I kind of wish there were a function to create a "cache" of language and script objects that can be used by multiple functions. It would be similar to mw.loadData, but for tables that include functions. So, then, before calling getByCode, you could check if the language or script object has already been loaded to the cache, and use that instead. Or getByCode could check for the existence of that function. Not sure if this is possible programming-wise. mw.loadData requires the tables that it loads to be static, not containing any functions. The caching function I'm envisioning would have to allow functions to be within the tables. Perhaps it would be possible for language and script objects, because all of their variables stay the same, except for the variables supplied as arguments. Or maybe not. — Eru·tuon 16:48, 19 May 2017 (UTC)


So let's write down our options. Thing we can do:

  1. Ask for an increase of max memory.
    It is quite possible they would let us
  2. optimize every bit our of lua.
  3. write an extension. Something along the lines with mw:Extension:Transliterator
  4. Make some or all part of rendering js-based.
  5. Or similar to above - let translation tables be ajax-based (per-request/per-click)
  6. use t-simple or even simpler template while we can (and the number of problematic articles are handful) and hope something will change.

Is there anything else? --Dixtosa (talk) 19:18, 19 May 2017 (UTC)

Also the idea above, of a new Scribunto function, similar to mw.loadData, to cache language and script objects so they're not duplicated in memory on a given page. Again, not sure if it's possible. — Eru·tuon 19:25, 19 May 2017 (UTC)
Probably asking for an increase of Lua memory would be the simplest option, and it would probably suffice for quite a while. However, it would be good to pursue another option even if we get an increase in memory, to prepare for the ideal future, in which every translation table for basic words like water would contain every attested translation in every language possible (and be as full as or fuller than the water-translations are now). It's conceivable that we increase memory now, and then our pages get longer and longer, and we end up needing a further increase of memory because we haven't done anything to optimize the memory-hogging modules. — Eru·tuon 21:01, 19 May 2017 (UTC)
Any JS-based rendering needs to degrade gracefully to non-JS. Equinox 21:08, 19 May 2017 (UTC)
I filed T165935 about this. - -sche (discuss) 04:33, 21 May 2017 (UTC)
The ticket has been closed as not something they will resolve, because it quite plausibly could be our modules which are the issue. They will also not increase the amount of memory pages can use, because "at some point in the future you'll probably wind up with an even-huger page that would exceed the new limit." They note that transliteration does seem to eat up a lot of memory. Perhaps deploying t-simple even on languages that would otherwise be transliterated — possibly simplifying t-simple to not even call modules, but have the script class (needed so the correct fonts are picked) input manually — is the way to go. Perhaps translations do not absolutely need to have transliterations? Or they could be supplied manually? - -sche (discuss) 22:01, 21 May 2017 (UTC)
I find the response somewhat frustrating. @Anomie suggests that somehow our modules are being cached, but it would help to have more explanation on how that can happen and how that could cause the problem. And he does not respond to the request for help with determining which modules or functions are using so much memory. Perhaps there is another place to submit a request for help so that they will actually respond to it, but I don't really understand the Phabricator framework enough to determine where that would be.
I think inputting manual transliterations with {{subst:xlit}} would be a good idea. Of course, the transliterations would not be updated when the module is changed, but the same will happen with the manually inputted language names. We can modify {{t-simple}} to add transliteration and gender annotations if they have been supplied as parameters.
Manually inputting script codes should be possible, if we add a findBestScript function in Module:languages/templates. — Eru·tuon 22:19, 21 May 2017 (UTC)
Wait, why would a module need to be involved at all if we input the script code manually? The template would just tag the translation as being whatever script was input, and common.css would apply the right font. (Or would that not work?) Of course, this would mean we would need to tightly control how many pages used t-simple, and periodically manually- or by-bot- check that only valid codes were being supplied. - -sche (discuss) 22:32, 21 May 2017 (UTC)
I mean, we'd use the module function once, when adding the script codes. Something like {{subst:#invoke:languages/templates|getByCode|language code|findBestScript|text}}. — Eru·tuon 22:52, 21 May 2017 (UTC)
Huh. What was I thinking? The findBestScript function is in Module:scripts/templates. — Eru·tuon 03:49, 22 May 2017 (UTC)
That would ensure that a valid script code was being supplied, and maybe we could do it "inexpensively" enough for it to be (to extend the metaphor) affordable. OTOH, we could save memory by not invoking a module at all: the template could just apply the supplied script code to the translation, like templates did before we had Lua. - -sche (discuss) 06:22, 22 May 2017 (UTC)
Okay, I guess I am being unclear. I am not proposing that the module be invoked, except when saving the page. I would just put a parameter such as |sc={{subst:#invoke:scripts/templates|findBestScript|eau|fr}} into {{t-simple}}. This would transform to |sc=Latn when the page is saved, and no template would be invoked once substing has happened. Then the script code in the wikitext will be valid unless someone changes the term or the language code on the page, or gives the script a new code in the script data module. findBestScript is the function by which {{t}} and other linking and tagging templates get script codes, so the output will be identical to the equivalent instance of {{t}}. — Eru·tuon 07:17, 22 May 2017 (UTC)

Adding ids to enable linking to headwordsEdit

I think we need ids in headwords, for form-of templates to link to.

Form-of templates currently allow an |id= parameter to link to a sense id. The need for some kind of id is clear, when there are multiple POS sections each with their own alternative forms. But using a sense id implies that an alternative form or inflected form only belongs to one sense. That's usually not true. Alternative or inflected forms generally apply to all senses inside a particular POS section. In order to use a sense id in a form-of template, you have to arbitrarily choose a sense to link to.

Ideally, we would link to the POS header itself. That's currently impossible, because sections for the same part of speech often occur multiple times in the same page. For instance, the page set has no less than 16 Noun sections, two of which are inside the English section. Adding the anchor #Noun will link to the first Noun section on the page, which is not necessarily correct. There is no way to add unique anchors to POS headers. It could be done by putting {{anchor}} within the equals signs of the header (for instance, ===Noun{{anchor|English noun 1}}===), but that's not currently allowed, and for good reason: it makes the anchor template display in the edit summary when you edit that section, and it breaks the link from the edit summary to the section.

The current solution is to link to a sense id in one of the definition lines in the desired POS section. This does not make sense for a form-of template. Forms generally apply to all or most definitions within the POS section, not to only one.

A better solution would be to link to an anchor within their headwords. We could add a |id= parameter to the headword template, and have Module:headword add id="headword id" to the first form in the headword. The form-of templates could then link to this id, instead of linking to the POS header directly above the headword.

(An alternative would be to place {{senseid}} in the same line as the headword template. That's sloppy, because it's called a sense id, not a headword id. As a sense id, it should only be used in senses [definition lines].)

To implement this, we would have to replace the existing |id= parameter in existing form-of templates with |senseid=. Then we would decide on a format for the headword ids, and allow Module:headword to create these ids, and Module:links to link to them. This may be complicated, but I am interested in making it happen. — Eru·tuon 22:07, 14 May 2017 (UTC)

Why does it have to be that complicated? Just add the id= parameter to {{head}} and that's it. —CodeCat 22:17, 14 May 2017 (UTC)
Huh? Add a parameter to headwords and do nothing to the form-of templates? — Eru·tuon 22:19, 14 May 2017 (UTC)
Yes. The form-of templates already have an id= parameter, they don't need changing. —CodeCat 22:22, 14 May 2017 (UTC)
Well, that means the headword ids have to be formatted identically to the sense ids. I didn't want to do that because it could result in a conflict, if a sense id had the same string as a headword id. — Eru·tuon 23:02, 14 May 2017 (UTC)
Ids of individual senses could also conflict with each other, but that doesn't seem to happen. I don't think it's an issue, unless I am missing something. —CodeCat 23:07, 14 May 2017 (UTC)
Hmm, you're right. So just the headword modules or templates have to be modified. — Eru·tuon 23:09, 14 May 2017 (UTC)
How are we going to name these headword ids though? We can't name them by sense or by part of speech (which isn't guaranteed to be unique). Adding a number can work, but of course the sections may be reordered later and then the number no longer matches its order on the page. This isn't necessarily an issue and highlights a strong point of ids, but if there's a better option that would be nice too. —CodeCat 23:19, 14 May 2017 (UTC)
That's a thorny question. Using a keyword from one of the most common senses would be easiest. As you say it might result in conflicts, but it might not be all that hard to avoid a conflict: an editor might be able to see both the id in the headword and any sense ids in the definitions while editing, unless the conflicting sense id was in another POS section. — Eru·tuon 00:52, 15 May 2017 (UTC)
This is why creating entirely separate ids, despite the complexity, would be better for editors. No need to worry about conflicts between headword and sense ids. — Eru·tuon 01:01, 15 May 2017 (UTC)
Like CodeCat, I think we should use the existing id= framework. Let's not make things unnecessarily complicated! We should probably add "have a bot check for duplicate ids" to the list of semi-regular tasks at WT:TODO, even to check for duplicate senseids under our already-existing system, like if someone put the senseid "pair" at a sense of the noun couple, and someone else put the same senseid at a sense of the verb couple. - -sche (discuss) 01:49, 15 May 2017 (UTC)
Okay, I'll work on making Module:headword add ids then. — Eru·tuon 02:05, 15 May 2017 (UTC)

in t-check categories, "Unspecified is an invalid script"Edit

Category:Requests for review of Lü translations and Category:Requests for review of Punjabi translations are displaying errors, apparently because the default text they create gives users an example of how to use {{t-check}} to put an entry into the category, but that example uses the Latin-script word "example" as the Lü/Punjabi translation. If there is no easier way of resolving the error, I suggest just dropping "It results in the message below:" and the displayed example after it, because it does not seem to be helpful. The wikitext to copy and paste seems to be all that a casual reader would need. - -sche (discuss) 01:43, 15 May 2017 (UTC)

Also, all the pages in the categories link to the #Lü and #Punjabi L2 (of library, etc) for some reason. - -sche (discuss) 01:44, 15 May 2017 (UTC)
That's an error message generated by the transliteration module, which in this case is Module:translit-redirect. I can disable the error message, either in non-entry namespaces or everywhere. — Eru·tuon 01:51, 15 May 2017 (UTC)
Eh, for the transliteration module to check that it is getting valid input, and to display an error if it is not, seems fine. So far, it seems to be only this edge case where there is a problem, and it would seem easier (or maybe not easier! but maybe better) to solve it by removing the actual display of "this is what it would look like if you put Latin script into a Punjabi t-check for some reason". - -sche (discuss) 02:49, 15 May 2017 (UTC)
Added English catfix to "requests for review of translations" categories, as I did before for "requests for translations". Now those categories will link to the English section of the entry. Thanks for noticing that error. — Eru·tuon 02:59, 15 May 2017 (UTC)
Thanks for fixing it! :) - -sche (discuss) 00:35, 16 May 2017 (UTC)

German attention neededEdit

There are over 200 entries in Category:Requests for attention in German entries. Some of these need attention before the German bot runs again - otherwise it might generate lots of rubbish. SemperBlotto (talk) 10:26, 15 May 2017 (UTC)

Oh, I'm glad to hear the bot will run again soon. :) I will start looking over these. - -sche (discuss) 21:37, 15 May 2017 (UTC)
@SemperBlotto I checked with AWB and found that there were only ~30 entries in both the "attention" category and Category:German terms with red links in their inflection tables, i.e. entries where the bot would be creating inflected forms for an entry with an attention tag. Of those, the inflection tables were correct in most of the entries: the attention tags were only asking for the definitions to be expanded, or in the case of a few adjectives, for "verb form" sections to be added. I will keep working to address and remove the attention tags, but I think that if your bot runs, the only entry it would create a wrong inflected form for is Wikiwörterbuch, where the inflection table lists a form Wikiwörterbuche which it should not list. - -sche (discuss) 23:01, 20 May 2017 (UTC)

Modified Russian translitEdit

I've created a JavaScript function that modifies Russian transliteration to show vowel reduction and other things: User:Erutuon/modifyRussianTranslit.js. So, for instance, the transliteration of Воро́неж (Vorónež) shows up as Varóniž: a little more representative of the actual pronunciation [vɐˈronʲɪʂ]. Kind of silly, but I like the more transcription-like transliteration variant.

I've wanted to do this for a while, but it wasn't possible until I added tagging for transliterations, as discussed above. — Eru·tuon 22:16, 15 May 2017 (UTC)

There is a longstanding disagreement between those who want transliterations of Russian to be transliterations, letter for letter, and those who want them to be like a secondary IPA. It has occasionally been suggested that both could be displayed, first the actual transliteration and then the "pronounced as". - -sche (discuss) 23:55, 15 May 2017 (UTC)
I just want to warn you that your tool probably makes a lot of mistakes. The reduction rules are not as regular as they may seem at first. --WikiTiki89 15:04, 16 May 2017 (UTC)
Probably. For instance, it transliterates ви́део (vídeo) as v̦íd̦ia when it should be v̦íd̦io. — Eru·tuon 19:11, 16 May 2017 (UTC)

Accel template for German diminutives needs updatingEdit

The accelerated-creation script for German diminutives uses outdated syntax in the declension-table it generates ([3]). I would fix it myself, but I don't recall where the relevant text that needs to be changed is located. - -sche (discuss) 00:43, 16 May 2017 (UTC)

@-sche, it's here: User:Conrad.Irwin/creationrules.js. —Μετάknowledgediscuss/deeds 00:48, 16 May 2017 (UTC)
Thanks! Fixed. - -sche (discuss) 01:04, 16 May 2017 (UTC)

Qualifying comparative and superlative forms in Template:de-decl-adjEdit

I suggest Template:de-decl-adj should accept optional parameters comp1_qual=, comp2_qual=, sup1_qual=, and sup2_qual= (or whatever names would be better) which, if present, would add a qualifier after the phrases "comparative"/ "superlative forms of x" that form the titles of the tables of forms. Then it could be explained why there are two tables of comparative forms on rot (one is for the inflection with umlaut, one without) and why there are two tables of superlative forms on blau (one with epenthetic -e-, one without). This could also be accomplished by repeating the tables with semicoloned headers, like this, but comp1_qual= would be neater. - -sche (discuss) 02:06, 16 May 2017 (UTC)

strange headerEdit

What's up with undetermined [Term?] at the beginning of the templates inherited, derived, and borrowing? --Espoo (talk) 11:00, 16 May 2017 (UTC)

Formatting for individual Japanese readingsEdit

There are several formatting issues with Japanese readings (yomi). The readings are displayed (collectively) by {{ja-readings}}, which provides fields for classes of readings (such as Goon, Kan’on, Kun, etc.). Each field may contain multiple readings (see e.g. ), but there is inconsistency in how the individual readings are formatted. The readings currently use plain wikilinks, so neither the linked hiragana nor the transliterations are currently tagged with lang=ja or classes Jpan or tr. Furthermore, there is inconsistency in how readings with okurigana are presented (sometimes the kanji spelling with furigana is included, sometimes not; sometimes there is a dot in the hiragana, etc.). When historical readings are thrown into the mix, things get more complicated, and other information is also sometimes supplied, such as rarity, and if the reading has been excluded from the Jōyō kanji table, i.e. it is 表外 (hyōgai). See , for some of the variety of display approaches. I think it is time we figured out how to present all this information and format it properly so that fonts are consistent on the site and language/script tags are properly applied. We could include this functionality in {{ja-readings}} (i.e. code it in the readings section of Module:ja), or we could create a new template for a single reading with its historical form (and automatic transliteration). I rather lean toward the latter route, as it would continue to allow extra information to be supplied alongside the readings as needed, while standardizing the display of modern/historical pairs and their transliteration. I also think it is particularly important to divorce the categorization of the historical readings from that of the modern readings, because they overlap, and it is useful to be able to search for kanji with a particular historical reading. – Krun (talk) 15:53, 17 May 2017 (UTC)

Input from editors of Japanese and also from Lua-savvy contributors would be much appreciated. @Eirikr, @Hippietrail, @Nibiko, @Haplology, @エリック・キィ, @TAKASUGI Shinji, @Nbarth, @Stephen G. Brown, @Wyang, @Atitarev, @suzukaze-c, @kc kennylau, @Erutuon, @CodeCat, any ideas?
I think it would be fairly easy to write a function to search for the wikilinked Japanese words and create a {{l}}-style link for each, without transliteration since the transliteration seems to be written out manually. It would also be possible to search for the italicized transliterations and add the correct formatting to them. If there are tables with transliteration missing, it would probably be possible to search for that and add automatic transliteration generated by the transliteration module. Is that the kind of thing you want? — Eru·tuon 00:32, 18 May 2017 (UTC)
Just to clarify: these are my ideas after looking at the sample input on the template page for {{ja-readings}}. I'm somewhat bewildered by your post because I know very little about Japanese and its entries (aside from the fact that there's one ideographic script and two syllabic ones). — Eru·tuon 00:36, 18 May 2017 (UTC)
Well, I was looking for technical input as well as linguistic, and you seem to manage well with the Lua module coding. I haven’t really gotten into it myself and don’t understand it very well. Anyway, about the transliterations: those should probably be removed and autogenerated instead. – Krun (talk) 01:49, 18 May 2017 (UTC)
I guess what we need might be a template, say ja-y that took a modern reading, its corresponding historical reading (optionally for now due to lack of data, but preferably always included), an optional more ancient reading (such as くゐやう for ) and potentially more, e.g. a parameter to indicate presence or absence in the Jōyō kanji lists. I’m not sure how it should be presented, but the connection of historical readings to the modern counterpart should be clearly indicated, all readings must be transliterated (there needs to be a separate automatic transliteration for historical/ancient readings from the modern ones), and the readings should be labeled succinctly, preferably with links to Wiktionary appendix pages explaining things like historical kana usage, Jōyō kanji, etc. – Krun (talk) 02:04, 18 May 2017 (UTC)
I've done a restructuring of the readings function, so that each reading can be handled in the same way (links redone using Module:links, for instance). If you notice any problems (such as readings in the template code vanishing from the displayed list), please let me know. — Eru·tuon 02:08, 18 May 2017 (UTC)
Hmm, I don’t really notice anything different now from before, but anyway, here is a stab at what I mean for each individual reading: Something like
{{ja-y|きょう|きやう|くゐやう|+}}
which would yield something like:
きょう (kyō) (Jōyō reading; historical きやう (kyau), ancient くゐやう (kwyau))
– Krun (talk) 02:40, 18 May 2017 (UTC)
@Krun: Right, it shouldn't look different. I was restructuring to make consistent script and language tagging or linking possible. — Eru·tuon 23:56, 19 May 2017 (UTC)
(i don't think the ping worked the first time. @Eirikr, @Hippietrail, @Nibiko, @Haplology, @エリック・キィ, @TAKASUGI Shinji, @Nbarth, @Stephen G. Brown, @Wyang, @Atitarev, @suzukaze-c, @kc kennylau, @Erutuon, @CodeCatsuzukaze (tc) 22:12, 19 May 2017 (UTC))
I've been thinking about the template too, but my own idea I had in mind was to keep the general structure but make more of it automated, like
{{ja-readings
|goon=こう<くわう
|kanon=きょう<きやう<くゐやう
|kun=まして,いわんや<いはんや,おもむき
}}
which would produce the same thing we presently have at .
I think Krun's idea is interesting too, since it allows for detail to be added easier.
Formatting of kun'yomi should definitely be solidified. 結#Readings is comprehensive but intimidating. —suzukaze (tc) 22:12, 19 May 2017 (UTC)
Suzukaze's suggestion is what I had in mind too. Wyang (talk) 00:18, 20 May 2017 (UTC)
結#Readings is indeed hard to read. Do you have an ideas for how it could be entered using the more brief input format that you are suggesting (or how it could be made more readable)? — Eru·tuon 00:21, 20 May 2017 (UTC)
My suggestion:
{{ja-readings
|kanon=けつ
|goon=けち
|kun=結(むす)ぶ, 結(むす)び, 結(むす)ばる, 結(むす)ばわる, 結(むす)ぼる, 結(むす)ぼうる, 結(むす)ぼれる, 結(ゆ)う, 結(ゆ)い, 結(ゆ)わう, 結(ゆ)わえる, 結(いわ)える, 結(いわ)く, 結(す)く-“to knit a net”, 結(かた)なす-“to gather or tie together into one bunch”, 結(かた)める-“to bind together; to open and read out the content of official documents”, 結(かた)ぬ
|nanori=
}}
Jouyou readings should be handled by a backend data module. Of course, the automatic link formatting algorithm should be disabled if any of the parameters is formatted as a link (e.g. by {{ja-r}}). Wyang (talk) 00:34, 20 May 2017 (UTC)
I've disabled the link-formatting thingy whenever "<span" is found in the parameter, which indicates that {{ja-r}} (or another linking template that adds language and script tagging) has been used. — Eru·tuon 01:16, 20 May 2017 (UTC)
@Wyang: How will information like いわんや<いはんや be handled in your approach? --kc_kennylau (talk) 03:37, 20 May 2017 (UTC)
@Kc kennylau As 況(いわ)んや<況(いは)んや - so the algorithm for generating the link(s) would be (1) split by dash; (2) split by the less-than symbol; and (3) analyse parentheses. Wyang (talk) 11:39, 20 May 2017 (UTC)

(unindent) Nice. After seeing your suggestions, I really like them, @suzukaze-c and @Wyang, particularly since a way to add extra info has been envisioned. My original suggestion mostly has merit for making it easier for editors to add extra info, but if everything necessary is provided for, I actually prefer the single template solution. The code is then also shorter and easier to read without all the pipes and stuff in there. I still wonder how specific things like qualifiers about context of readings, rarity, classical Japanese only, etc. would be put in, and of course about the implementation of Jōyō readings. I do very much like the idea of having the Jōyō status of readings be automatically determined by lookup in a data module. I guess that would be very similar to how {{ja-kanji}} automatically determines the grade (Kyōiku (1–6) / Jōyō / Jinmeiyō / Hyōgai) of the kanji itself, which I also like. One change from Wyang’s example I would like to make is to remove the kanji (it is, after all, always the same kanji). That would also make the input simpler and easier to read:

{{ja-readings
|kanon=けつ
|goon=けち
|kun=むす.ぶ, むす.び, むす.ばる, むす.ばわる, むす.ぼる, むす.ぼうる, むす.ぼれる, ゆ.う, ゆ.い, ゆ.わう, ゆ.わえる, いわ.える, いわ.く, す.く-“to knit a net”, かた.なす-“to gather or tie together into one bunch”, かた.める-“to bind together; to open and read out the content of official documents”, かた.ぬ
|nanori=
}}

I still think the spelling with kanji and the plain hiragana spelling should both be displayed in full, though, when the page is rendered. – Krun (talk) 00:28, 22 May 2017 (UTC)

@Krun, Erutuon, Wyang, Kc kennylau: https://en.wiktionary.org/w/index.php?title=Wiktionary:Sandbox&oldid=43068363. 鬱 is a placeholder since using PAGENAME in links would look stupid in the sandbox. The broken appearance of きょう<きやう<くゐやう is on purpose for now since I want to see more evidence that we should add/support more than one layer of historical kana readings ([4] only has きょう<きやう). —suzukaze (tc) 07:09, 22 May 2017 (UTC)
Looks great. Question: are there any circumstances in which {{ja-r}} may be invoked with the kanji in a non-initial position? Wyang (talk) 07:13, 22 May 2017 (UTC)
@Wyang I am not aware of such. The only thing I could think of is things like お母さん, but then the かあ part seems to be considered the reading. – Krun (talk) 15:29, 22 May 2017 (UTC)
@Suzukaze-c Pretty good. I particularly like the special formatting for the historical readings. As for the additional layer, it is certainly true that dictionaries will normally not include it (like the one you linked to, which I believe is the 2nd edition of the Daijirin. Wiktionary is special, however, as we are not bound by the same restraints. I also think it’s a very good thing that we can offer easy access to otherwise obscure information. In any case, these readings have already been added to Wiktionary pages, and I do believe they are authentic, if rare. There are several sources for them; this page has a nice overview, with reference to several printed sources at the bottom; see also this one.
I am a little bit concerned with the furigana/ruby though; I think I prefer the full on plain kana spelling accompanied by the kanji-with-kana spelling for several reasons:
1. To have a link to both (the kanji spelling links to the main entry for a word, where the senses, pronunciation, etymology, etc. are to be found, but the plain kana spelling links to the page that gives the different kanji spellings for that reading).
2. With ruby, unfortunately, the kanji-with-kana spelling is not searchable (e.g.  (おく)れる (okureru) is 遅 followed by おく followed by れる in the HTML text. In my browser (Safari), a page search will find “おくれる”, but not “遅れる”. This is of course a wider issue with our use of ruby, and in cases where multiple kanji are used, often neither the spelling with kanji nor the plain kana can be searched for. It will presumably become less of an issue as our Japanese coverage gets better and we get individual entries for everything (then the headword line has both spellings in full), but it would be nice to help users to make the most of the information we currently have.
3. The kana are more readable when written in full. After all, in this particular page section the kana are all-important, as people are looking in it specifically for the reading; it’s not just a reading aid or extra information in this case.
I would also like to have consistency in the appearance of readings that have okurigana and those that do not, e.g. 行う (おこなう (okonau)) and (くだり (kudari)), rather than having the latter show only the plain kana spelling. We already have it e.g. at . I want to make the presence of okurigana explicit like that in every case, as it isn’t otherwise obvious whether there are no okurigana or the information is merely missing. There probably are cases where we cannot be certain for a while what the okurigana are, and not specifying (until someone figures it out) would be desired in those cases. There is always a question of whether the kanji or the kana spelling should come first; I don’t really have a strong opinion on that.
Another thing we need to be prepared for is variations in the okurigana. How should we format e.g. おこなう for , which has the variants 行う and 行なう?
Then there are readings that are words normally spelled with two kanji, like the kun-reading さとう for . This is, of course, a word which is normally spelled 砂糖 (including the kanji , but with another kanji as well). Has this word really been spelled with alone, or do kun-readings of this sort perhaps only serve as glosses or as standard word transformations for reading kanbun and the like? Perhaps you know something about that, @Eirikr, TAKASUGI Shinji, Nibiko?
– Krun (talk) 15:29, 22 May 2017 (UTC)
re: historical readings: Alright, how many levels are there? Is "modern" < "historical" < "ancient" all there is?
re: furigana: Being unable to Ctrl+F ruby is a problem that lies in the very HTML structure of ruby. However, I've changed the way kanji is displayed to be more like 食. I think there is no problem with having okonau listed twice ([5]).
re: weird kun'yomi: No idea.
suzukaze (tc) 18:21, 22 May 2017 (UTC)
Thank you guys for your effort to improve kanji entries. I’m wondering if we could show official readings more clearly, probably with bold letters. Wiktionary sometimes has too much information including archaism, in which case it can be misleading. The current version of erroneously makes you believe  () (osu) is acceptable in modern Japanese. — TAKASUGI Shinji (talk) 22:36, 22 May 2017 (UTC)
@TAKASUGI Shinji: I wouldn’t say it’s implied that everything on the page is in contemporary use, but I say we should definitely indicate all the Jōyō readings. Perhaps bold would be an appropriate choice, but then just for the kana-only spelling (don’t know about the transliteration, but definitely not the kanji). It would probably also be a good idea to specially indicate readings that only apply to classical Japanese.
@Suzukaze-c: Nice work! I very much like the look of it now. Also, nice touch to add the underline to the transliteration to indicate furigana. Regarding the “ancient” readings: Yes, I think there is only that one extra layer, and it is only when there was originally /kw/, /gw/ before /e/, /i/, /y/ (later simplified to drop the /w/). These are borrowed clusters from Chinese that only occur in on’yomi. The name isn’t anything standard, but I suppose “ancient” will suffice as long as we have an appendix to explain what we mean.
I have two things to add:
First, when the historical reading is the same as the modern reading, I would still like it to be indicated in some way. This helps users and editors to know whether the historical reading is in fact the same or just hasn’t been added, and enables categorization as well.
Second, we need to have a separate transliteration scheme for historical/ancient readings. It has to differentiate between (du) and (zu) and (di) and (zi), ぢゃ and じゃ, etc. Perhaps Nihon-shiki would be a good basis for that, although perhaps the -line had better be fa, fi, fu, fe, fo? I guess I’d be okay with it being ha, hi, fu, he, ho, as in Hepburn. Yōon should also be indicated without an extra vowel before the /y/ or /w/, even though all the kana are written full size, e.g. きやう (kyau), くゑ (kwe), きよ (kyo), くゐよく (kwyoku); /ou/ should not yield ō, e.g. きよう (kyou), not kyō. – Krun (talk) 00:12, 23 May 2017 (UTC)
In fact we have so many “Japanese” entries that are actually Old Japanese. Why don’t we separate them using the codes ja and ojp? — TAKASUGI Shinji (talk) 03:14, 23 May 2017 (UTC)
I get the feeling that it is largely User:Bendono's work. We do have a small number of Old Japanese entries. —suzukaze (tc) 03:36, 23 May 2017 (UTC)
re: bolding modern readings: I'm not sure using bold formatting is a good idea. The reasoning for such formatting might not be immediately obvious.
re: modern < historical < ancient: ✔
re: historical reading == modern reading: How would this information be presented? (+ what kind of category structure are you thinking of?)
re: romanizing historical readings differently: From the face of it, it seems feasible. I'll look into it.
suzukaze (tc) 03:36, 23 May 2017 (UTC)

(unindent)

@Suzukaze-c: Re: bolding: We were talking about putting boldface on readings that are found in the Jōyō list, not all readings that are used in the modern language. Even if we went for bolding them, their status should also be indicated more explicitly (it could be an abbreviation or symbol, but it must have a link to an appendix explaining the Jōyō list. Actually, one thing I noticed from your sandbox example is that you’ve marked things as non-Jōyō (which is already done in entries), but I envisioned that we would rather specially indicate readings that are in the Jōyō list; the + in the code was meant to indicate that (a + is weird as a marker for exclusion anyway). For the kanji that have many readings, I’m sure most of the readings are non-Jōyō, and we it’s not good to imply that some random readings (maybe ones added later to the page) are in Jōyō because they’re not marked as such. Anyway, @Wyang already suggested above that a data module be used to keep track of the Jōyō readings. We can extract all of them from w:List of jōyō kanji.
Re: modern < historical < ancient: Looks good!
Re: if historical == modern: Hmm, I guess we don’t need to add any new feature for that. Some shorthand display would be possible; many dictionaries don’t indicate the historical spelling if it’s the same or if it only differs in use of small kana vs. full size (e.g. しょう / しよう) – but I guess it’s probably best to just add the historical reading in full even if is’t the same as the modern one, especially if it might have different romanization. Just so you are aware, in case it has any effect on the implementation, there are cases where the modern and historical readings are the same, but there exists a more ancient version that is different, e.g. ( (ki) < (ki) < くゐ (kwi)). This won’t affect anything of course, if we just go for the full display of modern and historical readings regardless of the values.
Re: categorization: Consider (かん (kan) < くわん (kwan)) and (かん (kan) < かん (kan)). These should both go into Category:Japanese kanji read as かん, but there would additionally be something similar for the different historical readings (e.g. [Category:Japanese kanji with historical reading くわん] and [Category:Japanese kanji with historical reading かん]). That way one can search specifically for kanji with the historical reading kan without them being conflated with the ones that have the historical reading kwan, as it is currently.
– Krun (talk) 19:23, 23 May 2017 (UTC)
@Suzukaze-c I see you’ve been working on the historical romanization :). Just one more thing: needs to be wo (not o) in the historical mode. – Krun (talk) 12:59, 24 May 2017 (UTC)
Jōyō readings are now indicated by both an inline note and a hideous yellow background, with the help of data at Module:Sandbox/1. I also added historical romaji according to what you've told me and "historical reading" categories. [6]suzukaze (tc) 04:57, 25 May 2017 (UTC)
@Suzukaze-c Cool. We’ll need to organize a bot run through all the kanji entries to switch to the new format. WT:AJA will need to be updated as well. Then we’ll need to add the historical readings and the okurigana dot marking manually, so perhaps we should add maintenance categories, one for readings where the historical form is missing and one for readings that don’t have a dot separator (if the kanji covers the whole reading, i.e. there are no okurigana, the dot should be at the end). Also, is everybody okay with this change (@Eirikr, Hippietrail, Nibiko, Haplology, エリック・キィ, TAKASUGI Shinji, Nbarth, Stephen G. Brown, Wyang)? – Krun (talk) 00:24, 28 May 2017 (UTC)
redoing ping: @Eirikr, Hippietrail, Nibiko, Haplology, エリック・キィ, TAKASUGI Shinji, Nbarth, Stephen G. Brown, Wyang and also @Fumiko Takesuzukaze (tc) 19:45, 28 May 2017 (UTC))
I am in support. Wyang (talk) 21:18, 28 May 2017 (UTC)
Great. Thank you, Suzukaze. — TAKASUGI Shinji (talk) 22:45, 28 May 2017 (UTC)
I support this and I think that this is a step towards making Japanese entries easier to edit. @Krun Those weird kun'yomi come from the Unihan database, which seems to treat kun'yomi as a field for giving Japanese glosses for kanji. Theoretically, they could be used in kanbun, but as they are not a part of any standard, I don't think that we should include them. Nibiko (talk) 07:38, 30 May 2017 (UTC)

Deploying the new codeEdit

(unindent) @Suzukaze-c I am wondering whether we can move the new code to Module:ja immediately, but keep the old code and just add a switch that would handle the readings as they are handled now if e.g. they start with [[ (but add the entry to a maintenance category, say Category:Japanese kanji readings that need to be updated), and use the new system otherwise. That way, we can immediately start converting some entries manually and tweaking the new code as needed, without breaking existing entries. We can then just do the big bot change whenever we feel like it. – Krun (talk) 14:26, 31 May 2017 (UTC)

@Krun The code already is able to cope with "old" formatting, due to User:Erutuon's work. Adding a category would be very easy. The code could theoretically be deployed now but there is still a small portion of it that may not be good coding (the part marked with TODO: this is probably bad, mod:ja-link should be callable from modules). @Erutuon, could you help? (also maybe review the rest of the code if you want, since you're so clearly more experienced at this stuff than I am) —suzukaze (tc) 18:44, 1 June 2017 (UTC)
@Suzukaze-c: I'm taking a look at it now, initially trying to figure out the code and making a few small changes. I started a module-callable function in Module:ja-link. Not sure if it will work yet. — Eru·tuon 22:24, 1 June 2017 (UTC)
Link function works now. — Eru·tuon 00:15, 2 June 2017 (UTC)
Thank you. —suzukaze (tc) 21:22, 3 June 2017 (UTC)

@Krun It serms there is no objection so I am ready to make the change live but there is one thing I forgot to ask. Currently the "old" {{ja-readings}} only adds categories for on'yomi, and the way these categories are named is different from what you proposed above (Category:Japanese kanji read as きょう vs. Category:Japanese kanji with reading きょう). How should these differences be reconciled?

Also, are there any kanji readings that exist as with on'yomi and kun'yomi readings? (do we need to add "on" and "kun" into the category name?) —suzukaze (tc) 21:22, 3 June 2017 (UTC)
@Suzukaze-c Yes, there are definitely readings that exist both as on’yomi and kun’yomi. Those will generally be the shorter ones, such as , いく, まつ, and so on. I think it would be a good idea to categorize the on- and kun-readings separately, e.g. Category:Japanese kanji with on-reading ひ and Category:Japanese kanji with kun-reading ひ. We could also separate them into the different classes of on-readings (goon, kan’on, tōon, sōon, kan’yōon), but in that case we would need duplicate generic on’yomi categories on everything (when one is looking up readings without knowing which type of on’yomi they might be), so maybe that would just make the categories overly complicated? I don’t know, maybe it would be an interesting new way to discover kanji and their different types of readings. I do think separate categories for Nanori will be needed. Also, I noticed that the dot separator is currently in the category names. We’ve never had that before, but it could be interesting, as okurigana are not always obvious and, again, the same kana can be applied differently (e.g. ひる: and 簸る, both kun-readings). – Krun (talk) 14:53, 4 June 2017 (UTC)
Perhaps we should also keep the current categories ([[[:Category:Japanese kanji read as きょう]], etc.) to allow more generic lookup as well, for when one doesn’t know whether a reading might be on or kun, or when one is simply looking for homophonous characters. – Krun (talk) 16:16, 4 June 2017 (UTC)
Alright, the main code has been changed. Now templates have to be made for the new categories. —suzukaze (tc) 01:50, 5 June 2017 (UTC)
Could you make a list of the types of category names that should be recognized, and an idea of what the category tree should look like? I should be able to create a module with that information. — Eru·tuon 02:26, 5 June 2017 (UTC)
@Erutuon: I think it would look something like this. (@Krun, please feel free to tweak Wiktionary:Sandbox.) —suzukaze (tc) 06:10, 5 June 2017 (UTC)
@Suzukaze-c: Okay, I've created the beginnings of a function at Module:ja-kanji-readings. Not updated to reflect the changes you've made in the sandbox. Perhaps the readings function from Module:ja could be moved there. — Eru·tuon 07:03, 5 June 2017 (UTC)
The new format looks really nice. Just a small suggestion: perhaps the background colour could be made half of current + white so that it looks more soothing. Wyang (talk) 02:49, 5 June 2017 (UTC)
@Wyang:   Done (diff) —suzukaze (tc) 06:10, 5 June 2017 (UTC)
@Suzukaze-c: I've changed some things in the Sandbox: [7] Still needs some work; I am extremely tired and can barely wrap my head around it right now. – Krun (talk) 01:33, 6 June 2017 (UTC)
@Suzukaze-c I was just trying the new functionality out on a new kanji, , and I encountered a problem: the reading しおうお is incorrectly romanized as shiōo instead of shiouo, because we forgot to provide for morpheme boundaries. In other places, a dot (full stop) is used for this (e.g. {{ja-r|鮑|しお.うお}}, yielding  (しおうお) (shiouo)), whereas we are using the dot here for the okurigana boundary. Perhaps we should change the separator for okurigana to - (hyphen) so that we can uniformly use . for its existing purpose in transliteration generation? – Krun (talk) 14:41, 7 June 2017 (UTC)
Sounds alright to me. —suzukaze (tc) 16:29, 7 June 2017 (UTC)
  Donesuzukaze (tc) 03:09, 8 June 2017 (UTC)
re: "if the kanji covers the whole reading, i.e. there are no okurigana, the dot should be at the end": Hmm, IMO it is kind of illogical. It seems weird to me. —suzukaze (tc) 03:09, 8 June 2017 (UTC)

@Suzukaze-c, Krun: Should there be "modern" categories, or is there another way for someone to find current readings as opposed to historical ones? — Eru·tuon 22:18, 8 June 2017 (UTC)

Ohh. Or is it that if the reading category is not qualified as "ancient" or "historic", it is modern? — Eru·tuon 22:27, 8 June 2017 (UTC)

@Erutuon: Yes, that’s it. You raise a valid point; perhaps it’s not obvious that they are modern readings.
@Suzukaze-c: re: dot/thingy at the end for readings without okurigana: Well, it’s perfectly logical if you consider that the part before the delimiter is the part covered by the kanji, and so if there is nothing after the kanji, everything comes before the delimiter. I was particularly concerned with uniform appearance, and putting the delimiter at the end accomplished that, namely in the form of identical formatting (underlining) of the part covered by the kanji. However, I am not completely satisfied. Like you, I do find it weird to see the hyphen (or dot) at the end; it looks like there is something missing. I don’t have anything against dropping the delimiter per se, when there are no okurigana, as long as it is unambiguous and displayed consistently, but I can’t see how it could be unambiguous with our editing model. We already have an abundance of existing readings without delimiters, many of which are readings that do have okurigana. Even if we go through all the kanji entries and standardize, there are always new editors who put in readings without necessarily knowing such details, and it’s useful to be able to do so (just like adding an on-reading without knowing whether it’s kan’on, goon or kan’yōon, or even tōon, etc.). Therefore, I think we do need to have some sort of explicit marker for this kind of reading. I haven’t added it for on-readings however, as I don’t feel that would make sense; on-readings can never have okurigana anyway.
@Suzukaze-c: Also, there are some small issues with the module: 1. the display order of the reading categories is getting messed up; it should always be: gōon, kan’on, tōon, kan’yōon, on (unclassified), kun, nanori; 2. jōyō matching has stopped working for readings with okurigana after the delimiter was changed. – Krun (talk) 00:59, 9 June 2017 (UTC)
@Krun: The readings should display in that order now (or else my method of maintaining the order isn't working). Though, where should sōon go? — Eru·tuon 03:11, 9 June 2017 (UTC)
After thinking about it I'm totally okay with the trailing delimiter now. —suzukaze (tc) 10:09, 9 June 2017 (UTC)
@Erutuon: It’s not working here, at least: (the order shown is Goon, Kun, Kan’on). I’ve even tried purging the page’s cache. Also, shows Nanori before Kun. – Krun (talk) 00:11, 10 June 2017 (UTC)
@Krun: Okay, fixed. Turns out my method was faulty. — Eru·tuon 01:30, 10 June 2017 (UTC)
@Suzukaze-c, Erutuon Something’s up with the reading (e.g. in and ): it’s being romanized as wa. Also, the Jōyō reading matching doesn’t work when the trailing delimiter is used. – Krun (talk) 01:18, 11 June 2017 (UTC)
The kun+trailing delimiter problem is fixed. The romaji, on the other hand, is a more complex issue...... I'm looking into it. —suzukaze (tc) 02:13, 11 June 2017 (UTC)

@Krun Should kun'yomi categories be added if a reading is missing a hyphen? —suzukaze (tc) 02:02, 15 June 2017 (UTC)

@Suzukaze-c I think we won’t be able to insert all the hyphens anytime soon, so it’ll probably be more useful (at least at the moment) to skip the hyphen in category names (see also Module_talk:ja-kanji-readings#Period_and_hyphen_in_readings_in_category_names). I’ve personally come to the conclusion that the benefits are minimal. We could always revisit the issue later and add the hyphens back. – Krun (talk) 13:50, 15 June 2017 (UTC)

Combining table utilitiesEdit

@CodeCat, Wikitiki89, Benwing2, Erutuon, I18n: I notice that we have several modules (mod:table tools, mod:table, mod:TableTools) which all contain, you guessed it, tools for tables. Could we combine these into one module so that we don't have to keep reïnventing the wheel? —JohnC5 04:07, 18 May 2017 (UTC)

I can't understand what Module:table tools does, but the module I just made, Module:table, could certainly be merged with Module:TableTools. There might be a duplicate function in the latter. — Eru·tuon 04:46, 18 May 2017 (UTC)
Module:table tools was created originally by User:Wikitiki89 and may be misnamed, but it has functionality orthogonal to the other modules -- it is for handling footnotes in table entries. OTOH, Module:utils overlaps heavily with Module:table and Module:TableTools and the three should probably be merged, perhaps into Module:utils since many of the things in the other two aren't specific to tables. Benwing2 (talk) 05:08, 18 May 2017 (UTC)
Speaking of Module:utils, does anyone know what @Vitalik was up to with Module:inflection and Module:inflection-docs? —JohnC5 05:22, 18 May 2017 (UTC)
Module:inflection was invented as some universal platform for inflection modules. Firstly it was implemented and used for Uzbek nouns, and then was started implementation for Russian nouns. But other users asked me to stop that implementation, so fill free to do with them anything you want. Vitalik (talk) 23:27, 19 May 2017 (UTC)
Module:table tools does not only handle footnotes. It contains functions that linkify and format comma-separated lists, and footnotes are only a part of that. Overall, it's meant to be a set of tools for making inflection table templates without dedicated modules. --WikiTiki89 14:35, 18 May 2017 (UTC)

@JohnC5: So, it seems that Module:TableTools and Module:table could be merged. Module:TableTools has better functions, but I like the name Module:table better. Some of the functions in Module:utils might be duplicates of functions elsewhere, and the module name is too similar to Module:utilities anyway. — Eru·tuon 01:36, 10 June 2017 (UTC)

@Erutuon: Would you be willing to undertake this endeavor? You've been doing so much refactoring lately. —JohnC5 06:17, 10 June 2017 (UTC)
@JohnC5: I'm intermittently working on a few things, but I can put it on the list. — Eru·tuon 06:33, 10 June 2017 (UTC)

Bot request: delete 600+ "Translations to be checked" categoriesEdit

Bot request:

Please delete all the "Translations to be checked" categories. For example:

I guess a bot can delete categories, right? If any of these categories has a couple of entries, a null edit in the entries should un-categorize them.

This action was voted and approved at Wiktionary:Votes/2017-03/Request categories 2. These categories were superseded by Category:Requests for review of translations by language.

Thanks in advance. --Daniel Carrero (talk) 13:54, 18 May 2017 (UTC)

A bot can do it, but it has to be logged in as an administrator. Not that it matters, because logged events cannot be flagged as bot edits in most cases (so will show in RC). Also, not all of these categories are empty, wouldn't it be easier to clean them up before deleting? - [The]DaveRoss 14:28, 18 May 2017 (UTC)
@TheDaveRoss: Do you think you can make the bot delete ONLY the categories that don't have any pages? This is just to be sure, because actually I believe I successfully emptied all the categories now. I did null edits where needed, and also removed a few entries that were categorized manually (i.e., without using templates).
Some categories have a reported number of entries like "Translations to be checked (Tupinambá)‎ (0 c, 1 e)" (emphasis mine) but they are actually empty. --Daniel Carrero (talk) 14:56, 18 May 2017 (UTC)
Yes, I can check if a category has members before deleting it. I'll delete all of the empty ones first and then check back. - [The]DaveRoss 15:02, 18 May 2017 (UTC)
Sounds great to me, thanks. --Daniel Carrero (talk) 15:03, 18 May 2017 (UTC)
It looks like the couple with members got cleaned up between then and now. The category is empty. - [The]DaveRoss 17:56, 18 May 2017 (UTC)
Great, thanks. I cleaned up Estonian at the last second when you were working on this. --Daniel Carrero (talk) 21:10, 19 May 2017 (UTC)
@TheDaveRoss, Daniel Carrero: Is there a way to re-add the boxes that showed the newest and oldest additions to the categories? I was checking French translations from oldest to newest, but that information is no longer available to me. Has it been lost altogether with the category name changes? If so, I would have voted against renaming them had I known this would be a consequence.... Andrew Sheedy (talk) 03:03, 8 June 2017 (UTC)
@Andrew Sheedy See Category:Requests for review of French translations. I re-added the boxes that showed the newest and oldest additions to the categories. I apologize since I'm doing this now, this could have been done before. However, because of the category move, technically all entries qualify as "Recent additions to the category". This will be true until 10 new entries get added in the category. --Daniel Carrero (talk) 06:58, 8 June 2017 (UTC)
Thanks. I figured the information about the oldest members of the category would be lost, which is a shame. Oh well, hopefully I'll get them all done anyway. Andrew Sheedy (talk) 02:57, 9 June 2017 (UTC)

Template:zh-derEdit

This template is failing in the Compounds section of by running out of Lua memory. It is powered by Module:zh and Module:columns. I tried turning off sorting (by previewing the page while editing Module:zh), but that doesn't fix it. It does have a huge number of links: 1275. — Eru·tuon 00:27, 19 May 2017 (UTC)

I (indiscriminately) removed some “compounds of compounds” to fix it. Wyang (talk) 00:23, 20 May 2017 (UTC)
It can avoid error by partitioning to many smaller zh-der as I did long ago somewhere. --Octahedron80 (talk) 04:01, 21 May 2017 (UTC)
Ah! I found it . It used to have the same error before. With , we can do this. --Octahedron80 (talk) 04:07, 21 May 2017 (UTC)

Function for inserting Template:t-simpleEdit

I've created a JavaScript function, User:Erutuon/simpleTranslations.js, to quickly convert {{t}} to {{t-simple}} for Latin-script translations with no parameters besides lang and term. (The function creates a link just above the edit box to allow you to trigger the function, if it finds the Translations header on the page.) I had been using regex in gedit (which I had to retype each time); might as well automate it. This will allow quicker fixing of Lua memory errors related to huge translations sections. I just had to do this in I, which recently developed a Lua memory error. So I decided to make a function. — Eru·tuon 03:44, 20 May 2017 (UTC)

Edit tag for missing headword templateEdit

Would it be possible to create an edit tag for missing headword templates? I've seen many edits like this, which have a definition and no headword template. They would be easier to find if they were tagged.

Perhaps the tag could be applied if there is a header containing one of the allowed parts of speech, but something other than a newline and a template is found after it, or if there is a POS header, newline, and #, as is true in the edit I linked above.

I don't know how edit tags work, so perhaps this is impossible. — Eru·tuon 00:43, 21 May 2017 (UTC)

They use regular expressions. It's technically possible, but the regular expression would have to match on a location where a headword-line template should be placed, but isn't. Both of these things run into problems: where does a headword-line template go, and what is a headword-line template to begin with? The answer to the first is a rather long list of valid POS headers, which makes for a very long regular expression. The second is even harder to answer, as it's essentially an open set; people create new headword-line templates all the time. At best, it could match for a template immediately following one of a long list of headers. —CodeCat 01:34, 21 May 2017 (UTC)
I think the location of the headword template is defined: one line below the POS header (though rarely someone might place it two lines below, which is incorrect). Yes, there are lots of headword templates. Checking for any template in that position would be easiest. (Could compile a full list from all the templates listed in Category:Headword-line templates and its subcategories, but I imagine that'd be a very long list, and quite frequently people don't add a category when creating a new template.) — Eru·tuon 01:47, 21 May 2017 (UTC)
There is no canonical list of parts of speech that appear in headers. DTLHS (talk) 03:03, 21 May 2017 (UTC)
What do you mean? What about the lists in WT:POS? — Eru·tuon 03:33, 21 May 2017 (UTC)
That's not a formal policy. In practice the set of all parts of speech in use is much larger. DTLHS (talk) 03:40, 21 May 2017 (UTC)
So, it would be impossible to generate a list of all the POS headers in use, and it is probably impossible to generate a list of all headword templates. But at the very least the filter could search for more commonly used POS headers, and then search for any template at the beginning of the first or second line below the header, and add a filter if there isn't a template there. That would be incomplete, but might not result in any false positives. — Eru·tuon 20:24, 21 May 2017 (UTC)
It must be possible to generate a list of POS headers (or a list of L3 headers from which non-POS headers could be sifted out manually), since lists of headers have been generated before and used to clean up errors like "Etmology". The list of POS headers would be large, but probably not larger than 300, especially once errors (like "Nouns" for "Noun") are removed. Then we could update WT:POS, with the understanding that new headers are not forbidden, but should be discussed. An edit filter checking so many things might be too expensive, though. - -sche (discuss) 22:24, 21 May 2017 (UTC)
Another idea: listing all the non-POS headers and assuming something is a POS header if it isn't a level 2 header (language) and isn't one of the non-POS headers? That would only be tenable if the list of non-POS headers is smaller than the list of POS headers. — Eru·tuon 01:17, 22 May 2017 (UTC)
I edited the shortcut WT:POS to point to WT:EL#Part of speech, which is the actual voted and approved policy. We do have a comprehensive list of parts of speech. (New parts of speech may be added depending on the needs of each language, of course.) --Daniel Carrero (talk) 01:33, 22 May 2017 (UTC)
For anyone curious about why this is a non trivial thing to analyze I suggest you download a dump and try to do it yourself. DTLHS (talk) 01:40, 22 May 2017 (UTC)
A possible approach: all headword templates should add a category in one of the following formats: [language name] lemmas or [language name] non-lemma forms. It should be possible to use the regular search to find entries that have no headword templates at all. As for entries with headword templates for some, but not all language sections: do the dumps include categories generated by templates, or are they all pre-transclusion? If they do have such categories, it would be a simple matter of comparing category names with L2 headers. If not, it might be possible to go through the subcategories of Category:Lemmas by language and Category:Non-lemma forms by language and create a list of language headers to compare with the list of L2s in the dumps. That still won't find cases where there's at least one headword template in the language section, but one or more headword templates are missing. It's a start, though. Chuck Entz (talk) 06:04, 22 May 2017 (UTC)
There is a category dump, but it's separate from the main dump, is in SQL instead of XML and is somewhat hard to work with. DTLHS (talk) 06:24, 22 May 2017 (UTC)
This filter might do the trick, it looks for the "known" POS headers and checks to see if there is any sort of template used before the definition line.
rx := "(s?)([=]{3,7})(Adjective|Adverb|Ambiposition|Article|Brivla|Circumfix|Circumposition|Classifier|Cmavo|Combining form|Conjunction|Contraction|Counter|Determiner|Diacritical mark|Gismu|Han character|Hanja|Hanzi|Infix|Interfix|Interjection|Kanji|Letter|Ligature|Lujvo|Noun|Number|Numeral|Participle|Particle|Phrase|Postposition|Prefix|Preposition|Prepositional phrase|Pronoun|Proper noun|Proverb|Punctuation mark|Rafsi|Romanization|Root|Suffix|Syllable|Symbol|Verb)\1([^{]*?)#(.+)";
new_wikitext rlike rx
Not sure how many false positives this would result in, but we can try it out and if it proves unhelpful amend or disable it. Added it as AF #68, tagging with "no head temp". - [The]DaveRoss 15:54, 22 May 2017 (UTC)
@TheDaveRoss: Thank you! Is there any way to find edits that have triggered that tag, to check how it's working? — Eru·tuon 00:34, 23 May 2017 (UTC)
Here is the log. It caught some edits to Wiktionary: pages; it should only look in the main-namespace and maybe also Reconstructions:. It also caught e.g. diff not because of any error in that edit, but because the page elsewhere has an untemplatized headword line, in the Scots section. That's probably OK, if we treat the log as a source of entries to clean up rather than edits to disallow. - -sche (discuss) 05:59, 23 May 2017 (UTC)
Ahh, so the filter checks the entire entry, not just the part that has been changed. I think it should be restricted to namespaces that contain entries: main and Reconstruction. The pages in the Wiktionary namespace are just noise to be ignored. Some Appendix pages contain entries (for instance, Appendix:Quenya/Elda), but most probably don't, so it would be best to exclude them. — Eru·tuon 06:45, 23 May 2017 (UTC)
DTLHS fixed the namespace (currently it is NS:0 only). I actually thought that looking at the whole entry (instead of newly added lines) was a feature, since it gives an opportunity to fix old problems as well as new. If that is not ideal it can check only new lines instead. - [The]DaveRoss 11:49, 23 May 2017 (UTC)
I agree it's fine to check the whole entry, as long as it remains set to "tag" and not "warn". :) - -sche (discuss) 21:28, 23 May 2017 (UTC)

Whew! TheDaveBot is really uncovering a lot of untemplated headwords... — Eru·tuon 21:43, 23 May 2017 (UTC)

When the next dump comes out I can try and find all of them, seems like there might be quite a few. - [The]DaveRoss 21:57, 23 May 2017 (UTC)

I just spotted قیلمق. But the edit wasn't tagged? —CodeCat 13:25, 3 June 2017 (UTC)

Edit request for language dataEdit

Please add a rule to replace ë with e for dum. —CodeCat 16:29, 21 May 2017 (UTC)

@CodeCat: Like this? —Aɴɢʀ (talk) 17:11, 21 May 2017 (UTC)
Yes, thank you. —CodeCat 17:12, 21 May 2017 (UTC)

Working with CategoriesEdit

I recently added "Category:English words prefixed with de-"to the page for "decant", but decant doesn't now show up on the list of words. Where is the documentation for this particular aspect of working with categories? Thanks! —This unsigned comment was added by Raspberrybeloved (talkcontribs).

  • It takes a few minutes for the category-mechanism to catch up. And I'm not totally convinced that decant has that prefix (it certainly starts with "de"). SemperBlotto (talk) 14:01, 22 May 2017 (UTC)
    • These categories are automatically added by etymology templates like {{affix}}, {{prefix}} and such, so you never need to add them manually. —CodeCat 14:17, 22 May 2017 (UTC)
The prefix de- was added to decant while the word was in Latin. We only add "prefixed with" categories when the prefix was added in the current language (English). So decant should not be in the category English words prefixed with de-. — Eru·tuon 01:33, 23 May 2017 (UTC)

Template:termetymEdit

I working on a module to grab the etymology of a term, based on {{desctree}}, called {{termetyl}} {{termetym}}. The idea is to nest etymologies to reduce duplication and make entries more accurate and consistent. It's pretty awful how so many parent and child entries have conflicting etymologies. I'm running a looping issue right now, though. Could someone have a look and see what's wrong? You can find an example here: duvet. --Victar (talk) 01:11, 23 May 2017 (UTC)

I fixed the template looping. — Eru·tuon 01:26, 23 May 2017 (UTC)
@Erutuon: Thanks! It's always the obvious fixes that are the hardest to find. --Victar (talk) 01:34, 23 May 2017 (UTC)

The modules are Module:User:Victar/term etymology and Module:User:Victar/term etymology/templates. — Eru·tuon 01:20, 23 May 2017 (UTC)

What do people think? Good idea worth pursuing? Any foreseeable problems? --Victar (talk) 01:51, 23 May 2017 (UTC)

I have an issue that some children language may have more than 1 word/form (which may or may not have same meaning). How do you handle this? --Octahedron80 (talk) 03:43, 23 May 2017 (UTC)
@Octahedron80: Check out the example. It's child to parent, so in the child etymology, you select the correct parent manually. --Victar (talk) 03:46, 23 May 2017 (UTC)

Does this this alter the fetched templates for the appropriate language? So if we had a French entry:

From {{inh|fr|frm|foo}}.

being imported into an English entry:

From {{bor|en|fr|foo}}, {{termetyl|fr|foo}}.

Would this be equivalent to:

From {{bor|en|fr|foo}}, from {{der|en|frm|foo}}.

Otherwise, you might be miscategorizing entries. —JohnC5 04:07, 23 May 2017 (UTC)

That might be somewhat complicated. It might require compiling a full chain of derivation, listing the proximate relationships between each word in the chain, and then a function to determine the relationship between the word of the current entry and each word in the chain of derivation. Two examples: ❶ If an English word is inherited from a Middle English word that's inherited from an Old English word, the English word is inherited from the Old English word. ❷ But if an English word is borrowed from a French word that's inherited from a Latin word, the English word is derived and not inherited from the Latin word. The module has to somehow distinguish those two cases. Not sure how to do that. But it can't be done by checking if Latin is an ancestor of English. That wouldn't work in the case of ❸ a French word borrowed from a Spanish word that is inherited from a Latin word. There, the French word is derived and not inherited from the Latin word, even though Latin is an ancestor of French. — Eru·tuon 05:24, 23 May 2017 (UTC)
I did a simple mw.ustring.gsub on the results, which should remedy that, although now you have to enter {{termetyl|en|fr|foo}} {{termetym|en|fr|foo}}. @Erutuon you want to have a look and see if it checks out for you? --Victar (talk) 05:50, 23 May 2017 (UTC)
{{inh}} does not always become {{der}} though.... —JohnC5 05:59, 23 May 2017 (UTC)
I believe there are two schools of thought on the use of {{inh}}. Some only use it for the first derivation in the etymology. Others use it for every inherited step. Which is correct, I don't know. If we did the later, all you would need to do is only start replacing {{inh}} with {{der}} after the first instance of {{bor}} or {{der}}. We'd also have to add a |noinh= parameter which is a bit annoying. --Victar (talk) 06:16, 23 May 2017 (UTC)
I hadn't heard of this other school of thought, in which only the relationship to the nearest ancestor word counts as inheritance. I'm doubtful. Where did you find this idea expressed? — Eru·tuon 06:49, 23 May 2017 (UTC)
I've never heard anyone espouse the "first-only" school of thought, and I certainly can say that it is not the intended usage of the template (no offense). Inheritance should be shown all the way down where accurate (I'm sure @CodeCat can back me up on this). —JohnC5 06:55, 23 May 2017 (UTC)
LOL, oh man, the pitchforks came out. No need to freak out. I can program the later, no problem. --Victar (talk) 07:26, 23 May 2017 (UTC)
There you go. That should do it. --Victar (talk) 07:55, 23 May 2017 (UTC)
Also {{etyl}}. —JohnC5 15:10, 23 May 2017 (UTC)
Done. --Victar (talk) 15:34, 23 May 2017 (UTC)
I can't figure out what your code is doing... Also, the syntax (x|y|z) doesn't work in Lua regex. — Eru·tuon 17:13, 23 May 2017 (UTC)
@Erutuon: No? What's the Lua equivalent? --Victar (talk) 17:24, 23 May 2017 (UTC)
@Victar: There is none. You have to write out each alternative separately and put it in a separate instance of the function. — Eru·tuon 17:42, 23 May 2017 (UTC)
Lua does not have regex's, it has patterns, and patterns lacks any equivalent to (x|y|z)JohnC5 17:42, 23 May 2017 (UTC)
That's pretty limiting of Lua. Fixed one of them, but still need to figure out how to replace this one:
local pattern = ".*?(\{{(bor|der).*)"
local match = mw.ustring.match(etymology, pattern, match)
@Erutuon, JohnC5:, could you have a look at my sloppy Lua work? --Victar (talk) 20:53, 23 May 2017 (UTC)
I added some comments. Hope that helps. --Victar (talk) 17:41, 23 May 2017 (UTC)

I think it would be great if it could be triggered with |etyl=1 in {{bor}}, {{der}} and {{inh}}. --Victar (talk) 15:36, 23 May 2017 (UTC)

I would also like that also, though "etyl" stands for "etymological language", so I would prefer |ety=1. —JohnC5 17:42, 23 May 2017 (UTC)
HAH! I actually just thought it was short for etymology. I would have called the template {{termetym}} otherwise. --Victar (talk) 20:53, 23 May 2017 (UTC)
But yeah, if this goes well and people agree to this, then I'd love to see this integrated into {{bor}}, {{der}} and {{inh}}. That would also simplify the logic as well ({{inh}} maintains inheritance, {{bor}} and {{der}} do not. All of them replace {{bor}} with {{der}}). —JohnC5 03:56, 24 May 2017 (UTC)

@JohnC5, Erutuon What's the policy for making such templates live? I'd like to try out some real life examples. --Victar (talk) 20:53, 23 May 2017 (UTC)

@Victar: I mean, no harm try it out on a few mainspace templates just to test. —JohnC5 03:31, 24 May 2017 (UTC)
OK, I guess I'll move them out of my user space. --Victar (talk) 03:34, 24 May 2017 (UTC)
Just for testing at the moment, of course. —JohnC5 03:41, 24 May 2017 (UTC)
Moved. Appendix:duvet categories look good. --Victar (talk) 04:08, 24 May 2017 (UTC)
@JohnC5, Erutuon, spoke too soon. Looks like the last step of the loop isn't being parsed now for some reason? --Victar (talk) 04:15, 24 May 2017 (UTC)
@Victar: I've fixed it. You were calling preprocess on each recursive call, so it was rendering the html for the internal calls. This way, they weren't editable like we wanted. Now, it only preprocesses once before returning. I also made it more efficient: you only need to run all the template language manipulation on the top call, not all the recursive sub-calls. Please check that everything is working! —JohnC5 05:54, 24 May 2017 (UTC)
@JohnC5: You rock! I wonder if some of those optimizations could/should be made to {{desctree}} as well. --Victar (talk) 14:01, 24 May 2017 (UTC)
@Victar: Yes, I think they could. Please remind me about this later. —JohnC5 14:55, 24 May 2017 (UTC)
Okay, the categories look good. But what about an English word inherited from Middle English, borrowed from French, inherited from Latin? I think an example of that is peace. Is there a way the module can be made to handle that? — Eru·tuon 07:27, 24 May 2017 (UTC)
Well from the English pages perspective, that would be English inherited from Middle English and then derivation the rest of the way. The fact that Middle English borrowed from Old French and Old French inherited from Latin is irrelevant to the English entry. I've reworked peace to show this. The template would handle this correctly. —JohnC5 07:41, 24 May 2017 (UTC)
Okay, I don't understand how it can handle that example, since it either keeps all {{inh}} templates or converts them to {{der}}. Am I misunderstanding how it works? — Eru·tuon 07:57, 24 May 2017 (UTC)
@Erutuon: So I should clarify the possible distributions of the templates. If were to make a pseudo-regex for the possible orders of templates ({{inh}} = i, {{der}} = d, {{bor}} = b, {{cal}} = c) within an etymology section these would be all of them:
  • An etymology containing inheritance: i+d* (at least one {{inh}} followed by any number of {{der}})
  • An etymology containing borrowing: bd* (only one {{bor}} followed by any number of {{der}})
  • An etymology containing calquing: cd* (only one {{cal}} followed by any number of {{der}})
  • An etymology containing derivation: d+ (at least one {{der}})
You cannot have a section like iidid or iibdd. These would be considered ill-formatted according to the intended usages of the templates. I also noticed we need to add coverage for {{cal}} and all the aliases of these templates.
After writing all of this, I realized that you might have been asking how, when transcluded, peace would have a distribution of ibi before processing, which would get processed to idi. I'll look into fixing this, but it shouldn't be too bad. I may not get to this immediately though. —JohnC5 14:54, 24 May 2017 (UTC)
This is all very complicated, and I realized peace may be a bad example. The first "inherit" would be indicated using {{inh}}, while the rest of the derivation (bi) would be handled by {{termetym}}. And hypothetically each entry in the chain of derivation would have {{termetym}}, and the Middle English entry would turn the {{inh|fr|la}} into {{der|enm|la}}, so there might not end up being a problem after all. I was thinking of hypothetical scenarios: iibi (English word inherited from Old English, borrowed from Latin, inherited from Proto-Indo-European). Not sure if that actually occurs. (I mean, it wouldn't occur in one etymology section; it might occur if you were transcluding the etymology from each item in the chain.)
@Erutuon: I think, like me, and as JohnC5 and my edits can testify, you might be overthinking it. You're never going to have an entry that is iibi because it should always become iidd from the perspective of the source. As soon as the chain hits a {{bor}}, that step in the chain and the rest going forward become {{der}}. --Victar (talk) 18:41, 24 May 2017 (UTC)
Still, using a parameter |noinh=1 seems messy. The module should be able to tell on its own when inheritance should be changed to derivation, because it is completely predictable. Editors, on the other hand, are prone to error. I like the idea of using a code to represent the chain of derivation. There are difficulties, though. I'm considering the idea of a template showing the proximate relationships between words in the chain, but am not sure how the parameters would be structured. — Eru·tuon 18:04, 24 May 2017 (UTC)
@Erutuon: Yeah, |noinh=1 is a bit annoying, but the only time you would ever use it is after a {{bor}} and {{der}}, so it's pretty straightforward. Ideally, it would be great if we could simply use |ety=1 within {{bor}}. --Victar (talk) 18:41, 24 May 2017 (UTC)
Hmm, so essentially to make this automatic, {{termetym}} would have to communicate with the etymology template that comes before it, and that is impossible. Humph. — Eru·tuon 19:01, 24 May 2017 (UTC)
Exactly. --Victar (talk) 19:11, 24 May 2017 (UTC)
This is why we suggested integrating this functionality into {{bor}}, {{der}}, and {{inh}}, so that it would know. —JohnC5 19:13, 24 May 2017 (UTC)
Heh, now I finally get it. I have some suggestions, though. — Eru·tuon 21:00, 24 May 2017 (UTC)
A good test would be pikake, which has a little bit of everything, plus a sort of a loop. Chuck Entz (talk) 08:45, 24 May 2017 (UTC)
@Chuck Entz: does this serve, Appendix:pikake? --Victar (talk) 15:51, 25 May 2017 (UTC)
@Victar: "likely a from"? Looks like something got overlooked ... Chuck Entz (talk) 04:56, 26 May 2017 (UTC)
@Chuck Entz: Just needed to removed the extra "a" from the Latin etymology. More to the point though, is it functioning how you would hope/expect? --Victar (talk) 05:39, 26 May 2017 (UTC)

Another thing that needs to be done is to add |nocat=1 to various templates, like {{compound}}. Should be simple though. --Victar (talk) 14:24, 24 May 2017 (UTC)

@JohnC5, Chuck Entz, CodeCat Nothing truly to do with {{termetym}}, but I just cleaned-up peacock and it got me again thinking about how to deal with {{compound}}. Should it just actually end the etymology, requiring people to click on either element? In this case, I only followed the tree up though the first element, since the second is really just an meaning intensifier. Or is this just an example of how it should be left to the editor's discretion and there is no one standard? --Victar (talk) 16:17, 24 May 2017 (UTC)
There are always going to be things like rebracketing and backformation to mess us up (not to mention calques), and complex compounds with lots of potential derivation chains. Perhaps we need parameters to control whether to follow the derivation of specific morphemes. Chuck Entz (talk) 04:56, 26 May 2017 (UTC)

I have a suggestion. To me, the name of the template and the modules are somewhat inscrutable. It would make more sense to me if the template were called {{getetym}}. That is a short description of what the template does: grab an etymology from another entry's etymology section. termetym sounds like it means etymology of a term, which could describe an entire etymology section, and it's unclear, from the name, what the template does to or with the etymology section. — Eru·tuon 21:00, 24 May 2017 (UTC)

Hmm, I'm not a fan. "get" is associated with functions and not in keeping with other template naming schemes. {{termetym}} was fashioned after {{desctree}}. We could use {{etym}}, instead of it being a redirect to {{etyl}}, or we could also make a sorter redirect, like {{tetym}}, {{tety}} or the reversed, {{etyt}}. --Victar (talk) 21:17, 24 May 2017 (UTC)

The other suggestion is that process_etymology and frame.preprocess should both be moved to a function in Module:term etymology so they can, in future, be called by Module:etymology or Module:etymology/templates and included in the various etymology templates. That's of course assuming no one objects. — Eru·tuon 22:51, 24 May 2017 (UTC)

This sounds very promising, but also very expensive in terms of memory. Would it work on a page like mole, which has a lot of etymology sections in several language sections? Can it handle the fact that one English etymology section refers to Spanish mole, which itself has multiple etymology sections? (Perhaps it could use anchors similar to senseids?) Will it use so much memory thait breaks the page? - -sche (discuss) 05:54, 25 May 2017 (UTC)

Answers to two of your questions: It can handle there being several language sections, each with its own Etymology section. (That can be seen in the demonstration page Appendix:duvet.) But it can't handle multiple etymologies in the same language section yet. — Eru·tuon 06:04, 25 May 2017 (UTC)

Welsh singulative parameterEdit

Could someone who's better at editing templates than I am please add (1) options for 1=m-p, 1=f-p, 1=f-m-p, 1=m-f-p (also for g= and g2=) to {{cy-noun}}, and (2) the function that 2= displays "singulative" instead of "plural" whenever 1= is set to one of the plural options? See abwyd for what I'd like the end result of [{cy-noun|m-p|abwydyn}} to look like. Thanks! —Aɴɢʀ (talk) 12:05, 23 May 2017 (UTC)

Very oddly designed template. It uses an instance of {{head}} for the first form and then manually tags the rest of the forms. — Eru·tuon 03:40, 24 May 2017 (UTC)
I would try to do what you ask, but I don't know how to make sure the acceleration-related HTML stuff continues to work. — Eru·tuon 03:54, 24 May 2017 (UTC)
Hmm, maybe it would be better for someone to make a module for Welsh headword lines. Unfortunately, I'm not the one to do that. Would anyone like to try? —Aɴɢʀ (talk) 09:23, 24 May 2017 (UTC)
I've started a new template {{cy-noun/new}} that shows "singulative" when the gender is set to m-p or f-p. It doesn't have accelerated entry creation (green links), though. —Aɴɢʀ (talk) 14:30, 2 June 2017 (UTC)

bot problemEdit

SemperBlottoBot hasn't worked since they changed http to https. I've had a go at updating the bot software and now get this error message when the bot runs:-

File "C:\<whatever>\pywikibot\data\api.py", line 2560, in getCookie
   prefix = login_result['login']['cookieprefix']

KeyError: u'cookieprefix'

Any ideas? SemperBlotto (talk) 16:04, 23 May 2017 (UTC)

What version of pywikibot do you have? DTLHS (talk) 16:09, 23 May 2017 (UTC)
Sorry. Fixed it. Bot now working. SemperBlotto (talk) 04:00, 24 May 2017 (UTC)

template:unkEdit

I'd like to add Category:Etymology templates to this template, but I don't know how to do that. Also, I think that unk. as a notation isn't extremely clear, nor useful. --Barytonesis (talk) 21:31, 23 May 2017 (UTC)

@Barytonesis: Done. {{unk}} is a very helpful for categorization. It should be used on entry pages in conjunction with a source that cites the etymology as unknown. Otherwise {{rfe}} should be used instead. --Victar (talk) 00:22, 24 May 2017 (UTC)
Thanks. I don't dispute the usefulness of the template, only its name. --Barytonesis (talk) 01:06, 24 May 2017 (UTC)
@Barytonesis: How so? What should it be named instead?--Victar (talk) 01:12, 25 May 2017 (UTC)
@Victar: {{unknown}}? --Barytonesis (talk) 22:26, 25 May 2017 (UTC)
@Barytonesis: Oh, you're talking about the template name. No, {{unk}} is a good name, and inline with {{der}}, {{bor}}, etc. I do wonder if we should be using {{etyl|und}} or {{der|lang|und|-}} instead, or have {{unk|lang}} be a redirect to the former. --Victar (talk) 22:33, 25 May 2017 (UTC)
{{der}} is a shortcut to a longer name, and I support giving {{unk}} the same treatment. —CodeCat 22:37, 25 May 2017 (UTC)
Yeah, that's fine as a template page, but people should still use {{unk}} and not {{unknown}}, as we do with {{der}} and {{derived}}. If you use {{derived}}, you're going to get your wrist slapped. --Victar (talk) 22:43, 25 May 2017 (UTC)
People are free to use either the full name or the shortcut. If we don't want people to use the full name, we shouldn't have it. —CodeCat 22:54, 25 May 2017 (UTC)
That's incorrect, and states so on the template page. --Victar (talk) 23:39, 25 May 2017 (UTC)
I don't see it mentioned anywhere that {{unknown}} is not to be used. The fact that it's the name of the template moreover invites people to use it. —CodeCat 23:41, 25 May 2017 (UTC)
@CodeCat: It seems you missed the fact that I just moved {{unk}} to {{unknown}}. No {{unknown}} previously existed. I was referring to {{der}} which cites the usage der, inh and bor. You can write out {{mention}} and {{link}} as well but there it is also not the recommended usage. --Victar (talk) 01:43, 26 May 2017 (UTC)
@Victar: I don't see anything on the template documentation page for {{der}} that says you shouldn't use the full spelling {{derived}}. Only thing is that you should use {{inh}} or {{bor}} whenever possible, to make the derivation more specific. — Eru·tuon 02:25, 26 May 2017 (UTC)
Exactly that. Also, if you ever do use {{borrowing|en|fr|duvet}} or {{mention|en|apple}}, rest assured, someone like @Angr is going to come along and change it to {{bor}} or {{m}} and probably write something on your talk page. --Victar (talk) 02:35, 26 May 2017 (UTC)
ಠ_ಠ —Aɴɢʀ (talk) 07:17, 26 May 2017 (UTC)
LOL, yes, and give you those eyes. --Victar (talk) 07:30, 26 May 2017 (UTC)

Pronunciation template for Arabic (MSA)?Edit

@Benwing, Atitarev, Mahmudmasri, Wikitiki89, Kolmiel, Erutuon Is this doable? Examples:

صَدِيق (ṣadīq):

دُكْتُور (duktūr):

(The template can perhaps do most (if not all) of its work by extracting all the headword templates (tashkil forms, manual transliterations, etc.) on the page, and hence be parameter-less.) Wyang (talk) 23:12, 24 May 2017 (UTC)

@Benwing2 and others interested. Wyang (talk) 23:18, 24 May 2017 (UTC)
Sounds interesting. I've been struggling to think about how the consonants ظ and ع are pronounced. (Also, I've been thinking of an Arabic script for German and related Germanic languages and dialects.) --Lo Ximiendo (talk) 23:55, 24 May 2017 (UTC)
Have you seen (heard!) this audio IPA chart? [8] Equinox 00:02, 25 May 2017 (UTC)
I started a module that generates the Arabic pronunciation from the transliteration: Module:ar-pronunciation. It's not quite ready; it doesn't show stress, or transcribe ـَة (-a) as /ah/, and it might not be able to handle phrases. — Eru·tuon 01:57, 25 May 2017 (UTC)
@Wyang: Having the module extract voweled forms from the headword templates sounds great, if we can make it work when there are multiple Pronunciation sections. (It would also be a great feature to have {{grc-IPA}} extract macroned forms from the headword template.) — Eru·tuon 18:50, 25 May 2017 (UTC)

دُكْتُور (duktūr):

As far as presentation is concerned, I would format the examples as shown on the right. This is the way I typically format dialectal pronunciations in English entries. I admit it's repetitive to have two of the prefix IPA (key). — Eru·tuon 18:58, 25 May 2017 (UTC)
The pronunciation template could be similar to the one for Chinese. Also, @Erutuon, what made you use /ɡ/ instead of /d͡ʒ/ for the jeem? --Lo Ximiendo (talk) 21:05, 25 May 2017 (UTC)
@Lo Ximiendo: The fact that it was supposedly a palatalized [ɡʲ] in Classical Arabic. But I don't mind it being changed. — Eru·tuon 21:11, 25 May 2017 (UTC)
The pronunciations are for MSA, aren't they? And if you wanted to do Classical, then wouldn't you have used [ɡʲ]? It would also be ideal if it could do syllabification. --WikiTiki89 21:15, 25 May 2017 (UTC)
Yeah, but MSA has a bunch of different regional pronunciations. Again, you can change it. — Eru·tuon 21:18, 25 May 2017 (UTC)
Yeah, but there's a sort of "Standard MSA". And the /g/ pronunciation is heavily marked as Egyptian. And I did change it. --WikiTiki89 21:45, 25 May 2017 (UTC)
At least the pronunciations /ʒ/ and /dʒ/ are equally standard. You'll never hear a Syrian or Lebanese person use /dʒ/ instead of his native /ʒ/. It is also rather rare for northern Egyptians to use /ʒ/ or /dʒ/ instead of their native /g/, but that does happen. Kolmiel (talk) 11:39, 26 May 2017 (UTC)
@Erutuon: Thanks! I wasn't aware Module:ar-pronunciation exists, but it seems like a pretty good start. The format is amenable to change, and I think putting the IPA tag in front of the individual pronunciations looks more aesthetic. Re headword templates: An example of such parsing is Module:th-headword and Module:th-pron―which do it in the reverse order, i.e. the headword template interprets input in the pronunciation template. There is probably a way around multiple pronunciation sections, but it will require some investigation. Wyang (talk) 21:47, 25 May 2017 (UTC)
Let's focus on getting the transcriptions right before we focus on parsing headword templates. --WikiTiki89 21:50, 25 May 2017 (UTC)
Okay. I added syllabification in; the two special cases of اللّٰه (allāh) still need to be fixed. Wyang (talk) 22:18, 25 May 2017 (UTC)
I will join the efforts to make Arabic transcriptions work. It may never be perfect because of shortness lack of references, dialectal differences even with MSA pronunciations. There are also variants and different styles but we can agree on what and how we transcribe. --Anatoli T. (обсудить/вклад) 22:24, 25 May 2017 (UTC)

Tabbed languages broken again?Edit

After being fixed for a few weeks the script seems to have broken again. DTLHS (talk) 23:31, 24 May 2017 (UTC)

Yes, I can't select any of the tabs. The only language I can see or edit (without opening the entire page) is the top language, usually English. I'm guessing it only affects the Firefox browser. —Stephen (Talk) 05:11, 25 May 2017 (UTC)
Broken for me on Firefox too. —Aɴɢʀ (talk) 08:55, 25 May 2017 (UTC)
Just wondering, maybe it is provoked by new div tag Wiktionary:Wikimedia Tech News/2017#Tech News: 2017-21. --Vriullop (talk) 09:02, 25 May 2017 (UTC)
  • Change 12th line to this
    var bodyContent = $(".mw-content-ltr .mw-parser-output")[0], // NOT #bodyContent
    
    . This fixes all pages that have new .mw-parser-output div. Others need nullediting. --Dixtosa (talk) 15:51, 26 May 2017 (UTC)
@Dixtosa Thanks. I'm not sure if this is an unrelated bug, but when I click on certain section links (for example lap cheong#French) it breaks until I do a hard refresh again. DTLHS (talk) 16:40, 26 May 2017 (UTC)

Bot task: Latin comparatives and superlativesEdit

Rather oddly on here, the comparative and superlative degrees of Latin adjectives (e.g. laetus) are generally neither listed in the headword line (even though the various Latin adjective headword line templates, such as {{la-adj-1&2}}, do support them) nor the declension table itself but are simply given below the table. The consensus here is to move to the headword line comparatives and superlatives that are already there below the table. Would someone mind having a bot do that for us? Esszet (talk) 23:43, 24 May 2017 (UTC)

Wikidata accessEdit

What was the outcome of the proposal Wiktionary:Beer_parlour/2017/February#Proposal:_Implementing_Wikidata_access? (I wasn't sure if I should ask directly in the proposal when it was 3+ months old). Is it possible to include modules from wikidata now? –dMoberg 10:18, 25 May 2017 (UTC)

PS. What kind of permissions/features do I need to activate in sv.wikt to get the same possibilities?

@Moberg: Nothing happened. Judging by how the Wikidata business has gone so far, the process will be entirely passive on our part, and they will not consult with us about what will happen and when. —Μετάknowledgediscuss/deeds 17:40, 26 May 2017 (UTC)
@Metaknowledge:Ugh, why not? -.- :( Do you have insight in what has to be done? –dMoberg 23:38, 27 May 2017 (UTC)
No. As I said, I gather that we don't have to do anything until it happens (when it will be surely announced in the Beer parlour. —Μετάknowledgediscuss/deeds 03:26, 28 May 2017 (UTC)
Note that there is an ongoing vote Wiktionary:Votes/2017-05/Installing Wikidata althought it is not clear what installation is requested, probably mw:Extension:Wikibase Client and d:Wikidata:Arbitrary access. It would interfere with current development plans d:Wikidata:Wiktionary/Development/Proposals/2015-05 or it will happen anyway when ready. --Vriullop (talk) 10:54, 28 May 2017 (UTC)
@Vriullop Sorry if I'm being stupid. Why would it interfere with those plans? –dMoberg 21:51, 1 June 2017 (UTC)
It may or may not, I am not sure at all. @Lea Lacroix (WMDE) At which point do we have enabled Wikibase Client and arbitrary access? Does it worth to request it for testing purposes? --Vriullop (talk) 06:12, 2 June 2017 (UTC)
I am less concerned about the unilateral project that is being pushed on Wiktionary communities and more concerned about enabling local communities to decide how they want to use the Wikidata structure and data. - [The]DaveRoss 11:47, 2 June 2017 (UTC)

Hello, thanks for your questions. Enabling arbitrary access on Wiktionary is the next step after enabling sitelinks for non-main namespaces. We didn't plan anything yet, but since several users from English Wiktionary asked us to enable it, we may adapt our schedule to allow you to try it soon :) Of course, we will take your community vote into account, if the result if negative, we won't deploy anything. If you're interested to try arbitrary access on English Wiktionary, then we will make the necessary changes so you can include Wikidata data in your pages.

That doesn't mean that we will force you to use the data. That means that you will be able to "embed" some informations stored in Wikidata, using a simple code such as {{#statements:part of|from=Q9264}}. The community will remain totally free to decide where, when, for which uses you want to use data. We will allow the possibility to do it, nothing more.

Of course, we can also deploy it on demand for other Wiktionaries.

About "installing Wikidata", we should talk about what you mean exactly. For enabling sitelinks, arbitrary access, etc. there is no need to install something. It's not a new database, it's about improving the code of the Wiktionaries to allow the access I describe above.

If you have any question about the development plan, the process or other things, feel free to ask. Lea Lacroix (WMDE) (talk) 14:09, 2 June 2017 (UTC)

By "improving the code" do you mean installing the Wikidata, Wikibase and DataValues extensions? I think that is what is meant when we speak of installing Wikidata. - [The]DaveRoss 14:18, 2 June 2017 (UTC)
Specifically, installing the necessary extensions in order to have data transclusion enabled with #statements and #property parser funtions, as Lea points out, and also Lua functions as explained in mw:Wikibase/Installation/Advanced configuration#Data transclusion. Thanks Lea for your answer, it clarifies some doubts. --Vriullop (talk) 15:20, 2 June 2017 (UTC)

Template:la-advEdit

This template produces "comparable x, superlative y". How can i correct that to "comparative"? The erroneous label isn't on the documentation page https://en.wiktionary.org/wiki/Template:la-adv/documentation.

Could someone answer the question that's been unanswered on https://en.wiktionary.org/wiki/Template_talk:la-adv for 7 years? --Espoo (talk) 13:19, 25 May 2017 (UTC)

@Espoo: The problem's located in the adverbs function in Module:la-headword. I'll see if I can figure it out, but @JohnC5 might have more success. — Eru·tuon 21:15, 25 May 2017 (UTC)
@Espoo, Erutuon: D'this fix it? —JohnC5 22:45, 25 May 2017 (UTC)
@JohnC5: I guess so. I didn't want to do that because I thought maybe there was a reason for putting in "comparable" rather than "comparative". — Eru·tuon 23:40, 25 May 2017 (UTC)
Thanks --Espoo (talk) 06:43, 26 May 2017 (UTC)

How to empty Category:Unspecified script charactersEdit

I see it only has a few entries but I'm not clear on which category or module needs to be edited per entry in order to actually empty this maintenance category. —Justin (koavf)TCM 01:24, 26 May 2017 (UTC)

The thing to do is update Module:scripts/data so that each character is included in the range of one of the scripts on that page. :) A couple of the characters sre Thai; it seems our current 'range' for Thai is too narrow. - -sche (discuss) 05:41, 26 May 2017 (UTC)
@-sche: You may have noticed that I have a very similar request at Module_talk:scripts/data#Updating_to_clear_out_Category:Unspecified_script_languages. —Justin (koavf)TCM 06:10, 26 May 2017 (UTC)

I think the problem is from "findBestScript" in my Module:mul-letter because "mul" has not been assigned EVERY available script; it may be solved by putting a script code into mul-letter template. But I wish it has better solution to "findBestScript" of "mul" without manual input, if someone can help modifying Module:scripts. --Octahedron80 (talk) 06:24, 26 May 2017 (UTC)

Yeah, I wish there were a function to return the script code for a codepoint. There are scripts that share characters: Cyrs and Cyrl, Grek and polytonic, the various Arabic script classes, and the various Latin script classes. But those conflicts can probably be resolved in some way or another. — Eru·tuon 07:01, 26 May 2017 (UTC)
In that case, we must choose a generic one for conficting range like Cyrl and Grek. --Octahedron80 (talk) 06:31, 27 May 2017 (UTC)
I think it should work this way: The function should only choose Cyrs if the character is not in the character list for Cyrl; same with Grek and polytonic. So letters that are only in polytonic, like (ha), would be considered polytonic, while letters like α (a) that are in both Grek and polytonic would be considered Grek. The modern script wins over the older one. — Eru·tuon 06:45, 27 May 2017 (UTC)
The function I'm proposing could also be used to add script classes to the links in the {{also}} template, to ensure they display well. — Eru·tuon 07:03, 26 May 2017 (UTC)
@Octahedron80: the number of entries Category:Unspecified script characters is tiny enough that I think your module could accept a manually-defined script in those cases, to handle them. - -sche (discuss) 21:41, 26 May 2017 (UTC)
The module already accepts manual sc; thanks to Erutuon. But it should ideally detect script by its own first. To run a bot adding sc everywhere is suchlike opposite thinking. --Octahedron80 (talk) 06:40, 27 May 2017 (UTC)

Bot task: indicate what script “unspecified script languages'” entries are inEdit

It is unfortunately just beyond the reach of my limited coding skills, but I wonder if one of you might run a script (if it would not be too difficult) to look through the "lemmas" categories of all the languages in Category:Unspecified script languages, and add : scripts = {"Latn"}, to the entry (in Module:languages) of each language which has entries in the Latin script ("A-Za-zÀ-ÖØ-öø-ɏḀ-ỿ"). This would presumably knock out most of them. More ambitiously, the script might also add script data for languages with entries in other scripts, possibly using [their definitions it] Module:scripts/data. (Great minds think alike, since I notice Koavf proposed something similar on Module talk:scripts/data, having only now read that discussion which I assumed was just restating the section above this,..) - -sche (discuss) 06:19, 26 May 2017 (UTC)

Do we follow the Unicode standard for these script codes? If so it would be fairly easy for a bot to determine the script code of each character without having to read that module. - [The]DaveRoss 12:27, 26 May 2017 (UTC)
Closely enough that it would work for this task, I think. (But we careful that if an entry uses a Latin script letter and then an "IPA"/"modifier letter", the language should be said to use Latin script only, not also "Zsym".) Of course, one could always just knock out the Latin script characters first and then see what was left. - -sche (discuss) 17:36, 26 May 2017 (UTC)

IPA pharyngealization characterEdit

It has come to my attention that there are two similar Unicode codepoints that can be used to indicate pharyngealization (ironically, the "small" one appears larger):

  • U+02C1 MODIFIER LETTER REVERSED GLOTTAL STOP (e.g. /sˁ/)
  • U+02E4 MODIFIER LETTER SMALL REVERSED GLOTTAL STOP (e.g. /sˤ/ invalid IPA characters (ˤ), replace ˤ with ˁ)

Which one should we be using? It also seems that our IPA template complains about the latter one. It seems that Erutuon decided it was the "wrong symbol", but based on what? --WikiTiki89 16:14, 26 May 2017 (UTC)

  • I have no idea what U+02E4 is for or why it was added to Unicode, but I'd say U+02C1 is correct for normal IPA purposes, since it immediately follows U+02C0 MODIFIER LETTER GLOTTAL STOP and there is no MODIFIER LETTER SMALL GLOTTAL STOP. And of the full-size letters, U+0294 LATIN LET­TER GLOTTAL STOP is the one intended for IPA, not U+0242 LATIN SMALL LET­TER GLOTTAL STOP. —Aɴɢʀ (talk) 16:26, 26 May 2017 (UTC)
It was on the basis of the Wiktionary entries: the entry for the first sign, ˁ, says it's an IPA symbol, while the entry for the second, ˤ, says it's an Egyptological symbol. I'm beginning to be doubtful: perhaps the entries are actually wrong. We should have some external verification for this. Phonetic symbols in Unicode on Wikipedia says that the second symbol, U+02E4, is an IPA symbol. — Eru·tuon 17:02, 26 May 2017 (UTC)
In the official Unicode chart, U+02E4 is in the section "Additions based on 1989 IPA". I'm not sure exactly what that means. The Wikipedia article Pharyngealization says both characters can be used and uses them inconsistently and interchangeably. --WikiTiki89 17:20, 26 May 2017 (UTC)
I support standardizing on U+02C1, at least in IPA (perhaps Egyptian editors prefer to use the other character in romanizations?), for the reason Angr gives — it is the counterpart to ˀ and matches it in size. - -sche (discuss) 21:38, 26 May 2017 (UTC)
In what way is it the counterpart to ˀ? They represent conpletely unrelated things in IPA. It's really the counterpart to ʼ. --WikiTiki89 21:54, 26 May 2017 (UTC)
They immediately follow each other in their codepoints and are mirror images with parallel names, with ˁ derived from its counterpart ˀ by reversal. - -sche (discuss) 05:24, 27 May 2017 (UTC)
The makers of the Gentium font seem to think the neighboring characters are IPA, because they give them identical letterforms, while the more distant one is taller and has a serif: ˁˤˀ. Here, the distant one is in the middle. However, Doulos SIL and Charis SIL make no distinction: ˁˤˀ, ˁˤˀ. (If you don't have these fonts installed, just ignore this post.) — Eru·tuon 06:29, 27 May 2017 (UTC)
In fact U+02E4 ˤ seems to be the IPA character. It is not a standard Egyptological symbol; we just use it as a kludge instead of the more correct ꜥ because the latter used to not have wide font support. According to Unicode, U+02C1 is a ‘typographical alternate for U+02BF’, i.e. ʿ, whereas U+02E4 canonically decomposes to a superscript U+0295, i.e. ʕ. The official IPA website also has a link labeled ‘IPA and Unicode’ that leads here, where U+02E4 is given as the hex code for ‘pharyngealized’. So I would guess that U+02E4 is the official IPA character. —Vorziblix (talk) 09:07, 1 June 2017 (UTC)

Gadget-JavascriptHeadingsEdit

It does not work. I rewrote it. Code's at here. Please update it. --Dixtosa (talk) 17:44, 26 May 2017 (UTC)

What does it do? --WikiTiki89 18:27, 26 May 2017 (UTC)
It transforms text surrounded by equals signs in JavaScript comments into HTML headers. — Eru·tuon 19:10, 26 May 2017 (UTC)
Oh. We should put short descriptions of what JS files do in a comment at the top. --WikiTiki89 19:17, 26 May 2017 (UTC)
I have updated the code and added Erutuon's description of what it does. :) - -sche (discuss) 21:35, 26 May 2017 (UTC)
Oops the new version fails to be even parsed. It seems MediaWiki does not support ECMAScript 6. @-sche I have updated the code at Giorgi's. Update the gadget too please. --Dixtosa (talk) 22:18, 26 May 2017 (UTC)

Template de-decl-noun-fEdit

This template seems to generate rubbish plurals. See Ausflucht as an example. Can someone please fix it? SemperBlotto (talk) 04:37, 27 May 2017 (UTC)

I see that entry has been fixed by inputting the plural form as "pl=" rather than as an unnamed parameter. Whether the template should be changed to allow the plural to be put in as the first unnamed parameter, or whether something else more often needs to be put in that slot, I don't know. - -sche (discuss) 20:08, 27 May 2017 (UTC)
I'm confused; what was the incorrect plural that the template was generating? I only saw Ausflüchte, which is correct according to German Wiktionary. Oh, it was *Ausfluchte. — Eru·tuon 00:06, 28 May 2017 (UTC)
It was *AusfluchtAusflüchte. See this diff. Redboywild (talk) 17:58, 28 May 2017 (UTC)

Bot task: CAT:Taos lemmas entries should use modifier apostrophesEdit

Could someone move all the Taos entries (at the moment there are no non-lemma entries, so all entries are in CAT:Taos lemmas) which use ' or , to instead use the modifier-letter apostrophe ʼ, pursuant to this RFM? Overwriting redirects is fine. Ideally, the bot would also fix links to the pages it moved, e.g. from translations tables. Links inside Taos entries which (links) use should also be updated to use ʼ. (Any links inside Taos entries that use ' will probably need to be handled in AWB by me or someone else on the lookout for false positives / links that are OK as-is, like a link to brewer's yeast). - -sche (discuss) 20:06, 27 May 2017 (UTC)

{{top2}} / {{mid2}} columns no longer line upEdit

See межевать for an example. The left column is a half line lower than the right one, at least on my screen and browser (Mac OS X 10.9.5, Chrome). They used to line up fine, so something has gotten broken in the meantime. Benwing2 (talk) 21:07, 27 May 2017 (UTC)

One of the items is placed below the bottom of the list. —CodeCat 23:09, 27 May 2017 (UTC)
That is not the problem. DTLHS (talk) 23:13, 27 May 2017 (UTC)

problems with categoriesEdit

I'm having problems with the automatic categorization linking to categories that words don't belong in. For example, I edited the formatting of the etymology of begynne to make it link to the words that it mentions, and now for some reason it's in the category Old English words prefixed with be-, which is definitely wrong because it's not even Old English. Also, yerd, which I'm currently adding an English definition for, is automatically categorized in English twice-borrowed terms even though it's not borrowed at all. The current etymology is

Presumably related to yard, from Old English ġerd (branch, twig, stick) or gierd, from West Germanic *gazdijo

Presumably related to {{m|en|yard}}, from {{inh|en|ang|ġerd||[[branch]], [[twig]], [[stick]]}} or {{m|ang|gierd}}, from {{etyl|gmw|en}} {{m|und||*gazdijo}}

Did I do something wrong with this? 2601:246:C602:67B3:91B6:583:F5B4:8242 05:47, 28 May 2017 (UTC)

The problem related to the Norwegian entries in begynne is that you used {{prefix|ang|be|}}. The template {{prefix}} is for etymologies, and it automatically categorizes words in a category such as Old English words prefixed with be-, using the language code (in this case ang, for Old English). You should be using the linking template {{m}} instead. And probably you should use the etymology given in the Old English entry beginnan, which derives the word from Proto-Germanic. — Eru·tuon 06:04, 28 May 2017 (UTC)
As to your second question, when I enter that code, I don't get an "English twice-borrowed terms" category at all, so I don't know what's going on there. — Eru·tuon 06:22, 28 May 2017 (UTC)
Aha, the Scots section had {{etyl|en|en}}. That's the source of the "double-borrowed" category. — Eru·tuon 16:42, 28 May 2017 (UTC)

Maybe remove a few "redlinks" categories to avoid module errorsEdit

Maybe it's a good idea to disable the redlink categories of a few languages, if they are not being used right now, because they use expensive functions and a lot of those languages enabled at the same time can cause module errors.

This can be done by editing Template:redlink category.

Full list:

--Daniel Carrero (talk) 01:44, 29 May 2017 (UTC)

I could be wrong, but I don't think expensive functions contribute much more than other functions to the Lua memory load. — Eru·tuon 02:05, 29 May 2017 (UTC)
Why not include a suitably hard-to-find switch to facilitate turning these redlink categories on and off in response to actual interest from a user or lack of interest from any users? Perhaps a switch for each language could be in a subpage of its "About" page, which page could be protected from editing by anyone but an admin.
How long would it take for such a category to be repopulated to 90% of complete? How long does depopulation take? DCDuring (talk) 02:10, 29 May 2017 (UTC)
The list is at Template:redlink category, which can be edited by any user currently. A few days probably to fill / unfill. DTLHS (talk) 02:13, 29 May 2017 (UTC)
I removed English from the template- there were already a couple dozen entries with "too many expensive function calls" due to it. Links to English entries aren't in the translation tables, but they're everywhere else, with Derived terms alone more than enough to take many entries over the 500-per page limit. Chuck Entz (talk) 03:09, 29 May 2017 (UTC)
Are categories really the best way to generate the lists? That is, do they have to be in (near-)real time? Dumps are available now every two weeks? Dump processing can give additional information as well. Obviously there is less labor involved, but downloading the entire dump and running a script with regex doesn't have to take very long. The possibility exists as well of generating counts of the redlinks for each missing term. DCDuring (talk) 03:22, 29 May 2017 (UTC)
I agree. We're clearly using too many different Lua functions, because errors have started cropping up more and more frequently and in more and more entries: water, man, I, iron, etc. And this is something we don't need near-real-time categories for. Something that parsed a database dump, and for more languages than just this, would be ideal. Another thing to consider is disabling automatic transliteration in translations tables (or alphabetic scripts, or of all scripts), because that is known (by us and also noticed by e.g. the folks at Phabricator) to eat up a looooot of memory. - -sche (discuss) 03:56, 29 May 2017 (UTC)
The expensive parser function calls error is separate from Lua memory usage. I agree there's no good reason to have these categories though, especially since they don't really seem to be used. DTLHS (talk) 04:03, 29 May 2017 (UTC)
I witnessed @SemperBlotto creating a few Italian entries based on Category:Italian redlinks, so at least that one seems to be used. Probably others are too. --Daniel Carrero (talk) 18:05, 29 May 2017 (UTC)
But we could easily generate redlink lists by bot, so why use expensive parser functions? --WikiTiki89 18:17, 29 May 2017 (UTC)
Yes, please. Easily generating redlink lists by bot is something I would like to see. --Daniel Carrero (talk) 18:25, 29 May 2017 (UTC)
The inclusion of languages is mostly due to people who edit those languages asking for them to be included. The categories' usefulness isn't the issue, it's the inefficient way they're generated: this template gets executed every time a page is loaded that uses the links module- once for every module-generated link. The only thing that keeps it from overloading things on every entry is that it does the expensive stuff for just the selected language codes. Chuck Entz (talk) 04:41, 29 May 2017 (UTC)
As I think I've said in a previous discussion, it would be great if someone created redlinks lists based on dumps. They don't have to be categories. --Daniel Carrero (talk) 18:06, 29 May 2017 (UTC)
  • I'm using the Spanish redlinks cat. It's my favourite page this month. -WF

More brokenness in {{top3}}, {{mid3}}Edit

See Reconstruction:Proto-Slavic/vьrgnǫti. The three columns should be headed "East Slavic:", "South Slavic:", and "West Slavic:", respectively, but instead I see East Slavic continue onto the second column, and South Slavic placed at the bottom. This happens both in Safari and Chrome on Mac OS X 10.9. Benwing2 (talk) 21:41, 29 May 2017 (UTC)

@CodeCat If this isn't able to be resolved we should revert the changes that make {{top3}} auto balancing. We could make another template {{top3-a}} that has the auto balancing CSS. DTLHS (talk) 21:45, 29 May 2017 (UTC)
I agree. I verified that both this brokenness and the above {{top2}}/{{mid2}} brokenness are due to CodeCat's changes of April 13. Benwing2 (talk) 21:50, 29 May 2017 (UTC)
In the vast majority of cases, columns should be balanced. Cases like the above were really misuses of the these column templates. They should be replaced with specialized templates and/or tables. --WikiTiki89 16:33, 30 May 2017 (UTC)
First of all, there are two brokennesses, and only one of them concerns unbalanced columns. The other one mentioned somewhere above concerns extra blank space in an entirely balanced two-column layout. Secondly, I don't agree that it is reasonable to make a change like this that will cause significant brokenness for existing pages and then simply tell those pages that they're "misuses" and need to be fixed. It is the responsibility of the template changer to handle any breakage that ensues; if they're unwilling to do this, the change should be reverted. Benwing2 (talk) 07:36, 31 May 2017 (UTC)

Template:zh-pron: In Firefox, Mandarin audio player covers up Mandarin IPA textEdit

I kept wondering why so few Chinese characters had IPA for their Mandarin, but then I found out they do! It is just that in Firefox, the audio player control is graphically glitched so it covers up the IPA text.

Easy example here (click the Expand link in upper right corner of the box): Template:zh-pron#.E4.B8.AD.E5.9C.8B

The issue

  • appears in Firefox 53.0.3 for Windows and Mac OS (and probably many earlier versions)
  • does NOT appear in Chrome 58.0.3029.110 for Windows or Mac OS
  • does NOT appear in Edge 40.15063.0.0 (edgehtml.dll build 11.0.15063.332)
  • does NOT appear in Safari 10.1.1

Hope you can take a look... -- Gianttrombone (talk) 01:32, 30 May 2017 (UTC)

I have also experienced it for some time. It may relate with CSS. --Octahedron80 (talk) 01:42, 30 May 2017 (UTC)

@Gianttrombone Leave a complaint here. God, I hate the MW media player.suzukaze (tc) 01:51, 30 May 2017 (UTC)
(edit conflict) I have the same version of Firefox, and I see the same thing. Another difference to note is that all of the indented lines below the language names have bullet points on Firefox, but not on Safari, and when the audio player is followed by one of those indented lines instead of a language name, there are two bullet points. Also, the bullet point for the line the audio player should be on is correctly placed, but the audio player and everything after it is shifted up a line. Chuck Entz (talk) 02:11, 30 May 2017 (UTC)
Not sure what I was looking at before, but when I looked at it again, both browsers showed the bullet points. Chuck Entz (talk) 02:52, 30 May 2017 (UTC)

Template:zh-pron: obsolete HTML tagEdit

As thwikt has Linter extension installed; it notifies about obsolete HTML tag (tt) inside zh-pron [9] which is adapted from enwikt. It populates about 40000 soft errors. Please replace tt with another tag. --Octahedron80 (talk) 02:00, 30 May 2017 (UTC)

Apparently, it's trigged by the (tt)s in Template:zh-pron/documentation...? - -sche (discuss) 08:32, 2 June 2017 (UTC)
I think not. thwikt does not copy the documentation page. But I found tt's in the linked Module:cmn-pron instead. --Octahedron80 (talk) 10:43, 2 June 2017 (UTC)

Template:character info: missing end tagEdit

As thwikt has Linter extension installed; it notifies about missing end tag inside character info [10] which is adapted from enwikt. It populates about 80000 soft errors. Please solve this. --Octahedron80 (talk) 02:03, 30 May 2017 (UTC)

Two broken gadgets which break expanding some itemsEdit

Going to https://en.wiktionary.org/wiki/pot%C5%99ebovat?debug=true#1%20Czech I could not expand the "conjugation table". I had all gadgets enabled and the web browser's developer tools show two errors:

Would someone repair them? phab:T164242 explains how. I checked that expanding works when disabling these gadgets. :) --Malyacko (talk) 16:25, 30 May 2017 (UTC)

Labels at Middle Dutch stedeEdit

At the moment, there's two labels, Flemish and Hollandic. The former has no link and doesn't even add the entry to a category, while the latter does. The "Hollandic" label links to the Wikipedia article about Holland, but this is the modern area of Holland. Since this is a medieval language, a link to w:County of Holland would be more appropriate, and also w:County of Flanders. Can this be done? —CodeCat 20:14, 30 May 2017 (UTC)

@CodeCat: Are the labels Flemish and Hollandic used for any languages besides Middle Dutch? If they are, it is currently impossible to make them have different content depending on the language, so all languages would have to link to these Wikipedia articles, even if they are spoken at a time when these Counties didn't exist. — Eru·tuon 21:05, 30 May 2017 (UTC)
Yes, they're used for Dutch as well. I thought we now had separate labels per language? —CodeCat 21:07, 30 May 2017 (UTC)
No, all we have is the ability to restrict labels to a specific language. So, if you use {{lb|en|Doric}} rather than {{lb|grc|Doric}}, it won't use the label data file, and it'll add a tracking template. But at the moment there is no way to have a label for Doric Scots and a label for Doric Greek that both have the same key ("Doric"). — Eru·tuon 21:10, 30 May 2017 (UTC)
Yeah, I asked about this years ago as I wanted {{lb|en|Ulster}} for CAT:Ulster English, {{lb|sco|Ulster}} for CAT:Ulster Scots, and {{lb|ga|Ulster}} for CAT:Ulster Irish, but that apparently isn't possible. —Aɴɢʀ (talk) 21:38, 30 May 2017 (UTC)
It shouldn't be hard to make it possible, though. All you need is to define the labels per language, like
["Flemish"] = {
        ["dum"] = {......}
        ["nl"] = {......}
}
CodeCat 22:00, 30 May 2017 (UTC)
@Erutuon Can you make this work? —CodeCat 10:39, 2 June 2017 (UTC)
@CodeCat: I don't know. It doesn't seem a fully developed idea yet. It can be attempted at Module:User:Erutuon/labels. — Eru·tuon 19:03, 2 June 2017 (UTC)
Think of these subtables as fully-fledged labels in their own right. So there's an entirely separate label definition for each language. The module would first see if a table index exists with the code of the current language, and if so, it uses the data there. Otherwise, it uses the data directly under the label as before. —CodeCat 19:08, 2 June 2017 (UTC)
I understand the idea, but so far it's not working in the sandbox module. And I have other concerns. — Eru·tuon 19:20, 2 June 2017 (UTC)

@CodeCat: I would rather not replace the existing language-specific labeling system with the new system, as you suggested in this edit summary. The new system offers no way to track labels that are being used in the wrong language, which may be useful in some cases. (For instance, perhaps someone would want to go through and correct labels that are used with the Cantonese language code when they are supposed to be used with generic Chinese, or clarify whether a label is actually being used the way its label data assumes it would be.) And it doesn't offer a way to use the same label data for more than one language. I'm not pleased with there being two ways of handling language-specific labels, though. — Eru·tuon 19:44, 2 June 2017 (UTC)

A possible solution would be to introduce something like invalid = true on the top-level label. If that value is true, then that means the label isn't valid. Since the language-specific data doesn't have that tag, it works as normal. This lets you do the inverse too: make the label valid for all except a selected language. As for using the same label data for more than one language, that's true, but how big of an issue is that? —CodeCat 19:47, 2 June 2017 (UTC)

@Erutuon Any progress? —CodeCat 12:17, 4 June 2017 (UTC)

Update HotCat or create a second version of itEdit

As more and more entries use templates to add categories (and especially if Wiktionary:Votes/2017-05/Templatizing topical categories in the mainspace were to pass), those entries can no longer be quickly edited with HotCat. Anyone fancy a go at updating the gadget to handle templatized categories? But, caution: there is some preparatory discussion underway leading to a proposal that might replace categories named like "en:Foo" with "English foo", which might change how/whether templates are used to add categories. - -sche (discuss) 23:53, 30 May 2017 (UTC)

I've long wanted HotCat to be able to recognize and use the categorizing templates {{cln}} and {{C}}, but the script (c:MediaWiki:Gadget-HotCat.js) is above my ability to update. — Eru·tuon 20:54, 2 June 2017 (UTC)
  • @Dixtosa, Giorgi Eufshi, please help. —Μετάknowledgediscuss/deeds 00:17, 4 June 2017 (UTC)
    I will wait for the vote to end first. --Dixtosa (talk) 18:05, 4 June 2017 (UTC)
    While we're at it, can a new version of HotCat be written to put the category in the correct language section? As of now, it puts the category at the bottom of the page, regardless of the language. So for example, if I wanted to use HotCat to add a category to cha#English, it would actually be added under cha#Zulu, which isn't good, especially in tabbed languages view. —Aɴɢʀ (talk) 19:39, 4 June 2017 (UTC)

{{blend}} should add categories for prefixes and suffixes, like {{affix}} doesEdit

I recently used {{blend}} to add an etymology to видеоте́ка (videotéka, video library), which is composed of видео- (video-, video-) + библиоте́ка (bibliotéka, library). This should add the word to CAT:Russian words prefixed with видео-, but it doesn't. Benwing2 (talk) 07:32, 31 May 2017 (UTC)

Wouldn't it make more sense to say it was a blend with видео (video)? DTLHS (talk) 17:37, 31 May 2017 (UTC)
I always wonder if "blend" is the appropriate terminology for words where morphemes are transparent; for me it's not exactly the same situation as in blurse, for example. Also, I don't understand the point about "blend versus portmanteau" on the documentation page. "blend is the correct linguistic term for a word made by merging two words" looks like the definition of a compound. --Oxytonesis (talk) 17:56, 31 May 2017 (UTC)
@Erutuon: since you're keen on improving documentation pages, do you have an opinion on {{blend}}? --Barytonesis (talk) 20:55, 22 June 2017 (UTC)
@Barytonesis: Well, if it's going to add suffix and prefix categories, it probably has to be Luaified. — Eru·tuon 21:00, 22 June 2017 (UTC)
@Erutuon: I meant, what's your opinion on that usage note about the difference between "blends" and "portmanteaux"? --Barytonesis (talk) 21:04, 22 June 2017 (UTC)
@Barytonesis: Hm, I'm not really familiar with the difference, so I have no opinion. — Eru·tuon 21:17, 22 June 2017 (UTC)
The problem with adding that capability is that the inputs to the blends don't necessarily end up as discrete morphemes with one attached to the other. A blend strikes me as more like a compound, because it can combine two or more independent terms without making one subordinate to the other. For instance, cockapoo is a blend of cocker spaniel and poodle. Is cocker spaniel a prefix? Is poodle a suffix? When {{confix}} was first introduced, we had a great deal of trouble with people using it improperly for compounds, with a flood of bogus affix categories as the result. I think adding the capabilities you're asking for would encourage similar abuses, though not on the same scale. Chuck Entz (talk) 02:31, 23 June 2017 (UTC)
I agree with Chuck's assessment, if a word is formed by combining an affix with another word {{blend}} is not the appropriate template. If the word is formed by blending two words then those words should not be categorized as affixes. - [The]DaveRoss 12:56, 23 June 2017 (UTC)
Then what is the appropriate classification for prequel? It looks like a blend consisting of a prefix and a full word. Either that or a portmanteau. I don't know what the difference is. The first syllable of prequel is replaced with the prefix pre-. Does the prefix pre- have to be considered a full word because it is combined with another word in a blend? — Eru·tuon 16:39, 23 June 2017 (UTC)
prequel is not prefixed with pre-, so it doesn't belong in the category. This is about categorization. {{blend}} is for terms like smog which are not roots with affixes, but rather are words formed by the blending of two other words. I don't think prequel falls into that category, however there may be blends which have a constituent word which is an affix, in those cases I think the category should be added separately as a special case. The worst case scenario is one in which smoke or fog are added to an affix category because they are the components blended in smog. - [The]DaveRoss 16:51, 23 June 2017 (UTC)
Well, how can prequel not be a blend or a portmanteau? The first syllable (or the first consonant) of its second component is replaced with that of its first component. I thought the definition of a blend or portmanteau was that one component replaces part of the other component. If it were presequel, it would not be a blend. But I'm not sure what I think about the "words formed with prefix" categorizing issue. — Eru·tuon 16:59, 23 June 2017 (UTC)
If you want to assert that prequel is a blend that is fine, it doesn't change my point. I consider it to be one of those words which is patterned after another word, but that is subjective. - [The]DaveRoss 17:19, 23 June 2017 (UTC)
What's the difference? --WikiTiki89 17:27, 23 June 2017 (UTC)
(edit conflict) Okay, so prequel may be a blend formed with the prefix pre-, but it should not be considered as prefixed with pre-. Interesting. It seems a distinction that is likely to confuse readers, but okay. I mean, I would rather simply stick all words that are formed with the prefix pre- in some way or another in the category English words prefixed with pre-, and avoid making finer distinctions about what "prefixed" means. For what it's worth, the OED gives the derivation as pre- + -quel. Probably the suffix -quel originated with prequel. — Eru·tuon 17:30, 23 June 2017 (UTC)
Prefix is a giant red herring. How about every other blend which isn't formed with an affix? - [The]DaveRoss 17:53, 23 June 2017 (UTC)
Huh? I'm not proposing that non-affixes be converted to affixes in {{blend}}. {{blend|smoke|fog}} would not result in Blend of smoke- + fog. You'd have to enter it as {{blend|smoke-|fog}}, with a hyphen, to get that result. — Eru·tuon 18:10, 23 June 2017 (UTC)
Bah. I totally missed the part that affix checks for the hyphen, my bad. - [The]DaveRoss 13:06, 26 June 2017 (UTC)
According to Blend_word#Formation, prequel and smog are both blends ("prequel" would be an instance of case 4., "smog" of case 1.), but smog would furthermore be a portmanteau, that is, a special subset of blends. --Barytonesis (talk) 17:37, 23 June 2017 (UTC)

Revamping {{named-after}}Edit

Hey, I'd like to expand and simplify the template {{named-after}}:

  • {{nam|en|fr|Champagne|w:Champagne (wine region)|loc=wine region|nat=France}}
    • named after wine region of Champagne in France
    • Category:
      • Category:English terms named after proper nouns
      • Category:English terms named after French proper nouns
      • Category:English terms named after place names
      • Category:English terms named after wine regions
      • Category:English terms named after place names in France
      • Category:English terms named after wine regions in France
      • Category:English terms named after Champagne
  • {{nam|en|it|Christopher Columbus|occ1=explorer|occ2=navigator}}
    • named after Italian explorer, navigator Christopher Columbus
    • Category:
      • Category:English terms named after proper nouns
      • Category:English terms named after Italian proper nouns
      • Category:English terms named after people
      • Category:English terms named after explorers
      • Category:English terms named after navigators
      • Category:English terms named after Italian people
      • Category:English terms named after Italian explorers
      • Category:English terms named after Italian navigators
      • Category:English terms named after Christopher Columbus
  • {{nam|en|en|Amelia Bloomer|occ=feminist|nat=American}}
    • named after American feminist Amelia Bloomer
    • Category:
      • Category:English terms named after proper nouns
      • Category:English terms named after English proper nouns
      • Category:English terms named after people
      • Category:English terms named after feminists
      • Category:English terms named after American people
      • Category:English terms named after American feminists
      • Category:English terms named after Amelia Bloomer

Any thoughts from people? --Victar (talk) 06:43, 2 June 2017 (UTC)

My first thought is that the combination categories ("Italian explorers", etc) might be too "granular", to use the going word. And since I gather they could fairly easily be added in later and would automatically become populated, one could wait and only add them if the "Italian people", "American people" etc categories become too large. But my second thought is of how many Italian people, American people, etc things are named after; those categories will surely become very large, so we could go ahead and create at least some "granular" categories from the start.
However, we probably want to avoid an over-abundance of possible occupations, and especially synonyms that would result in entries being split among two categories and thus made harder to find (one entry might say "alderman" where another says "city councillor", all where "politician" is probably sufficient), so maybe the template should rely on a list (the way {{label}} does) and silently put the entry into an "attention" category if an un-listed occupation (location, etc) is given. - -sche (discuss) 08:26, 2 June 2017 (UTC)
Using the lists defined in {{label}} is a great idea! I suppose one could do this, if they really want to: {{nam|en|it|Mario Rossi|occ=[[politician|city councillor]]}}. --Victar (talk) 15:26, 2 June 2017 (UTC)
If we're relying on {{lb}} lists, perhaps we can format it like this: {{nam|en|American|Amelia Bloomer||feminist|activist|politician}} --Victar (talk) 15:40, 2 June 2017 (UTC)
To be clear, I'm not suggesting that Module:labels/data itself be used for this (it doesn't contain labels for "explorers" or "Italian people", does it? and it shouldn't), but just that a data module like that could be used. - -sche (discuss) 04:21, 3 June 2017 (UTC)
Yeah, I was thinking more about the regional labels, which do already exist, and can be expanded upon for both the sake of both templates. --Victar (talk) 06:50, 3 June 2017 (UTC)
It might also be useful to have a notext= parameter, that would result in no visible text (at all!) being displayed, but categories being added. Then the template could be used even in entries where its particular wording might not be desirable, e.g. where {{blend}} or {{compound}} is already used. - -sche (discuss) 08:30, 2 June 2017 (UTC)
For sure, |notext=, |nocat=, |cap=/|nocap=. --Victar (talk) 15:26, 2 June 2017 (UTC)
@Ungoliant MMDCCLXIV --Victar (talk) 16:48, 4 June 2017 (UTC)
I support some granularisation of eponym categories (without removing pages from the main category; we could do something similar to what has been done with Category:English terms derived from Italian-type categories in the last few years). Category:English eponyms is already too large for comfort. This is what I had in mind when I devised {{named-after}} with parameters for occupation and nationality several years ago, even if the way I implemented it wasn’t well thought out.
Concerning your proposed categories above, I think that “derived from [French] proper nouns” is better than “named after [French] proper nouns”; and “terms named after place names” feels redundant to “terms named after places”. — Ungoliant (falai) 21:37, 4 June 2017 (UTC)

June 2017

Template:IPA letters - rhoticityEdit

Template:IPA letters produces ɑː for the letter R. This should, I suppose, be ɑː(ɹ) or suchlike. Equinox 01:01, 2 June 2017 (UTC)

Only if Wiktionary adopts Wikipedia's trans-dialectal (diaphonemic) transcription system. Up till now, we haven't (though I suppose rhymes pages use some sort of diaphonemic system, or choose a particular dialect). The transcription of the above-mentioned vowel would be /ɑː(ɹ)/ for RP,/ɑɹ/ for General American, and /ɐː(ɹ)/ in Australia and New Zealand. — Eru·tuon 16:54, 2 June 2017 (UTC)
I've updated it. The other pronunciations, like /əʊ/, are also British, so this was obviously intended to be ɑː(ɹ). Ideally it would be updated to give multiple dialects' pronunciations, possibly through the use of separate Template:IPA letters/en-US and Template:IPA letters/en-UK subpages and through accepting en-US and en-UK as codes. - -sche (discuss) 18:21, 3 June 2017 (UTC)

Terms derived from ChineseEdit

Why are the various "terms derived from Chinese" categories (e.g. CAT:Irish terms derived from Chinese) not subcategories of the corresponding "terms derived from Sinitic languages" categories (e.g. CAT:Irish terms derived from Sinitic languages)? Can and should it be fixed? —Aɴɢʀ (talk) 12:24, 2 June 2017 (UTC)

Chinese is a synonym of the Sinitic languages. —CodeCat 12:43, 2 June 2017 (UTC)
Then why do we have both? —Aɴɢʀ (talk) 13:26, 2 June 2017 (UTC)
The Chinese category is used when someone hasn't specified which Chinese language something is derived from, while the Sinitic languages category is an umbrella category in which the categories for particular Chinese languages are placed. It is illogical to have both, though, because they're synonymous... — Eru·tuon 16:51, 2 June 2017 (UTC)
Then I'd say we should merge them to "Chinese". —Aɴɢʀ (talk) 11:00, 3 June 2017 (UTC)

Oldest tagged RFVsEdit

The lists of "Old tagged RFVs" atop the RFV pages are displaying "No pages meet these criteria." They should be fixed, since they're the main way we come to notice tagged-but-not-listed RFVs. As a separate matter, maybe a cleanup bot could periodically list unlisted RFVs... it wouldn't have to run often / represent a major time investment; even once every few months would work. - -sche (discuss) 18:11, 3 June 2017 (UTC)

It is now working for English. The problem is in the split of the RfV page and associated elimination of a single category for all RfVs. This approach never worked for RfV items that didn't have an RfV template.
The non-English RfV category has only subcategories, for each one of which we could have an oldest RfV list. There are solutions. One class of solutions would require a new category, eg, for all non-English RfVs, or new categories, eg, for all languages in a given family or in a given script or differentiated by nature of attestation. Another might be some dump-based solution. It is beyond my paygrade to even participate seriously any further in the discussion. DCDuring (talk) 19:04, 3 June 2017 (UTC)
  • @Daniel Carrero, was this you? —Μετάknowledgediscuss/deeds 14:04, 5 June 2017 (UTC)
    Thanks, DCDuring, for identifying the problem and solving it on the English page. There are two simple ways the remaining problem could be addressed. All RFVs could double-categorize into both the language-specific category and a general category of all RFVs (like the one they previously went into). This would have benefits besides just finding Oldest tagged RFVs (the same benefits as the per language categories: it'd be there if someone wondered what terms needed verification, if they wanted to help cite them; for many languages and RFVs). The other approach, which is not mutually exclusive, and which is more useful for this specific issue, is to have non-English RFVs double categorize into a "RFVs (non-English" category, to feed the Oldest list at the non-English page. If it would be too difficult to implement the idea of categorizing all non-English RFVs, or conceptually untidy/undesirable I could just add an "all RFVs" category. - -sche (discuss) 15:38, 5 June 2017 (UTC)
    I fixed Wiktionary:Requests for verification/Non-English. See if you like this approach or if you would change something. --Daniel Carrero (talk) 15:45, 5 June 2017 (UTC)
    Your solution seems to require constant maintenance, obliging people to notice and update that template whenever a term in a new language is RFVed (or do I misunderstand?). But one point of the Oldest Tagged list is to catch cases that people don't notice have been RFVed. So, I suggest we use one of the two double-categorization approaches I outline above. (Potentially perhaps your template could be updated to fetch the contents of Module:languages'(s) data modules and check every language's RFV category, so as to catch whenever a code is added, whenever a new in a new language is RFVed, etc, but that seems like a very unnecessary and probably expensive use of Lua for something {{rfv}} could do by just automatically inserting a certain category.) - -sche (discuss) 19:38, 5 June 2017 (UTC)
    I see. What about using "CategoryTree"? I added an example here in the discussion now. This way the list would still be organized by language, and it would be updated in real time as new language categories are created. It would be nice if we could use JavaScript to replace "Requests for verification in Volapük entries" by just "Volapük", and do the same for all languages. --Daniel Carrero (talk) 20:31, 5 June 2017 (UTC)
    @Daniel Carrero: I've created a JavaScript function that does that; see User:Erutuon/scripts/sandbox.js. It does make the list much easier to read. — Eru·tuon 20:52, 5 June 2017 (UTC)
    @Erutuon: Thanks! I added the categorytree in WT:RFD, WT:RFC and WT:RFV. Do you think you could update the JS code to do the same for the other two? Note that RFD and RFC also have a "language code missing" category at the start, but RFV doesn't (because you can use RFD and RFC in entries without a language code, but you can't use RFV without a language code without triggering a module error).
    Your code worked when I added it to User:Daniel Carrero/common.js, but I basically don't understand anything of JS and I don't know if we need to edit somehow MediaWiki:Common.js to implement it properly for everyone to see. --Daniel Carrero (talk) 21:22, 5 June 2017 (UTC)
    @Daniel Carrero: Yep, I've modified it to do the same for RFD and RFC categories. The "language code missing" category could perhaps stand to be shortened, but I'm not sure what to shorten it to. — Eru·tuon 21:30, 5 June 2017 (UTC)
    Oh, the code is working now. Maybe we could replace "Requests for (deletion, cleanup) with the language code missing‎" by just "language code missing". I also implemented a page count in all the categories. Do you think you could use JS remove the "0 c," from all the categories? A category like Category:Requests for deletion in Lithuanian entries obviously doesn't have any subcategories. --Daniel Carrero (talk) 21:42, 5 June 2017 (UTC)
    The sandbox script now removes the useless "0 c, ". It will probably do that to any category tree generated in the same way as the one above, though it doesn't do anything on category pages like Category:English nouns; those must use different classes. — Eru·tuon 22:53, 5 June 2017 (UTC)
    Thanks, it's working perfectly. --Daniel Carrero (talk) 23:07, 5 June 2017 (UTC)
  • To retain some focus on problematic older entries, perhaps each language with more than, say, 20 entries in one of the categories, could use the "oldest" listing, using the approach now used in RfV/English. DCDuring (talk) 23:24, 5 June 2017 (UTC)
    I believe it's not to possible to use "categorytree" to focus on the older entries anymore, because of the change in categories. Technically, all the "older" entries in Category:Requests for verification in English entries are the ones that were added when I created the new category in May 18 2017. --Daniel Carrero (talk) 03:58, 6 June 2017 (UTC)
    Well, right now all the entries in each category were added at about the same time (when you created the categories, as you said), but several months from now, when old RFVs have been dealt with and new ones have come up, it will once again be possible to sort (a category) by age, won't it? - -sche (discuss) 15:59, 7 June 2017 (UTC)
    That is correct. --Daniel Carrero (talk) 16:02, 7 June 2017 (UTC)
I consider it suboptimal that I have to click to expend each language's category to see its entries; however, assuming that the sum of all those language categories contains all of the entries which would be contained in a (list of the oldest RFVs in a) single category of all non-English RFVs similar to the one we previously had (which was for all RFvs, before WT:RFV was split by language), this is adequate; thank you for your help. I do still think we could benefit from restoring a general category for all RFVs to double-categorize into, so that the oldest ones (independent of language) can be found. (Likewise for RFCs and RFDs.) - -sche (discuss) 15:59, 7 June 2017 (UTC)
I'm pretty sure we could leave the list completely un-collapsed by default, but assuming we want that, I'm not sure how to do it.
Here's an idea for a different categorization approach, though it would require constant maintenance. Wikipedia has cleanup categories organized by date, such as w:Category:Articles with unsourced statements from February 2017. I guess we could have categories for each year (the month is not necessary because we don't have that many requests to deal with, I guess) If we created categories like Category:Requests for verification from 2017 (and even Category:Requests for etymology from 2017, Category:Requests for pronunciation from 2017, you get the point) maybe some bot could constantly add the correct year in entries and automatically create new categories each year? I could help by making {{auto cat}} usable in these categories. I could also fill the categories by checking the current entries with RFVs and perhaps other types of request. --Daniel Carrero (talk) 16:16, 7 June 2017 (UTC)

Bug in inserting IPA charactersEdit

If you edit a page, and then delete a character and then immediately click on one of the IPA symbols under the "IPA and enPR" section at the bottom of the page (at least, this happens with ɛ), it inserts the character one character to the right of where it should be. This happens to me consistently on Chrome under Mac OS X 10.9. It doesn't happen if the last thing you did before clicking on the IPA symbol is to insert rather than delete a character. Do people see this on other systems? Someone should file a Phabricator bug if there isn't one already (I'm still not quite sure how to do that). Thanks! Benwing2 (talk) 19:19, 3 June 2017 (UTC)

Flags for Hijazi Arabic and Najdi ArabicEdit

See MediaWiki_talk:Gadget-WiktCountryFlags.css#Hijazi_and_Najdi for more details. --Lo Ximiendo (talk) 02:31, 4 June 2017 (UTC)

Citations at citationsEdit

Why do we bother putting {{citations}} at the top of every Citations page? I reckon it should be automatically included in the software. --Celui qui crée ébauches de football anglais (talk) 11:17, 4 June 2017 (UTC)

Well, we still have to specify the language the citations are in, and we need a template to handle that. —Μετάknowledgediscuss/deeds 14:03, 5 June 2017 (UTC)
True. But on a related note which I've mentioned before, why do we put {{reconstruction}} atop most reconstruction pages? Could that be done by a Mediawiki page or JS? Or a bot... - -sche (discuss) 15:44, 5 June 2017 (UTC)
I think we can do that if we install PageNotice extension. - [The]DaveRoss 18:31, 5 June 2017 (UTC)
Well, that sounds like a good idea, then. Before we set up a vote on installing that extension, does anyone have any comments / see any obvious problems? - -sche (discuss) 19:42, 5 June 2017 (UTC)

rel-top columns messed up the 2015 NORM voteEdit

For the record, the recently-added automatic columnization (?) of {{rel-top}} messed with this 2015 vote: Wiktionary:Votes/pl-2015-11/NORM: 10 proposals.

All the collapsed parts of the vote are formatted as two columns now. They were normal text without columns before. --Daniel Carrero (talk) 12:21, 5 June 2017 (UTC)

Deletion of {{script}}Edit

Why was {{script}} deleted? I think it would be really useful to be able to type {{sc|Mani}} and get Manichaean. Yes, I'm aware that {{sc}} is currently for {{smallcaps}}. @CodeCat, Daniel Carrero? --Victar (talk) 03:12, 6 June 2017 (UTC)

Why would that be useful? It isn't a significant change in character count, it isn't likely to change frequently over time, why not just type the name of the script? - [The]DaveRoss 18:43, 6 June 2017 (UTC)

Memory Weirdness at Edit

How is it that this diff pushes the memory usage from 44.6 MB to over the 50 MB limit? If I edit and preview the section containing the template, I see "Lua memory usage 7.95 MB/50 MB" in the parser profiling data table. If I then remove that template and preview again. I get "Lua memory usage 7.01 MB/50 MB".

.84 MB≠5.4 MB+ Chuck Entz (talk) 03:48, 6 June 2017 (UTC)

Weird. I even added as an exception in Template:redlink category and then did a "hard purge" on the diff, but this did not stop the memory errors. --Daniel Carrero (talk) 03:54, 6 June 2017 (UTC)
Well, that template uses Module:zh-cat, which uses Module:zh; perhaps the problem lies there. Perhaps it would help if Module:zh were split into smaller modules. — Eru·tuon 15:47, 6 June 2017 (UTC)
Never mind; Module:zh would already be invoked in any Chinese entry. — Eru·tuon 15:49, 6 June 2017 (UTC)
I went through many of the Chinese modules and removed global variables. I hoped that it would help with the memory error, but apparently not. — Eru·tuon 18:26, 6 June 2017 (UTC)

Header vs headword-line POS mismatchEdit

I wrote some probably inefficient, possibly horrifyingly inelegant, case sensitive regex which I used to search the most recent database dump and find 1666 entries where the part of speech header gave one part of speech but the headword-line template gave another. I cleaned up many pages which matched this same regex in 2014, so it appears we're seeing an edit or two a day (on average) introducing a header-headword POS mismatch. Would it be worthwhile to have an edit filter check for this in real-time? Would it be overly expensive? My regex can probably be improved upon; perhaps it can be made case-insensitive by adding = at the start of each header and by ignoring entries with nonstandard spaced headers (=== Noun ===), although the current approach also catches cases of "Proper Noun". It does not find all cases of mismatch, just common ones. (A related but separate approach would be to monitor for edits that introduce a header without also introducing a corresponding headword-line template.) If the filter works, it could even be updated to warn users against the edits. - -sche (discuss) 16:55, 7 June 2017 (UTC)

As to the regex, the header part could easily be simplified: Noun(===| ===|====| ====)Noun ?====?. So could the language code part: (..|...)[^\|]+. (That would also allow it to catch the exceptional codes from Module:languages/datax.) — Eru·tuon 17:14, 7 June 2017 (UTC)

Wikidata not working?Edit

To try out, I queried a single Wikidata item at Module:User:CodeCat. However, I get an error "attempt to index field 'wikibase' (a nil value)". I thought Wikidata access was enabled for Wiktionary already? —CodeCat 17:29, 11 June 2017 (UTC)

See phab:T159316. Wikidata is expected to work sometime, but it's not working yet. --Daniel Carrero (talk) 17:43, 11 June 2017 (UTC)
Ah, so we only have to wait for someone to enable it then? —CodeCat 17:46, 11 June 2017 (UTC)
I believe the next step of the big Wikidata plan is to enable custom interwikis. This is phab:T158323. The interwiki thing will be enabled on June 20th as said in Wiktionary:Beer parlour/2017/June#Enable sitelinks on Wikidata for Wiktionary pages (outside main namespace).
Apparently this has to be done first, before implementing normal Wikidata queries ("arbitrary access") as described in phab:T159316. I don't know when they'll do it, but it's marked "ready to go". --Daniel Carrero (talk) 17:53, 11 June 2017 (UTC)

Template:place problem with ParijsEdit

I just converted the entry Parijs to use {{place}}, but now the category Category:nl:Cities in France has disappeared from the page. It is the capital city of France, so isn't it also a city in France by definition? —CodeCat 18:05, 11 June 2017 (UTC)

It's a known bug (Template_talk:place#Capital_cities). The whole thing needs to be rewritten. DTLHS (talk) 18:13, 11 June 2017 (UTC)

Requested change to Template:trans-topEdit

Partially related: Template talk:trans-top#link to wikidata item

Right now, on the first line, there is this:

{{#switch:{{{1|}}}||Translations to be checked=|id{{=}}"Translations-{{anchorencode:{{{1}}}}}"}}

I'm requesting that this be changed to:

{{#ifeq:{{{1|}}}|Translations to be checked||{{#if:{{{id|{{{1|}}}}}}|id{{=}}"Translations-{{anchorencode:{{{id|{{{1}}}}}}}}"}}}}

This change will allow the id= parameter on translation tables, so that each table can be matched to the corresponding senseid on an individual sense. It will create an anchor on the page with the format Translations-{{{id}}}, so that the translation table can be directly linked to. It is probably desirable to, eventually, remove {{{1}}} from the anchor (if no id is present), so that only ids are used as anchors and not glosses. Moreover, it would be nice if there was a small link to the corresponding senseid in the top bar. —CodeCat 11:51, 12 June 2017 (UTC)

Done. It is probably good to have a separate |id= parameter. — Eru·tuon 19:38, 13 June 2017 (UTC)

preg_match_all equivalentEdit

Basic question. What's the equivalent to preg_match_all in Lua? I was hoping that I could create a table/array like so: mw.ustring.match(content, "\n=+[^=]*=+[^=]*"). --Victar (talk) 14:32, 12 June 2017 (UTC)

@Victar: There isn't any equivalent. But I created the function matchToArray in Module:string, which hopefully will work for that purpose. — Eru·tuon 17:48, 12 June 2017 (UTC)
Either that or if you plan to do something to each match that doesn't require an index of which number match it was, you can use a for loop with mw.ustring.gmatch. — Eru·tuon 18:12, 12 June 2017 (UTC)
@Erutuon: Thanks again! Man, you think that would be a basic function in Lua. --Victar (talk) 18:19, 12 June 2017 (UTC)
@Erutuon: I'm pretty sure I'm asking for the impossible, but I'm trying use this to create an array entry of section on a page: local content = require("Module:string").matchToArray(content, "\n=+[^=]*=+[^=]*"). I need this part, [^=]* to match all and stop at 2 repeating ==. I thought maybe [^={2}]* or .*(==), not no luck. Again, I assume this is impossible, but I thought I'd ask. --Victar (talk) 03:38, 13 June 2017 (UTC)
@Victar: If you want to stop at just two equals, I think .-==[^=] would work. .- is a non-greedy quantifier, equivalent to JavaScript .*?. — Eru·tuon 03:44, 13 June 2017 (UTC)
@Erutuon: \n=+[^=]*=+.-==[^=] is better, but it's stealing the first ='s and letter of the next section. You can find my test here. --Victar (talk) 04:43, 13 June 2017 (UTC)
@Victar: Well, I found a solution that grabs the content of the Descendants section. Was that what you wanted to do, or did you want to grab the whole rest of the language section below the Descendants header? — Eru·tuon 04:54, 13 June 2017 (UTC)
@Erutuon: I need to grab the section header and the content until the next section header. --Victar (talk) 04:58, 13 June 2017 (UTC)
@Victar: You can move the parentheses to capture whatever you want. — Eru·tuon 05:03, 13 June 2017 (UTC)
@Erutuon: Cool, but the problem with matching content that I'm not including is that is taking it from the next array item, which is causing every other to be skipped. --Victar (talk) 05:09, 13 June 2017 (UTC)
@Victar: Ouch. I can't think of a way to fix that, except by coming up with an entirely different approach to finding the Descendants section. — Eru·tuon 05:15, 13 June 2017 (UTC)
@Erutuon: If I just wanted to grab the Descendants section, that would be a breeze, but I'm trying to do something like GET KEY_1 of MATCH for "{{senseid|xxx}}" THEN FIND MATCH for "Alternative forms" with KEY_2 +/-2 from KEY_1. --Victar (talk) 05:35, 13 June 2017 (UTC)

Automatic Palindromes and नूनEdit

The automatic palindrome categorizer is good, but seems to be incorrect for abugidas. नून (nūn) isn't a palindrome, because backwards it would be ननू (nanū). Things like ननून, नूनू, प्रतंप्र, etc. would be considered palindromes. DerekWinters (talk) 19:24, 12 June 2017 (UTC)

Is there a rule that can be added to Module:palindromes/data to fix it? DTLHS (talk) 19:28, 12 June 2017 (UTC)
I think the reason why it's considered a palindrome is that नून consists of three characters, न ू न, and Module:palindromes determines palindromes based on individual characters, not combinations of letter and diacritic. It can be made to find letter plus diacritic combinations instead. (More technically, it has to put letter plus diacritic combinations into slots in the table that it uses, rather than individual characters.) — Eru·tuon 19:41, 12 June 2017 (UTC)
That's a language-independent process though, isn't it? If we know in advance which characters need to combine. Can we use our Unicode database for this? —CodeCat 19:45, 12 June 2017 (UTC)
It might be possible, but it would be more complicated than simply doing it for one script where there's a relatively small number of combining diacritics. With one script it would be fairly simple to make a pattern to search for letter–diacritic combinations, whereas the full list of Unicode combining characters would be huge and might require a more complicated function that uses the is_combining function in Module:Unicode data. — Eru·tuon 20:37, 12 June 2017 (UTC)

more pages exceeding the memory limitEdit

fire now runs out of memory, following the addition of two Tamil translations. This lends support to the point, also made by some on Phabricator, that our auto-transliteration modules seem to be among the main culprits behind the errors. (They demonstrably are culprits behind a large chunk of the errors that plagued water.) As more and more pages run out of memory, perhaps we should rethink whether translations really need to provide transliterations. - -sche (discuss) 06:08, 13 June 2017 (UTC)

I'd rather move the translations to a subpage /translations than remove transliterations, which are very helpful. However, even that solution will run into problems in the future, when the translations themselves go over the 50MB limit. — Eru·tuon 06:57, 13 June 2017 (UTC)
@Erutuon In that case, how about /translations (A) for languages, whose names start with the letter A (for example, Arabic); /translations (B) for language names starting with B, and so forth. --Lo Ximiendo (talk) 07:04, 13 June 2017 (UTC)
If I remove any 2 translation languages with transliterations then the page renders fine. It seems that most weird transliteratins are Korean and Thai. If I remove Korean translations then it's fine. If I fill the Korean transliteration in the translation template then it still uses 4 modules for Korean and it breaks. There is something inefficient there calling modules that it really don't use at the end. If that is fixed then a bot could subst missing tr parameters avoiding large upload of translit modules. The cons is to refresh them after an update in translit modules. --Vriullop (talk) 09:14, 13 June 2017 (UTC)
I think we should stop using modules for static text which is unlikely to change much or often, like transliterations. Make the script code and transliteration into parameters and only call the module when there is no value provided. As Vriullop says, a bot can do the work to keep them up to date. - [The]DaveRoss 11:15, 13 June 2017 (UTC)
Module:links generates a transliteration whenever a transliteration module is available, in order to compare manual to automated transliterations, so providing a manual transliteration actually uses slightly more memory. This problem of escalating memory usage is fairly recent- a matter of a few months- so we should look at changes made this year to see if any of them have side effects on memory usage. Chuck Entz (talk) 13:01, 13 June 2017 (UTC)
It may do that, it doesn't need to do that. - [The]DaveRoss 14:04, 13 June 2017 (UTC)
I second that: checking transliterations is obviously too taxing. Ideally the transliterations for a word should be created once and stored in the page (or in the Future: in Wikidata). It can be automated (by bot or with a gadget or an extension). — Dakdada 15:06, 13 June 2017 (UTC)
I agree we should stop using modules to automatically provide transliterations in translations (if not also in links), and "subst" them in by bot. (@Dakdada, accessing Wikidata is expensive and relatively slow, so that would not improve things over what is currently done, and might make things worse. And, of course, transliterations vary between wikis. It would be better to store transliterations directly on the page.) So, we need to (1) rewrite {{t}} to stop expensively checking the auto-translit against the manual translit, and (2) "subst" the auto translits into all entries. - -sche (discuss) 15:46, 13 June 2017 (UTC)
Saying "use a bot" to solve all our problems is intellectually lazy. Whose bot? How often is it expected to be run? What happens if the bot runner leaves the project or dies? How will the bot code be updated if we decide to change a formatting detail? DTLHS (talk) 15:51, 13 June 2017 (UTC)
For translations, I would think that a bot could run once to subst all existing {{t}}s, and then perhaps the translations-adder script could be changed to fetch and add (spelled out) the transliteration that the automatic transliteration module provides. Perhaps we should require the code of the bot to be available, something we did not require of some previous AutoFormat-esque bots whose users subsequently became much less active or were globally banned.
But even just step 1, rewriting {{t}} so that it does not compare a manual transliteration to the automatic one, together with the addition of manual transliterations to the {{t}}s in the entries that are currently broken, would fix most of the breakages we are seeing by stopping the invocation of the expensive translit modules. Even just {{t-simple}}, if it never invokes a transliteration module and only provided a transliteration when one was manually given, ought to work... - -sche (discuss) 16:20, 13 June 2017 (UTC)
Shared bot projects are possible if the toolserver is used. They can be maintained by a group and can be run against either the replica databases or via the API. - [The]DaveRoss 17:03, 13 June 2017 (UTC)
Another test in fire page, on previous version without t-simple: if I remove translations with tr parameter then the page renders fine with Lua memory usage 47.64 MB/50 MB. With current version with some translations changed to t-simple it uses 49.93 MB. To compare manual and auto transliterations is really a waste of resources. It is fine in a language by language basis for checking purpouses but it should be temporary and it should be switched off when the task is finished or nobody is checking it. If the translation-adder script adds tr parameter then maybe a bot is not needed. --Vriullop (talk) 17:06, 13 June 2017 (UTC)

I've added manual transliterations for the Korean and Thai words on the page fire with {{subst:xlit|lang|term}}. Now transliterations are shown, but they don't use any Lua memory. — Eru·tuon 17:10, 13 June 2017 (UTC)

Fantastic. Hmm, would it be possible to update your t-simplifier gadget to convert translations in non-Latin scripts, adding the (manual) translit and script parameters? - -sche (discuss) 19:20, 13 June 2017 (UTC)
Probably. I just haven't done it yet because I imagine it will be complicated. I should, though. Once made, it will simplify things bigly. — Eru·tuon 19:27, 13 June 2017 (UTC)

{{t-simple}} now supports interwiki links. Turn them on with |interwiki=1. — Eru·tuon 19:28, 13 June 2017 (UTC)

Okay, I have an idea regarding the transliteration comparison, mentioned above: how about disabling the transliteration comparison on particular pages that are running into memory errors? So, on water and such pages, {{t}} would only invoke a transliteration module if no |tr= parameter was provided. — Eru·tuon 19:16, 15 June 2017 (UTC)

I prefer a solution which will solve the problem rather than playing whack-a-mole. As a stop-gap I think your solution is reasonable. - [The]DaveRoss 17:31, 16 June 2017 (UTC)

Asia in Category:en:CountriesEdit

Why is the entry Asia in Category:en:Countries? It doesn't belong there as far as I can tell, but I can't find where on the page the category is being added to the page. —CodeCat 18:08, 13 June 2017 (UTC)

{{list:countries of Asia/en}} DTLHS (talk) 18:09, 13 June 2017 (UTC)
I've made it no longer categorize if the pagename is "Asia", and re-added the template (after @CodeCat removed it) under Hyponyms. — Eru·tuon 18:19, 13 June 2017 (UTC)
The same is happening to Europe as well. —CodeCat 12:58, 14 June 2017 (UTC)

Auto-expand translation table when linking to its anchorEdit

The translation table on Republic of Macedonia contains a link to Macedonia#Translations-Q221, which takes you to the right translation table. However, the translation table stays collapsed. Could it be made so that the translation table expands whenever you link to its anchor (if it has one)? —CodeCat 18:10, 14 June 2017 (UTC)

The anchor to Macedonia#Translations-Q221 doesn't work for me (it links further down the page, probably because of things loading in the wrong order) (Chrome). DTLHS (talk) 18:16, 14 June 2017 (UTC)
That happens with fragments in general, very commonly on discussion pages. But if you wait for the page to load and then click the address bar and hit enter, it jumps to the right place for me. —CodeCat 18:18, 14 June 2017 (UTC)

Swahili on WiktionaryEdit

"Welcome to the English-language Wiktionary, a collaborative project to produce a free-content multilingual dictionary. It aims to describe all words of all languages using definitions and descriptions in English."

Swahili has some words that do not fit into verb, noun, adjective, etc because they are sentences when translated into English, such as 'mtaona' (sic) which means 'you (plural) will see'.

How can we make pages for words such as this? Anjuna (talk) 09:51, 16 June 2017 (UTC)

I believe those are considered "verb forms". —suzukaze (tc) 10:11, 16 June 2017 (UTC)
Yes. It's no different than Latin vidēbitis. —Aɴɢʀ (talk) 12:03, 16 June 2017 (UTC)
I'll also point out the About_Swahili page, which covers some of the nuances with regards to the Swahili language treatment here on Wiktionary. I am not sure it addresses this issue directly, but the "conjugated form" section demonstrates how verb forms should be formatted. - [The]DaveRoss 13:33, 16 June 2017 (UTC)

Thank you all! Yes, I've made a derived verb page. I'll file them all under verb. I'm afraid that some will be taken down, though. They're a little long sometimes. I won't bother to make pages for words with object infixes, such as 'ninakuchukua', literally 'I am taking you', since those words contain the nominative, and the predicate- there are too many combinations.
Anjuna (talk) 00:56, 17 June 2017 (UTC)

How about this one? ataona Is there any more formatting that I must do to make this acceptable?Anjuna (talk) 01:31, 17 June 2017 (UTC)

Put {{head|sw|verb form}} under the Verb header. DTLHS (talk) 01:44, 17 June 2017 (UTC)
Also make sure you check for any irregular forms before you add {{sw-conj}} to a page. DTLHS (talk) 01:49, 17 June 2017 (UTC)

Thank you all so much! I found the verbal derivation. I found it helpful. This section can be deleted now, I suppose. —This comment was unsigned.

I see that many of the questions have already been answered. @Metaknowledge is probably able to help with any questions that remain, such as whether the more heavily inflected forms need to be created or not. - -sche (discuss) 03:17, 17 June 2017 (UTC)
Inflected forms are much lower priority when a language coverage is nowhere near a decent level for a dictionary. Before e.g. a Russian inflected entry for уви́дите (uvídite, you (plural) will see) was created, many lemmas like уви́деть (uvídetʹ) were made. --Anatoli T. (обсудить/вклад) 07:01, 17 June 2017 (UTC)

User:Metaknowledge hasn't replied to me, so, how do I signify negative?

@Science Bird: they replied on their talk page (which is a common practice), have a look there. Also it is helpful if you sign all your comments with four tildes (~~~~) so that everyone knows who is making a statement or asking a question. - [The]DaveRoss 18:48, 21 June 2017 (UTC)

Renaming a senseid?Edit

Is there a particular way to rename a senseid? Right now, the music sense on house has genre of music as its senseid, but there is also a Wikidata item, d:Q20502. If the senseid is to be changed to match the Wikidata item, how do we deal with all the existing uses of the genre of music id? —CodeCat 16:58, 16 June 2017 (UTC)

The simplest solution would seem to be to allow multiple senseids, so that the one which is intelligible to humans can be kept, and the Wikidata one can be added. - -sche (discuss) 17:01, 16 June 2017 (UTC)
I suppose that would work. Should {{senseid}} take multiple id parameters then? —CodeCat 17:03, 16 June 2017 (UTC)
Yes. (I mean, AFAICT we could just use multiple instances of {{senseid}}, but obviously your suggestion is better.) Some senseids and anchors are linked-to from Wikipedia entries, and others may be linked-to from other off-wiki sites. I'm not entirely opposed to rotting links sometimes, for example if a particular senseid is badly named (very misleading, offensive, etc), in general adding additional IDs seems preferable, with the understanding that there should be no vast proliferation of them (one human-readable ID and one Wikidata ID seems reasonable; five human-readable IDs, if one person wanted orange#the_colour and one wanted orange#the_color, etc, would be too many). - -sche (discuss) 17:52, 16 June 2017 (UTC)
How are we going to make sure that senseids are not changed or removed without links to them being fixed too? — Ungoliant (falai) 17:10, 16 June 2017 (UTC)
We can't, really, unless we have a way to track down all uses of a particular id. —CodeCat 17:11, 16 June 2017 (UTC)
So we're just supposed to accept inevitable link rot? Would anyone else support banning senseids all together? DTLHS (talk) 17:19, 16 June 2017 (UTC)
Not without an adequate replacement. — Ungoliant (falai) 17:20, 16 June 2017 (UTC)
Yes, glosses, which are independent of what they reference. DTLHS (talk) 17:21, 16 June 2017 (UTC)
Senseids are a complement to, not a replacement of, glosses. Even glosses + senseids with link rot is an improvement over just glosses. Glosses form the informational connection between link and definition, while senseids form the software connection. — Ungoliant (falai) 17:27, 16 June 2017 (UTC)
I support only using them if they are reasonably robust. I think the long-term goal is that the editor and back-end infrastructure get to the point that a senseid would be intelligible to humans and also not suffer from the possibility that it becomes out of sync with the unique identifier. Not sure how long that term is though. - [The]DaveRoss 17:23, 16 June 2017 (UTC)
@TheDaveRoss: Would it be possible to make an edit filter that would catch changes of senseids? — Eru·tuon 18:15, 16 June 2017 (UTC)
I think it would be possible, I will take a look. - [The]DaveRoss 18:32, 16 June 2017 (UTC)
Changed my mind. I can make a filter which can detect if a senseid is added or removed, or a line which contains a senseid is changed. Possibly also if the first instance of senseid has been changed. But without flow control I don't think there is a way to see if any instance of senseid has been changed. - [The]DaveRoss 19:32, 16 June 2017 (UTC)

Search results from sister projectsEdit

TOW has just added a nice search feature called "results from sister projects", so that searching Wikipedia also shows results in a sidebar from Wiktionary, Wikibooks, Wikivoyage, Wikiquote, and Wikisource. I wonder if we can get this...? I remember suggesting it a long time ago in response to the argument "we should have entries about TV series because people might not find them on other sites". Equinox 00:36, 19 June 2017 (UTC)

I like it. - [The]DaveRoss 13:27, 19 June 2017 (UTC)

Is my abuse filter not working?Edit

[11] is supposed to prevent edit summaries of the word "nothing". It has blocked a few edits. How did this one get through? [12] Equinox 17:51, 21 June 2017 (UTC)

You can add whitespace to the end of the edit summary and it will be stripped in the history, but your edit filter won't recognize it. I'm guessing that's what happened. DTLHS (talk) 18:02, 21 June 2017 (UTC)
Converting the rule to regex could probably deal with that. — Eru·tuon 18:26, 21 June 2017 (UTC)
Aha. I have changed it to ^\s*[Nn]othing\s*$. Hope that's correct. Equinox 18:33, 21 June 2017 (UTC)
If they aren't editing a section (unless sections are automatically stripped?). To account for that, it would have to be something like ^\s*\/\*.*?\*\/\s*[Nn]othing\s*$ (though there could be errors, because I'm not sure what version of regex is used). — Eru·tuon 18:36, 21 June 2017 (UTC)

Lua error: attempting to index upvalue 'm_data'Edit

Many of the templates (of various kinds) on the page for the Malay/Indonesian term burung kakaktua are showing this error:

Lua error in Module:script_utilities at line 167: attempt to index upvalue 'm_data' (a boolean value).

Each of the templates looks fine to me individually when I look at it in edit mode. Other Malay/Indonesian words look fine. --46.226.49.232 11:33, 22 June 2017 (UTC)

This happens sometimes. Someone made an error when changing the code in one of the modules and then it was fixed, but there are so many entries using the module that it takes a while for the system to update all of them. If you find any more like that, simply edit the entry and save/publish it without making any changes (what we call a "null edit"). This will make the system apply all the edits waiting for that entry and bring it up to date. If that doesn't solve the problem, then something still needs to be fixed and you can let us know here. Thanks! Chuck Entz (talk) 13:43, 22 June 2017 (UTC)

Language name to ISO converterEdit

I just discovered that the language name to ISO code converter that I used to be able to see in the sidebar is no longer showing up for me in either Firefox or Chrome, although it is still selected in my per-browser preferences. Any idea why this might be? It does warn that it's one of the potentially buggy gadgets, but it was also working before... Andrew Sheedy (talk) 03:34, 23 June 2017 (UTC)

See also links from Cyrillic е to ёEdit

I recently noticed that зелений (zelenyj) didn't have a see-also link at the top to зелёный (zeljónyj). I'm surprised a bot didn't add it, and wanted to check that this was just an anomaly. — Eru·tuon 17:42, 23 June 2017 (UTC)

@Erutuon: But one has a soft ending, the other a hard one. --Barytonesis (talk) 17:43, 23 June 2017 (UTC)
Ohh... duh. Well, they both have hard endings, actually, just spelled in different alphabets. — Eru·tuon 17:44, 23 June 2017 (UTC)
I hadn't even noticed one was Ukrainian, not much better... --Barytonesis (talk) 17:48, 23 June 2017 (UTC)
The see-also links are mostly added by hand anyway. There was one bot that briefly added some, but it missed a lot. I just recently added {{also|weder}} to the top of Weder, for example. —Aɴɢʀ (talk) 06:54, 24 June 2017 (UTC)

Affix sense differentiationEdit

Several affix entries include two or more (often completely unrelated) meanings, yet the words they derive are all categorized together:
e.g. modesty and messy both fall under [[Category:English words suffixed with -y]], with no mention of how -y was used in two completely different fashions.
I'm fairly new in this discussion page, so I can't say for certain this hasn't been discussed before, but I would like to hear if anyone else is... well, kinda bothered by this. – GianWiki (talk) 15:20, 24 June 2017 (UTC)

I agree it's not ideal. Another good example is e-. Are there any practical ways to get around this? Equinox 15:22, 24 June 2017 (UTC)
You can add a gloss, for example merger. DTLHS (talk) 15:29, 24 June 2017 (UTC)
Better still, use a senseid. —CodeCat 16:29, 24 June 2017 (UTC)
As is currently done with Category:English words suffixed with -y (diminutive)‎. — Eru·tuon 17:46, 24 June 2017 (UTC)

Category:head tracking/no lang categoryEdit

I seem to remember someone bringing this up before, though I don't remember the outcome: this tracking category currently has 33,393 entries in it- most of which seem to be Serbo-Croatian and proto-languages. I believe the reason for the latter is a statement in the block of code in Module:headword#full_headword that generates the category:

if not mw.ustring.find(cat, "^" .. data.lang:getCanonicalName()) then

I'm sure there are efficiency/speed benefits to converting the language name into a pattern this way, but it seems to me like any language name with pattern characters in it such as "-" would give unexpected results. Is there any way to use an escaped version of the language name instead? Or maybe we should skip language names with characters like "-" in them?

If there are reasons not to fix this, the question then becomes: how can you use a tracking category with 99% false positives? I ran into a couple of Latin entries in the category that seem to have some other problem (Alba Pompeia, for one), but looking through 33,393 entries for similar cases seems pointless (there is Special:WhatLinksHere/Template:tracking/headword/no lang category/lang/la, though, if you know you want to look specifically for Latin entries).

Of course, I'm not well-versed enough in Lua to write even the simplest of code, so I may not be understanding this correctly- but I figure it's worth bringing up, anyway. Thanks! Chuck Entz (talk) 01:37, 26 June 2017 (UTC)

Looking further, I notice that there are 53,191 entries in Category:Serbo-Croatian lemmas, so a simple language-name explanation (literally) doesn't add up. After a quick look through the tracking category, I notice that there don't seem to be any verb or adjective endings- perhaps it has something to do with the code for nouns in function export.noun of Module:sh-headword? There seem to be a good number of Esperanto terms in the tracking category, as well, as well. Chuck Entz (talk) 02:26, 26 June 2017 (UTC)
For some reason, the category name Serbo-Croatian lemmas doesn't get processed by the code mentioned above. In adsorpcija, the category name that put the entry in Category:head tracking/no lang category was Serbo-Croatian feminine nouns, not Serbo-Croatian lemmas. (I printed out the offending category name in the Lua log.) So perhaps only the S-C entries that have categories besides the lemma category ended up being put in the tracking category, or categories besides the lemma and basic part-of-speech category. — Eru·tuon 03:30, 26 June 2017 (UTC)
I think you're right that the problem was due to the minus-hyphens in the language names. I pattern-escaped the language name, then tested the entry adsorpcija; this removed it from the category. — Eru·tuon 03:22, 26 June 2017 (UTC)
I took a look at abateco, an Esperanto noun. It was in the tracking category because the Esperanto headword module Module:eo-headword put the category Category:Missing Esperanto noun forms into the table of categories supplied to Module:headword. I solved that problem by having Module:eo-headword put that category in a separate table. That should begin emptying Esperanto entries from the tracking category. — Eru·tuon 05:05, 26 June 2017 (UTC)

Misspelling kindaEdit

Hello! In many languages I've found words that haven't been misspelled but rather missconjugated. In Swedish the verbs skära and bära are both strong verbs but many wrongly conjugate them as weak ones hence skärde, skärt, skärd, skärda, bärde, bärt, bärd and bärda. Is there a template for missconjugations? If there isn't one I think it would be good to have one since a misspelling isn't really what it is. I see that most English examples (fighted, swimmed, shooted) seem to have just been labeled (nonstandard) as if it isn't incorrect, just not as common. Should the Swedish verb forms also be labeled (nonstandard) or what does the guidelines say?Jonteemil (talk) 02:00, 26 June 2017 (UTC)

nonstandard is for terms that are generally considered incorrect: “Not conforming to the language as accepted by the majority of its speakers.” — Ungoliant (falai) 03:33, 26 June 2017 (UTC)