Wiktionary:Grease pit/2013/November

Template:ro-form-adj

In feminine#Romanian, this template displays "feminine pluralnominative form of feminin", "feminine pluralaccusative form of feminin", etc. There should be a space between "plural" and "nominative", etc. - -sche (discuss) 08:18, 1 November 2013 (UTC)[reply]

Done, though could we remove some of the wikilinks? I find them distracting. Also shouldn't the name be {{ro-adj-form of}}? Mglovesfun (talk) 11:36, 1 November 2013 (UTC)[reply]

I support de-linking these words ("feminine", "plural", "nominative", etc.) in form-of templates and inflection tables, reader of a dictionary must already know what they mean. --Z 12:10, 1 November 2013 (UTC)[reply]

I've delinked the words. - -sche (discuss) 02:40, 2 November 2013 (UTC)[reply]

גיי אין דר׳ערד

In the headword-line, the geresh is not in {{l}} but all the other words are. As a result, it looks abnormally small because the fonts for Yiddish seem not to have been applied to it. (You can see how it should appear by looking at the page title). Why is this? Doesn't {{head}} "know" that the text in head= is Yiddish the same way that {{l}} does? If so, why are they not receiving the same font specifications? —Μετάknowledge^{discuss/deeds} 21:51, 2 November 2013 (UTC)[reply]

The problem is that {{head}} sees that you've already applied special internal styling, so it doesn't add its own styling. (This is correct behavior: if it did try to apply its own styling, then the text inside {{l}} would end up exceedingly large, because both templates would be enlarging it.) CodeCat has fixed the entry now, by switching to normal wikitext links and letting {{head}} handle the styling. —Ruakh_TALK 05:55, 3 November 2013 (UTC)[reply]

Template:grc-cite

I really like using this template. It allows me to enter minimal code and have absolutely beautiful results. However, I really, really hate maintaining and expanding it, as it requires a host of helper templates, many of which are exceedingly complex. Lua seems to have made quite the splash here since I was last active, and I suspect that I can rewrite it with Lua into a much simpler and more easily maintained form. So, firstly, can someone point me to a Lua primer, and perhaps a "Lua on the English Wiktionary" primer? The language doesn't seem too complicated (it can't really be any more esoteric than Wikicode), so I imagine a brief intro should be sufficient. The rest I can probably pick up by breaking large swathes of Wiktionary and then desperately trying to fix them before anyone notices. Additionally, if anyone has any specific thoughts or recommendations for the template, that'd be just swell. If anyone doesn't know the template or what it does, just look at any of my recent creations, showing the quotations, and then comparing the results with the code, and also looking at the templates used. -Atelaes λάλει ἐμοί 01:09, 3 November 2013 (UTC)[reply]

I would recommend moving the specific citation templates (Template:grc-cite-Euripides-Herakles) to a single data page with a table similar to Module:labels/data. The grc-cite module can then apply the wiki markup / links. In regards to Lua, you can download a local interpreter to experiment with without breaking Wiktionary, or just use Module:User:Atelaes. See the mediawiki lua reference manual for anything specific to the mediawiki implementation. DTLHS (talk) 02:01, 3 November 2013 (UTC)[reply]

By the way, it seems like there is nothing in this template specific to Ancient Greek. Perhaps you could generalize to work for any language? --Wiki Tiki 89 02:14, 3 November 2013 (UTC)[reply]

Except that Ancient Greek has a fixed corpus that will never grow (except with new discoveries), whereas languages like English have a virtually infinite set of works that we could use. DTLHS (talk) 02:28, 3 November 2013 (UTC)[reply]

So perhaps it could be extended to other languages with "fixed corpuses" (they're only fixed until something new is discovered, and then they are once again fixed). --Wiki Tiki 89 02:46, 3 November 2013 (UTC)[reply]

Not just those with fixed corpora, but even those with relatively small corpora like Tok Pisin, or even major modern languages where certain works are cited extensively. I think a unified multilingual quotation template system could be a great advance in that regard. —Μετάknowledge^{discuss/deeds} 17:12, 3 November 2013 (UTC)[reply]

Thanks for the links. Yes, a single page with all the algorithms is probably the best route....and I can probably have it send the info to the existing templates in the meantime, until I get everything transferred. The reason for the language specificity is so that I can use all of the abbreviated forms that I'm using, which certainly couldn't work if the template was used for other languages. This template will have a lot of info in it when it's complete, and it would be unreasonable to have a single template have all of that info for every language. It might be reasonable to have a single starting and ending template, such that the editor uses {{meta-cite|lang=grc|author|work|etc.}}, which is translated to {{grc-cite|author|work|etc.}}, which uses {{cite}} as its final formatting template. In any case, if other languages end up using similar templates and we want to unify them down the road, it's an easy after-market add. Thanks again for the tips and thoughts. -Atelaes λάλει ἐμοί 03:08, 3 November 2013 (UTC)[reply]

Your example there of a generalized template would be pretty useless. What I meant was have one template or module, that does everything except for looking up the details of the text. For example, {{cite-text|grc|Homer|Odyssey|...}} would call the function cite_text() in module Module:cite. Module:cite will do all the work except knowing about the text. When it needs to know about the text, it will call get_text_info() in Module:cite/grc, which will give it all the information about Homer's Odyssey. That way, to add a language's corpus to this template, you only have to create another Module:cite/xxx with all the information about every text in the corpus. Anyway, that's just an example. I'm not telling you how you should structure your code, just giving some advice. --Wiki Tiki 89 03:29, 3 November 2013 (UTC)[reply]

Shouldn't this template be renamed {{RQ:grc-something}} for consistency with the other RQ templates? —Aɴɢʀ (talk) 16:55, 3 November 2013 (UTC)[reply]

Perhaps. If anything, it'd be {{RQ:grc}}, as it's the only one. It probably wouldn't be too difficult to make it {{RQ}}, and simply do {{RQ|grc|Homer|Iliad}} or something, with the overall machinery being language agnostic, and each language having its own database to access. -Atelaes λάλει ἐμοί 20:15, 3 November 2013 (UTC)[reply]

Separate articles for inflected forms

Discussion moved to Wiktionary:Beer parlour/2013/November#Separate articles for inflected forms.

Automatic watchlisting of created pages is back.

The automatic watchlisting of pages you create into your watchlist is back. I hate this feature, can we turn it off? Mglovesfun (talk) 00:22, 4 November 2013 (UTC)[reply]

Preferences, in the watchlist tab. — Ungoliant ^(Falai) 00:29, 4 November 2013 (UTC)[reply]

And remember to do it for any bots you run. Else they will build ginormous, unmanageable watchlists. SemperBlotto (talk) 08:08, 4 November 2013 (UTC)[reply]

It would be desirable, I think, to be able to watch entries for a period after they are added to a watchlist, for whatever reason. There was a bugzilla item about this, but it lost momentum. Would something time-of-addition-to-watchlist be a helpful additional field to use for pruning watchlists, possible automatically? (The ability to watch based on category membership was another suggestion, which is a practical step toward language-specific watchlists, much wished here.) DCDuring TALK 13:37, 4 November 2013 (UTC)[reply]

Bug in translation adder (User:Conrad.Irwin/editor.js)

As can be seen in this diff, I was adding Hebrew translations for the Nile and the preview put them them in the right place, but when I pressed save, the translations ended up in the Aramaic in Hebrew script section, rather than the Hebrew section. --Wiki Tiki 89 00:13, 5 November 2013 (UTC)[reply]

Old bug. Yes, it should be fixed. For this reason Serbo-Croatian uses nested "Roman" (script), not "Latin", which is also a language. The other old bug is when a {{trreq}} or {{ttbc}} doesn't let adding a translation if the language name precedes it. E.g., I can't add a German translation if there is a Georgian {{trreq}} (next one alphabetically) but there is no German {{trreq}} immediately before it.--Anatoli ^{(обсудить}/^вклад) 02:58, 8 November 2013 (UTC)[reply]

Put JavaScript in the URL

Hello,

When I put some JavaScript in the URL, like:

javascript:(function(){importScript("Uzanto:Yair rand/TabbedLanguages.js")})()

in eo:Mexico, it creates an error like: "importScript in not defined". I have the same error when I use the "mw" or "document" objects directly in the URL. Do you know why, and if I can avoid it?

Thank you in advance for your answers, — Automatik (talk) 13:16, 5 November 2013 (UTC)[reply]

If you don't load the library that include "importScript" then you obviously can't use it (same for mw and document). When we write javascript for Wikimedia sites (gadgets or pages.js), those libraries are loaded by the software automatically. Dakdada (talk) 09:48, 7 November 2013 (UTC)[reply]

Addition: if you want to use the script mentioned above on a wiki, you have to create your own common.js and write the importScript line there. Dakdada (talk) 09:49, 7 November 2013 (UTC)[reply]

Thank you! But when I use the window object, I've the same error. Is there a library for the window object? Isn't it included in the major browsers? — Automatik (talk) 02:07, 8 November 2013 (UTC)[reply]

Language-specific CSS at MediaWiki:Common.css

Is it a good idea to add language-specific formatting to MediaWiki:Common.css, such as the following:

.Hebr:lang(jrb) { font-family: Arial Unicode MS, Times New Roman, Arial, Arial Hebrew, Helvetica, sans-serif; }

(It turns out that those fonts are the best at displaying Judeo-Arabic's sole diacritic.)

Or is it better to create a new script tag entirely (jrb-Hebr)? --Wiki Tiki 89 15:01, 5 November 2013 (UTC)[reply]

IE 6 and IE 7 don't support the :lang pseudoclasses, but if you're relatively happy with .Hebr for Judeo-Arabic, and just consider this a minor improvement, it should be fine. (One thing we might consider is, allowing multiple classes for a single script. That way, for example, the scripts that use a certain increased font-size could all get that behavior from a shared class; the scripts that disable italics could all do so from a shared class; etc.) —Ruakh_TALK 17:55, 7 November 2013 (UTC)[reply]

I am actually not relatively happy with .Hebr for Judeo-Arabic, mostly because one of the best and most common Hebrew fonts, namely David, which is second on our list, does not seem to display the HEBREW MARK UPPER DOT (U+05C4) at all. --Wiki Tiki 89 18:08, 7 November 2013 (UTC)[reply]

Can we remove David from the current .Hebr font stack and still meet everyone’s needs? Or move it to the end? That would be the best choice. For reference, it is currently:

font-family: SBL Hebrew, David, Narkisim, Miriam, Arial Hebrew, Arial, serif;

Otherwise, it’s best to stick to the framework that we are using, and create a class for .jrb-Hebr, in the short run. The font stack should be similar to the one for .Hebr. Is there a reason to omit SBL Hebrew, Narkisim, and Miriam from this stack, or to omit Arial Unicode MS, Times New Roman, and Helvetica from the other stack? Why reverse the order of Arial and Arial Hebrew? Times New Roman should probably be shuffled to the bottom, because it is not a sans-serif.

There is a much bigger discussion to be had about modernizing our font specs. —Michael Z. 2013-11-07 21:33 z

Also, anyone know why our .Hebr font stack ends with serif? It says nothing about this in the style sheet comments or in WT:AHE. I will change that to sans-serif, to go with the rest of the site. —Michael Z. 2013-11-07 21:39 z

"Arial Unicode MS" and "Times New Roman" are (quite surprisingly) the only fonts I found that display the HEBREW MARK UPPER DOT in the right place, most of the other ones display it awkwardly high, and David doesn't display it at all. For Hebrew itself, David is probably my favorite font in terms of distinguishing Hebrew vowel points. Therefore, your idea will not work. The ideal, and I don't see why we don't do this, is to come up with our own collection of compatibly licensed fonts that can be served from Wiktionary. That way we don't have to worry about what people have installed except as a backup.

Regarding serifs, Hebrew "serifs" are not the same thing as Latin serifs. Hebrew generally looks better with serifs and sans-serif Hebrew fonts are a computer-age innovation. --Wiki Tiki 89 21:49, 7 November 2013 (UTC)[reply]

Do you know if any of the already-installed ULS fonts will work for Judeo-Arabic? If so, then we can just serve them up immediately. If not, then do you know of any free fonts that do? – we could ask the developers to install them.

So sans-serifs are completely foreign to the Hebrew writing system, or just non-traditional? Serif fonts or equivalents might be more appropriate for Latin, Cyrillic, Greek, and probably everything else in Wiktionary. Sans-serifs are divorced from traditional manuscript and type letterforms.

But our house style is sans-serif, and we shouldn’t specify serif fonts for an individual language or writing system unless the community agrees. —Michael Z. 2013-11-07 22:33 z

Sans-serif fonts are fine for Hebrew in general, but they're never used for writing with vowel diacritics, so no sans-serif font has very good vowel support. A lot of vowels are hard to distinguish in sans-serif fonts at normal font sizes. —Ruakh_TALK 23:49, 7 November 2013 (UTC)[reply]

Wikitiki89, can you provide a text sample to try out fonts for both Judeo-Arabic and Hebrew? —Michael Z. 2013-11-07 22:38 z

Serif fonts can be easier to read for certain scripts so I think we should make the decision per script. I think using serifs for Chinese would be a good idea for example. —CodeCa t 22:50, 7 November 2013 (UTC)[reply]

The only requirements for Judeo-Arabic are that the dot be visible above each letter in the following: {{lang|jrb|גׄדׄטׄכׄךׄצׄץׄתׄ}}, and the {{lang|jrb|ﭏ}} ligature needs to exist. For Hebrew itself, the basic requirements are that each of the vowel diacritics be clearly distinguishable at our normal font size. Here's the letter א with each of the vowels: אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ. But the ideal requirement for Hebrew is to be able to display all the cantillation marks also, which most fonts don't do very well. The open-source Taamey Frank CLM font here is probably the best for cantillation marks ~~(if we could get Wikimedia to install it that would be awesome!)~~, but it does not work for Judeo-Arabic. From all the searching I've done, as I've said above, the only good fonts for Judeo-Arabic that I've found are Arial Unicode MS and Times New Roman (and Courier New, but that's not a good display font). --Wiki Tiki 89 23:12, 7 November 2013 (UTC)[reply]

Scratch that, Taamey Frank CLM is installed based on the link you gave me. --Wiki Tiki 89 23:15, 7 November 2013 (UTC)[reply]

Let’s see if we can get the ULS to kick in.

גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ

And here are some fonts I have installed, and display problems on my machine. WikiTiki, most of the fonts in your jrb stack break in some way, in Safari/Mac. It’s possible that I have old versions of the fonts.

default – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
bad font spec – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Arial – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Arial Unicode MS – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (no diacritics combine)
Arial Hebrew – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (upper dot breaks characters)
Code2000 – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Courier New – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (diacritics above don’t combine)
Helvetica – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (nothing works)
Linux Libertine – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (upper dot breaks the display, ligature absent)
Lucida Grande – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (ligature absent)
Microsoft Sans Serif – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Raanana – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (ligature absent)
Tahoma – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Times New Roman – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ
Titus Cyberbit Basic – גׄדׄטׄכׄךׄצׄץׄתׄ ﭏ אֹאֻאָאַאֶאֵאִאֳאֲאֱאְ (diacritics below are offset one character left)

By the way, which browser/OS does not display these correctly? As far as I can tell, they all work fine in Mac OS X/Safari, unless I start imposing fonts on them (although I see two forms of the ligature, in different fonts).

CodeCat, if we decide what improves readability on a per-script basis, we may as well just change the global fallback from sans-serif to serif. Serif is generally easier to read, as long as you have decent font rendering. —Michael Z. 2013-11-08 00:11 z

I just realized that when I tested Times New Roman, I tested in on my iPhone, mistakenly thinking that if it works there it would work on my computer. We should take into account the difference between PC and Mac versions of the same fonts. --Wiki Tiki 89 00:46, 8 November 2013 (UTC)[reply]

Using ULS fonts

Now that I know that Taamey Frank CLM is installed in the Universal Language Selector, we should use it for Hebrew (but not for Judeo-Arabic). Does anyone know how to do this? --Wiki Tiki 89 23:33, 7 November 2013 (UTC)[reply]

Okay!! While TaameyFrankCLM and MiriamCLM, given in the WMF FAQ don’t work, it appears that Taamey Frank CLM and Miriam CLM do work on my machine. I have updated the items above. Please let me know how it looks for you. —Michael Z. 2013-11-22 16:37 z

Two Arabic diacritics in a row and a Wikimedia bug

Does anyone know how to overcome this Wikimedia bug? Can the characters be forced to appear in the right order? Impacted - Module:ar-verb (please don't change, make a copy if you want to try) or Arabic conjugation templates. E.g. 2nd person plural feminine of the verb فَعَلَ (faʿala, “to do”) is فَعَلْتُنَّ romanised as "faʿaltunna" but the last symbols are displayed incorrectly, the auto-transliteration module shows "faʿaltunaa" (incorrect)

In this form دَلَلْتُنَّ (dalaltunna - if spelled correctly), I'm trying to make the last two diacritic symbols ـّ (šadda) and ـَ (-a) (fatḥa) appear in this order. They invariably swap places and become ـَ (-a) (fatḥa) + ـّ (šadda). Having both symbols as variables in a module doesn't seem to have any effect.

This is an old bug. I've see old discussion but there's seems to be no resolution. See User_talk:MK#Arabic_vowels posted by Stephen G. Brown (talk • contribs) back in 2007. It's not a problem with Unicode, examples of correctly ordered diacritics can be found on the verb, like on this site] where the combination appears twice - أَنْتُنَّ فَعَلْتُنَّ) (ʾantunna faʿaltunna) - "you (females) did" (displayed in the wrong order here but correctly on that web site).

--Anatoli ^{(обсудить}/^вклад) 08:55, 9 November 2013 (UTC)[reply]

testing - Keφr

Hmm... I just pasted the following string of Unicode characters below:

00062D  ARABIC LETTER HAH
000633  ARABIC LETTER SEEN
00064E  ARABIC FATHA
000651  ARABIC SHADDA
000646  ARABIC LETTER NOON
000020  SPACE
00062D  ARABIC LETTER HAH
000633  ARABIC LETTER SEEN
000651  ARABIC SHADDA
00064E  ARABIC FATHA
000646  ARABIC LETTER NOON

DejaVu Sans: حسَّن حسَّن

Droid Arabic Kufi: حسَّن حسَّن

Droid Naskh Shift Alt: حسَّن حسَّن

Code2000: حسَّن حسَّن

Both words render identically in each font for me, with shadda above fatha in Code2000 and below in the others. Now I shall look if the software swaps the order of diacritics.

Keφr 09:12, 9 November 2013 (UTC)[reply]

Apparently, yes:

00062D  ARABIC LETTER HAH
000633  ARABIC LETTER SEEN
00064E  ARABIC FATHA
000651  ARABIC SHADDA
000646  ARABIC LETTER NOON
000020  SPACE
00062D  ARABIC LETTER HAH
000633  ARABIC LETTER SEEN
00064E  ARABIC FATHA
000651  ARABIC SHADDA
000646  ARABIC LETTER NOON

Keφr 09:19, 9 November 2013 (UTC)[reply]

more testing - Keφr

Trying U+200D ZERO WIDTH JOINER:

DejaVu Sans: حسَّن حسّ‍َن

Droid Arabic Kufi: حسَّن حسّ‍َن

Droid Naskh Shift Alt: حسَّن حسّ‍َن

Code2000: حسَّن حسّ‍َن

Keφr 09:21, 9 November 2013 (UTC)[reply]

Apparently putting U+200D ZERO WIDTH JOINER prevents the software from swapping the order of diacritics, but only Code2000 renders this identically. And User talk:MK#Arabic vowels tells me that it is the wrong rendering. Suggestion: drop Code2000 on the floor. Keφr 09:28, 9 November 2013 (UTC)[reply]

Using ZWJ is a great idea! I don't quite understand though. I'm not using fonts in the module and the last example seems to work correctly as well. How do I "drop Code2000 on the floor"? What do you mean? --Anatoli ^{(обсудить}/^вклад) 09:34, 9 November 2013 (UTC)[reply]

(before EC)I tried to apply the fix in Module:ar-verb and at one stage I saw a good result in User:Atitarev/ar-conjug-I-geminate-test (romanisation "dalaltunna" is correct) but then it changed back to what it was. There is still "دَلَلْتُنَّ" with the wrong order ("dalaltunaa"). --Anatoli ^{(обсудить}/^вклад) 09:52, 9 November 2013 (UTC)[reply]

Remove it from MediaWiki:Common.css. This is how it renders for me (left: ZWJ, right: no ZWJ). Which is the correct rendering? Font links: DejaVu, Droid. Keφr 09:49, 9 November 2013 (UTC)[reply]

OK, thanks. Please have a look at my query above. --Anatoli ^{(обсудить}/^вклад) 09:52, 9 November 2013 (UTC)[reply]

Re rendering. On my PC they visually look correct, all of them (interestingly shadda is shown under (ie. before) fatha, even if placed in the other order). When I check the order, those with ZWJ are correct, the others not - all fonts. On my iPad the order shows correctly for for ZWJ examples, others (without ZWJ) are incorrect for all fonts. My issue at the module persists though, even if used a variable, which includes ZWJ, which you have added. --Anatoli ^{(обсудить}/^вклад) 09:58, 9 November 2013 (UTC)[reply]

But which of the renderings in the picture is correct? And I have not added ZWJ to the module. I just wrote the UTF-8 octet values in decimal. Keφr 10:05, 9 November 2013 (UTC)[reply]

Strangely, VISUALLY, the first test looks OK on PC - (shadda+fatha order), even if other order is used. The 2nd test without ZWJ looks wrong on iPad. I'm using SC Unipad to see the real order of character when saved here. It's hard to demonstrate anything, as the order of diacritics changes when saved here. Honestly, I am now confused about what's happening. --Anatoli ^{(обсудить}/^вклад) 10:54, 9 November 2013 (UTC)[reply]

You can do it by using XML codes (without spaces):

shadda+fatha = &#x0 651;&#x0 64E;
shadda+kasra = &#x0 651;&#x0 650;
shadda+dhamma = &#x0 651;&#x0 64F;

An easier way is to use {{ar-dia}}. It will insert whatever compound diacritic you want. —Stephen ^(Talk) 10:07, 9 November 2013 (UTC)[reply]

Thank you, Stephen, but I need it to work with a module. --Anatoli ^{(обсудить}/^вклад) 10:54, 9 November 2013 (UTC)[reply]

Any thoughts on the root entries themselves? How about the modules surrounding the forms of verbs that use the aforementioned roots? --Lo Ximiendo (talk) 10:15, 9 November 2013 (UTC)[reply]

It seems to work now but I'm not entirely sure, which change fixed it. --Anatoli ^{(обсудить}/^вклад) 22:28, 9 November 2013 (UTC)[reply]

I think this change fixed the transliteration module. The issue is simple: MediaWiki converts all Unicode strings to NFC (I think) before saving the page and passing it to Lua modules. (Anatoli's fix did not work, because it substituted a string for itself; the normalisation algorithm reordered the combining characters in the module code. Using numeric escapes prevents it from happening.) I think we could fix a later shadda substitution, so that it catches and passes through any interleaved fatha or whatever else appears there, instead of changing the combining-mark order before the transliteration proper.

(TLDR version: this is a transliterator and/or font renderer bug, not a MediaWiki bug.) Keφr 22:41, 9 November 2013 (UTC)[reply]

Thanks for that, mate. --Anatoli ^{(обсудить}/^вклад) 22:48, 9 November 2013 (UTC)[reply]

For transliteration, it's not a big problem. The transliterator should be written not to care what order the diacritics are in. I can help with that if need be. --Wiki Tiki 89 23:34, 9 November 2013 (UTC)[reply]

Script direction in Old Italic script

The entry 'uerfale' using Chrome. Note how the image and the head word are not in the same direction. Click to see enlarged version. Mglovesfun (talk) 13:37, 12 November 2013 (UTC)[reply]

I notice that Umbrian entries (language code "xum") display on my computer as left-to-right, but w:Umbrian language says that the Old Italic script ({{Ital}} was right-to-left. Is it supposed to be that way, is it an oversight, or is it some quirk of my system/installed fonts (Mac: browsers are Firefox & Safari- I checked both). Chuck Entz (talk) 17:16, 10 November 2013 (UTC)[reply]

I believe that Old Italic, like Greek inscriptions of the same era, was actually freely variable between being left-to-right and right-to-left, and the letter orientation varied depending on the reading order, so if the letter 𐌄 looked like E then the text was left-to-right, and if it looked like Ǝ then the text was right-to-left. (Back in the days of slide projectors, this fact made it very difficult for archaeology professors to tell whether their slides were in backwards or not.) —Aɴɢʀ (talk) 17:47, 10 November 2013 (UTC)[reply]

According to Unicode Old Italic is defined as an LTR script. Although I know that Firefox deviates and displays Old Italic as RTL for one reason or another. -- Liliana • 17:51, 10 November 2013 (UTC)[reply]

I've added a screenshot above of what 𐌖𐌄𐌓𐌚𐌀𐌋𐌄 (uerfale) looks like in Chrome with my stylesheet. Mglovesfun (talk) 13:52, 12 November 2013 (UTC)[reply]

I think that's OK. The digital text is LTR with letters oriented for LTR writing, and the image is RTL with the letters oriented for RTL writing. I suspect if there were a literate native speaker of Umbrian around, he would say both versions are correct and the word can be written either way. —Aɴɢʀ (talk) 15:08, 12 November 2013 (UTC)[reply]

If desired, RLO can always be used to force right-to-left direction at least in the wikitext. -- Liliana • 19:11, 12 November 2013 (UTC)[reply]

Template:exthomophones

I think that template needs to be updated so it can be used for Modules. But how? --ElisaVan (talk) 11:45, 12 November 2013 (UTC)[reply]

Also Template:question. --ElisaVan (talk) 11:46, 12 November 2013 (UTC)[reply]

If we Lua-ize {{homophones}} then we can just redirect {{exthomophones}} to it. --Wiki Tiki 89 13:24, 12 November 2013 (UTC)[reply]

Modules can't call on templates, but templates can call on modules. Mglovesfun (talk) 13:29, 12 November 2013 (UTC)[reply]

That is a correct statement, but what do you mean by it? I did not suggest a module calling on a template. --Wiki Tiki 89 13:32, 12 November 2013 (UTC)[reply]

Actually frame:expandTemplate can be used to expand templates from modules (if absolutely necessary- seems like a bad idea to me). DTLHS (talk) 01:36, 13 November 2013 (UTC)[reply]

Module:labels does it, to call the old context templates, as a compatibility measure. It's not a bad idea as long as we do it sparingly or if there's a good reason. —CodeCa t 14:37, 19 November 2013 (UTC)[reply]

Category:Pages with script errors

The category currently contains several language code templates (mostly variants of Quechua) which are missing from Module:languages. Can we get rid of the errors, one way or the other? Keφr 18:12, 12 November 2013 (UTC)[reply]

Looks like the Template:langt template was altered. Personally, I'd delete the templates for the Quechua variants since we don't recognize them as separate languages. -- Liliana • 19:17, 12 November 2013 (UTC)[reply]

Huh, I thought the Quechua variants' codes had already been deleted. Ah well, I've started deleting them now. As Liliana says, we do not and should not treat them as separate languages. - -sche (discuss) 20:16, 12 November 2013 (UTC)[reply]

What about the Pashto ones? Keφr 20:57, 13 November 2013 (UTC)[reply]

This fellow writes, regarding Ethnologue's decision to split the language, that "speakers of all the three dialects of Pashto i.e. Northern, Central and Southern can understand each other up to 99%" of the time, and "moreover the structure of all of the three dialects is exactly the same." Our sister project, citing D. N. MacKenzie, says "the morphological differences between the most extreme north-eastern and south-western dialects are comparatively few and unimportant, and the criteria of dialect differentiation in Pashto are primarily phonological." That suggests WT:LANGTREAT is right to say that only Pashto proper (code: 'ps') should be treated as a language. I happen to know someone who speaks Pashto and has worked in Afghanistan as a translator, though, so I'll ask them for their thoughts on the matter. - -sche (discuss) 22:23, 13 November 2013 (UTC)[reply]

My friend confirms that the differences between the Pashtun lects are mainly in pronunciation, the kind of thing en.Wikt handles with {{a}} rather than with separate L2s. Speakers of the various lects can communicate freely, though it's clear which region a speaker is from. The dialectal differences are exemplified by the different pronunciations of the language's name, پښتو, as Pashto/Pəshto (Southern) vs Pəkhto (Northern). There are also a few lexical differences, some of which are longstanding and others of which are due to Pakistani Pashto speakers' tendency to borrow words from English whereas Afghanistani Pashto speakers borrow from Dari, but on a balance, the dialects are no more different than, say, GenAm and NZ English. (Also, it is to be noted that most Pashto speakers are illiterate.) I've deleted the dialect codes accordingly and will add a link to this discussion to WT:LANGTREAT. - -sche (discuss) 19:03, 14 November 2013 (UTC)[reply]

FYI, Abuse Filter 26 has been disabled

As you know, we were hit a few months ago by spambots that created pages with links to "personal blogs" (a.k.a. sites selling diet pills, Gucci handbags, etc). We tried to block the bots with Abuse Filter 21, but its side effects were undesirable enough that it was disabled; Filter 25 was likewise disabled because of its side effects. Our latest defence was Filter 26, but it was recently pointed out (here) that it, too, has undesirable side effects; as a result, it too has been disabled (or, technically, prevented from stopping edits). The spambots are still trying to create pages (see e.g. this recent diff), but Filter 23 (which someone imported from WP) seems to be catching them... perhaps we should just wait a while and see if we actually need any additional anti-link filters or not. - -sche (discuss) 01:24, 13 November 2013 (UTC)[reply]

Today I encountered User:RickieWisniewsk and User:GertrudeTost. They look like mechanically generated user pages but there's no promotional material AFAICT. Rickie also has a page on a small Italian wiki elsewhere on the web which is definitely spam, so it must be a spambot at work, but why add promotional material that doesn't promote? Does this spam page impact another somehow? Haplogy (話) 15:10, 14 November 2013 (UTC)[reply]

Maybe it is to fool filter conditions like "if user has less than [a handful of] contributions". Or maybe they are legitimate users (for some definitions of "legitimate", at least). Keφr 16:25, 14 November 2013 (UTC)[reply]

They're definitely produced by a NTSAMR-type bot. My guess is that the bot owner has realized that the edits were being stopped, and they're trying different combinations of words to see what gets through. The problem with Filter 23 is that it depends on spotting word patterns, and the word patterns can be changed by the bot programmer. Once they figure out what the filter is looking for, they'll just rephrase around it. Chuck Entz (talk) 07:00, 15 November 2013 (UTC)[reply]

Would it be helpful to find out (via checkuser?) the IP addresses of the spam users (GertrudeTost etc), to see if they're all in one range that can be blocked? (See also meta:NTSAMR.) - -sche (discuss) 08:40, 15 November 2013 (UTC)[reply]

Category:en:Quran

Its not visible yet. I'm not that familiar with wiktionary formats. Can someone help please? Pass a Method (talk) 05:32, 13 November 2013 (UTC)[reply]

Move to RFD. No entry belong to this category in any languages - currently four in Category:Quran, so the categories display nothing. --Anatoli ^{(обсудить}/^вклад) 06:02, 13 November 2013 (UTC)--Anatoli ^{(обсудить}/^вклад) 06:02, 13 November 2013 (UTC)[reply]

Pass a Method just created it, so of course it has no entries. I created Category:Quran with {{topic cat||Quran}} in it, and {{topic cat parents/Quran}} with Islam as the only parent (so far), and created language-specific subcategories for some languages with large percentages of Muslim speakers. Now the only thing left is to add [[Category:en:Quran]] to the appropriate entries. As for visibility: [[Category:en:Quran]] will put the page it's added to in the category and hide the wikitext. You have to put a colon inside the double square brackets to have it show up and not categorize the page: [[:Category:en:Quran]]. Chuck Entz (talk) 06:27, 13 November 2013 (UTC)[reply]

I can't say that I see much need for this category, but if we're going to have it, I think it should be at Category:en:Qur'an. See https://books.google.com/ngrams/graph?content=Koran,Quran,Qur'an. —Ruakh_TALK 07:42, 13 November 2013 (UTC)[reply]
- Ok, I replaced all the Quran categories with Qur'an ones, and deleted all of the former except [[Category:en:Quran]] and [[Category:Quran]] (for the time being, anyhow).

There seem to be few enough entries that this could be easily merged into Category:Islam. Mglovesfun (talk) 13:13, 13 November 2013 (UTC)[reply]

I would actually prefer Quran without the apostrophe because i have already tagged some entries without an apostrophe. But i dont feel strongly about it. 80.42.72.138 18:02, 13 November 2013 (UTC)[reply]

Surely accuracy takes precedence over convenience. Mglovesfun (talk) 20:59, 13 November 2013 (UTC)[reply]

Surely helping users find a category trumps other considerations. At least restore as a redirect. DCDuring TALK 23:11, 13 November 2013 (UTC)[reply]

Redirects for categories don't work like redirects for pages.
How do users find categories besides seeing them in use? If one starts typing into the search bar, whichever category name we use will come up once one gets as far as "Category:en:Qur" (it does not at the moment because the category is too new, but it should within the next 24 hours). - -sche (discuss) 00:17, 14 November 2013 (UTC)[reply]

Let's not overestimate the familiarity of a user with our practices. A new user from another wiki might just look for categories by typing them in the search box, probably "Category:Koran" or "Category:Quran" as like as anything, probably not being aware of our language prefixes. I'd forgotten about the hints under the search box, which, in this case, address the question adequately for someone typing "Category:Quran" AND recognizing Qur'an as an English alternative spelling, but not for someone not aware of "Qur'an" as an alternative and not for someone using alternative spellings beginning with other letters, as "Koran".

I can't imagine that WP doesn't have some (semi-)automatic means for creating the many redirects they have. Even if they don't, we could benefit from having such. We could use it for categories and possibly appendices (based on (non-obsolete, non-archaic, non-rare) alternative spellings with different initial letters) and for the common variants of idioms (varying inflection, determiners, placeholders, pronouns). Some means of detecting how many characters are required to put the desired target or a truly obvious alternative DCDuring TALK 01:36, 14 November 2013 (UTC)[reply]

I think it's OK to have redirects to categories, not just for alternative spellings but possibly for very long category names (as long as there is no conflict and too much ambiguity). --Anatoli ^{(обсудить}/^вклад) 02:34, 14 November 2013 (UTC)[reply]

It's not about whether we want to have redirects to categories. As -sche said, "Redirects for categories don't work like redirects for pages." Even if Category:Quran redirects to Category:Qur'an, a page that puts itself into Category:Quran will NOT appear in Category:Qur'an. --Wiki Tiki 89 02:39, 14 November 2013 (UTC)[reply]

Of course, they won't but as DCDuring suggested, people may look for categories by their names (as they are used to) and find the ones that are actually used here. Restored Category:en:Quran as a redirect to Category:en:Qur'an for now. --Anatoli ^{(обсудить}/^вклад) 03:04, 14 November 2013 (UTC)[reply]

For a simple test of what happens, see Angiosperms, which I have made categorize into Category:mul:Taxonomic names (cladus), which redirects to Category:mul:Taxonomic names (clade). Clicking on the italicized(!) category ("cladus") link at the bottom of the page takes you to the "clade" category. [[Angiosperms]] is not in the "clade" category. If one clicks on the "redirects from" link, one can see [[Angiosperms]] as the sole entry in the "cladus" category.

Thus having redirect categories would mean that we would need to identify entries that were so categorized into the redirected categories and properly categorize them. This could certainly be done by processing the dump from time to time. This would probably be more reliable for single miscategorizations than our current system for monitoring miscategorization: Special:WantedCategories, which is rarely reviewed beyond the first one or two thousand entries. We simply don't know how many wanted categories there are are beyond the first 5,000. DCDuring TALK 03:27, 14 November 2013 (UTC)[reply]

You seem to be implying, or presupposing, that we can't identify the wanted categories by processing the dumps; but that's not true. I've done so just now, and will happily e-mail the resulting list to anyone who wants it. There are about 20k; and a great many belong to regular patterns (Category:Language conjunctions and so on) that probably lend themselves to bot-creation. —Ruakh_TALK 09:33, 14 November 2013 (UTC)[reply]

No. I was saying that the default, built-in system (that doesn't require any requests for the time and skills of busy people) is limited. Dump processing can do anything if someone with some skill is willing to devote the time and there is processing power enough (which there usually is). For example, I'm sure that it would be dead easy, though it might take noticeable computer time, to get all capitalized English words not in sentence-initial position used in (English) definitions. (That's on my wish list to compare with headwords from Wikispecies to create a list to use to find unlinked taxonomic names.) DCDuring TALK 14:39, 14 November 2013 (UTC)[reply]

Module:es-noun

Could someone create Module:es-noun, please? For example on entries like civilización it should be able to work out the plural automatically from the string -ón at the end of the word. Mglovesfun (talk) 12:07, 13 November 2013 (UTC)[reply]

Yes please! We talked about this a long time ago but nothing got done. In any case, all the rules are written out neatly for anyone who feels like doing this but doesn't know Spanish: User talk:Ungoliant_MMDCCLXIV/Archive3#Module:es-plural.

(Although Spanish verbs need a module even more... but that will be harder.) —Μετάknowledge^{discuss/deeds} 01:55, 15 November 2013 (UTC)[reply]

Strange new User pages

Between 08:10 and 08:11 this morning, a bunch of new User Pages were created. All very similar, not actually spam, but they seem coordinated. Don't we have something in place to stop them? (p.s. They need to be deleted). SemperBlotto (talk) 08:22, 16 November 2013 (UTC)[reply]

p.p.s. Very similar to User:RosellaBusch who arrived yesterday.
Maybe you could link to some examples? --Wiki Tiki 89 15:04, 16 November 2013 (UTC)[reply]

I also saw five in a row created last night, and deleted and blocked them. They are back because some filters have been disabled. They usually include the (imaginary) user's full name, home town, and hobbies or job, often separated with <br> breaks, but the exact form is not too predictable. Equinox ◑ 15:09, 16 November 2013 (UTC)[reply]

Just a note: some of these pages didn't contain links, and so wouldn't have been stopped by the filter that was disabled. (It, in turn, was disabled for stopping unacceptably many good edits.) - -sche (discuss) 16:01, 16 November 2013 (UTC)[reply]

What caused the outbreak of script errors etc?

Does anyone know what caused the outbreak of script error messages about 10 minutes ago? I couldn't even get to this page. DCDuring TALK 19:16, 16 November 2013 (UTC)[reply]

I don't know what happened, but editing the page seemed to work. On the subject of script errors, what's wrong at Talk:glob? —Μετάknowledge^{discuss/deeds} 19:19, 16 November 2013 (UTC)[reply]

I bet it's related to my absence lately, which means people get to break stuff without control. -- Liliana • 19:34, 16 November 2013 (UTC)[reply]

Ha. Yes, I'm sure there are editors who are normally well-behaved, but as soon as they notice you're slightly less active, BAM! Script errors! —Ruakh_TALK 19:38, 16 November 2013 (UTC)[reply]

Whatever it was, it seems to have fixed itself. I was working on [[case]] at the time and was afraid I had broken something on that page, but it's OK now. —Aɴɢʀ (talk) 19:45, 16 November 2013 (UTC)[reply]

[1]. It has been fixed already. — Ungoliant ^(Falai) 20:20, 16 November 2013 (UTC)[reply]

Yup, I thought I could get away with (in reality: didn't notice I was) removing some commas while Liliana wasn't looking ... never again! - -sche (discuss) 21:14, 16 November 2013 (UTC)[reply]

Is anybody going to own up to it? DCDuring TALK 22:15, 16 November 2013 (UTC)[reply]

I'd try looking at the two comments immediately preceding yours, if I were you. —Μετάknowledge^{discuss/deeds} 22:41, 16 November 2013 (UTC)[reply]

I thought -sche was joking. DCDuring TALK 01:35, 18 November 2013 (UTC)[reply]

Backups

I am curious about the backup strategy for Wiktionary. I assume the Wikimedia Foundation handles this. What do they back up, and how often, and in what format/medium, and where is it stored geographically? Anyone know, or know whom to ask? Equinox ◑ 21:24, 17 November 2013 (UTC)[reply]

Well, the dumps are a form of backup (Ruakh knows a lot about these). But I am not sure if any of the dumps backs everything up. --Wiki Tiki 89 21:30, 17 November 2013 (UTC)[reply]

I guess I've never pulled out the database schema and compared it to the list of dumps to make sure, but my understanding is that the dumps do indeed, taken as a whole, back everything up. (Of course, some of the dumps are "private", meaning that you or I can't get at them. Apparently the Foundation doesn't share LinkedIn's view that user account data should be freely available. :-) —Ruakh_TALK 04:06, 18 November 2013 (UTC)[reply]

I'm pretty sure that, as well as the dumps, there is a log of changes since the last dump. After a disaster the latest dump is restored and the change log is rerun. (I seem to remember it happening a few years ago). SemperBlotto (talk) 08:20, 18 November 2013 (UTC)[reply]

We don't use the dumps as part of our backup strategy, not really; they are produced for researchers, bot operators, folks who want to produce offline readers or set up their own copies, etc. And of course to support the right to fork. But restoring from those would take a very long time and would miss days of edits. What we do have is db replication to slaves in two data centers, in case one data center is hit by a disaster, and database snapshots (via LVM) in case of data or db corruption that is replicated to the slaves before we catch it. -- ArielGlenn (talk) 06:37, 19 November 2013 (UTC)[reply]

automatic categorization

I would like to know how to automatically categorize when i place the tag "qur'anic" before an entry. In entries such as Dhul-Qarnayn, the tag "qur'anic" would be more appropriate than "islam" because this figure is not really a part of any creed, faith or woship. He's simply a Quranic figure and we need such a tag. Pass a Method (talk) 23:31, 17 November 2013 (UTC)[reply]

You would have to get someone to add it to Module:labels/data. I wouldn't do it myself, since a simple typo can cause script errors site-wide.

First you should figure out:

The keyword(s) that would be used in the {{context}} template (there can be "aliases")
What you want to be displayed on the page
The name of the category you want it to add

I would advise against a separate "Qur'anic" category- you already have some people questioning whether Category:Qur'an is redundant to Category:Islam. Chuck Entz (talk) 00:17, 18 November 2013 (UTC)[reply]

By that logic "biblical" is redundnt to "christianity". Pass a Method (talk) 00:30, 18 November 2013 (UTC)[reply]

No, because the Bible is shared by several religions. In the same way, however, the Quran could be adopted by other religions or other religions could split off from Islam and retain the Quran, thereby differentiating Quranic from Islamic. --Wiki Tiki 89 00:48, 18 November 2013 (UTC)[reply]

The Quran is also sharwed by several religions, such as Baha'is and Druze among others. Pass a Method (talk) 08:11, 18 November 2013 (UTC)[reply]

I was not aware that the Druze share the Quran, but I can believe it. The Baha'i, I'm not sure about, where did you find this information? --Wiki Tiki 89 13:20, 18 November 2013 (UTC)[reply]

I meant "Qur'anic" as opposed to "Qur'an". I, personally, don't see any reason to not have a "Qur'an" category. Chuck Entz (talk) 13:43, 18 November 2013 (UTC)[reply]

I don't think anyone was suggesting a Category:Qur'anic. --Wiki Tiki 89 13:49, 18 November 2013 (UTC)[reply]

Reconstruction:Proto-Sino-Tibetan/k-m-raŋ ~ s-raŋ

Can someone please un-italicize the Ainu on this page? I can't find the module that controls this. Ultimateria (talk) 18:17, 18 November 2013 (UTC)[reply]

It's not controlled by a module, or at least, not in the sense that you mean. We just use MediaWiki:Common.css to prevent certain scripts from appearing in italics (even inside HTML <i>). Kana currently isn't covered (because we haven't really been using it: for Japanese we use .Jpan), but we could certainly add it there. —Ruakh_TALK 19:29, 18 November 2013 (UTC)[reply]

I have neutered the italics for Hiragana and Katakana.[2]

I suppose they should both have the same format as .Jpan?

/* Japanese (Hiragana, Katakana, Kanji) */

.Jpan {
  font-family: Hiragino Kaku Gothic Pro, MS PGothic, Arial Unicode MS, Code2000, sans-serif;
  font-size: 110%;
  }

.Jpan, .Jpan * {
  font-style: normal;
  font-weight: normal;
  }

big.Jpan,
strong.Jpan,
b.Jpan,
b .Jpan {
  font-size: 137%;  /* Fonts are really big in Japan */
  }

.Jpan b {
  font-size: 125%;
  }

—Michael Z. 2013-11-19 03:47 z

I think that instead of listing Kana and Hira (and perhaps Hrkt) separately, we should include them in the Jpan selectors, and in the alphabetical positions where Kana and Hira would go, to just have CSS comments saying they're grouped with Jpan. Otherwise it's hard to (remember to) keep them all in sync. —Ruakh_TALK 04:29, 19 November 2013 (UTC)[reply]

Roger. On it. —Michael Z. 2013-11-19 21:32 z

Okay, done that.[3] How does everything look?

Do the bold sizes work consistently? I see that .Jpan has font-size 110% (13px × 110% = 14.3px), b.Jpan has 137% (= 17.81px). I suppose nested .Jpan b would multiply 110% × 125% = 138% (17.875px) – shouldn’t we use precise values? And wouldn’t b .Jpan (nested) multiply 137% × 110% = 151% (19.591px)? As far as I know, Chrome and Safari round to the nearest pixel, but Firefox renders precise font size. —Michael Z. 2013-11-19 21:53 z

frequencies

Any chance to include lemma / word frequencies to entries? It would be helpful for a number of applications, including second language learning. — This comment was unsigned.

Are you interested in English? In contemporary spoken English, contemporary written English (newspaper, textbook, fiction, business?), or older English literature? UK, US, or other? It is not as easy as it might seem to do this, not that it can't be done or shouldn't be done. DCDuring TALK 19:08, 18 November 2013 (UTC)[reply]

Module:languages

This module is way too large to be easy to work with. I can think of two good ways to solve this problem:

Create a submodule for each language code and store its info there.
Create submodules for each first letter of the language code (or some other subset).

These options may or may not cause noticeable slowdown. But I am sure that at least the second option can be made to work fast. --Wiki Tiki 89 05:18, 19 November 2013 (UTC)[reply]

I guess the question is, what factors are we optimizing for? For example, if we want to minimize job-queue disruption, the best approach would probably be to put the stablest and most widely referred-to language-codes in one module that's always checked first, and to split all the other languages in some algorithmically-determined way (such as by first letter as you suggest). That way only a small minority of pages would have to be updated for any given edit.
It may be worth asking the MediaWiki devs to weigh in. (Note that as of the current database dump, Module:languages is transcluded in 3,544,079 pages, which is about 92.6% of the pages on the wiki; so while there's a general principle of not worrying too much about performance, I think this is a clear case where it's worth giving it at least a nod.)
—Ruakh_TALK 06:46, 19 November 2013 (UTC)[reply]

How exactly does the size of the module affect the user experience? Is it downloaded with every page that uses a template that needs language information? How big is what is downloaded? Doesn't it make the page so big that users with older machines (esp. less RAM) suffer bad performance, for example, when paging through a large page?

Why would language information be needed at all for pages that use only our default script? Isn't that a bad fundamental architecture for of the way we handle languages? DCDuring TALK 13:32, 19 November 2013 (UTC)[reply]

The size of the module does not really affect page load speeds of pages that transclude it. I was just referring to the fact that it is difficult to edit the module. Splitting the file up, however, will require considering how much such a change would affect performance. --Wiki Tiki 89 14:54, 19 November 2013 (UTC)[reply]

I think it might help to create 28 submodules. One for the two-letter codes, 26 for the three-letter codes split by first letter, and one for custom Wiktionary codes. This makes it very easy for a module to determine which submodule to look in, you only need to look at the string length, and if it's 3 then look at the first letter. —CodeCa t 14:33, 19 November 2013 (UTC)[reply]

I like CodeCat's idea. --Wiki Tiki 89 14:54, 19 November 2013 (UTC)[reply]

There is a logistical problem with changing the module though. Many modules access the data table directly, so if we change how the data is accessed, all existing modules that use it will break. I do think we can work around this, but it will take some time and thought. We'll need to decide on which way the data is meant to be accessed. We can keep the direct access, but that would mean that every module that uses language data will have to have its own version of the "find out which module the code is in" code. A better way might be to write an accessor function that does this, and require all access to the data modules to go through it. It would be similar to the lookup_language function in Module:language utilities, but meant to be called in Lua. —CodeCa t 15:04, 19 November 2013 (UTC)[reply]

Yeah, I'm surprised we didn't already have an accessor function. It shouldn't be too hard to update though. First we add the accessor function, then we change all modules that use Module:languages to use the accessor function, and then (after creating all the submodules) we update the module. --Wiki Tiki 89 15:09, 19 November 2013 (UTC)[reply]

Scratch that. The way the module is designed will require a few more steps. --Wiki Tiki 89 15:11, 19 November 2013 (UTC)[reply]

Would metatables be useful here? I'm not sure how they work. —CodeCa t 15:24, 19 November 2013 (UTC)[reply]

Data modules can't have metatables, so while I'm not sure quite what use you're picturing, I think the answer to your question is "probably not". —Ruakh_TALK 15:37, 19 November 2013 (UTC)[reply]

@Wikitiki89: Yes, as I've noted previously, the direct external use of nontrivial data is unfortunate. To fix it, I think the right sequence of steps is:

Copy the contents of Module:languages to Module:languages/data.
Modify all references to Module:languages to instead be references to Module:languages/data.
Change the now-unused Module:languages to be a normal Lua module backed by Module:languages/data.
Modify all code that uses Module:languages/data directly to instead use it via Module:languages.

(Note: the same technical effect can be achieved by putting the accessor function in a different module, and leaving the data proper in Module:languages. That would let us eliminate the first two steps above. But using Module:languages/data would be better in the long term by making it much clearer that it's an internal data-store for use by Module:languages, and better in the short term in that, by emptying out the list of transclusions of Module:languages, we'll be able to build an authoritative list of all modules that were using it, so we can be sure we're migrating all of them to use the accessor function.)

—Ruakh_TALK 15:37, 19 November 2013 (UTC)[reply]

I would like to add an extra intermediate step. Copy the contents to Module:languages/data, but then make Module:languages/data/temp_redirect which just returns Module:languages/data. The reason is that we don't want pages to use Module:languages/data directly, and it can be hard to track those uses down, because we can't distinguish cases that use Module:languages/data via the accessor, and those that use it directly. By using an extra redirect, we can keep track. —CodeCa t 17:50, 19 November 2013 (UTC)[reply]

How do redirects work for modules and how does it help keep track of them? I already moved the contents of Module:languages to Module:languages/alldata and tried to orphan Module:languages, but I don't know how to tell if I orphaned it completely or not. --Wiki Tiki 89 17:55, 19 November 2013 (UTC)[reply]

It's ok I think I was just confused. What you did is a "redirect" the way I imagined it. I created Module:languages/data2, Module:languages/data3 and Module:languages/datax now. —CodeCa t 17:57, 19 November 2013 (UTC)[reply]

But Module:languages/data3 is not really a module, just a "directory" for submodules. --Wiki Tiki 89 18:00, 19 November 2013 (UTC)[reply]

Ok it can go then. —CodeCa t 18:06, 19 November 2013 (UTC)[reply]

I created all the submodules of Module:languages/data3. --Wiki Tiki 89 18:20, 19 November 2013 (UTC)[reply]

I think splitting all the codes by their first letter is a good compromise between keeping the page small and keeping the data centralized. (I would oppose creating separate modules for every code, especially as long as we still haven't gotten rid of the separate templates for every code.) CodeCat's idea is also good, particularly because "has a two letter code" is a decent proxy for "is stable and widely-transcluded". (The only languages with more than 20 000 entries which don't use two-letter codes are Mandarin and Translingual, with merely 64022 and 45208 entries, respectively. The other languages with 20k entries, which together account for almost 3 million of our 3.5 million entries, have two-letter codes.) - -sche (discuss) 15:20, 19 November 2013 (UTC)[reply]

I'm not sure if two-letter codes necessarily imply stability; the diacritic-removal rules, for example, seem potentially quite subject to tweaking. The good news is, we don't necessarily need any proxy for "is stable and widely-transcluded": the stable and widely-transcluded languages are used in such a large majority of pages that there's little harm in having a Module:languages/data/stable that we always check before falling back to the split-up modules. (I think the problems with #ifexist: have made people a bit gunshy about fallbacks, but in this case I'm talking about first checking a data-module that is most likely already loaded anyway.) —Ruakh_TALK 16:08, 19 November 2013 (UTC)[reply]

What advantage do we gain from separating the stable languages other than ensuring no errors pop up in the stable module? I don't think -sche removes commas frequently enough for it to be an issue. --Wiki Tiki 89 16:14, 19 November 2013 (UTC)[reply]

Job queue. We should be able to edit the family of a minor Papua New Guinean language without forcing the regeneration of 92.6% of the pages on the wiki. —Ruakh_TALK 17:07, 19 November 2013 (UTC)[reply]

That makes sense. I agree now. --Wiki Tiki 89 17:12, 19 November 2013 (UTC)[reply]

But if both the stable and unstable data pages are accessed through Module:languages, wouldn't even a change in the unstable data force regeneration of everything that uses the module? --Wiki Tiki 89 17:21, 19 November 2013 (UTC)[reply]

It depends what you mean by "accessed through". If you use the approach that Darkdadaah describes below, where Module:languages is still just a data module (that assembles all the others), then yes, every module is transcluded on every page. But I meant that Module:languages would have accessor functions that always try the stable data-module first, and only load the unstable data-module if needed. On the vast majority of pages, they won't need to load any unstable data-modules. (Probably we'd have a single local "helper" function in Module:languages that handles the logic of loading data for a given language, and then the other functions in that module would be truly outward-facing, just calling the "helper" function and not needing to worry about where it comes from.) —Ruakh_TALK 22:15, 19 November 2013 (UTC)[reply]

So if you make a change to an unstable language module, wouldn't that cause every page that transcludes Module:lanugages to have to be refreshed even if that page itself did not require loading the unstable languages? If that is not the case, then how does it work? --Wiki Tiki 89 23:11, 19 November 2013 (UTC)[reply]

Also, the notion of "stable language" is not as useful anymore. I frequently have to update things like the entry_name property for Arabic, for example. --Wiki Tiki 89 23:17, 19 November 2013 (UTC)[reply]

Re: every page having to be refreshed: Sorry, I don't really understand your confusion, which makes it hard for me to clarify how it works. If a code module has an mw.loadData expression inside an if or else or and or or in such a way that the expression is only reached on certain pages, then the specified data-module will only be loaded on those pages, and the links table will only count it as being transcluded on those pages, so only those pages will need to be refreshed. (Think of the implementation you wrote earlier, where you would load something like Module:languages/alldata/a by doing string manipulation to build the name. Surely you didn't think that this would also cause Module:languages/alldata/b to be loaded, even on pages that didn't use any language-codes starting with b? How would the software even know about Module:languages/alldata/b on such pages?)
Re: stability: Yeah, I wouldn't consider Arabic to be "stable" yet; but how often do we modify French? (Also, "stable" is relative. If we can get the most-widely-transcluded module to be edited only (say) once a month, that's already an enormous improvement over the status quo. And it's also something we can evolve toward; do you think you'll still be frequently editing such basic properties of top-40 languages in, say, six months to a year?)
—Ruakh_TALK 00:03, 20 November 2013 (UTC)[reply]

Ok, I understand how it works now: the modules that a page transcludes are determined from scratch on each page. The next question is how do we determine which languages are transcluded the most? --Wiki Tiki 89 00:09, 20 November 2013 (UTC)[reply]

Re: "how do we determine which languages are transcluded the most": That's a difficult question, and one I wondered about when you first started the discussion. One of -sche's comments in this discussion got me really excited, because I thought (s)he was giving transclusion-counts for major languages, and I was going to ask how (s)he did that, until I realized (s)he was just giving entry-counts for those languages. :-P I think we can get a decent approximation by counting up all pages that use l, t, t+, term, or etyl with a given language code, plus all entries or category-pages that are in the language, minus any double-counting implied by the above. —Ruakh_TALK 00:28, 20 November 2013 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ That would probably be a good enough approximation. Another less important question is wouldn't it make it more difficult to edit language tags if you have to first check whether it is in the stable submodule and then whether it is in the default location? And should we duplicate the stable languages or should we remove them from the default location? --Wiki Tiki 89 00:43, 20 November 2013 (UTC)[reply]

I'm afraid that is the case, yes. Dakdada (talk) 17:37, 19 November 2013 (UTC)[reply]

What matters is transclusion. Modules themselves don't have transclusions, because they don't have any content on their pages. So they're not affected by updates to a page that is used by that module. Only content pages are affected. The only changes that would affect Module:languages directly are the changes to Module:languages itself, but those would be rare. —CodeCa t 17:46, 19 November 2013 (UTC)[reply]

Does it mean that no update made in Module:languages/alldata would propagate? Dakdada (talk) 18:53, 19 November 2013 (UTC)[reply]

Yes, it would propagate. Look at dictionary for example, it now transcludes Module:languages/alldata but not Module:languages anymore. —CodeCa t 18:57, 19 November 2013 (UTC)[reply]

I think it would eventually after a day or two, but it would not propagate instantly. --Wiki Tiki 89 19:00, 19 November 2013 (UTC)[reply]

Implementation

I've created a sample of how I think this should work at Module:User:Wikitiki89/languages. It has not yet taken into account Ruakh's latest suggestion. --Wiki Tiki 89 15:44, 19 November 2013 (UTC)[reply]

The m3 and mx tables are not needed. Within a single Lua entry-point, modules loaded by require are already cached. (And across Lua entry-points, of course, your own module is discarded, so can't help with the caching of other modules.) —Ruakh_TALK 16:02, 19 November 2013 (UTC)[reply]

Addendum: I guess there are a bunch of kinds of "caching", so to be explicit about it: within a single Lua-entry point, all require's of a module will return the same object. (Only the first require will actually run the module.) Something like

require('Module:foo').bar = 'baz'; return require('Module:foo').bar

will return 'baz'. —Ruakh_TALK 16:14, 19 November 2013 (UTC)[reply]

Then I guess as long as the call to require itself is not too slow, I'll get rid of them. --Wiki Tiki 89 16:08, 19 November 2013 (UTC)[reply]

Is mw.loadData (caching) usable with this implementation? Otherwise it will be inefficient and slow. Also, I'm afraid that "requiring" a template with something provided by the user is prone to errors (the big red error that we want to avoid), unless we catch the error or control the input perfectly. I believe it would be much simpler to concatenate all the tables into one in the Module:languages: there would not be a lot of work to do and we can control what modules are required/loaded. Speedwise, it should be as fast as the current implementation, and with the use of mw.loadData for loading module:languages, all the work would be done just once per page (this is already very fast). Dakdada (talk) 16:42, 19 November 2013 (UTC)[reply]

I think you're missing the point. We are trying to split up the data into multiple files to make it easier to maintain.

The script error will only occur if the length of the language code is 3 bytes and the first byte is not [a-z]. In that case, it should cause a script error. --Wiki Tiki 89 16:47, 19 November 2013 (UTC)[reply]

I don't think Darkdadaah is missing the point. (S)he's saying that Module:languages would be a small module that just loads/combines all of Module:languages/a (etc.), so we'd still have the maintainability benefits of splitting into multiple modules, while preserving the existing interface of Module:languages and the use of mw.loadData. (I'm not sure if I totally agree with that approach, but — as you seem to have realized from your latest edits — (s)he's right that any significant amount of data must be loaded via mw.loadData, not via require.) —Ruakh_TALK 17:04, 19 November 2013 (UTC)[reply]

Yeah, I was not aware of mw.loadData before. --Wiki Tiki 89 17:08, 19 November 2013 (UTC)[reply]

I think your implementation is good. —CodeCa t 17:46, 19 November 2013 (UTC)[reply]

Module:languages/alldata

I have split up all the data and meant to replace Module:languages/alldata with the contents of Module:User:Wikitiki89/languages/alldata, but when previewing pages, I get script errors complaining about mw.loadData not allowing meta-tables. But I have not used any meta-tables and therefore cannot figure out what the problem is. --Wiki Tiki 89 22:27, 19 November 2013 (UTC)[reply]

Turns out that data loaded with mw.loadData will contain meta-tables and therefore cannot be returned by another data module. --Wiki Tiki 89 23:08, 19 November 2013 (UTC)[reply]

STOP. You really need to stop what you're doing, take a pause for breath, and a pause for thought. You are making drastic, untested changes to a module that, as we've recently established, is used on 92.6% of pages. It is a known fact that switching from mw.loadData to require causes a serious performance degradation. (This isn't just speculation on my part; Tim Starling actually created mw.loadData specifically after CodeCat showed him Module:languages.) Please roll back your changes to Module:languages, and do not re-proceed until you have decided what you are actually going to do — after discussion, after dev weigh-in (if they do weigh in: I've left requests at some WP talk-pages), and most importantly, after testing. —Ruakh_TALK 00:13, 20 November 2013 (UTC)[reply]

The module itself is still used with mw.loadData, so that shouldn't make a difference in performance. You may be right that I am moving too fast, but I have not really changed anything yet except for where the data is stored. --Wiki Tiki 89 00:19, 20 November 2013 (UTC)[reply]

O.K., yes, you're right: since nothing besides Module:languages is using Module:languages/alldata yet, and Module:languages is loaded at most once per page, Module:languages/alldata is also going to be loaded at most once per page. *whew* Sorry if I overreacted. (But yeah, I do still feel that you're moving too fast.) —Ruakh_TALK 00:22, 20 November 2013 (UTC)[reply]

In the current setup, things will be loaded at most twice. Once via /alldata, and once via the "new" location. —CodeCa t 00:59, 20 November 2013 (UTC)[reply]

But since nothing uses the "new" location except /alldata, they can only be loaded once. --Wiki Tiki 89 01:04, 20 November 2013 (UTC)[reply]

For now, yes. But even when we do switch over, it will only be twice, that shouldn't be too bad. —CodeCa t 01:26, 20 November 2013 (UTC)[reply]

During the switch, yes. But after we switch over, /alldata won't be used anymore and it will once again be only once. --Wiki Tiki 89 01:40, 20 November 2013 (UTC)[reply]

Merge Module:language utilities into Module:languages?

If Module:languages will soon not be a data module anymore, then we could consider merging these two. Do we want to do that? —CodeCa t 16:38, 20 November 2013 (UTC)[reply]

I would support that in theory, but I don't know how difficult it would be and whether it would be worth it. --Wiki Tiki 89 16:46, 20 November 2013 (UTC)[reply]

From Anomie

From Anomie (talk • contribs), at his/her Wikipedia talk-page:

> It seems like you all have it well in hand there: putting "stable" widely-used languages in one module and the "unstable" less-widely used languages in another (or multiple others), so editing one of the unstable language's modules doesn't add 92% of the pages on the wiki to the job queue, is a great idea. Feel free to ping me if you want code review of a prototype implementation, or if you get stuck on something and need suggestions. Anomie ⚔ 02:30, 20 November 2013 (UTC)[reply]

[link]

—Ruakh_TALK 21:04, 20 November 2013 (UTC)[reply]

If we only edit the unstable module, how can Mediawiki know that it doesn't have to update all the pages? The ~~"stable"/"unstable"~~ data modules are never used directly and are called by the same templates/modules. Dakdada (talk) 09:53, 21 November 2013 (UTC)[reply]

This was discussed above (see Ruakh's comment starting with "Re: every page having to be refreshed"). --Wiki Tiki 89 13:44, 21 November 2013 (UTC)[reply]

What are the real effects of this? I remember a time when we kept some changes that I can’t remember in templates for 30 days because there was a rumour about the CSS being cached a month for some readers. Then we forgot all about it and I have never heard a single complaint. —Michael Z. 2013-11-22 21:07 z

CSS caching is client-side and completely different from MediaWiki's page caching, which is server-side. --Wiki Tiki 89 21:16, 22 November 2013 (UTC)[reply]

The episode I refer to was about WikiMedia’s proxy servers or something. I know about MW page caching and clearing it with null edits. The question is, what would be the nature and scope of the real breakage that readers would see because of the Lua data being cached? Can we test and measure the effect with a harmless change? —Michael Z. 2013-11-22 21:24 z

Lua stuff wouldn't be affected by proxy servers either. A proxy server is basically a client that re-servers data. Anything cached on the proxy server would already have had its Lua evaluated on the server. CSS is evaluated by the client, which is why it could be out of sync with the page cache. Lua cannot get out of sync. --Wiki Tiki 89 21:38, 22 November 2013 (UTC)[reply]

Yeah, I know. But my point is that in the past we thought that some wide-ranging effect would degrade the experience of Wiktionary for millions. But it didn’t. —Michael Z. 2013-11-23 04:05 z

It didn't because you took the right precaution. Either way, that issue is irrelevant to this one. The only cache related issue here is when you update a Lua module, every page that transcludes it will have to be refreshed in the cache; the more there are, the longer it takes to refresh them all. In this particular module, that means we should try to ensure that the most frequent changes won't have to refresh the entirety of Wiktionary, especially when the change does not apply to most of the pages. --Wiki Tiki 89 04:21, 23 November 2013 (UTC)[reply]

I think it didn’t degrade Wiktionary because the problem never existed, had been only fabricated by speculation. I don’t remember anyone announcing that something had been fixed. It just went out of our attention and we eventually forgot about it. We stopped taking any precaution, and nothing changed because of it. (If someone remembers better, please correct me.)

I am also reminded of the fanatical avoidance of transcluding templates within templates, until a developer told someone to stop worrying about it, because it is designed to work the way it works.

So my question is, what real evidence is there of a problem, and what is nature and magnitude of it? E.g., how long does it actually take to refresh, and what is wrong before it finishes? —Michael Z. 2013-11-23 16:33 z

Note: sorry, I didn't realize an announcement was necessary, but yes, the problems with Squid caching for CSS were addressed a while back — maybe about two years ago? And client-side caching was also mostly addressed, by including a timestamp in the URL, though there are some limitations. In particular, the 'importScript' stuff doesn't have enough information to do that, so JS included by that mechanism is still cached. —Ruakh_TALK 17:15, 23 November 2013 (UTC)[reply]

Thanks. It’s just that I was never able to experience most of those problems, and they were kind of a folklore. Certainly never had to wait 30 days for anything to update. We were acting on faith in humouring some of the suggestions, and I only have a vague idea that it had gone away. —Michael Z. 2013-11-23 21:03 z

I'm not 100% sure, but I believe that 30 days would only ever have been due to client-side caching (potentially including, especially for logged-out users, caching by transparent ISP proxies), never Squid caching. If someone said or implied otherwise, I'm guessing that they were confused about the differences between the two. (And I know what you mean. There are a lot of different kinds of "caching" — client/proxy-side, Squid-side, server-side Memcached, pre-computed values in the DB, and so on — all of which have different characteristics, and I don't think anyone on Wiktionary really understands them all very well, and they're very hard to test and reproduce, so folklore develops, especially as people repeat theories without pointing to the (often old and obsolete) evidence that originally led to those theories. Performance is alchemy sometimes. Fortunately the good folks at WMF seem to have made a lot of performance improvements in the platform.) —Ruakh_TALK 21:23, 23 November 2013 (UTC)[reply]

How long, depending on how backed up the job queue is, it could be instant or could take a few days. Or might never be refreshed and we would have to null edit everything that it applies to. The symptoms are that any changes in the module will not be reflected on a page that has not been refreshed. For example, if a language's name was changed, the old one will still be displayed (until the page is refreshed either by the job queue or a null edit). --Wiki Tiki 89 16:38, 23 November 2013 (UTC)[reply]

I don't know how the queue works exactly, though I've read the documentation, but the queue has been in the vicinity of 30,000 jobs for at least a week. On November 15, I added categorization to a template with about 7K transclusions (300th on the list). I think that is about 14 jobs, assuming each entry-category update is one "task". Items are still being added to the category. This seems slower than the rate at which such things are added when the queue is at levels below 100. No wonder Codecat has to nul-edit in order to see the results of changes to templates and modules. I would imagine that bots would run slower, but not if their priority was the same as that of a normal editor. DCDuring TALK 16:54, 23 November 2013 (UTC)[reply]

Well, a language named up by its previously-used name for a few days does not destroy Wiktionary. I can live with it. Whole disciplines somehow keep existing with multiple historical names for things in use. What other scenarios can result? Does it get worse?

How long is “a few days?” What is the longest actual period of non-updatedness recorded? Is “it might never be refreshed” actually proven, or speculative? —Michael Z. 2013-11-23 17:03 z

I find your reaction here frankly bizarre. A long job queue has obvious downsides. Currently, if I update the name of a language that doesn't appear even once on Wiktionary, that interferes with others' ability to populate cleanup categories for quick tasks, with their ability to make template changes that require concurrent manual effort (like moving template-populated categories), etc., etc., etc. I don't think anyone is saying "OMG Wiktionary is assploding", but you don't seem to have any reason for opposing the proposal? No matter how small you think the advantages are, they're obviously bigger than the disadvantages you've presented, because you've studiously avoided presenting any. —Ruakh_TALK 17:27, 23 November 2013 (UTC)[reply]

I am neither for nor against the details. I just want to understand the problem and its effects. —Michael Z. 2013-11-23 21:03 z

If you want to better understand how backed up the job queue can get check out how many pages still transclude Module:languages despite the fact that Module:languages was orphaned on November 19. --Wiki Tiki 89 20:34, 26 November 2013 (UTC)[reply]

Ouch. Fun to refresh the list and see it change every time. —Michael Z. 2013-11-26 23:23 z

@ Wikitiki: How do you make inferences based on Special:WhatLinksHere/Module:languages? Do you page through 500 links at a time and count how many times it took to get to the end? It really seems quite useless for something like Module:Languages.

I also notice that we do not have a special page that has "most-linked-to modules". Shouldn't we? It would tell us something about the rate at which the queue is updating language related stuff by looking at the difference between totals every three days and/or it might disclose something about the modules not working as expected. The corresponding templates coulda/shoulda been a warning of overdependence on {{head}}. I suppose flying blind by choice is braver than flying blind by necessity. DCDuring TALK 00:16, 27 November 2013 (UTC)[reply]

It would help if "WhatLinksHere" gave a count, because currently I don't know of any way to tell. Paging through shows that there is still a good amount even after a week. --Wiki Tiki 89 01:37, 27 November 2013 (UTC)[reply]

'What links here' is fairly useful if the number is not too large, ie, less than 5K. It gets very tedious trying to count the pages one has to click through to get to the end to get a count on demand. The count of pages linked for Module:languages must be one the order of a million. The run for high-use pages the 5,000 most-linked-to- pages/templates/categories/files is only every 3 days, but allows one to see a trend over time if one records the relevant information and to compare total links with what one thinks ought to be linked. But the pages do not include modules, AFAICT, even though the raw data seems to be there. Doesn't this merit a bug report? DCDuring TALK 02:23, 27 November 2013 (UTC)[reply]

I guess it would merit a bug report. But what do you mean by "the raw data seems to be there", where does it seem to be? --Wiki Tiki 89 02:30, 27 November 2013 (UTC)[reply]

If you're asking why 'Special:WhatLinksHere' doesn't include other modules that invoke a given module, it's because the module page itself doesn't include that other module. The same behavior applies to templates: if a template contains <noinclude>[[foo]]</noinclude><includeonly>[[bar]]</includeonly>, then the template itself will show up only at Special:WhatLinksHere/foo, not at Special:WhatLinksHere/bar. (With templates, frequently any transcluded templates will still appear on the template-page itself, either because we didn't use <includeonly> or because they appear in the documentation. With modules, that's still a possibility — transclusions on the documentation subpage do count — but it's less common, because we haven't been including that kind of demonstration in our module documentation.) —Ruakh_TALK 04:39, 27 November 2013 (UTC)[reply]

Let me ask the questions directly, without relying on my wholly inadequate grasp of include/includeonly/onlyinclude/noinclude:

Do we get information that is in principle useful from having module namespace with its own 'most-linked-to pages' listing available every three days? How much interpretation of that page is needed to make it useful? Is the information really only reliable for templates that are directly and only transcluded in entry pages?
Given that we, or at lest CodeCat, apparently don't care that we have templates that are transcluded on virtually every mainspace page, whatever the consequences, is there any practical value to spending MW time getting a special most-linked-to pages for modules?
Would we achieve more useful results in this doing our own dump analysis? (I'm not volunteering.)

As always, please let me know if I am missing the point. DCDuring TALK 04:59, 27 November 2013 (UTC)[reply]

Sorry, it is probably I who was missing the point. Were you referring to Special:MostLinkedTemplates, and how it (apparently) doesn't include modules? If so, then — I agree, either it should include modules, or a separate page should exist for that purpose. And — yes, Bugzilla is the way to make that request. —Ruakh_TALK 05:06, 27 November 2013 (UTC)[reply]

As I was asking, I started to wonder what the point of it would be, as it the most-linked-to templates page seemed more useful as measuring CodeCat's progress toward transcluding {{head}} on every page than on warning about dependence on a simple template. DCDuring TALK 06:04, 27 November 2013 (UTC)[reply]

The job queue remains above 30,000. Have more "one-time" changes been made or is this the new normal? DCDuring TALK 19:22, 27 November 2013 (UTC)[reply]
Yes, there are constantly language mergers coming in from WT:RFM. Meanwhile, as you can see here there are still a lot of pages that transclude the temprarily orhpaned Module:languages. And I am afraid to make further changes to Module:languages before I am 100% sure it is orphaned, even though these changes will make language mergers affect fewer pages. --Wiki Tiki 89 22:16, 27 November 2013 (UTC)[reply]
Current status of the job queue: jobs="2958371"

Up to the minute status here.

This is 100~~ten~~ times the level we had been experienced that caused category updates to be very slow. DCDuring TALK 11:53, 11 December 2013 (UTC)[reply]

It's the first time I've seen such a number. 30,000 is acceptable, but 3 million? That's, like, the total number of pages of Wiktionary! (Well, not quite, but close.) In comparison, the French Wiktionary has 658 jobs and the English Wikipedia 73,965. Dakdada (talk) 17:22, 11 December 2013 (UTC) NB: The French Wiktionary also has a Module list of language data, regularly updated. Dakdada (talk) 17:24, 11 December 2013 (UTC)[reply]

I don't think that folks have any certain knowledge of what this means. I fear we would have to petition MW to shut down Wiktionary and reprioritize to allow all the jobs (500 "tasks" to a "job", AFAICT) to complete. Otherwise it may take many months to clear it out. I have the suspicion that each edit made to the language modules creates a task for every entry. There are lots of "little" edits, each of which might generate 3,000,000 tasks. I don't think undoing helps. It might just add an equal number of jobs. I can only hope that the queue doesn't really matter, except for things like the listings of members of categories. I guess we'll find out. DCDuring TALK 19:40, 11 December 2013 (UTC)[reply]

One thing that can be done is to make edits in batch, not one at a time (e.g. once every week). Moreover, this would allow a kind of review before adding/modifying things. Dakdada (talk) 09:26, 12 December 2013 (UTC)[reply]

The need would be to batch those for the widely transcluded templates eg, {{t}}, {{l}}, {{head}}, {{context}}, and perhaps some like {{en-noun}}) and any module that is used by such tempates (eg, language). The need may be obvious but those involved seem quite uninterested in testing, review, and any restraint whatsoever, beyond what they can muster from themselves.

But we don't even seem to have a certain understanding of exactly what goes into the queue, nor even of what triggers removal. For example, why is the queue down to 1.3MM today from nearly 3.0MM a day ago? Was this the result of normal operation or was it flushed? Are changes that do not involve writing new HTML "cheaper" and therefore faster to complete than those that do? DCDuring TALK 13:31, 12 December 2013 (UTC)[reply]

The job queue is supposed to be processed every time there is a page query, i.e. every time a page is visited, which is quite often (one page viewed = one page processed from the job queue). So it can be quite fast. Dakdada (talk) 18:15, 12 December 2013 (UTC)[reply]

How many requests does Wiktionary have in a day or annually? [It looks like we have something on the order of 100-150K requests per hour.] I wonder whether the 3MM was an estimate, which was subsequently revised to a lower number. DCDuring TALK 19:13, 12 December 2013 (UTC)[reply]

It's down to 1 million in only a day. So it is processed quite fast (2 million per day is possible, apparently). That tells me that a big job queue really isn't a cause for concern. The worst it does is delay category/transclusion updates, but those are not a high priority for a functioning Wiktionary anyway. "Restraint" is only relevant if we have to choose between equal sacrifices, which is clearly not the case here as one sacrifice is both minor and temporary in nature. —CodeCa t 20:59, 12 December 2013 (UTC)[reply]

"1 million" in the sense of 1.36 million. Let's see where it is tomorrow. I guess your bet would be about 0-300K. My bet is 1.0-1.3 MM. DCDuring TALK 21:18, 12 December 2013 (UTC)[reply]

I was wrong. ~~It's still at 1.36MM. That is, no change. Evidently our understanding of how the queue is emptied is not right (Is the documentation accurate?) or our count of page requests is wrong for this purpose.~~ DCDuring TALK 13:55, 13 December 2013 (UTC)[reply]

That's strange. It shows 78 thousand for me. —CodeCa t 14:00, 13 December 2013 (UTC)[reply]

Yep, 78688 for me. --Wiki Tiki 89 14:03, 13 December 2013 (UTC)[reply]

Yep. Wrong again. Please don't ask why. Sigh. Why are my template-change-caused category changes dribbling in so slowly? I suppose the job sequence must be random. DCDuring TALK 14:40, 13 December 2013 (UTC)[reply]

WTH is the job queue anyway? Mglovesfun (talk) 13:40, 12 December 2013 (UTC)[reply]

See this at MediaWiki. It's been rewritten a bit and is easier to understand than it has been. DCDuring TALK 14:00, 12 December 2013 (UTC)[reply]

Items are still being added to a category resulting from changes made to a template of November 15, 2013. This seems long. DCDuring TALK 18:32, 12 December 2013 (UTC)[reply]

My suggestion

Drop the Lua thing and return to templates. There are many things Lua is good for, but this isn't one of them.

Sounds like a workable idea, no? -- Liliana • 21:52, 23 November 2013 (UTC)[reply]

If it were only language names, you'd be right. But this module contains a lot of data that is very useful and can only work as a module. --Wiki Tiki 89 21:55, 23 November 2013 (UTC)[reply]

So? You can make a template return different data depending on the parameter passed. -- Liliana • 21:59, 23 November 2013 (UTC)[reply]

Only text data, not more complex structures such as the entry_name and sort_key replacement rules. Also, modules can't easily call templates, which I think is a major flaw but there's nothing I can do about it in the short run. --Wiki Tiki 89 22:04, 23 November 2013 (UTC)[reply]

You'd have to tell me what entry_name is, but sort_key is something that could stay as a Lua template. I doubt we will get sort keys for all 7,000 languages anytime soon... -- Liliana • 22:08, 23 November 2013 (UTC)[reply]

entry_name is like sort_key only it gives you the name of the page to link to rather than the sort key (by removing diacritics and such). Anyway, many of the more simple features such as alternate language names would suffer a great performance hit if they were changed to templates. --Wiki Tiki 89 22:12, 23 November 2013 (UTC)[reply]

How often, if at all, is the alternate name feature used? -- Liliana • 22:13, 23 November 2013 (UTC)[reply]

I'm not sure exactly, but it is useful for reverse lookups. Especially with Kephir's xte javascript tool, you can look up any of the alternate names and it gives you the language code (but I don't count that as a real use). Anyway, I don't see what problem you're trying to solve, because now that we have split up the language data into manageable chunks, it is much easier to use and make changes than it was with the template pattern. --Wiki Tiki 89 22:31, 23 November 2013 (UTC)[reply]

Is it? I find Lua unusable because the special edit window breaks the search function of my browser. And how often do you ever edit multiple languages at once? -- Liliana • 22:32, 23 November 2013 (UTC)[reply]

I agree with you that the built-in Lua editor is horrible, but that is only a small problem. I'm sure there is a way to disable it with JavaScript anyway. As for editing multiple languages at once, I've seen -sche doing exactly that quite a bit recently. --Wiki Tiki 89 22:37, 23 November 2013 (UTC)[reply]

You can disable codeEditor by clicking on the square button on the top left of the editbox normally, can't you? — Automatik (talk) 06:54, 24 November 2013 (UTC)[reply]

You're right, thanks for the tip! --Wiki Tiki 89 07:05, 24 November 2013 (UTC)[reply]

Awesome! :-D —Ruakh_TALK 08:12, 24 November 2013 (UTC)[reply]

My gut reaction is to say "no, keep the data centralised, don't scatter it back into templates". However, centralisation does seem to have had the effect that hundreds of thousands of pages need to update (and we see how long it takes for that to happen — the job queue is still over 30k and countless pages still haven't caught up to the disuse of the main Module:languages) every time we update the family or script info of a lect we don't even have entries in (and we know how often we do that — at least one a week)... frankly, maybe Liliana's right. (We could put each language into its own module or module subpage, but at that point, we might as well go back to the templates and regain the minor but useful ability to subst the templates to produce language names.) - -sche (discuss) 22:20, 27 November 2013 (UTC)[reply]

Does anyone even use the substitution feature? I thought that was in days long gone. -- Liliana • 20:22, 1 December 2013 (UTC)[reply]

The accelerated creation gadget that automagically produces a first draft of plural forms in English uses subst: AND a module: "^{The template Template:†temp does not use the parameter(s): 2=lookup_language}

3=en

4=names Please see Module:checkparams for help with this warning.{{subst:#invoke:language utilities}}". DCDuring TALK 22:32, 1 December 2013 (UTC)[reply]

It doesn't need to, though. If needed, we could work around that very easily so that it just puts the language name in directly. In fact, I think even if there is no need, it would still be clearner to see "English" instead of a substed module call in the draft text. —CodeCa t 22:36, 1 December 2013 (UTC)[reply]

I just looked at the list of templates transcluded by Category:English terms derived from Occitan. It has a trivial amount of header context all of which is provided through the operations of {{derivcatboiler|en|oc}}. It seems nevertheless to transclude many language modules: all 26 of the data3 modules, the data2 module, and the alldata module.
1. Is this a temporary state of affairs?
2. What useful services do the modules actually perform for a page like this?
3. If no useful services are performed, is this an artifact of something like the old template ifexist tests? *#Weren't we supposed to be getting rid of those when we got the promised land of Lua/Scribunto?
Hope-giving responses preferred, but truth is acceptable. DCDuring TALK 19:45, 1 December 2013 (UTC)[reply]

Yes, that's a temporary state of affairs: any page that used to include Module:languages currently includes all the individual language modules, because we haven't finished the refactoring. —Ruakh_TALK 20:34, 1 December 2013 (UTC)[reply]

Module:languages is orphaned

There are no more transclusions, they've all been moved over to the temporary Module:languages/alldata. There are many pages that link to it but they're just links, not transclusions, so they aren't an issue. We're free to make any changes now. —CodeCa t 02:58, 3 December 2013 (UTC)[reply]

That's what I suspected, but how did you verify that? --Wiki Tiki 89 03:11, 3 December 2013 (UTC)[reply]

At the top of Special:WhatLinksHere/Module:languages, click on the 'Hide links' link. It will take you to https://en.wiktionary.org/w/index.php?title=Special:WhatLinksHere/Module:languages&hidelinks=1, which shows all transclusions and redirects (though it's not a great example, since it's empty; see https://en.wiktionary.org/w/index.php?title=Special:WhatLinksHere/Module:languages/alldata&hidelinks=1 for what it would look like if there were still transclusions). —Ruakh_TALK 04:27, 3 December 2013 (UTC)[reply]

Oh, I misunderstood. I thought CodeCat meant there were still phantom transclusions left. --Wiki Tiki 89 04:37, 3 December 2013 (UTC)[reply]

So I copied the accessor function to Module:languages. The next step will be to fix the implementation of Module:language utilities to use Module:languages and possibly even move everything from Module:language utilities to Module:languages if we decide that it's worth the effort. It would be nice to also have a list of all modules that directly call Module:languages/alldata. I think that we will not need to completely orphan Module:languages/alldata because some modules, such as Module:JSON data are better off using all the data at once. --Wiki Tiki 89 04:50, 3 December 2013 (UTC)[reply]

Template:enm-proper noun

Can someone please fix this template to not use {{Latinx}}? Ultimateria (talk) 07:05, 19 November 2013 (UTC)[reply]

If you just blank out the template, it won't be using {{Latinx}} anymore. :-P
O.K., so, seriously: what are you requesting? What's broken about using {{Latinx}}? What would you like to see instead?
—Ruakh_TALK 07:08, 19 November 2013 (UTC)[reply]

For Middle English, {{Latinx}} is apparently needed to make sure all browsers render Ȝ and ȝ correctly. —Aɴɢʀ (talk) 21:32, 19 November 2013 (UTC)[reply]

I thought we were trying to deprecate script templates for the same reason as language code templates. In any case, I've been working on entries that call them directly because they tend to be horridly formatted. Is there a better way to allow the template's script to be Latinx without using the template itself? Ultimateria (talk) 01:26, 20 November 2013 (UTC)[reply]

I thought <strong lang="enm"></strong> was the most modern way of doing it. Mglovesfun (talk) 10:35, 20 November 2013 (UTC)[reply]

Proposal for a new module policy of prohibiting top level data modules

(Perhaps this should be at the Beer parlour, I'm not sure which is better.)

I hereby propose that we prohibit all top-level data modules. All data modules should be in a submodule of a top-level accessor module and should not be directly accessed except by its own parent. This will ensure that all modules can be safely loaded with require, restricing mw.loadData to loading a module's own data submodules. This is similar in concept of making data modules "private", but I don't know of a way to enforce this privacy except by convention, which will be dictated by this policy.

We would not have to implement this right away. We could start by only enforcing this on newly created modules and slowly migrating our existing data modules to fit this pattern.

--Wiki Tiki 89 17:48, 20 November 2013 (UTC)[reply]

I don't think a policy is needed nor desired. I do think it's a good idea, but making it a hard rule is the wrong approach. —CodeCa t 18:39, 20 November 2013 (UTC)[reply]

I agree (w/CodeCat). —Ruakh_TALK 19:57, 20 November 2013 (UTC)[reply]

Would an unofficial policy work better? --Wiki Tiki 89 19:59, 20 November 2013 (UTC)[reply]

You mean a guideline? I would agree with that. It would be nice to codify such informal conventions somewhere. We have WT:Templates and WT:Scribunto but neither of them really do that so well. —CodeCa t 20:04, 20 November 2013 (UTC)[reply]

I think "guideline" is too lax of a term. It should be something all modules "should" do. We can create WT:Module coding conventions for these kind of things. --Wiki Tiki 89 20:26, 20 November 2013 (UTC)[reply]

Could we "require" that departure from the guideline be documented? Is anything like this "enforceable" in any practical way? Is it just moral suasion? DCDuring TALK 22:19, 22 November 2013 (UTC)[reply]

Is anything on Wiktionary enforceable in a practical way? In this case I think that for enforcement all we need to do is document and eventually fix all exceptions (I use eventually in the same sense as in "Wiktionary will eventually be finished."). --Wiki Tiki 89 22:31, 22 November 2013 (UTC)[reply]

Reversion and blocking work to enforce standards for entries. Reversion worked for templates. I don't know that we risked blocking someone with undocumented templates (Lack of documentation equals job security?). DCDuring TALK 23:12, 22 November 2013 (UTC)[reply]

Bot getting HTTP Error 503: Service Unavailable

My bot keeps getting this error from time to time. Sometimes a lot, other times less frequently. I don't know what's causing it, but it is rather annoying. It's not even making any changes, they're just null edits. Is anyone else having this? —CodeCa t 15:19, 21 November 2013 (UTC)[reply]

I'm getting 503s occasionally right now on my normal account, so I don't think it has anything to do with the fact that it's a bot. --Wiki Tiki 89 15:31, 21 November 2013 (UTC)[reply]

You are not the only one (there is a recent feedback on the subject). The wiki is just getting slower and slower. Any ideas why that might be? SemperBlotto (talk) 15:33, 21 November 2013 (UTC)[reply]

Someone on feedback just complained about the same thing. —Stephen ^(Talk) 15:34, 21 November 2013 (UTC)[reply]

It just might have something to do with there being 27,000 items in the job queue. There were 29,000 about 12 hours ago. This usually results from editing widely transcluded templates, in my non-expert understanding. Why not ask one of our technology mavens? DCDuring TALK 15:57, 21 November 2013 (UTC)[reply]

Then that must be because of the changes to Module:languages. We are currently working on updating it in a way would let us make changes to language data without putting all of Wiktionary in the job queue (see #Module:languages above). --Wiki Tiki 89 16:06, 21 November 2013 (UTC)[reply]

As I understand it, the job queue reflects the number of lower-priority tasks that are to be done, while more valued tasks are given priority. I assume that views by normal users are given priority. I hope that "small" contributions by individual users to entries are also given priority. My attempts to load large lists into my user pages have not been successful while the queue has been this large, so that kind of thing may be lower in priority as well. DCDuring TALK 17:00, 21 November 2013 (UTC)[reply]

I think all edits are given the same priority, just smaller edits take less processor time and so are more likely to succeed. Page views take very little processing time and are the last things to fail. --Wiki Tiki 89 17:08, 21 November 2013 (UTC)[reply]

This MW page has more about the specifics, but doesn't mention any kind of priorities nor does it make clear what kind of operations are adversely affected by a large queue. Probably, then, it is as you say, at least to a first approximation. DCDuring TALK 17:17, 21 November 2013 (UTC)[reply]

This category was designed to be called

If you create, let's say, Category:Whovian headword-line templates with the text {{tempcatboiler|pt|headword-line}}, you get a big red error message saying "This category was designed to be called Portuguese headword-line templates." Can we make the bold part of that error message into a link to whatever the category was supposed to be called (in this case, Category:Portuguese headword-line templates)? That would make it slightly easier for me to change categories around when languages are renamed. - -sche (discuss) 03:15, 22 November 2013 (UTC)[reply]

Done. —Ruakh_TALK 03:23, 22 November 2013 (UTC)[reply]

Thanks! With a little tweak, it works great. - -sche (discuss) 03:29, 22 November 2013 (UTC)[reply]

Anyone else think it's time to overhaul our category backend? One unreasonable idea would be to ban explicit categorization (with [[]]) and make all categories come from templates. Each template that categorizes would be required to call Module:categorize which would give the appropriate categories to the page (if we wanted to, this could propagate to metacategories, so that anything in "organic chemistry" would also be in "chemistry", etc). Which would make changing large category trees a lot easier- the only problem would be updating the actual category pages. DTLHS (talk) 03:38, 22 November 2013 (UTC)[reply]

FWIW, with the language I just updated (Mari→Eastern Mari, to distinguish it from the Sepik language called Mari, and the Austronesian language called Mari), almost all of the entries were categorised by template rather than by explicit categorisation, so the only problem I had was updating the actual category pages. Given that the contents (the descriptions, not the members) of categories are themselves also usually generated by template... what would make moving categories easiest would be if they could be moved (via "move" button) like regular pages. - -sche (discuss) 03:48, 22 November 2013 (UTC)[reply]

Yes... I really wish there was an extension or something so category pages didn't have to be created at all. DTLHS (talk) 03:51, 22 November 2013 (UTC)[reply]

@DLTHS: Is there a single piece of user interface designed by Wiktionarians that is bullet-proof, meeting semi-professional standards? It seems that the more ambitious the project the more detritus is left behind. (Take a look at Special:WantedPages and Special:WantedCategories.) Do we have good error messages yet? Do the designers stay around to correct the problems? Do they even document their work?

Lots of things here work well, but the most ambitious schemes: not so much. Please be realistic. DCDuring TALK 04:02, 22 November 2013 (UTC)[reply]

Don't worry, I don't have any intention to actually implement anything. I'm not responsible for other people's behavior however. DTLHS (talk) 04:05, 22 November 2013 (UTC)[reply]

The error message on Category:Batak Angkola language does not currently link. Is that a caching issue, or are top-level language categories' error messages controlled by a different page? - -sche (discuss) 04:28, 22 November 2013 (UTC)[reply]

Since Edit > Preview does not fix it, it is not a caching issue. Anyway it is now fixed. --Wiki Tiki 89 04:33, 22 November 2013 (UTC)[reply]

Template:ru-decl-noun-table and Template:ru-decl-noun-table-single

These templates have a perfect width when they are collapsed, but when you expand them, they take up the whole width of the screen. Is there any way to make them retain their perfect width when expanded? I'm not experienced enough with CSS and JS to figure it out. --Wiki Tiki 89 13:33, 22 November 2013 (UTC)[reply]

For me (using Firefox, XP-Pro), the expanded and collapsed forms have the same width (50% of the screen width). I have never seen them expand all the way across. —Stephen ^(Talk) 08:44, 24 November 2013 (UTC)[reply]

You are right, I just tested it in Firefox and expanding it does not change the width. This bug seems to only apply to Google Chrome and Internet Explorer. --Wiki Tiki 89 17:28, 24 November 2013 (UTC)[reply]

I figured it out. The 50% and 100% widths were interpreted as a percent of page width rather than container width. They are unnecessary anyway, so I removed them. --Wiki Tiki 89 18:31, 24 November 2013 (UTC)[reply]

showI3raab parameter in Module:ar-translit

Duplicating, see Module_talk:ar-translit#showI3raab_parameter. User:Atitarev/ar-conjug-I-test should transliterate all verb endings in full, e.g. فَعَلَ (faʕala) should be "faʾala", not "faʾal", etc. --Anatoli ^{(обсудить}/^вклад) 03:34, 25 November 2013 (UTC)[reply]

I would say yes. That's what I've been doing in headword lines anyway. --Wiki Tiki 89 04:19, 25 November 2013 (UTC)[reply]

Yes for what, sorry? I'm asking for technical help here. The parameter "showI3raab" in Module:ar-translit (from line 47) is not working now but it did before (without any obvious edit in Module:ar-translit or Module:ar-verb). --Anatoli ^{(обсудить}/^вклад) 04:38, 25 November 2013 (UTC)[reply]

Sorry, I misread your question. I thought you asked whether you should transliterate verb endings in full. I'll take a look. --Wiki Tiki 89 05:21, 25 November 2013 (UTC)[reply]

Yoga

How come {context|yoga} doesn't add pages to Category:Yoga? Can this be fixed easily? Ƿidsiþ 09:43, 25 November 2013 (UTC)[reply]

You are correct (even when supplying lang=). I suspect that "yoga" does not have an entry in "Module:labels/data". SemperBlotto (talk) 10:11, 25 November 2013 (UTC)[reply]
Got it. Man, this stuff is a lot more obscure than it used to be… Ƿidsiþ 13:54, 25 November 2013 (UTC)[reply]

How can we convert the category templates to Lua?

I'm talking about templates like {{poscatboiler}}. Of our major templates, these are really the only ones left that still rely on template logic. But they are very complex because of this; templates aren't really well suited to programming. I think they could be simplified considerably by converting them to Lua. But it's not so straightforward how to do it. From what I can see, it has some requirements:

Internals should be relatively straightforward. This should be easy because we don't have to work around template limitations like Daniel Carrero's original version had to.
It should be easy for people with relatively little knowledge to add new labels. The templates used one subpage for every label. We could do this with data modules (one data subpage per label), but we can also use a larger data module without too much trouble. Having one subpage per label makes it harder to get an overview of all the labels, and it also makes writing the code harder when it needs to do things like checking whether a label exists, or writing out a list of possible labels (this could be a big usability advantage, and should not be dismissed). But it makes it easier to create "edit" links from category pages, to create/edit them.
Speed is not an issue. We shouldn't make things needlessly slow, of course, but these templates are the only thing used on the category pages most of the time, so they can take their time.
Whatever we do, it should be able to do the same as what the existing system does.

Any ideas? Oh, and please don't start a tirade against any kind of progress, that's really not productive (you know who you are). —CodeCa t 15:41, 25 November 2013 (UTC)[reply]

Ugh, I was totally with you until your last paragraph. People say "the current system is better than what you're proposing" and you reply "you monster, why do you hate progress?", which is absurd. It's not "progress" if it's worse than what we've got. —Ruakh_TALK 15:58, 25 November 2013 (UTC)[reply]

But "worse" seems to be determined mostly by "I have to learn something new, which is difficult for me" or "anything associated with Lua is bad on principle" and not by judging something on its merits. That is what I tried to pre-empt. —CodeCa t 16:28, 25 November 2013 (UTC)[reply]

Until you show some respect for other people's time and opinions, I don't see that you should be trusted with any development task other than cleaning up the messes you created and restoring missing functionality. DCDuring TALK 16:31, 25 November 2013 (UTC)[reply]

So until I show respect for people who disrespect me? Right. Makes sense. —CodeCa t 16:53, 25 November 2013 (UTC)[reply]

It's a hard-earned disrespect. DCDuring TALK 04:33, 26 November 2013 (UTC)[reply]

My opinion here is even though the templates are complex, they are already in place, work perfectly fine, and are not very difficult to use in creating new category pages. Unless there are new useful features that will be able to be added with Lua, I don't see a reason to do this. --Wiki Tiki 89 16:07, 25 November 2013 (UTC)[reply]

The current implementation uses hundreds of subtemplates, and is hard to read and maintain for anyone not familiar with all the main sub-templates. In Lua, there would only be two pages (label data + a set of functions). So, from a maintenance point of view, it would be way better. Dakdada (talk) 16:32, 25 November 2013 (UTC)[reply]

But how frequently do things there actually need to be changed? --Wiki Tiki 89 16:46, 25 November 2013 (UTC)[reply]

That, I don't know. I can only say that if someone finds a bug, wants to add features or to copy this template on other project, it would be almost impossible without a lot of work. At least with Lua one does not need to be a template guru to look at the code. In short, if someone is willing to change this template to Lua, then by all means... Dakdada (talk) 17:13, 25 November 2013 (UTC)[reply]

The advantage we have here is that we don't have to "hot-swap" a heavily-used template right away. We can develop a parallel system with Lua, under different names, and only replace the current catboiler templates here and there to test. We can make sure that it works well and is bullet-proof before doing any sort of mass replacement. I would say it would make sense to develop it now, but to allow the mass replacement only after everyone has had a chance to check the code, see how it works with different combinations of variables, and otherwise gives it a thorough vetting, with a solid consensus for implementation after thorough discussion being an absolute minimum requirement. As long as we follow such a protocol, there should be minimal chance for the kind of problems we've had previously with the Lua conversion.

Let's not let the fact that it's CodeCat proposing this keep us from giving it the impartial consideration it deserves.

As for why to do it: as long as we're dependant on code that looks like a cross between spaghetti and an M. C. Escher engraving, we're going to be stuck in the straightjacket of always doing things the way we've always done them. Eventually something will change that requires major reworking, and we won't have the time then to do it right like we could now. Chuck Entz (talk) 03:45, 26 November 2013 (UTC)[reply]

Sure, all of the recent projects have sounded good in principle. I've supported some at the sales-pitch stage. But in practice they have been characterized by regression of capability; cartoon-like, self-parodying error messages; conversion of sensible context labels to ridiculous categories; a job queue that never empties; etc. I no longer believe the sales pitch as long as the product is from the same old factory, not even troubling to act as if there will be any change in the mode of implementation. DCDuring TALK 04:52, 26 November 2013 (UTC)[reply]

I'm not talking about blindly accepting things at the sales-pitch stage. This wouldn't be implemented until the finished product is deemed acceptable. Also, in this particular case, these catboiler templates already have language codes and byzantine implementation- they can only get better. Have you ever tried {{topic cat}} on a topic name that's naturally capitalized? The default description created by the template forces the topic name in the sentence to lower case, and it's not obvious how to fix that (see this hypothetical example, for instance). There's nothing to stop CodeCat from doing the pre-implementation steps I outlined above, but she's asking for input. CodeCat shouldn't have made that remark about a "tirade against any kind of progress", but blind opposition to everything she proposes is uncomfortably close to an illustration of what she was talking about. Chuck Entz (talk) 20:59, 27 November 2013 (UTC)[reply]

Codecat has no one else to blame. One cannot continually repeat the same pitch, fail to repair the damage done, and expect forgiveness and a license to continue. If you want to be an apologist for this nonsense, you may. I was fooled too for a time. DCDuring TALK 21:56, 27 November 2013 (UTC)[reply]

I'm no apologist. It's just that this requires neither forgiveness nor a license to continue. I'm not saying we should give her carte blanche to do anything she wants. We should check everything, and have people who know template and module code check everything, only then giving permission for implementation of exactly what we've approved- and no more. Besides, categories aren't transcluded the way templates are, so substituting the candidate template into any number of categories, then reverting the change, or perhaps just adding it and viewing the preview without saving, will have absolutely zero effect on any actual entries- we can test every possible case before we make any replacement of the templates themselves- no trust needed. Chuck Entz (talk) 22:33, 27 November 2013 (UTC)[reply]

I've written Module:User:CodeCat/category boilerplate and Module:User:CodeCat/category boilerplate/short, and I changed {{shortcatboiler}} (which is not widely used) as a first trial. It seems to work well from what I can see, but of course the real test is how maintainable it will prove to be. Neither module is in its final state, there will still be many changes needed to support all of our category templates, but now it can support a majority. —CodeCa t 21:06, 27 November 2013 (UTC)[reply]

You should write a test page with various (correct and incorrect) uses to see how the template/module behaves. For example, it throws a Script Error when the language code is not defined. Also, is there no interface to get language data (like lang.get_lang_data(code)), instead of loading the raw table? Dakdada (talk) 10:26, 29 November 2013 (UTC)[reply]

Questions about the logo

What is the IPA font used in [ˈwɪkʃənrɪ] in the logo of Wiktionary at the top left of the page?
Why was the chiefly British-based pronunciation chosen [ʃənrɪ] rather than [ʃənˌɛri]. At least the ending [ʃənəri] is also permissible in British.
Since that transcription appears to be very precise, why the [r] was used rather than the [ɹ]?

--Mahmudmasri (talk) 06:54, 26 November 2013 (UTC)[reply]

You seem to have posted your questions on the wrong page. This is the Grease pit, for technical discussions: template/module design, bot discussions, design of Gadgets and site JS and site CSS and so on, etc., etc., etc. —Ruakh_TALK 07:42, 26 November 2013 (UTC)[reply]

OK, where to post that? I searched and only found that was the relevant page, that you are now telling me it's not. --Mahmudmasri (talk) 12:37, 26 November 2013 (UTC)[reply]

To answer your questions: (1) Offhand, I don't know what font that is; I'm sorry ... someone will probably come along who does, though. (2) At the time the logo was created, it was easy to find documentation of and references for RP, and it was a prominent standard pronunciation register, so it was used. (At least, that's a romanticised version of what happened. Those who were actually in the wiki-trenches at the time may have a different recollection.) Notably, our Rhymes pages also use RP in instances where the US and RP pronunciations are not different diaphonemes and it would be redundant to have pages for both the US and RP pronunciations (e.g. in the case of Rhymes:English:-əʊ vs *Rhymes:English:-oʊ). See also our FAQ. (3) At the time the logo was created, Wiktionary was lax in its transcription of the 'r' sound, and followed the practice (somewhat common among monolingual English dictionaries) of transcribing 'r' as /r/, [r]. In recognition of the fact that Wiktionary is multilingual and it's actually quite confusing to label the English sound a trill, Wiktionary later voted (in 2008) to switch to the technically-correct /ɹ/. (4) Perhaps the Information Desk would be a better place to post general questions? The Grease Pit does tend to be reserved for technical questions. There's an overview of the various discussion rooms here. - -sche (discuss) 14:46, 26 November 2013 (UTC)[reply]

Mahmudmasri, usually questions about the logo are placed at Wiktionary:Feedback or Information Desk. I can tell you that the logo is an image file that was created in 2004 by Erik Möller. I imagine that the British pronunciation was chosen because Erik had learned British English. It is likely that Erik was not aware that the r symbol was pronounced differently in IPA. —Stephen ^(Talk) 14:57, 26 November 2013 (UTC)[reply]

Thanks. I may ask there. --Mahmudmasri (talk) 16:44, 26 November 2013 (UTC)[reply]

According to the logo’s info page, the logo includes Times New Roman, Lucida Sans, and Bitstream Vera Sans. —Michael Z. 2013-11-26 16:55 z

Thanks. --Mahmudmasri (talk) 09:45, 27 November 2013 (UTC)[reply]

Bot request: update template used in the etymologies of Hungarian compounds per RFDO

On RFDO, consensus is to delete Template:hu-compound as redundant to Template:compound + lang=hu. However, before it can be deleted, it must be orphaned. A bot should replace all uses of {{hu-compound|foo|bar}} with {{compound|lang=hu|foo|bar}}. - -sche (discuss) 02:26, 28 November 2013 (UTC)[reply]

Done, please review háromfejű karizom, kétfejű karizom. DTLHS (talk) 19:40, 28 November 2013 (UTC)[reply]

Thank you! - -sche (discuss) 20:38, 28 November 2013 (UTC)[reply]

Searching in Just One Language

Hey All,

I use Wiktionary frequently to search Latin terms because it's the most effective dictionary online that preserves macrons (ex: ut valētis). However, I easily search twenty/thirty terms in a one-hour period, and I would like to save myself time by not needed to scroll down to the Latin entry. Is there anyway to enter a string into the search bar and have it scan Wiktionary for just a single language?

-Ryan

Append #Latin to the term you are searching. — Ungoliant ^(Falai) 16:43, 28 November 2013 (UTC)[reply]

One small caveat is that you have to write the whole string yourself (including #Latin) everytime. It may be a good idea to add a field to select a language, that would automatically append the #Latin (or any language) anchor for every search. Dakdada (talk) 10:17, 29 November 2013 (UTC)[reply]

If you use tabbed languages (available under "Preferences" - "Gadgets" - bottom of "User Interface Gadgets") it will remember what language you looked at last and switch to it, if available. -Atelaes λάλει ἐμοί 19:00, 29 November 2013 (UTC)[reply]

I tested this and it is not persistent: if I read the Latin amo, then switch to a page with no Latin heading, then switch back to a page with a Latin heading, it will have been forgotten. Also, this gadget does not work for IPs. Dakdada (talk) 08:57, 2 December 2013 (UTC)[reply]

By the way, if I look for a Latin word and an article exists but without a Latin heading, it would be nice to be told that "Wiktionary does not have an entry for xxx in Latin", instead of searching for a non-existent heading. Dakdada (talk) 08:59, 2 December 2013 (UTC)[reply]

Ah. Sorry it didn't work out for you. I will keep those critiques in mind; it is possible they could be rectified in the future. -Atelaes λάλει ἐμοί 09:04, 2 December 2013 (UTC)[reply]

When you go to a page that doesn't have the language, it will try to choose a default language that does appear. Once it has chosen that language, that's the one it remembers for the following page. It may not be ideal, but changing it has its own downsides. —CodeCa t 15:05, 2 December 2013 (UTC)[reply]