Open main menu

Wiktionary β

Wiktionary:Grease pit

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, Lua modules, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of "all words in all languages".

Others have understood this page to explain the "how" of things, while the Beer parlour addresses the "why".

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Find information and helpful links about modules, Lua in general, and the Scribunto extension at WT:LUA.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with "tips-n-tricks" are to be listed here as well.

Grease pit archives edit

February 2018

Python problem (non-Wiki)Edit

Hi. I have been away for some days recovering from a laptop disaster. I have reinstalled Python 2.7.14 Now, when I right click a Python program and select Edit with IDLE, nothing happens (no edit window, no message, nothing). Any ideas? SemperBlotto (talk) 19:32, 2 February 2018 (UTC)

What operating system? DTLHS (talk) 19:51, 2 February 2018 (UTC)
Windows 10. SemperBlotto (talk) 19:55, 2 February 2018 (UTC)
"Try deleting the contents of the .idlerc folder in your profile. To open the folder just type and enter %USERPROFILE%.idlerc." DTLHS (talk) 19:59, 2 February 2018 (UTC)
I can't find any file or folder named idlerc (with or without a dot) anywhere on my machine. SemperBlotto (talk) 20:57, 2 February 2018 (UTC)
I uninstalled / reinstalled Python - now works OK. Now I have to get the bot working again. SemperBlotto (talk) 16:08, 3 February 2018 (UTC)
  • I reinstalled pywikibot. Now, when I try running the bot, I get:-

Traceback (most recent call last):

 File "C:\Python-it\", line 7, in <module>
   import pywikibot, config
 File "C:\Python-it\pywikibot\", line 15, in <module>
   from textlib import *
 File "C:\Python-it\pywikibot\", line 17, in <module>
   import wikipedia as pywikibot
 File "C:\Python-it\", line 7559, in <module>
   get_throttle = Throttle()

NameError: name 'Throttle' is not defined

Any ideas? SemperBlotto (talk) 06:10, 4 February 2018 (UTC)

Uniform usage labels templateEdit

Hi! We (@Isbms27 and myself) would like to propose a single unified system of usage labels for wiktionaries in all languages. Such labels exist in some wiktionaries (English, Russian, etc.), their use it not systematic, and in some other languages they are not used at all.

We suggest the following categories, the tags are taken as an example from English wiktionary: 1. Usage/Register/Stylistic: neutral / colloquial / informal / formal / slang / jargon / nonstandard / familiar / periphrastic / official / vulgar / taboo / obscene 2. Speakers: to specify special social groups (by age, gender, social status, occupation (for slang / jargon), etc.) 3. Academic subject area: chemistry / biology / zoology, etc. (for terminology only) 4. Regional/Geography: American English / Australian / etc. 5. Temporal: dated (outdated) / archaic / obsolete / neologism / historical / hot word / nonce word 6. Expressiveness: approving / disapproving / humorous / ironic / offensive / euphemism / 7. Word type: abbreviations / acronyms / initialism

Sorry, if wrong thread!

I don't see how 7 relates to usage context. —Rua (mew) 18:06, 3 February 2018 (UTC)
I wanted to reformat, to more easily understand the proposal:
  1. Usage/Register/Stylistic
  2. Speakers: to specify special social groups (by age, gender, social status, occupation (for slang / jargon), etc.)
  3. Academic subject area:
    • chemistry / biology / zoology, etc. (for terminology only)
  4. Regional/Geography:
    • American English / Australian / etc.
  5. Temporal:
  6. Expressiveness:
    • approving
    • disapproving
    • euphemism
    • humorous
    • ironic
    • offensive
  7. Word type:
A couple of points:
  1. Although many of these are defined in the Appendix:Glossary for English, not all of them are because they are quite subjective. The proposal is pan-project; how would these concepts be normalized across the languages involved?
  2. What qualifies as academic subject area? e.g. Pharmaceuticals, particularly trade marks like Viagra (or, more controversially, Aspirin, which is a trade mark in most of the world other than the USA) which are not a chemical name or pharmaceutical compound but simply a brand. - Amgine/ t·e 18:30, 3 February 2018 (UTC)
I completely agree that particular concepts of some categories (esp. Register, Temporal and Expressiveness) should be normalized across languages; we used existing tags from English wictionary as an example, Russian tags are also quite messy and I could find no system for Spanish or German. However, fixing language specific tags is the second step.
The first and the main step is to introduce a uniform set of categories as a fixed template for all languages and to encourage users, 1) to specify as many categories as possible for a particular word-usage, 2) to use the same pattern describing words in different languages.
We already have it in grammar description: for every word part of speech is specified, etc. It will be useful to have this for 'semantics'/'usage' information as well.
For example, at Wikimedia Pre-hackathon in Olot they discuss an idea about the integration of lexical wikidata into some machine translation systems. These usage-labels if uniformly presented in all languages and integrated can help to choose translation equivalents. —Isbms27 10:30, 4 February 2018 (UTC)
<nod> There certainly would be work to do. But I was asking how you would address normalizing the categories. It is the part which seems most difficult for Wiktionary.
Also, I believe Rua would like for you to address the concern raised about #7. - Amgine/ t·e 16:08, 4 February 2018 (UTC)
  • Also, we don't have a part of speech for every word -- POS concepts don't apply very well to Chinese terms, for instance, and Lojban is just plain wacky, and there are aspects of Japanese that haven't been tackled yet, and the whole issue of idioms has gone back and forth a few times, among many other issues. Looking at the numbered list above, some aspects are rather confusing -- #3 says "for terminology only", which is very ambiguous given the context.
While I support the underlying idea (semantic tagging for better correspondences across languages), I caution planners that this is a vastly complicated issue. Please do not expect rapid progress.  :) ‑‑ Eiríkr Útlendi │Tala við mig 18:41, 6 February 2018 (UTC)
In my opinion there are certain elements here which are broadly accepted - for example the temporal tags. I believe most languages of Wiktionary will have the concept, e.g. Catégorie:Langage désuet, Kategorie:Zastaralé_výrazy, etc., of dated/archaic terms. Those should be immediately implemented as they serve as exemplars how this project can work inter-language. - Amgine/ t·e 17:21, 7 February 2018 (UTC)

CAT:head tracking/unrecognized posEdit

How does {{head}} add words to this category? Asking since I'm importing some modules from here to Hindi Wiktionary, and hi:માટે is being put in this category. —AryamanA (मुझसे बात करेंयोगदान) 23:26, 3 February 2018 (UTC)

Any part of speech that isn't in the lemmas or non-lemmas table in Module:headword/data will be put into that category. DTLHS (talk) 00:01, 4 February 2018 (UTC)

Old English declension infrastructureEdit

Old English noun declension templates are woefully incapable of presenting what should, by all rights, not be too complex. For example, see feond for an entry where each form has to be specified because the templates can't handle it. If someone could Luacise it so that it worked like our Latin declension templates, that would help a great deal. @Rua, JohnC5 as people with a likely interest. —Μετάknowledgediscuss/deeds 21:38, 4 February 2018 (UTC)

For feond, the problem came from trying to squish two separate declensions into the same table, so I have separated them. mellohi! (僕の乖離) 04:06, 8 February 2018 (UTC)

Module:wikipedia (I think)Edit

When viewing {{pedia}} in the mobile version of Safari, it comes out pretty much like this:

  Grease pit on Wikipedia.Wikipedia Wikipedia

Note that there's no space between the period and the "Wikipedia" link when the problem actually occurs. Esszet (talk) 17:56, 5 February 2018 (UTC)

@Esszet: There a link after the period even in the desktop version, though it's invisible: for instance, <span class="interProject"><a href="" class="extiw" title="w:elephant">Wikipedia</a></span> in elephant. Not sure what its purpose is. The CSS file for the desktop version, MediaWiki:Common.css, hides it, while the CSS for the mobile version, MediaWiki:Mobile.css, doesn't. — Eru·tuon 05:24, 6 February 2018 (UTC)
Can it be deleted? It seems totally redundant. Esszet (talk) 13:23, 6 February 2018 (UTC)
@Esszet: Apparently it isn't. It seems to populate the "in other projects" section of the sidebar. So an admin needs to add a rule to MediaWiki:Mobile.css like the one in MediaWiki:Common.css. Perhaps @-sche, who was working on that file? — Eru·tuon 17:54, 6 February 2018 (UTC)
  Done. Let me know if there are any unwanted side effects; I looked for anything else that used that class, and didn't see anything that seemed likely to cause problems. - -sche (discuss) 18:29, 6 February 2018 (UTC)

Odd behavior from templateEdit

On jocose, I added a citation, originally with {{cite-book}} but then changing it to {{quote-book}}. In both, the only date I used was the year (1886) but when I use the latter, correct template, it displays "1886. February 5." Where is it getting the date??? —Justin (koavf)TCM 18:04, 5 February 2018 (UTC)

@Sgconlaw? —Μετάknowledgediscuss/deeds 18:11, 5 February 2018 (UTC)
The quotation templates uses PHP functions to parse dates. This is unfortunate because PHP will never throw an exception, ever, no matter what garbage input you give it. So if you give it a date of "1886" it will attempt to fill in the other parameters of the date with the current date. Anyway, if you just want a year, use the year parameter. Only use date if you have a specific date. DTLHS (talk) 18:13, 5 February 2018 (UTC)
Yup. This is a feature of {{#time:}}, not of the templates themselves. — SGconlaw (talk) 18:26, 5 February 2018 (UTC)

Japanese example quoting using ja-usexEdit

I want to write some examples for Japanese Jukujikun (熟字訓) kanji readings

    • 2000, "Tsunami", in Ballad 3: The Album Of Love, performed by Southern All Stars, track 24:
      {{ja-usex|人は誰も愛求めて 闇にさまよう運命|ひとはだれもあいもとめてやみにさまようさだめ|The fate of a man wandering in the dark, looking for love}}

Is there a way to bypass this system(exceptions for cases found in famous literature) or add Kanji readings that I believe are legitimate?

Thanks. Jayshinkw (talk) 03:49, 6 February 2018 (UTC)

Did you mean this? (After adding spaces)
 (ひと) (だれ) (あい) (もと)めて (やみ)にさまよう運命 (さだめ)
hito wa dare mo ai motomete yami ni samayō sadame
The fate of a man wandering in the dark, looking for love
Wyang (talk) 04:12, 6 February 2018 (UTC)

Diacritic stripping requestEdit

Kind of how Latin links display macrons but they get stripped from links. For Livonian, lang code: liv

  • remove any apostrophes
    • e.g., {{m|liv|ke'ž}} would land on kež
  • also remove this weird apostrophe: ’
  • remove ogonek from long ō: ǭ --> ō
    • e.g., {{m|liv|rǭ’}} would land on

Neitrāls vārds (talk) 01:16, 8 February 2018 (UTC)

Done. DTLHS (talk) 01:34, 8 February 2018 (UTC)
Thank you! Neitrāls vārds (talk) 01:55, 8 February 2018 (UTC)

Another diacritic stripping questionEdit

Latvian (lv) has most of the scholarly tone diacritic removal/replacement rules covered (thanks to whoever did this) but it would be awesome to get rid of a spelled out <uo> diphthong (just o in standard orthography) but one of the two letters (seems to vary by author which one) gets a tone diacritic.


To avoid having to make any complicated "logic statement" I could spell out all of them (not that many because it's not possible for 2 different tone marks to be within the same diphthong):

(uo) (ũo ûo ùo ūo) (uõ uô uò uō) (ũõ ûô ùò ūō) --> o

But the part with macrons should only apply to <uo> sequence because macron is a legitimate diacritic (technically it's not even a tone mark but, I guess, they use it as a replacement for tilde, since <uo> is not part of orthography to begin with, that prevents any confusion, I suppose.) Neitrāls vārds (talk) 03:01, 8 February 2018 (UTC)

If it was even possible, I disagree with this one. These are orthographic variation and should be treated as such. --Victar (talk) 01:21, 9 February 2018 (UTC)
I guess it makes sense that replacing a sequence would be more tricky than a single character...
However, uo is not a variant spelling only a "dictionary notation convention" (for lack of a better term), to my knowledge there isn't any spelling tradition (that has seen any use) that would spell out uo's, only a convention that is used in headword lines of more specialized dictionaries. Neitrāls vārds (talk) 18:14, 9 February 2018 (UTC)

This page gives a quick rundown of the stages of lv orthography

1st (chaotic) attempts (ca. 16~18th century), just plain o

  • Ahbola /a:buola/ (modern ābola)

or occasionally a doubled oo

  • goodtc /guots/ (gods)

So-called "German orthography" (~19th century), just o

Modern orthography, conceptualized at the turn of 19th/20th centuries but introduced after WW1 (1918-ish?) Initially the plan was for it to spell out uo's but it didn't materialize (rightly so if you ask me, when every (native) o is implicitly uo, what's the point of explicitly spelling them out, but I digress.) So, outside of dictionaries it has never really been used. Neitrāls vārds (talk) 18:14, 9 February 2018 (UTC)

The issue is that we strip diacritics to make linking easier, but not as a tool to normalise an orthography. The problem is not a technical one, but that this is an inappropriate application. —Μετάknowledgediscuss/deeds 18:38, 9 February 2018 (UTC)
So there has never been a precedent? I.e., a language actually adding extra letters for their dictionary notation as opposed to just adding diacritics (every example I can think of falls in the latter category actually.)
not as a tool to normalise an orthography – as I outlined above uo has never been part of any orthography tradition, only "dictionary notatation" / faux transcription.
Latvian is not that relevant in etymologies (Lithuanian can usually do the same job while being more archaic) but looking forward how can this problem be tackled? Suppose I magically fix all the links right now, then the year 2020 rolls over and there are another 200 red links when there are perfectly fine entries that they should land on, say, for example, link for uozuols when there's ozols (and the former is not a valid form attestable in prose, only in dict headword lines.) It's not that I care that much but it sounds like something one would constantly need to look after. Neitrāls vārds (talk) 20:57, 9 February 2018 (UTC)

Double boldfaceEdit

Is it possible to avoid the double boldface that occurs when, say, {{past participle of|acquit|lang=en}} is used on the acquit entry page itself? — SGconlaw (talk) 10:54, 8 February 2018 (UTC)

I agree this is a problem that should be resolved. I recall this kind of thing coming up before (there was a discussion of it involving msh210 and Rua—CodeCat at the time). - -sche (discuss) 05:27, 9 February 2018 (UTC)
That is what would normally appear if an inflection has the same spelling as the main entry. I don't think it should be done away with. DonnanZ (talk) 16:45, 9 February 2018 (UTC)
Hmm, it seems to be more accentuated than at run#Verb. DonnanZ (talk) 16:56, 9 February 2018 (UTC)
Wouldn't the normal level of boldface be sufficient? The extra boldface seems excessive. — SGconlaw (talk) 17:43, 9 February 2018 (UTC)
Yes, it should only be as bold as e.g. a (sloppy, #Noun-less) link from an adjective section to a noun section on the same page, not twice as bold. I found the prior discussion, which should help resolve the current case: Wiktionary:Beer parlour/2014/June § boldfaced forms of invariant lemmata. - -sche (discuss) 18:56, 9 February 2018 (UTC)
We don't normally create inflection entries if they are the same spelling as the main entry (I don't anyway) so this case is a little odd. DonnanZ (talk) 19:12, 9 February 2018 (UTC)
@Sgconlaw: I have fixed it using "old-fashioned" technology, but had to re-add the past participle category. See if it meets your approval now. DonnanZ (talk) 01:04, 10 February 2018 (UTC)
Thanks! — SGconlaw (talk) 15:19, 11 February 2018 (UTC)
@Erutuon, Rua, DTLHS or anyone else who might know: can we find an actual, general solution to this? Simply removing templates as was done on [[acquit]] seems like an undesirable and unmaintainable approach that only "fixes" individual entries as they crop up. - -sche (discuss) 19:24, 11 February 2018 (UTC)
@-sche: I had taken a look at the HTML, and thought it was because two CSS selectors were both emboldening the same word: the strong tag and .form-of-definition-link .mention class selector. But that when I look at it with browser-internal styles displayed (I'm in Firefox), it's clear that the reason is slightly different: <strong>, <b> tags have the rule font-weight: bolder;, which means that the "form of definition mention" text, which is already bold, is made even bolder by the <strong> tag. One solution would be to override this browser-internal rule with strong, b { font-weight: bold; } in MediaWiki:Common.css, though I'm not sure if that's the best solution. — Eru·tuon 20:54, 11 February 2018 (UTC)
Is there any way to standardize the css classes to only use either strong or bold in order to match what the wikitext produces? Having both floating around seems destined to create oddly-distributed coincidental combinations. Chuck Entz (talk) 21:44, 11 February 2018 (UTC)
I don't quite understand your question, because strong and bold are HTML tags and have nothing to do with CSS rules applying to classes. I might not have explained things well. In the entry acquit, a selflink <strong class="mw-selflink selflink">acquit</strong> (acquit) was generated from the wikitext [[acquit]]. The strong tag, as well as the b tag generated by wikitext bolding syntax, has the CSS property font-weight: bolder; applied to it by my browser, and apparently by other people's browsers. The selflink was wrapped by <span class="form-of-definition-link"><i class="Latn mention" lang="en">...</i></span>, and MediaWiki:Common.css applies the property font-style: bold; to this configuration of classes. So acquit starts out bold because of the Wiktionary CSS property and becomes even bolder because of the browser-internal CSS property. (The resulting bold value is 900 according to my browser: font-weight: bold; + font-weight: bolder; = font-weight: 900;.) — Eru·tuon 22:30, 11 February 2018 (UTC)
How would such a line in the css interact with e.g. the lines that specify that "bolded" Hebrew has a normal (non-bolded) font weight and is big instead? Would it override them and cause Hebrew to be "bolded"? If not, that sounds like a good fix. What did we do to fix the "fishbone" problem linked above, of self-links on the headword line being double-bolded? - -sche (discuss) 01:18, 12 February 2018 (UTC)
@-sche: Based on this article, CSS selectors that include class names (.Hebr) will have precedence over those that only include tag names (b, strong). So the Hebrew-related styles will behave in the same way.
When I test it in my browser, b, strong { font-weight: bold; } fixes the double-bolded headword selflink problem too, so it could replace the rule that currently fixes the problem, b .selflink, strong .selflink { font-weight: inherit; }. I wonder if there are any cases in which Wiktionary needs any levels of bolding besides normal and bold. — Eru·tuon 04:56, 12 February 2018 (UTC)
I wouldn't imagine so. — SGconlaw (talk) 02:34, 14 February 2018 (UTC)
I have added that code to MediaWiki:Common.css. It seems to solve the issue. If there re no adverse side-effects, the old code (currently commented out) can be removed. - -sche (discuss) 03:14, 14 February 2018 (UTC)
Oh, I thought the template had been modified. Didn’t realize it had simply been removed. — SGconlaw (talk) 19:39, 11 February 2018 (UTC)
I tried to explain that, it can be regarded as a temporary solution if the "boffins" can work out what to do. DonnanZ (talk) 12:06, 12 February 2018 (UTC)

FastRevert no longer correctly?Edit

I noticed that FastRevert is no longer leaving clean edit summaries. See this diff, for example. Anyone know what happened?

For the record, I was using Safari 11.0.2 when I made that edit, although I'm not sure it makes a difference. --Ixfd64 (talk) 19:48, 8 February 2018 (UTC)

Protected page edit requestEdit

Hello, For Wiktionary:Per-browser_preferences, please change:

<div id="isPreferencePage" name="isPreferencePage" />
<div id="isPreferencePage" name="isPreferencePage"></div>
To resolve Special:LintErrors/self-closed-tag. Thank you, Xaosflux (talk) 16:14, 9 February 2018 (UTC)
  Done. Thanks for pointing out the lint error. - -sche (discuss) 16:20, 9 February 2018 (UTC)

Renaming 'Azeri' to 'Azerbaijani' with a botEdit

Per WT:RFM#Renaming_az I have renamed the language code az to 'Azerbaijani'. Can someone with a bot update the language headers, translations tables and descendants lists? --Vahag (talk) 17:51, 9 February 2018 (UTC)

Many categories also need to be moved. DTLHS (talk) 17:51, 9 February 2018 (UTC)
@Vahagn Petrosyan: I see that you have gone ahead and renamed it, which is probably not a good idea until we're ready to switch everything over. As DTLHS notes, a big part of it will be the categories, so if you want to start moving those over, now would be a good time. —Μετάknowledgediscuss/deeds 21:26, 9 February 2018 (UTC)
It's fine, I'll move them tonight. DTLHS (talk) 21:32, 9 February 2018 (UTC)
Looks like the categories are done now. The mainspace changes await. —Μετάknowledgediscuss/deeds 06:24, 10 February 2018 (UTC)
A few newly created entries have kept "==Azeri==", such as artıq. But I guess the process is still ongoing, so there's no need for concern. Allahverdi Verdizade (talk) 12:15, 10 February 2018 (UTC)
@DTLHS, Metaknowledge, thank you for moving and sorry for hurrying. --Vahag (talk) 13:18, 10 February 2018 (UTC)

Need HelpEdit

Can anyone please help me in removing transliteration of Urdu, Persian & Arabic languages from Urdu Wiktionary? The imported modules there cause Urdu entries to say "transliteration needed" but that shouldn't be necessary because it's the Urdu Wiktionary. — Bukhari (Talk!) 11:55, 12 February 2018 (UTC)

Well, even on the Urdu Wiktionary, don't you think Urdu entries should have some way of showing how they're pronounced, like IPA or using vowel marking on the headword line? In any case, @Aryamanarora can probably help. —Μετάknowledgediscuss/deeds 17:38, 16 February 2018 (UTC)

Template "de-nom" at Occitan WiktionaryEdit

The template "de-nom" at the Occitan Wiktionary looks like it could benefit from an expansion to include the other three grammatical cases. --Lo Ximiendo (talk) 13:17, 13 February 2018 (UTC)

UPDATE: I added three rows for the noun forms, but not the columns for the names of the grammatical cases, which are "nominatiu, genitiu, datiu, acusatiu" in Occitan. --Lo Ximiendo (talk) 13:38, 13 February 2018 (UTC)

Bug with /api/rest_v1...Edit

definition 5 has this:

"Hän kysyi, mistä puhuin Samin kanssa. Sanoin, että se ei kuulu hänelle.\n

He asked what I was talking to Sam about. I told him it was none of his business.


3 "He asked what I was talking to Sam about. I told him it was none of his business."

As you can see the sentence is duped. on the other hand, has no such duplicate sentence.

What is this? I've never heard of this API. Who is developing it? DTLHS (talk) 21:40, 13 February 2018 (UTC)!/Page_content/get_page_definition_term
  I guess I should've gone to to report this? I don't feel like making an account. If anyone would do this, then good.
@Jberkel has worked on that project. - TheDaveRoss 21:52, 13 February 2018 (UTC)
That API endpoint was added to permit dictionary lookups from within the Android Wikipedia app (documentation). I'm not involved in the development of the API, I only suggested that our templates generate some extra markup to facilitate the parsing. I'm not sure the endpoint is still maintained/developed at the moment. The discussion on phabricator has stalled in any case. – Jberkel 23:16, 13 February 2018 (UTC)
@Jberkel do you think it is worth submitting a bug? If so would you mind doing so? A little familiarity with the project goes a long way in making bugs meaningful. If you would rather not I can take a stab at it. - TheDaveRoss 13:23, 14 February 2018 (UTC)
@TheDaveRoss: – Sure, I'll do it. Jberkel 15:03, 14 February 2018 (UTC)
ticket T187430 on phab. – Jberkel 10:55, 15 February 2018 (UTC)

Listing all daughter languagesEdit

Is there any way to see a complete list of all languages that have a given language X as their ancestor? For example, both German and Yiddish have Middle High German as their immediate ancestor, while Cimbrian has Bavarian as its immediate ancestor and MHG as a more distant ancestor; is there any convenient way to see a complete list of languages that have MHG anywhere in their ancestor tree? —Mahāgaja (formerly Angr) · talk 15:36, 14 February 2018 (UTC)

Here (this is not my work). --Per utramque cavernam (talk) 15:43, 14 February 2018 (UTC)
Awesome, thanks! (And thanks, JohnC5, too!) —Mahāgaja (formerly Angr) · talk 15:50, 14 February 2018 (UTC)
@Mahagaja: Thank @Erutuon. He's the one who has been the one cleaning it up recently. We're hoping to integrate it into {{langcatboiler}} at some point so that it is easily available. —*i̯óh₁nC[5] 20:42, 14 February 2018 (UTC)

Two competing categoriesEdit

Hi all,

I have encountered these two entities:

Category:Bashkir terms borrowed from Arabic

Category:Bashkir terms derived from Arabic

I am surprised to see these two are separate entities rather than one. Also, note that the two lists are different.

Ideally, these two should be merged, and only one should be kept. I need somebody technical to help me with this. Borovi4ok (talk) 09:02, 15 February 2018 (UTC)

No, "derived terms" categories (for all languages) include all forms of derivation, including inheritance and borrowing, whereas "borrowed terms" are only form terms borrowed directly. Consider e.g. French terms derived from Latin, which includes borrowed terms and a large inherited vocabulary. But the distinct also holds for Bashkir; a word borrowed into Bashkir from, say, English, which borrowed it from French, which inherited it from Latin, which borrowed it from Arabic, is thus a Bashkir word which is ultimately derived from Arabic, but it's not a Bashkir word borrowed from Arabic. (The category boilerplate text could be expended to explain this better, IMO.) - -sche (discuss) 09:46, 15 February 2018 (UTC)
OK thanx,
I will sort it out manually then. Borovi4ok (talk) 12:41, 15 February 2018 (UTC)

Can WT:ACCEL include English gerunds for -ing constructions?Edit

I'm struggling to think of any English verb which couldn't be conjugated in the present participle as fooing but which also isn't a gerund-style noun at the same time. The Accel gadget is too intricate for me to tinker with it, so can someone please add the gerund forms to it? —Justin (koavf)TCM 17:56, 15 February 2018 (UTC)

Adding wikidata ids to Module:data/languagesEdit

I'm proposing the addition of another field to {{Module:languages}} data which is the Wikidata item for that language. This would supersede the wikipedia_article property, since these links could easily be generated from the Wikidata item id. – Jberkel 09:54, 16 February 2018 (UTC)

I support that idea. It will have to be done with great care, though, as some of our languages may not map as intuitively as you'd expect to Wikidata items, and many won't have items at all. —Μετάknowledgediscuss/deeds 17:35, 16 February 2018 (UTC)
Can you generate a preliminary list of mappings so that it can be reviewed? DTLHS (talk) 17:46, 16 February 2018 (UTC)
@DTLHS: raw data: {{Module:User:Jberkel/languages}}, matched table: {{User:Jberkel/languages}}, unmatched table: {{User:Jberkel/languages/unmatched}}, ambiguous: {{User:Jberkel/languages/ambiguous}}. The matched table is just a sample, since lua runtime constraints prevent the full render. Matching is done via ISO 639-X codes. – Jberkel 22:22, 16 February 2018 (UTC)
I've done a test edit in {{Module:languages/data3/a}}, Special:Diff/49017034. Please take a look, the entries I checked were all correct. If there are no objections I'll run the script on the other data modules. – Jberkel 21:28, 18 February 2018 (UTC)
Looks good. @DTLHS, in case he wants to take a look. —Μετάknowledgediscuss/deeds 22:19, 18 February 2018 (UTC)
@Jberkel, Metaknowledge, Erutuon Look like you boys broke some stuff. I'm getting Lua error in Module:languages/data3/p at line 1252: attempt to call global 'wikidata_item' (a nil value) from {{desc|psu|𑀡𑀻𑀟}}. --Victar (talk) 16:17, 19 February 2018 (UTC)
@Victar: sorry should be fixed now. i've done a couple of manual edits and a typo snuck in. – Jberkel 16:23, 19 February 2018 (UTC)
Thanks. --Victar (talk) 16:27, 19 February 2018 (UTC)
@Jberkel Hi, pmh is still broken: can't use the {{der}} or {{inh}} tags. -- माधवपंडित (talk) 17:00, 19 February 2018 (UTC)
@माधवपंडित: maybe caching? can you post a link to a broken entry? – Jberkel 17:09, 19 February 2018 (UTC)
@Jberkel: रित्तें and अस्वल show this error but others don't. I'll clear cache and re-check. -- माधवपंडित (talk) 17:14, 19 February 2018 (UTC)
Odd, the error persists on these entries but entries like हांव are intact. -- माधवपंडित (talk) 17:19, 19 February 2018 (UTC)
@माधवपंडित: I don't see errors in the entries you linked to. – Jberkel 17:53, 19 February 2018 (UTC)
@Jberkel: Yeah, it's fine now. -- माधवपंडित (talk) 01:03, 20 February 2018 (UTC)

RTL reconstructionsEdit

Is there some way to fix the spacing problem in RTL reconstructions, like Avestan: *𐬛𐬁𐬥𐬀 (*dāna)? I'm wondering if the solution is to have a |recon= param. @Erutuon --Victar (talk) 20:14, 19 February 2018 (UTC)

@Victar: What is the problem? It looks fine to me. I see an asterisk and an Avestan word, right-to-left, followed by a space and transliteration in parentheses, left-to-right. — Eru·tuon 20:46, 19 February 2018 (UTC)
@Erutuon: Compare Avestan: *𐬛𐬁𐬥𐬀 (*dāna) to Avestan: 𐬛𐬁𐬥𐬀 (dāna). Note the space. --Victar (talk) 20:59, 19 February 2018 (UTC)
@Victar: Still not seeing it. Perhaps you could post a screenshot of the issue? — Eru·tuon 21:03, 19 February 2018 (UTC)
I sometimes have this issue as well, though not here. But at अज़दहा (azadhā), I'm seeing this:
Avestan spacing issue Wiktionary
. --Per utramque cavernam (talk) 21:18, 19 February 2018 (UTC)
@Erutuon: --Victar (talk) 21:26, 19 February 2018 (UTC)
@Victar: Wow. So you are seeing several spaces between the reconstructed Avestan and the opening bracket, while I am seeing just one. I am using Firefox Quantum 59, but I just viewed this page in Chrome 64 and saw this spacing problem. It seems to be related to the unicode-bidi: embed; CSS property in that is assigned to Avestan in MediaWiki:Common.css. If I remove that property in the developer tools (right-click on the text and click "Inspect"), the text displays with only one space, but the asterisk is then on the left side. In fact, when I switch between the different unicode-bidi property values, the property values that put the asterisk on the right (where it should be) also have the spacing problem. That's got to be a bug. Something about including an asterisk in right-to-left text is screwing things up. — Eru·tuon 21:54, 19 February 2018 (UTC)
@Erutuon: Good to know it's not broken cross-platform. What if we filtered out the asterisk from the Avestan text and then add it back with CSS, something like .Avst::after { content: "*" }? --Victar (talk) 22:17, 19 February 2018 (UTC)
@Victar: Interesting idea. I tried it in the developer tools (through JavaScript) and it does work, though I had to modify the selector to .Avst a::before to get the asterisk to display inside the link and in the correct position (on the right side). So it amounts to removing the asterisk and then adding it back. Heh. — Eru·tuon 03:06, 20 February 2018 (UTC)
@Erutuon: Hah, well, if it works! I wonder if other RTL reconstructions as afflicted by the same bug. --Victar (talk) 03:10, 20 February 2018 (UTC)

light and Lua error: not enough memoryEdit

Previous discussion of our bumping into the Lua memory limit, including a rejected phabricator request to raise that memory limit: WT:GP/2017/April § water is broken.

Starting from English etymology 2, light is full of "Lua error: not enough memory" error messages. Can anyone diagnose/fix? Tetromino (talk) 22:21, 19 February 2018 (UTC)

@Tetromino: Maybe it's because of the wikidata ids that were recently added to the language and language family data modules by @Jberkel. — Eru·tuon 22:55, 19 February 2018 (UTC)
Weird. I would't expect this to consume that much more memory. Or there's something seriously wrong inside the wikidata extension. – Jberkel 23:11, 19 February 2018 (UTC)
It's probably just that there is a delicate balance with the memory that the modules use- many pages are right on the edge and any addition can put them over. DTLHS (talk) 23:18, 19 February 2018 (UTC)
I removed the sitelink lookup and it still fails. If there's something seriously wrong I'd expect a lot more pages to fail. – Jberkel 23:43, 19 February 2018 (UTC)
The problem is the size of the data module, not whether anything is being done with that data. DTLHS (talk) 23:44, 19 February 2018 (UTC)
Hm, it's just a few extra bytes per language, but given the number of languages it could add up to something like 200kb, assuming that all data modules get loaded. – Jberkel 23:50, 19 February 2018 (UTC)
One solution could be to mirror the language data into another module with only the wikidata IDs mapped to our language codes. DTLHS (talk) 23:52, 19 February 2018 (UTC)
@Jberkel: The total memory is probably than 200 KB. I don't entirely understand how Scribunto memory works, but this Lua function is an attempt to get a handle on how much memory the new Wikidata items might take up. World of Warcraft wiki says that each table index not in the array part of the table takes up 40 bytes, plus the bytes taken up by the value. And apparently each string uses 24 bytes along with its byte length. So 7447 Wikidata items times 40 bytes = 297,880 bytes; the total of the bytes in each of the strings is 56,306 bytes; then 24 bytes times 7447 strings = 178,728 bytes. Total of all of those, 532,914 bytes. And if any tables had to be expanded to the next larger size (a power of two), that added memory too. So assuming this all is correct, more than 500 KB has been added by the recent edits on any page where all the language data modules are transcluded, even when not considering the memory used by mw.loadData when it wraps the data modules, and by the new getWikidataItem function, and so on. — Eru·tuon 00:05, 20 February 2018 (UTC)
FYI, there are actually errors on about 30 high-volume pages (CAT:E). —*i̯óh₁n̥C[5] 03:42, 20 February 2018 (UTC)
Yesterday I added a bunch of words from CAT:E into the pagename blacklist in the source of Template:redlink category, and it did help. But today it looks like a bunch of those pages are back in CAT:E and more. FWIW. —Internoob 04:52, 20 February 2018 (UTC)
  • @Jberkel: Given the scope of the memory errors that are being produced and the very limited usefulness of the Wikidata IDs. I would like you to undo your changes to the modules for now. (We can keep the field for the IDs, just not use them.) —Μετάknowledgediscuss/deeds 05:27, 20 February 2018 (UTC)
    So, comment out the Wikidata IDs rather than undoing/entirely removing them? (Seems sensible, whether it's what you're suggesting or not.) We still need to address the pre-existing problems of our entries using Lua for so much, e.g. auto-transliteration and the redlink finder, of course. If more memory is used the more codes there are, memory usage will continue to go up for that reason, too, because we are always adding more codes... - -sche (discuss) 06:26, 20 February 2018 (UTC)
    Unfortunately, our project is well suited to automation, and I feel that this issue will continuously be coming up. As Meta has mentioned, it seems like we should ask for the a software solution to this problem from the devs, whether that be increasing the memory limit or having them streamline some of our base processes (though I don't know how that might work). —*i̯óh₁n̥C[5] 06:32, 20 February 2018 (UTC)
    Commenting them out would be perfect, yes. We should get the ball rolling with a Phabricator ticket, perhaps, but I'm not the right person to write one. —Μετάknowledgediscuss/deeds 06:44, 20 February 2018 (UTC)
    Le sigh. Ok, I'll undo my changes. I was initially thinking about splitting the data modules into smaller pieces (data/a/a1, /a/a2 etc.) but there will always be some outlier pages which transclude everything, and more pieces also means more overhead (and inconvenience for editors). Another solution could be to increase the memory limit exceptionally for a few high traffic pages (but how would that be set up?). In any case we need to find a "proper" solution soon. We also need better tools for profiling memory usage. I'll start a ticket on phabricator to get some ideas. – Jberkel 07:55, 20 February 2018 (UTC)
    @Jberkel: To be fair, if every language is going to have canonical name and wikidata code, couldn't you put those in indices [1] and [2] to save a lot of memory in the language modules? —*i̯óh₁n̥C[5] 08:23, 20 February 2018 (UTC)
    Indeed, we're approaching full coverage of the family parameter, so it might make sense to put that in [3] and just assign an "uncategorized" family to those yet to be added. I believe I'm right in thinking that the array memory is much more efficient if used at the declaration time of the table than the hash table, right? —*i̯óh₁n̥C[5] 08:33, 20 February 2018 (UTC)
    @JohnC5: Sorry, I don't follow. I don't see how an extra index would save memory here. As Erutuon has indicated, the storage requirements for strings are around 24+length * number of instances. It's difficult to get below this baseline. The table keys should be handled by Lua's string interning and only count once. I'll verify this though to be sure. – Jberkel 09:28, 20 February 2018 (UTC)
    @Jberkel: I'm saying that instead, of putting the values of canonicalName, wikidata_item, family under those names entries (i.e. in the table's underlying hashtable), put them as entries [1], [2], [3] of the table's underlying array. For instance, convert:
    ["zaa"] = {
    canonicalName = "Sierra de Juárez Zapotec",
    otherNames = {"Ixtlán Zapotec", "Atepec"},
    scripts = {"Latn"},
    family = "omq-zap",
    wikidata_item = "Q12953989",
    ["zaa"] = {
    "Sierra de Juárez Zapotec",
    otherNames = {"Ixtlán Zapotec", "Atepec"},
    scripts = {"Latn"},
    This will mean that the table creation is much more efficient for these mandatory entries as well as the lookups and will save memory in that way. —*i̯óh₁n̥C[5] 09:43, 20 February 2018 (UTC)
    Ah, ok I misread [1] as missing wiki references, not indexes :). Yes, this should save (3 * 40 bytes (string keys) - 32 bytes (3 int keys) = 88 bytes per entry? I can't believe it's 2018 and we're discussing byte-level optimisations :) – Jberkel 10:08, 20 February 2018 (UTC)
    That is a good idea. Another idea is to share script arrays between languages, particularly for {"Latn"}, which is used more than 3000 times (see the "script combinations" table in User:Erutuon/language stuff). That is, define local Latn = {"Latn"} at the top and use that in each applicable data table on the page. mw.loadData is clever enough to cache only one copy of the table then. That would in theory save 40 + 16 bytes for every "Latn" script table after the first, plus about 24 + 4 bytes for the string (84 bytes?). I tried it in one module, but didn't notice any effect. I suppose it would save even more, at least in the data module, to use a string, { --[[...]] scripts = "Latn", --[[...]] }, instead of an array, but the functions relying on the scripts item would have to be modified. — Eru·tuon 10:47, 20 February 2018 (UTC)
    @Erutuon: Even better, assume script Latn as default avoiding to define it to each language. --Vriullop (talk) 18:49, 20 February 2018 (UTC)
    That seems mostly sensible. Latin is by far the most used script, especially for the obscurer lects where no script is yet specified. However, I'd like if we could make a list of languages which don't currently have a script set, before we make Latn the default, so we know which languages we need to check the script of. (Or, add an "undetermined" script code to those languages, which can be converted to specific script codes at leisure.) - -sche (discuss) 19:31, 20 February 2018 (UTC)
    @-sche: If you look at the "script combinations" table in User:Erutuon/language stuff, languages with no script are in the None row; there are 3718 of them at the moment. If you sort by the "languages" column, you will see they are the largest group, larger than Latn. — Eru·tuon 20:23, 20 February 2018 (UTC)
    Good point; I meant that Latin is the most used script in the world [by number of languages using it], but in our modules, there are still a lot of gaps. But I've filled in a bunch of those gaps; Latin is used by more than half of all the lects we have codes for. But [t occurs to me to back up and ask] would it actually save us any memory to treat Latn as the default, or would the same amount of memory still be used just by the check that would be performed to see whether or not a script was set for a particular language? - -sche (discuss) 06:20, 21 February 2018 (UTC)
    @-sche: Right, I was mainly just pointing to the page. It would save some memory to leave out {"Latn"} in the tables. I guess at least 96 bytes is used per instance (based on the World of Warcraft wiki explanation, ignoring Scribunto-specific stuff), if a local Latn variable is not being shared between the tables, which would come to a few hundred kilobytes if all the data modules are being transcluded. By contrast, it's cheap to check for the presence of the "scripts" item in a language's data table: you just check whether data_table.scripts is nil. I wonder if there are languages that need to have their script specified as None? I guess I can't see why. — Eru·tuon 07:40, 21 February 2018 (UTC)
    @-sche: I created a list of languages without scripts at User:Erutuon/languages with no scripts, as I realized that's what you were actually asking for. 21:12, 21 February 2018 (UTC)
    Thanks! Right now, while I'm just adding scripts to the modules, the list doesn't offer much advantage over just noticing which languages have no script set (unless it's of help to someone fulfilling the idea I suggested a few threads down for adding missing scriptS), my point is that it would be necessary (or at least, helpful) to save or subst: a copy prior to any switch to not declaring Latn at all and assuming that languages with no script specified can be assumed to be written in Latn (a fine assumption, but one we'll want to fix the edge cases of). (I've saved a copy now.) - -sche (discuss) 15:04, 22 February 2018 (UTC)
    From the perspective of the module, I guess there's probably no advantage to specifying "None" over assuming "Latn". But from the perspective of people trying to go through and ensure that languages with identifiable scripts have those scripts specified (most are Latin, but in a few cases the script has been Deva, or Ethi, or Thai), if we switch to specifying no script when the script is Latn, it would be good to know which languages have no script specified because the script is known to be Latn, vs which have no script specified because the script is not known. Perhaps this could be accomplished by first adding a commented-out scripts = {"None"} or script unknown to languages with no script specified, so the module doesn't have to spend any time processing that "script", but humans can still see while editing the module which languages we still need to track down script info for. - -sche (discuss) 16:06, 21 February 2018 (UTC)
    That's a good general principle, especially for a wiki that requires elapsed-time-consuming research. We need more allowance for work in process at a highly granular level. I don't really need to get a red Lua message for typing "g=f?, m". I need to have an acceptable entry to which I can come back when I have more information or an working on that class of problem. DCDuring (talk) 17:02, 21 February 2018 (UTC)
  • @Erutuon: Well, mine will require a script change as well. The transition for mine would also be fairly easy: change the accessors to check the positional params as well as the hashtable during the transition period, then remove the check in the hashtable after the transition is over. Could you possibly get some statistics on your page concerning how many languages don't have family params? Thanks! —*i̯óh₁n̥C[5] 11:01, 20 February 2018 (UTC)
    @JohnC5: you can see it at {{Module:sandbox}}: 8031 - 5778 = 2253. I'll change the data modules to use indexes plus inline the Latn scripts. – Jberkel 12:30, 20 February 2018 (UTC)
    @JohnC5: I've added a table of the total number of languages and the number that has each data item (with notes on what the numerical indices represent). — Eru·tuon 20:41, 20 February 2018 (UTC)
I am wondering if, at some point in the near future, we can all agree that the concept and execution of the languages module is just not going to work and try and come up with some novel solutions. The current process of making a change, breaking a bunch of things, then trying to scale back changes until nothing quite breaks is not what I would call an optimal design paradigm. If we want to persist in using the current solution, I the propose that we mandate any changes made will demonstrably *not* break content which is currently unbroken. - TheDaveRoss 18:26, 20 February 2018 (UTC)
You're looking at novel solutions right above your comment. There's no optimal design paradigm possible when we don't have control over how it all works, and we don't even fully understand how memory is allocated. (And that's also why your mandate would not be feasible, because it's hard to demonstrate without trying it first.) —Μετάknowledgediscuss/deeds 18:36, 20 February 2018 (UTC)
Tweaks to the existing design do not qualify in my book, moving from a large flat-file format which needs to be read in its entirety during every invocation to almost anything else would be a marked improvement. Perhaps restructuring the module so that it can read a small page specific to the language code rather than reading a large module with all language data. Perhaps figuring out how to migrate to Wikidata and leveraging an actual structured database. Perhaps something else entirely. - TheDaveRoss 21:20, 20 February 2018 (UTC)
@TheDaveRoss: Yes, we could (and should) make better use of Wikidata. That's why I wanted to incorporate ids in our database. Things like language script data already exists in Wikidata. So in the long term our (reusable) data should be stored there, not in big lua chunks. – Jberkel 02:07, 21 February 2018 (UTC)
Note that arbitrary access to Wikidata is an expensive function. See mw:Extension:Wikibase Client/Lua function mw.wikibase.getEntity. I'm afraid it will not be an alternative for intensive uses. --Vriullop (talk) 10:47, 21 February 2018 (UTC)
@Vriullop: that's the case for this particular function call, since it loads all the data. However it's also possible to only query the fields needed which is much cheaper. – Jberkel 10:51, 21 February 2018 (UTC)
Actually, perhaps that would be a great first step. If every language had its own data module it would reduce the amount read tremendously. Why does the data need to be in such large chunks? It would be easier to maintain if it were in discrete pages as well. A bot could probably generate all of the submodules in minutes, without a disruption in the existing structure. Then we would only have to update the data module lookup function and the rest should remain functional as is. - TheDaveRoss 21:46, 20 February 2018 (UTC)
I suspect that having to load thousands of individual modules would not be a performance improvement over having to load a single module (or 26 modules as we do now). DTLHS (talk) 22:04, 20 February 2018 (UTC)
@TheDaveRoss: I'm not sure what you mean by "read in its entirety"; the first time mw.loadData is called on a data module, it creates a cached copy that is then used by later calls to mw.loadData. So a given data module is read only once on a page, provided it is always loaded with mw.loadData and not with require. I am curious what the memory difference would be if the data modules were split up.
There is a certain amount of overhead for each data module loaded with mw.loadData. If I'm reading the source code right, the data-wrapping function creates one table (seen) every time mw.loadData is called to map between the actual (cached) tables and the empty tables that are returned, and for each table in the data module it creates 2 tables (an empty table and the empty table's metatable) and 6 functions. Four of these functions, __index, __newindex, __pairs, __ipairs, are placed in the metatable of the virtual table and two (pairsfunc, ipairsfunc) are returned when pairs and ipairs are called on the empty table returned by mw.loadData. (Whew, it actually re-wraps the data every time the function is called, so these tables and functions are duplicated for every invocation! That's got to be a major contributor to our memory problems, because we load data modules so many times.)
Okay, so I guess the only item that would be duplicated if the data modules are split is the seen table. [Edit:] Actually, only the top level of a data module is wrapped. Subtables are wrapped only if they are visited by indexing. (For instance, mw.loadData("Module:languages/data2")["en"]["scripts"] wraps the top-level table, the English data table, and the English scripts table.) So if you iterate through a loaded data module that contains subtables, each of the subtables will be wrapped, and memory usage will be greater than if you load it without doing anything else. — Eru·tuon 22:29, 20 February 2018 (UTC)
@Erutuon: Re performance, the reality is that we are up against an artificial performance problem, Wikimedia decided that 50mb of Lua memory usage would be the limit whether or not some other amount would be usable without compromising actual performance (e.g. page load time, server cost). The solution, until we start hitting other performance issues, can be as simple as minimizing the use of Lua memory in favor of resources which are less restricted (processor time). Splitting the data module into a per-code format would, I completely agree, increase the overhead in terms of function calls, but since most pages contain very few languages, I suspect that on average it would reduce overall server resource consumption. Since it is very hard for us to profile the things we do on wiki, we will be mostly stuck guessing about these types of things. (edit) Also, since not every invocation returns the same table in the current format, I am curious how MW decides to optimize. - TheDaveRoss 13:13, 22 February 2018 (UTC)
Re "most pages contain very few languages": English lemmas with translations tables contain lots of languages, and the number of those is only going to increase as we become more and more complete. They are already the entries we're having trouble with. - -sche (discuss) 20:25, 22 February 2018 (UTC)
@-sche: True. However currently every page with any invocations needs to read a large data file into memory, even if it only needs one language. There will be a tipping point somewhere when the average page needs to read a sufficiently large portion of the current module, but we are VERY far from that. - TheDaveRoss 21:03, 22 February 2018 (UTC)
@TheDaveRoss: Actually, I've changed my mind; splitting up the language data modules is worth a try. It makes sense, because a given module typically uses only one or two language data tables. However, as there are 8031 language codes and there would be that many modules, it would probably be best to keep the current large modules for human editing and create a bot that would maintain the small modules. They would need to be protected and Module:documentation could display a message like "This module is generated from module x by a bot. Please edit module x instead of this one." (Heh, this would make the list of transclusions incredibly long. I wonder how many language codes are used on the pages with the most translations.) — Eru·tuon 21:34, 22 February 2018 (UTC)
I assume you're volunteering to write and maintain said bot. DTLHS (talk) 21:37, 22 February 2018 (UTC)
But would this actually help (m)any entries? We aren't having problems on entries that use only one language code, e.g. Evenki entries that never need to invoke any other language code besides Evenki, so we don't need to "fix" all those pages. We might see improvements on the few very language pages that are breaking now, but we'd be letting that tail wag the dog, in a way that would require much more upkeep (8000+ separate modules, possibly maintaining a bot to handle them,...). Our most complete pages, that transclude thousands of language codes, might still break. - -sche (discuss) 22:30, 22 February 2018 (UTC)
@DTLHS: The first step is determining if it's worth it. If so, I might consider learning bot-writing just for this purpose.
@-sche: I don't know. Maybe loading one of several large modules many times is more costly than loading many small modules with the same data, or maybe not. There is probably a way to test this without creating 8000-plus modules. — Eru·tuon 23:13, 22 February 2018 (UTC)
I was thinking of replacing the languages/dataX modules with something like languages/data/en and keeping the languages module exactly as it is. Once the module has been split into languages (perhaps by bot) it seems like it would be easier for humans to maintain the smaller, specific data files. They are easy to find (since they are just at their ISO code subpage) and they will be very small and simple. - TheDaveRoss 13:47, 23 February 2018 (UTC)
The current system has the advantage that it's easier to quickly add data to a lot of lanuages, e.g. paging between Wikipedia, Ethnologue and one large lettered data module at a time, I've added script data to almost a thousand languages. It's also easier to watchlist and monitor changes to a few data modules. If we split it up, it'd seem like a step backwards, to when we had templates. We would seem to need to protect not only all existing subpages (/en, /fvr, /aav-ban-pro, etc), but all nonexistence subpages of valid form (/xx, /xxx, /xxx-xxx, /xxx-xxx-xxx) against being created by vandals, that could otherwise be created and would then AFAICT be accepted by the modules without complaint. And it doesn't seem like it would help that many pages. I'm not totally opposed to it, it just seems like it has a lot of drawbacks and not such great benefits. - -sche (discuss) 15:07, 23 February 2018 (UTC)
If we didn't care that the language data modules were human readable, how much could we reduce the size? I'm thinking of something like a minifier that periodically "compiles" the human readable modules (what we have now) into something smaller. DTLHS (talk) 18:41, 20 February 2018 (UTC)
@DTLHS: One idea: concatenate all data into a string and provide another string with numerical data (printed in some non-decimal system) to indicate how to read the data. But I don't know exactly how to implement that or if it would really use less memory. — Eru·tuon 21:12, 20 February 2018 (UTC)
Language modules seem to be used more intensively in the translation tables, but translations templates only need to know the script (and transliteration?), and probably other templates only need the script as well. Smaller modules with script data could be a good aproach. --Vriullop (talk) 10:38, 21 February 2018 (UTC)
@Vriullop: Ideally we would still store all the data in one place and have a mechanism to selectively load only the fields needed, sort of like a specialized view of the data. – Jberkel 10:44, 21 February 2018 (UTC)

What is the intended format for cases where a lect does not, at the time it is added to the data module, have a Wikidata ID? (This could easily be the case for some of the more obscure lects we add exceptional codes for.) A blank "",? Use the old format where the canonical name and family are named parameters/fields? - -sche (discuss) 19:28, 20 February 2018 (UTC)

@-sche: Plain nil. I've just bulk-changed the data modules, the memory errors are gone now. – Jberkel 19:42, 20 February 2018 (UTC)
@Jberkel Could you publish the script that converts to new format? On my wiki, many language names are already translated and it is too tired to convert manually. --Octahedron80 (talk) 20:08, 20 February 2018 (UTC)
@Octahedron80: sure, it's for Python3 + pywikibot: 01:52, 21 February 2018 (UTC)
Category:Rebracketings by language and many other categories display "Lua error in Module:languages/by_name at line 5: table index is nil" now. - -sche (discuss) 21:55, 20 February 2018 (UTC)
Now gone, although it had persisted even after null edits earlier. - -sche (discuss) 22:53, 20 February 2018 (UTC)
Would it help to deploy the "local Latn" 'hack' that Module:languages/data3/a uses to the other submodules? - -sche (discuss) 22:53, 20 February 2018 (UTC)
It probably wouldn't hurt. — Eru·tuon 00:02, 21 February 2018 (UTC)
Ok people, after cleaning up some unrelated errors and debugging, the only page remaining with an error is do. Anyone got any ideas? —*i̯óh₁n̥C[5] 08:14, 21 February 2018 (UTC)
@JohnC5: No, but I'll check if the mw.wikibase.sitelink calls can benefit from caching, not sure how much memory is allocated there. – Jberkel 11:41, 21 February 2018 (UTC)
Just a random idea: I noticed that Lua has weak tables which could be used to hold the language data. If more memory is needed some of it can be garbage collected (and later reloaded if necessary). The problem at the moment is that all language modules are loaded and never reclaimed. – Jberkel 16:27, 21 February 2018 (UTC)
@Jberkel: Unfortunately, data modules that will be loaded with mw.loadData can't be weak, because you can't add metatables to them, and I don't know if the weakness of tables actually even affects Scribunto memory usage. — Eru·tuon 20:20, 22 February 2018 (UTC)
@JohnC5: It might reduce memory to put scripts and other_names in indices 4 and 5. Those are the next most frequent items, in that order. However, going from 4 to 5 array items may enlarge the size of the array part of the table from 4 to 8; if so, leaving other_names in the hash part would be best. — Eru·tuon 22:01, 22 February 2018 (UTC)
@Erutuon, Jberkel: So last night, while doing some other work, I found what I think is a more efficient and user-friendly way of doing this. I've created Module:languages/global which contains the names of all the fields in the language data ordered by frequency, all the standard diacritics, and the common scripts. We load this into all the language modules and use it as the one source of truth. So what is now:
m["be"] = {
otherNames = {"Belorussian", "Belarusan", "Bielorussian", "Byelorussian", "Belarussian", "White Russian"},
scripts = Cyrl,
ancestors = {"orv"},
translit_module = "be-translit",
sort_key = {
from = {"Ё", "ё"},
to = {"Е" , "е"}},
entry_name = {
from = {"Ѐ", "ѐ", GRAVE, ACUTE},
to = {"Е", "е"}},
local g = mw.loadData("Module:languages/global")
m["be"] = {
[g.canonical_name] = "Belarusian",
[g.wikidata_item] = "Q9091",
[] = "zle",
[g.other_names] = {"Belorussian", "Belarusan", "Bielorussian", "Byelorussian", "Belarussian", "White Russian"},
[g.scripts] = Cyrl,
[g.ancestors] = {"orv"},
[g.translit_module] = "be-translit",
[g.sort_key] = {
from = {"Ё", "ё"},
to = {"Е" , "е"}},
[g.entry_name] = {
from = {"Ѐ", "ѐ", g.CHARS.GRAVE, g.CHARS.ACUTE},
to = {"Е", "е"}},
Under the hood this would comes to be stored as (with the current project wide frequencies):
m["be"] = {
[1] = "Belarusian",
[2] = "Q9091",
[3] = "zle",
[5] = {"Belorussian", "Belarusan", "Bielorussian", "Byelorussian", "Belarussian", "White Russian"},
[4] = Cyrl,
[6] = {"orv"},
[8] = "be-translit",
[10] = {
from = {"Ё", "ё"},
to = {"Е" , "е"}},
[9] = {
from = {"Ѐ", "ѐ", g.CHARS.GRAVE, g.CHARS.ACUTE},
to = {"Е", "е"}},
It would turn out that for this case, that fields 1–6 will go into the array whereas 8–10 will go into the hashtable because [7] is omitted. However, we never iterate over these tables, and so the simplest tables will only have a few bytes worth of storage overhead. Then when you want to get something out, you do something like:
local g = mw.loadData("Module:languages/global")
local language_name = self.__data[g.canonical_name]
There will be a bit more lookup overhead, but it will always be O(1). This system also means that is one field becomes more common than another, all we need to do is change the order in Module:languages/global to rebalance the entire project. What do you think? —*i̯óh₁n̥C[5] 22:52, 22 February 2018 (UTC)
Also, @-sche, DTLHS. —*i̯óh₁n̥C[5] 23:45, 22 February 2018 (UTC)
@JohnC5: Lua Performance Tips mentions "If you write something like {[1] = true, [2] = true, [3] = true}, however, Lua is not smart enough to detect that the given expressions (literal numbers, in this case) describe array indices, so it creates a table with four slots in its hash part, wasting memory and CPU time." I'll have a look at the implementation, it's still not clear to me how it decides between array/hash parts. – Jberkel 08:21, 23 February 2018 (UTC)
@Jberkel: I'm not sure why it says that since it's definitely not true. If you look at Module:User:JohnC5/Sandbox3, you can see that the first 3 elements which are inserted in the table under indices 1, 2, and 3 get printed out by ipairs, which only prints from the array. The object at index 5 gets put in the hashtable because it is non consecutive. Note also that the order in which the indices are entered is not relevant, as the compiler will still recognize that 2, 1, 3 is actually 1 to 3 consecutively. Perhaps those tips come from before Lua 5.1, when they suped up the constructor for the tables? Does this make sense? —*i̯óh₁n̥C[5] 08:40, 23 February 2018 (UTC)
@Jberkel: Looking more carefully now that I've made some changes to my test module, the behavior is weirdly more robust than I expected. All the test I know for checking the size of the array (#a, ipairs(a), and table.getn(a)) point to my being correct, but I'm startled by these results. —*i̯óh₁n̥C[5] 08:58, 23 February 2018 (UTC)
@Jberkel: I take it back. After some fiddling around with memory stuff, these functions are just clever, but they are not being put in the array. Lemme think on this for a bit. —*i̯óh₁n̥C[5] 09:18, 23 February 2018 (UTC)
@Jberkel: Damn, it won't work. I tried a bunch of things, but we'd just have to hard code them in order. Damn Lua for being the worst. —*i̯óh₁n̥C[5] 10:50, 23 February 2018 (UTC)
@JohnC5: It seems that the length operator first looks at the array part, then looks in the hash part. In the latter case, it finds the largest power of 2 i such that t[i] isn't nil, then does the search for an i less than that where t[i + 1] is nil and t[i] isn't. (So it returns the wrong result if a power of two is empty: x = { [1] = true, [3] = true, [4] = true, [5] = true } assert(#x == 1). table.getn does some other stuff that I don't understand, but if that fails, it calls the # operator. 21:48, 23 February 2018 (UTC)

@JohnC5, Jberkel: A way to use numerical indices would be to preprocess the data before outputting it: replacing string keys with numbers. "scripts" could be replaced with 4, "otherNames" with 5, and so on. Because the modules are loaded into memory once on a page, this processing would also be done only once. Unfortunately, it would confuse people that the exported table didn't match the table in the module (as would the previous idea). — Eru·tuon 21:55, 14 March 2018 (UTC)

@Erutuon: Yes, I think this will be a maintenance nightmare. My call for profiling help on phabricator didn't go anywhere unfortunately. And setting up an instance to do profiling locally seems to be a lot of work. Ideally there would be a sandbox instance with extra debugging and profiling enabled. – Jberkel 23:45, 14 March 2018 (UTC)

Translation template errorEdit

After using "Edit source" in translation section (trans-top template) and returning back from the editor by "Publish changes", all translation sections miss the [show ⏷] button on the right side to unfold them (and also the ± sign to edit the header). It is necessary to refresh the page afterwards to get back to normal operation of the template. With thanks and regards, Peter 10:17, 20 February 2018 (UTC)

@Peter K. Livingston: I've experienced that as well, when using the AjaxEdit script. The "show" buttons and "±" sign are powered by JavaScript scripts, and I guess the scripts don't reload when "Publish changes" is pressed. Unfortunately I don't know how to fix this. — Eru·tuon 21:13, 20 February 2018 (UTC)
@Peter K. Livingston: What should we do to find somebody who is qualified to solve this issue? Peter 22:17, 20 February 2018 (UTC)
@Peter K. Livingston: @Dixtosa might be able to help. He knows JavaScript better than I do. — Eru·tuon 21:53, 23 February 2018 (UTC)
I am afraid we can't anything about it that is not a hack. Loading Common.js manually solves show/hide problems. --Dixtosa (talk) 08:21, 24 February 2018 (UTC)

Red links not turning blueEdit

I have had this problem for a couple of days where red links don't turn blue straight away when an entry is done, say for an inflection. I'm not sure whether it's just happening to me, or whether anyone else has noticed it. It can be rectified by doing a null edit, but this shouldn't be necessary. DonnanZ (talk) 19:25, 20 February 2018 (UTC)

I've noticed it as well; when I created buck-hoist as an alt form of buck hoist before I created buck hoist itself, the link from buck-hoist to buck hoist stayed red until I did a null edit. It might have something to do with all the changes to the language modules filling up the job queue(?). - -sche (discuss) 19:33, 20 February 2018 (UTC)
Is 10K a big number for jobs? DCDuring (talk) 21:04, 20 February 2018 (UTC)
The situation has improved somewhat, but still a few seconds slow on occasion. DonnanZ (talk) 21:14, 20 February 2018 (UTC)

"Lemma" categories and non-morphemic sinogramsEdit

@Wyang, Justinrleung, Suzukaze-c I just noticed page . This Chinese character is obviously not a lemma (at least in Chinese). But this page is currently categorized into Translingual lemmas, Middle Chinese lemmas, Old Chinese lemmas, Chinese lemmas, and Mandarin lemmas. What should we do? Dokurrat (talk) 20:15, 20 February 2018 (UTC)

The lemmanon-lemma distinction is useless for Chinese, since there is no non-lemma form in Chinese by default. I think we should leave it as it is, since the "lemma" categories effectively function as a catch-all place for the words that one would find in a traditional dictionary, which is what 鳺 would belong to. Wyang (talk) 23:06, 20 February 2018 (UTC)

Arabic etymologyEdit

Is there any way I can pull our all the Arabic word entries in Wiktionary that contain etymological info, please? —This unsigned comment was added by Rdurkan (talkcontribs).

The best I can suggest is ploughing your way through Category:Arabic lemmas, which is not terribly helpful. DonnanZ (talk) 10:34, 21 February 2018 (UTC)
@Rdurkan You can start with this list and extract the relevant sections from their contents using some kind of script, such as the from Pywikibot, or you can write your own script to extract from links which use &action=raw as a parameter to index.php (e.g. this example). Wyang (talk) 13:21, 21 February 2018 (UTC)


On MediaWiki_talk:Recentchangestext, there is a request to add a link to the Urdu version, but (a) the link is not of the same format as all the rest of the links (which use "foo:Special:Recentchanges" and rely on the site software to redirect to the local name of the page), and (b) I would imagine every wiki has a Recentchanges page, right? so I wonder if there are some criteria for deciding which languages to add interwiki links to, and whether Urdu meets those criteria. - -sche (discuss) 05:37, 21 February 2018 (UTC)

Huh? I see no difference in the format of the link. As for your (b), I don't think we have any criteria, but it would be sensible to choose a cutoff of article count, and limit it to those wikis. —Μετάknowledgediscuss/deeds 05:42, 21 February 2018 (UTC)
The request is to add [[ur:خاص:حالیہ تبدیلیاں]] (the Urdu-language name of the page), whereas the link to e.g. Arabic is not to [[خاص:أحدث_التغييرات]] but rather to [[ar:special:recentchanges|ar]] which then resolves to [[خاص:أحدث_التغييرات]]. - -sche (discuss) 05:51, 21 February 2018 (UTC)
Oh, I see. They both end up at the same place, but I guess you're right that we should standardise with the easier one. —Μετάknowledgediscuss/deeds 06:06, 21 February 2018 (UTC)
If we make the cutoff 10,000+ articles (since we already link to Arabic and Simple, and since that is the cutoff for the Main Page's sidebar links), we need to add quite a few more. I'll do that now, I suppose. I wonder if this is the kind of thing Wikidata wants to handle, the way they handle interwikis between different wikis' editions of Category:English nouns etc. - -sche (discuss) 15:11, 22 February 2018 (UTC)
  Done: I've updated the list to provide the same languages, using the same cutoff, as the Main Page. - -sche (discuss) 18:39, 22 February 2018 (UTC)

Moving ordinal numerals from category:Azerbaijani adjectivesEdit

Could someone with a bot please help me moving all Azerbaijani ordinal numerals from the adjective category and over to numerals? That is, renaming ===Adjective=== and changing {{head|az|adjective}}. Thank you. Allahverdi Verdizade (talk) 12:58, 21 February 2018 (UTC)

@Allahverdi Verdizade   Done on 100 pages in that category which contain {{ordinalbox}} in their content; see the changes. Wyang (talk) 13:41, 21 February 2018 (UTC)
@Wyang Thanks a million! Allahverdi Verdizade (talk) 15:19, 21 February 2018 (UTC)

Languages with entries but no script specifiedEdit

If anyone feels up to the task, it would be helpful if someone found every language which has no script specified in Module:languages, but which has entries (or even: which has translations in water), identify which scripts those entries/translations are in, and mass-add the scripts to Module:languages. - -sche (discuss) 23:34, 21 February 2018 (UTC)

(For the current list of these languages, see User:Erutuon/languages with no scripts.) — Eru·tuon 06:10, 22 February 2018 (UTC)
It would probably even be useful to simply add to the modules the scripts that all the languages we have entries for are de facto written in (meaning, the scripts our entries are in), not just ones that don't already have scripts specified. - -sche (discuss) 15:07, 22 February 2018 (UTC)

Parameter request for {{homophones}}Edit

Can someone add a "qN=" functionality to this template? Homophones are so often rooted in regional pronunciations, and I've seen some pretty bad workarounds and incomplete accent tagging due to the absence of this function. Or it could be "aN=" in keeping with {{a}}, or it could be like {{alter}}, but personally I find that template confusing. Ultimateria (talk) 12:04, 22 February 2018 (UTC)

Module:form ofEdit

@Rua, DTLHS, Erutuon, any chance we could update the usage of Module:form of from {{comparative of|good|lang=en}} to {{comparative of|en|good}}? --Victar (talk) 04:09, 23 February 2018 (UTC)

Just that one template, or all of them? Right now it would be weird to use {{comparative of|en|good}} when all the other form-of templates use |lang=. —Mahāgaja (formerly Angr) · talk 12:29, 24 February 2018 (UTC)
@Mahagaja: Oh yes, I mean all templates under that module. --Victar (talk) 14:12, 24 February 2018 (UTC)

Can {{ux}} can take an additional |audio= argument?Edit

As titled. There are some entries having usage examples with audios, for example Korean 헤아리다 (hearida). The audio can be displayed after the example in inline examples, and on a line under the example in multiline ones. Thanks!

(By the way, the current audios on that page are displayed incorrectly for me, covering the line above.) Wyang (talk) 09:32, 25 February 2018 (UTC)

How is it displaying incorrectly? It looks OK to me. — SGconlaw (talk) 17:05, 25 February 2018 (UTC)
@Sgconlaw The current display for me: [1], where the audios are shifted upward, almost completely covering the lines above. Wyang (talk) 22:32, 25 February 2018 (UTC)
I see. This isn't something I can help with, unfortunately. In any case, what browser are you using, and what version? I have no problem with Mozilla Firefox Quantum 58.0.2. — SGconlaw (talk) 22:46, 25 February 2018 (UTC)
@Wyang: It's displaying more or less the same way for me. Changing the inline CSS properties in the table tag that surrounds the audio player fixes it: vertical-align: bottom; display: inline;. The player is then centered on the bullet. (I used the developer tools to tinker with it. I'm in Firefox Quantum 59.) — Eru·tuon 22:53, 25 February 2018 (UTC)
@Erutuon Thank you, that also makes it display better on mine. I'm using Chrome 64.0.3282.167. Although not completely level, the line above is visible at least: [2]. Wyang (talk) 23:04, 25 February 2018 (UTC)
@Wyang: Interesting. In your browser it looks a little different. I don't know what is going on. — Eru·tuon 02:33, 26 February 2018 (UTC)
@Erutuon I changed it to vertical-align: bottom in {{audio}}. That seems to do the trick for solving the display issue on 헤아리다, while keeping the other uses unchanged. Wyang (talk) 02:49, 26 February 2018 (UTC)
This probably needs someone who can work with the javascript to solve the positioning issues and maybe make a slimmer player, before it can be added to {{ux}}. DTLHS (talk) 18:07, 25 February 2018 (UTC)

Russian translit - болого - g, not v - an exception in the exceptionEdit

Can someone please add a new exception to Module:ru-translit, please? Please look for the line starting with -- handle Того, То́го (but not того or Того́, which have /v/)

The translit should produce the regular "g" and the pronunciation should use [ɡ]. --Anatoli T. (обсудить/вклад) 22:43, 25 February 2018 (UTC)

I have attempted in diff, modelled on handling of до́рого (dórogo) above but it didn't work for some reason. --Anatoli T. (обсудить/вклад) 23:01, 25 February 2018 (UTC)
Fixed in diff by User:Per utramque cavernam. I see that I wasn't attentive. Thanks! --Anatoli T. (обсудить/вклад) 23:59, 25 February 2018 (UTC)

Transliterations in {{head}}Edit

Where is the code that disables the "transliteration needed" prompt for Latin-script languages? Asking for Hindi Wiktionary, where I've imported some modules. I believe @BukhariSaeed also has this issue on Urdu Wiktionary. —AryamanA (मुझसे बात करेंयोगदान) 22:52, 25 February 2018 (UTC)

I think it's language specific modules. E.g. if Persian noun headwords can't find "tr=" Module:fa-noun, entries use transliteration needed. You can search by modules (tick module only and the word "needed") and scan for "transliteration" in the browser. --Anatoli T. (обсудить/вклад) 00:59, 26 February 2018 (UTC)
Perhaps lines 175–178 in Module:headword? Wyang (talk) 01:45, 26 February 2018 (UTC)
@AryamanA: any solution? — Bukhari (Talk!) 04:34, 7 March 2018 (UTC)
@Wyang: Thanks! Solved finally. —AryamanA (मुझसे बात करेंयोगदान) 22:00, 8 March 2018 (UTC)

Consensus required for move protection codeEdit

Dear Community,

In this phabricator task an admin requested deletion protection that was backed up by community consensus. The patch currently is on hold since if it were merged move protection would be enabled for the main page also. This deletion and move protection if implemented would block all users (including sysops) from moving or deleting the page. What are the community's thoughts on it? If the community is not ready to fully commit yet to this protection maybe enable it for a reasonable trial period (6 months or so) to see its effects?


Sau226 (talk) 12:24, 26 February 2018 (UTC)

I can't think of any case where we would need to move the main page! Equinox 12:57, 26 February 2018 (UTC)
We have done so before, but I think there are other ways to accomplish the same results without moving if the need arises in the future. - TheDaveRoss 13:59, 26 February 2018 (UTC)
If there is appropriate consensus I can make a link to the discussion (or an admin/locally trusted user can post it) on the phab task. As soon as this is given the aim is to merge the change ASAP --Sau226 (talk) 17:11, 1 March 2018 (UTC)
This is the best you're going to get — we have no need to move the main page, and I created the task with a link to a discussion about deleting the main page. —Μετάknowledgediscuss/deeds 17:17, 1 March 2018 (UTC)

Language categories and indexesEdit

The language categories (CAT:Belarusian language, CAT:Assamese language, CAT:Hindi language, etc.) automatically link to Index:xxx, whether it exists (Index:Hindi) or not (Index:Belarusian, Index:Assamese).

AFAIK we don't use and create these anymore, so could we remove the link from {{langcatboiler}} when the Index doesn't exist? --Per utramque cavernam (talk) 12:36, 26 February 2018 (UTC)

Another Lua Memory ShortageEdit

There's another Lua memory shortage at the entry wind. --Lo Ximiendo (talk) 15:56, 26 February 2018 (UTC)

I mean, shouldn't the solution be similar to the one for the entry water? --Lo Ximiendo (talk) 07:29, 28 February 2018 (UTC)
That's a workaround, not a solution :) We really need to fix this properly. I've opened a ticket (T188492) to get some suggestions for better memory profiling. – Jberkel 10:59, 28 February 2018 (UTC)

Tech NewsEdit

Just a reminder to anyone who wants to keep up with Wikimedia tech news (which sometimes includes explanations for our mysterious bugs), be sure to watchlist Wiktionary:Wikimedia Tech News/2018‎. —Μετάknowledgediscuss/deeds 22:03, 26 February 2018 (UTC)

Global preferences available for testingEdit

Please help translate to your language.


Global preferences, a highly request feature in the 2016 Community Wishlist, is available for testing.

  1. Read over the help page, it is brief and has screenshots
  2. Login or register an account on Beta English Wikipedia
  3. Visit Global Preferences and try enabling and disabling some settings
  4. Visit some other language and project test wikis such as English Wikivoyage, the Hebrew Wikipedia and test the settings
  5. Report your findings, experience, bugs, and other observations

Once the team has feedback on design issues, bugs, and other things that might need worked out, the problems will be addressed and global preferences will be sent to the wikis.

Please let me know if you have any questions. Thanks! --Keegan (WMF) (talk) 00:24, 27 February 2018 (UTC)

Buryat Mongol scriptEdit

Could someone please make Buryat written in Mongol script display vertically? Crom daba (talk) 14:38, 27 February 2018 (UTC)

  Done; diffsuzukaze (tc) 00:52, 28 February 2018 (UTC)
@Suzukaze-c That makes Cyrillic script also appear vertically: see ᠬᠥᠳᠡᠭᠡ_ᠠᠵᠤ_ᠠᠬᠤᠢ. DTLHS (talk) 00:57, 28 February 2018 (UTC)
@DTLHS: OK, I reverted it for now, but I think that has to do with the way {{head}} assigns scripts... See [3]. @Erutuon? —suzukaze (tc) 01:01, 28 February 2018 (UTC)
@DTLHS, Suzukaze-c: Yes, Module:headword assumes that all forms share the same script as the headword. So in this case, the Cyrillic was being tagged as Mongolian. This probably saves some Lua resources because findBestScript doesn't have to be called on each form, but I don't know how much. Headword modules for languages that regularly use multiple scripts (Module:mn-headword, Module:sh-headword) supply the script for the alternative form. So in this case the solution would be {{head|bua|noun|tr=xüdöö ažaxy|Cyrillic|хүдөө ажахы|f1sc=Cyrl}}: automatic script detection for the headword, manually supplied script for the alternative spelling. — Eru·tuon 01:14, 28 February 2018 (UTC)
I guess {{bua-noun}} already accomplishes this- so if the existing entries that just use {{head}} are switched over Mongolian can be added to the script list. DTLHS (talk) 01:24, 28 February 2018 (UTC)

So now I’m a spammer?Edit

Creation of a simple user page was blocked citing "various specific spammer habits." Suspect an over-zealous reaction to a single link to a page about my late wife. Also said "if I believe it constructive," I could resubmit. That message is wrong, because resubmitting only made the complaint stronger and removed the resubmit offer.

Having read the entire user page guidelines, I am persuaded my three paragraphs contain nothing prohibited and everything asked for. —This unsigned comment was added by 伟思礼 (talkcontribs).

It's an automated preventive measure against those weird people who believe Wiktionary user pages are an appropriate place to post ads. :/
I believe it should deactivate if you make more edits. —suzukaze (tc) 06:44, 28 February 2018 (UTC)
If all links are prohibited, (1) the guidelines should say so, instead of "may describe your real-life activities and/or link to your own website"; and (2) the rejection should not invite a resubmission which will only be rejected again. I removed the link and the rest of it was allowed. 伟思礼 (talk) 06:51, 28 February 2018 (UTC)
@伟思礼: It is not the case that all links are prohibited, as you will see many pages contain links to a wide variety of places. The restriction on placing links is tied to the status of the account, with brand new accounts being restricted completely. It is certainly inconvenient for new editors, but it is a necessary evil to prevent spam bots from adding links all over the place.
The text of the message is, I agree, unhelpful, that is something we can do something about.
Thanks for your interest in Wiktionary, and I hope you stick around and contribute some of your knowledge to the project. - TheDaveRoss 13:23, 28 February 2018 (UTC)

Another bugEdit

When I make an entry here, go to the bottom and click "publish," it adds a captcha to the bottom of the page, then scrolls to the top and adds "incomplete or missing captcha." Trying to publish an edit in other places adds the captcha to the top of the page. If a captcha is going to always be required, why not make it part of the page right away, instead of making us scroll to the bottom twice and click the same publish button twice? —This comment was unsigned.

I think captchas are only required before submitting edits if the edit contains an external link (either to an unexpected site, and/or prior to the editor making a certain number of edits? I'm not sure). I am also fairly certain that captchas are not something we as an individual wiki control (unlike the so-called "abuse filters" which stopped you from adding a link to your userpage). - -sche (discuss) 20:53, 1 March 2018 (UTC)

Pinging problems are really annoyingEdit

I've been using wikis for years and years now, and I'm embarrassed to say that I'm unsure about how to ping people properly. I've gotten messages time and time again like, your ping didn't work, your ping didn't work. I'm not trying to sound like I'm ranting or something, but it's really annoying to have to keep hearing that (and I'm not annoyed at people themselves for telling me, I'm just annoyed that it keeps not working).

I'm not asking to tell me about how pings work (though it'd be nice). I'm just asking if there's some way that pinging can become easier; for instance, if the symbol @ is put before [[User:, then it should automatically ping in every situation. Something like that. Because the current way to do it I THINK is to put the ping right before your signature or on the same line as your signature or something like that.

I'm not into the technical stuff, so I don't know how much work implementing something like that would require, but I'm just asking if there's something we can implement in regards to this. PseudoSkull (talk) 00:05, 1 March 2018 (UTC)

1. The guide is at mw:Manual:Echo :)
2. I believe "automatically ping in every situation" could be quite disastrous, like when making manual archives of talk pages.
suzukaze (tc) 00:13, 1 March 2018 (UTC)

March 2018

Tabbed languages + MediaWiki updateEdit

Has anyone else problems with tabbed languages and MediaWiki edit urls? When I save a page ?action=edit is still part of the URL and messes up the tabs. I have to remove the "edit" part and reload the page (just reloading the page will go back into edit mode). I suspect this behaviour was introduced with a recent MW upgrade. – Jberkel 09:10, 1 March 2018 (UTC)

Empty arguments to {{suffix}}Edit

I'd like to be able to give an empty argument (or -) to {{suffix}} together with a corresponding |argN= parameter to produce an effect similar to {{m|und||*term}} in order to discuss hypothetical suffixes. Could someone make this happen? Crom daba (talk) 00:05, 2 March 2018 (UTC)

@Crom daba: I enabled that in the suffix template (though not the other affix templates yet): {{suffix|und|base|alt2=*hypothetical suffix}}base +‎ *-hypothetical suffix. You leave parameter N + 1 empty and put in the corresponding |altN= parameter instead. — Eru·tuon 00:24, 2 March 2018 (UTC)

Terms by scriptEdit

I noticed we don't have Category:Terms needing native script by script or Category:Cyrillic terms needing native script. Any chance someone might be able to put that together? --Victar (talk) 00:40, 2 March 2018 (UTC)

Template:hot word throws error for lang=mulEdit

Translingual once more gets no respect. See [[Tupanvirus]] for example of module error resulting from use of lang=mul. But it's a more general problem than just one template. DCDuring (talk) 17:52, 3 March 2018 (UTC)

You have to use lang as an explicit named parameter. DTLHS (talk) 18:00, 3 March 2018 (UTC)
Indeed, there's no secret conspiracy against Translingual. But we should try to switch over this kind of template to using the langcode as the first positional parameter, in line with many other templates. —Μετάknowledgediscuss/deeds 18:50, 3 March 2018 (UTC)
It wouldn't be a shock if "mul" were unintentionally neglected. I thought I had tried lang=. Sorry. DCDuring (talk) 21:51, 3 March 2018 (UTC)

Have edit summaries got longer?Edit

When I add to an entry I'm in the habit of pasting my additions into the edit summary box, to show exactly what I did. The box would cut them off short. Now it's allowing far more characters, and I might be clogging up the Recent Changes a bit. See e.g. [4]. Does anyone know why this changed? Equinox 17:53, 3 March 2018 (UTC)

See here. I think that such copy-pasting is a waste of time, to be honest; if someone cares, they can check the diff very quickly (almost instantaneously if they're a logged-in user who's got the proper gadget). —Μετάknowledgediscuss/deeds 18:53, 3 March 2018 (UTC)

Help in installing the gadgetEdit

All are a good day! I'm a sysop of the Belarusian Wiktionary please help me with installation on my wiki this gadget. --OlegCinema (talk) 14:50, 4 March 2018 (UTC)

@OlegCinema: Прывітанне. You mean the Belarusian wiktionary? The translation adder is not meant to be used on Wikipedia (indeed, I don't see how it could). --Per utramque cavernam (talk) 20:24, 4 March 2018 (UTC)
Да, wiktionary. Я перепутал. Не Википедия --
@Per utramque cavernam: Sorry, I confused. Not Wikipedia. --OlegCinema (talk) 09:03, 5 March 2018 (UTC)


This template, which invokes Module:linkbar, is used at WT:WL, where it pings the user whose name is linked to. It seems undesirable, or at best unnecessary, to bring this process to the attention of the person under deliberation. Can we prevent the ping from occurring? —Μετάknowledgediscuss/deeds 20:20, 4 March 2018 (UTC)

FWIW, I don't remember getting pinged when the suggestion was made, but only when my user rights were changed. --Per utramque cavernam (talk) 20:27, 4 March 2018 (UTC)
I might be mistaken, then! Want to do a quick test? —Μετάknowledgediscuss/deeds 20:34, 4 March 2018 (UTC)
@Metaknowledge: Yes, go ahead. --Per utramque cavernam (talk) 20:37, 4 March 2018 (UTC)
@Metaknowledge: Well, I did get pinged. I'm gonna check my ping history because I'm curious now. --Per utramque cavernam (talk) 20:40, 4 March 2018 (UTC)
I got pinged every time. I must be senile... --Per utramque cavernam (talk) 20:46, 4 March 2018 (UTC)
A user will be pinged when their username is linked to. If you don't want the pings you need to remove the link. DTLHS (talk) 19:15, 7 March 2018 (UTC)
@Metaknowledge: Go ahead. --Per utramque cavernam (talk) 08:49, 13 March 2018 (UTC)
@Per utramque cavernam, is this your only ping from me? —Μετάknowledgediscuss/deeds 17:49, 13 March 2018 (UTC)
@Metaknowledge: Yes. --Per utramque cavernam (talk) 18:13, 13 March 2018 (UTC)

A more drastic Lua memory solutionEdit

So, we've significantly reduced the number of pages with memory errors now, but do and wind still are overflowing, despite now being in the memory-intensive entry list. I was wondering whether another solution might alleviate our difficulties. It is often the pages that contain many translations that cause a problem and as the project grows, the number of these pages will only increase. For these giant pages, I was thinking that we could move there translations to a subpage and leave the translation box as a redirect. Unfortunately, this feels like a kludge to me.

A more powerful solution might be to make a "Translation" namespace with an associated tab like we have for Citations. With this system, we could remodel MediaWiki:Gadget-TranslationAdder.js to edit this Translation namespace (if that is actually possible. @Dixtosa?). Then you have the translation boxes call a Lua module that checks the size of the Translation page. If it is above a certain threshold, you don't render the contents but instead leave a link. Some details would need to be worked out, like how to represent the data on the Translation page to be parse back to the mainspace, but this would lower the amount of memory, as it would not actually load all the linking and language data if the translation section is deemed too large. Of course, we'd also have to get a new namespace, etc. and the programming changes would not be insubstantial. Unless we do something, however, we're steadily going to run out of memory in all the most semantically or orthographically common entries. @Chuck Entz, Metaknowledge, Erutuon, Jberkel, Rua, -sche, Vriullop, TheDaveRoss, DTLHS. —*i̯óh₁n̥C[5] 12:21, 5 March 2018 (UTC)

Another, less extreme option would be to just have the translations be flat wiki markup instead of calling modules (without much benefit, really). I am not sure why we have to look up a language code for every translation which exists on the site. The transliteration magic is nice, but that only needs to happen once, and after that it could easily by "flattened" by a bot so that it doesn't need to make module calls again afterward.
Re translation namespace, there are a lot of problems which would need to be solved before this would work. Many terms have lots of senses to be translated, so figuring out which portion to transclude, and keeping it in sync, is one major hurdle. We should also test whether the memory usage of a transcluded page is counted against the total for the page, otherwise this may actually increase Lua memory usage. - TheDaveRoss 12:29, 5 March 2018 (UTC)
@TheDaveRoss: I imagine the memory usage of fetching a page's data (local content ="Translations:" .. PAGENAME):getContent()) is comparable to having the bytes in the entry itself. The issue would be parsing through the data to find the appropriate bit to render. For my money, the most effective long-term solution would be to move all the translations to a Translations namespace and then leave redirects in the mainspace. This keeps the memory out of the entry and gives you all of your fancy rendering. The problem would then be having to go to a different page to look at the translations. —*i̯óh₁n̥C[5] 12:42, 5 March 2018 (UTC)
I missed the portion where you said we would not transclude very large pages, which would certainly reduce resource usage. - TheDaveRoss 13:53, 5 March 2018 (UTC)
There's an easy way to check the basic principle: edit this subpage and look at the memory usage, then edit the main Grease pit page and compare its memory usage. The main Grease pit page uses about three times the memory of this one, so I suspect the entire memory usage of the transcluded page is included in the transcluding page. Chuck Entz (talk) 13:22, 5 March 2018 (UTC)
Our biggest translations tables currently include under 3000 languages. We have codes for roughly 8000 languages. Your idea is interesting, and might even have other utility — it reminds me of the proposed Collocations namespace, which would've allowed, as this also theoretically could, collecting translations of common collocations of words that are SOP but that translation dictionaries often include. However, I suspect that, even if in a separate namespace, translations tables will eventually run out of memory. (I suppose this could be tested by generating a translation table with some nonsense string written in each language in each of the scripts that the language has. That last part is important.) I also think that, conceptually, something more like what The Dave Ross proposes might be more desirable. In particular, my (testable/falsifiable) understanding has been that subst:ing in transliterations, and just accepting them and ceasing to compare manual to automatic transliterations, might significantly reduce memory usage; subst:ing in language names would also seem useful. - -sche (discuss) 15:17, 5 March 2018 (UTC)

Audio playback cut off too earlyEdit

See Talk:Plymouth. I can reproduce the problem in current Google Chrome. Equinox 18:15, 6 March 2018 (UTC)

I have the same problems, and in following it upstream it appears that it is not a local problem. My guess is that MW has versions of the file for downloading and versions of the file for playing in browser, and the in-browser versions are somehow broken. The recording is also not very good, maybe someone can replace it. - TheDaveRoss 13:39, 7 March 2018 (UTC)

Inputbox (or something similar) passing text input to a module for processing ― possible?Edit

Something similar to mw:Extension:InputBox: could be a string or paragraph box, that passes the text input to a backend module for processing upon button-clicking, and returns a processed output. Is this possible? Any guide on how to write this would be greatly appreciated. Thanks! Wyang (talk) 10:33, 7 March 2018 (UTC)

I did it to test a module: ca:Module:ca-general. The output of a wiki inputbox can only be obtained as a page name, so it edits a subpage of a sandbox showing the result in the editintro. --Vriullop (talk) 17:10, 7 March 2018 (UTC)
Hmm, thanks. I think I was expecting more than the MediaWiki preloader, as in whatever is written in the inputboxes (e.g. 'язык') can be passed to a module, to produce an output like 'jazyk'.
I remember @Dixtosa (sorry to disturb your wikibreak!) mentioned that this may be possible (?) at Wiktionary:Grease pit/2016/November#Recent searches for non-existent entries. It would be very useful if it is indeed feasible.
I'd like to learn a bit of coding for this, although not quite sure where to start. Wyang (talk) 22:42, 7 March 2018 (UTC)
I'm curious: would it be something of this kind, or of this one? --Per utramque cavernam (talk) 22:56, 7 March 2018 (UTC)
Yes, something similar. It doesn't have to be limited to transliterations; could also be IPA, script conversion, text segmentation, translation, etc. Wyang (talk) 23:10, 7 March 2018 (UTC)

@Wyang: Something that I'd find invaluable would be a small search engine for "language name > language code" and the reverse. I suppose it would use Module:languages/code to canonical name and Module:languages/canonical name to code. I think that's a similar tool from a conceptual point of view. --Per utramque cavernam (talk) 23:07, 7 March 2018 (UTC)

@Per utramque cavernam: The xte gadget does that already. —Μετάknowledgediscuss/deeds 03:20, 8 March 2018 (UTC)
Indeed! Thanks. --Per utramque cavernam (talk) 13:30, 8 March 2018 (UTC)
This seems relevant. —suzukaze (tc) 23:09, 7 March 2018 (UTC)
Thanks @Suzukaze-c! I will very slowly investigate... Wyang (talk) 23:12, 7 March 2018 (UTC)
@Suzukaze-c: Sorry, I forgot to answer. I'm confused :3 How should I use this? --Per utramque cavernam (talk) 23:58, 13 March 2018 (UTC)
This doesn't do anything, right? (did you get my second ping, btw?) --Per utramque cavernam (talk) 00:00, 14 March 2018 (UTC)
Ah, I was replying to Wyang's original post. (;・∀・) (I don't know what you mean by "second ping"...) —suzukaze (tc) 00:07, 14 March 2018 (UTC)
The search results are for scripts that seem to have the ability to render wikitext. I don't know what CodeLinks specifically does. —suzukaze (tc) 00:13, 14 March 2018 (UTC)
I'm trying to ping you from the edit summary. That shit doesn't work! --Per utramque cavernam (talk) 00:15, 14 March 2018 (UTC)
#Notification_from_edit_summary says you have to wait a few more hours. —suzukaze (tc) 00:58, 14 March 2018 (UTC)
And templates on edit summary are not expanded, you should use a real link. --Vriullop (talk) 09:02, 14 March 2018 (UTC)
@Suzukaze-c, Vriullop: Oh, ok, thanks and thanks! --Per utramque cavernam (talk) 10:31, 14 March 2018 (UTC)

Linking to parts of speech sections with {{l}}Edit

Would it be possible to have {{l|en|walk|pos=verb}} (for example) link to the verb section of walk rather than to the top of the entry page? — SGconlaw (talk) 23:06, 7 March 2018 (UTC)

And what if there are multiple verb sections (like with different etymologies)? - 23:37, 9 March 2018 (UTC)
Good point. Perhaps there should be some way to specify which section (e.g., "verb 2"). — SGconlaw (talk) 04:31, 13 March 2018 (UTC)
@Sgconlaw: Use {{senseid}}. —Μετάknowledgediscuss/deeds 04:39, 13 March 2018 (UTC)
Ah. But that would entail having to create all the sense-ids ... It would be nice to have {{l}} at least link by default to the first appropriate part-of-speech section. — SGconlaw (talk) 06:24, 13 March 2018 (UTC)
The problem is that POS headers are just headers as far as the system is concerned. If you look at the links in the table of contents at the top of any page with lots of headers (wind, for example) you'll see that the system sticks a number at the end of each header name that's a duplicate, but doesn't distinguish between language sections, and also bases it strictly on order. That means your "Noun_2" becomes "Noun_3" if someone adds a new noun section before it. Chuck Entz (talk) 14:12, 13 March 2018 (UTC)

feminine ofEdit

Hey. How easy would it be to categorise, or make a list of, all Spanish nouns that use "feminine of"? Things like comercializadora is what I'm after. Ideally, of course, we would have all of them as lemmas, and make a link to the masculine form. It's all about equality of the genders, obviously. Even though Spanish is inherently sexist. --Otra cuenta105 (talk) 23:13, 7 March 2018 (UTC)

Running a search for incategory:Spanish_nouns insource:/\{\{feminine of/ should yield a complete list. — Vorziblix (talk · contribs) 03:13, 8 March 2018 (UTC)
Personally, I think {{feminine noun of}} is a good compromise. —Μετάknowledgediscuss/deeds 03:18, 8 March 2018 (UTC)

Font creationEdit

Does anyone have any experience creating fonts? I'm having trouble getting the .fina lookups to work using FontForge. --Victar (talk) 13:20, 8 March 2018 (UTC)

Albanian, Irish and Old Irish cognatesEdit

Is there any way I can pull out Wiktionary etymologies which contain references to Albanian, Irish and/or Old Irish words?

@Rdurkan User:DTLHS/Irish-Albanian DTLHS (talk) 02:54, 9 March 2018 (UTC)

Please test pings in edit summaryEdit

1. Read this:

"You can notify users in edit summaries. They will get a ping just as if they had been mentioned on a wiki page. phab:T32750"-- meta:Tech/News/2018/10

2. Sign up at using a different user name and password (not the one you use here). You may create multiple accounts if you like, just put a note on their user pages.

3. Edit a page and put a username link in edit summary. Confirm that you are receiving the notification correctly.

4. Test at different pages and in different ways.

5. Report bugs to Phabricator.

6. Share this comment with other people on other wikis, in different languages.

--Gryllida (talk) 23:54, 8 March 2018 (UTC)

Sound laws appEdit

Hello. I've just had this idea I wanted to put out there: a sound law application. It would apply all known sound laws leading from one language to another (say Proto-Germanic to Swedish) to any given word, in an algorithmic fashion.

The first step would be to list all the relevant sound laws (chronologically whenever possible). These would have to be confirmed by an appropriate set of words.

Implementing the reverse functions as well would be nice:

  • 1) inputting the etymon and the descendant and seeing if they're a match or not.
  • 2) inputting the descendant and outputting the etymon (or the different possible etyma, if there's been a sound merger at some point).

It would help tremendously in seeing how strong and prevalent analogy can be.

Does something of the sort already exist? --Per utramque cavernam (talk) 17:00, 9 March 2018 (UTC)

@Per utramque cavernam Here's something: [5]. —AryamanA (मुझसे बात करेंयोगदान) 17:18, 9 March 2018 (UTC)
@Per utramque cavernam Here’s a more general sound change application, if you want to input your own set of sound laws and words and see what comes out. — Vorziblix (talk · contribs) 17:55, 9 March 2018 (UTC)

Request: categorize names when both 'male' and 'female' are setEdit

{{given name|male|or=female}} and {{given name|female|or=male}}, i.e. any use of the "or=" parameter, should categorize into "Category:(language) unisex given names". (This would resolve some of the entries which are categorized as both male and female given names, but not as unisex). Probably the template should even display "unisex" rather than "male or female" in such cases, although it's fine if only the categorization request can be fulfilled. (I also wonder whether "gender-neutral" might be better than "unisex"; Ngram results are inconclusive.) Reposting from January because unisex given names came up again at RFDO. - -sche (discuss) 17:42, 9 March 2018 (UTC)

Converting Wiktionary dumps to a Kindle dictionary fileEdit

Hello all,

Here is a project of mine that will download and parse a Wiktionary dump, and then generate a MOBI dictionary file that Kindles can use for in-book word look-up. My work has been mostly to "glue" different already existing softwares together: JWTKL to parse the Wiktionary dump, tab2opf to convert text files to OPF and HTML ones, and KindleGen to create the MOBI file.

It is far from perfect as JWKTL does not handle templates (I implemented some code that does, but only a few of them are supported). Inflected word forms are also not supported, etc. Entries are also very basic, only definitions and examples will make it to the dictionary file.

This is more of a proof of concept than anything else, but as I could not find any Greek-English dictionary from Amazon or elsewhere I wanted to see what could be done with Wiktionary. The project is on GitHub and you'll find download links for three dictionaries I just generated (EN-EN, FR-EN, EL-EN). Please feel free to comment and suggest. — nyg gh (talk) 23:07, 9 March 2018 (UTC)

@Nyg gh: Nice! I worked on something similar a while back, also using JWKTL. We'll soon have dumps in HTML format, I think those are better suited for parsing than XML (or maybe a combination of both). – Jberkel 23:22, 9 March 2018 (UTC)
@Jberkel: Thanks :). I've looked at the dumps mailing lists but have seen no mention of these HTML dumps. Do you have more info? — nyg gh (talk) 18:56, 10 March 2018 (UTC)
@Nyg gh: related phabricator tickets: T133547, T182351Jberkel 19:02, 10 March 2018 (UTC)

Cascading protection of User:DerekWintersEdit

Cascading protection of the user page User:DerekWinters is preventing me from editing Module:parameters. What on earth??? It's so bizarre it's hilarious, but I'd appreciate it if an admin could fix this, so I can look into a bug in Module:headword/templates that may reside in Module:parameters. @AryamanA, it looks like you added the protection; was it meant to be cascading? — Eru·tuon 04:41, 12 March 2018 (UTC)

Removed the cascading protection. DTLHS (talk) 04:45, 12 March 2018 (UTC)
@Erutuon: That's really weird, sorry for the trouble. I meant to protect his subpages. —AryamanA (मुझसे बात करेंयोगदान) 10:10, 12 March 2018 (UTC)

Notification from edit summaryEdit

The only script I can think of that mentions a user page is the rollback one, and the rolled-back user will already get a ping by dint of having been reverted. I guess this means they'll get a double ping? If so, we should edit the script to prevent that. —Μετάknowledgediscuss/deeds 04:45, 13 March 2018 (UTC)

Add curly braces around ISBNs in QQEdit

Can someone find a way to safely make this edit work? ISBNs should be templatized (like this) so as not to end up in Category:Pages using ISBN magic links (magic links will be removed at some point). - -sche (discuss) 05:06, 13 March 2018 (UTC)

Austronesian etymologyEdit

Can anyone tell me how I can pull out all entries with references in their etymology sections to Proto-Austronesian, Proto-Polynesian, Proto-Nuclear Polynesian and/or Proto-Malayo-Polynesian, please? —This unsigned comment was added by Rdurkan (talkcontribs).

Ugh why do they ruin everything?Edit

Some change has messed up a keyboard shortcut that's important to me. Go to delete a page and the "reason" box is focused. I usually hit Shift+Tab to focus back to the preceding list of reasons, then e.g. "V" to choose "vandalism", then tab back and type the sub-reason. Now the list can't be tabbed into, because it's apparently become some IDIOTIC custom element, not a real HTML list, that won't respond to input properly. Where can I complain? I really don't want yet another login/signup so if anyone can help/post on my behalf that would be wonderful. Equinox 17:18, 14 March 2018 (UTC)

Poking around in the Delete interface in Chrome, it appears that the main deletion reason dropdown is still selectable using Shift-Tab, when starting from the "Other/additional reason" textbox -- if you hit Shift-Tab 54 times. If you hit just Tab to cycle forward, you have to hit it 13 times.
It appears that someone screwed up the z-index value somehow. Both are shown in the CSS inspector with the default value of auto, FWIW. ‑‑ Eiríkr Útlendi │Tala við mig 18:45, 14 March 2018 (UTC)
Thanks for investigating. Is anyone able to report this bug? I am really unwilling to create an extra account. (Also ask them to let registered wiki users write bug reports. w00t!) Equinox 15:41, 16 March 2018 (UTC)
For that matter, where does one report bugs? ‑‑ Eiríkr Útlendi │Tala við mig 16:54, 16 March 2018 (UTC)
On Phabricator. I've had the pleasure of using it once before. -Stelio (talk) 17:18, 16 March 2018 (UTC)

Display of current votes in watchlistEdit

I remember the Template:votes appearing in my watchlist, but it's been missing now for a small handful of months. I'm guessing I've changed a preference that has removed it, but I don't see which setting it is. Any thoughts on how I can add it back in to my watchlist? Thanks, Stelio (talk) 11:05, 16 March 2018 (UTC)

That's strange, it still appears in mine, as it always has. --WikiTiki89 15:46, 16 March 2018 (UTC)
It was removed from some people's watchlist as a test, following this conversation. --Per utramque cavernam (talk) 16:07, 16 March 2018 (UTC)
Without telling them? DonnanZ (talk) 16:25, 16 March 2018 (UTC)
This is a bad change and should have been voted on. --Victar (talk) 17:01, 16 March 2018 (UTC)
Who did this, @TheDaveRoss? I agree that it is very inappropriate to take action, especially selective action, after a discussion like that. —Μετάknowledgediscuss/deeds 17:24, 16 March 2018 (UTC)
@Metaknowledge, I initially removed the votes, then immediately reverted and started the discussion you linked. If that edit and revert somehow removed it from a bunch of people's watchlists permanently then something is broken somewhere, and it wasn't my intent. - TheDaveRoss 14:20, 20 March 2018 (UTC)

Wait no, I'm definitely confusing this with something else. --Per utramque cavernam (talk) 17:25, 16 March 2018 (UTC)

@Stelio What happens if you restore the default settings? --Per utramque cavernam (talk) 17:28, 16 March 2018 (UTC)
Fascinating. Restoring to the default preferences does indeed put the votes back on my watchlist. I then tried editing settings one at a time and checking the watchlist. First thing I did was change language from "en - English" to "en-GB British English". Bam! No votes on my watchlist. So there's the culprit identified.
This also explains why a change to a special message didn't actually work for me (it was changed for en but not en-GB).
So in the short term I'll revert to using "en". Longer term, I'll look at a way to synchronise the various en* language pages for special messages. -Stelio (talk) 20:17, 18 March 2018 (UTC)
Or rather for all languages. --WikiTiki89 14:52, 20 March 2018 (UTC)
@Wikitiki89, of course we (the site as a whole) should synchronise messages in all languages. I personally don't have the language knowledge to do that, so I would like to (at some point) look at the en* languages only. If, when I spend that time, it turns out to be a straightforward job then perhaps I can extend it. At least I can report back for others to do that work. But note that this involves more than just copying template use from one language's messages to another. For example the proper job would include translating the header row from Template:votes (stored in Template:votes/layout) to each user language as well, by making that text into a system message itself and translating appropriately. Confining my own personal scope to just the types of English avoids that particular hurdle. As it is, editing the system messages is restricted to a higher level of user rights than I have — sysop, I believe — so I can merely investigate what changes should be made rather than actually implementing them. -Stelio (talk) 22:15, 20 March 2018 (UTC)
@Stelio: What I mean is that it should appear no matter what for all languages, even if we haven't translated it to that language, in which case it should appear in English. --WikiTiki89 23:10, 20 March 2018 (UTC)

eq= in {{given name}}Edit

Can someone link this parameter to English sections? That's its sole purpose, and it creates a bunch of black links on pages where the form is the same as the English name. See Teresa. Ultimateria (talk) 21:27, 16 March 2018 (UTC)

Polish m-pr declension templateEdit

I get an error message when using {{pl-decl-noun-m-pr}} for proper nouns with certain endings (-ła, -la, -ka, -sa, -wa, -za, -zia, -cia, -nia, -na). For example, see

Am I the only person who can't get the template to work for these endings? I can type out the declension manually but this does take more time. Hergilei (talk) 20:49, 17 March 2018 (UTC)


Why is ab a different format in the namespace? (larger sans serif font vs. smaller serif font) Is it just me or is it like this for everyone? Are there other pages like this? – Gormflaith (talk) 00:47, 21 March 2018 (UTC)

That's the page title; a namespace is a different sort of thing (like Main vs Talk vs User).
As for why: it's because the modules think Äynu should be in Arabic script, and its being on a Latin-script page is messing with the fonts. Perhaps @-sche can assess whether we should move the entry or add Latn as a valid script. —Μετάknowledgediscuss/deeds 00:58, 21 March 2018 (UTC)
It's tricky, because the language is secret and documentation is sparse. All that I can get ahold of is either in Latin script (but old), or phonetic (complete with square brackets), or phonetic-like but using e.g. š not ʃ and no brackets. The two "big" references known to Glottolog are the one cited in the entry, which cites words phonetically in square brackets, and Hayasi's Šäyxil Vocabulary, which I haven't been able to get ahold of, but Hayasi's piece in Copies Versus Cognates in Bound Morphology, and other works which cite the Šäyxil Vocabulary for the Eynu words they give, use (phonetic-like but bracket-less) Latin script spellings. Given the current paucity of documentation in the Arabic script, my inclination would be to add Latn as another script for the language. - -sche (discuss) 01:26, 21 March 2018 (UTC)
Pardon my ignorance, but why should the page title be affected by the inclusion of a particular language. If there is anything in the content of a page which affects presentation outside of its context we should fix that. If we need to account for the page-title script we should do so in the global .js or .css, not within the page content and certainly not within modules. - TheDaveRoss 15:03, 21 March 2018 (UTC)
@TheDaveRoss: I suspect that what was probably happening was that Module:headword was adding a display title for Arabic script. Arabic is one of the scripts for which this is done, listed in Module:headword/data. Arabic was detected by findBestScript, which returned what was the only script listed for Aynu: Arabic. I introduced the display title feature a while back during a discussion about vertical scripts here. Script classes can be added with JavaScript (see scriptTitles.js), but there isn't yet a script that does everything that Module:script utilities does, like replacing spaces with newlines in vertical scripts, and there may be some users who don't have JavaScript enabled. On the other hand, adding display titles with Lua is inefficient, because it's done multiple times on a page if there are multiple headwords. — Eru·tuon 20:20, 21 March 2018 (UTC)

A passage-converting scriptEdit

For anyone who knows how to code, I've always dreamed of having a personal Wiktionary script that converts passages of text into organized lists of links for each word in the text. For example:

"I really love to eat sugar."

would turn into

{{l|en|[[eat]] [[I]] [[love]] [[really]] [[sugar]] [[to]]}}

I wouldn't put these lists in the dictionary namespace obviously though; only in my personal userspace pages. I want to do it so I can automatically generate lists of blue links/redlinks/yellowlinks for Danish words from specific passages of text (since there are a LOT of red/yellow links for Danish that need attention) instead of having to do it manually. Also, in the cases where there's a capitalized word after a period, exclamation point, or question mark, the letter that follows will be automatically lowercased, since the script will detect this.

Is anyone willing to make a script like this just for me? It might actually do good for other users too in their situations. Thanks! PseudoSkull (talk) 16:27, 23 March 2018 (UTC)

@PseudoSkull: See Module:sandbox for a mockup version. It's not entirely functional yet though. — Eru·tuon 18:19, 23 March 2018 (UTC)
I do this manually for Irish in User:Mahagaja/Sandbox. One thing to keep in mind is multiword phrases: if the script takes the words out of their natural order and puts them in alphabetical order, how will you keep the words of multiword phrases together? —Mahāgaja (formerly Angr) · talk 18:25, 23 March 2018 (UTC)
This is approximately what User:Equinox/code/ExtractBookWords does. Equinox 18:34, 23 March 2018 (UTC)
I do it with a regex (that also allows me to get rid of common words so there's less to scan through). But I would welcome something more efficient. —Μετάknowledgediscuss/deeds 19:06, 23 March 2018 (UTC)
My quickie low-tech method: in a text editor like TextEdit or Notepad, use cmd-f or ctrl-h find-and-replace to convert all punctuation to spaces, then get rid of extra spaces, then convert all spaces to ]] [[. You can then copypaste it into the edit window. If you don't want to sort it by hand, convert the spaces to newlines so you can copypaste it into a spreadsheet (first set the format in the column(s) to text so it won't choke on things like leading hyphens), then copypaste it back to the text file to finish the formatting. It works well enough that I haven't bothered to find a more elegant way to do it. Chuck Entz (talk) 22:09, 23 March 2018 (UTC)