Wiktionary:Grease pit/2014/August

Pages Running Out Of Time and Memory

I've been seeing a good number of pages lately in Category:Pages with module errors with module errors saying "The time allocated for running scripts has expired" or "Out of memory". It's not language-specific: I've seen it in monolingual pages of various languages (including English), as well as multilingual ones. So far, they've all cleared with a null edit, but it's a clear sign that some recent change has brought us to the limits of what the system can sustain. Does anyone have any idea what's going on? Chuck Entz (talk) 19:11, 1 August 2014 (UTC)[reply]

Those messages may also be the result of an infinite loop. Some of Kephir's recent changes may have caused them. —CodeCa t 19:18, 1 August 2014 (UTC)[reply]

Non-template items are affected, too: the interwikis are showing up as redlinks to Wiktionary destinations in the entry and not in the sidebar. The language-name links from the {{etyl}} template are redlinks, as well, so it looks like everything stops before the system does some kind of link-conversion step. Chuck Entz (talk) 19:35, 1 August 2014 (UTC)[reply]

The times I've seen it, purging the page fixed it. I think it's just a random delay. @Chuck: If any module on the page takes too long, it will screw everything up that comes after it, not just that instance of the module. --Wiki Tiki 89 23:15, 1 August 2014 (UTC)[reply]

The point is, it only started happening recently, but now it happens regularly. I was careful to say "a recent change" without being specific about whether it's in a module or in the system. The fact that I've cleared these and had new ones come back over several days is worrying.

If there is something wrong, it's going to be hard to track down, because the breaking point is where the cumulative execution time hits a certain point, which is often not the point at which the delay occurs. It might even be something minor common to multiple items on the page, which only in combination are enough to push the total over the line. Chuck Entz (talk) 00:29, 2 August 2014 (UTC)[reply]

One idea for how to troubleshoot this issue is to invoke each common module and template (in turn) in a sandbox an increasing number of times — i.e. put 100 {{l}}s in a row, then 200, etc — until a timeout error occurs with reliable frequency. If {{foobar1}} runs into an error only when invoked 200 times, but {{foobar2}} runs into one when invoked a mere 60 times, that will tell us which module is eating the most resources. Disclaimer: such testing might put strain on the servers and be a bad idea (I don't know)... OTOH, the problem itself would seem to put strain on the servers, so going through a some strain to identify the problem might be worth it in the long run. - -sche (discuss) 03:30, 2 August 2014 (UTC)[reply]

Presumably the purpose of the timeout is to avoid strain on servers. Therefore I don't think this form of testing should cause much strain. --Wiki Tiki 89 20:50, 2 August 2014 (UTC)[reply]

OK, I tested the following:

{{term|lang=en|foobar}}: Saving the page, which is still quick with up to 350 transclusions, starts to take longer once 400 transclusions are present. It takes an average of 1 min 35 sec to save the page when 4400 transclusions are on it, and it times out after an average (based on four data points) of 4240 transclusions.
{{l|en|foobar}}: Saving the page is quick until about 600 transclusions, whereupon it starts taking longer. Saving the page with 4400 transclusions on it takes an average of 1 min 8 sec. I tested all the way up to 6000 transclusions and it never timed out.
{{m|en|foobar}}: Saving the page is quick until about 600 transclusions, whereupon it starts taking longer. Saving the page with 4400 transclusions on it takes an average of 1 min 18 sec. I tested all the way up to 6000 transclusions and it never timed out.
{{head|en|noun}}: Saving 100 transclusions takes 4 seconds; 300 takes 10 seconds; 600 takes 10 seconds; 1000 takes 20 seconds; 2000 takes 33 seconds; 3000 takes 44 seconds. Saving the page with 4400 transclusions on it takes an average of 1 min 9 sec. Even with 6000 transclusions on the page (it took 1 min 37 sec to save the page with that many transclusions on it) it doesn't time out.
{{head|en|noun|g=f|plural|foobar}}: It times out after an average (based on four data points) of 4255 transclusions. After that threshold is crossed, further uses of {{head}} curiously do not all necessarily turn into red "module error" messages; sometimes, after a swatch of a few hundred module error messages, the remaining transclusions turn into simple links to "Template:head".

- -sche (discuss) 08:04, 3 August 2014 (UTC)[reply]

Next, I'll test the linking templates with more parameters set. I'm intrigued that {{term}} eats noticeably more resources than {{m}}; is there some benefit to using positional vs named parameters? I suppose I can test that by testing {{l|en|foobar||this is a gloss}} against {{l|en|foobar|gloss=this is a gloss}}. - -sche (discuss) 08:10, 3 August 2014 (UTC)[reply]

More results:

{{term|lang=en|foobar|gloss=glossy}}: when 4400 transclusions are present on the page, saving the page takes an average of 1 min 53 sec, and the template times out and starts spewing module errors after an average (based on five data points) of 4050 transclusions.
{{term|lang=en|foobar||gloss}}: when 4400 transclusions are present, it takes an average of 1 min 53 sec to save the page, and starts spewing module errors after an average (based on five data points) of 4092 transclusions.
{{m|en|foobar||gloss}}: with 4400 transclusions present, saving the page takes an average of 1 min 37 sec. There are no module errors. Saving the page after adding enough transclusions to bring the total to 6000 takes 2 min 10 sec, and it fails in an interesting way. 5089 transclusions work just fine, and then two transclusions resolve into links to "#invoke:links/templates", and the rest of the transclusions resolve into links to "Template:m". The page finds itself in Category:Pages where template include size is exceeded, but does not display any module errors.
{{m|en|foobar|gloss=glossy}}: with 4400 transclusions present, saving the page takes an average of 1 min 37 sec; all transclusions work, there are no module errors. Saving the page after adding enough transclusions to bring the total to 6000 takes 2 min 17 sec, 5064 transclusions work, the 5065th resolves to "#invoke:links/templates", and the rest resolve to "Template:m".

It seems there is exactly no difference in how long it takes to save a page with the gloss set as a numbered parameter vs a named parameter, and there is little difference (attributable to mere noise) in how many resources each eats. It is apparent that there is, however, something about Template:term which cases it to eat more resources and fail more quickly than Template:m or Template:l. - -sche (discuss) 19:35, 4 August 2014 (UTC)[reply]

Perhaps it is the call to {{#invoke:term cleanup|cleanup}}. --Wiki Tiki 89 19:44, 4 August 2014 (UTC)[reply]

@-sche For comparative testing: are the contents of the links always identical and to a page that exists?

I wonder how long does it take to save a page with 300, 4400, and 50,000 bare links, piped links, bare links to section headers, piped links to section headers. DCDuring TALK 21:31, 4 August 2014 (UTC)[reply]

For all the tests of Template:l, Template:m, and Template:term, I used an identical mix of ~60% links to the English sections of the pages 1-10 and one-ten, and ~40% links to the French sections of those pages. Of the numerals, 1, 2, 4, 6 and 8 have English sections, while only the number 2 has a French section; of the words, all have English sections, while only four and six have French sections. In the wild, all of the linking templates sometimes point to entries/sections which exist, and sometimes point to entries/sections which don't. I wouldn't expect it to make a difference from the template's point of view whether the section exists, but I suppose I could test that. For the tests of Template:head, I used 50% {{head|en}} and 50% {{head|fr}}. - -sche (discuss) 21:57, 4 August 2014 (UTC)[reply]

Thanks, especially for the rationale. I subsequently realized that I can copy the exact pages you ran by looking at your userpage history, rerun some for calibration, and modify them to answer my specific questions. DCDuring TALK 00:27, 5 August 2014 (UTC)[reply]

I think I'm beginning to see a pattern: there are days when this doesn't happen at all, but today and yesterday there were quite a few. There are at least a couple of the maintenance reports at Special:SpecialPages that were updated yesterday. I think the extra load on the system might be slowing things down to the point where some of the more script-intensive pages are timing out. Chuck Entz (talk) 14:08, 11 August 2014 (UTC)[reply]

I was also making a lot of changes to modules so that might also be it. —CodeCa t 14:16, 11 August 2014 (UTC)[reply]

Can Persian be added to the input keyboards?

I notice that there are input tools available for languages other than English now. Is there any chance that a Persian keyboard could be added? Kaixinguo (talk) 18:34, 3 August 2014 (UTC)[reply]

@Kaixinguo. Do you mean MediaWiki:Edittools#Arabic? You can type in Persian (Arabic, Urdu, Kurdish) using the tool and add transliteration symbols missing on standard keyboard, e.g. â, š, ž, č and ğ. --Anatoli T. ^{(обсудить}/^вклад) 02:50, 1 September 2014 (UTC)[reply]

-1

Any idea what is generating the 937 links to -1? See Special:WhatLinksHere/-1. I presume it's a template? It doesn't seem to be a problem, it's just intriguing... - -sche (discuss) 15:40, 5 August 2014 (UTC)[reply]

What all those links seem to have in common is a Scots entry. --Wiki Tiki 89 15:43, 5 August 2014 (UTC)[reply]

It is the line {{#if:{{#ifeq:{{{1|-}}}1|-||{{#invoke:ugly hacks|is_valid_page_name|{{{1|-}}}1}}}} in {{sco-noun}} (and likely other Scots headword templates). Although I'm not quite sure how to cleanly fix it in a way that both works and does not link to something random. --Wiki Tiki 89 15:57, 5 August 2014 (UTC)[reply]

Perhaps User:Kephir can help, since he has been working with the ugly hacks stuff. --Wiki Tiki 89 15:58, 5 August 2014 (UTC)[reply]

Just convert {{sco-noun}} to use {{head}}. It will go away. — Keφr 18:43, 7 August 2014 (UTC)[reply]

I still don't quite understand why it's generating these links. I searched the page encyclopaedia (one of the "linking" pages to -1), but I didn't find any links to "-1" on the whole page neither on the edit form or from the actual page... Strange. Rædi Stædi Yæti {-skriv til mig-} 07:22, 16 August 2014 (UTC)[reply]

Well actually, what's going on here is User:Kephir created the ugly hacks module with a is_valid_page_name function to replace the old {{isValidPageName}}, and he made the module essentially create a fake link to the page being queried. I didn't realize this before, but now I realize that this is the wrong behavior, since checking whether a page name is valid does not mean that you depend on the existence of the page, unlike checking whether a page exists, which should create a fake link. --Wiki Tiki 89 13:33, 16 August 2014 (UTC)[reply]

And anyway, couldn't -1 be an actual entry? I can kind of see why it isn't, but isn't it at least considerable?

Also, well maybe I shouldn't be speaking since I know very little about module code, but couldn't we make it so that it links to a Wiktionary user page or something so that it's not hectic in the mainspace namespace? Something that generally is not used and maybe deleted, such as Wonderfool's old user page. (Well maybe not that, lol, but you get the idea) Rædi Stædi Yæti {-skriv til mig-} 04:21, 18 August 2014 (UTC)[reply]

If you go to one of the pages that "links" to -1, I don't think you'll find anything to click on- look at graip, for instance. The only way you can tell that anything is referencing the page is by going to the nonexistent entry and clicking on "what links here", or by looking at Special:WantedPages. It's just part of the way templates and modules check to see if a page exists, and it doesn't show up in their final output. Chuck Entz (talk) 05:20, 18 August 2014 (UTC)[reply]

Like I already explained, checking whether a page name is valid is not the same as checking whether it exists. That is why I think the something is wrong and there should not be a fake tie to the page. --Wiki Tiki 89 14:38, 18 August 2014 (UTC)[reply]

Creating an entry for the Unicode replacement character

I just tried to create an entry at Unsupported titles/Replacement character for the Unicode replacement character. This triggered an abuse filter, so I would appreciate it if an administrator could help me. I was trying to create the page with the following text:

==Translingual==

{{Specials character info|hex=FFFD|name=REPLACEMENT CHARACTER|image=[[File:Replacement character.svg]]}}

# Unicode [[replacement character]], used to represent a symbol that the system is unable to [[render]].

Thanks —Mr. Granger (talk • contribs) 05:13, 7 August 2014 (UTC)[reply]

Created. {{unsupportedpage}} is not working for some reason. — Ungoliant ^(falai) 05:24, 7 August 2014 (UTC)[reply]

Thank you. I seem to be unable to add a link from Appendix:Unsupported titles so that it displays correctly. Help with this would also be appreciated. —Mr. Granger (talk • contribs) 05:30, 7 August 2014 (UTC)[reply]

Done as well. — Ungoliant ^(falai) 05:40, 7 August 2014 (UTC)[reply]

Duplicate mineral elements - fixable by bot?

I just became aware of this problem, inadvertently caused by me a while ago, where minerals' elements (auto-extracted from their chemical formulae) have been duplicated in some cases: [1]. It might only be hydrogen, or there might be other cases. If anyone with a bot and spare time feels like trying to resolve it, that would be kind! Equinox ◑ 03:37, 9 August 2014 (UTC)[reply]

Module:etymology language/data

I (and User:I'm so meta even this acronym) think that it would make etymologies more accessible (see eg οξύτονος) if Koine Greek was output by {{etyl}} rather than Koine, unfortunately my simple change to Module:etymology language/data led to new categories to be assigned. However the terms apparently derived from Koine (see Category:Terms derived from Koine) are only: Latin:1, English:0, Greek:34), which I would be happy to sort out.

Are any other effects likely? — Saltmarsh^{απάντηση} 15:43, 9 August 2014 (UTC)[reply]

Probably not. Renaming it should be ok, as long as you monitor Category:Categories with incorrect name. —CodeCa t 15:57, 9 August 2014 (UTC)[reply]

Thanks! — Saltmarsh^{απάντηση} 16:02, 9 August 2014 (UTC)[reply]

Yeah, thanks CodeCat. — I.S.M.E.T.A. 16:40, 9 August 2014 (UTC)[reply]

A Peculiar Edit

I must be missing something obvious, but why did this edit (diff) have {{also|liriope}} display as See also: {{{1}}}? I also noticed that the list of templates below the edit window was lacking {{also}} and the module it invokes. Chuck Entz (talk) 00:23, 10 August 2014 (UTC)[reply]

I cut and paste from something that has that in it. It was not intentional and not a sign of something technically wrong. DCDuring TALK 00:32, 10 August 2014 (UTC)[reply]

If {{also}} on a page has as an argument the name of that page, it shows as you saw it. That is supposed to draw an editor's attention - and usually does. DCDuring TALK 00:40, 10 August 2014 (UTC)[reply]

[Post–edit-conflict]: @Chuck Entz: {{also}} is designed to omit self-referential links; see Template talk:also#Implement self-hiding. — I.S.M.E.T.A. 00:43, 10 August 2014 (UTC)[reply]

I corrected the ones for taxonomic names and English, but 240 or so remain. They are easy to search for. DCDuring TALK 00:49, 10 August 2014 (UTC)[reply]

@DCDuring: Do you mean like this? Wouldn't changing {{also}}'s code to this:

{{#if:{{{1|}}}|{{#invoke:Template:also|main}}|}}

fix this problem? — I.S.M.E.T.A. 00:59, 10 August 2014 (UTC)[reply]

I particularly like intègrerai: it was created by a WF bot five years ago, and all the edits since have been by other bots. Chuck Entz (talk) 01:12, 10 August 2014 (UTC)[reply]

Ah, that's the "something obvious" I missed. It did lead me to spot an erroneous interwiki ([[en:Template:also]]) in the template documentation. I also noticed [[Category:Limit of template reached| ]], which seems unnecessary now that the template is based on a module. Chuck Entz (talk) 01:00, 10 August 2014 (UTC)[reply]

Why do we insert these manually anyway? If a bot is good for anything it should be good for detecting headwords that differ only by diacritical marks and capitalization or other orthography. There is some need to avoid duplication between what appears in {{also}} and under the Alternative forms header. DCDuring TALK 02:14, 10 August 2014 (UTC)[reply]
I agree. I agitate from time to time for bot implementation of {{also}}. Somewhere around here I had a fairly detailed proposal for it, which I'll find if anyone is interested. - -sche (discuss) 03:32, 10 August 2014 (UTC)[reply]
Here was the proposal to make existing {{also}} links symmetrical: Wiktionary:Grease pit/2012/July#symmetric_use_of_Template:also. And here was the proposal to "create {{also}} links between any pagetitles which are identical except that one contains character(s) with diacritic(s) and the other contains the same character(s) but with different or no diacritic(s)": Wiktionary:Grease pit/2012/August#Template:also_links_by_letter_decomposition. It was pointed out that some entries with very short titles might have more "twins" than {{also}} could handle, but we already have a way of handling such cases, namely Appendix:Variations of "a" et al. - -sche (discuss) 03:51, 10 August 2014 (UTC)[reply]

Module:ru-verb help needed

In function "conjugations["6c"]" I'm passing an additional parameter no_iotation=y to change the bahaviour of "present_je_c" function. It didn't work, though.


<pre>
	local no_iotation = args[no_iotation]; if no_iotation == "" then no_iotation = nil end
...
	if no_iotation then
		present_je_c(forms, stem, no_iotation)
	else
		present_je_c(forms, stem)
	end

I have removed "or no_iotation" but this is what I tried:

	-- Verbs ending in a hushing consonant do not get j-vowels in the endings.
	if mw.ustring.find(iotated_stem, "[шщжч]$") or no_iotation then
		forms["pres_futr_1sg"] = iotated_stem .. "у"
	else
		forms["pres_futr_1sg"] = iotated_stem .. "ю"
	end

Purpose: For the verb стона́ть (stonátʹ) the 1st person singular is "стону́", it should use the first part before "else". --Anatoli T. ^{(обсудить}/^вклад) 07:41, 10 August 2014 (UTC)[reply]

Problem fixed by User:Wyang. Thank you! --Anatoli T. ^{(обсудить}/^вклад) 23:48, 10 August 2014 (UTC)[reply]

No worries. The present adverbial participle is стоня́, is this correct? Wyang (talk) 23:49, 10 August 2014 (UTC)[reply]

The forms "стона́в" and "стона́вши" are much more common and preferred (from the 2nd conjugation pattern) (also "стена́я" from related verb "стена́ть"). However, "стоня́" is also used, which may be considered a less standard form. I will check if "стоня́" should be overwritten with "стона́в" and "стона́вши" but it seems attestable and looks like a dated form. BTW, many verbs are seldom used in certain forms, adverbial participle for стона́ть seems relatively rare.--Anatoli T. ^{(обсудить}/^вклад) 00:21, 11 August 2014 (UTC)[reply]

Module:fro-verb is open for business

The purpose of this module is to implement Old French verb conjugations. These conjugations have a number of problematic aspects:

Complex but regular phonological adjustments in parts of the present-tense singular, when adding morphological zero, -s and -t.
Multiple sets of endings, used in different circumstances (e.g. there are distinct endings when the final consonant of the stem was once a palatal consonant).
Multiple alternatives for endings, more or less interchangeable but often marked for particular dialects or time periods.
Multiple possible formations for e.g. the preterite: weak-a, weak-a2, weak-i, weak-u, strong-i, strong-o, strong-u, strong-st, strong-sd, etc.
All manner of irregularities, which often involve the stem (leaving the endings alone) but sometimes affect particular forms (e.g. 1st-singular present indicative).
Very frequently, multiple more-or-less-interchangeable alternative ways of conjugating a particular tense for a particular verb.

I handle all of this. You can

specify multiple stems;
specify distinct stressed and unstressed stems in the places where such a distinction makes sense (present indicative, present subjunctive, imperative, preterite);
override particular forms either by replacing all alternatives, inserting alternatives at the beginning or end or replacing a particular alternative by index;
specify the entire conjugation of a given tense in a compact fashion, useful when there are lots of irregularities, e.g. pres=vois,vai/vais,vas/vait,va/alons/alez/vont for the present tense of aler, where slashes separate forms in the paradigm and commas separate alternatives within a given form, i.e. 1st singular is either vois or vai, 2nd singular is either vais or vas, etc.;
request palatal endings;
request two distinct types of archaic/dialectal endings (useful if the stem is also archaic or dialectal);
add a prefix to all forms, to easily implement verbs like convenir or secorre, with the same conjugation as convenir or corre;
specify a search/replace that applies to all forms, useful when listing the conjugation of a verb with non-standard spelling (e.g. nestre should conjugate like naistre but with all occurrences of nais- replaced by nes-) -- note that prefixes could be implemented this way, since search/replace uses Lua patterns (aka crippled regexes);
etc.

This is used by four primary templates:

{{fro-conj-er}} (for group I -er verbs);
{{fro-conj-ier}} (for group I -ier verbs);
{{fro-conj-ii}} (for group II verbs, i.e. -ir verbs with -iss- infix);
{{fro-conj-iii}} (for group III verbs, i.e. ir verbs without -iss- infix, plus -re, -oir, and -eir verbs as well as a few -er verbs where the -er here is an Anglo-Norman reduction of -eir).

These templates are thoroughly documented.

This code was originally based on Module:it-conj but greatly expanded and rewritten. In turn it could serve as the basis for another language with similarly complex and irregular verbal morphology.

Please feel free to look over the code and/or the documentation and give me comments, suggestions, critiques, etc.

Benwing (talk) 10:06, 12 August 2014 (UTC)[reply]

Stray word "valid"

Some recent edit to {{ga-proper noun}}, presumably one of these two, has had the effect that the word "valid" appears at the end of the headword line; cf. [[Cáisc]]. I have no idea how to fix it, though; could someone else please look into it? —Aɴɢʀ (talk) 15:04, 12 August 2014 (UTC)[reply]

Actually, it was broken before. (Happens.) But fixed. — Keφr 15:13, 12 August 2014 (UTC)[reply]

㞍 (U+378D) and Template:ja-readings not working correctly

For some reason, the "on=" (Japanese on form) and "kun=" (Japanese kun form) reading information in Template:ja-readings isn't displaying for me (in two different browsers). I don't know if it has something to do with the character being in the CJK Unified Extension A range or if it's just some other miscellaneous bug. This is similar to the issue I reported back in June regarding 𩙿 (U+2967F) in Extension B (which, for now, is displaying correctly). Bumm13 (talk) 17:04, 13 August 2014 (UTC)[reply]

Not my area of expertise, or even work, but I notice that Module:ja is coded only to accept CJK characters, and, in doing so, missed the Extension A block causing this problem. I have to wonder why this restriction exists in the first place. ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 17:29, 13 August 2014 (UTC)[reply]

Polysynthesis help

Is grease pit the right place to ask this? What's the general policy with creating conjugation/declension templates for heavily polysynthetic languages - is it possible or even desired? —Jakeybean^TALK 21:53, 13 August 2014 (UTC)[reply]

I'd say the Beer Parlour is the more appropriate place to discuss the question of whether it's desirable, and the Grease Pit is the place to discuss how to do it if it is desirable. We do have a few entries like xłp̓x̣ʷłtłpłłs and xłp̓x̣ʷłtłpłłskʷc̓, and the latter survived an RFD, but I don't think we've ever discussed the possibility of inflection templates that would generate such forms. —Aɴɢʀ (talk) 22:20, 13 August 2014 (UTC)[reply]

There are some somewhat-large tables in Category:Finnish verb inflection-table templates. There are also some specific Georgian inflected forms like გვფრცქვნი which we have because they, like xłp̓x̣ʷłtłpłłs(kʷc̓), are interesting — but they don't seem to be linked-to from their lemma entries (ფრცქვნის?). So, what Angr said. - -sche (discuss) 22:58, 13 August 2014 (UTC)[reply]

Thanks for your input. My main contributions at the moment are in Greenlandic so the problem for me is knowing where to stop. For example if you take the verb 'atuarpoq' (to read), I could provide purely the indicative forms, but each one comes with a negative counterpart:

atuarpunga (I read) + atuanngilanga (I don't read)

atuarputit (you read) + atuanngilatit (you don't read) etc. for each person

There are also interrogative, imperative, optative, conjunctive, past subordinative, future subordinative, habitual subordinative, participial moods and others that are sort of semi modes, with each of those having positive and negative forms. Also, each mode has a transitive inflection so for the verb 'asavoq' (to love) you could have 36 different combinations purely in the indicative e.g. You love him (asavat), they love me (asavaanga), you pl. love us (asavassigut) and so on...I think the best thing to do would be to start by creating an indicative form template. Including positive and negative forms and maybe the transitive combinations as well. There already exists a basic noun declension table which only includes the bare noun cases without any further affixing such as possessive markers, so maybe simplicity is best? —Jakeybean^TALK 00:22, 14 August 2014 (UTC)[reply]

In addition to those mentioned above, there's Hebrew, which has binyanim (different forms of the stem that conjugate through the whole paradigm), and Turkish- but there we have a contributor for that language who is the poster child for the "don't know when to stop" problem, and has created all kinds of unnecessary templates and categories. I haven't gotten that far with Cahuilla verbs, but those will definitely have similar issues, so I've started to think a little about it. How much do the affixes interact phonologically with the stems and with each other? If they stay discrete, you could have one complete subparadigm, and a list of the counterparts to the part that stays the same within the subparadigm (I don't want to say the root, because I'm sure each one contains its own combination of affixes). At any rate, you would want to make heavy use of multiple collapsible boxes, which, if memory serves, can even be nested within each other. Chuck Entz (talk) 02:59, 14 August 2014 (UTC)[reply]

I'm not worried about the layout of the tables, but of whether we need to mass-create all the red links produced by these tables. (Also, I wouldn't consider Hebrew binyanim, or their equivalents in other Semitic languages, to be examples of polysynthesis.) --Wiki Tiki 89 14:30, 14 August 2014 (UTC)[reply]

Finnish has the case of possessive suffixes and other suffixed particles, which exponentially increase the number of possible word forms. Latin also has a few clitic suffixes which we agreed not to include as entries. Zulu and other related languages include not just the subject but also the object noun class into the verb, so that makes for some 100+ forms for each mood-tense-voice combination, totalling perhaps 1000 forms per verb. And there are probably more still. —CodeCa t 14:32, 14 August 2014 (UTC)[reply]

English has John 's 're, Hebrew has, as an extreme example, וּ כְ שֶׁ בְּ בֵיתְ כֶם (u-kh'-she-b'-veit'-khém, “and when in your house”), with a similar situation in other Semitic languages, and we do not include any of these constructs as individual words. --Wiki Tiki 89 15:04, 14 August 2014 (UTC)[reply]

The mention of Finnish seems apt. If some of the categories of Greenlandic inflected forms (e.g. negative forms, possessive forms, etc) differ from other, 'simpler' categories of inflected forms (e.g. plain present tense) only in some predictable way, such as by the addition of an affix, it would probably make sense to leave those (negative, possessive, etc) forms out of the inflection tables and not bother creating entries for them except in exceptional cases, such as when the same string that is a negative possessive third person verb form also happens to be a noun, or when the string is oft-mentioned on account of some exceptional quality. The full paradigms of a few representative verbs could (and IMO should) then be provided in an appendix. - -sche (discuss) 16:01, 14 August 2014 (UTC)[reply]

I think that sounds like a good idea. I'll come up with a basic indicative template and provide an appendix with full paradigms (or as full as one can be without becoming ridiculous!).—Jakeybean^TALK 17:47, 14 August 2014 (UTC)[reply]

But what about looking up such forms? Given that we have no real space restrictions, can we still have entries for them? —CodeCa t 17:51, 14 August 2014 (UTC)[reply]

I'm against it. --Wiki Tiki 89 18:16, 14 August 2014 (UTC)[reply]

I wasn't saying Hebrew was polysynthetic, just that it might be helpful to look at because it's an example of how experienced, sophisticated editors have dealt with an extra dimension in the paradigm. Polysynthesis isn't really that useful of a term because no one seems to completely agree on what it means, but I'm not so sure that any of the languages discussed are universally considered polysynthetic, anyway. Chuck Entz (talk) 02:48, 15 August 2014 (UTC)[reply]

With these sort of languages I find it hard to know what to prioritise for inclusion; whether the viewer would benefit more, for instance, from a breakdown of the different moods, or if the indicative is enough in itself but with both intransitive and transitive paradigms...neither, or both...and with both outcomes whether the negative of each inflection is desired. Greenlandic in particular doesn't really indicate tense in ways we're used to in Indo-European; the indicative verb can mean both past and/or present, but instead they add morphemes which define tense, such as: vague future, inevitable future, recently, a long time ago etc. and I think we'd all agree those are better off in an affix Appendix. Hard to know where the line should be drawn. —Jakeybean^TALK 12:39, 15 August 2014 (UTC)[reply]

I think the best thing to think about is which parts of the inflection can differ depending on the word itself (these should always be in the table), and which parts are added unchanged to any word of that type (these are ok to move to an appendix). --Wiki Tiki 89 12:47, 15 August 2014 (UTC)[reply]

Is there a way without a bot to determine if an entry is in category A but not category B?

In particular, I'm trying to find Old French verbs lacking conjugations, which will be in Category:Old French verbs but not in any of the categories created by the conjugation module. Benwing (talk) 04:33, 15 August 2014 (UTC)[reply]

There's the API, but I think it would be easier to download an XML dump and scan it offline. DTLHS (talk) 04:39, 15 August 2014 (UTC)[reply]

Bots can retrieve lists of pages in a category. If you use a language like Python that can handle sets, then it's a matter of doing some set operations to get the result you want. —CodeCa t 13:06, 15 August 2014 (UTC)[reply]

Any language "can handle sets". --Wiki Tiki 89 13:58, 15 August 2014 (UTC)[reply]

I think his point is that some semi-crippled languages like Lua don't have built-in API's that implement sets, so you end up needing to implement them yourself using hash tables. Benwing (talk) 16:08, 15 August 2014 (UTC)[reply]

A hash table is a set, but with a few extra features, and without built-in set operations like unions, intersections, etc., which are all very easy to implement. --Wiki Tiki 89 16:12, 15 August 2014 (UTC)[reply]

Objectionable coding convention saying "don't comment code"

In WT:Coding conventions it says

Comments should be used sparingly; good code does not need much commenting. Keep comments brief and to the point. Do not put ASCII art in comments.

Comments should not be used for documentation; use the documentation subpage instead.

I disagree with all of this. When you have 1000+ lines of code, you better have comments in there otherwise it will take 3x as long to understand. I've worked with systems with 200,000+ lines of code and commenting is incredibly important. I get that the goal is to discourage stupid comments that don't add anything, but in reality lack of commenting is a much bigger issue than too much commenting. And too much commenting, if it ever happens, is easy to fix. As for "comments should not be used for documentation", (1) this will discourage people from commenting internal functions, whereas in reality pretty much *all* internal functions should be documented, and (2) yes in theory the doc page should be kept up to date but in practice most of them aren't, and it's more likely that someone will comment in the source code that in the doc page.

I think we should add something like "add a comment at the top of all functions describing what it does". This doesn't have to be long if it's obvious, but much of the time, it's not.

Also, I think we should add something like "always write a change summary for every change you make". I've noticed that certain users (CodeCat, among others) don't normally do this, and it makes it much harder to sort through the change history.

Benwing (talk) 04:54, 15 August 2014 (UTC)[reply]

Agree about "always write a change summary"; this should be very strongly encouraged if not actually mandated. Equinox ◑ 07:18, 15 August 2014 (UTC)[reply]

This is a crazy convention. I used to understand Lua functions but these days I find that most of them are totally incomprehensible - bring back comments! SemperBlotto (talk) 08:11, 15 August 2014 (UTC)[reply]

The only part of the first line I agree with is "Keep comments brief and to the point." (well the ASCII art thing too, but I don't think it needs to be mentioned explicitly). But I entirely agree with the second line. --Wiki Tiki 89 14:01, 15 August 2014 (UTC)[reply]

Fortunately, it's just two people's opinion, not a real policy. DCDuring TALK 17:46, 15 August 2014 (UTC)[reply]

Language links

By the way, this has to do with the entire Wiktionary project, not just the English one, and I put this discussion here since this is like the central Wiktionary anyway.

So my experience with language links here is very tedious. One usually would add language links to the bottom of a page as such:

[ [ ar:word ] ]

[ [ da:word ] ]

[ [ fr:word ] ]

etc.

And as far as I know, on all language wikis, this is supposed to be handled by bots. In many Wiktionaries in other languages, they do not have language links at all for thousands of pages and haven't been added by bots for several years. Also, there then comes the problem of having to alphabetically sort all of the language links, which also sometimes causes bugs in the programs of the bots. And I know I have a lot of big ideas, but this one could actually be very useful to the project, well, ALL of the projects.

My idea is that we should have what Wikipedia has, where they have a central database that holds all of the Wikipedia articles and the articles in other Wikipedias with which they are associated. In other words, if we had this on Wiktionary, one could just add a language link to "edit links", and enter the name of the page that you want to add to the language links. The bots could also come in handy for adding these links, such as if a Wiktionary page for a word is added in another language, it would automatically be added to the list by a bot.

If we had a system like this, opposed to the one we have now, it would be so much easier for Wiktionaries in all languages to have links to all of their associated pages. We may also want to have a filter, like we do now, that filters out pages for other words. For instance it would filter out someone trying to add a language link to Danish menneske from the word human on the English Wiktionary.

I am not actually sure if we've had a discussion about this kind of thing before, and I also do understand that it would take a lot of work to set up and implement something as huge as this, especially for the entire project, including all of its subdomains, but I really think we should (re)consider this idea, since I personally think this would be very helpful for all of us. Rædi Stædi Yæti {-skriv til mig-} 04:51, 16 August 2014 (UTC)[reply]

Yes, this is probably what wikidata is supposed to do (seems simple to me, but who the fuck knows what it would actually take to make that happen). DTLHS (talk) 04:54, 16 August 2014 (UTC)[reply]

Yes, one of the things Wikidata was supposed to do was take interwiki linking off projects' hands. And Wiktionarians such as me have repeatedly suggested that Wikidata start taking Wiktionary's interwikis off our hands, since Wiktionary interwikis are easier to handle than other projects' — all the mainspace pages which are to be linked have the same title. However, Wikidata-ers seem uninterested in that, and instead have technically complex and IMO linguistically unsound and unmaintainable ideas about taking definitions off our hands and putting them in a repository so sea and de:Meer could transclude a single unified definition. (Never mind that even denotatively synonymous words are rarely connotatively synonymous.) Meh. You can read about the plans here; my most recent suggestion that Wikidata just take interwikis off our hands is here if you'd like to add your voice. - -sche (discuss) 06:56, 16 August 2014 (UTC)[reply]

New entries by language

3 August 2024: υποσυνείδητα
3 August 2024: σκαλώνω
3 August 2024: υποσυνείδητο
3 August 2024: συναισθηματικός
3 August 2024: μορφωμένος
3 August 2024: πρωτεύω
3 August 2024: συλλογισμός
3 August 2024: σκληρά
3 August 2024: συνειδητά
3 August 2024: συνειδητό
3 August 2024: συνειδητός
3 August 2024: υποσυνείδητος
3 August 2024: διαφωτισμός
3 August 2024: κατίνα
3 August 2024: Κατίνα
3 August 2024: κούρσα
2 August 2024: ατενής
2 August 2024: ατενίζω
2 August 2024: ατεμάχιστος
2 August 2024: ατελώνιστος

There used to be a list/feed (some years ago - not always reliable) of new entries in a chosen language - has this gone, or is it hidden somewhere? — Saltmarsh^{απάντηση} 19:12, 20 August 2014 (UTC)[reply]

That very useful feature, maintained by User:Visviva at http://www.fraktionary.com/, stopped working long ago. Now people can add crap in the languages I work with and I wouldn't know. --Vahag (talk) 19:25, 20 August 2014 (UTC)[reply]

Actually, we could enable this another way. There is a feature that shows when entries were last added to a category, and we've used that for a few cleanup categories to show the oldest and newest members. Now that we have the lemmas/non-lemmas categories, we could apply something similar there? —CodeCa t 22:05, 20 August 2014 (UTC)[reply]

Yes, let's do it. Can we just put a collapsed list of the newest entries on the page of each lemmas category? --Wiki Tiki 89 00:12, 21 August 2014 (UTC)[reply]

It turns out that you don't even need to include the list on the category it monitors. You can put it anywhere. Here's the list for Category:Greek lemmas: —CodeCa t 00:15, 21 August 2014 (UTC)[reply]

Yes, but I think the category is the best place to permanently host such a list. Is it possible to show who created the page? --Wiki Tiki 89 00:30, 21 August 2014 (UTC)[reply]

You can see all the features here. I'm not sure about putting it on the lemma category page itself, because this is primarily useful for editors, while the lemma category is for users. —CodeCa t 00:39, 21 August 2014 (UTC)[reply]

Why wouldn't a reader be interested in the new entries created for a language? Anyway, it shouldn't be too bothersome for those who don't care if it is collapsed. --Wiki Tiki 89 00:43, 21 August 2014 (UTC)[reply]

Visviva's tool tracked also newly added translations by language and, IIRC, had an RSS feature. --Vahag (talk) 13:07, 21 August 2014 (UTC)[reply]

It tracked all changes by language. It was very useful for patrolling a specific language. --Panda10 (talk) 13:16, 21 August 2014 (UTC)[reply]

lalala Index:Georgian lalalala--Dixtosa (talk) 13:33, 23 August 2014 (UTC)[reply]

Well, it worked!. I have an idea, let's return this string <span style = "display:none">[[Index:langname]]</span> whenever {{t}} and {{head}} is used. and then we can use Special:RecentChangesLinked page. in this case this. With this we will be able to monitor newly added translations too.

(inspired by User:ZxxZxxZ's Recent changes for fa.--Dixtosa (talk) 13:40, 23 August 2014 (UTC)[reply]

RQ Template

I'm an inexperienced editor who concentrates on adding quotes to senses. To save time and space, and to allow systematic updating, I would like to use RQ templates. My first investigation suggests that I could use something like:
<includeonly>'''1915''', {{w|George A. Birmingham}}, ''[http://gutenberg.org/ebooks/24394 Gossamer]''{{#if: {{{1|}}}| , ch.{{{1}}} }}</includeonly>
Does this present any problems? If I go ahead with it, should I add an entry for it in Wiktionary:Quotations/Templates? — ReidAA (talk) 08:48, 21 August 2014 (UTC)[reply]

If you plan to add many (not just a few) quotations from that particular source, then go ahead. Also, there is no reason for the <includeonly> tags, in fact they get in the way of testing out the template on its own page. --Wiki Tiki 89 13:00, 21 August 2014 (UTC)[reply]

Thank you for your advice. I do plan to add many, and this will save both time and space. — ReidAA (talk) 00:17, 22 August 2014 (UTC)[reply]

Category:English words prefixed with auto-

Why does autoaway appear at the bottom of the A list, after autozygous? It was added more recently, but how does this explain the incorrect alphabetical positioning? Equinox ◑ 04:47, 22 August 2014 (UTC)[reply]

Template:confix isn't applying the sort key correctly- some problem with Module:compound. DTLHS (talk) 04:55, 22 August 2014 (UTC)[reply]

Why are long pages slow to load?

I've been looking at ways to make Wiktionary:List of languages load faster, and at first I focused on making the Lua code more efficient. But when previewing the page, it showed in the statistics that Lua was only taking about 2-3 seconds to do its work, whereas the whole page took 19 seconds. So I tried expanding the page's code completely, so that all module invocations were eliminated. This only brought down the load time to 16-17 seconds, which is basically what it was before minus the Lua load time.

It seems that Lua is not the culprit in this case, but that the wiki software just does very badly at handling large pages. Is there anything we could do to speed this up somewhat? And if we can't, what can we do to make this list load faster so that it's more useful for quick lookups? I imagine the majority of people who use the page only go there to look up names and codes, so we could get rid of the extra information like script and family information? —CodeCa t 14:59, 23 August 2014 (UTC)[reply]

I don't have any problem loading long pages, but I do notice that they take a long time to save after they have been edited. Donnanz (talk) 15:04, 23 August 2014 (UTC)[reply]

I think we should keep the List of languages as it is, but it would be nice to have indexes of codes and of the names they stand for. I would like to be able to go to a single page that has every code, including languages, families, etymology-only languages, etc., with their name, their type and the name of the data module they're in. Likewise for another index with the same data, but arranged by name, and having separate lines for each alias. Chuck Entz (talk) 16:33, 23 August 2014 (UTC)[reply]

Such a page would be possible, but it would be prohibitively slow to load, so not really suited to quick lookups. But I've just been working on something that may be more feasible. I wrote a JavaScript that looks up language data by code on the fly. It's not really full-featured yet, but it's a good start. To try it out, copy the code at User:CodeCat/common.js to your own common.js, and then go to User:CodeCat/lookup language. A small form should appear where you can type in a language code. @Kephir, Yair rand Could you review my code and post any comments you may have about any specific coding practices on my talk page? I'm not very experienced with JavaScript, so feedback would be very welcome. —CodeCa t 20:59, 23 August 2014 (UTC)[reply]

Because you're downloading more bytes of HTML. --Wiki Tiki 89 22:58, 24 August 2014 (UTC)[reply]

1.3 MB should not take 16+ seconds to load though. —CodeCa t 23:08, 24 August 2014 (UTC)[reply]

Also the rendering takes time, due to all the formatting. I'm curious if it would load any faster if we had a page of that size with just plain text. --Wiki Tiki 89 23:32, 24 August 2014 (UTC)[reply]

Module errors on Old Church Slavonic translations

In a couple of cases (diff and diff), removing manual transliteration has caused a module error with the message: "Lua error in Module:Cyrs-Glag-translit at line 129: This module can only transliterate Old Cyrillic (Cyrs) and Glagolitic (Glag)". In both cases, these were existing entries (моуха (muxa) and орьлъ (orĭlŭ)). Is this a problem with the spelling, or with the module? Chuck Entz (talk) 19:22, 23 August 2014 (UTC)[reply]

Never mind- it was use of "sc=Cyrl" in the template. Chuck Entz (talk) 19:27, 23 August 2014 (UTC)[reply]

Red link headline in abbreviations

This problem was fixed for suffixes yesterday, but still exists in some abbreviations. Please see these entries: kft. (uses {{hu-noun}}) and és tsai. (uses {{head}}). --Panda10 (talk) 12:12, 25 August 2014 (UTC)[reply]

I fixed it. és tsai. now links to és and tsai., which is how it should be. If tsai. does not exist on its own, then the head= parameter should be specified to override it. --Wiki Tiki 89 14:54, 25 August 2014 (UTC)[reply]

Thank you. --Panda10 (talk) 15:39, 25 August 2014 (UTC)[reply]

Template:pt-adj

When making the ACCEL links,this happens, which classes a feminine plural as a feminine. I guess the ACCEL function needs to be changed somehow. Any takers? --Type56op9 (talk) 20:47, 28 August 2014 (UTC)[reply]

Template:character info adding bogus script cats

Discussion moved to Wiktionary:Grease pit/2014/September.