Wiktionary:Grease pit/2007/September

Grease pit archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007
2006


September 2007

Template {{t}} reducks

I/we had set up template t to use a parameter (ls=), added by bot code, to generate section links if and only if needed. At Connel's request, I changed it to lang= to make it clearer what it was. (this turns out to be a step in the wrong direction ;-) The idea is to keep from transcluding dozens or hundreds of language code templates to get the language names.

This turns out to fail usability and human factors: users want to add the parameter to get the desired section link, and then are annoyed when the bot removes it because it is optimizing and the section link isn't needed in that particular case. Too much (visible) magic going on. A colleague, David Robinson, explained it to me ~25 years ago as The Principle of Least Astonishment: users should not be surprised by what the software does. (Mind you, there can be some pleasant surprises, but only when the s/w also does do what it is told.)

To that end: I've added a couple of sub-templates to t, {{t-sect}} and {{t-lang}}. Using them avoids expanding references to code templates when not needed. Template t-lang avoids the need for code template references for a dozen or so common languages; the bot code will add a parameter (now xs=, intentionally a bit obscure) to suppress the references for the others when it updates the entry. The result is:

  • all uses of the template generate section links
  • when first added, only the less common languages will use the code templates
  • when the bot has updated it, there will be no use of code templates
  • any disconnect between the list of optimized languages used by the bot and the template will not affect correct generation of the section links

It is true that if someone adds a large translation table (e.g. butterfly) all at once, it may call out hundreds of code templates for that entry, but then the bot will fix that; at any given time the overall use across the wikt will be reasonable.

Users should not feel that they have to bother with the xs= parameter; if they add it or remove it it will (appear to) change nothing. Thus they are much less likely to add it; and not be disturbed if it is removed. It still is a bit too "magic"; but that probably can't be helped. Robert Ullmann 13:20, 1 September 2007 (UTC)[reply]

Assuming I'm understanding your comment correctly (and I apologize if I'm not): You're right about the principle of least astonishment, but in this case it sounds like you're reacting to it wrongly; yes, you can remove astonishment by changing users' expectations to fit the software behavior, but it's better to remove it by changing the software behavior to match users' expectations. Section links are never detrimental; the bot should add them. The name "lang" is an intuitive one for the parameter that contains the name of the language. (It seems like you ended up getting everything right, except deciding that editors are too stupid to be trusted to see that the parameter containing the language name is indeed supposed to take the language name?) —RuakhTALK 15:11, 1 September 2007 (UTC)[reply]
The 'bot doesn't "add" section links; they are always there (now). That's the source of the problem and the problem: you wanted to add lang= because it generated the section link. "xs" doesn't work that way; it just changes a code path; you get the section link anyway. (If there was some way of encoding the parameter value, like in cookies/URLs, that would be good, so people wouldn't pay attention to it at all!) Editors shouldn't need to or be bothering with xs=; to the extent that they do, all they can really accomplish is breaking something. (There is a tendency for users to try to fill in everything; there is little that can be done about that ;-) I don't think they are too stupid, but they do tend to try too hard. That's why the doc (not that anyone ever reads it...) just describes the simple case first. If someone thinks/figures out that xs= can be the language name, it is usually harmless. Robert Ullmann 15:46, 1 September 2007 (UTC)[reply]
Re: first sentence: Sorry. I actually understood that, but misspoke. (I meant something like "Section links are never detrimental; if it weren't that the template now automatically adds them, it would be the case that the bot should do so.") I do see what you're saying about not encouraging users to do things that the bot can handle with less risk of error, but I still don't like the idea of intentionally obfuscating visible wiki code in order to prevent people from messing with it. *shrug* Thanks for your changes to the template; except for my one nitpick about the name, they seem overwhelmingly positive. :-) —RuakhTALK 15:58, 1 September 2007 (UTC)[reply]
Thanks. If one really wanted to be intuitive, we should write: (e.g. at day) {{t|eo|tago|eo=Esperanto}} ... and yes, this is easy to do. (cases like tr=Turkish are problematic, we'd have to modify the other parameter names ;-). I'm not really suggesting that, but the code is simple (trivial). Robert Ullmann 16:28, 1 September 2007 (UTC)[reply]
How on Earth is that intuitive? What does eo= override in that scenario? --Connel MacKenzie 15:20, 4 September 2007 (UTC)[reply]
Hey, I just got that. Makes a lot of sense, actually. It's kind of like dropping a note. But I think additionally you should still accept something that's not obscure, like {{{L}}}, since {{t!}} doesn't actually need a language code. DAVilla 17:30, 5 September 2007 (UTC)[reply]
Please do not misrepresent what I said. My complaint is about screwy nonstandard counter-intuitive variable and template names...which I see you are trying to make worse here, not better. --Connel MacKenzie 15:20, 4 September 2007 (UTC)[reply]
Sigh. I thought I'd said this. I'll go slower ... If the parameter name is something "intuitive", editors will think it is something they are supposed to fill in. Hencce we see Ruakh adding lang=Hebrew and complaining when the 'bot removes it.
If, OTOH, the name is intentionally not obvious, and the parameter apparently does nothing, and only appears on <3% of the entries (the template optimizations cover 97% of the translations in the en.wikt at present ;-), then editors will mostly ignore it. If they think it is the language name, then what it is called is not relevant. (Since they've "figured it out".) If they actually look at the documentation at {{t}}, they'll find out that they needn't bother with it.
It is human factors engineering. You put the headlight switch in an obvious place on the dashboard or the wheel. You put the hood/bonnet release where no-one will see it or try it unless it is what they are looking for. Robert Ullmann 12:36, 6 September 2007 (UTC)[reply]
Actually, no; I didn't think it was something I was "supposed" to fill in, but rather something useful. Which it was. And apparently either you agree, or you've caved on this point, because you altered the template to always do what adding the language name did previously. —RuakhTALK 15:24, 6 September 2007 (UTC)[reply]
I haven't "caved" ;-). I have been trying (and I think have succeeded...) in finding a way to make sure (1) section links are always present if needed (and always present anyway meets this ;-), (2) we don't have to have some param=(language name) in every single call but at most in a small minority overall, and (3) not transclude a code template for every call as there may be hundreds. Robert Ullmann 15:47, 6 September 2007 (UTC)[reply]
Going by the "Principle of least astonishment," it would make more sense to always and at all times avoid (if not prohibit) sub-templates. --Connel MacKenzie 15:22, 4 September 2007 (UTC)[reply]
Sub-templates have nothing to do with the user interface. And they are an optimization; would you prefer a huge server load? Robert Ullmann 12:36, 6 September 2007 (UTC)[reply]

Addition to "Special:Whatlinkshere"

Would it be possible to have a (last n) in addition to the present (previous n) and (next n)? I go through a couple of large ones every day to look for recently added Italian nouns and adjectives (to feed the bot) and it gets longer and longer. SemperBlotto 16:04, 4 September 2007 (UTC)[reply]

The code that builds this page doesn't know up front what the size of the list is, so it can't make a link to the last n. It all works off of limits and offsets, which are translated directly into the query made to the database. There would have to be a separate query done on every page load that counts the size of the entire list.
Your note led me to look at that page a little more closely and I realized how little sense it makes in some cases. It defaults to a page size of 50, but if there are any redirects on a page, it seems to either fetch all of them or some arbitrary number. I just looked at Special:Whatlinkshere/Wiktionary:Grease pit and it shows 50 entries in the top-level list, but 127 entries linking to WT:GP... Strange stuff. Of course, the redirect behavior doesn't matter much in the case of our main namespace, since it has so few redirects, but it really doesn't make much sense. Mike Dillon 16:59, 4 September 2007 (UTC)[reply]
Note that you can edit the "Address" bar directly, and change "&from=0" to "&from=500000" if you have some notion of the internal DB pointer number you'd like to start from. {{NUMBEROFPAGES}} is currently 649,967...with deletions and redirects and other stuff, the current newarticle number is 738,427 or somewhere around there. My recommendation is for you put a link like this on your userpage, and update that number once a week. --Connel MacKenzie 17:27, 4 September 2007 (UTC)[reply]
Thanks Connel - that technique works fine. SemperBlotto 21:07, 4 September 2007 (UTC)[reply]
It should be possible to get something approximating this using parser functions directly without having to update each week. The ratio between the new article number and the number of pages should be fairly constant, since it is a reflection of the proportion of deleted page ids. The ratio from Connel's numbers is about 1.1361. Here are some calculations:
  • {{#expr:(1.1361 * {{NUMBEROFPAGES:R}}) round 0}}
    10661424
  • {{#expr:((1.1361 * {{NUMBEROFPAGES:R}}) round 0) - (((1.1361 * {{NUMBEROFPAGES:R}}) round 0) mod 1000)}}
    10661000
Using something like these formulas, it should be possible to come up with a reasonably consistent link for this sort of thing.
I just realized that a better way to do this might be to provide a reverse sort order option for this page, similar to the dir=prev option on Special:Contributions. Looking at the code in SpecialWhatlinkshere.php, it should be dead simple and perform just as well as the forward sort. Mike Dillon 18:00, 4 September 2007 (UTC)[reply]
Note: Special:Contributions doesn't actually have a reverse order, I'm just thinking that "dir=prev"is a good name for the parameter. Mike Dillon 18:06, 4 September 2007 (UTC)[reply]
Pardon my expediency. I'm pretty sure SB is not interested in getting too fancy. I don't think the 1.1361 holds from month to month though, nor week to week. Still, it is close enough. I don't feel like filling out a bugzilla: for this, but I suppose someone really bored could. So, something like this: //en.wiktionary.org/w/index.php?title=Special:Whatlinkshere/Template:it-noun&from=10661000, then? --Connel MacKenzie 18:23, 4 September 2007 (UTC)[reply]
Yeah, that's basically what I was saying; I'd assume that the ratio doesn't change drastically, but you'd know better since you're more familiar with the growth rates (both in general and in the ratio of things that need to get deleted). If you're talking about creating a ticket in bugzilla for dir=prev, I'm probably going to do it. I already worked up a patch for adding support for dir=prev. I did it in such a way that the parameter is supported by the Special:Whatlinkshere controller, but I didn't put it into the user interface; it would only work for constructed links like: //en.wiktionary.org/w/index.php?title=Special:Whatlinkshere/Template:it-noun&dir=prev. We'll see what happens. Mike Dillon 18:34, 4 September 2007 (UTC)[reply]
Actually, I was planning on being lazy and not filing a bugzilla on it.  :-)   It might make sense to run through all the special pages, adding the namespace and 20 50 100 250 500 stuff, too. Special: Allpages, Fewestrevisions, Mostcategories, Mostrevisions, BrokenRedirects, CrossNamespaceLinks, Deadendpages, Disambiguations, DoubleRedirects, Listredirects, Longpages, Mostlinked, Ancientpages, Lonelypages, Shortpages, Wantedpages all could use a "namespace" thingy. --Connel MacKenzie 22:46, 5 September 2007 (UTC)[reply]

Proper nouns & red links

Every time I try to use the doubly [brace]d/ curly [bracket]ed 'en-proper noun' tag link that shows up in the box formed by the 'Templates' pull-down menu that is shown by default at the bottom of article editing pages but can also be reached on talk pages, I find that either double open brace 'en-proper' or 'noun' double close brace pops up—this link is broken. The other links down there that have spaces work properly. Can someone pls tell me how to fix this?

I find it frustrating that when I open red links, the page I get to is already set to be edited and doesn't link to the deletion log. I think it's important to always check the deletion log before potentially creating a new page, so having to copy & paste the same URL into the address bar instead of clicking in order to get to the other page, which does have the deletion log link and much of the same information anyways wastes my time and accidental straight clicks waste a bit of my download limit. I want you gurus to tell me what I can do to fix this. :) Thecurran 09:37, 6 September 2007 (UTC)[reply]

Re: #1: Fixed; thanks for pointing it out. (Those are controlled via MediaWiki:Edittools, which only an administrator can edit, so you did exactly what you needed to to have it fixed.) Re: #2: If an entry with that title has previously been deleted, the edit-entry page (where redlinks take you to) shows you the deletion log. See, for example, ignorampant. —RuakhTALK 15:28, 6 September 2007 (UTC)[reply]

Edittools

I found your MediaWiki:Edittools neat, specially script selector and would have liked to import it into my home wiki, English Wikiquote. I copied your version to our own but it doesn't work ... no selector available (now blanking for avoiding duplication).

Could anyone please help it out? We discuss it on our q:WQ:VP, relevant talks may be found there. Cheers, --Aphaia 12:04, 6 September 2007 (UTC)[reply]

The version you took is heavily depended upon MediaWiki:Monobook.js - the function addCharSubsetMenu() contains the goofy version. Better (much, much better,) is my version on bs:MediaWiki:Monobook.js which I haven't imported over here yet, out of laziness.  :-)   --Connel MacKenzie 01:14, 7 September 2007 (UTC)[reply]
For those of you who many not be familiar with my sense of humor: the delay is obviously not laziness; Monobook.js is getting rolled into Common.js with a bunch of other changes all at once, with testing over on http://wiktionarydev.leuksman.com/ before upsetting people here. but that one function, yes, is worlds better, in the bs: version. --Connel MacKenzie 01:17, 7 September 2007 (UTC)[reply]

Now what is it?

http://en.wiktionary.org/w/index.php?title=Special:Whatlinkshere/Template:en&limit=500&from=0 --Connel MacKenzie 01:08, 7 September 2007 (UTC)[reply]

Seems like the same bug that's been breaking Special:Wantedpages; MW is evaluating the ParserFunctions in {{context}} even when they are not actually called. -- Visviva 01:36, 7 September 2007 (UTC)[reply]
Visiva: that isn't a bug; that's how the language works; it uses recursive transclusions, not "calls". See the Schools Brief, supra ;-) Robert Ullmann 09:25, 7 September 2007 (UTC)[reply]
Possibly via {{language}} which has been completely stripped down but for some reason is calling {{isValidPageName}}. DAVilla 02:26, 7 September 2007 (UTC)[reply]
No, that's not it. DAVilla 02:30, 7 September 2007 (UTC)[reply]
Oh, Template:en. Yes, this is it. DAVilla 02:55, 7 September 2007 (UTC)[reply]
Think it's here: {{{{#if:{{{poscat|}}}|language|context 0}}|{{lc:{{#if:{{{lang|}}}|{{{lang}}}|en}}}}}}, which evaluates to {{en}} when {{{lang}}} is not specified... of course that entire line *should* only be evaluated when {{{poscat}}} is specified, but apparently that's not how the software is working currently. -- Visviva 02:48, 7 September 2007 (UTC)[reply]
No, I put that code on my user page, and it doesn't show as being linked. {{language}} does, however. DAVilla 02:55, 7 September 2007 (UTC)[reply]
Er, well, actually it evaluates to {{language|en}}, which seems to call {{en}} in turn (via is isValidPageName). So you were right. -- Visviva 03:03, 7 September 2007 (UTC)[reply]
Oh, I had put up {{lc:{{#if:{{{lang|}}}|{{{lang}}}|en}}}} which was maybe an earlier revision? Regardless... DAVilla 03:10, 7 September 2007 (UTC)[reply]
No, when poscat is blank, it evaluates to {{context 0|en}} which evaluates to nil. When poscat is not blank, and lang is blank, it evaluates to {{language|en}}, which evaluates to "English", and is used to generate the category name actually needed in that entry. Robert Ullmann 10:14, 7 September 2007 (UTC)[reply]
So Visvia isn't completely wrong, it does evaluate to {{language|en}} in at least some cases. DAVilla 13:09, 7 September 2007 (UTC)[reply]
I have taken code out of {{language}} that checks for these language templates, and instead relied on lang: templates in all cases. Unfortunately that means a lot of the languages will break since there are far fewer lang: templates than the normal ones. Is there any way to automate their entry? (Or is it still an option to dewikify the other ones?) DAVilla 03:10, 7 September 2007 (UTC)[reply]
I put this back; the idea is to get away from the lang: templates; not break a lot of things and force their use. What exactly is the problem here? Connel is complaining that a lot of pages link to {{en}}? So? Has he somehow got the idea that it shouldn't be referenced, or that uses of {en} should be cleaned up? Huh? Robert Ullmann 09:25, 7 September 2007 (UTC)[reply]
In rain cats and dogs for example, the entry uses {{idiomatic}}, when generates [[Category:{{en}} idioms]] just as it should. If we want to optimize the lang=en case, that isn't hard. But it isn't broken. Robert Ullmann 09:37, 7 September 2007 (UTC)[reply]
If we look at the references, we find uses of {en} by {context} exactly in cases when needed. The other references are
  • the regional context labels themselves which default to English, e.g. {{Commonwealth}} which use it to self-categorize
  • the same for POS labels, to self-categorize
  • explicit use of lang=en in {wikipedia}, e.g. at Helgoland
  • missing lang= in POS labels, e.g. 猫の手も借りたい
  • use of {{projectlinks}}, which always calls {en}, e.g. at snickerdoodle, whether the referenced language is English or not
That's what I saw skimming through it. Robert Ullmann 10:06, 7 September 2007 (UTC)[reply]
I came across occurrences of {{en}} in a newbie's Finnish entries. I was trying to find all of them...seemed like a reasonable expectation for using WhatLinksHere. If the fucking language templates for context were somewhere else, everything would be working fine, as expected. Unfortunately, we now suddenly have the horrific scene of {{context 0}} and {{context x}} which exist only to flood the fucking crap out of the "Templates used on this page" section. Well, that, and to make it impossible to trace what the hell is going on. I am not convinced this needs to be made so intractable. --Connel MacKenzie 00:05, 8 September 2007 (UTC)[reply]
The first is a problem with {{language}}, not {{context}} directly. "Templates used on this page" is an unrelated issue that should be brought up in a separate thread. DAVilla 01:34, 8 September 2007 (UTC)[reply]

Connel, if you want whatlinkshere to be clear of these, you're going to have to insist on keeping the links. Otherwise I'm going to go through and dewikify them all so that we aren't straddling the fence like this. DAVilla 13:09, 7 September 2007 (UTC)[reply]

Your comment makes no sense at all. --Connel MacKenzie 23:58, 7 September 2007 (UTC)[reply]
The two are related because of the current messy state of affairs. I had fixed the problem with Special:Whatlinkshere/Template:en but I was reverted by Robert because it is not clear what our goal is on language templates. {{language}} is currenly using both sets, one wikified, the other incomplete. We're straddling the fence, and need to either dewikify the one, or complete the other.
If you want Special:Whatlinkshere/Template:en to be empty and/or if you want the language templates to remain wikified (aside from TopN), then it's a clear case for needing both sets of language templates. You've just said the first pretty explicitly above, and if you think the second is important, or if others chime in in agreement then we can revert and fix the problem again.
What I was saying is that, if we were to decide we don't care so much about either of those (seems unlikely though), I'm going to assume that a single set of language templates is better and start making preparations for that. DAVilla 01:34, 8 September 2007 (UTC)[reply]
Where we are right now is that we need lang:xx if and only if xx is wikilinked. The original purpose of wikilinking the templates so as to try to get consistent linking in Translations tables is almost obsolete; to be consistent we have to check all the existing entries anyway. There isn't any way to make people {{subst|te}} rather than writing "Telugu". So it has to be, and is, handled differently.
We have to have the plain set (all wikts do), if we wikilink some of them, then we have to have another (at least partial) set, and the maintaining them will be an endless task. The only thing that makes any sense at all is unlinking all of them, and letting the "calling" template (e.g. {wikipedia}) link as it pleases. In the meantime we have a transition path we are on, without breaking anything even momentarily as we go.
As to Connel's offensive temper tantrum, that sort of language gets one nowhere.
As to the use of {{en}} by user BiT: first of all, it isn't in a trans table, and the only policy we have is that the code templates must be subst'd in the tables. Says nothing about use anywhere else.
As to finding them; the what-links-here was completely useless before, now at least it shows where it is actually used. And it isn't the best way to find them anyway, you'll just have to look at each entry and fix it manually. I just ran replac -{{en}} +English on the wikitext and there were only 3. ("fixed", if they needed to be). Robert Ullmann 15:00, 8 September 2007 (UTC)[reply]
I assume you appreciate my frustration? We've had three separate passes on simplifying this collection of templates; it is rather horrifying to see it returning to a massively (pointlessly) complicated form again. --Connel MacKenzie 07:19, 10 September 2007 (UTC)[reply]
Can we unlink all the plain language templates? Or must we maintain all of the (remaining) complexity? Eh? Robert Ullmann 15:08, 13 September 2007 (UTC)[reply]

en.wb would like to transwiki this to you. Could some admin import it please? It will be deleted soon. Mike.lifeguard | talk 23:44, 7 September 2007 (UTC)[reply]

We only have direct (full-history) Transwikis from Wikipedia right now. I'll see if we can get Wikibooks turned on. --Connel MacKenzie 23:56, 7 September 2007 (UTC)[reply]
I will transwiki manually for you. Mike.lifeguard | talk 00:29, 8 September 2007 (UTC)[reply]
Here: European Computer Science Dictionary Mike.lifeguard | talk 00:30, 8 September 2007 (UTC)[reply]
Can someone please verify that transwiki is complete (ie. do you need further information before it's deleted from en.wb?) Mike.lifeguard | talk 00:39, 8 September 2007 (UTC)[reply]
I think it should be ok. If not, use my talk on en.wb. PS you guys need to learn how to archive - this page is ridiculously massive. Mike.lifeguard | talk 01:19, 8 September 2007 (UTC)[reply]
Moved to Appendix:European Computer Science Dictionary (but a renaming may be in order, and a better introduction is needed). Cheers! bd2412 T 01:27, 8 September 2007 (UTC)[reply]

How will we handle the fact that there's an ISO language code for see (Seneca). There are only 175 living speakers, but it's still an issue. --EncycloPetey 17:29, 8 September 2007 (UTC)[reply]

The obvious thing to do would be to move the existing template:see to template:seealso, leaving the redirect there until all existing uses had been changed (with the Did You Mean extension likely to be turned on soon, this would be a logical time to make the change). Obviously multiple high-profile messages would be needed before the redirect was extinguished. Perhaps AF could tidy up any further uses of the deprecated {{see}} template during the transition period. Thryduulf 18:06, 8 September 2007 (UTC)[reply]
I'd suggest moving it to {{see also}} (bit better name, and Seneca would be using "see" and the "see-" prefix). It was created by a user who maybe didn't realize we had {see}? Was copied from the 'pedia. If we do that, as observed, we could add it to the regex table in AF, and it will fix new ones and old ones as found. Eventually we go do a replace. No hurry. (note that people often see AF changes on their watchlist, and learn these small things ;-) Robert Ullmann 13:49, 11 September 2007 (UTC)[reply]
Please don't call it "see also". There is already too much confusion between the section of that name and the browsing tool at the top. May I suggest something like "browse-see" or "browsing-see" or "disambiguation-see" or "disambig-see".
As a side note I've always thought using the bare ISO letters for template names was a bad design. I've always thought all templates (and all CSS classes etc) should be prefixed to show their function: iso-xxx / lang-xxx / l-xxx / browse-see / b-see etc. — Hippietrail 07:30, 24 September 2007 (UTC)[reply]

Emergency substitution needed

When I made {{fr-def-verbform}} to work in substitution, I accidentally broke its transclusion use. Can somebody use a bot to substitute the ca. 160 uses of the template? Circeus 22:53, 8 September 2007 (UTC)[reply]

New bot, same as the old bot.

There was a problem with RobotGMwikt: the pl.wikt didn't want links added to the ru.wikt. The problem was that the runner didn't know how to do that. Now we have Volkov, who is so far unable or unwilling to answer questions about the bot. I suspect (but don't know) that he doesn't understand the code either.

(strike that, he replied while I was writing this ;-) Robert Ullmann 08:44, 10 September 2007 (UTC)[reply]

Here's the thing: the interwiki code is designed for the Wikipedia, where it goes around trying to figure out how to link articles on the different language pedias by connecting one to the other, if w:en:Cat is linked to w:fr:Chat, and that is linked to w:it:Gatto then it figures it can link en:Cat to it:Gatto. (Note that if there are two unconnected sets, including singletons, the code can't link them; this is why pedians keep trying to add iwikis here: they think the process needs "hints")

This is utterly ridiculous for the wiktionaries, where we link to the same title. From the code:

Script to check language links for general pages. This works by downloading the
page, and using existing translations plus hints from the command line to
download the equivalent pages from other languages. All of such pages are
downloaded as well and checked for interwiki links recursively until there are
no more links that are encountered. A rationalization process then selects the
right interwiki links, and if this is unambiguous, the interwiki links in the
original page will be automatically updated and the modified page uploaded.

So the bot is going around, reading pages all around the wikts, trying to find links. The only thing it does differently in "-wiktionary" mode is skipping links when the title doesn't match. This is obscenely inefficient. And will permanently miss some links.

Any reasonable bot would download the list of article titles for each wikt from the download service, merge them into an index (which can be done in memory, if you have a half-G or so above where your OS idles). Then munch through each new XML dump, and add exactly what is needed, reading and writing only the page that is to be modified. This would also generate all the correct links, after each XML pass all the links in that wikt would be current (as of the dates of the other dumps). (A small optimization is to generate the article title list locally for each language while munching the XML, one can also record all of the iwikis in the entries. This is how User:Tbot works.)

The existing code also fails to remove links (at least in some cases), and won't sort links added out of order by users unless it is adding a link. I can only imagine the total and completely unnecessary load on the WM servers as this bot goes around reading dozens or many dozens of pages for each new link it discovers. Robert Ullmann 08:25, 10 September 2007 (UTC)[reply]

Well, this won't solve the "total and unnecessary load" part, but if Tbot added appropriate interwikis to the English Wiktionary, that would at least solve the "permanently miss some links" part, at least for cases where the English Wiktionary has an entry. —RuakhTALK 14:39, 10 September 2007 (UTC)[reply]
Those are very distinct operations, which IMHO should be kept separate. --Connel MacKenzie 17:35, 10 September 2007 (UTC) I do agree that if interwiki.py adapted a similar method, (which I think is what Robert is suggesting) it would be much more efficient. --Connel MacKenzie 17:36, 10 September 2007 (UTC)[reply]

One too many archiving complaints

I've aggressively archived this page today. Please do not restore oversized conversations here. Instead, start a new section with a link to the relevant archive section and a BRIEF summary of that discussion (as needed.) --Connel MacKenzie 17:31, 10 September 2007 (UTC)[reply]

Now, if you could also do that for RfV and RfD... ;-) bd2412 T 00:01, 13 September 2007 (UTC)[reply]
My experiments for that keep breaking, trying to diff gigantic pages (oh, the irony.) I'd say my own talk page needs archiving next (before RFV/RFD) but that would only encourage more people to post there. --Connel MacKenzie 08:18, 15 September 2007 (UTC)[reply]
Well I've gone and moved all the discussions for kept RfD's from January to their talk pages. I'd like to make an archive or two and just move all closed-with-delete discussions there. I'm just not sure how we handle archiving those pages. bd2412 T 03:04, 16 September 2007 (UTC)[reply]

Spanish noun templates

In response to Wiktionary:Grease pit archive/2007/August#Spanish noun templates (“tried putting in links [...] just like {{en-noun}} [...]. Can someone fix that, so that you can link to multiple words [....]”, I implemented the custom formatting/linking feature in {{es-noun-m}} and {{es-noun-f}}. In doing so, I noticed that when I originally created those templates, we had no established naming scheme for custom singular and plural parameters, so the templates accepted either "sg" or "singular" and either "pl" or "plural". For consistency, I dropped the "singular" and "plural" support. Temporary maintenance categories will collect entries where editors supply those deprecated parameters. When a reasonable training period has passed, we can remove those maintenance categories. Rod (A. Smith) 18:55, 10 September 2007 (UTC)[reply]

I don't understand why it is a problem to have two names for the same parameter. If someone uses "singular," it should just uotput the same ting as if someone uses "sg," without any cleanup necessary, right? There is no downside to making the template more flexible. Dmcdevit·t 01:09, 11 September 2007 (UTC)[reply]
If you want, we could restore the alternate parameter names, but if we do so, we should add the alternate parameter names to all the other noun templates that accept a singular or plural parameter. Should we? Rod (A. Smith) 02:05, 11 September 2007 (UTC)[reply]
I don't see any reason not to, except perhaps the negligible increase in template length/complexity. But I may missing something. Dmcdevit·t 02:06, 11 September 2007 (UTC)[reply]
We should probably give full support, or at least uniform support. I always have a hard time remembering with Spanish templates accept only f= for the feminine form, and which accept only fem=. I think it should uniformly be m= for masculine and f= for feminine, because our gender templates are {{m}} and {{f}}. Likewise I would like to see the number paramenter support s= for singular and p= form plural, since {{s}} and {{p}} are our number parameters. --EncycloPetey 02:34, 11 September 2007 (UTC)[reply]

I like both of your suggestions and now wish I hadn't used {{{sg}}} and {{{pl}}} in the first place. Since "singular" and "plural" are so common for inflections, {{{s}}} and {{{p}}} make good sense as abbreviations for {{{singular}}} and {{{plural}}}. Besides, using one-letter and four-plus-letter template and parameter names avoids confusion with ISO codes for languages. So, for a trial, I implemented support for the four suggested parameter names in {{es-noun-m}} and {{es-noun-f}}. Note that the legacy parameter names are still in place for now. If the new parameters seem right in those templates, we'd need some sort of community approval before replacing replacing {{{sg}}} and {{{pl}}} with {{{singular|{{{s}}}}}} and {{{plural|{{{p}}}}}} in the inflection template syntax across all language templates (including English ones). We'd also need to enlist the help of a bot owner to update entries that use the deprecated parameters. I'm not sure I have the energy to do so yet, since I'm still trying to wrap up work on {{term}}. If somebody picks up this standardization effort now, I'd be much abliged. Otherwise, I will have to resume this when I get around to it. Rod (A. Smith) 07:12, 11 September 2007 (UTC)[reply]

I still don't understand what the point is in deprecating parameters that aren't hurting anything. Why not {{{singular|{{{sg|{{{s}}}}}}}}}, for example? Replacing all the "sg"s and "pl"s seems like needless work. Dmcdevit·t 07:54, 11 September 2007 (UTC)[reply]
Whatever syntax we support, we should do so across the board, i.e. in all language templates. Some contributors feel that the many templates are already overly complicated. Balancing parentheses is already a challenge for me. If each inflection template must now include several repetitions of the text {{{singular|{{{sg|{{{s|{{PAGENAME}}}}}}}}}}} and each documentation set must mention all three parameters, the templates might be viewed as overly complex. Rod (A. Smith) 08:13, 11 September 2007 (UTC)[reply]
Well, I may not be vocal about it, but I'm one of the ones that thinks our templates are too complex already. But length is not really a measure of complexity, but strange ParserFunctions, line breaks, comments, subpages, templates, and nested functions are. In fact, my point is that allowing multiple parameters makes it less opaque. A new user doesn't have to read a convoluted template, or even the (hopefully) clear documentation, if their intuitive guess (the full word, like "plural," being one of those) just works. Dmcdevit·t 00:32, 13 September 2007 (UTC)[reply]

I think we need a straw poll to gather feedback regarding common template parameter names. E.g.:

Straw polling on: Guidelines for template parameter names

  • Templates that accept a parameter for an inflection type with one or more commmon abbreviations should accept it in multiple formats: one for each common abbreviation and one with the fully spelled name of that inflection. E.g. {{{p}}}, {{{pl}}}, and {{{plural}}} for plurals and {{{s}}}, {{{sg}}}, and {{{singular}}} for singulars.
  • Templates that accept a parameter for an inflection type that with a standard English Wiktionary abbreviation should accept it both through an argument named as that abbreviation and through an argument named with the fully spelled name of that inflection. E.g. both {{{pl}}} and {{{plural}}} and both {{{sg}}} and {{{singular}}}.
  • Templates that accept a parameter for an inflection type with a standard English Wiktionary abbreviation should accept it only through an argument named as that abbreviation. E.g. {{{pl}}} and {{{sg}}}.
  • We should not bother with such a guideline. Each template can accept its own parameter name.

Please also help me determine the appropriate options and wording. Also, should such a poll be here or in WT:VOTE? Rod (A. Smith) 18:29, 12 September 2007 (UTC)[reply]

  • Didn't we go through a fair amount of pain to delete {{pl}} before? Why would we ever want to recommend using the wrong form? (Likewise, sg.) Why would we ever want to recommend (or even permit) the long forms? I'd be happier if we had a more consistent way of identifying the wikified pagename in all of our inflection templates...right now each template has a different variable name. --Connel MacKenzie 00:06, 13 September 2007 (UTC)[reply]
    I kept "pl" above because it's the abbreviation currently given for "plural" in Wiktionary:Glossary and because several inflection templates currently use that parameter name. I'd be very happy to remove that as our preferred abbreviation and parameter name, especially since it looks like the two-letter ISO 639-1 code for Polish. Shall we replace it with "p" in Wiktionary:Glossary? Rod (A. Smith) 00:19, 13 September 2007 (UTC)[reply]
    • Is three names for one parameter ("p," "pl," "plural") so much? I think it's better for none of the intuitive forms to be wrong, and there is not much of a problem, unless I'm missing anything. We could use those three consistently, after all, instead of haphazardly like we do now. Dmcdevit·t 00:32, 13 September 2007 (UTC)[reply]
      Assuming that we want all inflection templates to support identical semantics, three different options seems not only excessive but it makes it difficult for editors who learn one abbreviation to read entries written with the alternative. So, if we support "p", I wouldn't expect to support "pl" as well, nor vice versa. I could understand an argument for both a fully-spelled form and a single abbreviated form (so that the wikitext could be either very readable or very easy to type), but multiple different abbreviations across the board seems wrong. Rod (A. Smith) 03:44, 13 September 2007 (UTC)[reply]

At Wiktionary:Grease pit archive/2007/August#Template:term and Template:italbrac, I asked whether to incorporate the styles from {{italbrac}} into {{term}} for transliterations (e.g., in “(deprecated template usage) λόγος (lógos)” and “(deprecated template usage) λόγος (lógos)”). Receiving no response, I did so, but note the following option in WT:PREFS:

  • Hide the parenthesis in the "italic-brac" definition qualifiers

Should {{italbrac}} and its styles be reserved for definition qualifiers? If so, it should be renamed. If {{italbrac}} is the right name, the WT:PREFS option is misleading. Rod (A. Smith) 08:04, 11 September 2007 (UTC)[reply]

Template term shouldn't be using this, the parentheses should always be present. In italbrac/context itself, one can change the text from italics, to small capitals, or a different color, or whatever, showing that it is descriptive text. The transliteration is a completely different animal. Robert Ullmann 12:12, 11 September 2007 (UTC)[reply]
In that case, we should change the name of {{italbrac}} and the CSS styles to indicate their purpose. They are not for parenthesizing italic text in general. They are for displaying qualifiers (including those displayed by {{context}}, by {{a}}, and before -onyms). Only the default style for those qualifiers is parenthesized italics.
So, may I rename {{italbrac}} to "{{qualifier}}" and rename the CSS classes to "qualifier-paren", "qualifier-content", etc.? Rod (A. Smith) 22:40, 12 September 2007 (UTC)[reply]
Nobody has objected, so I will rename {{italbrac}} to {{qualifier}} to clarify its actual purpose. Similarly, I would like to rename {{italbrac-colon}} to {{sense}}. Rod (A. Smith) 21:03, 22 September 2007 (UTC)[reply]
  • Why has this change been rushed through with so little thought and apparent lack of understanding of the purpose of the template?
  • The "italbrac" template was specifically designed and named to be used anywhere, independant of meaning, anywhere people have put italicized text inside parentheses. This has been done all over the place. The context labels is only one place.
If it's essential to display as italics within parenthesis, you should be using (''...'') not {{italbrac|...}}. If what you're using instead is a qualifier, which might be italicized in parenthesis, then you should be using {{qualifier|...}}. Other uses cannot rely on the same CSS classes. DAVilla
The reasoning was that you and I may favour (''...'') but Stephen and others favour ''(...)''. Having both was ugly. The template specifically provided a way to make one kind of wikitext that looked how each user preferred. Qualifiers and other specific uses was left for later as it was a deeper problem. — Hippietrail 21:59, 23 September 2007 (UTC)[reply]
Unfortunately, it was assumed that the styles it used would be present only in qualifiers. So, WT:PREFS allows people to format their qualifiers without parentheses. WT:CUSTOM allows them to format their qualifiers using small caps and no parentheses. When editors use {{italbrac}} in places other than qualifiers, users who prefer such alternative styles receive with improper rendering. Rod (A. Smith) 00:20, 24 September 2007 (UTC)[reply]
Yes Connel also initially thought "italbrac" was for the context labels. Even though I convinced him what it was for he still worded WT:PREFS in a way that made it seem like it's for only context labels. I wish he hadn't but since WT:PREFS is in his personal space I didn't mess with it. It should be really moved to the MediaWiki namespace so we can all work to improve it. — Hippietrail 07:23, 24 September 2007 (UTC)[reply]
That doesn't match my recollection of the discussion. The wording centered around the typical use of the preference. How do you want it worded, instead? --Connel MacKenzie 19:05, 25 September 2007 (UTC)[reply]
Ah, good reason to keep the generic! DAVilla 00:42, 25 September 2007 (UTC)[reply]
The generic {{italbrac}} has been restored, per suggestions from DAVilla and Hippietrail. An AutoFormat-type bot will help us gradually update most of the common qualifying uses of {{italbrac}} to {{qualifier}}. Rod (A. Smith) 01:01, 25 September 2007 (UTC)[reply]
  • All places the template is used for other purposes will now have a misleadingly named template.
More often you will find wikitext rather than templates used. Regardless, where this is done we can create new templates to address these uses. DAVilla
When others have confronted such issues they have scanned the XML dump to come up with a percentage rather than a "more often". This is a practice we should insist up so we know how much breakage we have to fix.
An XML dump won't tell you that many cases of {{i}} have been replacement by bots rather than the format used by the original contributor. I don't have the means to determine how far the bots have progressed, nor do I think it relevant. Regardless of what was used, it may be possible to convert with little supervision. DAVilla 00:38, 25 September 2007 (UTC)[reply]
  • What should have been done was to either a) duplicate the "any use" template for the specific use, or b) make a specific use template which in turn uses the "any use" italbrac template. — Hippietrail 09:11, 23 September 2007 (UTC)[reply]
The only objection I have to (b) is that it would rely on one set of CSS classes. Aside from that, in my opinion (b) is generally the way to go, except that there should be a split between the easily accessible any-use template and the subtemplate they all call. Particularly, {{italbrac}} and {{qualifier}} would call {{stylized bracketed italics}} so that Special:Whatlinkshere/Template:italbrac isn't flooded and can be monitored for emerging uses. DAVilla 14:00, 23 September 2007 (UTC)[reply]

{{qualifier}} is used to format a qualifier for a list item. WT:PREFS allows readers to customize how that list item qualifier displays. Some choose not to see parentheses around their list item qualifiers. I am not aware of any uses of {{italbrac}} other than for indicating a list item qualifier. If you know of one, please point it out to me. Rod (A. Smith) 17:13, 23 September 2007 (UTC)[reply]

  • I'm not arguing that "qualifier" is a bad thing - it's much needed. I see the differences are a) different name to reflect the semantics and b) different CSS classes to reflect the semantics. This is excellent. I for one have been using "italbrac" in every section of articles ever since I created the template. I'd have to scan the XML dump to show you examples which I can't do for a few hours.
  • I highly recommend keeping the new template and restoring the old one to not be a redirect to the new one. Changes to CSS files should also have the "italbrac" stuff reinstated alongside the "qualifier" stuff. — Hippietrail 21:59, 23 September 2007 (UTC)[reply]
    Note that I did not remove any styles from MediaWiki:Common.css or from {{qualifier}}. I only added styles. So far as I know, everyone should still have the exact same rendering for pages that use (the redirected) {{italbrac}}. If that's not the case, please let me know immediately so I can investigate and fix. I do understand your point of splitting {{italbrac}} from {{qualifier}}, but from my quick perusal, it seems that at least 90% of the uses of {{italbrac}} are for outputting qualifiers. If we begin a slow migration to {{qualifier}} for those 90%+ uses, we should then end up with a manageable number of scenarios where clients use {{italbrac}} for purposes other than qualifiers. From that list of scenarios, we should be able to create appropriate substitute templates and CSS flags. Again, please let me know if any formatting is actually currently broken. Rod (A. Smith) 00:20, 24 September 2007 (UTC)[reply]
  • Cool, I didn't dig deeply to see what had been changed. I only noticed that you made some changes to Stephen's custom CSS file. Anyway I can see two ways forward:
    1. Leave the new template and classes but also reinstate the old template and classes. This means no more redirect. Though a redirect to a changed old template shows immediate results it's not the right thing in the long term. Either way all the uses of "italbrac" that were at the beginning of definition lines need to be changed to "qualifier" by a bot. The results won't be immediate but they shouldn't be too slow and broken cases can be avoided. It can even be added to AutoFormat.
    2. A clever use of templates should be possible which works like the "italbrac" and "qualifier" templates but takes one or more extra parameters. With no extra parameters the current "ib-" CSS classes will be generated. But with a parameter stating the template call is for "qualifier", different CSS classes would be generated. This would then be trivial to extend for other uses of "italbrac" as they are identified!
That clever template should be {{stylized bracketed italics}} or some other name which would only be used indirectly. Inventing CSS classes is no small matter, each with its own customization preferences. DAVilla 00:46, 25 September 2007 (UTC)[reply]
Very sensible. I agree that the long-term template for showing parenthesized italic text should have a long name in order to discourage editors from using it directly in entries. Rod (A. Smith) 01:01, 25 September 2007 (UTC)[reply]
  • Anyway keep up the excellent work - I hope you take these criticisms constructively. — Hippietrail 07:23, 24 September 2007 (UTC)[reply]
    Thanks, Hippietrail. I will execute the first option you give immediately above, requesting AutoFormat-type bot assistance with the following conversions: (unindenting)
  • ^#[ \t]*\{\{italbrac|([^}]*}}” (definition qualifiers using {{italbrac}}) to be replaced with “# {{qualifier|\1}}
  • ^#[ \t]*''\([^)]*\)''” (definition qualifiers using italic parentheses) to be replaced with “# {{qualifier|\1}}
  • ^#[ \t]*\(''[^']*''\)” (definition qualifiers using parenthesized italics) to be replaced with “# {{qualifier|\1}}
  • ^*[ \t]*\{\{italbrac-colon|([^}]*}}” (pronunciation and -onym sense qualifiers using {{italbrac-colon}}) to be replaced with “# {{sense|\1}}
  • ^*[ \t]*\{\{italbrac|([^}]*}}:” (pronunciation and -onym sense qualifiers using {{italbrac}} followed by colon) to be replaced with “# {{sense|\1}}
  • ^*[ \t]*''\([^)]*\):''” (pronunciation and -onym sense qualifiers using italic parentheses and colon) to be replaced with “# {{sense|\1}}
  • ^*[ \t]*\(''[^']*''\):” (pronunciation and -onym sense qualifiers using parenthesized italics with colon) to be replaced with “# {{sense|\1}}
  • ^*[ \t]*\{\{italbrac|([^}]*}}([^:])” (ad hoc pronunciation and -onym qualifiers using {{italbrac}}) to be replaced with “# {{qualifier|\1}}[[Category:Some rfc category suggesting template:sense]]\2
  • ^*[ \t]*''\([^)]*\)([^:])''” (ad hoc pronunciation and -onym qualifiers using italic parentheses) to be replaced with “# {{qualifier|\1}}[[Category:Some rfc category suggesting template:sense]]\2
  • ^*[ \t]*\(''[^']*''\)([^:])” (ad hoc pronunciation and -onym qualifiers using parenthesized italics) to be replaced with “# {{qualifier|\1}}[[Category:Some rfc category suggesting template:sense]]\2

Is any bot operator able to assist with such conversion? Are regex expressions like the above the proper format for these bot-assisted formatting discussions? Rod (A. Smith) 16:59, 24 September 2007 (UTC)[reply]

Wow, you haven't loaded http://sf.net/projects/pywikipediabot yourself yet? Setup instructions are at m:Pywikipediabot. (Sorry, I thought you had been using this for quite some time, for some reason.) --Connel MacKenzie 18:33, 25 September 2007 (UTC)[reply]
You seem to be missing a closing parenthesis in all of your regex's above. Also, "|" is the regex "or" operator, and needs to be escaped as "\|". The syntax for the first, of the above, looks something like this:

$ python replace.py -ref:Template:italbrac -regex "#[ \t]*\{\{italbrac\|([^}]*)}}" "# {{qualifier|\1}}"

but I am getting errors (presumably from erroneous modifications I made at some point to replace.py.) And I don't feel like updating (and comparing differences against other customizations I may have made,) at the moment. --Connel MacKenzie 18:33, 25 September 2007 (UTC)[reply]
Note: error corrected above, added "-regex". Duh.  :-)
Note: error corrected above: removed "^" as the python regex version I'm using doesn't properly recognize line-start.
Do you want me to run these? I think they all need manual yes/no (i.e. do not say "ALL".) Or are you loading it now? --Connel MacKenzie 19:18, 25 September 2007 (UTC)[reply]
e.g. this. --Connel MacKenzie 19:25, 25 September 2007 (UTC)[reply]
Thanks for your help. I stopped using pywikipediabot when EC removed the bot status from User:Rod-MigrateCatsBot. I never reloaded it onto my new computer. (Thanks for making me relive that memory. *shudder*) I suppose it's high time I reloaded it, though, even if I don't use that specific bot account. Rod (A. Smith) 19:44, 25 September 2007 (UTC)[reply]

(Gee, now I can load this page again! Thanks Connel!)

I've created User:Tbot (talk) to do the updates on the "t" templates. Please see the documentation and the talk page. So it isn't under a bot flag and isn't auto-patrolled yet (I added it to WT:WL). It is throttled to one edit every 70 seconds.

At present it changes only the text inside the t templates, changing the name to reflect whether the FL.wikt entry exists (if it can tell). It adds a parameter for the section links in the cases not optimized. It will be adding the script parameter for some non-Latin scripts if not present (I don't anticipate it changing a script parameter.)

Several people have asked about Tbot adding the templates; this is not possible in the general case, but is in a very large number of translations that match simple cases. (e.g. [word], [word] {f}, and so on). My thoughts are to have it add the template if and only if:

  • the format is simple, and it really can't mess up; the FL word is wikilinked, and there might be a gender marker, and/or a known script template in an expected syntax
  • the FL word exists in the en.wikt (with the correct language section)
  • the FL.wikt exists, and the word exists in the FL.wikt (but language section not checked)

So it would always be adding the {{t+}} template. I'd probably start with a small set of languages as well. Later it could add entries where the FL word isn't here and so on. Alternatively, it could work on entries that someone has tagged as wanting the templates, and do as much as possible. Robert Ullmann 14:37, 11 September 2007 (UTC)[reply]

I ran this after the 14 September XML dump, expecting it to find a few new ones; it found 342 of them. (Some converted from previous use of {{trad}}. Time to set 'bot vote I think. Robert Ullmann 14:18, 17 September 2007 (UTC)[reply]
Is there any plan to unify the existing translations, to use {{t}}? I don't understand how that falls under a separate task, logically. (Programatically, sure, but logically an intertwined task, right?) Looking at your three items above, shouldn't it be (one and (two or three)) instead of (one and two and three)? --Connel MacKenzie 14:33, 23 September 2007 (UTC)[reply]

Heading size

Sorry if it's just that I did something stupid on the client side, but — did text size of headings suddenly get huge? Was this a change to some CSS file somewhere? If so, what was the reason? —RuakhTALK 16:05, 11 September 2007 (UTC)[reply]

Seeing the same thing here... would like it to change back, please. -- Visviva 16:07, 11 September 2007 (UTC)[reply]
I'm seeing it too, unless my monitor just got a lot closer without me realising. Widsith 16:08, 11 September 2007 (UTC)[reply]
Wow! Yep. Seeing it here too. --Dijan 16:10, 11 September 2007 (UTC)[reply]
Yup, same oversized headings showing on Canadian computers too. -- WikiPedant 16:13, 11 September 2007 (UTC)[reply]
No changes to MW namespace in the past several days... something deeper in the software, perhaps? -- Visviva 16:19, 11 September 2007 (UTC)[reply]
The font size is coming from main.css. Looks like this revision did something. Mike Dillon 16:25, 11 September 2007 (UTC)[reply]
Thanks. I don't know why, but the font-size: 100%; line in the h1, h2, h3, h4, h5, h6 selector block (or whatever the proper term is) doesn't seem to be having any effect: it should be causing each section's [edit] link to be the same size as normal text, and therefore the same size regardless of the section's header level; but that's not the case. As a result, it seems that level-one headers are ending up at 188%2=353.44% of the regular font size instead of 188%, and so on. And since level-six headers are supposed to be 80% of the regular font size, so I guess now they're 64% instead. Yay for mini-headings! :-) —RuakhTALK 16:49, 11 September 2007 (UTC)[reply]
Good call. What happens is, our own MediaWiki:Monobook.css overrode main.css's heading sizes. When main.css changed what selector it used, suddenly MediaWiki:Monobook.css wasn't overriding main.css, but rather working in concert with it, thereby supermagnifying things. I've now changed MediaWiki:Monobook.css to compensate for this change by using the same selector. (Now the headings seem too small, though; I'm not sure if I'd already managed to adjust to the huge headings, or if there's something else that needs to be changed as well.) —RuakhTALK 16:58, 11 September 2007 (UTC)[reply]
P.S. Note that you may need to refresh a page — any page here should do — in order to get the new CSS. —RuakhTALK 16:58, 11 September 2007 (UTC)[reply]
I've got crappy eyesight - but it is even too big for me! SemperBlotto 16:37, 11 September 2007 (UTC)[reply]

Tim Starling just rolled back to r25705 undoing the JS-breaking r25706 and the CSS mangling r25707. Brion is trying to undo a couple other things, and re-instate the changes that came after that. So, Ctrl-F5 or Shift-R to reload pages now. Some heading sizes are still enborkened, but it is being watched closer now. --Connel MacKenzie 17:10, 11 September 2007 (UTC)[reply]

Ok, we removed the borky bits from the software, updated it, and removed the borky changes to MediaWiki:Monobook.css which made it double-size again. Should be ok if you clear cache now. :P :) --Brion 17:19, 11 September 2007 (UTC)[reply]
Yay, thanks. -- Visviva 04:49, 12 September 2007 (UTC)[reply]

Another transwiki

Another transwiki here. It's a glossary of film terms in German and English. Edit history is on the talk page. Mike.lifeguard | talk 04:41, 12 September 2007 (UTC)[reply]

Well, Full history imports are now enabled from Wikibooks, Wikisource, Wikiquotes, Wikiversity and Wikinews. But we don't really have a way of importing a .PDF file. Those go to commons. --Connel MacKenzie 04:13, 16 September 2007 (UTC)[reply]

This should be transwikid here as well. Would some local admin please take care of it? There are many subpages that need to be imported, and I'm going to bed. I know you don't have import from en.wb enabled; you'll have to do a copy-and-paste transwiki. Yes it's lots of work. That's why I'm going to bed instead of doing it. Good luck. Mike.lifeguard | talk 05:24, 12 September 2007 (UTC)[reply]

How many more do you have in store? Is it worth requesting Tim Starling's idea of simply allowing All-To-All transwikiing? --Connel MacKenzie 07:04, 12 September 2007 (UTC)[reply]
We're going to have a slow but steady trickle. Regardless, all-to-all import would be great. en.wb has already requested additional import options; I'd love to see all of them. I may add the rest to the request.
PS: Did it get done? We may want to delete it after it's been done. Mike.lifeguard | talk 13:29, 12 September 2007 (UTC)[reply]
Not yet...it would be here at Transwiki:English-to-Spanish Reference (or there would be a redirect there) if it had. (Generally, we only import into the Transwiki: namespace here.) And since there is a possibility of getting the full history import done via Special:Import, I'm inclined to wait to see if it pans out quickly. If a delay becomes apparent, manual fallback will be OK. --Connel MacKenzie 15:08, 12 September 2007 (UTC)[reply]
OK, the magic is done. (Thanks User:ArielGlenn!)) --Connel MacKenzie 04:20, 16 September 2007 (UTC)[reply]
Whoops. you mean b:Special:Prefixindex/English-to-Spanish Reference, didn't you? (Sigh) OK, let me get the rest. --Connel MacKenzie 04:29, 16 September 2007 (UTC)[reply]
OK, they are now in Special:Prefixindex/English-to-Spanish Reference, except for X, Y and Z which seem to be missing on b:. --Connel MacKenzie 04:34, 16 September 2007 (UTC)[reply]

ISO codes in template names

We've done an admirable job of migrating templates with three-letter names to accomodate ISO 639-3, but that process is incomplete ({{see}}, {{mpl}}, {{top}}, etc.) and has caused some headache. The eventual publication of ISO 639-6 will use four-letter codes, however, so template naming complications have only just begun. Rather than further restrict our template namespace, I suggest we stop using ISO language codes as template names and as template name prefixes.

Instead, let's continue to use ISO language codes as arguments to positional parameters (e.g. as in {{t|de|Wort}}) and as arguments to the named {{{lang}}} parameter (e.g. as in {{term|mot|lang=fr}}), but for template names, let's only use ISO codes as suffixes, e.g. moving {{en}} to {{lang-en}}, {{en-noun}} to {{infl-en-noun}} or {{noun-en}}), and {{es-conj-ar}} to {{conj-es-ar}}. Rod (A. Smith) 03:30, 13 September 2007 (UTC)[reply]

I'm not sure, but I think that ISO 639-6 is going to include a lot of things that we aren't going to want to provide separate templates and template sets for; we don't really want a "Midland dialect noun" template, etc. ... so I'm not convinced that this is worth concerning ourselves over at this juncture. -- Visviva 03:48, 13 September 2007 (UTC)[reply]
Note that all of the wikts use the ISO code templates; most of them without subst'ing in translations tables etc. There is an absolute expectation that I can go to the xx.wikt and use {yy} and get the name of yy in xx. For any wikt. I wouldn't worry about the 4 and 5 letter codes: we are going to max out at around 7000 languages that we can document, while there are 11 million 5 letter codes; the assignments are going to be very sparse. In any case, the prefixes are not a problem, if {see} was called {see-also}, so what? Is only a problem if Seneca has a POS called "Also". The prefix language codes means all the templates for that language sort together. It isn't broken. Robert Ullmann 04:44, 13 September 2007 (UTC)[reply]
Personally I believe that the best way to accomodate all the plain language templates we could ever need is to have them ALL prefixed with lang: e.g. lang:en, lang:es, etc. As for the inflection templates I believe they should always start with the language code. We don't want to make too much work for ourselves continually moving templates what with all the crap we are constantly tidying up anyway. Moving the langauge templates to a psuedo-namespace lang:xxx etc. would be a big win other templates that call the current ones for functionality can be easily modified.--Williamsayers79 12:48, 13 September 2007 (UTC)[reply]
The "straight" code templates have to exist; they are what almost all the wikts use in translation tables, and people from other wikts need them. (Whether they subst them or leave them there for AF or whatever to fix.) There are only two options:
  1. unlink all of the code templates
  2. maintain a second, duplicate set of unlinked templates (lang:)
No amount of hand-waving or hackery is going to get around the simple fact that we shouldn't have links in the code templates. If we were using them (e.g. leaving them) in the translations tables it might make sense; since we aren't, it is pointless to have some linked. (There is a third option if we can get it: an #unlink parser function.) Maintaining a second set involves continual work, on-going for the indefinite future. Years. (You volunteering? ;-)
In the meantime, we are in-between those two options, and fixing things up manually, with much template complexity. Robert Ullmann 15:00, 13 September 2007 (UTC)[reply]
Yes, I hadn't considered the factor of editors from other Wiktionaries. That alone is adequate grounds to keep the three-lower-case-letter template space reserved for ISO language codes. Suggestions stricken above. Shall I archive this topic now or leave it here for further discussion? Rod (A. Smith) 16:33, 13 September 2007 (UTC)[reply]

I have several comments:

  1. I agree the language templates should be gradually migrated to somewhere else, like Template:lang:xx.
  2. I agree the exiting language templates will persist for a very long time. But I would prefer they remain orphaned.
  3. I think all our current templates should be using the lang: pseudo-namespace prefix wherever possible.
  4. I think the existing language templates should all have <noinclude>{{dontlinkhere}}</noinclude> so they can be methodically substed on a regular basis. Contributors from other wikts continue to use these when they probably should not; there is no reason not to make the corrections of those more automatic.

--Connel MacKenzie 17:03, 13 September 2007 (UTC)[reply]

You want us to maintain an entire duplicate set of 500+ templates so that you can fix the handful of entries where some editor has committed the inherently evil act of leaving a language template in the page wikitext? It is a lot easier to just skim the wikitext for the ones that AF won't correct automatically, or hasn't gotten to yet, e.g. house cricket:

tissue Angolan subtraction mus Belarusian cassation apart pax monitor pirouette Aramaic cur radix constipation gaze meiosis appendage radice xeno- crow's feet fence human rights OE. OF. stig ought xerography potentate céile Julius foreground nabla house cricket suspense arcu mæl encroach gallus peri שבת mollis accumulateur acidose one of a kind suprême groser reconvene pronomen peonie 중부 거래 en banc court of cassation verão pelaje δικάταρτο δι- νενέ Snæland παρανομιάζω cabras

Note this includes some interesting cases that aren't use of languaage templates: in cur someone has subst'd a template that they shouldn't have. (This list is from the 14 Sept XML, minus things fixed 14-17th Sept) Robert Ullmann 13:03, 19 September 2007 (UTC)[reply]

This bot has been off on an undiscussed run converting entries to use {{es-verb form of}} which is new, un-discussed, and, well, go look at it. No mention here of adding it by bot to a lot of entries.

Additionally, the bot was running unsupervised; I added a note to Dmcdevit's talk page, got no reply, blocked the bot account after a while, and it started using his sysop account (an evil mis-feature in the pybot framework), I've blocked that too. (sigh) He can always unblock both of course. Robert Ullmann 15:04, 13 September 2007 (UTC)[reply]

Sorry about this. As I explained on my talk page, I had been doing just a few manual edits tinkering with it, and somehow left the "always on" feature on, and it was running in the background without my noticing it (while I was asleep). That's never happened to me before, and I'll make sure it doesn't happen again. Dmcdevit·t 19:11, 13 September 2007 (UTC)[reply]

This template and probably some others have a {{{lang}}} parameter that accepts a full language name. Should such templates be converted to accept ISO language codes and their clients be updated by bot, or has it not yet adequately been approved to use ISO language codes exclusively? Rod (A. Smith) 06:10, 14 September 2007 (UTC)[reply]

Well, the working convention would seem to be "[full language name] apocopic forms" ... so until & unless we have a full consistent set of templates that render the unwikified form of the language name, it would seem necessary to provide the language name directly. See also RU's comments here and a recent discussion that happened somewhere recently (BP? RFDO? not sure). -- Visviva 06:26, 14 September 2007 (UTC)[reply]
I disagree. I think the language code is much more appropriate there. --Connel MacKenzie 06:31, 14 September 2007 (UTC)[reply]
Just in the template parameter, or in such categories generally? (I don't disagree in either case; my primary interest is just to have a convention that I can understand and follow.) -- Visviva 06:57, 14 September 2007 (UTC)[reply]
Generally. With no English language equivalent category, it seems like a topical category, therefore merits the language code not the language name. But then, I too have always been a bit unclear on the rules for that. --Connel MacKenzie 03:49, 17 September 2007 (UTC)[reply]
Added: Here is some code that should always return the unwikified language name, unless there is a template missing: User:Visviva/lang-nowiki. It relies on our de facto convention of using the lang: pseudospace (only) for unwikified language names, so it will of course cease to function if that convention is abandoned. (Probably this has been done before, but I could never find it.) -- Visviva 06:57, 14 September 2007 (UTC)[reply]
It is {{language}}, and it will be maintained to do whatever we can figure out. Right now we are in the middle. Robert Ullmann 14:08, 17 September 2007 (UTC)[reply]
Wonderful! That was far to obvious for me... -- Visviva 14:19, 17 September 2007 (UTC)[reply]

Problem with Wikipedia box template

 
English Wikipedia has an article on:
Wikipedia

What happened to the WP globe icon that's supposed to appear in the Wikipedia box template? (See right for example.)

Hmm... seems to be working now. --EncycloPetey 23:23, 15 September 2007 (UTC)[reply]
Hmmmm... gone again after another edit. What gives? --EncycloPetey 23:24, 15 September 2007 (UTC)[reply]
Looks like there's something funky going on with Commons. I've used "action=purge" to clean up a couple images today, but it looks like a recurring problem. I'd assume the WMF developers know about it. I just purged the globe again. Mike Dillon 23:27, 15 September 2007 (UTC)[reply]
https://wikitech.leuksman.com/view/Server_admin_log#September_15 Mike Dillon 23:29, 15 September 2007 (UTC)[reply]
Excellent link, thank you. --Connel MacKenzie 05:01, 16 September 2007 (UTC)[reply]
This is precisely why sysop-only uploads are still enabled...so that we can have our "critical" icons local here. I guess I better get started on these. (Oooh, and I can load the correct Wikipedia logo without the vertical-pixel-stealing wording at the bottom! Yay. Now {{wikipedia}} won't glom onto an extra line, I hope.) --Connel MacKenzie 05:01, 16 September 2007 (UTC)[reply]
OK, I've started uploading ones that were pooched on Main Page...going on from there, I guess. User:Connel MacKenzie/images is where I'm going to keep track of the ones we want; please add to it liberally, for any images we have here that appear in templates or "overly-popular" pages (like WT:BP or WT:GP, WT:RFD, etc.) --Connel MacKenzie 05:32, 16 September 2007 (UTC)[reply]

A minor pain in the ass (feature request).

I have in the past been taken to task for forgetting to mark my minor edits as minor, and (after making "minor" my default) for forgetting to mark new entries as not minor. I'd like a preference setting that defaults to marking all edits as minor except new entries. Opinions? Is this worth putting in for? Cheers! bd2412 T 03:01, 16 September 2007 (UTC)[reply]

I can picture a WT:PREFS for it, that checks the [article] tab's color (if red, uncheck the "mark edit as minor" checkbox.) Maybe someone will beat me to it. --Connel MacKenzie 18:11, 25 September 2007 (UTC)[reply]

Looks-alike and Sounds-like features

Unless one knows the correct spelling of a word, it can be difficult to find it. It would be nice if Wiktionary had a feature that permits the reader to make a typing mistake and then presents the reader with a list of alternatives; words that are spelt nearly the same and words that sound similarly. For example, Google et al do this by asking the user Were you looking for: blah blah?.

COme to think of it, this would also be a great feature for several other WikiMedia projects. How about it? — This unsigned comment was added by 194.151.98.245 (talk).

Like this? No equivalent for Wiktionary that I know of, yet. If you type a word in wrong, you may (if it is a very common spelling error) get a stub entry that identifies the misspelling (by language.) Alternately, you may be presented with an edit page that identifies the misspelling in the deletion log comment. So far, there has been very little interest (in fact, a fair amount of opposition) in automating a misspelling redirect mechanism, as many contributors would rather enter the obscure or obsolete meanings for such spellings. --Connel MacKenzie 09:40, 17 September 2007 (UTC)[reply]
  • For "sounds liike" we already have the Rhymes pages. If you are looking for a word that sounds like listen, you can go to that page, click on the Rhymes link and get a list of words that rhyme. You can then back into higher-order pages in Rhymes if that doesn't get you quite what you were looking for. --EncycloPetey 15:48, 17 September 2007 (UTC)[reply]

Hi Grease pitters - could someone add a parameter onto Template:fr-conj-er - for verb such as éprouver, which begin in é, these pages are in the Category:French verbs after words beginning with z, e.g. zipper. By adding a parameter {{{cat=???}}}, I'd like to have e.g. éprouver categorised with words beginning with e, so at éprouver, you write {{fr-conj-er|éprouv|avoir|cat=eprouver}}, like in French Wiktonnaire. Thanks in advance — This unsigned comment was added by Almightty (talkcontribs) at 15:30, 17 September 2007 (UTC). (aka Wonderfool)[reply]

Done, blocked. --Connel MacKenzie 16:07, 17 September 2007 (UTC)[reply]

Interwiki links

Yesterday, I added an interwiki link to Wiktionary:Tea room/header. But the Vietnamese link now appears twice. Lmaltier 16:25, 18 September 2007 (UTC)[reply]

That's because Wiktionary:Discussion rooms contains an iw link to the Vietnamese Wiktionary. Should that be removed now? Rod (A. Smith) 16:37, 18 September 2007 (UTC)[reply]
It could be wrapped in a <noinclude>. Mike Dillon 17:15, 18 September 2007 (UTC)[reply]

I noticed that {{obsolete spelling of}} categorizes its clients in Category:Obsolete, a Category:Lexicons category for obsolete words. To distinguish obsolete spelling from obsolete vocabulary (especially in light of spelling reforms of regulated languages like Dutch), I'd like to create the subcategory Category:English obsolete spellings and have {{obsolete spelling of}} use that and similar categories based on the {{{lang}}} parameter. Any objections or comments? Rod (A. Smith) 21:57, 18 September 2007 (UTC)[reply]

FYI, I created Category:English obsolete spellings and its Dutch counterpart. {{obsolete spelling of}} isn't used by very many entries yet so I went ahead with the change. If we decide I'm headed in the wrong direction, we can always revert me.  :-) Rod (A. Smith) 22:44, 18 September 2007 (UTC)[reply]
Sounds good to me. Should it maybe be a subcategory of Category:English alternative spellings as well? (I've actually been using "{{obsolete}} {{alternative spelling of|[[…]]}}" for these; I hadn't noticed we had a specialized template for this. Does anyone with a copy of the last XML dump want to generate a list of pages using "{{obsolete}} {{alternative spelling of|[[…]]}}"?) —RuakhTALK 23:30, 18 September 2007 (UTC)[reply]
Switched 2 French ones to it and created the cat. Circeus 00:04, 19 September 2007 (UTC)[reply]
All of the entries in the wikt as of 14.9.7 that have definition lines containing "obsolete" and "spelling" but not {{obsolete spelling of}}:

swop abrade expence unvail benzole connoisseur maselyn sha'n't develope vender Thor witing wiver isle shid MicroSoft churchward swart As That Wage deciembre á воспитанье battle ship muger lacker asswage quillion bishoprick bomboora shende tuwel auriflamme i- Thal vinyard Avator oxi- horse-power algorism Sunis roquelaire cassimere kerseymere sulphuret whing Dynbaer Kefalovrysion Kefalovryssion Kefalovrision Kefalovrission magick renascence faetus murther chuse 밝쥐 propriedad calefie haywain heritour grype deie

Leave me a note on my talk page if you want me to rescan it (and I can flip a flag in my replac code to have it re-check the current wikt for each problem) Robert Ullmann 15:44, 19 September 2007 (UTC)[reply]
Thanks! :-) —RuakhTALK 15:50, 20 September 2007 (UTC)[reply]

Automatic wikilinking of languages in translations tables

AutoFormat has been using the list at WT:TOP40. I had added a table at the bottom of languages to always be wikilinked, but that clearly won't scale. I've changed AF to link all languages with templates + languages listed in the last table - the Top 40 languages (of which there are presently 69 ;-)

If you see it linking a language that it shouldn't be, add it to the second ("grey area") list at WT:TOP40 (and prod me, AF only reads the control files at startup). Language names that aren't recognized are left alone. (See User:Robert Ullmann/Trans languages).

This will scale a lot better; else the list of languages to be linked would be thousands.

Also: note as of 14 September 2007, we have 500 languages with templates. Some sort of milestone, that is. Robert Ullmann 15:05, 19 September 2007 (UTC)[reply]

Bit of magic with translations glosses

We presently (14.9.7) have 11,347 entries that use {{top}} for translations, without glosses. I've added a bit of magic to make it easier: if you want to convert top/mid/bottom to trans-top/etc all you need to do is add the gloss to the {top} template. (Which will add it to the AF category ;-)

This does not work for checktrans (at this time), so don't try adding "to be checked" or whatever to {top}; you should convert that table to {{checktrans-top}} (and remove the L5 header "Translations to be checked" if present.) Robert Ullmann 14:35, 20 September 2007 (UTC)[reply]

What is 14.9.7? Nice trick on {{top}}, anyhow. --Connel MacKenzie 14:48, 22 September 2007 (UTC) (14th September 2007 - nonstandard even on this side of the pond to use a single digit for year) SemperBlotto 14:58, 22 September 2007 (UTC)[reply]
Obviously it must be the 14th day of the 9th month of year 7 in the millenialists' calendar ;) --EncycloPetey 14:59, 22 September 2007 (UTC)[reply]
Erm, I don't know why that wasn't obvious. C'est la vie. We should probably stick to either American i18n date format, (9/14/07) or comparison date format (2007-09-14) if we want to avoid confusion. (Also, do we consider Africa to be the UK part of the pond?) --Connel MacKenzie 17:36, 22 September 2007 (UTC)[reply]
Sorry, just the usual order anywhere but the US; as to the single digit year, it is very common on cheques and deposit slips etc (when written by hand), and similar things. Seems more natural to a lot of people than "07". (I think it might have been 2000 that did that, writing "00" was very wierd, so some people wrote "1.1.0", some "1.1.2000") Africa isn't the UK part of the pond, but East Africa used to be UK Commonwealth, and for our purposes (linguistic) still is. If writing documentation, etc, I'd write "14 Septermber 2007". Robert Ullmann 14:19, 24 September 2007 (UTC)[reply]

Local language Customisation help at mr wiktionary requested

Wide following places in the namespace and every where else in related mediawiki templates the word Category and Categories have been replaced with Marathi Language word वर्ग. While the namespace shows word वर्ग correctly, we have defficulty that when we actually do categorisation with [[वर्ग:name of the category]] is accepting word वर्ग in input but when we save any page with [[वर्ग:name of the category]] unfortunately we are receiving output view is [[Category:name of the category]] or [[Categories:name of the category]] instead of correct one what we want [[वर्ग:name of the category]] to apear.

This problem is ocurring only at Marathi Language wiktionay and not at Marathi Language Wikipedia.Following are the links to Marathi Language wiktionary related pages.



Looking forward to some good solution and help

Thanks and regards

Mahitgar 16:45, 20 September 2007 (UTC) (This user is admin at Marathi Language Wiktionary.) User page at Marathi WiktionaryTalkpage[reply]

mr:Wiktionary:Embassy

That text is controlled by MediaWiki:Pagecategories. Mike Dillon 01:12, 22 September 2007 (UTC)[reply]
Thanks and its Nice of you Mike, I succeeded in geeting it done with your advice.
Mahitgar 12:35, 22 September 2007 (UTC)[reply]
Cool. I'd suggest updating mr:w:प्रकल्प:MessagesMr.php and resubmitting it to the MediaWiki maintainers so that any Marathi language wikis in the future won't have to customize this themselves. It looks like कोल्हापुरी had already customized MediaWiki:Pagecategories on the Marathi Wikipedia. Mike Dillon 14:32, 22 September 2007 (UTC)[reply]
P.S. I don't know why you guys have mr:Template:All messages (with a bunch of missing messages) when mr:विशेष:Allmessages already shows all messages. Looking at the special page shows even more localizations that should be submitted back to the MediaWiki core. Mike Dillon 14:37, 22 September 2007 (UTC)[reply]

Interwiki bot

As discussed above, the present interwiki.py code, designed for the pedia, is horribly inefficient on the wikts. It goes around loading lots and lots of pages from various language wikis, trying to find "translations" of article titles. Even with the -wiktionary flag, it still does a huge number of page loads.

I've written some new code based on Tbot, that creates a merged all-wikt index from a given starting point on the fly, and loads only the pages that need iwikis modified. I'm running some tests for now, you'll see it as Tbot. Just a heads up. Feel free to check anything it does. Robert Ullmann 02:31, 24 September 2007 (UTC)[reply]

The interwiki.py code works, as I noted, by wandering around. If for example, someone adds an entry to the te.wikt (Telugu), without adding a link from one of the "big" wikts to it, the interwiki code will never find it. That's why we see people showing up here to add iwikis, they need to link to their work from en (or fr, es, etc), so that the bot(s) running interwiki will find it and add it to others.
Since my code doesn't work that way, but uses a complete merged index, those will be found. There seem to be quite a lot of them. (And then, even if the other wikis are still using the other code, they'll get them too ;-).
Also, it is sorting iwikis in a number of cases (but only when adding or removing), lots of cases of fi being under Fi rather than Suomi ...
Continuing a bit of testing, please comment? Robert Ullmann 15:19, 24 September 2007 (UTC)[reply]
Seems to be working well. We haven't seen VolkovBot since the 15th (except a few edits on the 20th). What do you think we should do? Robert Ullmann 17:21, 24 September 2007 (UTC)[reply]

(sound of crickets, very pleasant where I live) I get the impression that no one cares much as long as it is done competently. (I think people like Volkov get very excited about making lots of edits and then find out it is boring? I don't know, and do not claim to know.) Running iwikt.py under UllmannBot for now, so you can see what it is doing, but won't flood RC. Robert Ullmann 23:05, 24 September 2007 (UTC)[reply]

Sorry for the deafening silence. I am very pleased that you are taking this approach. It is at least an order of magnitude better than interwiki.py. Good job! Rod (A. Smith) 23:16, 24 September 2007 (UTC)[reply]
Tx, btw: I am looking at sense/etc... the interwiki.py heuristics are fine for the 'pedias (quite interesting really!), just really not needed for the wikts ;-) Robert Ullmann 23:22, 24 September 2007 (UTC)[reply]
Re: "I get the impression that no one cares much as long as it is done competently.": Word. Please continue. :-) —RuakhTALK 00:13, 25 September 2007 (UTC)[reply]
This is excellent, actually. You're saving someone a whole lot of computing cycles. When I finally decided to vote against VolkovBot it was too late.
I'm wondering if we should start putting interwiki links in section zero. DAVilla 03:42, 25 September 2007 (UTC)[reply]
Thank you. I had to stop at "Ab" and add some code to skip the redirects seen in the XML; there are ~57000 left over from the conversion script. If one has an entry in another wikt, the code was picking it up and then skipping it. (fairly useless) The only down side is if a title was a redirect in the last XML, but is now an entry, it won't get the iwikis.
There are wikts with them in section 0, but why? Seems to me the only thing that would do is have them in the way when editing? Is there something else? (Sometimes they end up in the middle when someone adds a section, but that is harmless, and fixed whenever the entry is touched by an iwiki bot or by AF). Is there another advantage? Robert Ullmann 13:04, 25 September 2007 (UTC)[reply]
Probably because they get confused with translations and are otherwise easily chopped up when section editing the last language. In section 0 they would be run across less often. You would only see them if you edited the whole page, which means that you would see them in relation to the rest of the page. If you didn't know what they were, at least you wouldn't confuse them for something else on the page. DAVilla 16:14, 26 September 2007 (UTC)[reply]

Does anyone know where to find real documentation as the the sort order for the allpages table? Presumably the SQL parameter? The Help: on meta says it is in ASCII order, which isn't helpful. (It isn't even ASCII) It appears to use a word order, e.g. Air-cooled comes before Air Marshal. This causes minor inversions in the merge. I can fix this easily, but would rather not go from one incorrect version to another ;-) Robert Ullmann 14:21, 25 September 2007 (UTC)[reply]

Ah, the Media Wiki page.page_title is the title with spaces replaced with underscores. So it sorts in "word" order unless the second or subsequent word starts with a character > underscore. Very helpful, that; an order that is useful to no-one. Robert Ullmann 14:36, 25 September 2007 (UTC)[reply]
Fixed, running. We shall see if it puts the "ru" link back into Air Marshal presently ;-) Robert Ullmann 15:31, 25 September 2007 (UTC)[reply]
I think the deafening silence is a result of too many other activities, these days, for most people to keep up with minor announcements like this. I have no idea what happened to VolkovBot, but I strongly disagree with DAVilla. Having redundant IW methods on hand can only be seen as a Good Thing. If Tbot does it faster and more efficiently, great; the other bots will have very little, if anything to do. As long as they are only reading pages, I don't think the cluster performance is an issue at all. (At worst, only a trivial additional load.)
Hey, I wasn't going to kill VolkovBot, I was just agreeing with Robert on the bot flag. We can't be certain it's doing the right thing if it's applying transitive properties rather than spelling. They aren't only reading pages, by the way. They're also writing pages with a limited domain of information. Am I correct that one of the pair of interwiki links on either of the language Wiktionaries has to be added manually? DAVilla
Well if it is using the -wiktionary flag on interwiki.py, we can be fairly sure it is doing the right thing. It doesn't need one of the two paired (reflexive) links, but it does need some link in a set of pages. So if someone adds an entry to the Telugu wikt, it won't get found unless there is a link from one of the pages already linked together. Then it can complete the entire set. If there are two disjoint sets (directed graphs), it won't merge them ;-) Robert Ullmann 16:21, 26 September 2007 (UTC)[reply]
I would like it, if we seriously discussed redirect interwiki links. If an interwiki link is placed 'on the same line, after the redirect it will still function, albeit not very useful. Much better, would be to follow the redirect and relax the "exact spelling" restriction that Gerard was so adamant about. At any rate, I'd like to see that discussed in WT:BP. Likewise, I'd like to see the capitalization variants discussed on WT:BP. Since we are deleting so many redirects these days, (in lieu of the Javascript redirection mostly working,) I think it would behoove us to honor those language Wiktionaries that effectively ignore the case-sensitivity (e.g. la, if I recall correctly.) Having some statistics of which Wiktionaries have extraordinarily high miss-rates due to capitalization would help with that decision, I imagine.
Interwikis on redirects are about as useless as whether #redirect itself is capitalized. Primarily only editors are going to follow it back. There's nothing keeping a foreign interwiki from linking to a redirect here, but it might as well be resolved to the correct case.
There are still Wiktionaries that capitalize the first letter, correct? Making an exception in those limited cases isn't much of an open question. Had anyone actually proposed doing something else? DAVilla 16:14, 26 September 2007 (UTC)[reply]
There's a potentially-more-than-one-to-one relationship between our entries and those of a Wiktionary that capitalizes the first letter; how should interwikis work in that case? (How does Wikipedia handle it?) —RuakhTALK 17:12, 26 September 2007 (UTC)[reply]
I think there is one skin that displays interwiki links across the top. Perhaps that is what DAVilla is referring to, when he suggests section zero? Since these are only for bot manipulation, I can't see how it would matter, either way. Offhand, it doesn't seem particularly useful to move them all.
For the avoidance of doubt, there's no need to move the interwiki links on a particular page just for the sake of moving them. It would only be done as part of an update. DAVilla 16:14, 26 September 2007 (UTC)[reply]
Those minor points aside, Fantastic work Robert Ullmann! Bravo! Please keep it up! --Connel MacKenzie 16:37, 25 September 2007 (UTC)[reply]
A sysadmin/software engineer/etc is doing a good job when everything is quiet: it means nobody is complaining. It may seem that nobody cares, or that the person's job is not needed, but just wait until something catches on fire... Cynewulf 23:36, 25 September 2007 (UTC)[reply]
Don't I know it! (I could tell you stories ... one Halloween when a manager yelled at me for coming into work at 10 AM on quarter close day, and had to choke her words back down when I told her I had been up until 5 fixing the problem, and it was fixed. ;-) Robert Ullmann 23:41, 25 September 2007 (UTC)[reply]
FYI, current run stopped at Elsa, restarted at abc; need testing in different ranges; there is a lot to do that has accumulated in the last 3 months. Please see vote on WT:BP etc Robert Ullmann 00:10, 26 September 2007 (UTC)[reply]

Present status: Run from the beginning to Elsa, from abc to about bat at the moment. There are a lot of iwikis being added, since RobotGMwikt hasn't run in 2+ months; also someone has been very busy adding to the Telugu wikt. It will take a while to catch up (it took more than a day to do the "a"s ;-). VolkovBot has also reappeared, seems to be working from new pages in the it, de, and fr wikts. (it:spaghetti was just created today!) Which is fine. I've fixed a few bugs that causes faults; one is catching all the conditions from putpage and just continuing; in most cases it is a timeout failing to read back the result of an edit; the edit was done. If not, it will get done the next time. It takes the program about 1/2 hour to initialize, reading the XML and the first page of Special:Allpages from all 170 wikts, so it is best if it doesn't fault ;-) Robert Ullmann 17:46, 27 September 2007 (UTC)[reply]

It finds lots of things the other bot missed; (was it always working on Special:Newpages?) carro and it:carro were created by EP on 5 March 2006, the it wiki added to our entry today. Robert Ullmann 13:08, 28 September 2007 (UTC)[reply]
More: the total number of title in the set of wikts seems to be about 3.2 million. (I have counts from several different runs. but none all the way through ;-) So the en.wikt covers about 1/6 of the total. Wikts with capitalization "issues" have little effect on the overall stats. The number of en.wikt entries needing interwiki links added is something like 1/3 of the total, perhaps 150,000 entries; the primary thread I am running has made it to convict after running for ~4 days; I'm also running a thread started at Han Unified (not Extension A), e.g. u4E00, and at Cyrillic (u0400). The wikt with the most Cyrillic is bg; I don't understand the language well enough to know what the quality of the entries are (I can read it phonetically, but that isn't much help ;-). It is surprising how few kanji have entries in the ja.wikt. And so on ... Robert Ullmann 10:39, 29 September 2007 (UTC)[reply]
Re: "It is surprising how few kanji have entries in the ja.wikt.": Perhaps not. How many kanji are commonly used online? The way Wiktionary is set up, it's probably pretty difficult for someone to look up a kanji they encounter in a book. Print kanji dictionaries are organized in various ways that have to do with how the kanji looks. Wiktionary is organized by typing or copy-and-pasting the exact kanji into the search box. I'm sure the Japanese Wiktionary is addressing this somehow — with lots of indices, probably — but I'm also sure it must be a pain. So, Japanese Wiktionarians are probably focusing on something they can be better at with less effort; English, perhaps? —RuakhTALK 15:45, 29 September 2007 (UTC)[reply]

A lot of iwikis missed before

This is a larger task than expected; not only the backlog since RobotGMwikt stopped in July; but also that the code run by Gerald (et al) looks at what it is told to, I expect mostly newpages; and won't 'unify' a set of iwikis unless the directed graphs are already connected. So once it/they miss something, it won't reappear. (This can, and does, explain some of the manual entries of iwikis in the en.wikt, both to fix misses, and "hint" to the 'bots).

I've already added 100,000+ iwikis, and there are many more: the code has covered part of a "!" -> "Elsa", part of "abc" to "d", part of "m" and part of "s", all of Cyrillic to partway into Arabic, all of Han and Hanguel.

Some people don't like the name; tell me what you think. (More likely to create anew rather than rename.) But getting there. Robert Ullmann 23:32, 30 September 2007 (UTC)[reply]

Name shmame; this is the best single development on here in a while. Thank you! -- Visviva 23:42, 30 September 2007 (UTC)[reply]
Seconded! --EncycloPetey 23:53, 30 September 2007 (UTC) If you decide to change the name, how about BeeBot? (as in busy as a bee, flitting from flower to flower.)[reply]
Karibu sana. One good thing is that if/when the en.wikt has a title, and those bots run (RobotGMwikt, VolkovBot, Multichill), as long as they include en, they will get all of the wikts ;-). It still will take quite a while. (10-15 days?) 23:58, 30 September 2007 (UTC)
Oh, do go vote (useful as consolidation and documentation of opinion ;-), and karibu! Robert Ullmann 00:07, 1 October 2007 (UTC)[reply]
I would not mind seeing a request on WT:MV to add "Bot" to the end of the name, for consistency. I don't think the bot's operation should be hampered in any way because of the name-change request, though. --Connel MacKenzie 18:17, 4 October 2007 (UTC)[reply]
karibu? Cynewulf 18:46, 4 October 2007 (UTC)[reply]

update

It has gone through the entire wikt (various ranges done as separate runs). There are/were about 1/4 million iwikis missing (WAG, not recorded statistics ;-). Of these, about 100K were ones that should have been added since Gerald's bot stopped running in July; about 150K were ones that had been permanently missed: either they fell out of Special:Newpages before being done, or didn't hav enough connections to find all of them.

I've added a bit more code so that it will always sort if needed (including when someone has added text after existing iwikis) and remove the iwikis some people have added to the translation in the FL.wikt, rather than the same word. It is busily adding some to entries new since the last XML. Robert Ullmann 14:28, 7 October 2007 (UTC)[reply]

What is it removing? The link that exists on the FL wikt, if within a translation section? Wouldn't it be better to convert those to "in-line" interwiki links? E.g. from [[en:{{{1}}}]] to [[:en:{{{1}}}]]? --Connel MacKenzie 15:03, 7 October 2007 (UTC)[reply]
No, some people have added iwiki links at the bottom of the entry to point to the translated word in the FL.wikt, not to the same title there. Those get removed (and replaced with the same title if present). And some are the result of copying and modifying entries. Look at unchallenging and supervillain. Robert Ullmann 15:19, 7 October 2007 (UTC)[reply]
Oh, yes, I see. Plain-vanilla errors should just be removed silently. --Connel MacKenzie 17:55, 7 October 2007 (UTC)[reply]

There are 2,664,580 unique titles in all 170 wikts combined. (Not an instant snapshot, but over two days; by the Intermediate Value Theorem, the total was exactly that at some point if we assume that there weren't bulk additions or deletions, but always one at a time, and that we didn't have the equivalent of the Fosbury Flop ;-) Robert Ullmann 10:02, 9 October 2007 (UTC)[reply]

Copyrighted dictionary as a resource

There are some rules of citing text resources. However, when there is a set of words with their translations, I think that it should be justified to use them in an open dictionary. Can also the meanings of foreign words be copyrighted? There could be a difference, if those words were in some way categorized. It would be an example on what should not be copied – their special order. Could you please tell me something about this? — This unsigned comment was added by 213.29.65.108 (talk) at 15:01, 25 September 2007 (UTC).[reply]

I am not a lawyer. If you wish for genuine legal advice, please consult a lawyer/copyright attorney.
While some have seen a distinction for mass-copying, I generally don't buy it. That is, the single translation of one word in one language to another single word in another language probably (according to others) is too small to hold copyright protection. But systematically copying an entire translation list's data, is clearly quite different. If you wish to enter translations in such a manner, from such a source, it is my understanding that you do so at your own risk. If a sysop here recognizes it as a mass import of a copyright-protected source, you would very likely be blocked from editing here indefinitely. AFAIK, the WikiMedia Foundation would assist prosecution in any way that it can (such as revealing IP addresses for a court-order, when such data is available.) To my knowledge, this has not been tested in court yet, for any Wiktionary entries. Obviously, whenever possible, a public domain resource should be used instead. Relying on native speakers of a language is a better alternate, and is Wiktionary's preferred method of translation acquisition. --Connel MacKenzie 18:03, 25 September 2007 (UTC)[reply]
Wholesale copying from copyrighted dictionaries is a Bad Idea. Please don't do it. Basically, the exact point at which a set of word equivalences becomes copyrightable under US law is debatable, and will eventually be settled by US courts. However, we are not interested in being part of that fight, because Wiktionary is not really about copying other people's work; we are seeking to build an independent dictionary which stands on its own merits. -- Visviva 15:40, 26 September 2007 (UTC)[reply]
Thank you for your answers. Don't worry, I do not plan mass copying of any words' lists. It only seemed to me as an important question: if one translates a number of words on their own, but with the same result as if using someone else's list (there aren't many ways how to translate some words), it's difficult to judge whether the work was, or wasn't stolen. 213.29.65.108 18:54, 26 September 2007 (UTC)[reply]
Hi, I am an intellectual property lawyer. Although this can run into a fairly gray area, if you speak another language and use a foreign language dictionary to confirm the correctness of your personal translations, and those translations are for a limited number of words which could equally be translated anywhere, no liability should accrue. Most translation dictionaries contain many thousands of translations, and necessarily coincide with one another. If you add a few hundred translations of common words that happen to be in a particular translation dictionary, it would effectively be impossible for the owners of that dictionary to demonstrate that they were the source of the translation as opposed to other dictionaries or even your own knowledge. bd2412 T 16:00, 30 September 2007 (UTC)[reply]

Yesterday, I edited User:Connel MacKenzie/custom.js, the Javascript file used by WT:PREFS. When I did that, I inserted some preferences, which changed the numbers associated with many of the preferences. I didn't realize that preferences are saved to cookies by their numbers instead of their names, so several editors returning today will find that their preferences have somehow changed. I have just now restored the previous numbers, so anyone who already readjusted their preferences today, please visit WT:PREFS once more. Sorry for any inconvenience. Rod (A. Smith) 16:58, 25 September 2007 (UTC)[reply]

Thank you for staying on top of it, Rod. --Connel MacKenzie 17:41, 25 September 2007 (UTC)[reply]

Extracting a substring from {{PAGENAME}}

I'm having trouble figuring out how to do this (and some related operations like checking for a specific suffix). StringFunctions don't seem to be installed at this moment. Is there another way to do it? --Ivan Štambuk 12:39, 26 September 2007 (UTC)[reply]

No. More's the pity. -- Visviva 15:10, 26 September 2007 (UTC)[reply]
Yes, but not with string functions. You can use any of {{NAMESPACE}}:{{BASEPAGENAME}}/{{SUBPAGENAME}} or other magic words for subject and talk space etc. For further subpages there's also a new parser function available called {{#titleparts}} which like the rest of MediaWiki is implemented like crap. It can return a fixed number of parent pagenames but not all of them. DAVilla 15:25, 26 September 2007 (UTC)[reply]
True, true.... But from the question, I think Ivan was mostly interested in extracting parts of the basepagename, like detecting if the page name ended in "-ed." Which would be great, if we could do it. -- Visviva 15:43, 26 September 2007 (UTC)[reply]
In that case there's javascript, and string functions on the eighth day of the week, Someday. DAVilla 15:47, 26 September 2007 (UTC)[reply]
That's too bad, as this will probably quadruple the number of declension templates for Bosnian/Serbian/Croatian :/ Some javascript magic on client-side could reduce the mental efforts down to the minimum though.
The only option left that I can think of would be manually encoding a lemma string as a sequence of characters separated by '/'s and passed as an argument to the template that would simulate character-by-character comparison by using {{#titleparts}} and usual {{#ifeq}} function. Converting a string from format "a/b/c" to "abc" wouldn't be much of a problem either.
But that would all be a really dirty workaround I don't want to partake in..--Ivan Štambuk 16:10, 26 September 2007 (UTC)[reply]
Just pass the lemma components as separate arguments, and the template can assemble them as it pleases. Yes, you can't just ake the pagename apart, you'll have to repeat the word. But then just pass the root and the ending (or more parts, I don't know enough about those languages. Robert Ullmann 16:15, 26 September 2007 (UTC)[reply]
Yes, as Robert says. Also, if those languages undergo stem mutations with certain inflections, it may be necessary for the editor to provide more than one form of the stem. Please also keep in mind that the inflection templates, which display on the headword line, should only display a few key inflections. For more extensive inflection details, please create a separate set of conjugation/declensiion templates for use in a ====Conjugation==== or ====Declension==== section. Rod (A. Smith) 17:17, 26 September 2007 (UTC)[reply]

Search providers

Firefox and Internet Explorer 7 have a built in search box that we can help readers customize. I'd like to add options to WT:CUSTOM to let the user to install MediaWiki:SearchEnWiktWithMediaWiki.xml and MediaWiki:SearchEnWiktWithGoogle.xml. To do so, we need to add something like User:Rodasmith/monobook.js to some site-wide Javascript file and to add something like User:Rodasmith/Help:Customizing your monobook to WT:CUSTOM.

Comments? Rod (A. Smith) 23:49, 26 September 2007 (UTC)[reply]

I was bold and went ahead with the initiative. WT:CUSTOM now has a new section near the bottom. If you're not using monobook or your browser doesn't support Javascript, a "sorry, but you're lame" message (or something like that) appears. Otherwise, if you have a cool browser, you get some options to integrate Wiktionary search into your browser. Rod (A. Smith) 03:34, 27 September 2007 (UTC)[reply]

Biblical Hebrew

We have 177 entries with the language header: ==[[w:Biblical Hebrew language|Biblical Hebrew]]==

All of these should probably just say ==Hebrew==. If so, would someone please replace them all by 'bot? --EncycloPetey 06:20, 30 September 2007 (UTC)[reply]

All those entries are also misformatted in other ways, so whoever does that should also add some sort of RFC tag as well ({{rfc-he-biblical}} or something). —RuakhTALK 15:06, 30 September 2007 (UTC)[reply]
But you agree about the language header, yes? --EncycloPetey 15:11, 30 September 2007 (UTC)[reply]
Yes. :-) —RuakhTALK 16:04, 30 September 2007 (UTC)[reply]
Could run this, but what happens to entries with both Hebrew and the "Biblical Hebrew" heading? (adding the cleanup template may or may not be the answer.) It is definitive that Hebrew is not going to distinguish between ancient and modern? Robert Ullmann 23:51, 30 September 2007 (UTC)[reply]
Yes. See w:Hebrew language (introductory section). --EncycloPetey 00:08, 1 October 2007 (UTC)[reply]
I'd hesitate to say it's definitive, but that seems to be the current sentiment. (They're really the same language; insofar as we can compare these things, I'd say that the difference between Biblical Hebrew and Modern Hebrew is not very much greater than that between Shakespearean English and today-ian English.) —RuakhTALK 02:33, 1 October 2007 (UTC)[reply]
Most sources seem to argue that Hebrew is the same language and that speakers of modern Israeli Hebrew can read and understand the original Old Testament whether they are religious or not. There are however some who argue strongly that "European Hebrew" is a new language. I'm no expert but I'd say at this point we should go with the former, majority position.
On a related note, the Biblical Hebrew pages are mostly entered under the spelling with nikud (vowel points) wheres most normal Hebrew pages ignore nikud in titles but try to include them in the headword. There are indeed some duplicate pages because of this and I have added {{see}} to some such cases myself. — Hippietrail 09:19, 1 October 2007 (UTC)[reply]
For information, iSO-639 has defined a specific code: hbo. Lmaltier 17:40, 8 October 2007 (UTC)[reply]

Spanish (Castilian)

We have 230 entries with the language header: ==Spanish (Castilian)==

These should just say ==Spanish==. Could someone please replace them all by 'bot? --EncycloPetey 06:23, 30 September 2007 (UTC)[reply]

can do with replac easily; leave this for a while for comments, and I or someone will run it. Robert Ullmann 23:52, 30 September 2007 (UTC)[reply]
Might require checking that the entries are not Spain-centric. Circeus 04:07, 1 October 2007 (UTC)[reply]
They can be reviewed at some point for finer accuracy, (plus usage notes) as you say. But that is no reason to use a completely invalid heading. --Connel MacKenzie 04:24, 1 October 2007 (UTC)[reply]
I'd taken a look at them in general. I didn't notice any words specific to Spain. --EncycloPetey 04:25, 1 October 2007 (UTC)[reply]
I've periodically fixed these in bouts for a while now. A bot would be better though coz it gets boring. — Hippietrail 09:11, 1 October 2007 (UTC)[reply]