Wiktionary:Grease pit/2011/May

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

May 2011

Audio bot

Could somebody please check if my bot does not spoil anything while adding audio files? I am not active on en.wiktionary and don't check latest policy changes. Bot contributions: Special:Contributions/DerbethBot. --Derbeth talk 17:31, 1 May 2011 (UTC)[reply]

Orange links

User:Yair rand/orangelinks.js is a script that turns all links to language sections that don't exist orange where the broken links would otherwise be blue. It works by requesting the full wikitext of each of the relevant pages from the API, and checking whether the target language section exists on the page. Does anyone know whether, if the script was turned on by default, it would cause problems to have such large requests being made on every page load? --Yair rand 21:45, 1 May 2011 (UTC)[reply]

How to most effectively add script templates

As a bit of an 'experiment' I converted {{l|xcl}} to {{l|xcl|sc=Armn}} in all Old Armenian entries; as I said above, I think the worst that can happen in this case is sc=Armn will appear twice, and really, there's no problem there. It's a bit ugly though, as named parameters should come after unnamed parameters, namely {{{1}}} and {{{2}}} for {{l}}, but I'm unsure of how to do that. Also, how easy would it be to check (using regexes in my case) for sc=Armn appearing twice in a template, and being able to remove it. Mglovesfun (talk) 12:14, 2 May 2011 (UTC)[reply]

Strictly speaking, it is theoretically impossible for a single regular expression to determine what parameters are used in a template and where a template-call ends: the ability of template calls to be nested arbitrarily is beyond a regular expression's ability to understand. That said, if we restrict consideration to template-calls whose interiors do not contain { or } — by far the most common case — then we can replace (\{\{l\|xcl\|[^\{\}]+)\}\} with $1|sc=Armn}} (adding |sc=Armn to the end of any template-call that starts with {{l|xcl| and doesn't otherwise contain { or }) and (\{\{l\|xcl\|[^\{\}]+)\|sc=Armn\|([^\{\}]+\|sc=Armn\}\}) with $1|$2 (removing |sc=Armn from the interior of any template-call that starts with {{l|xcl|, ends with |sc=Armn}}, and doesn't otherwise contain { or }). For a bit more coverage, you can sprinkle in \s* wherever whitespace is allowed. —RuakhTALK 13:01, 2 May 2011 (UTC)[reply]
That explains why I was finding it so hard. Hmm. Mglovesfun (talk) 13:08, 2 May 2011 (UTC)[reply]
I'm happy to do replacements like the one above for languages only written in one script, such as Old Armenian, is this sort of replacement. The only 'problems' are really cosmetic rather than functional (unless someone can show otherwise) that script templates can appear twice, and can appear before the word itself. FWIW I was talking to msh210 about character ranges; this tbot userpage seems to be a bit of a gem, the idea is to detect script from the character range and add a script template accordingly. I'm not gonna do anything at the moment, though, apart from discuss it. Mglovesfun (talk) 11:13, 3 May 2011 (UTC)[reply]
Re: "The only 'problems' are really cosmetic rather than functional (unless someone can show otherwise) that script templates can appear twice, and can appear before the word itself": I don't get it. Were you not satisfied with my suggestion above? —RuakhTALK 11:18, 3 May 2011 (UTC)[reply]
Right, I was fooled by "Strictly speaking, it is theoretically impossible for a single regular expression to determine what parameters are used in a template and where a template-call ends". I thought you were saying your script wouldn't actually work. I see what you mean now. Mglovesfun (talk) 12:00, 3 May 2011 (UTC)[reply]
I shall give it a go before running it 'en masse' to see if there are any problems. Mglovesfun (talk) 12:02, 3 May 2011 (UTC)[reply]
I ran it using hy instead of xcl ({{hy}} = Armenian) using the regex function of AWB and it skipped all 8000+ Armenian entries. Mglovesfun (talk) 23:05, 3 May 2011 (UTC)[reply]
I don't know much about AWB; what does that mean? I just tested my first regex above, after changing xcl to hy, and — for example — it matches two occurrences in [[ալոէ]]. (That's using JavaScript regexes, but judging from w:Wikipedia:AutoWikiBrowser/Regular expression, AWB regexes seem to be the same in all relevant respects.) Actually, the above wouldn't work perfectly at that page — in my second substitution, I didn't consider the case that \|sc=Armn had originally appeared at the end of the template-call — but I don't see why AWB would have skipped that page, since the first substitution does match. (BTW, to address the issue I just mentioned, the second substitution should replace (\{\{l\|xcl\|[^\{\}]+)\|sc=Armn(?=\|)([^\{\}]*\|sc=Armn\}\}) with $1$2.) —RuakhTALK 23:27, 3 May 2011 (UTC)[reply]
Perhaps it's human error on my part. --Mglovesfun (talk) 11:50, 4 May 2011 (UTC)[reply]

Translation cleanup script

I'm playing with User:Mglovesfun/add t.js in an attempt to write a script to add {{t}} and also move transliterations and genders inside the {{t}} templates. It seems to be beyond my capability and don't, worry, I'm not planning on running it any time soon. Current problems is that it matches the format * <languagename>: [[foo]] anywhere, so it'll add it for descendants too. Also, it will only format the first link, so * <languagename>: [[foo]], [[bar]] will only add t to foo, not bar. There loads of stuff I can mention, those are too obvious ones. Mglovesfun (talk) 11:48, 3 May 2011 (UTC)[reply]

You could limit it to inside ttables by adding (?=([\s\S](?!\{\{trans\-top))+\{\{trans\-bottom\}\})to the end of the regexp. --Yair rand 22:57, 3 May 2011 (UTC)[reply]
Could you add it to that page? Mglovesfun (talk) 23:07, 3 May 2011 (UTC)[reply]

smart quotes

Template_talk:term#Smart_quotes, any input? Mglovesfun (talk) 15:36, 3 May 2011 (UTC)[reply]

This seems like more of a beer-parlour issue, no? —RuakhTALK 17:04, 3 May 2011 (UTC)[reply]
I don't disagree. Mglovesfun (talk) 23:06, 3 May 2011 (UTC)[reply]

DEFAULTSORT in Greek entries and interwiki order

I tried a python script we use in el.wikt to add DEFAULTSORT keys to Greek words (User:Flubot/Adding_DEFAULTSORT_key_to_Greek_words). It's not very elegant because I've made some changes to run it here, but it works. There are two points: (a) If someone had previously inserted a valid DEFAULTSORT key somewhere in the middle of the text, the script moves it right before interwiki links. This change is not necessary but harmless, I think. (b) It changes the order of some interwiki links, for example it moves hu after lt, i.e. Magyar after Lietuvių (see this) and Suomi after Romanian, ie fi after ro (here). That seems alright to me but I would like to have other opinions before I go on with this script. --flyax 02:23, 5 May 2011 (UTC)[reply]

I see that Interwicket used to sort interwikis by language name and not by language code order (see Absinth). Luckas-bot does the same (see φόρος). So, if their is no objection I'll resume my edits with this script in a couple of days. --flyax 11:57, 6 May 2011 (UTC)[reply]

Subpages for entries?

moved to the Beer parlour.

pywikipediabot - Importing existing free-licensed database into Khmer Wikt

Greetings from Khmer Wikt!

I have came across a blog post somewhere about pywikipediabot in importing existing database from a free-licensed dictionary into Wikt. I heard that the Burmese or Lao project has been doing it. So, how can I and my team make sure of this bot for Khmer Wikt project? Currently, there's 1-2 active users who devote their time to manually make each entry one by one into our Khmer Wikt. So, I need your help in exploring if this pywikipediabot can help make life easier with a great deal of time saved! Thank you! --វ័ណថារិទ្ធ (Vantharith) 16:36, 7 May 2011 (UTC)[reply]

Autoformat

For some time I haven't been finding any results from either Autoformat or a replacement on various clean-up lists. Is that because there is nothing to clean up? If not, this would to make it highly likely that entropic processes will eventually overwhelm us. DCDuring TALK 11:41, 8 May 2011 (UTC)[reply]

Do we actually need a bot running 24/7 to format pages? Are cleanup lists generated from the dumps enough? Nadando 20:13, 9 May 2011 (UTC)[reply]
Not 24/7 but if it's once a day, it certainly helps! --Mglovesfun (talk) 11:12, 16 June 2011 (UTC)[reply]

Make conjugation box open immediately

Hi, Can I make the conjugation box open immediately when I enter an entry? For example: If I go to parlare the Conjugation is hidden and i need to manually press "show" in order to show it. Jobnikon 16:13, 8 May 2011 (UTC)[reply]

Thanks!Jobnikon 17:09, 8 May 2011 (UTC)[reply]
Or you could just click "Show conjugation" in the "Visibility" section of the sidebar. --Yair rand 20:49, 8 May 2011 (UTC)[reply]

This page contains a deprecated message; Special:CategoryTree was implemented long ago. It should either be deleted, in which case it would revert to the default message "The following {{PLURAL:$1|category contains|categories contain}} pages or media. [[Special:UnusedCategories|Unused categories]] are not shown here. Also see [[Special:WantedCategories|wanted categories]]." Or the portion labeled "the [http://tools.wikimedia.de/~daniel/WikiSense/CategoryTree.php?wikilang=en&wikifam=.wiktionary.org&cat=*Topics&m=c&go=Load&userlang=en&terse= Category Tree browser] from the Toolserver" should be changed to "[[Special:CategoryTree|the CategoryTree]]" instead, or something similar. TeleComNasSprVen 23:42, 11 May 2011 (UTC)[reply]

"Category:Latin words needing attention" needs to be changed to "Category:Latin terms needing attention", but unfortunately this template is protected. TeleComNasSprVen 05:28, 12 May 2011 (UTC)[reply]

Done. --Yair rand 05:30, 12 May 2011 (UTC)[reply]
I find it a bit irritating that such templates don't categorize when the form with macrons isn't given. That is, even when there are no macrons, you have to write out the page name to tell the template that, or else it won't categorize. --Mglovesfun (talk) 11:20, 12 May 2011 (UTC)[reply]

Anonymous feedback

I noticed while I was logged out that there is a section located in the sidebar titled "feedback" with the header "Send you anonymous feedback to Wiktionary" and includes several javascript commands to mark an entry a particular way according to a standard scale of some sort, as well as a link directly to Wiktionary:Feedback, which is of course where we get all of our feedback in writing. However the section disappears while I am logged in, and I'm no longer able to comment on a particular entry. Why has this feature been removed for autoconfirmed users who just wish to scroll through entries? TeleComNasSprVen 00:52, 13 May 2011 (UTC)[reply]

I think it's assumed that logged-in users would fix something themselves, or have a better idea of how to get someone to fix it. —Internoob (DiscCont) 03:25, 21 May 2011 (UTC)[reply]
Apparently is has to be anonymous, you can't do it while logged in. Mglovesfun (talk) 13:10, 22 May 2011 (UTC)[reply]
This has always been for unregistered users only. It wouldn't be hard to enable it for all users but, as Internoob pointed out, it didn't seem like something most users would want. - [The]DaveRoss 20:40, 20 June 2011 (UTC)[reply]

Making topic cat use numbered parameters

Currently, (almost) all category boiler templates have the language code as the first numbered parameter, and the name of the category as the second. {{topic cat}} is rather conspicuous because it requires parameters called lang= and current= instead. Would anyone object if I modified the template so that it allows you to call it with numbered parameters as well, like other boiler templates? —CodeCat 16:20, 16 May 2011 (UTC)[reply]

I've spotted this before; I'd say go for it. Obviously lang and current should still work, as they're so widely use. Looks like a good case where regular expressions could be used to simplify these templates using MglovesfunBot. Mglovesfun (talk) 21:12, 17 May 2011 (UTC)[reply]
I have made the change now, and a few categories now use this newer system. Would anyone object to running a bot that converts the old style to the new, or is that going overboard? —CodeCat 12:37, 22 May 2011 (UTC)[reply]
I support using a bot to do that. --Daniel. 12:44, 22 May 2011 (UTC)[reply]
Since it's thousands and thousands of edits, I think it's "going overboard" to do it without a vote. —RuakhTALK 13:05, 22 May 2011 (UTC)[reply]
I don't see how the number of edits matter when they're all minor. It wouldn't change the readers interface at all - it would appear the same to readers (i.e. non-editors) a bit like changing {{infl|fr|adverb}} to {{fr-adv}}. Plus, the old system would still work - editors can still use it, just a bot may modify it later to do the same thing, just slightly simpler. Mglovesfun (talk) 13:08, 22 May 2011 (UTC)[reply]
WT:BOT says "I will make sure that the task is so innocuous that no one could possibly object". That's not good wording, as it doesn't care why someone objects. In theory, even if nobody does object, someone could hypothetically object or hypothetically have objected, so no task would ever pass under this rule. Furthermore I could object just because I'm in a bad mood or I have a grudge against someone. Anyway, in upholding what I'd like to think is the spirit of this rule, there's no reasonable reason to object to this. So under the spirit of the law, this doesn't even require consultation, never mind a vote. Mglovesfun (talk) 13:14, 22 May 2011 (UTC)[reply]
The only reason I can think of for not doing this is if we prefer {{topic cat|lang=fr|current=Something}} over {{topic cat|fr|Something}} for 'canonical' reasons... But topic cat has always been a bit of an odd one out when it comes to the category structure. —CodeCat 13:28, 22 May 2011 (UTC)[reply]

Orphaned appendix pages

Is there a way to find out which appendix pages do not have links to them? It would be useful to be able to see which of our reconstructed entries are not linked to from etymologies yet. —CodeCat 12:36, 22 May 2011 (UTC)[reply]

So currently, Template:syc-root-entry is set up to include three (and only three) letters as the basis of the Syriac root. That covers the vast majority of Syriac roots, however, there are quite a few that don't follow this pattern. Some will have only two letters, others four or sometimes five. Is it possible to set the template up in a way so that it includes all of those possibilities? I'm guessing it's an easy fix, maybe make two letters required and the rest optional with "#if:" or something. --334a 20:41, 22 May 2011 (UTC)[reply]

How's this? --Yair rand 20:54, 22 May 2011 (UTC)[reply]
I've cleaned it up a bit by adding line breaks, and I changed the nested ifs to ifs appearing side by side, because there doesn't seem to be any difference. —CodeCat 21:10, 22 May 2011 (UTC)[reply]
With the ifs nested it wouldn't have to actually run the parserfunctions after the first blank parameter, I think. --Yair rand 21:15, 22 May 2011 (UTC)[reply]
That works beautifully! Many thanks to both of you. :) --334a 21:43, 22 May 2011 (UTC)[reply]
Or should that be... {{syr-root-entry}}? See the discussion below (June 2011). --Mglovesfun (talk) 11:11, 16 June 2011 (UTC)[reply]

"Sister projects

Thought you might find this interesting:[[1]]Geof Bard 04:11, 24 May 2011 (UTC)[reply]

These seem to be Dawnraybot edits, many of them. How do we fix them? --Mglovesfun (talk) 11:10, 24 May 2011 (UTC)[reply]

Remove variables from templates. For example:
# {{pt-verb form of|[[abandonar]]|ar|indicative|tense=preterite|number=plural|person=first-person|dialect=brazil}}
change into
# {{pt-verb form of|[[abandonar]]|ar|indicative|preterite|plural|first|dialect=brazil}}
Ungoliant MMDCCLXIV 00:24, 8 June 2011 (UTC)[reply]