User:Rukhabot

I'm a bot created and controlled by Ruakh. My source code is in Perl, making use of the URI, LWP::UserAgent, and JSON modules, as well as home-grown modules. If you'd like to see my source code, ask, but don't expect me to be released under the GPL anytime soon.

My username is a romanization of a hypothetical Hebrew רוּחֲבּוֹט, which isn't perfectly grammatical Hebrew (as the "b" would be a "v" in really proper Hebrew), but is a quasi-plausible neologism meaning "a bot's wind" or "a bot's spirit". (Ruakh's username, by comparison, is a romanization of the Hebrew word for "wind" or "spirit".)

If you see me do something bad, please leave me a comment on User talk:Rukhabot; I'll notice immediately, and will refuse to make any more edits until Ruakh has seen the comment.

InterwikisEdit

My main task, accounting for the vast majority of my edits, is the addition, removal, and sorting/organizing/formatting of main-namespace interwiki-links. Various random facts about me in my guise as an interwiki-bot:

  • I only edit the English Wiktionary.
  • I don't examine the page-creation, page-deletion, and page-move logs; rather, I operate solely based on database-dumps (enwiktionary-YYYYMMDD-pages-articles.xml.bz2 and PREFIXwiktionary-YYYYMMDD-all-titles-in-ns0.gz). This means that my information about a given Wiktionary is typically up to two weeks out of date. However, before removing an interwiki-link that I think points to a non-existent page, I'll consult the target wiki's API just to confirm that the page still doesn't exist.
  • I arrange interwiki-links in the wikitext according to the following rules:
  • My code is completely distinct from, and independent of, that of any other interwiki-bot.

Translation-templatesEdit

Another significant task, but not accounting for nearly as many edits as the interwiki-links task, is conversion between {{t}} and {{t+}}. Various random facts about me in my guise as a translation-template-bot:

  • I only edit the English Wiktionary.
  • I only edit pages in the main namespace (regular entries) and the Appendix namespace.
  • I don't examine the page-creation, page-deletion, and page-move logs; rather, I operate solely based on database-dumps (enwiktionary-YYYYMMDD-pages-articles.xml.bz2 and PREFIXwiktionary-YYYYMMDD-all-titles-in-ns0.gz). This means that my information about a given Wiktionary is typically up to two weeks out of date.
  • I only convert between those two templates, plus converting to them from {{t-}} and {{t0}} and {{}}. If a translation does not use any of those templates, it will be not be touched.
  • I choose between {{t}} and {{t+}} using the rules you'd expect ({{t+}} when the foreign-language wikt exists and has the entry; {{t}} in all other cases), with a few special cases:
    • When the translation contains an explicit link, I use {{t}} (since {{t+}} doesn't support that case).
    • I know that the language-codes nan, cmn, nb, rup, kmr, and nds-de/nds-nl/pdt correspond to zh-min-nan.wikt, zh.wikt, no.wikt, roa-rup.wikt, ku.wikt, and nds.wikt, so I use {{t+}} for them when appropriate. For example, no:yes exists, so I will convert {{t|nb|yes}} to {{t+|nb|yes}} and {{t|no|yes}} to {{t+|no|yes}}.
    • sr.wikt has a feature whereby, if a Latin-script page doesn't exist, the software will check for the corresponding Cyrillic-script page, and issue an HTTP 301 redirection if the latter exists — and the reverse if a Cyrillic-script page doesn't exist but the Latin-script page does. I'm fully aware of this feature, so I'll write {{t+|sr|...}} if sr:... either is an entry or redirects to one.
    • ku.wikt has the same sort of feature as sr.wikt, but for Latin and Arabic scripts instead of Latin and Cyrillic. I support that as well.
    • zh.wikt, kk.wikt, and iu.wikt have the same sort of feature as sr.wikt and ku.wikt, but in those cases I don't know the mapping rules yet, so for them I change {{t}} to {{t+}} only when the title is an exact match, and for now, I never change {{t+}} to {{t}}.
  • I do not change any formatting outside of the template call.
  • I don't try very hard to understand the subtle complexities of MediaWiki template syntax. I simply look for (approximately) {{t[-+ø0]?[|][a-z-]+[|][^|}=]+ followed by | or }}. So, for example, I will be fooled by {{t+|fr|asfasefasefase|2=le}}, which looks like it links to fr:asfasefasefase, but which actually links to fr:le. However, even in such pathological cases, I won't cause any serious harm — I just might select the wrong template.
  • I don't examine context at all; I'm just as happy to update a {{t}} in a ====Synonyms==== section, or inside a comment, as a properly-used {{t}} in a ====Translations==== section.
  • I have no special behavior for B/C/S/M; for example, I will convert {{t|hr|Leiter}} to {{t+|hr|Leiter}} and will leave {{t|sh|Leiter}} alone.
Last modified on 14 April 2014, at 03:08