Last modified on 9 January 2015, at 19:09

Wiktionary:Grease pit

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of "all words in all languages".

It is said that while the classic beer parlour is a place for people from all walks of life to talk about politics, news, sports, and picking up chicks, the grease pit is a place for mechanics, engineers, and technicians to talk about nuts and bolts, engine overhauls, fancy paint jobs, lumpy cams, and fat exhausts. That may or may not make things clearer... Others have understood this page to explain the "how" of things, while the Beer parlour addresses the "why".

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with "tips-n-tricks" are to be listed here as well.

Grease pit archives +/-
2006
2007
2008
2009
2010
2011
2012
2013
2014


December 2014

Interwiki botsEdit

Are any interwiki bots running at all - Ruakhbot or others? Also, I noticed that Min Nan wiki entries (nan) are usually not touched. They are not fancy but shouldn't be ignored. --Anatoli T. (обсудить/вклад) 04:29, 1 December 2014 (UTC)

I believe Rukhabot is still running- when User:Ruakh has the time. Not that often, but far better than nothing. Chuck Entz (talk) 04:58, 1 December 2014 (UTC)
Can't see anything in Special:Contributions/Ruakhbot. --Anatoli T. (обсудить/вклад) 06:14, 1 December 2014 (UTC)
You spelled it wrong, it's Special:Contributions/Rukhabot (the difference has to do with Hebrew grammar; there's a short explanation on its user page). Anyway, it clearly hasn't run since August. --WikiTiki89 06:22, 1 December 2014 (UTC)
Mea culpa. I will try to start a run sometime this week. (Are there no other active interwiki-bots anymore? That's unfortunate. Rukhabot is the most comprehensive interwiki-bot in mainspace, operating based on XML dumps like Interwicket (talkcontribs) used to do, but there used to be others that ran based on recent-changes and "seeding". And Rukhabot has never touched non-mainspace pages such as categories; we have always depended entirely on "traditional" interwiki-bots for that.) —RuakhTALK 01:00, 2 December 2014 (UTC)
@Ruakh: Yes, please restart, if you can.
A couple of questions: Does it link to redirects in other languages? 三輪車 is a redirect to 三轮车 on Chinese wiktionary (zh).
Are there excluded wikis, which are ignored, e.g. Min Nan (zh-min-nan) (not sure if it's excluded by your bot but I noticed that interwikis are often one-sided)? Compare e.g [1]] and 三輪車. The English entry is linked to Min Nan but it's not linked back to other wikis. --Anatoli T. (обсудить/вклад) 01:57, 2 December 2014 (UTC)
Re: "Does it link to redirects in other languages?": For redirects like zh:三輪車 (which are "redirects" in the proper HTTP sense, rather than in the sense we usually mean on enwikt), no: it does not.
Re: "Are there excluded wikis, which are ignored, e.g. Min Nan (zh-min-nan) [] ?": It excludes "closed" Wiktionaries, such as those for Tibetan, Moldovan, and Romansh; but the Min Nan Wiktionary is not closed, and therefore is not excluded. (See e.g. minnan?diff=28872250.)
Re: "I noticed that interwikis are often one-sided": As explained at User:Rukhabot#Interwikis, Rukhabot only edits en.wikt.
Note that it would be very difficult for me to change the interwiki-link behavior, because I would risk getting into accidental revert-wars with the other interwiki-bots. I have a lot more freedom with the translation-link behavior, which is why the translation-links do have a little bit of support for "redirects" like zh:三輪車.
RuakhTALK 06:54, 4 December 2014 (UTC)
@Ruakh: Thank you. I see you have no control over other wikis. Are able (please consider) to add HTML redirects? We probably centralise traditional and simplified Chinese entries into traditional, pls see Wiktionary:Beer_parlour/2014/December#New_changes_to_Chinese_entries. --Anatoli T. (обсудить/вклад) 21:50, 4 December 2014 (UTC)
Er, I think you misunderstood me? I'm saying that I can't just start supporting HTTP redirects, because then I would get into revert-wars with the interwiki-bots that don't support them. (By the way, 'HTML' and 'HTTP' are not the same thing.) —RuakhTALK 02:23, 5 December 2014 (UTC)
@Ruakh: Sorry, I meant HTTP redirects. What other bots do you mean? I meant making possible linking English Wiktionary's 三輪車 to zh:三轮车]. Did you mean this particular link is going to cause revert-wars? Please confirm. --Anatoli T. (обсудить/вклад) 02:42, 5 December 2014 (UTC)
What Ruakh is saying is that if his bot (or anyone else) adds that link, one of the other interwiki bots will remove it. --WikiTiki89 02:50, 5 December 2014 (UTC)
OK, thanks. Sorry if my question/request sounded stupid. --Anatoli T. (обсудить/вклад) 05:16, 5 December 2014 (UTC)

{{ping}}Edit

There is an odd behavior with {{ping}}. It's been working fine and sent notifications (a red number next to my user name) but today it did not and even when I click the list of notifications, the new one is not even there. There was a period after the template call instead of the usual space. Would that cause this issue? I wanted to test it but apparently I can't ping my own user name. --Panda10 (talk) 20:04, 1 December 2014 (UTC)

Just to note, the reason you can't ping yourself is that, to do so, you'd have to save and view the page that had just pinged you. Which of course just removes the ping right away. —CodeCat 21:05, 1 December 2014 (UTC)
@CodeCat:. Thanks. --Panda10 (talk) 21:36, 1 December 2014 (UTC)
One way to ping yourself is to open a second browser in which you're not logged on, then ping your own name as an anon. I just did that, and it worked: I got a notification. —Aɴɢʀ (talk) 21:30, 1 December 2014 (UTC)
Hi @Angr:. Ok, but you used a space after the template call. I am pinging you now with a period after it. Let me know if you get a notification. --Panda10 (talk) 21:36, 1 December 2014 (UTC)
@Panda10:. Yes, I did. Did you? —Aɴɢʀ (talk) 21:46, 1 December 2014 (UTC)
Yes. In that case, I have no idea why I did not get that one message. Thank you for testing. --Panda10 (talk) 21:48, 1 December 2014 (UTC)
The pinging function is often unreliable. It's often happened to me that I've failed to get notifications despite being pinged, without any discernible reason. —Aɴɢʀ (talk) 21:56, 1 December 2014 (UTC)
It's good to know. Thanks. --Panda10 (talk) 22:17, 1 December 2014 (UTC)
Note that if the post isn't signed properly, or if the edit adding the post removed any lines from the page, or if the ping link isn't formatted normally, the user won't get a notification. --Yair rand (talk) 01:05, 2 December 2014 (UTC)
mw:Help:Echo. Keφr 17:35, 21 January 2015 (UTC)

Scan Azeri entries for incorrect yaaEdit

Can someone scan the dumps and make a list of all pages with an ==Azeri== L2 that contain the letter ي (U+064A, ARABIC LETTER YEH) either in the page title or in the text of the page? These should all be replaced with ی (U+06CC, ARABIC LETTER FARSI YEH). I just fixed a few, but have no idea how many more there. --WikiTiki89 16:26, 3 December 2014 (UTC)

I only see one (پوميدور), which you've already moved. I could scan translation tables too, if you think it's worthwhile. DTLHS (talk) 17:02, 3 December 2014 (UTC)
Interesting that you didn't find Azərbaycan, which I also just fixed. Are you sure you checked the Latin- and Cyrillic-script entries? --WikiTiki89 17:06, 3 December 2014 (UTC)
My bad, I just checked the title, not the page text. DTLHS (talk) 17:08, 3 December 2014 (UTC)
Here:

bir fil din mis dünya ayı diz سو miras фил динозавр avtomobil birlik دین balıq yarasa pomidor silah Azərbaycan мирас میمون təyyarə тәјјарә imza çiçək maşın ədəbiyyat dəniz qızıl dərya gəmi heyvan minarə siçan дин kəndir әдәбијјат kərpic zeytun зејтун şirkət milçək göyərçin bibər meymun jasmin dayanacaq kəfgir qərənfil yoxsul kasıb böyrək xəzinə aclıq hinduşka həqiqət cib damcı پوميدور طیاره qarışqa qalay yasəmən مئیمون мејмун dinozavr Əs-Səlamu əleykum

DTLHS (talk) 17:23, 3 December 2014 (UTC)

Thanks! --WikiTiki89 17:30, 3 December 2014 (UTC)
Courtesy of Lo Ximiendo. --Vahag (talk) 17:58, 3 December 2014 (UTC)
See also the translations and links on Azeri. DTLHS (talk) 18:03, 3 December 2014 (UTC)
I fixed all those, but I guess it also looks like it may be worth scanning all translation tables. --WikiTiki89 18:09, 3 December 2014 (UTC)
Checking ي vs ی and ك vs ک, which look alike in certain positions could be a regular task for patrollers. It's a common mistake for Arabic, Persian, Urdu, Azeri, etc, when entries are made using, e.g. wrong IME. --Anatoli T. (обсудить/вклад) 22:56, 3 December 2014 (UTC)
What I found when correcting the yaas was that usually the kaf was correct in the same word where the yaa was wrong. --WikiTiki89 23:30, 3 December 2014 (UTC)
Yes, yāʾ/ye errors are more common for both Arabic ي and ى (ʾalif maqṣūra) (which looks even more like Persian ی, but a while ago, there was a contributor who used Persian keyboard for Arabic words and she used "kāf" as well. --Anatoli T. (обсудить/вклад) 23:38, 3 December 2014 (UTC)
Or a task for the script detection module (why was it disabled?) DTLHS (talk) 16:48, 4 December 2014 (UTC)
Yes, patrolling doesn't have to be manual, mixing Roman and Cyrillic scripts has been quite common, I have fixed quite a few when I spotted them. --Anatoli T. (обсудить/вклад) 05:05, 5 December 2014 (UTC)
On a related note, could someone delete کيلو and رجوليت for me, please? Thank you. ك could well be considered an 'alternative spelling' for Persian entries. Kaixinguo (talk) 13:52, 8 December 2014 (UTC)

Template:etyl edit protected requestEdit

"Anglo-Norman" (xno) should link to w:Anglo-Norman language, and not to w:Anglo-Norman, as it presently does. See e.g. beauty#Etymology. It Is Me Here t / c 17:06, 3 December 2014 (UTC)

Presumably it's actually Module:etymology language/data that needs to be changed, but I'm not sure how. —Aɴɢʀ (talk) 17:52, 3 December 2014 (UTC)
N.B. they manage to do it for Georgian at Template:etyl/documentation, if that's any help. It Is Me Here t / c 19:24, 3 December 2014 (UTC)
Yeah, practically all languages automatically have the word "language" added to the end in {{etyl}}, but Anglo-Norman doesn't, and I don't know why. —Aɴɢʀ (talk) 20:06, 3 December 2014 (UTC)
Yes check.svg Done For "normal" languages — ones with actual mainspace support and so on — it uses the language-name as it appears in categories, which is generally just the normal name plus "language". (There are some configured exceptions — ase category-names and etymology-Wikipedia-links use "American Sign Language" rather than "American Sign Language language" — but that's the default.) For languages defined in Module:etymology language/data, the default is just to use the first listed name, unless configured otherwise with an explicit wikipedia_article value. I've now configured xno to link to w:Anglo-Norman language. —RuakhTALK 04:31, 5 December 2014 (UTC)

Huge edit, unreadable in wiki software? consolesEdit

What's going on here at consoles? I can't even view the diff. I managed to roll it back though. [2] Equinox 19:58, 3 December 2014 (UTC)

It's just a bunch of junk JavaScript code, which comes out like even worse junk when interpreted as wikitext. --WikiTiki89 21:12, 3 December 2014 (UTC)

Howler in Template:it-nounEdit

You can no longer specify the other gender plural in {{it-noun}}. The rationale is, that plural goes on the other gender form. So nominatrici cannot be linked to using it-noun in nominatore, but you can link to it from nominatrice so there's no problem. However, for -ista nouns, the other gender noun is the same as the page name! For example alchimista you can't specific a masculine plural and a feminine plural unless both plurals are identical, which they aren't! I had to switch to {{head|it|noun}}. FWIW I'd rather we just allow linking to the other gender plural, but that's a policy issue, not a technical one. Renard Migrant (talk) 12:39, 5 December 2014 (UTC)

In cases like these we would just create two noun sections with separate definitions, just like for nominatore and nominatrice. They're still separate nouns with separate meanings, they just happen to have some forms in common. —CodeCat 14:11, 5 December 2014 (UTC)
I disagree, they're the same noun with the same meaning. CodeCat are you breaking things to avoid boredom again? Renard Migrant (talk) 15:42, 5 December 2014 (UTC)
They have different meanings. You can't call a nominatore a nominatrice. —CodeCat 17:58, 5 December 2014 (UTC)
The issue is with alchimista and similar Italian nouns ending in -ista, where one single word refers to both masculine and feminine in the singular, but that same word has two different plural forms depending on the gender. ‑‑ Eiríkr Útlendi │ Tala við mig 19:23, 5 December 2014 (UTC)
Yes, and I'm arguing that there are two nouns, alchimista m (male alchemist) and alchimista f (female alchemist). —CodeCat 19:25, 5 December 2014 (UTC)
  • That wasn't clear initially -- “You can't call a nominatore a nominatrice” made it sound like a different issue.
It seems then as though you're advocating for data redundancy, rather than a simple and elegant means of combining duplicate information. I'm not sure why, especially when we used to have just such a solution. ‑‑ Eiríkr Útlendi │ Tala við mig 20:10, 5 December 2014 (UTC)
Being accurate is still more important than not being redundant. There are many nouns in many languages that are synonyms of each other, but by and large we still have full definitions for both of them. But in this case, they aren't even synonyms. We would never define policewoman with the exact same definitions as policeman, because they clearly mean different things. The same applies here: you can't use alchimisti and alchimiste interchangeably, they refer to different things. Common sense says that words with different meanings must have different definitions, and cannot be combined into one word anymore than policewomen and policemen can. —CodeCat 20:27, 5 December 2014 (UTC)
  • CodeCat, again your comment suggests that you and I are talking past each other. The issue at hand is not about treating alchimisti and alchimiste interchangeably. It is about having one single entry for the singular form alchimista, which, in the singular, happens to be either masculine or feminine (presumably indicated by the article used), and indicating in the headline for alchimista that this term has two separate plural forms, depending on the gender. ‑‑ Eiríkr Útlendi │ Tala við mig 07:52, 6 December 2014 (UTC)
Does anyone other than CodeCat agree with CodeCat about this? --WikiTiki89 22:30, 5 December 2014 (UTC)
Not sure why this is not in Beer parlour. Is the undiscussed change still left unreverted? --Dan Polansky (talk) 22:45, 5 December 2014 (UTC)

I agree with CodeCat. Very often, such words are grouped in paper dictionaries, but it's for space reasons only. And the meaning of the feminine noun associated to the masculine noun is often unclear in these dictionaries: in French, if boulanger and boulangère are described in the same entry, the meaning wife of ... is usually omitted, very unfortunately. Lmaltier (talk) 22:54, 5 December 2014 (UTC)

But we aren't discussing boulanger and boulangère, we are discussing (using French as an example) words like chimiste that have the same masculine and feminine form. In French, the plurals are the same, but in Italian, they are different, but they are still one word in the singular. --WikiTiki89 22:59, 5 December 2014 (UTC)
In French, yes, for simplicity, it's possible to consider chimiste as a single word, either masculine (for men) or feminine (for women). But, for alchimista in Italian, different plurals clearly mean that they are different words. In Italian, nouns have a gender, either masculine or feminine. Lmaltier (talk) 06:26, 6 December 2014 (UTC)
  • I got to wondering, here we are blathering on about this in English, what do the Italians think? If any Italian speakers can chime in here, please do. I note that none of the participants so far in this thread self-report any Italian knowledge.
FWIW, The IT WT is missing any it:alchmista entry, but they do have an entry for it:specialista, with just the one listing for both masculine and feminine, and two separate entries for the plural.
Note: I did have a look at the history for {{it-noun}}, and the last change was in February 2013 when Semper removed the old template code and replaced it with an invocation of Module:it-head. That was last changed in late July this year by CodeCat, but briefly looking over the changes, I don't think those changes affected plural handling. In short, it looks like {{it-noun}} has had this deficiency for a while, possibly since it was first created.
If this IT WT entry is at all representative, and if Renard Migrant's comments above are accurate about the state of affairs in {{it-noun}}, then this template needs alteration to allow for this case (single gender-agnostic singular form, separate gender-specific plural forms). ‑‑ Eiríkr Útlendi │ Tala við mig 07:52, 6 December 2014 (UTC)
I don't think knowledge of Italian is needed; the matter is presented clearly enough for English-speaking lexicographers to be able to judge the matter well enough. Chiming in from native speakers of multiple inflected European languages featuring constructions similar or analogical to the one discussed does not harm. In Czech, most role names have gender-differentiated lemmas (učitel and učitelka for teacher), but there are cases which are not so differentiated, such as vrchní and radní, and for those it is not unequivocally obvious that we need two separate noun entries, each having a distinct sense. --Dan Polansky (talk) 11:18, 6 December 2014 (UTC)
I disagree with splitting definitions of nouns with common gender. Yeah, English nouns don’t have gender and different words must be used when referring to a specific sex, but why should this affect languages that do inflect for gender? — Ungoliant (falai) 16:42, 6 December 2014 (UTC)
How about nouns with identical masculine and feminine singulars and identical masculine and feminine plurals as well? Check out this edit to French marxiste. Isn't this just reader-alienating silliness? Renard Migrant (talk) 13:25, 7 December 2014 (UTC)
Precisely what I’m talking about. It’s confusing, inaccurate (at least for Italian and Portuguese, the masculine isn’t merely a male X, it’s also the form used when the sex is unknown or irrelevant) and it wastes the time of readers and contributors alike. — Ungoliant (falai) 16:46, 7 December 2014 (UTC)

Revisiting the issue of English UK/US spellings and entry synchroni(s|z)ationEdit

Following from the Wiktionary:Requests_for_cleanup#anonymize thread, I wanted to bring up again the technical possibilities for maintaining a single dataset (page) for entries with multiple spellings, where the only difference in the spellings is regional (i.e. it has no bearing on meaning, etc.).

Every once in a while, curiosity or frustration bubbles up regarding English entries with different spellings, such as color and colour, and the challenges of maintaining content on both pages. I dimly recall past discussions about this, where the consensus that evolved was that the technical limitations were too great to allow for anything other than either picking one spelling as the lemma and redirecting the other, or just manually maintaining both pages.

However, the last such discussion that I can remember took place before we got Lua. Now that we have proper string processing, I started wondering if we might not be able to come up with something that would work.

As a quick-and-dirty sample, I created User:Eirikr/Template_Tests/colour and User:Eirikr/Template_Tests/color, which both just pull from User:Eirikr/Template Tests/colo(u)r. The plural forms for this word pair are quite simple, so no special processing is needed. More complicated plural forms would need another approach, but at the moment, I cannot think of any UK/US word pairs with more complicated plurals. Another sample pair to demonstrate a verb is at User:Eirikr/Template_Tests/anonymise and User:Eirikr/Template_Tests/anonymize, both of which pull from User:Eirikr/Template_Tests/anonymi(sz)e.

An alternate approach would be to pick one spelling as the "main" one that would contain the data, and just transclude that into the other spelling directly, without using a third page.

Maintaining two separate pages, where ostensibly the only differences should be in regional labels and other specifics, has proven problematic. See the discrepancies between honor and honour, for instance. Maintaining a single dataset instead and applying some simple structured authoring techniques looks like a saner and more consistent way forward.

What are your thoughts? Is this something we could implement? ‑‑ Eiríkr Útlendi │ Tala við mig 20:07, 5 December 2014 (UTC)

  • I oppose synchronizing entries via templates; we had this before and we quit this. I oppose the use of Grease pit for this discussion. --Dan Polansky (talk) 20:25, 5 December 2014 (UTC)
    • I oppose Dan's opposition to a solution. If you can't come up with a solution, opposing the attempts made by others is not helpful. Therefore, I declare this opposition void as it is only obstructive and not contributing to solving the problem in any way. —CodeCat 20:31, 5 December 2014 (UTC)
This is the correct place for discussing the technical aspects of whether something can be done and how we might do it. It's only the matter of whether to actually implement it that has to go to the Beer parlour. The cases where decisions to implement things were made here aren't an argument against discussing things here, but against omitting the Beer parlour step in the process. Chuck Entz (talk) 21:57, 5 December 2014 (UTC)
I disagree. Whether this kind of thing should be done should be discussed in Beer parlour. This discussion should have been started in Beer parlour, since there are no technical difficulties to be discussed, merely whether we want to use templates and thereby complicate editing. --Dan Polansky (talk) 22:05, 5 December 2014 (UTC)
  • After e/c... @Dan Polansky: The grease pit is for technical discussions. This is a technical discussion. Why do you “oppose the use of Grease pit for this discussion”?
  • Yes, indeed, we had this discussion before -- as I explicitly note in my initial post here (“I wanted to bring up again”). Much has changed in the technical capabilities of the MediaWiki platform, in ways that, I believe, alter the parameters enough to warrant discussing this again. We turned down a similar idea once before, under different conditions. Under current conditions, I am interested in what community members think.
Note that this issue affects relatively few terms: just those where multiple spellings are all regarded as full lemmata in their own rights, and where the content on all of the relevant pages is (or should be) identical except for spellings and regional tags. Increasing complexity by adding such a workaround (to what is ultimately a technical issue) also reduces complexity by simplifying page maintenance. I believe the net effect on editing complexity is actually a reduction.
I now understand that you are opposed, so thank you for making that clear. However, I do not fully understand why you are opposed. Could you articulate your position? Is your only opposition that it increases a specific kind of complexity, in a small and very specific subset of entries? ‑‑ Eiríkr Útlendi │ Tala við mig 22:24, 5 December 2014 (UTC)
(edit conflict)I understand Dan's opposition. He's not the only editor to oppose this suggestion for the reason that it makes editing more complicated. Nevertheless, if we can't have the two spellings in one heading like other dictionaries, then I support this proposal. Dbfirs 21:51, 5 December 2014 (UTC)
  • The Lua function :getContent() might be useful. See User:Wyang/anonymize - Of course more code and format standardisation of source entry are needed, but this could avoid having to update both pages. Wyang (talk) 22:02, 5 December 2014 (UTC)
    • Actually, I think we can avoid using Lua for this. There is a MediaWiki extension whose name I cannot recall designed to make partial page transclusions possible. I think we even have it installed here on Wiktionary. Keφr 18:16, 10 December 2014 (UTC)
      • mw:Extension:Labeled Section Transclusion. (Ironically, "labeled" is spelled the US way.) Keφr 18:18, 10 December 2014 (UTC)
        • Wow, that is excellent. It would be great if the function could be enabled. Wyang (talk) 20:19, 10 December 2014 (UTC)
          • We can't use {{#lsth:}} because as far as I can tell from the documentation, there is no way to specify for example which ===Noun=== section to transclude out of the many noun sections on the page. I see nothing wrong with using {{#lst:}} and marking the sections explicitly. --WikiTiki89 20:48, 10 December 2014 (UTC)
            • I made test pages at User:Wikitiki89/colour and User:Wikitiki89/color, but it seems that the feature is not enabled. --WikiTiki89 20:50, 10 December 2014 (UTC)
              • Ya, sorry I missed participating in this earlier. I'd found out about LST quite some time back and got excited by the possibilities, only to find that it didn't work here. That's part of why my sample pages (linked above) use {{ifeq: {{PAGENAME}} | UK spelling | UK spelling details | US spelling details }}.
              That said, https://en.wiktionary.org/wiki/Special:Version#mw-version-ext-LabeledSectionTransclusion shows that this is now installed on Wiktionary. I put together some rudimentary testing at User:Eirikr/Sandbox as the source, and User:Eirikr/Scratchpad as the page transcluding the source. However, given the way that LST works, I don't think it's what we need for UK/US entry synchronization, since here, we don't need to transclude specific subsections by name, and instead we need to conditionally determine which spelling to present to the reader, for one specific word. {{ifeq: {{PAGENAME}} ... }} is ugly, but without variables that can be set once and accessed from anywhere within the page, that's the only non-Lua way I can think of to do this. ‑‑ Eiríkr Útlendi │ Tala við mig 22:38, 10 December 2014 (UTC)
              The issue is not "which spelling to present to the reader", but how we can synchronize the content without moving the content to the template namespace. In other words, we want the content to be located at one of the entries and transcluded onto the other(s), and that's where LST becomes handy. --WikiTiki89 22:51, 10 December 2014 (UTC)
              • Ah, sorry, I was still working from the assumption that content would be in a third page. (I dimly recall that past similar discussions got hung up on political considerations of which spelling to use as the "main" one, so User:Eirikr/Template_Tests/colo(u)r works around that by having both UK and US spellings call the content from a third neutrally spelled page.) That said, even with LST as you describe here, which spelling to use is still a consideration (just not the consideration with regard to using LST). ‑‑ Eiríkr Útlendi │ Tala við mig 22:58, 10 December 2014 (UTC)
                • I think Lua is more promising than LST. But it might also be slower. —CodeCat 22:59, 10 December 2014 (UTC)

Which differences should be considered as normal between such pages? It's not always easy to tell, and differences may appear years after creation of pages. Possible differences are:

  • spelling (of course)
  • quotations (of course)
  • inflected forms: plural, etc. (of course)
  • conjugation tables (of course)
  • geographical area
  • period of use
  • pronunciation (either due to the different area or to the different period)
  • usage notes
  • meaning: it may appear that, unexpectedly, the precise meanings are slightly different, or that some meaning is particular to a spelling
  • synonyms, etc. (because of the possible difference of meanings, but also because some synonyms may be specific to a period or to a geographic area)
  • translations (because of the possible difference of meanings)
  • anagrams (of course)
  • gender (unlikely, but possible)
  • etymology (sometimes, the origin of the specific spelling may be of interest)

In other terms, everything may differ. I think that differences are normal, that discrepancies are not a problem, and that users opening honor and users opening honour may have different expectations (e.g. why not a different spelling in definitions and notes in these pages?). But comparing the history of spellings may be interesting nonetheless, and should be encouraged when interesting. The most important thing is that users should never be asked to click on a link in order to get information they need about the word: both pages should be as complete as possible (it will always be the case if pages are improved with time). Also remember the KISS principle: contributors should not be discouraged by complexity. Lmaltier (talk) 22:33, 5 December 2014 (UTC)

Lmaltier, all of what you mention above might indeed have differences depending on spelling. In the case of UK/US entries, however, I would argue that all of that information still belongs in one place, and should be accessible from either spelling. ‑‑ Eiríkr Útlendi │ Tala við mig 22:38, 10 December 2014 (UTC)
Support the centralisation of US/UK, etc. spellings under one entry (whatever is chosen) in principle. Agreed to the move of the discussion to BP. --Anatoli T. (обсудить/вклад) 23:40, 7 December 2014 (UTC)
Oh? I wasn't supporting just one entry. I thought we were discussing keeping separate entries with just one set of definitions where appropriate, and avoiding "alternative spelling of" and the like. Anyway, I support the move to BP. Dbfirs 10:00, 8 December 2014 (UTC)
I Support the centralization of US/UK, etc. as long as: i)the content stays in the Main space and will not be moved to some obscure location in the Template namespace or even worse in the Lua-function namespace, ii) the templates/functions added to the content page don't alter the layout too much allowing to keep the main basic formating scheme, i.e. editing the conntent by humans or bots/javascript should not complicated too much by the unification.Matthias Buchmeier (talk) 05:53, 10 December 2014 (UTC)
I would support using the Oxford spelling as the main lemma (i.e. colour and analyse, but localize). --WikiTiki89 06:07, 10 December 2014 (UTC)
Support any sane way to avoid the duplication, bearing in mind that some senses (e.g. obsolete Shakespeare) may only be attestable in one spelling. Equinox 10:19, 10 December 2014 (UTC)
This discussion seems to have been diverted slightly onto a different tack. I agree with Equinox about obsolete and rare spellings because the user needs to be made aware of the rarity, but no user wishes to be told that the standard spelling of a common word in his country is an "alternative spelling" and have to click on a proscribed spelling to find the definitions. Dbfirs 16:46, 10 December 2014 (UTC)
It wouldn't be so bad if it says "American spelling of" or "British spelling of". --WikiTiki89 18:14, 10 December 2014 (UTC)
Agreed. We did get that changed on some entries, but it's still not ideal. Dbfirs 09:42, 11 December 2014 (UTC)
I don't like the idea that spellings exist within a binary American/British dichotomy. What about Canada, Ireland, Trinidad...? Equinox 13:05, 11 December 2014 (UTC)
I think Canada's the only one that really has its own spelling conventions; every other English-speaking country uses either US spellings or UK spellings consistently. And even Canada doesn't have any spellings that are uniquely its own; each word is spelled either the British way or the American way. For example, Brits might buy rubber casings for car wheels at a tyre centre, and Americans at a tire center, but Canadians would go to a tire centre. And Brits might have a paralysed neighbour and Americans a paralyzed neighbor, but Canadians would have a paralyzed neighbour. —Aɴɢʀ (talk) 13:51, 11 December 2014 (UTC)
I was only using American and British as the most significant examples. If we use Oxford spelling as the lemma, then "Canadian spelling of" would be pretty rare, in fact I don't know of any cases where they differ. What's also always confused me is why we Americans write advertise instead of advertize. --WikiTiki89 19:39, 11 December 2014 (UTC)
I gave two examples of Canadian spelling deviating from Oxford spelling above: tire and paralyze (also curb and analyze). As for advertise, it isn't etymologically advert + -ize, so we don't spell it that way (likewise televise, compromise, surprise, etc., are spelled with s). —Aɴɢʀ (talk) 21:14, 11 December 2014 (UTC)
Yogourt is a uniquely Canadian spelling, used on product packaging for French compatibility, although yogurt is probably the most common in Canada.
Canada is also more tolerant of foreign spellings (British or American), and less likely to see them as errors. In very many cases, the “foreign” spelling is also an alternative Canadian spelling. Makes it hard to mark up dictionary senses clearly. Michael Z. 2014-12-22 19:58 z
Umm... Oxford spelling also uses tire and paralyze. Don't confuse Oxford spelling with British spelling. I guess you're right about advertise. --WikiTiki89 02:29, 12 December 2014 (UTC)
Dammit you're right about paralyze. I always confuse -y(s/z)e and -i(s/z)e. --WikiTiki89 02:32, 12 December 2014 (UTC)
I think I'm right about tire too. Oxford dictionaries call that an American spelling in the sense of a rubber or plastic covering of a motor vehicle's wheel and accept only tyre in that sense. —Aɴɢʀ (talk) 14:24, 12 December 2014 (UTC)
Firstly, as you know, the ODE is not the OED. But I guess the actual Oxford spelling is disputable. The OED has entries for both "TIRE n.2 2b" and "TYRE n.5 2a", but describes the latter as a variant of the former. --WikiTiki89 14:37, 12 December 2014 (UTC)
The OED is pretty much the last place I'd go to find out what the Oxford spelling of a word is, because it's so comprehensive and so unabashedly descriptive that it lists everything ever attested. If you use an Oxford dictionary for everyday writing, you should use ODE or COED or http://www.oxforddictionaries.com/ and they all agree that tire in this sense is an American (or rather, North American) spelling. —Aɴɢʀ (talk) 15:06, 12 December 2014 (UTC)
Then I guess the question is "What is Oxford spelling?" I always thought it was defined by the spelling used by the OED. Note that for -i(s/z)e endings, the OED does not have entries at all for localise, but such spellings are found in the quotations listed at localize. It would be interesting to see which spelling of tire/tyre is used in the OED's definitions of other words. --WikiTiki89 16:32, 12 December 2014 (UTC)
To me, Oxford spelling is the spelling preferred by Oxford University Press in all its publications, for example a journal article published by OUP called “Environmental impact assessment of a scrap tyre artificial reef” or this article mentioning “five trucks equipped with four different types of tyres”. —Aɴɢʀ (talk) 17:32, 13 December 2014 (UTC)
All publications in British English use the spelling tyre except when they are deliberately using the archaic spelling tire which used to be used for the iron rim of wheels. Most educators in the UK (including Oxford University itself) now recommend the use of the ise ending where there is a choice in British English, so the so-called Oxford spelling is gradually disappearing on this side of the pond. Of course, some words (advertise, advise, apprise, chastise, comprise, compromise, despise, devise, disguise, excise, exercise, improvise, incise, prise (meaning ‘open’), promise, revise, supervise, surmise, surprise, televise etc) cannot have z in modern British English. Dbfirs 19:18, 13 December 2014 (UTC)
None of the words in that list can have z in American English either, at least not in proofread/copyedited American English. —Aɴɢʀ (talk) 19:48, 13 December 2014 (UTC)
Thanks, I knew that many of them were z-proscribed but wasn't sure about others (such as prize) in American English. Dbfirs 20:15, 13 December 2014 (UTC)
Prise in the sense of "open" has become pry in American English anyway. —Aɴɢʀ (talk) 20:52, 13 December 2014 (UTC)
Oh yes, I'd forgotten that. (I'm not fully conversant with modern American usage.) Our entries at prize (etymology 1 noun sense 7 and etymology 2 verb sense 3) obviously need some adjustment. They could be marked "proscribed" or just mis-spellings for British English, but they seem to appear in some American dictionaries. Dbfirs 21:28, 13 December 2014 (UTC)
  • After this discussion has run its course and before there is the required vote on any specific proposal, there should be a Beer Parlor discussion on this obviously non-technical matter. DCDuring TALK 17:02, 13 December 2014 (UTC)
  • Apologies for not having read all of the discussion above. The duplication and separate maintenance of so much content across, say, "color" and "colour" is crazy. I have always thought so. Once I "naively" tried to merge one of these pairs -- can't remember which one now -- assuming that duplicated entries had been created in error, only for it to be reverted. However it is achieved technically, some way of maintaining shared content once only would be highly desirable. 86.152.161.61
Yes, we all agree that one set of definitions, where appropriate, would be a good idea, but we cannot agree on the best way to achieve this. Merging to one spelling is not a good solution, though there are some editors who force this on us. Dbfirs 12:40, 28 December 2014 (UTC)

Citations pages created with empty template documentationEdit

Not sure why users/bots are creating these, but I've seen many of them. They create a Citations: page that contains the {{documentation subpage}} stuff. An example (which I've since deleted) is at Citations:脚本. Could someone please write an abuse filter to prevent it? Equinox 03:50, 6 December 2014 (UTC)

It's because the Citations namespace has the same preload settings somewhere as the Documentation namespace. It was brought up here before, but nothing came of it. Chuck Entz (talk) 04:20, 6 December 2014 (UTC)
It was in MediaWiki:Gadget-DocTabs.js. And I actually fixed that script long ago, but apparently there are still un-fixed versions of it lingering in people's browser caches. We would have to convince people who do it to clear their cookies and localStorage. Quite hopeless, really. The best I could do is probably put mw.loader.store.clear(); into MediaWiki:Common.js and leave it there for a few weeks, but it will strain WMF servers somewhat. Keφr 18:10, 10 December 2014 (UTC)

Etymological treesEdit

Etymologies here tend to be displayed in the same format as paper dictionaries. It's a concise format that works well for linear etymologies but in my opinion can be confusing for compound words or when there are multiple possible branches.

At the Vietnamese Wiktionary, we've been nesting vi:Bản mẫu:etym-from to create etymological trees. (For better or worse, we copied the Dutch Wiktionary's penchant for templatizing everything.) Linear etymologies appear inline, but as soon as any branch has an ancestor, the display turns into a nested list. Unfortunately, I'm literally the only person who's been using this template because the syntax is so hideous.

The other day, I came across {{etymtree}}. Though it confusingly deals with descendants rather than the contents of "Etymology" sections, it gave me the idea to implement complex etymologies as "family trees". vi:Bản mẫu:etym takes the same information as etym-from, but with a much simpler syntax based on wiki lists. vi:Mô đun:EtymologicalTree parses it into a microformatted list that is rendered as a "family tree" using CSS. See vi:bánh bao, vi:câu lạc bộ, and vi:văn thư for some simple examples.

EtymologicalTree hasn't been battle-tested yet; I'm sure the English Wikipedia would have plenty of edge cases that would break it easily. But I welcome any feedback you can give, because lately there's been renewed interest in adding etymologies to the Vietnamese Wiktionary, and the community here has a lot more experience with etymologies.

 – Minh Nguyễn 💬 09:02, 8 December 2014 (UTC)

Part of what makes {{etymtree}} (yes, it's badly named) work so well is that it doesn't try to parse the whole list. Rather it only parses the language name and the following colon; anything beyond that is shown as-is. It would be a lot more complicated if it had to try to extract the words themselves as well as any additional information that was provided next to them (qualifiers, etc.). Furthermore, even this template breaks if you specify more than one descendant on the same line, because it can't tell which descendants belong to each parent. —CodeCat 13:59, 10 December 2014 (UTC)
I like this idea. Wyang (talk) 20:34, 10 December 2014 (UTC)

Edit filter to prevent Ladin --> Latin header changesEdit

It seems like just about every day I revert some IP's changing of a "Ladin" L2 header to "Latin". Would someone be so kind as to add a filter to stop this? Should it be a warning or a block? Or should it be a warning for autoconfirmed and a block otherwise? Chuck Entz (talk) 14:54, 11 December 2014 (UTC)

  • Yes, I have come across this as well (several times). I have always assumed it was done in good faith so have just reverted with no block. I would give a warning if the same user persisted. SemperBlotto (talk) 11:03, 13 December 2014 (UTC)
Obviously I worded this very poorly: my question was whether the edit filter should warn those who try to make the change, or disallow it. The secondary issue was whether we might make it conditional, so that non-autoconfirmed users (IPs, mostly/all?) would be treated differently: perhaps disallowing the edit for them, but only warning others (or doing nothing in their case?). The warning should say something like "Please don't try to change Ladin headers to Latin- Ladin is a modern Romance language quite different from Latin". If they were really determined, they could probably work around it by deleting the header or first changing it to something else. Of course, I'm not sure how easy it is to compare an edit against the existing version, so the whole point could be moot.Chuck Entz (talk) 23:26, 13 December 2014 (UTC)
I think a warning is sufficient; the edit shouldn't be prevented. After all, the editor might be making other good edits at the same time; also, for all we know, somewhere out there we do have a Latin section erroneously labeled "Ladin". —Aɴɢʀ (talk) 13:18, 14 December 2014 (UTC)
A warning sounds like a good idea to me. —RuakhTALK 21:48, 14 December 2014 (UTC)

The process of converting classic talk pages to FlowEdit

See also: Wiktionary:Beer parlour/2014/December#Converting classic talk pages to Flow

(BG: Flow). There are various ways to convert talk pages to Flow. Discussed at this page. I would like to know what we would like to have. Please join the discussion. Gryllida 23:57, 11 December 2014 (UTC)

Finally! I have some questions though...
  • Is this already available? If not, when will it be?
  • What about LiquidThreads?
  • What about non-talk discussions, like this page and its archives?
CodeCat 23:59, 11 December 2014 (UTC)
  • Yuck. DCDuring TALK 01:41, 12 December 2014 (UTC)
    • My thoughts exactly. Keφr 08:50, 12 December 2014 (UTC)
  • Eww. What a terrible idea Flow is. See my comments and Eiríkr's in the BP. - -sche (discuss) 05:53, 17 December 2014 (UTC)
Please let's defer this discussion until the community has decided whether or not to use Flow at all! Equinox 06:04, 17 December 2014 (UTC)

Phabricator tasks trackingEdit

Hi all, I just created the task phab:T78531 to track bugs and feature requests related to the Wiktionary project as a whole (all languages included). Language specific bugs should be tracked by a different task, e.g. phab:T76447 for the French Wiktionary, so I encourage English contributors to create a task to track en.wiktionary bugs. Dakdada (talk) 11:04, 15 December 2014 (UTC)

Template:contraction ofEdit

This template is named like a definition-line template; compare Template:abbreviation of, Template:alternative spelling of, Template:genitive of, etc), so I assumed it was for use in the definition lines of pages like I'm. However, it seems to actually be used in etymology sections of pages like aband. Is this desirable or should it be renamed? Shouldn't we have a template for I'm, drum, etc? - -sche (discuss) 06:00, 17 December 2014 (UTC)

It's not totally unknown for such definition-line templates to be used in etymologies. I've seen {{initialism of}} used that way. Of course, whether these templates are used this way and whether they should be are two different things. Renard Migrant (talk) 20:02, 19 December 2014 (UTC)

Idea: view a category from one's current entryEdit

When I am viewing an entry and I click one of its (large) categories, such as "English lemmas" or "English words suffixed with -able", I find it unhelpful to be shown the first page of the category starting from the very top. Could we, and should we, change things so that we are dropped into the category at the right place, and can immediately see the alphabetically adjacent entries that surround the entry we came from? Thanks. Equinox 19:55, 19 December 2014 (UTC)

I don't think it's possible without overwriting the built-in category infrastructure (hide existing category links and create our own- probably not something we want to do). DTLHS (talk) 20:25, 19 December 2014 (UTC)
The URL could be constructed automatically by using the entry, for example to jump to the word 'ability' in the English lemmas category without paging: https://en.wiktionary.org/w/index.php?title=Category:English_lemmas&pagefrom=ABILITY%0Aability#mw-pages. --Panda10 (talk) 20:49, 19 December 2014 (UTC)
I think this would be a good idea, if we could make it work. But like DTLHS said, the links are generated by the software and that's out of our control. The best we could do is make yet another JavaScript hack... —CodeCat 21:04, 19 December 2014 (UTC)
But if the point is to see alphabetically adjacent entries, we shouldn't start the page with the word the reader is coming from, but some number (20?) of entries before it. —Aɴɢʀ (talk) 22:44, 19 December 2014 (UTC)
That would be nice, in theory, but Javascript can't be counted on to know anything about what's in the category- the entry itself is the only member of the category it knows. Also, when you get into numerical displacements, it makes a big difference whether the entry is 732nd in the category or 2nd. Chuck Entz (talk) 07:41, 20 December 2014 (UTC)

Template:mul-letter not working when enteredEdit

Under I#Letter, I just added a third entry after I i and I ı: İ i. But I could not get Template:mul-letter to work for me no matter what I did.

  1. I copied it from one of the existing entries and plugged in the appropriate character(s) from the "Latin/Roman" insertion menu, and all I got was the first character, the capital letter I.
  2. I typed it all in by hand: same old same old.
  3. Finally I gave up and manually typed the wikicode to duplicate the format I saw on the existing entries, and at last I got what I needed:
    İ upper case (lower case i)

I've used & edited Wikipedia for 9 years and I've never heard of the like, but I'm pretty new to editing Wiktionary. Somebody technical, please take a look at this issue. --Thnidu (talk) 07:18, 23 December 2014 (UTC)

{{mul-letter}} is designed to work only in the entries of the lower-case and upper-case forms. — Ungoliant (falai) 12:27, 23 December 2014 (UTC)
The documentation is misleading. It says "all Translingual letter entries", without linking to wherever we explain "Translingual", as we use it. In the absence of a precise technical explanation in terms of Unicode or something a user would default to translingual. Most letters are translingual ("existing in more than one language"), so, in the absence of a link or inline explanation, Thnidu's "error" (the actual error is in the documentation) will likely be repeated. I don't know the right way to fix the documentation to make it technically accurate and potentially complete (if users are encouraged to use good links). DCDuring TALK 17:07, 23 December 2014 (UTC)

Category:Middle French present participle formsEdit

Could someone add this to {{poscatboiler}}? I can't, or only by risking breaking it (which I've done to other modules before). Renard Migrant (talk) 16:13, 23 December 2014 (UTC)

Should be working now. — Ungoliant (falai) 16:48, 23 December 2014 (UTC)
What's wrong with Category:Middle French participle forms? We don't need this category. —CodeCat 19:25, 23 December 2014 (UTC)

{{ja-usex}} and 々Edit

Template_talk:ja-usex#Issue_with_handling_.E3.80.85. @Wyang, Eirikr, Haplology: --Anatoli T. (обсудить/вклад) 07:32, 24 December 2014 (UTC)

Closed. Thanks to User:Kephir. --Anatoli T. (обсудить/вклад) 23:24, 31 December 2014 (UTC)

maloEdit

malo is failing to distinguish between malo#Etymology_2 and the Latin verb mālō (see ‘Etymology 2’ under ‘Latin’), which I can't directly link to at the moment because it isn't. Esszet (talk) 17:16, 25 December 2014 (UTC)

You’ll have to use {{anchor}}. — Ungoliant (falai) 17:24, 25 December 2014 (UTC)
In this case, it may be better to switch the etymology sections and link directly to #Latin, since Etymology 1 is an inflected form and Etymology 2 a lemma. — Ungoliant (falai) 17:28, 25 December 2014 (UTC)
Would you mind explaining exactly what I have to do? {{anchor}} doesn't have documentation, and I can't figure out from the other pages it's used on what to do. Esszet (talk) 22:12, 25 December 2014 (UTC)
Add an ID as the first parameter ([3]), then link to it using the ID (malo#la_ety_2). — Ungoliant (falai) 22:17, 25 December 2014 (UTC)
That works for purposes of linking, but if you click on ‘Etymology 2’ under ‘Latin’ in the TOC, it takes you to the etymology of the Galician word malo. Esszet (talk) 22:45, 25 December 2014 (UTC)
I don’t think that can be fixed. — Ungoliant (falai) 22:48, 25 December 2014 (UTC)
A similar doesn't exist on other pages; on si, for example, the several ‘Etymology 1’ sections are distinguished by numbers after the ‘1’; for example, si#Etymology_1_2 and si#Etymology_1_3. Why can't the section for mālō be marked ‘Etymology_2_2’? Esszet (talk) 23:41, 25 December 2014 (UTC)
Oh, wait, I think I see why. The section for mālō is the first ’Etymology 2’ section, so it doesn't see the need to put an additional ‘2’ after it, and the mālō section is thus confused with the second ‘Etymology’ section, which is also marked ‘Etymology_2’. This looks like it would require a change to the Wiktionary source code to fix it. Esszet (talk) 23:49, 25 December 2014 (UTC)
Yeah, our section naming practice conflicts with the hidden link numbers added by MediaWiki. The same problem doesn’t occur in si because the section Etymology 2 comes before the second numberless Etymology section (under Dalmatian, which can’t be linked to). — Ungoliant (falai) 23:55, 25 December 2014 (UTC)
Alright, where do I go to fix this? Esszet (talk) 23:58, 25 December 2014 (UTC)
Sorry, no idea. — Ungoliant (falai) 00:07, 26 December 2014 (UTC)
I've reported the issue on MediaWiki's Support Desk, and someone opened a bug for it. Esszet (talk) 12:09, 26 December 2014 (UTC)
And within two days User:Jackmcbarn has proposed a patch. :) I won't get too excited until the patch is approved, but it's great to see someone trying so quickly to address this widespread, significant, simple problem. - -sche (discuss) 17:16, 31 December 2014 (UTC)

Etiquette for Module EditorsEdit

I would suggest that anyone who edits a widely-transcluded module should check Category:Pages with module errors at least once a day for a day or two after any edit. Those who regularly do so should make a habit of checking it at least once a day, every day. It just seems like common sense to check the most obvious place where problems will show up so you can catch errors or unforeseen side-effects, and not checking after edits seems like driving blindfolded.

Case in point: there are about a dozen Chinese entries that have been there for upwards of a week with an error in Module:zh-forms. I don't know who caused it, but User:Wyang edited the module about the time the errors started, so he would have spotted them if he had checked. Likewise, whoever corrected the Japanese module problem above should have noticed that a Japanese entry with a different error has shown up in the category since then.

As someone with minimal Lua skills, I'm very grateful for those who are making improvements and fixing things, and I feel bad about having to nag them- but a module error can negate all the benefits of using a module in the first place. Chuck Entz (talk) 18:00, 25 December 2014 (UTC)

Those errors are caused by incorrect formatting - see diff for an example. Wyang (talk) 22:17, 27 December 2014 (UTC)
I stand by what I said, though your case apparently isn't a good example of what I do see more than I should. Still, your explanation is sort of like saying "there's no way I could have shot that man, because I was robbing a bank at the time". Any idea when you can fix those entries? Chuck Entz (talk) 05:05, 28 December 2014 (UTC)
I'm on holiday and won't be back until after New Year. Those errors have been sitting there for years and years. If my bot hadn't editted those articles, the wrong information on those pages could be displaying forever. If you see an entry in that error category, instead of absentmindedly singling out whoever might be responsible, see what the real cause is and see if it's something as easy to fix as this. It's not at all related to faults in the modules I created. Even if someone's module edit caused articles to be placed in that category, I disagree with the statement that a module error negates all the benefits. Wyang (talk) 10:26, 28 December 2014 (UTC)
From one volunteer to another: enjoy your vacation! There'll be plenty of time to nag you when you get back... ;) Chuck Entz (talk) 00:18, 29 December 2014 (UTC)

A New Way to Handle Deprecated TemplatesEdit

Would it be possible to have deprecated templates add an invisible text string that would trigger an edit filter which would give a deprecation warning? This would have the advantage of targeting the warning at those who are making edits without cluttering up the entry. Chuck Entz (talk) 19:27, 25 December 2014 (UTC)

Edit Filter only checks the source of page, the templates can not add any extra text to the source when they are transcluded, it is possible when it is substituted ("{subst:...}") though. --Z 17:13, 26 December 2014 (UTC)
Oh, well, it would have been nice if it had worked. I suppose that edit filters looking for specific templates in the source would still work, though- in moderation. Chuck Entz (talk) 18:32, 26 December 2014 (UTC)

{{was wotd}} and Images (and Presumably Other Multimedia Objects)Edit

When {{was wotd}} is placed immediately above an image (or presumably another type of multimedia object), a gap is created in the text next to the WOTD link (see here for an example). Esszet (talk) 22:25, 30 December 2014 (UTC)

I don't see it at [[glyph]] using Firefox 34.0.5, Windows 7, fairly big screen, and my Wiktionary preferences, which include ToC right. Is your screen or window very narrow? What browser (and version) are you using? DCDuring TALK 23:42, 30 December 2014 (UTC)
No, and Safari 8.0.2 on OS X 10.10.1. Esszet (talk) 01:04, 31 December 2014 (UTC)
I definitely can't help, but that info might enable someone else to. DCDuring TALK 01:37, 31 December 2014 (UTC)
What I see at glyph is the text after the image staying below it, instead of moving up on the left. In this case, the Etymology L3 header stays below the image, so there's a gap between the English L2 header and the Etymology L3 header that's the height of the {{was wotd}} display and the image combined. This only happens when there's no text between {{was wotd}} and the graphic element below it (it has the same effect when followed immediately by {{wikipedia}}): if I add a single letter to the right of {{was wotd}} or below it, the effect goes away. Chuck Entz (talk) 02:40, 31 December 2014 (UTC)
That sounds like a bug in the browser then. —CodeCat 02:45, 31 December 2014 (UTC)
Except I happen to be using Internet Explorer 11 on a PC at the moment, and Esszet is using Safari on a Mac- that sounds like a very widespread bug! Chuck Entz (talk) 03:48, 31 December 2014 (UTC)
For me, the L3 header is at the same height as the image, and adding text to the right of or below {{was wotd}} doesn't make the gap go away; the text simply appears within it. I've looked at {{was wotd}}, and I don't see anything wrong with it, so I'm guessing the problem lies in the source code for image layout. I've just realized that there are very few types of multimedia objects on Wiktionary, and {{audio}} can't be aligned on the right of the page, so I can't see if the problem persists with other types of multimedia objects. Esszet (talk) 14:32, 31 December 2014 (UTC)
You could try something else that was right-aligned, like {{examples-right}}. DCDuring TALK 14:39, 31 December 2014 (UTC)
Same problem. Esszet (talk) 15:59, 31 December 2014 (UTC)

January 2015

Right-to-left problemEdit

Somehow Uyghur transliteration is skewed on Happy_New_Year#Translations (note the position of brackets) but no problem here: Uyghur: يېڭى يىل مۇبارەك (yëngi yil mubarek). --Anatoli T. (обсудить/вклад) 07:19, 1 January 2015 (UTC)

There was a right to left mark (U+200F) in the template- please make sure I fixed it correctly. DTLHS (talk) 16:54, 1 January 2015 (UTC)
Now I'm wondering if there's any context in which U+200E / U+200F in page text would be appropriate- can we just bot remove all occurrences? DTLHS (talk) 17:09, 1 January 2015 (UTC)
Thanks. The ocurrence would be appropriate if a translation into a RTL language had no transliteration or other Roman letters, such as gender, qualifiers, etc. E.g. see two identical translations into Persian without transliterations: دریا, فانوس. They now appear in the right order - LTR. --Anatoli T. (обсудить/вклад) 00:10, 2 January 2015 (UTC)
Wiktionary:Todo/Pages containing LTR marks, Wiktionary:Todo/Pages containing RTL marks, if anyone wants to help clean up the ones that don't need the LTR/RTL marks. - -sche (discuss) 18:20, 21 January 2015 (UTC)

Simplifying Catboiler Templates For EditorsEdit

I do a lot of category creation, and, though it's less arcane and complex now that User:CodeCat has luafied most of the infrastructure, there's still a lot of typing involved.

This is mostly unnecessary: we already have very strictly-enforced rigid constraints on the format of category names, and they generally contain most or all of the information needed in the name itself- so the modules that power the templates should be able to parse it from the page name using the modules that CodeCat has put in place (see Category:Pagename-based auto-fill-in templates for some templates that provide a substitutable front end for existing templates using such techniques).

Here are my ideas with regards to specific templates:

{{topic cat}}: For language-specific cats, the first part of the name is the language code, followed by a colon, followed by the topic name. For the non-language-specific parent categories, the page name is the topic name. See {{tcez}}, which was developed for me by User:Kc kennylau and User:Wyang. I found Wyang's version more useful and robust, so I modified it slightly to get {{tcez1}}

  • Implementation:
  1. If the name contains a colon:
    1. the language code is everything before the colon
    2. the topic name is everything after the colon
  2. If the name contains no colon, the language code is empty and the topic name is the page name
  • Problems:
  1. When the language code is "sms", the string "sms:" is converted to "sms:" (apparently by the Lua string-function backend), and the colon isn't recognized.
  2. Any topic name containing a colon will cause parsing of the non-language-specific parent category to fail.

{{prefixcat}}: the first part of the page name is the canonical name of the language, followed by " words prefixed with ", followed by the suffix, followed by "-".

  • Implementation:
  1. the canonical language name is everything before " words prefixed with "
  2. the prefix is everything after " words prefixed with ", minus the "-" at the end

{{suffixcat}}: the first part of the page name is the canonical name of the language, followed by " words suffixed with -", followed by the suffix.

  • Implementation:
  1. the canonical language name is everything before " words suffixed with -"
  2. the suffix is everything after " words suffixed with -"

{{charactercat}}: the page name is always the canonical language name, followed by " terms spelled with " followed by the character.

  • Implementation:
  1. the canonical language name is everything before " terms spelled with "
  2. the character is everything after " terms spelled with "
  • Note: As far as I know, there's no way to parse the sort parameter from the page name, so those would still have to be entered by hand where necessary.

These methods can be applied to just about every template that uses Module:category tree, with one important exception (below), and quite a few others.

{{poscatboiler}}: the language-specific categories all consist of the canonical language name, followed by a space, followed by what currently goes in the template's second parameter.

This one is trickier to implement, because there's no unique delimiting text, and because of the potential for overlap between parts of language names and parts of the second parameters.

I came up with a kludgy workaround: require a single parameter consisting of the first few characters of the current second parameter. Everything before the first instance of a space + this string in the page name is the canonical language name, and the string + everything after the first instance of a space + the string in the page name is the current second parameter.

This workaround is potentially defeatable by new canonical language names that would contain a match for the string as originally entered, so it's probably best not implemented in {{poscatboiler}} itself, but in a substitutable fill-in template. I have a working proof-of-concept at {{pcbez}}, but I don't understand substitution and/or templates in general well enough to make it substitutable without a lot of clueless trial and error. Can someone do that for me?

Thanks! Chuck Entz (talk) 21:13, 1 January 2015 (UTC)

I think it would be more workable, at least in the short term, to provide only the language code. The module can then determine that everything else must be the label. I would rather not make things too dependent on "delimiters" because my goal for the long term was to integrate {{suffixcat}} and company into {{poscatboiler}}. I believe that it's beneficial to have less templates, so that users don't have to remember which one does which. —CodeCat 21:18, 1 January 2015 (UTC)
Using the language code certainly looks like the only workable way to adapt {{poscatboiler}} itself, and may someday cause problems with converting other templates to {{poscatboiler}}, but I'm talking about the real short term here: I suspect that all the specific examples I gave here could be implemented in an hour by someone who really knows what they're doing (troubleshooting could drag that out much longer, of course). Since no one currently uses these templates without parameters, there's no problem with backwards-compatibility: you can either ignore any positional parameters, or you can use them instead of the pagename-based ones if they're present (the latter is probably better, just to be safe- see the problem with "sms:" in the {{topic cat}} section, above). Chuck Entz (talk) 22:05, 1 January 2015 (UTC)
As for the philosophical issue: it's true that the proliferation of catboiler templates was a serious problem. I'm sure someone would eventually have come up with "rfquoteoldestylecatboiler" which would render quote-request categories for earlier authors in appropriate fonts to stylistically match their era, or "trreqsundaycatboiler" for translations requested on a specific day of the week. Reducing the number of templates is a worthwhile goal, but it needs to be kept in the context of the overall demand on the editor. It's nice not to have to remember 57 different catboilers, but it's also nice not to have to have to look up the language code- especially for the <canonical language name> terms derived from <language name or language family name> categories, which often have redlinks to more obscure language-family categories. Chuck Entz (talk) 22:35, 1 January 2015 (UTC)
I've added this to {{poscatboiler}} now (see the edits I made to Module:category tree and Module:category tree/poscatboiler). If you leave out the label, it will try to extract it from the page name. If the category doesn't begin with the specified language, or if the autodetected label doesn't exist, it shows a somewhat nondescript error message, but at least the basic idea works. —CodeCat 14:12, 2 January 2015 (UTC)
Very nice! If you create a new category that has multiple redlinks in the breadcrumbs, it's now possible to copy the wikitext from the first unchanged into all the redlinked categories with just a few clicks and keystrokes. The error handling is definitely a problem, though. Perhaps you could compare the expanded language code with the beginning of the page name and give a message along the lines of "The language code XX is for the language YYYY, which doesn't match the category name".Chuck Entz (talk) 18:18, 2 January 2015 (UTC)

missing important and common wordsEdit

What kind of software solutions exist and which are we using to ensure we know which of the most common words are missing? I was very surprised to discover that, for example, news stream is completely missing in this and all other dictionaries. So Wiktionary has the chance to be the first dictionary to record one of the most important and common and descriptive words of our time. I found several lists like User:Brian0918/Hotlist and User:Robert Ullmann/Missing and User:Visviva/Tracking, but these don't seem to make any kind of frequency analysis. I don't understand what to do with the red links at the beginning of the last list. --Espoo (talk) 11:32, 7 January 2015 (UTC)

  • Those red links were part of Visviva's semi-automated system of generating new subpages for recent newspaper editions. No longer in use, but I come back to the existing subpages from time to time and try to complete them. I delete them when they are complete. SemperBlotto (talk) 10:03, 8 January 2015 (UTC)

"hu-conjugation of" in verb form categoryEdit

How do we change the template {{hu-conjugation of}} so that it isn't in Category:Hungarian verb forms but puts verb forms there? --Lo Ximiendo (talk) 15:02, 7 January 2015 (UTC)

Fixed. You have to put the category call inside the "includeonly" part and not inside the "noinclude" part. —Aɴɢʀ (talk) 15:11, 7 January 2015 (UTC)
@Angr: but you should have taken a look at the list in Category:Hungarian verb forms and seen the template there. Besides, I wish {{hu-conjugation of}} got rewritten into Lua. --Lo Ximiendo (talk) 15:16, 7 January 2015 (UTC)
Hmm, I don't know why that's happening. It would be good for the template to be luacized, but that wasn't the problem you brought up. —Aɴɢʀ (talk) 15:22, 7 January 2015 (UTC)
I disagree with making {{hu-conjugation of}} add entries to the category. The category is already added by the headword template. —CodeCat 15:24, 7 January 2015 (UTC)
{{hu-conjugation of}} was created several years ago before the current categorization direction. There is no need to recreate the template in Lua. It could be replaced by {{inflection of}} applying the parameters needed for Hungarian. For example: vadászok
{{hu-conjugation of|vadászik|1|s|indic|pres|indef}} would become
{{inflection of|vadászik||1|s|indicative|pres|indefinite}}. --Panda10 (talk) 17:39, 7 January 2015 (UTC)
I've added most of the grammar tags from {{hu-grammar tag}} to Module:form of/data. But there was a conflict in one case: sub is already used to mean subjunctive, so it can't also mean sublative. Furthermore, the following tags shouldn't be added to the module, so some solution for them should be found: ban, ben, 1s, 2s, 3s, 4s, 5s, 6s, 1p, 2p, 3p, 4p, 5p, 6p. I'm not sure what to do with pos and nonattr. It also needs to be checked if any pages use {{hu-grammar tag}} with a tag that it doesn't recognise (in which case it's shown as-is) but which {{inflection of}} does recognise. —CodeCat 18:52, 7 January 2015 (UTC)
Ok, thanks. Those parameters are used by {{hu-inflection of}}. This would be more complicated to replace (it is used in about 19,000 entries). In my above note I meant to replace only {{hu-conjugation of}} with {{inflection of}} because it is used in a little over 200 entries. --Panda10 (talk) 19:41, 7 January 2015 (UTC)
{{hu-conjugation of}} now just calls {{inflection of}}. You can replace it if you want. —CodeCat 22:34, 7 January 2015 (UTC)
Thanks. Is it feasible to semi-automate the creation of Hungarian verb forms? Similar to the noun forms. It is very convenient to create a noun form entry just by clicking the declension table cell. --Panda10 (talk) 22:38, 7 January 2015 (UTC)
The biggest limitations of WT:ACCEL are that it only works for red links, and it can only create entries for one form at a time. So when the same form is actually several distinct forms that just happen to be identical, then it doesn't work either. This is likely a problem for verbs, which have a lot more forms than nouns do and so there is more risk of one form appearing more than once in the table. In theory, the table module (it would have to be a module; it's not feasible with a template alone) could be modified to alter the acceleration tags it puts in the links, so that WT:ACCEL is told that the entry is for multiple forms. But that would make it a lot more complicated as well. —CodeCat 22:54, 7 January 2015 (UTC)
Ok, it will just stay as is. However, the changes you made in the templates created a problem in the possessive forms, e.g. ablaka - the definition line contains a category name in wikilinks. It also emptied the Category:Hungarian noun forms - possessive. Can you reverse this? I appreciate your help but I really don't want more problems, it will be just too overwhelming for me to correct them. --Panda10 (talk) 00:25, 8 January 2015 (UTC)
I just removed the category for now. Having a category for every single kind of inflected form is just overkill, as I have mentioned before in other discussions. —CodeCat 00:29, 8 January 2015 (UTC)

FontsEdit

Why do the Navajo, pinyin and romaji words need to be written in those ugly and different fonts? --Biolongvistul (talk) 13:11, 8 January 2015 (UTC)

For Pinyin and Romaji, it happens when the software assumes that the word is being written in Hanzi/Kanji even though it really isn't: if I write {{l|zh|Běijīng}} it shows up as Běijīng because it assumes that everything labeled "zh" is in Hanzi, so it uses a font that's better suited to Hanzi. I would have expected {{l|zh|Běijīng|sc=Latn}} to force it to show up using the default Latin font, but it doesn't; it still shows up as Běijīng, which is annoying. (Interestingly, if I specify the language as "cmn" instead of "zh", Pinyin shows up using the default Latin font, even without being explicitly labeled "sc=Latn", so {{l|cmn|Běijīng}} shows up as Běijīng.) For Navajo, I have no idea since Navajo is only written in the Latin alphabet, so the software shouldn't be assuming anything else. —Aɴɢʀ (talk) 20:59, 9 January 2015 (UTC)
If I remember correctly, that font was chosen for Navajo to accommodate its many diacritics, assuming that our standard fonts can't. Stephen G. Brown (talkcontribs) can shed some light. --Vahag (talk) 00:36, 10 January 2015 (UTC)
That is correct. Navajo uses diacritics on some letters that are spaced incorrectly in regular Roman fonts, so we use the Aboriginal Sans Serif. —Stephen (Talk) 09:57, 10 January 2015 (UTC)

Confix templateEdit

There is a problem with this template where listings for the suffix don't appear in the correct alphabetical order, when there's a root word as well as a prefix. For example: repristination. Donnanz (talk) 12:36, 9 January 2015 (UTC)

I don't see it, can you elaborate? —CodeCat 13:57, 9 January 2015 (UTC)
You can see it in the -ation listings, in between prioritization and privatisation. (https://en.wiktionary.org/wiki/Category:English_words_suffixed_with_-ation) Donnanz (talk) 14:06, 9 January 2015 (UTC)
It looks like the module is stripping the prefix from the sort key for the prefix category and using this prefix-stripped sort key for the suffix category, too, where it should be using the whole word. Chuck Entz (talk) 17:19, 9 January 2015 (UTC)
I have found a workaround for repristination by altering it to {prefix|re|pristine|lang=en} {suffix||ation|lang=en} (doubled brackets shown as single). But the problem still remains for the unwary. Donnanz (talk) 21:28, 9 January 2015 (UTC)
Does it work if you use {{affix|en|re-|prestine|-ation}}? —CodeCat 21:36, 9 January 2015 (UTC)
Yes, that works OK. Donnanz (talk) 22:02, 9 January 2015 (UTC)
Please don't apply that workaround to lots of entries! We should fix confix rather than propagating hacks. Equinox 21:38, 9 January 2015 (UTC)
Hopefully there's not a lot of entries like this. Donnanz (talk) 22:02, 9 January 2015 (UTC)
I think confix only works properly when a prefix is linked to a suffix with no word in between. Donnanz (talk) 10:03, 10 January 2015 (UTC)

Can anyone fix the tagging of the "Appendix" namespace, please?Edit

See discussion here: http://sourceforge.net/p/kiwix/discussion/604122/thread/1eacb6d8/

Thank you.—This unsigned comment was added by 198.23.103.67 (talk) at 13:59 9 January 2015.

No comments?  :( —This unsigned comment was added at 198.23.103.67.

Could you state exactly what you are seeking and why? Generally speaking, only parts of Appendix namespace have content that meets minimum standards IMO. Separating the wheat from the chaff will take some time if contributors are willing to undertake the task. DCDuring TALK 15:17, 15 January 2015 (UTC)
We really need a separate namespace for reconstructions. —CodeCat 15:42, 15 January 2015 (UTC)
Maybe, but the link above is to a discussion about Appendix:Japanese verbs. I rather doubt that reconstructions are tops on the list of Appendix content that linkers seek. DCDuring TALK 16:01, 15 January 2015 (UTC)

Bot RequestEdit

Would someone with a bot please be so kind as to perform null edits on all the entries in Category:Pages with module errors? With over 2100 entries from a problem that was quickly fixed a day or two ago, the real module errors are very hard to spot (see błyskać for the one I know about). Thanks! Chuck Entz (talk) 19:47, 9 January 2015 (UTC)

I'm running it now. It's not actually performing null edits though, just hard purges. It's faster that way. —CodeCat 21:37, 9 January 2015 (UTC)
Thank you! It was fun watching the member count dropping rapidly every time I refreshed the page. I see that the problem with the Polish entries has been fixed, (the Chinese ones seem to have beenfixed earlier), and the pt-adj monstrosity has been dealt with, so we now have an empty category for the first time in what seems like a month or more. I can handle the new ones that will straggle in from the edit queue, but I gave up on these after clearing about 1,500 of them (what can I say- I'm stubborn!). Chuck Entz (talk) 22:17, 9 January 2015 (UTC)
Making a bot that does null edits is very easy. If you want, I can give you some Python code that does it? You'll need the Pywikibot package. You probably don't need a bot account as null edits aren't really edits, nothing is actually changed. —CodeCat 22:19, 9 January 2015 (UTC)

Template for variant spellings.Edit

I would like to create a template that creates a collapsable table, replaces input strings according to a specific key and logs every single replacement. An example: The input of the template is "a, d, e". Each of these letters has two results according to the key, the alternatives being "ä, ð, 0". I would want to log the variants "ade", "aðe", "äde", "äðe", "ad", "äd", "að", "äð" in an annotated table. Basically an automated form-table like the one that can be seen at enmity.
I tried to figure it out myself through the help pages, but they all seem to be for people who already know to some extent how it all works. For example I could find no help pages on Wiktionary on how to create a table that is collapsable.
So if anyone could give me a link to a help page with the relevant formatting data and explain to me how to include a string-replacement into a template, I'd be grateful. Korn (talk) 10:08, 10 January 2015 (UTC)
ps.: I'm aware the WT-templates page has a section giving the code to make a table collapsable. But what I meant is that if you've no prior Wiki-experience, making your baby steps here is at least a bit confusing. Korn (talk) 10:31, 10 January 2015 (UTC)

If I understand you correctly you would like to create a table for variant spellings without regard to whether they were actually attestable or sourced from a reference like the OED. That would be something we wouldn't want in entries. We already have some handmade tables like that which are not useful. The table at enmity might be useful, but the plethora of redlinks suggests that we haven't collected evidence and are relying exclusively on the OED. DCDuring TALK 11:09, 10 January 2015 (UTC)
Well, I was primarily intending to use it for IPA to translate the basic structure of a word into all the narrow transcriptions for different dialects for Middle Low German. Though, in modern Low German, 99% of words in almost all but one or two varieties are indeed not attestable, because people use German for written communication. And people use a pronunciation spelling for writing, so it could be used in that area as well.Korn (talk) 13:07, 10 January 2015 (UTC)

I triggered a spam filter?Edit

I just tried to edit my userpage with a short introduction saying who I am and that I'm an admin and CU over at en.wikibooks. It seemed to have triggered a spam filter, I think. Can someone add this to my userpage?

I am an administrator and check user at English Wikibooks. I'm not very active at English Wiktionary.

Thanks. --Xania (talk) 23:14, 12 January 2015 (UTC)

Yes check.svg Done. — Ungoliant (falai) 23:18, 12 January 2015 (UTC)
For future reference, I believe the filter was triggered because you had few contributions and were adding an external link; because your target was another WMF site, you could have avoided the filter by using a link of the form [[:wikibooks:User:Xania|English Wikibooks]]. Cheers, - -sche (discuss) 23:58, 12 January 2015 (UTC)
Thanks. I had thought that a WMF URL would have been exempt but I'd forgotten that I could have used a shortcut instead.--Xania (talk) 00:12, 13 January 2015 (UTC)

Module:Alternative formsEdit

In e.g. indogermanisch, the name of the "dialect" to which the alt forms belong (in this case the qualifier is not a dialect but an explanation that the alt forms are abbreviations) should be in parentheses, like it would be if the old method of formatting were used. - -sche (discuss) 01:46, 13 January 2015 (UTC)

Bump. - -sche (discuss) 20:26, 19 January 2015 (UTC)
See this discussion. --Vahag (talk) 10:42, 26 January 2015 (UTC)
Why are the forms listed in a different font face? —Aɴɢʀ (talk) 20:39, 19 January 2015 (UTC)

Template questionEdit

Hello,

I'm new to the Mediawiki markup language. I've already read several help pages about templates.

According to m:Help:Parameter_default, {{{p|q}}} outputs "p", if "p" is defined. Otherwise "q".

I still don't get a particular syntax:

{{{a|{{{b|c}}}}}} gives c
{{{{{{a|b}}}|c}}} gives c - parameter b is undefined

Can someone explain to me, why the result is "c"? In both examples, "a" might be defined?

The assumption in the examples is that there are no defined parameters.
{{{a|{{{b|c}}}}}}
is interpreted as
  • is parameter a defined?
  • Yes: return the contents of parameter a.
  • No: is parameter b defined?
    • Yes: return the contents of parameter b
    • No: return the literal string "c".
{{{{{{a|b}}}|c}}}
is interpreted as
  • is parameter a defined?
  • Yes: return the contents of parameter a
  • No: return the literal string "b"
  • what was just returned will now be interpreted as a parameter name, being either what was in a, or "b". Assuming that there are no defined parameters, this will be "b", so...
  • is parameter b defined?
  • Yes: return the contents of parameter b
  • No: return the literal string "c"

I'm also struggeling with an expression like this one:

{{#if: {{{A|{{{B|{{{C|}}}}}}}}} | XXX | YYY }}

If I understand the syntax correctly, the output will be "XXX" if any one the three variables "A", "B" or "C" is defined. Otherwise (if none of these variables are defined) "YYY" will be the output. Is my assumption correct?

Almost.
So #if has three elements: the truth statement, the return value if true, and the return value if false.
If the parameter A is defined, its contents will be returned as the truth statement. If it isn't defined, then parameter B will be evaluated. If it isn't defined, C. If that isn't defined either, then the empty string "" will be returned.
The truth value is then determined. Simply: if the return value is undefined or an empty string, then it evaluates as false, otherwise as true.
So if all of the parameters are undefined, the result of the #if statement will be "YYY". It will also be "YYY" if the contents of the first defined parameter in A, B, or C, is the empty string (that is, if it is called from an invocation like A=.) So you can force it to output a negative result, even if B and C are defined.
If you don't put a default value in an expansion, and the parameter is not defined, then the result will be the string unexpanded.
So if you have
{{{A|{{{B|{{{C|}}}}}}}}}
with no parameters defined, the result will be {{{C}}}, which is, C is not defined, so the result is {{{C}}}, which is a valid string, which is what is passed back as the result. This is not an empty string, so it evaluates as true, and the result of
{{#if: {{{A|{{{B|{{{C}}}}}}}}} | XXX | YYY }}
(that is, with no default for the C parameter) will be "XXX"
Have I confused you utterly? --Catsidhe (verba, facta) 12:09, 13 January 2015 (UTC)
Hello Catsidhe, thanks for your fast and extensive response! ;))) Thanks!
Just for clarity:
{{{a|{{{b|c}}}}}}
works the same way
{{{{{{a|b}}}|c}}}
does?
I also learned, that a defined (but empty) parameter affects the logic flow. Until now I assumed, that empty values were input errors by user, but now I rescind that statement.
I think I've been way off with my interpretation of
{{#if: {{{A|{{{B|{{{C|}}}}}}}}} | XXX | YYY }}
I lost you at "(that is, with no default for the C parameter) will be "XXX"":
First you said, that if the variables A,B are not defined, the output will depend on the value of C. If C isn't defined either, the empty string will be returned and "YYY" will be the output.
On the other hand, you said that
{{{A|{{{B|{{{C|}}}}}}}}}
will be evaluated to
{{{C}}}
and that is not an empty string, independent of whether C is defined or not. Thus "XXX" will follow.
This is the point where I am confused. Lets assume C has no default value and no value is given for C, where the template is called. Will "XXX" or "YYY" follow?
Citronas (talk) 13:27, 13 January 2015 (UTC)
Recall that {{{x|y}}} means if x is defined, return "{{{x}}}", otherwise, return "y". So look at the evaluation step-by-step:
  • "{{{A|{{{B|{{{C|}}}}}}}}}" is what we start with. Is parameter A defined?
    • Yes? Then it becomes "{{{A}}}".
    • No? Then it becomes "{{{B|{{{C|}}}}}}". Is parameter B defined?
      • Yes? Then it becomes "{{{B}}}".
      • No? Then it becomes "{{{C|}}}". Is parameter C defined?
        • Yes? Then it becomes "{{{C}}}".
        • No? Then it becomes "" (empty string).
I hope this helps. —CodeCat 13:42, 13 January 2015 (UTC)
It's more like {{{C}}} evaluates to the contents of the parameter C, if it is defined. If it is not defined, then it evaluates to the string "{{{C}}}". The {{{C|a}}} says that if C is not defined, then return the string "a". {{{C|}}} says that if C is not defined, then return the empty string. I was contrasting the behaviour of {{{C|}}} and {{{C}}}. With the alternate value of the empty string, the whole construction will return the empty string if none of the parameters are defined, and the empty string has the truth value "false". Without that pipe character, if none of the parameters are defined, it will return the string "{{{C}}}", which has the truth value "true".
{{{a|{{{b|c}}}}}} and {{{{{{a|b}}}|c}}} do not work the same. If a is defined, then the first will return the value of a. Else if b is defined it will return the value of b, else it will return the string "c". The second will first evaluate whether a is defined, and return either that value or the string "b", and then use that string (either a or "b") and see whether that is a defined parameter. If it is, it will return the contents of that parameter, otherwise it will return "c". So if the contents of a is "z", it will evaluate {{{a|b}}}, which will return "z", which then becomes {{{z|c}}}. If parameter z is defined, then that value will be returned.
--Catsidhe (verba, facta) 19:49, 13 January 2015 (UTC)
I finally got it ;) Thanks Catsidhe and CodeCat!! I wasn't aware of the difference between {{{C|}}} and {{{C}}}. I should have asked here earlier, instead of guessing for 2 weeks straight =) Citronas (talk) 10:10, 14 January 2015 (UTC)

Template:eo-headEdit

Template:eo-head seems to be displaying noun and adjective inflections slightly differently: for adjectives, it presents them in the order "plural, accusative singular, accusative plural" (e.g., aĝa), whereas for nouns, it presents them in the order "accusative singular, plural, accusative plural" (e.g., ŝnuro). I think it is desirable to have the inflections presented in the same order for both adjectives and nouns, so I would appreciate it if someone could change Template:eo-head so that it uses the order "plural, accusative singular, accusative plural" for both parts of speech. This is the order that Template:eo-noun and Template:eo-adj both use. Thank you! —Mr. Granger (talkcontribs) 00:02, 16 January 2015 (UTC)

I think that putting the accusative singular first makes more sense, because the accusative plural is derived from the nominative plural. And what happens for nouns with no plural? With the ordering you propose, you would end up with the accusative singular changing positions because the plural before it disappears. It looks neater if the plural is just taken off the end instead. Furthermore, we already put singular cases before nominative plural in the headword lines of Russian and Slovene, and probably other languages too. —CodeCat 00:07, 16 January 2015 (UTC)
That's fine with me. I just want all three templates to use the same order for both parts of speech. —Mr. Granger (talkcontribs) 00:30, 16 January 2015 (UTC)
I think it's fixed now. —CodeCat 00:59, 16 January 2015 (UTC)
Thanks! —Mr. Granger (talkcontribs) 01:05, 16 January 2015 (UTC)

parameter id= has stopped working Edit

The parameter id=, used in {{m}} and {{l}} for linking to {{senseid}}-generated targets, has stopped working. See for example in the etymology of भाति (bhāti). Please fix it. --Vahag (talk) 11:25, 18 January 2015 (UTC)

It works now. id= wasn't broken, it was just being ignored for Appendix pages because we don't need to link to language sections. But we do need to be able to link to ids, so I changed that now. —CodeCat 12:37, 18 January 2015 (UTC)
I see, thanks. --Vahag (talk) 15:17, 18 January 2015 (UTC)

Category:Hungarian uncountable nouns - incorrect contentEdit

The above category is suddenly collecting suffixes, proper nouns, pronouns and noun forms. Any idea what may be the cause? Thanks. --Panda10 (talk) 19:28, 19 January 2015 (UTC)

They have a noun declension template with n=sg. You need some way to distinguish between declensions of nouns and other parts of speech in the template. DTLHS (talk) 19:33, 19 January 2015 (UTC)
Thanks! --Panda10 (talk) 20:04, 19 January 2015 (UTC)
@CodeCat: I believe this problem is coming from the new Module:hu-nominals. Is there a way to correct it? --Panda10 (talk) 20:04, 19 January 2015 (UTC)
I've removed the category for now, but I'm confused what's wrong with showing proper nouns there. They are nouns after all. —CodeCat 20:44, 19 January 2015 (UTC)
Ok, thanks for the correction. The nominal inflection module is probably not the best way to do this type of categorization since we are using it for nouns, adjectives, numerals, pronouns, and even suffixes. I will add the category when needed using other methods. --Panda10 (talk) 21:01, 19 January 2015 (UTC)
I used the modules from other languages as a base when making it, so there were some remnants like that. —CodeCat 21:15, 19 January 2015 (UTC)

Tamil transliteration rules are incompleteEdit

In அஃகம், you can see that a couple letters (namely ஃக) aren't transliterated. I presume this should be addressed. - -sche (discuss) 20:26, 19 January 2015 (UTC)

@DerekWinters, Wyang: pls help if you can. --Anatoli T. (обсудить/вклад) 01:17, 20 January 2015 (UTC)
I think it should be "aḥkam". It's visarga () + ka (ka). --Anatoli T. (обсудить/вклад) 01:22, 20 January 2015 (UTC)
My diff didn't work in Module:ta-translit. --Anatoli T. (обсудить/вклад) 01:25, 20 January 2015 (UTC)

Redlinks by languageEdit

Is it possible to acquire a list of words wanted in a given language? That is, pages with a redlink encased in a template such as {{m|xyz|word}} leading to them?

As far as I can tell this is not possible within MediaWiki software, but it sounds like information extractable from a database dump perhaps. --Tropylium (talk) 14:38, 20 January 2015 (UTC)

It would be possible, but difficult. But much easier would be to find those enclosed in {{l}}, {{m}}, {{term}} as all of these have a language parameter, position 1 for {{l}} and {{m}}, lang= for {{term}}. How are your skills with regular expressions? DCDuring TALK 15:50, 20 January 2015 (UTC)
For l and m this should be OK (in Python):
r"{{(?:l|m)(?:\|.*?=.*?)*(?:\|(LANGCODE))(?:\|.*?=.*?)*(?:\|(WORD))(?:\|.*?=.*?)*}}" #gives two groups: langcode and word. 
As for term, as it can have kinda difficult expression I would first transform term's into l's or m's like this:
from r"{{term(.*?)(\|lang=(LANGCODE))(.*?)}}" into r"l|\3\1\2\4"
P.S. I think regex will have hard time working on that huge file though.
--Dixtosa (talk) 18:11, 20 January 2015 (UTC)
I run a Perl script every month that extracts and counts instances taxa enclosed in {{taxlink}} (on 11K pages). It runs in less than 30 seconds, but virtually all instances are red links, so it doesn't have to compare the list of all terms enclosed in {{l}} (on 362K pages) and {{m}} (on 42K pages) with a list of all headwords, let alone a list of all entries in a given language. In addition all terms enclosed in {{taxlink}} are Translingual lemmas.
There are also templates such as {{l/es}} (223K pages) (Compare {{l|es}} (10K pages).) that enclose words from only a single language. IOW, it would be easy to generate, for example, l|es-, m|es-, and l/es- linked words in Spanish. I think they are supposed to all be lemmas. Subtracting members of Category:Spanish lemmas shouldn't be too hard. DCDuring TALK 19:47, 20 January 2015 (UTC)
I wouldn't assume that all terms linked with {{l}} and {{m}} (and l/XX templates) are lemmas. There are all sorts of times when nonlemma forms might find themselves inside those templates. —Aɴɢʀ (talk) 20:39, 20 January 2015 (UTC)
If they mostly are, the exercise would still probably be worth it. But it would probably be worthwhile to subtract all entries in a given language, rather than just all lemmas. In any event the remaining entries would still have to be looked at one at a time for purposes of actually adding new L2 sections or new pages DCDuring TALK 22:04, 20 January 2015 (UTC)
"Take all links with a language code, subtract all existing entries" is the obvious brute force option, sure. I'm wondering more if it is possible to speed things up a bit: first acquire a list of redlinks in the main namespace, then retrieve the referring wikicode(s) for each? --Tropylium (talk) 11:12, 21 January 2015 (UTC)
We have Category:Terms having red links in their inflection table by language already, which works with various templates. Examples of such templates are Template:es-adj and Template:ast-noun. --Walled brick (talk) 11:25, 21 January 2015 (UTC)

Weirdness at これは何ですかEdit

This module error wasn't here a couple of days ago, but the entry's edit history doesn't show anything since January 11, and when I look at "Templates used in this section:" for the section that has the error, none of the templates listed has any edits in the past week:

  • Template:ja-phrase last edit July 27, 2014‎ (my time zone)
  • Module:ja last edit December 24, 2014‎
  • Module:ja-headword last edit December 25, 2014‎
  • Module:languages last edit September 26, 2014‎
  • Module:languages/data2 last edit January 13, 2015

Can anyone explain where this module error came from? It looks like it materialized out of thin air. Has there been a system change that might explain this? Chuck Entz (talk) 03:47, 21 January 2015 (UTC)‎

Fixed. The function find_kana in Module:ja-headword tries to find from the arguments a pure kana parameter, and it fails to detect one if the fullstop "。" is included. Wyang (talk) 05:59, 21 January 2015 (UTC)
Thank you! Any idea why it waited a week from the last edit before the error showed up?. Chuck Entz (talk) 07:01, 21 January 2015 (UTC)
No idea, it might be the reason these edits were needed. Wyang (talk) 08:53, 21 January 2015 (UTC)

Mismatch between L2 and language declared in etymologyEdit

Is this and this, i.e. the use of a language code as the lang= parameter of {{borrowing}} or as the second parameter of {{etyl}} that doesn't correspond to the L2 header, something a bot could check for periodically? It doesn't always need to be cleaned up to the language code that corresponds to the L2; sometimes it needs to be switched to use "-", as here. - -sche (discuss) 17:58, 21 January 2015 (UTC)

@-sche: User:DTLHS/bad etymology. I have excluded Chinese from the list. DTLHS (talk) 01:24, 22 January 2015 (UTC)
Some of the pages in the list use {{compound}} and related templates with nocat=. Those should really be excluded. —CodeCat 01:38, 22 January 2015 (UTC)
Excluded. DTLHS (talk) 01:42, 22 January 2015 (UTC)
Thank you! - -sche (discuss) 02:13, 22 January 2015 (UTC)
I really don’t like the way {{borrowing}} works. I think it should work the same way as {{etyl}}. — Ungoliant (falai) 02:36, 22 January 2015 (UTC)
It seems that people expect {{unk.}} to work the same way as {{etyl}}, too. - -sche (discuss) 02:38, 22 January 2015 (UTC)
I wonder if {{rfe}} should have multiple parameters, for the probable language of the etymology (if you knew it was Latin) as well as the requesting entry language. DTLHS (talk) 02:45, 22 January 2015 (UTC)
  • @DTLHS: The next time this is run {{rfelite}} should also be included, though it has only about 80 transclusions so far. DCDuring TALK 02:53, 22 January 2015 (UTC)

Wiktionary talk:AutoWikiBrowser/CheckPage#Technical 13Edit

Since I don't expect that page is monitored that well, I'm posting here requesting that someone take a look at my request on Wiktionary talk:AutoWikiBrowser/CheckPage#Technical 13. Thank you. Technical 13 (talk) 20:59, 21 January 2015 (UTC)

Partial string searchEdit

Is it possible to search for partial string with regular expressions or something else? Currently, I need to find all Russian words with Cyrillic "-вств-" in them (e.g. чу́вство (čúvstvo) to fix a pronunciation rule in Module:ru-pron. I have mistakenly defined the rule with a silent first "в", as in чу́вство (čúvstvo), здра́вствуйте (zdrávstvujte) but there are cases when it's pronounced, I forgot what those words are! One example is де́вственница (dévstvennica).

I think the advanced search functionality would be useful in various case, e.g. when looking for words having the same stem or suffix, etc. --Anatoli T. (обсудить/вклад) 22:56, 21 January 2015 (UTC)

AWB search returned 31 results.
безнравственность, Соединённое Королевство Великобритании и Северной Ирландии, королевство, Соединённое Королевство, чувствовать, почувствовать, чувствоваться, почувствоваться, здравствуйте, здравствуй, чувство, девственник, колдовство, предчувствие, девственница, кумовство, девственность, воровство, здравствовать, девственная плева, да здравствует, сочувствовать, сочувствие, нравственный, чувствительный, лукавство, нравственность, отцовство, рыболовство, самочувствие чувствительность --Panda10 (talk) 00:45, 22 January 2015 (UTC)
@Panda10: Thanks a bunch! Is that a complete list? (As it turns out, silent "v" in the beginning of the cluster is less common than pronunciation "/fstv/.) --Anatoli T. (обсудить/вклад) 02:10, 22 January 2015 (UTC)
Yes and it is very easy if you are on unix-like machine (for example Linux). You just download the "List of all page titles"(it is only 55MB). And run command
$ grep вств enwiktionary-20150102-all-titles
but it doesn't filter by language.
Yes this is a complete list as of 2015-01-02. --Dixtosa (talk) 08:46, 22 January 2015 (UTC)
Special:Search/insource:/вств/ (warning: slow!) Keφr 09:24, 22 January 2015 (UTC)
Thank you all! --Anatoli T. (обсудить/вклад) 14:09, 22 January 2015 (UTC)

Alternative font for alternative forms?Edit

Why does {{alter}} and/or Module:Alternative forms display forms in a font different from the default font? How do we fix that? —Aɴɢʀ (talk) 12:56, 25 January 2015 (UTC)

Maybe because local sc = args["sc"] or "polytonic"? Keφr 18:06, 25 January 2015 (UTC)
So what should it say? —Aɴɢʀ (talk) 18:30, 25 January 2015 (UTC)
Probably this. What is the actual point of {{alter}} anyway? {{l}} paired with {{qualifier}} works well enough for me. Keφr 18:42, 25 January 2015 (UTC)
I dunno, I've never used it myself. I just noticed that it looked funny when other people use it. —Aɴɢʀ (talk) 19:10, 25 January 2015 (UTC)
It still has the problem that it doesn't display the qualifier label in parentheses; see Wiktionary:Grease_pit/2015/January#Module:Alternative_forms. If it can't be fixed soon, I'm tempted to start restoring functional manual formatting in entries that use it. - -sche (discuss) 19:18, 25 January 2015 (UTC)

Automating removal of Category:German lemmas categories (and presumably other languages)Edit

I've never really dabbled in the automation side of Wiktionary, so I thought I'd ask here: is it possible, using AWB or a bot, to go through the German parts-of-speech categories and remove categories like Category:German nouns/Category:German adjectives etc from pages that already has template:head or one of its descendants? The problem is that template:head automatically parses German words with special characters in order to correctly alphabetise them in dictionary order (so it puts gären between garen and garnieren). However, putting a lemma category on the page then overrides this and causes the default sort to take precendence, which puts non-ASCII characters after ASCII (which means gären gets sorted after gustieren, between gähnen and gönnen). Simply removing the category where it's unnecessary would ensure that terms including special characters get correctly sorted. I've corrected a few entries by hand (eg. [4], [5]) but it's hard to find these improperly categorised pages manually when they don't start with an umlaut.

Presumably other languages have this problem too. Category:Spanish adjectives has ñango (which is only categorised through template:es-adj) next to namibio, but ñoño (which is explicitly categorized) is sorted next to zurdo. I'm using German as an example solely because that's a language with collation rules I know fairly well. Smurrayinchester (talk) 09:39, 26 January 2015 (UTC)

I've been going through the German topic categories and fixing the ones with diacritics to use {{catlangcode|de|Blah}} instead of bare [[Category:de:Blah]] for the same reason: {{catlangcode}} uses smart sorting, and the bare Category: code doesn't. —Aɴɢʀ (talk) 20:19, 26 January 2015 (UTC)

February 2015

Help with ff-rootEdit

Need help with reviewing syntax for the new Template:ff-root. Seems to work but for the articles in the category not showing in the category. TIA.--A12n (talk) 18:45, 1 February 2015 (UTC)
Seems to be working now, but a review of syntax would still be appreciated. TIA.--A12n (talk) 19:41, 1 February 2015 (UTC)

I'm not sure what the template is supposed to do, as there is no documentation. Can you elaborate? —CodeCat 19:56, 1 February 2015 (UTC)
It's modeled after the ar-root template - but only needs to display the root on the page on which it's placed, put that page in the Fula roots category, and put itself in the Fula template category. There was a delay in populating the categories so I thought there was a problem. Will look at how to do the documentation (appears from the ar-root example to require a separate page.--A12n (talk) 20:06, 1 February 2015 (UTC)
If the purpose is only to show the page name, and add a category, then you don't need to make a new template. The standard template "head" will do: {{head|ff|root}}. —CodeCat 20:11, 1 February 2015 (UTC)
Ok, thanks. Will look into changing.--A12n (talk) 20:40, 1 February 2015 (UTC)

Option for checking past contributions...Edit

I really think that there should be an "only show items that are not on your watchlist" option for checking one's past contributions.

Oftentimes, I remove items from my watchlist once I feel that they are no longer in any danger of being vandalised or the like. However, sometimes I wish to check on those items that I have removed from my watchlist just on the off chance that something did happen to them.

Is there any way to implement such an option for that? Tharthan (talk) 17:15, 2 February 2015 (UTC)

this script makes unwatched entries bolder on Users Contribution page, but is awfully slow.--Dixtosa (talk) 23:04, 8 February 2015 (UTC)

Bug in romanization of ArabicEdit

The automatic romanization of Arabic has a small bug in the translations list. If you look at the English word 'wolf', the Arabic translation is given as ذِئْب (I have no idea whether that'll come out correctly here.) This is correct, but the romanization is (ḏīb). That is, it is not recognizing that the middle ya is the bearer of hamza, and is treating it as a ya of prolongation, giving a long vowel. On the actual page for the word ذِئْب, the hamza is correctly coming out in the transcription (ḏiʾb). – 194.106.220.86 16:29, 4 February 2015 (UTC)

We don't have automatic romanization of Arabic. It's all manual for that language. If someone romanized it as ḏīb when it should be ḏiʾb, then they just made a mistake. —Aɴɢʀ (talk) 17:00, 4 February 2015 (UTC)
Not quite true. It's automatic but only if the transliteration module determines that the word is fully vocalised. In any case, though, manual transliterations will override automatic ones. —CodeCat 17:03, 4 February 2015 (UTC)
Sure 'nuff. I took out the manual translit and now it automatically generates ḏiʾbun. —Aɴɢʀ (talk) 17:15, 4 February 2015 (UTC)
By convention, we don't include ʾiʿrāb in the translations to Arabic, so I changed the translation to ذِئْب (ḏiʾb). ذِئْبٌ (ḏiʾbun) is the nominative singular indefinite form in the MSA or Classical Arabic. --Anatoli T. (обсудить/вклад) 22:15, 5 February 2015 (UTC)
There's an ongoing discussion about the use of ʾiʿrāb in Wiktionary. Suffice to say that nunation is not pronounced in "pausa" (end of a clause before a pause) even in standard Arabic. No dialect preserves nunation, except for some accusative forms, especially adverbials but this also usually affects unvocalised spellings (alif is written in most cases). See also Wiktionary:About_Arabic#.CA.BEi.CA.BFr.C4.81b_.28final_short_vowels_and_nunation.29. --Anatoli T. (обсудить/вклад) 22:26, 5 February 2015 (UTC)

nive#WalloonEdit

It's showing "uncountable, plural -", which doesn't make sense. This, that and the other (talk) 10:53, 5 February 2015 (UTC)

It's fixed now. —CodeCat 14:11, 23 February 2015 (UTC)

Template:headtempboiler:letterEdit

As mentioned at Template:headtempboiler#Letter template there's the parameter "lower2=" in Template:headtempboiler:letter. But that doesn't work anymore and seems to have been remoed here. A "lower2" is e.g. needed for σ (sigma). So the template needs to be fixed. Or should {{head|LANG|letter|lowercase|LOWER2|uppercase|UPPER}} be used like in β? -Yodonothav (talk) 21:56, 5 February 2015 (UTC)

Telugu script not showing up correctlyEdit

A picture, for anyone seeking to troubleshoot this. - -sche (discuss) 08:33, 7 February 2015 (UTC)

Hi! So I noticed that there seems to be a problem with how certain aspects of the Telugu script show up within entries (i. e., not in the titles). Consonant adjuncts don't seem to be working at all; consonant clusters appear as the two base consonants next to each other, the first with a virama (the inherent vowel deleter) and the second with the appropriate vowel adjunct. While this technically produces the same sound if read out loud, it is not generally how Telugu orthography works. Secondly, many vowel adjuncts don't seem to be working either... The adjunct simply shows up next to the base consonant it's supposed to be modifying, but just hovering next to it instead of being integrated like it should be. Below is an example of an entry which features all of these problems:

అంటార్కిటికా

The word should look like it does in the title of the entry, but nowhere else in the article does it look remotely like that. Does anyone know how I could fix or help fix this issue? It's rather widespread in Telugu articles. –AxaiosRex (అక్షయ్⁠రాజ్) 00:24, 7 February 2015 (UTC)

Yes check.svg Fixed by removing a crappy font from Common.css (Sangam Telugu is good, though). —Μετάknowledgediscuss/deeds 09:12, 7 February 2015 (UTC)

Space in Template:IPAEdit

Can anyone figure out why {{IPA}} is no longer placing a space after the colon? Kc kennylau says he doesn't think it's because of his recent edits to Module:IPA, but I don't see any other recent edits to relevant templates or modules that could be causing it. —Aɴɢʀ (talk) 08:12, 7 February 2015 (UTC)

PTO translated with a combined mark into HungarianEdit

Hello there,

I wonder if it's possible to add this sign: ˙/. as a translation for PTO in its second meaning ('please turn over'). There seems to be an issue with this string. I wrote "fordíts!" as well, because that's how it's expanded in speech, but in terms of writing, this form is not used, only the combination of these three characters. Thanks in advance for your help. Adam78 (talk) 23:36, 7 February 2015 (UTC)

Wiktionary talk:Babel#Greenlandic_.28kl.29Edit

It seems that (a) the Wikimedia #Babel system has a bug affecting Greenlandic, and (b) we're missing Template:User kl-0. See Wiktionary talk:Babel#Greenlandic_.28kl.29 for discussion. - -sche (discuss) 00:58, 8 February 2015 (UTC)

Automated flagging of missing Wiktionary entriesEdit

Hello! I am an information scientist and natural language complexity researcher at the University of Vermont, leading a project that predicts "missing" phrase-entries from a dictionary. This development only applies to dictionaries that include larger-than-word lexical objects (such as the the Wiktionary). For example, I am able to generate shortlists of four-word phrases that are similar to those defined in the Wiktionary, which in fact are missing:

  • benefit of a doubt
  • keep an eye to
  • roll off the presses
  • one of a million
  • one upon a time
  • made up your mind
  • what time is new
  • down in the count
  • keep an eye for
  • ...

These lists are ordered according to how likely they are to be meaningful (in need of definition).

Notice that some are completely absent idiomatic entries, like

  • roll off the presses,

which is similar to the extant, "roll off the tongue".

Many more are variants of existing metaphoric forms, like

  • keep an eye for,

which are still without reference or redirect.

I would like to add to the requested entries list on Wiktionary:

as part of this ongoing research project, mapping out and defining the greater, English lexicon of phrases.

As this could generate large lists of requested entries, I must ask, is this reasonable within the current framework of the Wiktionary system?

If not, would it be possible to create a separate access point through which I could make these shortlists public?

I am very interested in enhancing the breadth and depth of knowledge---already enormous---on the Wiktionary.

My service and interest in this is purely academic, and I offer it freely and openly.

Looking forward to this discussion :)

Sincerely, Jake Ryland Williams

---

jake[dot]williams[at]uvm[dot]edu http://www.uvm.edu/~jrwillia/

---

Hi. Please sign up with a user name, and then you can create subpages under your user page, like (for example) User:MyName/mypage1. I don't think that a new experimental project will be quite ready to post on WT:REE yet. Equinox 17:54, 8 February 2015 (UTC)
But many of your phrases are just, plain wrong :-
benefit of a doubt - benefit of the doubt
keep an eye to - keep an eye out
one of a million - one in a million
one upon a time - once upon a time
made up your mind - make up one's mind
what time is new ?
down in the count - down for the count
keep an eye for - see above

SemperBlotto (talk) 18:03, 8 February 2015 (UTC)

Hello again, and thank you all very much for your responses. Thanks Equinox---I have created a user account---and DCDuring---I have transported this conversation to my user page, enhancing it to a more full description. Please visit jakerylandwilliams and feel free to contact me with an questions or suggestions. As stated, I am very interesting in working with the Wiktionary, and within whatever framework is deemed productive and acceptable. Best, Jake.

Edittools no longer workingEdit

Has anyone else found that Edittools no longer works? It appears in my UI when I'm in edit mode just as expected, and I can click on any of the items, but instead of inserting the clicked text at the location of the cursor in the textbox, the UI focus just ... vanishes. The blue outline on the textbox, indicating that the textbox is the active UI element, disappears, and nothing else is highlighted. I have to click within the textbox before I can type again.

This non-functionality first arose maybe a month ago. I had made no changes to my Edittols config, and something (I forget what) led me to think that it was a browser update issue (I had been using slightly-outdated Chrome 30-something), but updating Chrome didn't fix the issue. I decided to do some testing yesterday, and found the same problem under Chromium on Ubuntu, and on Firefox on Mac, leading me to conclude that the Edittools infrastructure must have changed somehow.

Any further information would be much appreciated. ‑‑ Eiríkr Útlendi │ Tala við mig 19:48, 9 February 2015 (UTC)Á

This kind of error is most likely caused either by broken/outdated personal JavaScript or one or more broken/outdated gadget(s). We've been seeing this on a number of wikis recently. I suggest you try disabling non-default gadgets and commenting out user scripts until you find that Edittools works again. This, that and the other (talk) 06:19, 10 February 2015 (UTC)
I have disabled almost all gadgets, deleted my common.js, and cleaned up most checked boxes in my per-browser preferences, but I still cannot add characters from the extended character set menus. Could this be Java-version specific, ie attributable to recent updates of these? DCDuring TALK 21:22, 10 February 2015 (UTC)
Can you check the webconsole of your browser and see if there is a javascript error? I had something like that: ReferenceError: insertTags is not defined. I think that "insertTags" may have been deprecated in the latest release, and it should normally work while showing "Use of "insertTags" is deprecated. Use mw.toolbar.insertTags instead." Maybe try to purge your cache. — Dakdada 17:14, 11 February 2015 (UTC)
Well, it did change, see phab:T85787. If purging does not solve your issue, open a bug report there. — Dakdada 17:22, 11 February 2015 (UTC)
  • I've purged and still get the inserTags error, but I'm not sure if the issue is with MW -- I suspect the problem is that our infrastructure here is outdated, as I dimly recall that Edittools is based on old code from Conrad Irwin. Last I mucked about with my own personal JavaScript settings for Edittools, the best practice at the time was to copy Conrad's code. Is there some MediaWiki code that we should be copying instead, or transcluding instead? Our own WT page discussing Edittools seems to be somewhat out of date, and I'm not sure where else to look. I'll poke around phab:T85787 later when I have more time. ‑‑ Eiríkr Útlendi │ Tala við mig 20:38, 11 February 2015 (UTC)
If anyone reading this knows how to do this, please implement the required change. I poked around in MediaWiki:Gadgets-definition, but I didn't see anything related to charinsert. ‑‑ Eiríkr Útlendi │ Tala við mig 08:57, 12 February 2015 (UTC)
Thanks for doing the research. I hope it gets implemented quickly. Now I can't even do a copy and paste from the Edittools character sets. I would need to use Unicode to get the characters. DCDuring TALK 14:27, 12 February 2015 (UTC)
The charinsert is implemented in MediaWiki:Edit.js, loaded by MediaWiki:Gadget-legacy.js (the first, default gadget). — Dakdada 16:20, 12 February 2015 (UTC)

Template:alternative form ofEdit

This template starts with a capital letter, whereas all other similar form-of templates appear to begin with a lowercase. Could someone please deal with this? This, that and the other (talk) 23:47, 10 February 2015 (UTC)

"all other similar form-of templates appear to begin with a lowercase" Such as...? Look at the templates in Category:Form-of templates, all of the ones I've checked so far all begin with an uppercase letter. Some of them seem to have a parameter that allows you to render it in lowercase for whatever reason (using the template amid a definition instead of on it's own line perhaps?). Bruto (talk) 01:58, 11 February 2015 (UTC)
Our whole set of non-gloss templates is not entirely consistent on whether to start with an uppercase or lowercase letter and end with a dot or nothing. It'd be nice to standardize. Since we generally (though a few object to this) begin English sense-lines with uppercase letters and end them with dots, while beginning other languages' sense-lines with lowercase letters and ending them without dots, perhaps the templates could even be set up to capitalize and punctuate based on the lang= parameter. - -sche (discuss) 18:51, 12 February 2015 (UTC)

Help with ff-nounEdit

Need to request help to include a parameter in Template:ff-noun that would add the entry to a category for the indicated noun class. That is, with {{ff-noun|sg-nc|plural|pl-nc}}, to have this category generated: [[Category:Fula noun in class sg-nc]]. The object is to group entries for nouns by noun class. These new categories would then be subcategories of Category:Fula nouns. TIA for any help or pointers.--A12n (talk) 04:56, 11 February 2015 (UTC)

Maybe you could do it the same way as {{sw-noun}}? Are your needs any different? —Μετάknowledgediscuss/deeds 08:28, 11 February 2015 (UTC)
Thx. Looks like that approach could be adapted. Is there a simpler way, taking the contents of the sg-nc field and putting it in the specified location in the category? (I'll need to read up on the coding, evidently.)
Well, I used a different system of categorising the noun classes, one that makes sense for Swahili but is not the numerical system standardly used by Africanist linguists. Besides other benefits, it greatly reduces what has to be typed into the template. That said, if you really want three parameters where the template itself is unable to predict anything and you must fill them all out, I can do that for you. —Μετάknowledgediscuss/deeds 17:19, 11 February 2015 (UTC)
Thinking about this. Noun class names in Fula unlike Swahili (if I'm seeing the latter correctly) also have a function - so ki for instance is also a particle functioning as a determiner and an indicative depending on whether it is after or before the noun. So the {{ff-noun}} template is set up so you type in whichever of the 22 or so singlar classes is appropriate (there are 4 plural classes but I still need to generate a template for plural Fula nouns). The other two parameters - the plural and the plural class - also need to be keyed in (no way to predict the plurals that I can see - ending can vary, and some initial consonants shift). So yes, if you could help that would be most appreciated.--A12n (talk) 04:49, 13 February 2015 (UTC)

Soundex searchEdit

This site demonstrates a Javascript function that generates a soundex code for a string. I assume that it is useful only within a given language. Couldn't we supplement our existing orthographic indexes (and our incomplete misspellings, IPA, and rhymes coverage) with a soundex index to enable search for terms (words?) the spelling of which is not correctly known? It would be nice if it were integrated into search, but it would first be nice to determine whether it would work and be useful at all.

Is it a good idea? What would be involved? DCDuring TALK 23:21, 11 February 2015 (UTC)

I see at w:Soundex that there are improvements over the original soundex system. DCDuring TALK 23:28, 11 February 2015 (UTC)
For misspellings the w:Levenshtein_distance is actually a better approach. The search engine used by MediaWiki already supports this, you'll need to add ~ to the search term (fuzzy search). Jberkel (talk) 01:41, 19 February 2015 (UTC)
@Jberkel: Thanks a lot. It's wonderful that we have it already. Is what we have "tuned" for English? What scripts and languages does it work with? DCDuring TALK 03:28, 19 February 2015 (UTC)
The Levenshtein distance is language agnostic (in contrast to the Soundex/Metaphone group of algorithms). The implementation used in MediaWiki has full unicode support so should work with all scripts supported by that standard. – Jberkel (talk) 14:37, 19 February 2015 (UTC)
Well yes but Soundex is about sound, not writing or misspelling. Is it not what DCDuring asked (words for which we don't know the spelling, but an approximate pronunciation)? — Dakdada 16:17, 19 February 2015 (UTC)
True, it's not about sound. But looking at the references in the article, Soundex (and most derivatives) are optimised for English (or non-English words familiar to English speakers). It would be very hard to build a version of Soundex which works well with the majority of languages and scripts in use here. However It would be interesting to see if the IPA data (where available) can be used to implement phonetic search. – Jberkel (talk) 17:01, 19 February 2015 (UTC)
Both sound and spelling are issues. Many misspellings, especially in English, are based on the sound. Hardly any ordinary users know IPA, so the only tool, short of asking at Info Desk or Tea Room, is to use conventional orthography as best one can. So: spelling matters, probably much more than anything else. But a Levenshtein or other distance would be more accurate if it "knew" whether the source of distance was a typo, or a scanno, or a thinko, or a pronunciation spelling (ie, a spelling intended to represent what was heard). For near-misses all of the above could be used to determine what the search engine offers the user, but a better focused list would be generated if the user could specify that pronunciation representation was the objective. A special interface to elicit better sound information from a user would be nice.
Any effort that worked for English would be a good start. For almost all searches we are likely to see the matrix language at least would be known and a secondary language could be guessed. DCDuring TALK 17:10, 19 February 2015 (UTC)
IPA can be used to search for sounds to some extent, at least as long as the user can type the sounds that he wants. I already did a tool like that for French (no fuzzy searches though), and I opted to use a virtual keyboard to type IPA symbols (see here). This approach can be found in other dictionaries like TLFi. The most difficult problem seems to be how to help the user type what he wants to find, rather that the search itself. — Dakdada 17:36, 19 February 2015 (UTC)

'#EnglishEdit

Neither {{head}} nor {{en-part}} works at '. Both result in this being displayed as the headword line: [[Category:English lemmas|]][[Category:English particles|]]. - -sche (discuss) 22:22, 12 February 2015 (UTC)

This is because apostrophes are stripped when making category sort keys. Of course in this case there is nothing left after that. I'm not sure what the best solution for this would be. The simplest, that I can think of, would be to skip creating a sort key altogether if the page name is only one character, but that would still break when someone creates something like ''. —CodeCat 22:28, 12 February 2015 (UTC)
sort=' solved it. — Ungoliant (falai) 22:30, 12 February 2015 (UTC)
Thanks! - -sche (discuss) 22:44, 15 February 2015 (UTC)

Chinese classifier templateEdit

I'm not sure if this idea has been run by you guys before, but what do you think of the idea of having a template that generates the correct classifier(s) for each Chinese entry? (@Atitarev, CodeCat, DCDuring, Wyang: Any input?) WikiWinters (talk) 11:12, 17 February 2015 (UTC)

Did I break anything?Edit

Hi. I've been playing with some Modules recently, which is probably not healthy for Wiktionary. Anyway, I'm trying to generate categories for missing noun forms, and later will try to do the same for other parts of speech. I've fiddled with lots of modules, but the only fiddle that worked, much to my delight, was my one on Module:ca-headword. My edits to Module:en-headword , Module:pt-headword , Module:fr-headword , Module:gl-headword and Module:ru-headword did not have the desired effect, and I'm afraid I might have broken something. Modules, by the way, are really complicated things! --Type56op9 (talk) 17:46, 16 February 2015 (UTC)

It would be useful to have a page Help:Modules to explain how to write and use the damn things, you know. --Type56op9 (talk) 17:47, 16 February 2015 (UTC)
One of the lines of Help:Modules will be like "Do not touch anything that is used by thousands of entries if you do not know what you are doing", for sure... --Dixtosa (talk) 18:25, 16 February 2015 (UTC)
I'm sure CodeCat would be happy to help. DCDuring TALK 20:18, 16 February 2015 (UTC)
If I'm to help, I'm just going to revert it all. —CodeCat 20:24, 16 February 2015 (UTC)
Why's that? Can't you assist with the objective, provided it is expressed, of course? DCDuring TALK 21:20, 16 February 2015 (UTC)
The objective is for Wonderfool to continue creating form-of entries with his bot or through some other (presumably automated) means, even though there have been complaints about the mistakes he has been making. Since it doesn't seem he wants to hold himself accountable for his edits (if he did, then why does he circumvent blocks?), I've chosen to stay far away from this topic, and want to bear no responsibility if it causes more problems. Let someone else deal with it. —CodeCat 22:08, 16 February 2015 (UTC)
I appreciate the comments, CodeCat. You are right about everything - the objective is to enrich Wiktionary with form-of entries (semi-automated, using WT:ACCEL, in fact). It's a pity that modules are so complicated, because it means less of us are able to use them. I'll follow this topic closely, and play with modules some more, until I either figure them out or I give up. --Type56op9 (talk) 10:34, 17 February 2015 (UTC)
If you work on modules you unfortunately have to spend some time to learn how to program. If you're unsure what you're doing then you should try your changes with one module first (preferably sandboxed). Once everything works as expected apply the changes to the live module. As far as I can tell you just blindly copy-pasted code snippets around. Jberkel (talk) 01:13, 19 February 2015 (UTC)
If someone could tell me how to -- or where to find the docs telling me how to -- sandbox a module, or even to create a module in userspace for testing before bringing it out into mainspace, and how to invoke it either way, I would be very much obliged. --Catsidhe (verba, facta) 01:19, 19 February 2015 (UTC)
Everyone has their own sandbox module, yours is at Module:User:Catsidhe. You can create that and use as many subpages as you like. —CodeCat 01:30, 19 February 2015 (UTC)
Catsidhe and Type56op9 raise a valid point: we don't have good documentation around modules and the development approach in general. Everything feels quite ad-hoc and every module author does things a little bit differently. Wheels get reinvented. Code gets copied. It would be good to work towards a consensus on how certain things should be done. Wiktionary:Coding_conventions#Lua is not enough. – Jberkel (talk) 02:13, 19 February 2015 (UTC)

Spanish nouns without Template:es-nounEdit

Hi there. How would one go about generating a list of Spanish nouns not including Template:es-noun? --Type56op9 (talk) 10:43, 17 February 2015 (UTC)

  • Well, I would take the contents of the Spanish nouns category together with "what links here" of the template and sort them together. Throw away all the entries that occur twice and Robert is your parent's brother. SemperBlotto (talk) 10:48, 17 February 2015 (UTC)


Or alternatively, you can ask the author of module:head to change it so that it categorizes just like you want. Or even better option is to change es-noun by yourself (not protected yay! :D) so that it does not categorize es-nouns and then get the list of new Spanish nouns. you may need to do massive null-edits on pages though. --Dixtosa (talk) 17:11, 17 February 2015 (UTC)

I think you can use AWB to compare lists (and possibly even to generate them from categories and whatlinkshere) even without being approved to save edits with it. - -sche (discuss) 22:08, 17 February 2015 (UTC)

Urgent help please - boxing spammerEdit

A very persistent spammer keeps adding "mywikibiz" rubbish to pages. He was using Talk:boxing until I protected it, and is now using other pages. He is a human, not a bot, and responds aggressively to people trying to stop him. He has many IPs. Can someone prevent "mywikibiz" being inserted into articles? -- that is the only way to stop him spamming his site. I tried adding it to a filter but I must have done it wrong. Thanks. Equinox 20:18, 17 February 2015 (UTC)

Done. --Yair rand (talk) 23:30, 17 February 2015 (UTC)
They still seem to be getting through, on kickboxing and martial art now. —CodeCat 22:19, 18 February 2015 (UTC)
  • The two bad edits that CodeCat fixed were both from the 208.54.32.xxx range. Equinox, could you tell us if this spammer consistently uses this range? If so, maybe we just block this range for a few days / weeks from making anon edits? ‑‑ Eiríkr Útlendi │ Tala við mig 22:38, 18 February 2015 (UTC)
IPs used by the spammer so far: 172.56.0.109 172.56.0.112 172.56.0.166 172.56.1.82 172.56.1.135 172.56.1.179 172.56.32.69 208.54.64.175 208.54.64.164 208.54.64.188 Equinox 17:05, 19 February 2015 (UTC)

Blank pageEdit

The page share is currently totally blank. Does anyone has an idea of the problem? — Automatik (talk) 14:39, 18 February 2015 (UTC)

It could be an ad blocker. —CodeCat 15:05, 18 February 2015 (UTC)
Exactly, thank you! AdBlock disabled for this page. — Automatik (talk) 15:21, 18 February 2015 (UTC)

Kassadbot still not running?Edit

There are now over 12,000 entries in Category:Requests for autoformat. SemperBlotto (talk) 08:44, 21 February 2015 (UTC)

Oh is there still no replacement? Sheesh. I've been, uh... in inpatient treatment for a while (borderline personality disorder sure is fun) and I got a new PC and lost maybe half my files due to a less-than-reliable USB hard drive. I might give it a try if I can set everything up again. -- Liliana 10:36, 21 February 2015 (UTC)

Wikisaurus changeEdit

Well, since wikisaurus has been proposed as a tool in order to find synonyms, antonyms, etcetera. Instead of adding synonyms of synonyms shouldn't all synonyms be linked together.

For example, if I add a synonym entry to cat as 'feline', then shouldn't wikisaurus create an entry 'feline' if it doesn't exist, and add 'cat' plus all synonyms, antonyms of cat? The reason for this is, that it might be easier to manage all the synonyms on a 'collective' space, so they're maybe, easier to manage together, and it might increase the size of wikisaurus way faster.181.50.196.58 18:28, 21 February 2015 (UTC)

If I understand what you're proposing, I would say it's not a good idea. A big problem with Wikisaurus is that it's not always obvious when you're creating an entry whether there's already a Wikisaurus entry that covers it. If I put felid as a synonym for cat, Wikisaurus:felid would duplicate Wikisaurus:feline. Also, WS entries are often based on subtle semantic distinctions that automated methods wouldn't be able to handle. The likely result of an automated method would be lots of single-member WS entries that would just add clutter and confusion. Chuck Entz (talk) 18:51, 21 February 2015 (UTC)
Redirects would solve the problem of people creating Wikisaurus:felid because they don't know about Wikisaurus:feline. Perhaps someone could even create a gadget similar to the one used on rhymes pages, which would create redirects automatically when a new synonym was added to a Wikisaurus page (i.e. if I add foobar to Wikisaurus:feline, the gadget would create Wikisaurus:foobar as a redirect to Wikisaurus:feline). - -sche (discuss) 19:05, 21 February 2015 (UTC)
Redirects are unnecessary since (a) the user can use the search bar present at the top of each Wikisaurus entry to find whether a WS page already contains the term, and (b) the mainspace Synonyms section for each word should eventually link to the corresponding Wikisaurus pages (I have now expanded felid to link to WS:feline). --Dan Polansky (talk) 14:56, 22 February 2015 (UTC)
I agree with Chuck Entz. I add that, generally speaking, most synonyms are not 100% equivalent, and this can be addressed in Wikisaurus, but not automatically. And Wikisaurus should not address only synonyms, antonyms... but should be a true thesaurus. @-sche: redirects are a good idea, but this cannot be automatic: many words have several meanings, and might appear in several Wikisaurus pages. Lmaltier (talk) 19:11, 21 February 2015 (UTC)

Template:sa-verb-presEdit

I wonder if anyone could fix Template:sa-verb-pres? It has extra "}}". (See हन्ति for example) --KoreanQuoter (talk) 12:13, 22 February 2015 (UTC)

Nevermind, I think I got it, --KoreanQuoter (talk) 12:50, 22 February 2015 (UTC)

Module:en-headwordEdit

I was fiddling with a Module again. It didn't work. Could someone check it, and correct it, please? --Type56op9 (talk) 15:16, 23 February 2015 (UTC)

software database dictionaryEdit

I have invented a word game and would like a free concise dictionary, in the form of a downloadable software database file, for inclusion within it. Is there such a file which can be used commercially? The word list I am using for the game is SCOWL and I am hoping to get a dictionary which will contain all the words that are in that word lst, so that when a word ((in the forum of the link ) is pointed to and clicked, the player will be directed to a short meaning of it.

Thanks Paul

See Help:FAQ#Downloading_Wiktionary. You'll have to run a manual comparison against SCOWL; also be aware that we are fairly inclusive of unusual and offensive words: your players might object to some of them if they are not in mainstream dictionaries etc. Equinox 22:02, 23 February 2015 (UTC)

Wikidata experiment with taxon hypernymsEdit

(Pinging people who might be interested, but might not check Grease pit very often. Sorry for ping spam.)

@DCDuring: @Chuck Entz: @SemperBlotto: @I'm so meta even this acronym: @JohnC5: @Equinox:

Messed around today with Wikidata and lua. Thought I'd share in case anyone might be able to use the output in some way, or wanted to push it along, or just wanted to see what might be possible when they ever enable Wikidata on Wiktionary.

So I was considering creating some sort of bot to generate the "hypernym" section for species and other taxon entries here, (and also pondering the mess on Wikipedia which is the Taxobox template, which is a related problem), and I thought it'd be far better to have a template with a lua script that did it all instead of running a bot. Was going to just spend an hour or two on it in the morning, but ended up spending most of the day getting it working.

Due to how Wiktionary being disconnected from Wikidata, the script will only run on Wikidata's internal wiki right now, but some day they might connect us to Wikidata and enable "access to arbitrary items". So for now the module only runs on Wikidata.

It outputs something you "could" paste into Wiktionary. It takes a "Q" number of a taxon's Wikidata item, and outputs the wikitext for the hypernym section.

Here's a sample of the kind of output (so far):

Octopoda (hypernyms) {{#invoke:Wiktionary-taxon|hypernym|Q40152}}

The dodo: (hypernyms) {{#invoke:Wiktionary-taxon|hypernym|Q43502}}

(more examples)

So while the script can't be used directly on Wiktionary yet, you could copy-paste the output into Wiktionary, but you would probably want to trim it down first. Obviously it still needs some work. Mostly it needs some added heuristics to choose which ranks to ignore. But thought I'd share it so far anyway.

You can try editing/previewing this with other species/taxa here: d:User:Pengo/hypernym, or see the module here: d:Module:Wiktionary-taxon. Will be glad if it can be be used in its current state.

Happy editing. Pengo (talk) 06:21, 26 February 2015 (UTC)

The proliferation of names, both ranked and unranked, for taxonomic clades and the unsettled relationship among them makes keeping track of relationships hard. It also makes keeping up sometimes counterproductive for dictionary users, who are generally not reading works that are up-to-the-minute in this regard. The "correct" placement and circumscription of a taxon is often provisional for years or decades and is sometimes controversial, with multiple hypernymic and hyponymic relationships being in use for some time. Most of the existing sources of taxonomic information have a hard time keeping track of the information for genus and species, let alone higher and lower taxa.
Even Wikispecies and English Wikipedia often disagree, sometimes without acknowledgement in Wikipedia of controversy. Wikispecies is particularly bad at recognizing multiple placements and circumscriptions, doing so only for the "highest" taxa, Commons attempts to reconcile them. The non-WMF external sites that try to have comprehensive coverage of many ranks or clades, firstly, do not actually have comprehensive coverage, secondly, rarely present controversy, and, thirdly, lag behind specialized websites, which are numerous, but often relatively short-lived (10 years being "long" and I'm not just talking about web addresses).
Thus, the grand project of presenting the apparently straightforward data structure of taxonomy requires a huge effort to simply keep track of the twists and turns of classification and may miss the mark in presenting how the authors our readers actually are reading have actually used taxonomic terms.
I have no particular solutions to the problem, other than including links to as many outside sites that cover this kind of thing. I wouldn't know how to usefully present multiple discrete circumscriptions (hyponyms) and placements (hypernyms) of taxa (some kind of diffs?). I don't want to discourage any work in this area, but I expect that there will be much more enthusiasm for working on the programming for the simplified snapshot of the latest taxonomy than for maintaining the data or reflecting the history and diversity of opinion. DCDuring TALK 14:32, 26 February 2015 (UTC)
Higher-level taxa tend to be less stable, since they're more abstract. Even when there's no question as to the branches, different taxonomists may represent them using different ranks: one may see a family with subfamilies, while another may see a superfamily with families, a family with tribes, or even an order with suborders. DNA and cladistic analysis don't always clear things up, since one study may focus on specific mitochondrial genes, while another may look for transposon sequences within nuclear DNA; choice, weighting and coding of features, choice of outgroup, and various other differences in methodology can lead to radically different trees from one study to the next. These will eventually get sorted out, but things are mostly in an unsettled, preliminary stage for the near future. These are exciting, but confusing times.
As for filtering algorithms: a lot of it is context within the larger structure- nodes that have sisters should be shown. Family, genus and species are always of interest, and often orders, classes, divisions/phyla and kingdoms. When there are multiple unbranched levels, omit prefixed ranks: orders, but not suborders or infraorders, families, but not superfamilies or subfamilies, etc. Subgenus is especially awkward, since it comes between the two parts of the binomial- so omit it whenever possible. I hope this helps. Chuck Entz (talk) 15:26, 26 February 2015 (UTC)
@DCDuring: Ultimately the goal, if I were to spend way too much more time working on this, would be to make it resemble the existing lists, but share the maintenance with the other many other WMF projects which use taxonomies.
Maintaining taxonomy data is happening separately already on Commons and Wikispecies and every wikipedia and wiktionary. I don't expect all the projects to switch to using Wikidata tomorrow or any time soon though, but ultimately it could only be less work to do so.
The conflicting taxonomies thing always come up when talking about Wikidata and taxonomies. The idea of using Wikidata seems to quickly get shouted down because en.wiki need to do their taxoboxes differently to fr.wiki (I have to admit, I've never worked out what the specific disagreements/differing views are actually about, but I accept they're valid).
However the problem should be solvable. Wikidata might be centralized, but it allows multiple, "conflicting" data items, which can be tagged with their source and dates, and other such things. If multiple taxonomies were imported into Wikidata, it should be possible to have one project pick one set of preferred sources, and have another pick another, but both still use the same data source, the same code, and use the data in the areas where there isn't controversy. A simplified taxonomy should also be possible, perhaps borrowing the IUCN's red list, where the focus appears to be on large familiar groupings of species more than on accurate cladistics (e.g. it doesn't place birds under reptiles). So the projects could be much more internally consistent. Another project could leverage the conflicting viewpoints and choose to present either one or both. (Yes, the job of working out how to display it best is difficult too, but at least it might become possible to find a new way to display information and actually apply it to existing data)
Hebrew Wikipedia is currently using Wikidata for its taxoboxes, but due to the current limitations on accessing Wikidata from Wikipedia, the tree can't be recursively climbed like I've done here, and instead each taxon in Wikidata needs its own links to some limited set of higher taxa. The guy who made the module presented it on a talk page to en.wiki two years ago but if anyone was enthusiastic about it, they hid it well. The responses were about it not handling weird edge cases, and about how it spelled the end for the all important English/French taxonomy divide, so the dev just went back to he.wiki and took his templates with him. No mention of Lua was made for Taxoboxes again since (in my limited search anyway).
Anyway, these experiments here are just from the data that was already in Wikidata, and as far as I can tell, no one's actually attempted to view it from bottom to top like this before, let alone make it presentable. But it seems there's a good amount of taxonomy data already imported in Wikidata.
Some day, I imagine it might possible for the user to change the timeline on a taxobox to choose which era's taxonomy to view, or to find some way to automatically list alternate taxonomies on Wiktionary, etc. The main thing here for me is the possibility of actually separating data and presentation.
Populating the data is certainly a huge task too, as you say, especially for anything even slightly historical. But if the various projects which use taxonomy data can work together, and only have to agree on where data has come from, and can decide separately which to display, building something that surpasses the existing systems should be achievable relatively quickly. It doesn't have to have everything, it just needs to be better than what's existing.
That said, my test here are very simple, and largely an experiment to see what's possible. The algorithm just picks the first "parent taxon" listed and repeats, without any smarts as yet. It could be interesting to find some area where taxonomies disagree and attempt to get that disagreement stored in Wikidata and add a switch to the module allow flipping between them, but it's all academic at this stage, especially as it can't even run properly anywhere but on Wikidata's own Wiki. It was really just meant to be a brief distraction to answer a "would that work?" kinda of question, but is worth thinking about for some time in the distant future. Pengo (talk) 16:58, 26 February 2015 (UTC)
@Chuck Entz: Those rules are pretty good and do help, thanks. As far as I can tell, there's no way to easily find child nodes in Lua/Wikidata right now, so the amount of branching is impossible to tell. Another good reason to go back to just writing code on my local machine where there aren't so many arbitrary limitations. :) Pengo (talk) 16:58, 26 February 2015 (UTC)
We already have some waste of time and needless confusion for our users in presenting in an entry a simple ladder of one-child taxonomic hypernyms in the same way as a branching structure (trees). But it is not always easy to tell whether a ladder will remain a ladder or become a tree, except perhaps by the length of time that it has remained a ladder.
I am already in the process of eliminating mention of subfamilies, supertribes, tribes, and subtribes from the hypernymic portion of the "definition" (in {{taxon}}) of genera and subgeneric taxa and substituting families, which tend to be more meaningful to non-specialists and somewhat more stable, notwithstanding the all-too-frequent conversion of families to subfamilies (and vice versa) that Chuck refers to. I am also substituting family for genus in subgeneric names, especially, subgenus and species.
I had also determined to limit the display of potentially long sequences of taxonomic hypernyms to one sequence leading to some taxon that has a recognizable connection to an English common name, eg, Plantae, Aves, Tetrapoda, Mammalia, Reptilia, Insecta, Crustacea, Mollusca, which hopefully is also stable. In contrast, the dodo sequence above is an example of a sequence that is probably not particularly helpful to a typical user. It conveys merely the idea that taxonomic classification is well developed, at least at the levels above Aves. I would be perfectly happy to leave to others the question of how to present taxonomic data above the rank of order, or even family, in entries above the level of genus.
  • It might be useful to rely on Wikidata for the presentation of complete (ie, like that of dodo above) taxonomic hypernyms via external links and limit ourselves to taxonomic names proximate to the headword. DCDuring TALK 18:07, 26 February 2015 (UTC)
I'd be curious how the ladder would look if I limited it to taxa which have some number of common names across languages. That would be relatively easy to do. I agree that displaying everything above "Aves" like in this example has very limited usefulness. I'll have a go at incorporating some of the suggestions some time. Thanks for the feedback. At the very least it would be nice if I could make something that could stand in for a human-edited list. Pengo (talk) 09:08, 28 February 2015 (UTC)
I apologize for the less-than-clear use I made of ladder (contrasted with tree). I was referring to the cases for which a taxon, say, a species, is the sole species for a sequence of higher taxa, eg, genus, family, order. An extreme example is Ginkgo biloba, which is the sole (known extant) member of genus Ginkgo, family Ginkgoaceae, order Ginkgoales, class Ginkgoopsida, division Ginkgophyta. This sequence is an unbranching portion of the taxonomic tree of life.

Question 3.25Edit

This question was posed a few days ago. The question and the answer to it are as follows:

Question

"I have invented a word game and would like a free concise dictionary in the form of a downloadable software database file for inclusion within it. Is there such a file which can be used commercially? The word list I am using for the game is SCOWL and I am hoping to get a dictionary which will contain all the words that are in that word list, so that when a word ((in the form of the link ) is pointed to and clicked, the player will be directed to a short meaning of it.

Thanks Paul"

Answer

"See Help:FAQ#Downloading_Wiktionary. You'll have to run a manual comparison against SCOWL; also be aware that we are fairly inclusive of unusual and offensive words: your players might object to some of them if they are not in mainstream dictionaries etc. Equinox ◑ 22:02, 23 February 2015 (UTC)"

 I forwarded the answer to my programmer who replied as follows:

"As far as wikitionary, it's not in text format. Another thing is that it's massively big database. So, I am afraid, it's not feasible to use something like wikitionary. We need to find someone, who provides text (.txt) format of such dictionary. And that too should be concise, in size. Say maximum 5 to 8 mb in size."


Can anyone help me in finding such a dictionary? Thanks for your help thus far.

Paul