Wiktionary:Grease pit/2013/July

Non-breaking spaces and all dashes

Is there ever an instance where a non-breaking space is not preferable to a regular space before a hyphen or dash, at least in principal namespace or the content and discussion spaces? It seems undesirable in template or module space. If not, is there an option that we can select in the MW software that always substitutes a non-breaking space for a breaking space? DCDuring TALK 00:22, 1 July 2013 (UTC)[reply]

I'm not sure how common suspended hyphens at the beginnings of words are in English, but they're fairly common in German (e.g. Stiefväter und -mütter "stepfathers and stepmothers") and may well be found in German example sentences and quotations. There's no reason for a nonbreaking space before that hyphen. —An gr 20:03, 3 July 2013 (UTC)[reply]

And in English. I forgot about those. Perhaps we could restrict the non-breaking space insertion to em- and en-dashes, assuming that hyphens are used in the case you identified.

I wonder if any harm would done if we bot-replaced " - ", ie, "space, dash, space", with "&nbsp- ", ie, "non-breaking-space, dash, space", either including all kinds of hyphens and dashes or restricting the replacement to em- and en-dash patterns. DCDuring TALK 20:33, 3 July 2013 (UTC)[reply]

An Odd Pair of Entries

How is it that we have two visually-identical entry names: زنجبيل and زنجبیل? Don't Arabic and Persian use the same characters? Chuck Entz (talk) 02:22, 1 July 2013 (UTC)[reply]

Just a hunch: does one of them use a zero-width non-joiner? I vaguely recall that some of our Persian entries do, for some valid reason (having to do with ligatures? digraphs?). - -sche (discuss) 04:26, 1 July 2013 (UTC)[reply]

Visually yes, but they are not identical. The Arabic letter ي (y) (yāʾ) is not the same as the Persian ی (yeh), which becomes obvious when written individually or in the final position. --Anatoli ^{(обсудить}/^вклад) 04:32, 1 July 2013 (UTC)[reply]

Also, the Arabic letter ى (ʾálif maqṣūra) (used only in the final position and is pronounced "ā" is not the same as the Persian ی (yeh), which look alike when written separately but are used quite differently and have different readings. The Arabic ى (ʾálif maqṣūra) is casually used by Egyptians as ي (y) (yāʾ) in the final position, which annoys learners and other Arabs. . --Anatoli ^{(обсудить}/^вклад) 04:41, 1 July 2013 (UTC)[reply]

Blocked new users from editing their own user page

There has been a big flood of spam just now, so I changed the abuse filter so that it blocks anyone who has 1 edit or less from editing their own user page. Hopefully this will not inconvenience legitimate users too much. I'm not sure how to show a more informative message so that good users know why the edit was blocked, maybe someone else can look at that? —CodeCa t 13:57, 2 July 2013 (UTC)[reply]

Where is the message located? —Μετάknowledge^{discuss/deeds} 15:39, 2 July 2013 (UTC)[reply]

span class="infl-inline"?

I noticed that several English headword-line templates contain this. What does it do? —CodeCa t 20:23, 2 July 2013 (UTC)[reply]

It seems to be discussed at Wiktionary:Grease_pit/Standardized_personalizable_inflection_templates. Equinox ◑ 21:15, 2 July 2013 (UTC)[reply]

Is that still used at all? —CodeCa t 21:29, 2 July 2013 (UTC)[reply]

Has someone broken ((en-noun|?))

Look at skink, where it now shows the plural as an actual question mark. {{en-noun|?}} should mean uncertain plural. Equinox ◑ 21:13, 2 July 2013 (UTC)[reply]

Yes, User:CodeCat, on the live system with incomplete testing (I would have been fired for that when I was a programmer). SemperBlotto (talk) 21:21, 2 July 2013 (UTC)[reply]
- Fixed. —CodeCa t 21:28, 2 July 2013 (UTC)[reply]

It's still broken. Now see fuzz. {{en-noun|es|-}} is showing a plural of "fuzzs" instead of "fuzzes". Equinox ◑ 22:54, 2 July 2013 (UTC)[reply]

That is fixed too. —CodeCa t 00:01, 3 July 2013 (UTC)[reply]

Something else weird: look at ranker. Why is the Noun header italicised? From the wikitext, it looks as though en-comparative of might be failing to shut off the italics at the end of the line. Equinox ◑ 15:54, 4 July 2013 (UTC)[reply]

Fixed too. Thank you for reporting these problems! —CodeCa t 16:04, 4 July 2013 (UTC)[reply]

I don't work with templates, but I think it would be a good idea to give each one an accompanying Test suite. Equinox ◑ 16:06, 4 July 2013 (UTC)[reply]

Can I also point out http://test2.wikipedia.org/wiki/Main_Page - it is where I did all my testing of the lua forms of Italian templates. SemperBlotto (talk) 16:16, 4 July 2013 (UTC)[reply]

Bot request - Spanish

If possible, I'd like for a bot to add a call to {{R:DRAE 2001}} in a References section at the bottom of as many Spanish lemmata as possible. It's an excellent Spanish monolingual dictionary online, so they will definitely have all common words. Ideally, this would be for all words in the top 1000 list at WT:Frequency lists#Spanish, and preferably for all of the top 10000 words. The tricky part is that the bot needs to be able to distinguish lemmata from non-lemmata, but hopefully that can be done by avoiding entries that use templates like {{alternative form of}}, {{alternative spelling of}}, {{plural of}}, etc. Thanks! —Μετάknowledge^{discuss/deeds} 06:15, 3 July 2013 (UTC)[reply]

Template:head generate script error when used per documentation

See Abutilon. It also fails to categorize. DCDuring TALK 21:28, 3 July 2013 (UTC)[reply]

Fixed. It was a small mistake that crept in while trying to fulfill Ivan's request at Template talk:head. —CodeCa t 21:32, 3 July 2013 (UTC)[reply]

Even small mistakes are highly visible with modules, which can be a good thing. DCDuring TALK 23:14, 3 July 2013 (UTC)[reply]

Lua method of spotting wrong parameters?

Is there an easy way for Lua modules to spot that a keyword parameter has been misspelled? (e.g. lag=it instead of lang=it) It should be general rather than specific (i.e. it shouldn't look specifically for lag=). SemperBlotto (talk) 10:47, 5 July 2013 (UTC)[reply]

Well, if it shouldn't look specifically for "lag", then what should it look for? —CodeCa t 12:05, 5 July 2013 (UTC)[reply]

Anything at all other than the keyword parameters needed/accepted by the particular template. SemperBlotto (talk) 14:09, 5 July 2013 (UTC)[reply]

I suppose that could be done but I am not sure what would be the best way. One way would be to actually remove parameters as soon as they are used. Then, at the end, if any unused parameters remain, consider that an error. But that would need to be built into each module separately. —CodeCa t 14:22, 5 July 2013 (UTC)[reply]

Make a table with the acceptable parameters and then compare that to args? Seems simple enough. DTLHS (talk) 14:31, 5 July 2013 (UTC)[reply]

It's not quite so simple. Args isn't a real table according to the Scribunto documentation, but a "metatable". That means you can index the values but you can't really do much else. I think you can iterate over all the values, but I'm not sure. In any case, though, acceptable parameters might still be wrong in some situations. Like, if you supply {{es-noun}} with gender "m" and then use the "m=" parameter to specify a masculine form, then that would not be treated as an error. But it actually is an error, because "m=" is meant for feminine nouns. —CodeCa t 15:18, 5 July 2013 (UTC)[reply]

Iterating over all the values is possible by using pairs, ipairs, or frame:argumentPairs, according to mw:Extension:Scribunto/Lua reference manual. --Yair rand (talk) 01:04, 9 July 2013 (UTC)[reply]

Ok, so it should be possible to set a parameter to nil after you're done with it. At the end, you can see if there are any remaining. —CodeCa t 01:06, 9 July 2013 (UTC)[reply]

Context label data module, Module:labels/data

I've created a first template for this module, and added a few entries to it to demonstrate a proposal. I followed the structure of the original templates rather strictly. That means that there are four different parameters for the different types of category. While that can work, we can also try another approach, where the category names are themselves "templated" in a simple way:

categories = {"{{{langcode}}}:Physics"}
categories = {"{{{langname}}} archaic terms"}
categories = {"American {{{langname}}}"}

This allows a bit more flexibility, but it does lock the formatting of the category names into the data module, meaning we have to modify the module if we ever want to change the format. That shouldn't be hard, but it's something to be aware of. This modified format also leaves more room for errors. So, is this ok? Please give feedback (even if you have no comments).

A second part is more of a migration proposal. Because we have many label templates, it's not feasible to migrate them all in one go, that's too much work. So I thought, the module can first see if the data module has an entry, and use that if it's present. If it's not present, the label template is used. That way, we can gradually create the labels in the data module, and orphan and delete the templates one by one as we migrate them, which is much easier to do over time. —CodeCa t 12:41, 5 July 2013 (UTC)[reply]

By "original templates" do you mean the ones that didn't require typing "context" in every case, the ones that operated effectively for several years or the degraded version of more recent vintage? DCDuring TALK 01:53, 9 July 2013 (UTC)[reply]

No I mean the templates like {{intransitive}}, {{archaic}} and so on, that currently contain the data for each context label. —CodeCa t 01:55, 9 July 2013 (UTC)[reply]

So you mean new templates that have usurped the original names, but without the original functionality. DCDuring TALK 02:53, 9 July 2013 (UTC)[reply]

Increase Lua script time limit?

The page Wiktionary:Frequency lists/Modern Greek/5K Wordlist is causing script errors because the scripts on the page take too long to run. This effect is apparently cumulative, so this means that pages can actually do some things without Lua that they can't do with it. That is not really a good thing, so is there a way to increase this limit? I think doubling it should be enough. —CodeCa t 18:17, 8 July 2013 (UTC)[reply]

I think the question should rather be "how do I optimize the code so the scripts take less time to run?" -- Liliana • 22:02, 8 July 2013 (UTC)[reply]

That is also useful of course, but can we expect it to work in all cases? And optimisation can also come at the cost of flexibility. —CodeCa t 22:05, 8 July 2013 (UTC)[reply]

Any ideas? There are now about 15 of these pages showing script errors because they time out. We never had this problem with templates. Why does such a powerful tool need to be limited in such a way? —CodeCa t 21:47, 22 July 2013 (UTC)[reply]

We kind of did: they took eons to load. Pages like water and a entered infamy because of their sluggishness. Not sure we want to get back to that. -- Liliana • 21:54, 22 July 2013 (UTC)[reply]

I see that the problem has become worse since last time? What changed to the Lua code that made it even slower than it used to be? The Greek wordlist example used to stop near the very end, now it's like near the very beginning. Are we perhaps overdoing it with the Luaizing? -- Liliana • 21:56, 22 July 2013 (UTC)[reply]

So Lua allows for functions that wouldn't be possible using the template language, but makes everything slower in return. Ouch. Mglovesfun (talk) 22:00, 22 July 2013 (UTC)[reply]

In principle Lua should be faster than parser functions. Of course you can do more complex things, which may negate that. DTLHS (talk) 22:04, 22 July 2013 (UTC)[reply]

Increasing the time limit should be a last resort. Have we done everything possible to optimize Module:links? Can we run any kind of profiler on it? DTLHS (talk) 22:04, 22 July 2013 (UTC)[reply]

A 380-line module that is used hundreds of times on a page doesn't seem "optimal" to me. -- Liliana • 22:06, 22 July 2013 (UTC)[reply]

Obviously something like the Greek frequency list shouldn't need all of the capabilities of the monster module that might be useful for an entry with a lot of translations like [[water]]. Perhaps it shouldn't run on pages with lots of {{l}}-type links outside principal namespace. Perhaps we should eliminate {{l}}-type links that don't address any serious current need, eg, English and taxonomic-name links, at least until the underlying performance problems are corrected, though someone like me can naively expect them to recur as the number of {{l}}-type links increases due to increased coverage of languages. DCDuring TALK 22:15, 22 July 2013 (UTC)[reply]

I just realised why this is happening now. Originally, {{l}} bypassed the script code templates and generated its own "simplified" version that was meant to work for any language. But of course our script code templates were too varied for that to work, so I changed it so that it called the script template instead. So it's probably that invocation of the script template that is slowing it down so much. Since then I've started to work on "merging" the script templates to use a common {{script helper}} template, which contains common base code that all script templates can use, and offloading any differences to MediaWiki:Common.css. Once that merging is complete, the modules (or any templates for that matter) will no longer need to invoke the script templates because they'd be identical. —CodeCa t 10:25, 23 July 2013 (UTC)[reply]

Etymology headings problem

By our current naming scheme, if a word has at least three languages, where at least one has at least two etymologies, and at least two only have one etymology, then the Table of Contents no longer works, as the first "Etymology 2" has the ID "Etymology_2", and the second "Etymology" also has the ID "Etymology_2". This makes it impossible to navigate to the second "Etymology" heading from the ToC, and impossible to link directly to the second Etymology heading.

Example: User:Supuhstar/Etymology_headings_problem

Supuhstar (talk) 01:00, 9 July 2013 (UTC)[reply]

Numbered pronunciations causes the same problem. Not much can be done about it, though. --Yair rand (talk) 01:06, 9 July 2013 (UTC)[reply]

It would help if the software labelled the sections in a hierarchial way. So that Etymology in the English section didn't have the same label as Etymology in the German section. —CodeCa t 01:09, 9 July 2013 (UTC)[reply]

@Yair we could change the naming scheme or the code that creates IDs.

@CodeCat That sounds good. The safest one I can imagine would be, for any H3, ID="H1Name_H2Name_H3Name"

— This unsigned comment was added by Supuhstar (talk • contribs) at 21:12, 9 July 2013.

Would such a naming scheme be accessible to templates, say, {{term}} and {{senseid}}? DCDuring TALK 01:50, 9 July 2013 (UTC)[reply]

Yes, we would be able to link to #German_Etymology_2 or something like that. —CodeCa t 01:53, 9 July 2013 (UTC)[reply]

I'd think that this should be reported as a bug. -- Liliana • 07:59, 9 July 2013 (UTC)[reply]

Audio files not playing again

In May and June, I had problems playing audio files. Now, once again, I can't play audio files. The behavior is different in that I click the play arrow, but nothing (ever) happens. No sound, no play, no dialogue box, nothing. --EncycloPetey (talk) 23:05, 9 July 2013 (UTC)[reply]

Two little things

eo-noun's forms aren't being linked to. See familio.
Many uses of {term} aren't being italicized. I realize it's not very important, but someone should fix that for consistency. I keep trying to edit etymology sections that look incorrectly formatted but are actually fine. Ultimateria (talk) 05:05, 11 July 2013 (UTC)[reply]

Re #2, on my computer, all instances of term now show the word in bold, instead of italics. --EncycloPetey (talk) 05:13, 11 July 2013 (UTC)[reply]

Re first issue: That is really confusing. I think it's being caused by a change on the Mediawiki side, actually. {{#ifeq:[[:Special:Whatlinkshere/a ]]|{{raw:Special:Whatlinkshere/a }}|true|false}} is now returning false, and I don't think that used to be the case. --Yair rand (talk) 05:35, 11 July 2013 (UTC)[reply]
- Does a null edit solve issue #2? If not, can you point to a specific instance of this happening? —CodeCat 12:08, 11 July 2013 (UTC)[reply]
  - Yes, it does. Do you know why only some pages are doing this? Ultimateria (talk) 16:19, 11 July 2013 (UTC)[reply]
    - It was a mistake in some edits that was corrected soon after. But by that time, the change had already been applied to some pages and it's taking a while for the software to update them a second time. I think the software is made so that if a page was recently updated after a template/module change, it avoids updating it a second time, so it becomes low priority. —CodeCa t 16:44, 11 July 2013 (UTC)[reply]
      Accelerated creation of inflected forms is not working at the moment. Is that also module related? DCDuring TALK 02:32, 12 July 2013 (UTC)[reply]
      Can you give an example entry that doesn't work the way it should? —CodeCa t 12:26, 12 July 2013 (UTC)[reply]
      Last night wherever I inserted {{en-verb}} and {{en-noun}} and previewed the forms displays were red and stayed red for at least 15 seconds and the form entry assistance didn't function. This AM they initially show red and shortly go green, as is normal. So, it seems things are fixed for now. I suppose that the problem last night could have been a transitory JS script loading issue or a delay in working through whatever corrections you(?) made to the module(s). I hope that the modules are now stable and that there is some kind of test suite so some testing can be done off-line. DCDuring TALK 13:19, 12 July 2013 (UTC)[reply]
      Changes to a module would not affect JS loading. And I haven't made any JS changes since the 6th. —CodeCa t 14:01, 12 July 2013 (UTC)[reply]
      Then I suppose the incredibly long time to update changes on widely used code means that complaints on changes that are not well tested offline will dribble in for days after a deployment of bad code or a change to the MW software whose effect is not anticipated. There's a lot to be said for working on entries not so dependent on these things. DCDuring TALK 16:02, 12 July 2013 (UTC)[reply]

space missing after some commas separating context labels

Template:context no longer adds a space after the comma that separates US from other context labels. See squantum, where the context declaration looks like this (US,Nantucket dialect, possibly dated), and this diff, which demonstrates that the space is missing regardless of whether the next label has a template or not, and that UK is not similarly affected. - -sche (discuss) 17:57, 11 July 2013 (UTC)[reply]

I've noticed similarly wonky behavior before in {{ja-l}}, among other places; adding a zero-width space forces the correct behavior. I'll have a look at {{context}} and see if that helps. ‑‑ Eiríkr Útlendi │ Tala við mig 18:32, 11 July 2013 (UTC)[reply]
Fixed. The software is a bit strange when it comes to spaces. It strips them in some places but not others. In this case, it did, so I replaced it with   and that fixed it. —CodeCat 18:45, 11 July 2013 (UTC)[reply]
- after e/c: I just added the zero-width space, after trying   and not seeing any change. Do changes to modules take a while to propagate? ‑‑ Eiríkr Útlendi │ Tala við mig 18:47, 11 July 2013 (UTC)[reply]
  - Yes, it's just like with templates. When you make a change, the software will add all pages that transclude it (or at least some of them) to a queue to be updated, and it will then work through the queue over time to spread the load. If the queue is already relatively long, which it probably is with all the changes to widely-used templates and modules lately, then it may take a few days for it to do all of the pages. —CodeCa t 19:02, 11 July 2013 (UTC)[reply]

It's broken now. See period#Verb. There is a number code where the comma should be. Equinox ◑ 20:34, 11 July 2013 (UTC)[reply]

Oops, fixed. —CodeCa t 20:42, 11 July 2013 (UTC)[reply]

outside#Noun still exhibits the problem. Equinox ◑ 21:11, 11 July 2013 (UTC)[reply]

Probably a server lag phenomenon -- it's showing up just fine for me, roughly 10 minutes later. ‑‑ Eiríkr Útlendi │ Tala við mig 21:22, 11 July 2013 (UTC)[reply]
Yes, things like these take some time. If you want to be sure the entry is up to date, do a null edit. —CodeCa t 21:43, 11 July 2013 (UTC)[reply]

Template:alternative plural of

Just found this by chance. Seems like a good candidate for a subst: template. Mglovesfun (talk) 10:40, 12 July 2013 (UTC)[reply]

Can a bot undo edits?

I made some mistakes with edits and I would like to use a bot (via pywikipedia) to undo the mistaken edits. It would need to make sure that the edit it's undoing is the right one, so it has to check the user and the edit summary before making the undo. Is there a way to do that? —CodeCa t 12:42, 12 July 2013 (UTC)[reply]

I don't think pywikipedia has any built in functions for doing that. How many pages are we talking about? I would use a combination of beautifulsoup and mechanize to check the edit summary and then send the undo. DTLHS (talk) 13:05, 12 July 2013 (UTC)[reply]

About 800 pages: Category:Context with skey. —CodeCa t 13:10, 12 July 2013 (UTC)[reply]

Mass undo is in progress. DTLHS (talk) 20:37, 12 July 2013 (UTC)[reply]

Thank you for your help. —CodeCa t 14:29, 16 July 2013 (UTC)[reply]

Script error on Category:Indo-Portuguese language

There is a script error on this page, because Module:languages states its language family is "cpp" (for "Portuguese-based creoles and pidgins"), which is not present in Module:families. I could create this family, but there is more to it than that. In particular, what would its superfamily be? Technically it would be "qfa-not" like the family "crp" (general creoles and pidgins), but at the same time this family is also a subfamily of "crp", as well as of "pt" in a sense. So what should be done here? —CodeCa t 14:28, 16 July 2013 (UTC)[reply]

I see two options. One is that we rethink our classification of creoles, so that they all go into lang-specific subcats that cat into crp, and the other is that we just give up and change the cpp here to crp. —Μετάknowledge^{discuss/deeds} 15:17, 16 July 2013 (UTC)[reply]

Some languages are creoles of multiple major languages (e.g. French + Spanish), so changing cpp to crp seems like the thing to do. - -sche (discuss) 00:32, 17 July 2013 (UTC)[reply]

Unless we decide that languages can belong to more than one family... —CodeCa t 00:33, 17 July 2013 (UTC)[reply]

I've taken the easy route and switched "cpp" to the standard "crp". - -sche (discuss) 07:46, 22 July 2013 (UTC)[reply]

Is it possible to make cpp work? — Ungoliant ^(Falai) 07:47, 22 July 2013 (UTC)[reply]

Is it desirable? The templates/modules that sort languages into family categories would need to be modified to allow languages to be in multiple families, since creoles are by definition based on multiple languages (e.g. Louisiana Creole is French- and Spanish-based). And, as CodeCat pointed out, we'd have to figure out how to classify each of those families (crp and pt? and qfa-not?). I see no benefit resulting from such effort. Of course, that's because I've resigned myself to the fact that our family-classification structure is coarse and based on out-of-date scholarship and exists only to segregate languages into piles of semi-manageable size (see Wiktionary talk:Families). If people were interested in, rather than opposed to, making our family-classification system fine-tuned (can we have codes for all the African subfamilies we've neglected?) and up-to-date, then categories for X- and Y-based creoles would seem useful, though not without the aforementioned difficulties. - -sche (discuss) 08:44, 22 July 2013 (UTC)[reply]

Indo-Portuguese is a based only on Portuguese. Any given Portuguese-based creole will be completely different from an English- or Spanish-based creole. Considering them to be the same family is a mistake.

I now noticed that other creoles use crp or qfa-und, so you did the right thing. I don’t think this is a good system, but I’m not particularly interested in creoles so I’ll let their contributors (just MK?) worry about it. — Ungoliant ^(Falai) 09:05, 22 July 2013 (UTC)[reply]

Bug in recons

At ki-#Swahili, {{recons}} is displaying correctly but linking to ki-Proto-Bantu. Can anyone fix this? —Μετάknowledge^{discuss/deeds} 15:56, 16 July 2013 (UTC)[reply]

It's linking right for me. —An gr 19:08, 16 July 2013 (UTC)[reply]

It is for me now too, but it wasn't earlier. —CodeCa t 19:12, 16 July 2013 (UTC)[reply]

This afternoon my Navigation box changed (in Monobook, the Navigation box is on the upper left bar). All these years, the item immediately above "Recent changes" was something about "Requests" (I think it was two words, but I’m not sure what the other word was). It’s where all of the hundreds of request pages are to be found, such as Telugu pages needing transliteration. What happened? —Stephen ^(Talk) 23:46, 16 July 2013 (UTC)[reply]

It's at MediaWiki:Sidebar. The last change was 3 days ago, when -sche took Random page by language off because it isn't working. —Μετάknowledge^{discuss/deeds} 23:55, 16 July 2013 (UTC)[reply]

That’s odd. requestedarticles-url|requestedarticles is the one I’m missing. Instead of "Requested articles", I have "Current events". But if I click on "Current events", there is no such project page on Wiktionary. —Stephen ^(Talk) 00:03, 17 July 2013 (UTC)[reply]

I have finally located the page: Wiktionary:Requested entries. It just doesn’t appear in my Navigation sidebar any more. —Stephen ^(Talk) 00:07, 17 July 2013 (UTC)[reply]

That is odd. I wonder if something caused it to revert to the default version? All of the code seems correct. DTLHS (talk) 00:18, 17 July 2013 (UTC)[reply]

FWIW, I don't recall ever seeing a sidebar link to "requested pages"... despite, as DTLHS points out, the fact that the code in MediaWiki:Sidebar suggests there should be one. I don't see "Wiktionary preferences", either. - -sche (discuss) 00:30, 17 July 2013 (UTC)[reply]

Do a null edit and see if it fixes itself in 24 hours. DTLHS (talk) 00:39, 17 July 2013 (UTC)[reply]

Bizarrely, I see "requested entries" and "preferences" now. Any idea what caused the disappearance (for Stephen) and appearance (for me)? - -sche (discuss) 00:51, 17 July 2013 (UTC)[reply]

Back to normal for me too. See [1]- may be related to the mediawiki update on July 11. DTLHS (talk) 00:55, 17 July 2013 (UTC)[reply]

Hebrew display problem

Discussion moved from Wiktionary:Tea_room#.D7.A9.D7.91.D7.95.D7.A8:_Nikud.

An enquiry on Talk:שבור, which I thought I'd copy here since nobody will ever look at it:
"I'm only a beginner in my knowledge of Hebrew, but it looks like the nikud on Shin are the wrong way round. Almost all of the nikud are on the left hand side (making Sin - an 's' sound) while the pronunciation given corresponds to placing the dot on the right hand side (making Shin - a 'sh' sound). You can find this difference clearly illustrated at the bottom of this page for instance: [2]. Online translators also seem to agree with this spelling of the word. -- Taohinton (talk) 17:11, 12 July 2013 (UTC)"[reply]
-- Hyarmendacil (talk) 05:51, 17 July 2013 (UTC)[reply]

Yes, it's backward. I suspect, though, that there's more to this than meets the eye. The wikitext seems to be correct, but it's getting switched somehow. Chuck Entz (talk) 07:34, 17 July 2013 (UTC)[reply]

~~Please take a look at this topic in the Tea Room.~~ The Hebrew letter shin is getting reversed in some words- but not all- so that the dot that should be on the right is showing on the left, which makes it sin instead of shin. The letters in the wikitext are correct, but end up being displayed backwards by the template. Chuck Entz (talk) 07:43, 17 July 2013 (UTC)[reply]

See Wiktionary_talk:About_Hebrew#Nikkud_order for discussion of a similar issue. - -sche (discuss) 08:59, 17 July 2013 (UTC)[reply]

This is weird. I tried retyping it, but no effect; I do note that my Hebrew keyboard only allows you to add vowels to a shin after you've dotted it, if you so choose to dot it. It still looks fine in edit view. I bet this is another result of the crappy Hebrew fonts being foisted on us by the MW devs. —Μετάknowledge^{discuss/deeds} 17:15, 17 July 2013 (UTC)[reply]

If I copy and paste the text elsewhere, then it looks right. If I change the font spec to just “sans-serif” in my browser’s web inspector, then it looks right. I think it might be the fault of one of the fonts specified in MediaWiki:Common.css. See what happens if you remove Arial Hebrew or Arial from the list. —Michael Z. 2013-07-19 01:55 z

.Hebr
{
    font-family: SBL Hebrew, David, Narkisim, Miriam, Arial Hebrew, Arial, serif;
...

Zoëga's 'A Concise Dictionary of Old Icelandic'

Perseus has an XML of Zoëga's dictionary of Old Icelandic (= dialectal Old Norse). Is this is in the public domain? (text published 1910, Geir T. Zoëga died 1928). If so, could we bot-add the entries? Hyarmendacil (talk) 10:00, 17 July 2013 (UTC)[reply]

It's in the public domain. Is there any way to download the entire file from Perseus? DTLHS (talk) 16:12, 17 July 2013 (UTC)[reply]

I should have said, the text may be in the public domain but a transcription of the text is not necessarily. See [3]. DTLHS (talk) 17:56, 17 July 2013 (UTC)[reply]

A full XML can be downloaded from [4] (~50MB). Some of the texts are licenced under CC-BY-SA (e.g. Beowulf) so I presume we can use those freely. However Zoëga's dictionary doesn't include any licensing info or crediting, so I'm not sure what the status is. I'm happy to email them and ask if we can use it if no-one here knows. Hyarmendacil (talk)

Deprecation of language code templates

Recently, it seems that all of the Language code templates have been marked as deprecated. They now say "This template is deprecated, and is no longer used on any Wiktionary pages. Please use Module:language utilities instead." Wouldn't it be better, though, to migrate all of the language code templates to use Module:language utilities rather than asking people to use Module:language utilities directly? It's certainly easier for me to remember {{fr}} than {{subst:#invoke:language utilities|lookup_language|fr|names}}. Kaldari (talk) 03:42, 18 July 2013 (UTC)[reply]

{{langname|fr}} -> French

– Catsidhe ^{(verba, facta)} 03:53, 18 July 2013 (UTC)[reply]

That's not subst:able the way language templates are. It pukes out {{#ifeq:fr|{{lc:fr}}|{{#invoke:language utilities|lookup_language|fr|names}}|{{langname/name|fr}}}} - -sche (discuss) 05:23, 18 July 2013 (UTC)[reply]

So {{langname}} is useful for entries, but not for templates. Is there something appropriate for template use? (Other than directly invoking the module, which is direct, but hardly simple or memorable.

Is there somewhere where this is documented, or is it scattered across the Grease Pit, Tea Parlour, and Talk pages? – Catsidhe ^{(verba, facta)} 05:27, 18 July 2013 (UTC)[reply]

FWIW, I sometimes type in Template:$LANGCODE when I run across a langcode I'm not familiar with. This is a quick-and-dirty way of learning the codes. I hope modulization doesn't break this? ‑‑ Eiríkr Útlendi │ Tala við mig 08:16, 18 July 2013 (UTC)[reply]

{{langname}} isn't meant to be used in entries. It was originally a stopgap/migration template from when language codes weren't universal yet, and some templates had to handle taking either a code or a name. Now, only {{ttbc}} and {{trreq}} (should) still use it, and that's really just for convenience. As for deprecating the language codes, some time ago I proposed changing all of the language templates so that they call Module:language utilities in much the way you suggested. They would also be able to be subst:ed, and in fact be required to, so they can't be used directly anymore (why would you need to, anyway?). {{en}} already works that way and it has for a while. But that, too, was intended only as a migration/stopgap measure, and wasn't meant to be permanent. The language templates take up some very useful short names, which we probably want to use for other purposes, like as redirects to templates like {{label}}. So I do think we should delete them when possible, but we currently can't yet as some scripts like WT:ACCEL, and probably some bots, still depend on them. —CodeCa t 08:47, 18 July 2013 (UTC)[reply]

Is any of this documented anywhere? How do we find out where? If {{langname}} is intended only for very specific (and temporary) purposes, is there any point noting this in its doc page? — Catsidhe ^{(verba, facta)} 09:47, 18 July 2013 (UTC)[reply]

Yes, probably. But where should such general documentation of templates be placed? Where would people look for it? —CodeCa t 10:29, 18 July 2013 (UTC)[reply]

That's the $64,000 question, isn't it. Something in or linked from Wiktionary:Index_to_templates might be a start. Just a thought. — Catsidhe ^{(verba, facta)} 10:40, 18 July 2013 (UTC)[reply]

There's also Wiktionary:Templates, which is more about writing your own, and Help:Templates which redirects to it. —CodeCa t 10:56, 18 July 2013 (UTC)[reply]

"See also" or similar links could be incorporated in {{documentation}} for items of relevance to groups of templates with large numbers of members. Even template category headers could be used, though this might require disturbing the jewel-like perfection of our header templates. Or we could effectively disable the templates outside of use in approved namespaces so that successful harmless unanticipated use does not give somebody the wrong idea about using the template in a creative, but unanticipated way. DCDuring TALK 11:17, 18 July 2013 (UTC)[reply]

There is no way to do that, unfortunately. A template or module has no way to know whether it was transcluded on the main namespace directly, or via another template or page. The {{NAMESPACE}} will always be that of the current page, no matter how many layers of transclusion are in between. So if a template were to check that the namespace is not "main", thinking that it would prevent it from being used directly in entries and only via another template, it will actually end up not working on entries even if it were transcluded through another template. —CodeCa t 11:24, 18 July 2013 (UTC)[reply]

I am quite relieved to hear that. The instinct to prohibit needs limits. Permanent technical ones are good, though a general skepticism toward prohibition is even better. DCDuring TALK 11:41, 18 July 2013 (UTC)[reply]

Good Lord, this has become complicated practically overnight! I guess the person who did that suggestion must be working in law business. Because it's the very same thing: things get made 800% more complicated for no obvious reason whatsoever until no one will see through anymore. Why that change? Where is the actual benefit from that? I can see none. Just - as it was mentioned before above - people can no longer remember things by heart (nor can I), being forced to remember these insanely long lines (or look them up). I think such decisions are sometimes made to artifically create some extra work for people who feel bored. There was absolutely no reason to fiddle with those templates. If it ain't broke, don't "fix" it. -andy 77.7.105.41 21:17, 26 July 2013 (UTC)[reply]

What is so complicated about it? —CodeCa t 21:26, 26 July 2013 (UTC)[reply]

Reconstructions in undetermined languages

FYI, in addition to the entries which have shown up in Category:Pages with script errors because they use {{recons}} without specifying a language, I've noticed some entries that use {{recons|foobar|lang=und}}... which I assume is undesirable... - -sche (discuss) 08:52, 22 July 2013 (UTC)[reply]

Specifying "und" as the language is ok. But it shouldn't create a link in those cases. I've changed that now. —CodeCa t 11:31, 22 July 2013 (UTC)[reply]

Surely it defeats the objective of the template; how can something be a descended of a hypothetical term in an unknown language? Mglovesfun (talk) 22:02, 22 July 2013 (UTC)[reply]

Why wouldn't we want a link? What if a new contributor has an idea about the right language? How much work should said user have to do to get to the reconstructed term entry? DCDuring TALK 22:21, 22 July 2013 (UTC)[reply]

Well, if someone knows the actual language, they can add it can't they? —CodeCa t 22:24, 22 July 2013 (UTC)[reply]

Why deny them the use of "what links here"? Why such a bias against letting users do things and go places? DCDuring TALK 00:48, 23 July 2013 (UTC)[reply]

I don’t see a problem. If a term has descendants in enough languages so that the term can be reconstructed, but the term’s language is unknown, why not have a page for that reconstruction under the Undetermined heading? Such is the case of many Iberian Romance words which descend from “pre-Roman.” Pre-Roman, of course, isn’t a language, but a blanket term for Celtiberian, Iberian, Aquitanian, Lusitanian, Gallaecian and Tartessian (and sometimes Basque.) — Ungoliant ^(Falai) 22:40, 22 July 2013 (UTC)[reply]

But no proto languages are "real" languages. If it's in an etymology and it's not an attested language it should always have a name. DTLHS (talk) 22:52, 22 July 2013 (UTC)[reply]

Generally speaking, the links that have "und" are those where the person who added the link neglected to say what language it was in, or those for which the language isn't known because it could be any of a set. Vahag for example insists on labelling Iranian reconstructions with "und" because they are not Proto-Iranian but some later undetermined language. —CodeCa t 23:06, 22 July 2013 (UTC)[reply]

In that case couldn't we make up a code such as "ira-pro-mid"? If something is a defined linguistic unit there's no reason not to name and classify it. DTLHS (talk) 23:10, 22 July 2013 (UTC)[reply]

We had a debate like that before. What if we did the same with the Germanic languages? "Middle Germanic" would cover Middle English, Middle Dutch, Middle Low German, Middle High German, Old Norse (spoken at the same time)... you really think that we can create any meaningful entries in this "language"? —CodeCa t 23:13, 22 July 2013 (UTC)[reply]

Just because there won't be any entries doesn't mean it can't be useful for etymologies. DTLHS (talk) 23:16, 22 July 2013 (UTC)[reply]

This debate is about displaying links to such entries. —CodeCa t 23:17, 22 July 2013 (UTC)[reply]

I agree with DTLHS: if something is linguistically coherent enough that we're reconstructing terms in it, it should have a code and a name. If it's not coherent enough to have a code, we should not reconstruct terms in it. (At a minimum {{recons}} indeed needs to prevent links, because "Appendix:Undetermined/foo" is firstly an ugly name and secondly liable to contain two different terms, if we happen to think that foo was, e.g., unrelatedly, a word in both post-Proto-Iranian and pre-Roman.) - -sche (discuss) 08:34, 23 July 2013 (UTC)[reply]

But is Middle Iranian really that coherent? We can reconstruct terms in it, yes, but that'd be the same as reconstructing a supposed "Middle Germanic" word "dag". It's not Proto-Germanic nor is it the same in every "Middle Germanic" language (Middle English day, Old Frisian dei, Middle High German tac, Old Norse dagr). Such reconstructions are really just abstractions from multiple sources and could be compared to something like the conlangs w:Slovianski or w:Folkspraak, they're kind of like the "average" of the languages they are based on. The difference of course is that Middle Iranian doesn't have any strict rules because it's not an attested language nor a reconstructed one. Such etymologies are more like "the word was borrowed from one of the Middle Iranian languages, we don't know which, but based on the borrowed word as well as the Middle Iranian languages we know of, it looked something like this". That hardly seems coherent enough to give a language code to. —CodeCa t 10:15, 23 July 2013 (UTC)[reply]

I am butting in because this seems like a matter of how Wiktionary supports entries that are works in progress, which, because this is a wiki, should mean, in principle, all entries. I think it is self-evident that no entry can be locked down because no contribution could improve it. Accordingly, I think we need to design all of our infrastructure to facilitate making progress on entries rather than maintaining their current state.

If someone (or the scholarly community as a whole) has narrowed the range of possibilities but does not have the ability or evidence to put a reconstructed word into a particular box, should we discard or render less accessible what knowledge we have by not having loose categories which include the range of possibilities? I think not. Almost all required template slots, at least in our most foundational templates, should make provision for the contributor not knowing how to fill the slot or simply not filling it in. If we must have the slot filled in than an obvious way of expressing ignorance, like "?", needs to be available. If "?" or "und" is appropriate for complete ignorance, then it is probably much less appropriate for many of the cases under discussion here. If the state of knowledge of a contributor or of the community at large is less precise than our categories, it seems to me that it is the categories that have to change to accommodate the reality of our state of knowledge. The more precise and all-encompassing our categories, the more the need for categories that can handle ambiguity. DCDuring TALK 13:09, 23 July 2013 (UTC)[reply]

What is wrong with specifying "und"? This code means "undetermined language" so it's exactly what is needed when the exactly language is unknown. What else do you suggest we use? Also, I'm not sure how categories fit into this as {{recons}} doesn't do any categorising. —CodeCa t 13:49, 23 July 2013 (UTC)[reply]

As -sche pointed out above, there's only one und code, but a multitude of different undetermined sources. How about having hybrid unds: und-<family code>? That way, whatever partial information you have is represented, and words of similar origin are grouped together. It follows a logical pattern, so we don't have to deal with different people coming up with different ad-hoc names, yet it's quite productive and should cover most cases. Chuck Entz (talk) 14:35, 23 July 2013 (UTC)[reply]

Good idea. — Ungoliant ^(Falai) 16:11, 23 July 2013 (UTC)[reply]

@CodeCat: Whether we actually create a category for a mandatory field in a template is completely beside the point. If the field is mandatory, then we have a de facto categorization. If the purported content of the field does not actually support a system of mutually exclusive, collectively exhaustive categories, then the purported content needs to be supplemented with placeholders. We are accustomed to doing this by declaring the missing fields to be clean-up items or by marking with question marks. I don't think this is adequate for intermediate cases in which something is known, but not at the level of resolution of what is effectively our categorization system. We accommodate knowledge that goes beyond the resolution of our basic system with optional regional and other "context" tags and categorization. That has been and ought to be a completely open system periodically reviewed to allow for consolidation. Something behaviorally analogous seems appropriate the class of items under discussion. DCDuring TALK 16:21, 23 July 2013 (UTC)[reply]

I understand that that can be useful in some cases, but I still don't really see how it relates to {{recons}} specifically. That template uses the language code for three basic purposes. The first two are to select the language and script with which to tag and format the text, but that is more or less irrelevant for your proposal. The third is to put the name of the language into the link as a prefix like "Proto-Germanic/(word)". So the question really becomes, is it useful to have an appendix of entries for terms reconstructed in "Middle Iranian", even though we know in advance that there was no such language? Again, if I go back to the analogy of "Middle Germanic", what use would a hypothetical Appendix:Middle Germanic/dag be to anyone? —CodeCa t 16:36, 23 July 2013 (UTC)[reply]

If, as seems to be the case, post-Proto-Iranian / Middle Iranian is not coherent enough that it's intelligent(?—I can't think of the right word) to have even appendix entries in it, then I don't think it's intelligent to use {{recons}} to make up words in it. Simply write: "From some Middle {{etyl|ira|hy}} language; compare {{etyl|somecode|-}} {{term|some attested term|lang=foo}}, {{etyl|anothercode|-}} {{term|some other attested term|lang=whatever}}." After all, to use your example of Middle Germanic, would you ever put "from Middle Germanic foo" in an entry? I've never seen it done; I've only seen the format I just mentioned, where the language family and then various attested possible cognates and relations are mentioned. - -sche (discuss) 17:07, 23 July 2013 (UTC)[reply]

A question for experienced Lua coders

When making headword modules like Module:nl-headword, I tried to make it with just a single entry point. That means that the templates for all parts of speech invoke the same function, and then give it a parameter to decide what part of speech to show. This works, and it works well, but I'm wondering if it really is the clearest and most effective way to do it. There are also modules where every part of speech has its own function. I didn't do Module:nl-headword that way because there would be a proliferation of essentially identical functions for many parts of speech, and even then the ones that are different still, when I wrote it, shared some of the common code. But now that Module:headword has been created to take over most of the code that is common to all headword modules, there isn't much left. So should I keep doing it this way when writing new headword modules, or would a function for each PoS be better (with a function to handle "others" that don't have their own)? —CodeCa t 16:20, 22 July 2013 (UTC)[reply]

I like the idea of having only one invoke-able function. This part:

    if poscat == "nouns" then
        genders = noun_gender(args, categories)
        inflections = noun(args, categories, genders)
    elseif poscat == "proper nouns" then
        genders = noun_gender(args, categories)
    elseif poscat == "adjectives" then
        inflections = adjective(args, categories)
    elseif poscat == "adverbs" then
        inflections = adverb(args, categories)
    end

can be replaced with something like

    export[poscat](args, categories, genders, inflections)

This way, even "show" functions will become almost identical across xx-headword modules. We would have to create a function for each PoS though. --Z 00:24, 23 July 2013 (UTC)[reply]

And that last part is really the wasteful part. If no function exists, it should just do nothing, like in the original code.

But on the other hand, parts of speech that don't need any "extras" could really just use {{head}}. Today I looked through Category:English headword-line templates and found that I could replace the code inside most of them with just a call to {{head}}. Maybe we should delete the templates that are like that, so that we only use the templates and the (to be made) module for when it's actually needed because they do something that {{head}} doesn't. In the case of English, most of the headword templates appear to be redundant. —CodeCa t 00:33, 23 July 2013 (UTC)[reply]

What about

    export[poscat or frame.args.pos](args, categories, genders, inflections)

so templates without "extra" stuff will call a single common function. Regarding the English headword-line templates, I don't think we should delete them, they are easier to type, and are faster (well, they were, before your recent changes to them) --Z 00:47, 23 July 2013 (UTC)[reply]

Speed isn't a problem with headword templates. Pages don't generally include hundreds of them, at most 10-20 which is very manageable. But I think the edits were preferable because they add section links and the other nice features that {{head}} now has. I'm not sure what your code would really be useful for. Why would there need to be a single common function that does nothing? —CodeCa t 00:54, 23 July 2013 (UTC)[reply]

Never mind, I misunderstood what you meant at the first line. --Z 01:02, 23 July 2013 (UTC)[reply]

Abuse filter request

Can we prevent users and IPs from creating other people's userpages? E.g. User:Foobar can create User:Foobar, but 201.34.232.23 cannot, and Foobar can't create User:OtherExample. For some reason IPs have been creating others' userpages with spam. The guy who makes all of those global user pages could be exempt though. Ultimateria (talk) 19:51, 23 July 2013 (UTC)[reply]

User:Pathoschild, right? —Μετάknowledge^{discuss/deeds} 20:06, 23 July 2013 (UTC)[reply]

Done. (Pathoschild is an admin, so he should not be affected.) -- Liliana • 20:09, 23 July 2013 (UTC)[reply]

Suppress Category:Wiktionary pages that don't exist in certain places

Liliana has pointed out that Template:only in currently adds all entries that use it to Category:Wiktionary pages that don't exist (and sometimes also a language-specific subcategory). This is inaccurate when there are multiple language sections on the page, as on abnodate. It also complicates Special:Disambiguations. Can someone change Template:only in to allow the categorisation to be suppressed in certain cases, or to un-complicate Special:Disambiguations in some other way? Alternatively, is this not a big deal? lol - -sche (discuss) 22:41, 23 July 2013 (UTC)[reply]

Rewriting the category templates

Currently, we have a confusing mess of boiler templates for our categories, which all require parameters and can be hard to keep straight even for seasoned editors, which in turn discourages categorisation. I'm too scared of the Daniel Carrero template wasteland to find out, but I'm sure there's a way to rewrite them without breaking anything but removing all the logic so we just take the information needed from the pagetitle (that's where the Lua comes in). Then most of those templates could be merged (but bot-fixing the old ones wouldn't have to be a priority). Just an idea, if someone feels like taking it on. —Μετάknowledge^{discuss/deeds} 04:59, 24 July 2013 (UTC)[reply]

I would prefer to port the current behaviour to Lua before making any changes to it. I am not sure about using the page title to extract the needed information, because page titles might be ambiguous and they were not made to be understood by a machine. On the other hand, there is no reason we couldn't merge the templates despite that, I've been thinking about that for a while now. —CodeCa t 10:02, 24 July 2013 (UTC)[reply]

Cyrillic ital

I thought we had agreed to prevent italicisation of Cyrillic in the BP, but at som#Etymology, one of the Cyrillic etyma is italic and the other isn't! Whence comes this anomaly? —Μετάknowledge^{discuss/deeds} 23:13, 24 July 2013 (UTC)[reply]

Must be because Uzbek is by default in Roman, not Cyrillic. "сўм (soʻm)" is the spelling used before the change to Roman letters, "soʻm" is the modern spelling. --Anatoli ^{(обсудить}/^вклад) 23:25, 24 July 2013 (UTC)[reply]

I'm stuck in the Cold War, I thought Uzbeks still used Cyrillic. I suppose this is because {{term}} doesn't have script detection yet? —Μετάknowledge^{discuss/deeds} 23:31, 24 July 2013 (UTC)[reply]

Yes, {{term}} has no script detection, it's still template-based. But {{recons}} uses Lua already as a trial run for {{term}} since they both use the same code. —CodeCa t 23:52, 24 July 2013 (UTC)[reply]

Just for now, I added sc=Cyrl to the relevant template call at som#Etymology so things look correct in the browser. ‑‑ Eiríkr Útlendi │ Tala við mig 00:48, 25 July 2013 (UTC)[reply]

Auto-numbering for headers

One issue that's a bit of a bother is manually numbering etymology sections in entries with multiple etymologies. This particular phenomenon is quite common in Japanese, where one term as written is quite often multiple terms as read, and each reading has its own etymology. For instance, I'm about to start work on 祖父, which has at least six different readings in the resources I have to hand.

I dimly recalled that CSS can be used to expand auto-numbering beyond just ordered lists. However, using CSS for this would require something different than just ===Etymology===, as we currently have things.

I also recall seeing that other-language Wiktionaries make use of templates in headings as well, so you'll see things like {{-flex-verb-|fr}} instead of ===Forme de verbe=== on the French WT, or === {{Wortart|Präposition|Englisch}} === instead of just === Präposition === on the German WT.

Would folks here be comfortable with exploring the possibilities of using a template in the ===Etymology=== heading itself, perhaps as ==={{heading-etym}}===, in order to supply the <span> tags and class attributes that would allow for CSS auto-numbering? This would make editing and reorganizing multi-etym entries a bit less tedious. If it works well, it could perhaps be expanded to any heading that needed auto-numbering. ‑‑ Eiríkr Útlendi │ Tala við mig 07:07, 26 July 2013 (UTC)[reply]

One downside of treating the etymology numbers as styling (by adding them in CSS) rather than content (by putting them explicitly in the wikitext) is that then we can't link to them, and probably shouldn't even refer to them by number anymore. —Ruakh_TALK 04:42, 27 July 2013 (UTC)[reply]

Bot to alphabetize translation tables

Since as far as I know KassadBot (talk • contribs) is not running its autoformat code at the moment, could someone, anyone who's capable really, alphabetize translation in translation tables. It's come up in Wiktionary:Todo/mismatched translation codes, as when you update the language name, often you have to realphabetize afterwards. It's such a massive, tedious job, it really should be done by bot. Mglovesfun (talk) 19:22, 26 July 2013 (UTC)[reply]

Obsolete only before the retreat from Moscow?

Why does {{context|1811|lang=und}} render as (archaic, slang)? Looks like a bug to me. {{context|1810|lang=und}} or {{context|1812|lang=und}} do not render in this way. Spinning Spark 19:36, 26 July 2013 (UTC)[reply]

Template:1811. Mglovesfun (talk) 19:39, 26 July 2013 (UTC)[reply]

(e/c) Template:1811 is shorthand for Template:Classic 1811 Dictionary of the Vulgar Tongue. Ideally, all entries that use it will be updated to use {{context|obsolete|slang}} explicitly, and to include {{R:1811}} — and to use modern words in their definitions! I have been going through all 1811 Dictionary entries and doing this, hence my recent spate of RFVs, but it will take a while. - -sche (discuss) 19:41, 26 July 2013 (UTC)[reply]

Template:trans-top with embedded html in parameters

Apparently something about recent edits to this template or something it uses is causing unexpected behavior when a parameter includes html-type tags, such as <sup></sup>. See picosecond and terametre for two examples. Chuck Entz (talk) 20:51, 26 July 2013 (UTC)[reply]

It looks like it was diff. Delink (which is the same as remove_links) removes wikilinks, but it does not remove HTML code. —CodeCa t 20:56, 26 July 2013 (UTC)[reply]

Ugh. Anyone know if there's a module that cleans out HTML code? --Yair rand (talk) 16:10, 28 July 2013 (UTC)[reply]

We could just remove anything between < and >- I don't think that would have any unwanted side effects. DTLHS (talk) 16:28, 28 July 2013 (UTC)[reply]

Big, not bold

For some scripts, {{head}} renders the lemma in larger letters rather than in boldface. Where is this encoded? I would like to add Ogam to the list of scripts to which this exception applies. If ᚔᚅᚔᚌᚓᚅᚐ appears in Ogam for you (i.e. it's not just little boxes or question marks) take a look at the headword line of ᚔᚅᚔᚌᚓᚅᚐ and you'll see what I mean. Putting it in boldface renders each group of five little lines and the group of two diagonal lines into monolithic blocks that are no longer legible (even to people who can read Ogam). —An gr 11:07, 28 July 2013 (UTC)[reply]

It's in MediaWiki:Common.css. I see the headword on that page fine though, my browser has made the whole thing wider so that it's still readable. —CodeCa t 11:40, 28 July 2013 (UTC)[reply]

Nevertheless, Ogam doesn't really lend itself to boldface. It's not as if they made the carvings on the stones thicker for emphasis back in 5th-century Ireland. What do I add to change it? Can I just copy the code that's that for, say, Mymr? —An gr 13:32, 28 July 2013 (UTC)[reply]

Yes that would work, although you'd need to adjust the sizes. Big text is normally 125% bigger, but Burmese text is shown bigger already by default, so bolded Burmese is made bigger still (125% times 130% is 162.5%). Ogham text is 125% by default so you need to multiply that by 125% again to get the final size. —CodeCa t 14:12, 28 July 2013 (UTC)[reply]

Done. Let me know if I fucked anything up. It looks better at any rate. —An gr 14:30, 28 July 2013 (UTC)[reply]

Template:R:ang:BT

This template seems to have been set up in a way to link to specific entries from a site, I think, but it doesn't seem to work on the site currently listed, or at least I haven't been able to do it. Would it be alright for me to change the link to http://bosworth.ff.cuni.cz/, instead? I think it would be easier to link to entries from this site for references. Anglom (talk) 16:15, 28 July 2013 (UTC)[reply]

Does this need discussion? I say just do it; what's the reason not to? Mglovesfun (talk) 21:55, 31 July 2013 (UTC)[reply]

Entry name and sort key generation now in Module:languages

Two new values have been added to the module, entry_name and sort_key. They both work the same, both should be a table containing two further tables named from and to. These give values to be replaced and/or removed in order to create an entry name or a sort key. Entry names are for removing characters and diacritics that are not included in the names of entries by common practice, like macrons for Latin. The sort key is meant to remove diacritics that are not distinguished for sorting in that language, like umlauts for German. Most likely, you do not want to directly interpret these values in your module, because there are already standard functions that do this: format_categories in Module:utilities and language_link in Module:links. If you use those standard functions, this functionality will be automatically included as appropriate. This includes standard templates such as {{head}}, {{l}} and, in the near future, {{term}}, {{t}} and the form-of templates. —CodeCa t 18:53, 28 July 2013 (UTC)[reply]

New templates for categorizing entries

Automatic generation of sort keys was recently added, via format_categories in Module:utilities. But so far the only way to take advantage of that was to either write a module that used that function, or to use {{head}}. Entries sometimes also need to add categories directly, and it's also more convenient if templates can take advantage of this, because not everyone is able to write modules and we also have a lot of templates that haven't been converted to Lua yet (if they ever will). So I thought it would be nice to have a template that provides this functionality to "wikispace" as well.

I created {{categorize}} for this purpose. It can take a list of any number of categories, and applies the appropriate sort key (generated automatically according to the rules of that language) to all of them. So {{categorize|fr|French nouns}} on the page été will generate [[Category:French nouns|ete]]. If the template is used on a page outside the main or Appendix namespaces, no categories are added, so it's safe to use there and you don't need to wrap it in {{#if:{{NAMESPACE}}|| like we normally do with categories.

We also already had a template that served a somewhat similar purpose, {{catlangname}}. This template would add the language name, so {{catlangname|fr|nouns}} would add "French nouns". This template uses the same module as {{categorize}}, so you can now add more than one category in one go, which the template couldn't do originally: {{catlangname|fr|nouns|uncountable nouns|invariable nouns}} on été will add: [[Category:French nouns|ete]][[Category:French uncountable nouns|ete]][[Category:French invariable nouns|ete]]. For convenience, I also created {{topics}} which works the same way but for topical categories. {{topics|fr|physics|chemistry}} on été will add: [[Category:fr:Physics|ete]][[Category:fr:Chemistry|ete]]. These latter two templates are really just for convenience, you can do the same with {{categorize}} as well, i.e. {{categorize|fr|French nouns|French uncountable nouns|French invariable nouns}} and {{categorize|fr|fr:Physics|fr:Chemistry}}. But we can use these templates in entries instead of putting the categories in directly, so that we can take advantage of the custom sorting that these templates provide. —CodeCa t 13:06, 29 July 2013 (UTC)[reply]

Apparently, there is also {{catlangcode}}, which does the same as {{topics}}. I suppose we don't need {{topics}} then, although I do think that name is a bit clearer. What do you think? —CodeCa t 13:11, 29 July 2013 (UTC)[reply]

AWB issue

I was going to correct *: Egyptian: {{t|arz| to Egyptian Arabic. So I started with what transcludes Template:arz, but because it's been moved to Module:languages, it has no transclusions. Mglovesfun (talk) 12:50, 30 July 2013 (UTC)[reply]

This is a difficult problem, and I'm not sure how we could fix it. A temporary modification to {{t}} would work, but I don't see a workable long-term solution for problems like this. —CodeCa t 14:03, 30 July 2013 (UTC)[reply]

It's not a major problem, I had to search for "wikitext: Egyptian" instead as a second best solution. Mglovesfun (talk) 14:43, 30 July 2013 (UTC)[reply]

Subst: not working

Does anyone know why diff didn't do anything? —CodeCa t 16:36, 30 July 2013 (UTC)[reply]

It works if you remove the first comment:

{{subst:sl-decl-adj-table
|

Comments are really overused in English Wiktionary templates BTW. --Z 17:04, 30 July 2013 (UTC)[reply]

part#Translations

What happened here? —Μετάknowledge^{discuss/deeds} 22:55, 30 July 2013 (UTC)[reply]

Same as above. {{jump}} uses html tags which are incompatible Lua modules. DTLHS (talk) 22:59, 30 July 2013 (UTC)[reply]

Fixed. --Yair rand (talk) 23:10, 30 July 2013 (UTC)[reply]

Script detection for templates

The case in question is a template like {{kk-noun}}. If the headword is in Cyrillic, it should use Module:kk-translit (which is indeed the current behaviour), but if the headword is in Arabic, it should use tr=, and if the headword is in the Latin script it should suppress any transliteration efforts at all. Is this already in a module somewhere where I can directly invoke it? If not, it should be. —Μετάknowledge^{discuss/deeds} 01:21, 31 July 2013 (UTC)[reply]

It's still a bit experimental. Automatic transliteration is nice, but it's turning out to be a bit too automatic at times, showing up even in places where it shouldn't. —CodeCa t 19:11, 31 July 2013 (UTC)[reply]

I don't get it... We can detect scripts, and we haven't had any problems with autotranslit in headword templates. So why not combine the two to make these templates cover every sort of situation gracefully? Or will I have to make people specify script, like the horrible mess we call {{tt-pos}}? —Μετάknowledge^{discuss/deeds} 20:28, 31 July 2013 (UTC)[reply]

There's no module to detect scripts independent of a language code. DTLHS (talk) 20:31, 31 July 2013 (UTC)[reply]

There's a function called detect_script in Module:utilities which tries to guess the script based on the language and text that it's given. So it's possible to do this, but I don't know how reliable that function really is and how many false guesses it makes. I know that it looks at the first characters of the string (excluding digits, punctuation and spaces). This means that a word like CD播放機 will be detected as Latin, and α-particle will be detected as Greek. So it's a nice heuristic that works for many cases, but it's certainly not reliable enough that we can expect it to never fail. We can make it better and more accurate (by making it look further than the beginning of the string) but that comes at the cost of speed, and speed is really important for a heavily-used function like this. —CodeCa t 21:23, 31 July 2013 (UTC)[reply]

Thanks Metaknowledge (link). :) Mglovesfun (talk) 21:31, 31 July 2013 (UTC)[reply]

After my recent change, before checking Latin letters in "CD播放機" and Greek letters at the beginning of "α-particle", the function checks if the letters of the native scripts of the language can be found in the string (not necessarily at the beginning). Since Latn and Grek are not listed in languages.cmn.scripts and languages.en.scripts, the script of those two examples must be detected correctly now. --Z 22:03, 31 July 2013 (UTC)[reply]

But what about Japanese? That does have Latin listed as one of its scripts. —CodeCa t 22:10, 31 July 2013 (UTC)[reply]

Since "Jpan" is in the first field of languages.ja.scripts and "Latn" is after that, it will be detected as "Jpan". --Z 05:31, 1 August 2013 (UTC)[reply]

@CodeCat: The false guesses you're talking about have nothing to do with Kazakh's situation, IMO, so I think it's perfectly safe to use it. How do I call detect_script directly?

@MG: Well, it was a good enough template before we got Scribunto, but now it's just silly. We have the power to make it an excellent Luacised template, and yet we're not doing it. —Μετάknowledge^{discuss/deeds} 22:31, 31 July 2013 (UTC)[reply]

I created Module:kk-headword. It works almost the same as the template did (there is no "dot" before the transliteration), but it includes script detection and it handles transliterations in the way that you said. —CodeCa t 22:48, 31 July 2013 (UTC)[reply]

I appreciate that, but because of the limits to my abilities, I would rather know how to do what is most comfortable, which is having the string manipulation (like script detection) be the only part in a module, and the logic be in the template. That way I can rewrite {{ky-noun}} as well, et cetera ad infinitum, instead of asking you to do it or (possibly faultily) copying your prototype. Also, the dot is kind of a standard thing among a wide range of headword-line templates at en.wikt, and it links to our transliteration standards. It's not important, but I would prefer that we keep it. —Μετάknowledge^{discuss/deeds} 23:41, 31 July 2013 (UTC)[reply]

CSS

Sorry if this is the wrong place to post this, but is there any way in CSS to say "display everything in these font families unless I say otherwise"? "body {etc}" will display most things as that, but there'll still be the occasional ugly-looking word that appears as a different font, and I can't always figure out what class it belongs to.Lunaibis (talk) 18:36, 31 July 2013 (UTC)[reply]

If you add !important before the ;, that will probably do what you want. If it still doesn't work, then your browser is probably being rebellious. —CodeCa t 19:09, 31 July 2013 (UTC)[reply]

So "no" is what you're telling me. XD Lunaibis (talk) 19:25, 31 July 2013 (UTC)[reply]

No, I'm saying yes? —CodeCa t 19:42, 31 July 2013 (UTC)[reply]

The joke is that I was already doing that. Well, to "body" at least. Lunaibis (talk) 19:48, 31 July 2013 (UTC)[reply]