Wiktionary:Grease pit/2008/July

Grease pit archives +/-
2006
2007
2008
2009
2010
2011
2012
2013
2014

July 2008Edit

Minor AF changesEdit

User Panda10 (new sysop!) noted that AF (at tanti) did not understand

==Hungarian

which should be interpretable. I improved the header parsing a bit, and arranged to make AF save this edit.

Note that the way AF works is by reading the entry, setting the edit summary to blank, and then parsing, adding phrases to the edit summary for each non-"minor" change. At the end, if the edit summary is nonblank, it saves the edit. This keeps it from endlessly fiddling with single blank lines and such.

I've changed it to add to the edit summary, thus saving, in three additional cases: (numbers are entries in 13.6.8 XML)

  • when a header is reformatted (as above and other cases) (5145 plus some not caught in prescreen)
  • when a category link is canonicalized ("category" to "Category", spacing) (6148)
  • when multiple blank lines, enough to affect rendering, are removed (652)

None of these (except the improved header parsing for a few cases) change what AF does in edits, just saving more edits. Robert Ullmann 14:53, 3 July 2008 (UTC)

Wiktionary:Wikitext styleEdit

Update and write-through of Connel's doc from two years ago. Please read. Robert Ullmann 15:29, 3 July 2008 (UTC)

Subcategory listingsEdit

I see that sometime in the past 24 hours, a change has been made in the way subcategories are listed under categories. A parenthetic number after the subcategory name lists the number of sub-subcategories (a nice feature, I suppose). However, the subcategory name also is bolded. This is a problem. It is a problem because some of our Asian-language categories include CJKV characters in the category name, and the names of these categories become unreadable when bolded. Can someone undo this change for us? --EncycloPetey 03:47, 4 July 2008 (UTC)

We can add:
.CategoryTreeLabel { font-weight:normal; }

if wanted. There are a number of nice features; try mousing over the displayed number. It would be better if the displayed number was the number of entries, not sub-cats. Which cat names are a problem? There are a couple in Japanese, but other than that where? We do try to keep category names in English, right? (;-) Robert Ullmann 18:05, 4 July 2008 (UTC)

There are some in Korean as well (Category:Korean adjectives ending in 하다) and there might be some in Chinese. We do try to keep category names primarily in English, but when categories are for words derived from certain roots, or words with a particular inflectional pattern, or words containing a given suffix, etc., then it is not unusual for part of the category name to include said root, a sample word or ending, said suffix, etc. --EncycloPetey 01:53, 6 July 2008 (UTC)

850,000Edit

Entry is taus by Jyril Robert Ullmann 17:29, 6 July 2008 (UTC)

Yeah! Thank you for keeping track of these; I've never gotten the hang of doing so. --EncycloPetey 20:29, 6 July 2008 (UTC)

name and order of scripts in Serbian translationsEdit

Serbian is written using two scripts, Cyrillic and Latin/Roman. In translations sections, the format is to have the language name "Serbian" followed by the translation in one script and then the translation in the other script, each on a separate line.

However, there appears to be no standard regarding the order the scripts appear in, although Cyrillic first is more common. Nor does there appear to be convention regarding what the non-Cyrillic script is labelled as, for example in the entry vermilion, the Serbian translation of the colour is labelled "Latin" for the noun sense and "Roman" for the adjective sense.

For the purposes of standardisation, I would like to propose that we agree on one order and label and gradually convert all entries to match it by adding to AutoFormat's existing translation table sorting code (I have not asked Robert Ullman about this, but I presume it would be a simple addition to the code).

Regarding the order, I believe that Cyrillic should appear first both as this is already the most common, and also because it is alphabetically before either Latin or Roman.

I do not have any preference regarding whether it should be Latin or Roman, nor do I have any feel for which is the most common. I do though believe we should pick one and stick to it to avoid situations like at vermilion. Thryduulf 14:30, 8 July 2008 (UTC)

I believe we already had a conversation about naming the non-Cyrillic script, and decided on "Roman" to avoid confusion with the language Latin. All the Serbian inflection templates use "Roman" for the non-Cyrillic script, and Dijan is consistently unsing "Roman" these days. I expect that the uses of "Latin" for Serbian translations are remnants of an earlier time. --EncycloPetey 17:55, 8 July 2008 (UTC)
Yes, IIRC I was asking then about "Latin/Roman spelling" as a header. Roman is a better choice, even given the script is named Latin (e.g. ISO 15924 Latn). The larger question here is why Serbian is using two (three) lines for this; Japanese has three scripts, and the Chinese languages 2-3 (sim/trad/pin), and we (usually) show them on one line.
And yes, I can add some rules to AF; but might be simpler to 'bot any existing ones with 2 regex rules (one to sub/one to sort). Are we seeing new ones, e.g. new "errors"? Robert Ullmann 16:46, 9 July 2008 (UTC)
Yes, the excessive three-line translation is the larger question. When I'm in the mood to clean up translation tables, I reformat Serbian translations to be on one line, e.g.:
* Serbian: {{t|sr|sc=Cyrl|нешто|n}}, {{t|sr|nešto|n}}
It's one of the rare situations where I include a script without a transliteration in {{t}}. It reads pretty clear, no? Rod (A. Smith) 18:44, 9 July 2008 (UTC)
But wouldn't it look silly for Serbian translations lines which list 2-3 translations of the English meaning to consecutively be spelled in alternating scripts? Perhaps modifying {infl}'s logic to support linking Serbian Latin transliterations (like you'd be writing {{t|sr|нешто|tr=[[nešto#Serbian|nešto]]}}) ? --Ivan Štambuk 19:47, 9 July 2008 (UTC)
But the problem is that they're not transliterations; they are actually alternative script spellings. If we put one of the two script forms into parentheses, it will look like a value judgement on our part as to which is the "standard" in Serbian. --EncycloPetey 20:42, 9 July 2008 (UTC)
Well, to me putting them all in one line looks like needless clutter and significantly cuts down the potential space of listing more than one actual translation for a particular meaning. Since there are actually lots of world's languages that are one way one or another (de facto or de iure) written in multiple scripts (usually Latin/Cyrillic in ex-USSR dominated areas, or Latin/Arabic with significant cultural Islamic influence), maybe this idea should be given a little more though outside the confines of ideograms/syllabaries-based scripts where space is less of a practical consideration. Out of billions of pointless and purposeful List ofs on WP, no one apparently seems to have bothered to compile one for languages written in multiple scripts, but wild-guessing the relatively active alphabetic languages on WT this one-line rule would affect Azeri (Latin+Cyrillic), Aramaic (Hebrew+Syriac) and Old Church Slavonic (Early Cyrillic+Glagolitic).
OTOH, I see Japanese translation on house formatted as:
  • Japanese: 家 (いえ, ié)
Isn't this also giving preference to one script over another? --Ivan Štambuk 06:06, 10 July 2008 (UTC)
Not really, since Japanese writing is actually more complicated than that, often with a mix of the two forms of script simultaneously. Only the old words have a short kanji form, and this form is usually preferred as it takes less space, is older, etc. The expanded hiragana and katakana are used for inflectional endings, for words that are rare, and so have unfamiliar kanji to most people, for borrowed words, etc. In other words, this isn't analogous to the kind of situation we're discussing, where the are two complete writing systems, each of which can be used without the other. Japanese requires both systems, often in the same sentence. --EncycloPetey 06:44, 10 July 2008 (UTC)
I think it's just a clarity issue. Having everything on one line (thus effectively removing the script titles) could potentially lead to confusion among readers who aren't familiar with the scripts used by the particular language, not to mention sacrificing quality and clarity for a few lines of extra space. A language like Kazakh, for example, is writtin in three different scripts (Cyrillic, Latin, and Arabic). Taking into account that sometimes the entries include one or more synonyms, the Kazakh entry could see six or more words in the same translation box completely unseparated and unexplained other than the vague "Kazakh" at the start of the line. For (a hypothetical) example:
  • Kazakh: тілі, tili, تىلى, теңізі, teñizi, تەڭىزى; or
  • Kazakh: тілі, теңізі, tili, teñizi, تىلى, تەڭىزى
Either way, it's not a pretty picture. There are currently four words per script for the English word "bicycle" under the Serbian entry. Without the script titles, it would go from looking like this:
  • Serbian:
    Cyrillic: бицикл m, точак m, двoточак m, dvokolka f [this last one hasn't been transcribed in Cyrillic for some reason...]
    Roman: bicikl m, točak m, dvotočak m, dvokolka f
to looking like this:
You don't get the full effect of this until you see it with the rest of the translations, but it's enough to want to make you stop reading the apparently endless line of words. --334a 02:48, 12 July 2008 (UTC)
I agree that the long lists of words are off-putting, so I would go the other way to some people here and use the multi-line format for other languages that use more than one script. We're not space limited, so I don't see why we should favour concision over clarity. Thryduulf 12:01, 13 July 2008 (UTC)
It should be done like traditional/simplified Chinese:
DAVilla 10:06, 15 July 2008 (UTC)
Why? I'd say that is actually the worst method suggested here, because it makes it difficult to determine easily where one word ends and the other begins, particularly with unfamiliar scripts. When one link is red and the other blue, yes it can be distinguished, but this doesn't help when both links are the same colour, nor for people who are colour blind, or use alternate colour schemes - on my small XDA screen, it isn't always easy to distinguish between the purple visited link colour and blue existing unvisited links.
Additionally, it doesn't address the comments made by 334a above regarding apparently endless lines of words; it doesn't help the person unfamiliar with the language know which script is which - or indeed make it clear they are in different scripts (which isn't always going to be obvious). or what other difference there is.
I would also not be surprised if it made life more difficult for people using screen readers (although I don't have any evidence of this). Thryduulf 16:03, 17 July 2008 (UTC)
It does address what 334a says, because it is exactly the example s/he gave. Not duplicating information like gender avoids the cluttered look. I would counter that, in contrast, a problem with long lists exists for the split-line version in that the correlations get out of sync, not necessarily wrapping at the same points.
Putting spaces between the slashes, as suggested below, should help a lot. I also like commas except that it overloads that divisor. An example with context (such as regional use) should clearly indicate that the context applies to both forms. DAVilla 06:37, 19 July 2008 (UTC)
I'm not crazy about slashes in general, but I don't think that these should be specific problems. The slash is familiar punctuation in English and other Latin-alphabet languages, it always marks a word boundary, and it works fine with Cyrillic. I don't think locating word endings would be a problem for readers (and I have never heard that screen readers are stymied by slashes either).
Alternatives could be commas and semicolons, interpuncts, dashes, vertical bars, or spaced slashes (but the latter usually represent line breaks):
I also don't mind using brackets. We can make the Latin-alphabet version primary, which is familiar to Wiktionary readers—this would also help differentiate it from transliteration:
 Michael Z. 2008-07-17 21:00 z
The problem with brackets is that it implies the bracketed script is inferior to the non-bracketed one. While for some languages this might be the case, I understand that at least in Serbian neither Latin nor Cyrillic is primary and to say otherwise would not be NPOV.
Slashes usually represent breaks between words in English and presumably at least some other languages that use the Latin script. Is it universal? My uninformed guess is that \ would be a more likely character in right to left alphabetic scripts. Ideographic scripts are not unlikely to use something different. Thryduulf 22:00, 17 July 2008 (UTC)
I concede that brackets do imply some kind of primacy for the first term, and we should be very careful where there may be a national aspect. I would still suggest presenting the Latin first, because it is familiar to en.wiktionary readers, and because the reverse order may imply that it is transliteration rather than an independent orthography. (One of them has to come first, whether on the same line or not.)
We should stick to English punctuation in en.wiktionary, if possible. The unspaced slash represents and/or, but is usually best avoided or expanded into the more graceful A or B, or both. Many readers and writers find it awkward, so they replace it with the spaced slash, which is properly only used to represent line breaks in poetry, lyrics, etc. The backslash (\) doesn't belong in English writing or typography.
Having the two on the same line, with a single gender notation, also helps relate the corresponding spellings—so that in the example it is obvious that dvokolka stands alone. I still think that the comma-semicolon scheme is understandable, unambiguous, and unobtrusive. I'll repeat it, with the Latin first: Michael Z. 2008-07-18 02:25 z

[de-indenting] I think all of these approaches are fine, in that someone looking at these translations will be someone who knows enough of the language to make use of them. I'm not terribly concerned about the reader who can't even recognize the various scripts of the language (s)he's supposedly translating into — but if I were, I think I'd prefer one of the systems that puts the scripts on one line, so that it's obvious which script-A spelling goes with which script-B spelling. —RuakhTALK 00:49, 18 July 2008 (UTC)

Is there a conclusion to this? I must admit that I also think three lines is too much and the different scripts don't necessarily stay together. So I'd also prefer the format with a / inbetween. If , and ; must be used, then I'd prefer to see
* Serbian: bicikl; бицикл m, točak; точак m, dvotočak; двoточак m, dvokolka f
since the comma is already used to separate translations. But I'd rather see / used, spaced or otherwise. --Polyglot 00:08, 19 December 2008 (UTC)

Special:Recent changesEdit

When you add a word that has templates within it, "Recent changes" shows an expansion of the template with all the span and other junk. Am I the only one who finds this annoying? You can't see the wood for the trees. SemperBlotto 08:52, 9 July 2008 (UTC)

Yes, just noticed this. It is annoying. (I was about to complain about you subst'ing the it-noun template when I figured it out ;-) I wish we had a much better idea (in general, not just this time) what "they" are fixing or think they're fixing. Robert Ullmann 11:28, 9 July 2008 (UTC)
Or even who "they" are- who controls things like this? Nadando 02:51, 12 July 2008 (UTC)
Seems to have been fixed. Never showed up in Bugzilla to my knowledge. Robert Ullmann 10:13, 12 July 2008 (UTC)

rename&move Special:Random page ?Edit

Does the "random page" link only choose random entries/words? If so, then how about renaming the "Random page" text on the side bar to "Random word" (a la Random article, Random book). Also, it seems to get lost in the middle of all those link in the list too... maybe move to the bottom like in 'pedia and 'books? i hope this is the right spot to write this. cheers. 116.240.241.38 12:17, 9 July 2008 (UTC)

Seems reasonable to me to change its text if it only yields entries. (It's MediaWiki:Randompage, by the way.)—msh210 23:02, 9 July 2008 (UTC)
Done by DAVilla. —RuakhTALK 00:43, 18 July 2008 (UTC)

new colour panel templateEdit

hi, i've made a template to help standardise the 'display of colours' in articles for colours. it is Template:Colour panel. the usage instructions are described on that page, and it outputs something like:

vermilion colour:    

it basically creates the panel that was in most of the colours round the traps into a template. sorry but i have no idea where to 'let it be known' that i have done this, so feel free to move this message there, wherever it may be. i don't have time to put it into every single colour (though i did do about 4)...
...but that's why wikis are so-o-o-o good! because now everyone can help everyone else by doing it whenever they feel like it :)

cheerio, 116.240.241.38 12:56, 9 July 2008 (UTC)

Copied to WT:GP. Very nice. Good to move bits of obscure HTML out of the entries into a common template. Thanks. I moved it to {{colour panel}} keeping the redirect. Robert Ullmann 17:00, 9 July 2008 (UTC)
Redirection in place from template:color panel for the benefit of those of us who know how to spell.  ;-) msh210 17:29, 9 July 2008 (UTC)
bloody Yanks Robert Ullmann 22:36, 9 July 2008 (UTC)
hi, thanks for that (lowercase-ification). i've fixed all the existing references to the template in word entries to lower case (just because i am pedantic). p.s. wow, it doesn't take long for the "u"-seless spellers among us to spread their poison, eh?  ;-)  Wanderlust (the artist formerly known as 116.240.241.38) 05:39, 10 July 2008 (UTC)
I should point out that I am originally from Boston (Massachusetts). About "Yankee": to someone outside the US, a Yankee is someone from the US; in the US, it is someone from the North (north of Mason Dixon line); to someone in the North it is someone from New England; to someone from N.E., it is someone from Maine; to a Mainiac, it is a nuclear power plant.
The only people who call themselves Yankees are these guys. Robert Ullmann 11:08, 10 July 2008 (UTC)

default scriptEdit

Can template:infl and template:term be modified to have a default script based on {{{lang}}}? E.g., in the "else" part of {{#if:{{{sc|}}}|...}} for template:infl, have something like {{#switch:{{{2|}}}|he|yi=Hebr|ru|uk=Cyrl, etc.—msh210 17:45, 9 July 2008 (UTC)

I hope so. At least for some languages, the editor's task in using these templates is redundant and invites error and inconsistency. I don't imagine there are many exceptions where we wouldn't want a language to use its default script.
Thanks. Michael Z. 2008-07-09 18:49 z
There's (currently unused) {{lang2sc}} that was thought for the purpose of providing the default sc= on lang= input, but I think that first the behaviour of {infl} should be discussed as I believe some users have been unsatisfied for un-bolding the Cyrillic headwords (which is what {infl} does with sc=Cyrl). --Ivan Štambuk 19:40, 9 July 2008 (UTC)

faviconEdit

Hey guys. I know this is like ringing up the TV broadcaster to turn the volume down at their end, but can we have a different favicon for Wiktionary?. I use firefox with "faviconize tabs" to save space on the tab-bar, and I normally have at least four wikimedia sites open. It's not a big deal by any stretch, just thought it'd be a good idea. :)

This isn't something that can be changed on-wiki; it needs to be done by the developers. It seems like Wiktionary:Beer parlour is a better place to discuss this (and I seem to recall it coming up before). Mike Dillon 02:49, 11 July 2008 (UTC)
Ah. Sorry about that. Cheers for the info
In my humble opinion, Wikipedia should change theirs to the globe. Ours matches our logo :p. Conrad.Irwin 20:01, 15 July 2008 (UTC)

User renamesEdit

I'm clearing some of the backlog of user rename requests, many of which are requesting single unified login, but a large number are from unregistered users. Clearly it is not possible to rename a non-existent user (the wiki software rejects the rename request), so what is to be done with these? Is there a procedure here, or do we just contact the users on whatever other wiki they are registered (typically Wikipedia) and tell them just to set up the name they want? Some are usurpation requests, but those are a slightly different matter. — Paul G 10:41, 11 July 2008 (UTC)

Tell the user they are confused. If there is no existing en.wikt user they need do nothing here. Just add that to the CHU page. I suspect that there are confused users asking for usurpation in all sorts of wrong places ... Robert Ullmann 10:02, 12 July 2008 (UTC)

FYI everyone: I am in Kigali, and a bit less accessible. Will be around every day or two though. Robert Ullmann 10:02, 12 July 2008 (UTC)

en-nounEdit

At cargo I added "|-" to add an uncountable use to {{en-noun|'''[[cargos]]''' or '''[[cargoes]]'''|-}} but it isn't showing uncountable, apparently because there are links in the first parameter. I thought this worked. Am I looking at it wrong? RJFJR 16:03, 15 July 2008 (UTC)

obscure case ... you (or somone) is trying too hard with the plural note, instead of using the options: {{en-noun|s|pl2=cargoes|-}} Robert Ullmann 16:56, 15 July 2008 (UTC)
Thank you. I just copied the existing plural into en-noun and added uncoutnable. I need to sit down and really learn the template options like pl2. RJFJR 17:54, 15 July 2008 (UTC)

Bot-assisted addition of Middle Chinese readings of hanziEdit

Hello, it's been getting a bit slow adding Middle Chinese (specifically Tang Dynasty-era) readings of hanzi (Chinese characters), and I wonder if this could be done assisted by a bot, the way a lot of the CJK data was originally done when the info was dumped into Wiktionary years ago.

The raw data of the Tang pronunciations may be found here:

http://homepages.mcs.vuw.ac.nz/~ray/Chinese/UnihanTang.htm

Two important notes:

1. I just heard via email from the CJK specialist at the Unihan Database that the asterisks that appear next to many of the readings do not (as I had believed, according to standard historical linguistics/phonology practice) indicate that the pronunciations are reconstructed--as apparently they are all reconstructed. Instead, they are used by Stimson to indicate that "a word or morpheme represented in toto or in part by the graph appears more than four times in the 700 poems [of the full Tang corpus analyzed]." The Tang pronunciations are derived from or consistent with T'ang Poetic Vocabulary by Hugh M. Stimson, Far Eastern Publications, Yale University, 1976.[1]

2. The pronunciations, unlike pinyin or Cantonese romanizations, include various IPA symbols such as ɑ (which is not the same as a) and ɛ.

Huge thanks for this; I think that now that we've found the raw data, it shouldn't be too hard for a bot to do a dump of the data into our individual hanzi entries. However, I don't have any idea how to do that.

The bot could be set to overwrite all the entries I've done by hand (a couple hundred at most, I think). I estimate that the raw data list presented above comprises about 4,000 hanzi.

24.29.238.60 19:35, 15 July 2008 (UTC)

Large Arabic entry titlesEdit

I just saw this entry at hu:Wikt. The Arabic entry title is very large and easily readable. Is there any possibility that we could similarly enlarge the size of our Arabic entry titles? It would make them easier to read. 24.29.238.60 21:07, 15 July 2008 (UTC)

On further examination, it seems that hu:Wikt has all their entry titles large, as this one. 24.29.238.60 21:08, 15 July 2008 (UTC)

See this discussion aw well. Nadando 04:49, 16 July 2008 (UTC)

Thanks; that seems to be a separate issue, not directly addressing the size of the entry title. 24.29.238.60 07:21, 16 July 2008 (UTC)

new context templateEdit

I've moved the version I've been testing and working on for a month+ into place, it fixes a number of things.

  • Does the sub-categorization of context labels as presently set up by default when possible.
  • Does not cat the templates themselves in content categories (user presentation).
  • Fixes missing skey for regional cats.
  • Adds cat= for a fixed category, not modified by language name or code prefix or region. (topcat= has been used for this, sometimes causing odd categories to appear)
  • Does not generate trailing spaces from qualifiers, or spaces before explicit commas. So {{context|frequently|lang=und}} is (frequently), and {{context|frequently|,|something|lang=und}} is (frequently ,, something) as it should be.
  • Categories appear in the order specified (they had been inverted by the recursion).

All of the tests in the last weeks have worked, tell me if you observe any anomalies. Robert Ullmann 06:50, 19 July 2008 (UTC)

deitalicizing nonroman scripts in wikipedia templateEdit

Whenever we add script calls such as {{Arab}}, {{fa-Arab}}, {{Thai}}, and {{Khmr}} to header templates or etymology templates, they are set up so that they are not italicized or bold. It would be great if someone could figure out how to add the same feature to the {{wikipedia}} templates ({{wikipedia|lang=ar}}, etc.). As it stands, these callouts are often illegible (see for example محمد بن عبد الله). —Stephen 20:25, 19 July 2008 (UTC)

I set it to accept a {{{sc}}} parameter, default Latn. This isn't the best approach, firstly because it doesn't remove the bolding, only the italicizing, and secondly because I'm not sure whether {{Latn}} should really be italicizing its argument; but it was probably the simplest. (In general I think our approach to script templates needs review, but this issue doesn't seem like the best leaping-off point for that.) You can see the new behavior at [[محمد بن عبد الله]]. —RuakhTALK 01:25, 20 July 2008 (UTC)
Thanks, it looks a hundred percent better. The bolding isn’t a very big problem like the italics were. —Stephen 01:29, 20 July 2008 (UTC)
Glad to hear it. :-) —RuakhTALK 02:32, 20 July 2008 (UTC)

Bot requestsEdit

Is there a section other than the Grease pit where bot requests may be submitted? Thank you, 24.29.228.33 07:21, 20 July 2008 (UTC)

Well, you could ask a specific user whom you know to be good with bots … but why? Why wouldn't you want to submit your request here? —RuakhTALK 13:55, 20 July 2008 (UTC)

Thank you, I did submit my request here, at Wiktionary:Grease_pit#Bot-assisted_addition_of_Middle_Chinese_readings_of_hanzi, with no response. 24.29.228.33 05:06, 21 July 2008 (UTC)

Oh, I see. That probably means that no one felt they had the knowledge necessary to do it. :-/ —RuakhTALK 22:15, 21 July 2008 (UTC)
I, too, have requested bots. Although I received response that the bots would be no problem, no bots were forthcoming. Amina (sack36) 12:52, 24 July 2008 (UTC)

Thank you for sharing your experience. I suppose, like Wikipedia, Wiktionary is a volunteer project and one must expect that individual editors would only do what they feel they're skilled at, or what they enjoy doing. However, at least a response and some projection of how long the task would take from the editors skilled at using bots (which are in fact used on a regular basis here) would be so greatly appreciated. I would propose a "Bot requests" page comparable to that at Wikipedia, where editors skilled in bot operations would evaluate and take care of bot requests in a timely manner, instead of slipping through the cracks, which seems to currently be the case. 24.29.228.33 05:28, 25 July 2008 (UTC)

If you got an account people might be more inclined to consider your requests. Nadando 05:39, 25 July 2008 (UTC)

Thank you for your kind and welcoming suggestion. However, you did not actually address the comment, which would be even more welcome. 24.29.228.33 00:43, 26 July 2008 (UTC)

The Beer Parlor would be an alternate (and perhaps better) place to start a discussion concerning bot status. Once some discussion has taken place a vote needs to happen. However, I think it unlikely that bot status will be granted to a bot without a named owner. -Atelaes λάλει ἐμοί 00:51, 26 July 2008 (UTC)

Thank you. I'm sorry, I was under the impression that some tasks have been done here at Wiktionary via bot. Was I mistaken in this? My request (asking for a bot operator to add all the Tang Dynasty pronunciations for Chinese characters, rather than doing the 4000 by hand) is just above. 24.29.228.33 03:29, 26 July 2008 (UTC)

Oh, I see. My initial impression was that you had a bot, and were asking for permission to use it. However, upon closer inspection, it appears that you are asking someone else to write such a bot. My apologies for not reading carefully enough. While I wish I could offer you a more optimistic appraisal of the situation, my guess is that it's not likely to happen soon. In my experience, bots generally only get written for things that the bot-writer is interested in doing, and are rarely completed per another's request. I hope I'm wrong on this, but I wouldn't hold my breath, if I were you. -Atelaes λάλει ἐμοί 03:39, 26 July 2008 (UTC)

Right, I noted above that Wiktionary seems to operate that way, but I also proposed a new, more efficient manner of operation where bot requests could actually be presented at a new "Bot requests" page, where they would be taken up and implemented in an efficient, expeditious manner, as with Wikipedia. I don't believe there's anything wrong with presenting an idea of how our project could be improved, as that seems to be Wikimedia's manner of operation--constant improvement. Adding all 4,000 characters by hand will take about 100 hours of work. 24.29.228.33 03:53, 26 July 2008 (UTC)

Agreed, the presentation of new ideas is always welcome. However, I think that, in this case, the suggestion is unlikely to work. Here's why: One major difference between Wiktionary and Wikipedia is that Wiktionary has rather more centralized discussions. Nearly all major discussions happen on perhaps a dozen pages of the site. We don't have discussions on the talk pages of entries, but rather bring them to the Tea Room. Nearly all policy type discussion happen in the Beer parlor. Likewise, nearly all technical discussion happens here in the Grease pit. This allows the relatively small number of editors to keep in touch with everyone else. Unlike Wikipedia, we don't have people trying to figure out how to plug into the project, but rather the folks who do put in a significant effort are nearly always swamped simply trying to keep up with the tasks they have already committed to. I feel very confident that if such a page were created, it would simply languish from inattention. -Atelaes λάλει ἐμοί 04:04, 26 July 2008 (UTC)

"Facts"/Trivia articlesEdit

I was wondering if there was any desire to get articles created that were about some of the interesting or curious aspects of our language. There are thousands of intelligent minds collected here who have a substantial knowledge of the English language, and they could probably provide the general public with interesting facts that most people wouldn't know. English trivia is not uncommon, but is frequently inaccurate. Just the other day someone tried telling me that the only two words in the English language with all five vowels in order are facetious and abstemious, which as interesting a bit of trivia it may be, it's wrong. With all the knowledge collected here already, I feel like this could be a fun and entertaining endeavor, as well as a quicker process than it may seem. For example, if there was an article listing palindromes, if a fraction of the users here took a minute or two and added whichever ones came to their minds, the article would be quite large in no time. And with time the information will spread and with it a deeper respect for the nuances of the English language to people who may never have thought about it before. Mrdeadhead 05:55, 21 July 2008 (UTC)

I'm not sure what you mean by "articles". On Wiktionary, we are a dictionary and have entries for words. There are a few Appendices out there, but mostly glossaries or pages about pronunciation or grammar. You might create a page in your personal userspace of the kind you have in mind to show us what you mean. --EncycloPetey 08:01, 21 July 2008 (UTC)

MediaWiki:SectionWatchLinks.jsEdit

Hey all,

There's been discussion (at Wiktionary:Beer parlour#Length and elsewhere) of structuring some of our high-volume discussion pages the way we structure Wiktionary:Votes, with the main page being little besides a series of transclusions of individual-discussion subpages. There are various technical issues with such an approach, including the annoyance of watchlisting all the votes at Wiktionary:Votes — generally you have to click the section-edit link, then watchlist the page it takes you to.

I've just created MediaWiki:SectionWatchLinks.js, which partly addresses this issue, by automatically adding "watch" and "unwatch" links next to the edit-link of each transcluded section (unless it's a subsection of a section it's already added the same links to); to install it, add importScript('MediaWiki:SectionWatchLinks.js'); to Special:Mypage/monobook.js and then hard-refresh.

Those of you who know JavaScript, please take a look. Feel free to edit it (if you're an admin), or to comment here or at MediaWiki talk:SectionWatchLinks.js.

Whether or not you know JavaScript, please try it out and let me know whether it works for you, how you feel about it, etc. (Once you've installed it, take a look at the [edit] link to the right of the "Install MetaKeywords Extension" header at Wiktionary:Votes; it should now be a set of three links, [watch · unwatch · edit].) I've only tested it in Monobook, and only in Firefox 2 and 3, IE 7, and Safari 3.

Possible improvements, if someone's got the itch:

  • A bit of Ajax-iness to identify which transcluded sections are already watchlisted and which aren't, and show only the relevant link. (Unfortunately there doesn't seem to be an API query to see what pages are watchlisted, so I guess the Ajax would have to go after Special:Watchlist/edit or Special:Watchlist/raw?)
  • An all-out MediaWiki:SectionWatchAjax.js that automatically watchlisted every transcluded section of a watchlisted page either automatically, or at the click of a button.

RuakhTALK 17:48, 22 July 2008 (UTC) edited —RuakhTALK 00:45, 23 July 2008 (UTC)

Thank you very much, that alleviates one of the worries I had. It seems to work fine at the moment, but if we're feeling the need for freeping creatures, then I think the first one we need is the "watch all" button. Conrad.Irwin 20:48, 22 July 2008 (UTC)
That's a good idea! I've now stolen it. :-D —RuakhTALK 00:45, 23 July 2008 (UTC)
On second thought, this approach may not be necessary. As may currently (and temporarily) be seen at User talk:Ruakh/Template, we can simply create a new template for such transclusions: rather than putting {{Wiktionary:Beer parlour/2014-04-16/title}} in the wikicode directly, we could use something like {{transclude|Wiktionary:Beer parlour/2014-04-16/topic}}, which could then add all the links we might want in addition to transcluding the page. (It wouldn't have the locale support that MediaWiki:SectionWatchLinks.js has, and it wouldn't group the links with the edit-button in that convenient way, but who cares? It would work for all users regardless of skin, browser, JavaScript support, etc., which is much more important.) At least this was a fun project, despite its uselessness. :-)   —RuakhTALK 18:07, 24 July 2008 (UTC)

Implementing logged discussion roomsEdit

This is currently only about discussion rooms (WT:BP, WT:GP, (WT:ID)) , request pages (WT:RFD, RFT, RFV) require a bit more thought!


Naming subpagesEdit

The bot itself won't care what the subsections are called, allowing people to transclude arbitrary pages from anywhere (for example cross-posting to BP and GP). However the main "option" in naming is whether to include the date before the title of the subpage.

Wiktionary:Beer parlour/2008-06-13/Topic one
Wiktionary:Beer parlour/Topic two

Depending on which option we select here which type of index (alphabetical or by date or both I suppose) it would be useful to have the bot create. I slightly prefer the second option because it means that if people link to a topic on the beer parlour using the Wiktionary:Beer parlour#Title of topic, the archived topic can be found instantly by changing the # to a /, however I know others prefer the idea of prefixing a date. A quick poll might be useful here, or just another opinion. Conrad.Irwin 21:16, 22 July 2008 (UTC)

I prefer the date-based system, since it will reduce the chances of later develioping a duplicate name and the problems that would create. --EncycloPetey 00:54, 23 July 2008 (UTC)
I agree with EncycloPetey. If it's desired for the dateless subpages to exist as well for the reason that you give, then a bot can make those pages be redirects (or date-based disambiguation pages, when topics are duplicated). (I'm actually not sure how much I like an infrastructure that's so dependent on bots, but this is not the detail that worries me.) —RuakhTALK 03:56, 23 July 2008 (UTC)

Current bot behaviourEdit

Well, I thought that a bot to do this would be trivial, but it turns out that there is a lot more to think about than I thought. This is how the current implementation will work (once i've ironed out all the bugs). If someone spots a better way to do anything, please let me know! Conrad.Irwin 21:16, 22 July 2008 (UTC)

  • Sits on irc://irc.wikimedia.org#en.wiktionary and listens for any changes to its configuration page (probably in MediaWiki space for protection).
  • For each page on its configuration page, it will start listening to changes to that page and its subpages.
  • When it hears a change on a subpage that is not on the page it will add a new section to the bottom of the page (adding a new month header if necessary?)
  • In the same edit it will remove any topics that have not been modified for a (configurable by page) length of time.
  • If configured (per page) to do so it will add this page's title to a list of (a configurable number of) "recently archived" topics which can be included at the top of the page.
  • ? It might then (also) add the page's title to a list of "all topics in year or month or an alphabetical index of topics?
  • ? It might then slap an {{archive notice}} on the subpage?
  • To allow for permanent includes on the page, an optional argument can be given on the transclusion.
    {{/The heading | permanent }}
    {{/Notice of pending event | expires=YYYY-MM-DD}}
    However (at the moment) all sections like this will gradually migrate to the top of the page.
    It is also possible to specify a set of pages to ignore changes to on the config page (currently using a regular expression)
  • Every once in a while (probably about 24 hours) it will use the API to reload the ages of all the subpages it is watching, correcting the page as necessary (This is because it is entirely possible that one of the edit messages goes missing). Conrad.Irwin
I think trying to "listen" for changes is trying too hard. It can just run every n hours and look to see what has changed. I'm not still quite sure what the structure is. Presumably a section added to the page (with the + tab) would be moved to a new sub-page?
One way to do the configuration is put that info in the page, including a link to some target page; the bot then looks at Special:Whatlinkshere for the target to find the pages it should be working on. Robert Ullmann 13:00, 24 July 2008 (UTC)

MouseoverEdit

Is there a wiki command that results in a pop-up with mouseover? Wikisaurus is trying to avoid a back-and-forth effect when looking for the right nuance. If we give definitions, what's the use in a dictionary? If we make a link to Wiktionary, it causes the user to go back and forth between the two areas. A mouseover pop-up would be just right. Any info? Amina (sack36) 11:42, 24 July 2008 (UTC)

Do you mean something like this?RuakhTALK 12:38, 24 July 2008 (UTC)
Ooh! Bless you Ruakh! That's just exactly what I was looking for. Amina (sack36) 04:39, 25 July 2008 (UTC)
Hmmm. I'm afraid there's a problem with the mouseover. It only works on comment fields and I need it on a click-able word. Amina (sack36) 05:15, 25 July 2008 (UTC)
So if you use it on the alt text of a link: word eh? What doesn't work? (We can arrange to make it blue again, by taking the colours out of {comment} in a variant template ;-) Robert Ullmann 10:59, 25 July 2008 (UTC)
I've now created {{comment-link}}, modeled on {{comment}}, but I'm actually not very happy with it, because it doesn't respect user stylesheet colors, and it doesn't distinguish visited from unvisited links. Also, I think MediaWiki has an option for redlinks to show with a red question mark following rather than with the whole link red; if so, this doesn't respect that. It might be best to just remove all the specified colors (as Robert suggests), and have the underdotting be black (which looks a bit odd, but isn't too bad: baz; maz). —RuakhTALK 11:56, 25 July 2008 (UTC)
Think you are trying to hard (;-), just take the colours out. The border will be the link colour. foo qoo using the CSS colours. Robert Ullmann 12:15, 25 July 2008 (UTC)
Woah, borders inherit text colors? *is mindblown* Thanks! —RuakhTALK 13:23, 25 July 2008 (UTC)
Yup, the CSS “color” property applies to all foreground elements, including text, borders, outline. Michael Z. 2008-07-25 16:03 z
Note that it isn't inheriting from the text per se, it is that the text and box around it (to which border-bottom: 1px dotted; is applied) are both inside the anchor, within classes a, a.active, and a.new. Robert Ullmann 10:00, 28 July 2008 (UTC)

redirects between caseEdit

My automation for deleting the "Conversion script" redirects having crunched through most of the 49,000-odd it was tasked with, I've been looking at what's left. And I have a question:

Do we ever want redirects to an entry that differs only in case?

There are any number of them that are left over from moving conversion script errors back to capitals, or fixing the endlessly recurring new entries with the wrong case (either newbie error, or uncertainty about what should be done).

Should all of them be removed (if not linked-to, of course)? The MW software will take you to an entry from the wrong case from the search box. (e.g. put in "UniTed nATions") What do you think? Robert Ullmann 14:46, 25 July 2008 (UTC)

Yes, I think so, for consistency. Conrad.Irwin 18:19, 25 July 2008 (UTC)
I can't think of any reason to keep them either. --EncycloPetey 20:26, 25 July 2008 (UTC)

Trying some new filters. Will see what it comes up with for a few hours; then I'll let it finish the other run. Robert Ullmann 14:10, 26 July 2008 (UTC)

that was very interesting ... will flood RC for a little while to complete the first task run. Robert Ullmann 22:34, 26 July 2008 (UTC)
Another set of redirects which needs to go away is el/grc words with no diacritics. Bascially, any word composed entirely of Greek characters, containing no diacritics (i.e. accent marks), which is a redirect can safely be deleted. -Atelaes λάλει ἐμοί 01:31, 27 July 2008 (UTC)
Are there many of these? I plan to (after the next XML dump, whenever that is ...) classify all the redirects and see where we are. There are still a number of English redirects of forms hanging out there, as well as any number of things that should be looked at to decide whether they ahould be misspell-of, alt-spell of, or deleted. ATM, I have no handle on how much there is. Robert Ullmann 10:06, 28 July 2008 (UTC)

No, redirects between cases may be needed if the target is mixed case (a capital after the first, but not all caps). The did-you-mean and search engine (go/search) handle it if the target is all caps, all lower, or has the first character "flipped". The search engine also handles Title Case and Title-Hyphen Case for the target. So some redirects are needed (we might consider creating a few as well ;-) an example is that IMDb needs (one of) the imdb and IMDB redirects, else looking up "IMDB" will not find it. Robert Ullmann 14:19, 28 July 2008 (UTC)

Did you mean messageEdit

On the page where an entry is not found, when it says "Wiktionary does not yet have an entry for ..."

Where is the code that generate the "did you mean" messages? (I have looked, and I usually know where to look ;-)

It seems to find other entries that are

  • all lower case
  • all upper case
  • given case with first capitalized

but not

  • title case
  • sentence case where given has other capitals

anyone know? Robert Ullmann 13:12, 28 July 2008 (UTC)

MediaWiki:NoarticletextWiktionary:Project-Noarticletext{{alternate pages}}{{didyoumean}}. And yeah, it just tries applying each of [lu]c(first)?:. It doesn't even try combinations of these, such as {{ucfirst:{{lc:pagename}}}}; so, [[JAPANESE]] pulls up nothing. —RuakhTALK 15:38, 28 July 2008 (UTC)
Ah, thank you. It would be good to add sentence case in there. Too bad title case can't be generated. (can't get from "united nations" to "United Nations") Robert Ullmann 16:31, 28 July 2008 (UTC)
I added Sentence case, seems to be okay. Robert Ullmann 12:14, 29 July 2008 (UTC)
Cool, thanks! —RuakhTALK 13:41, 29 July 2008 (UTC)

Creating redirectsEdit

There are cases, like IMDb, where entries will not be found by either search or links to all UC or all lc, or other reasonable combinations. For example, without redirects, none of "imdb", "Imdb", or "IMDB" would find the entry. Another example it to try linking to "United nations", "united nations", "UNITED NATIONS", none of which work. In this case go/search will work, because the s/w does try Title Case.

It seems to me that we might usefully create redirects in some defined set of cases to make these entries more accessible. Robert Ullmann 12:25, 29 July 2008 (UTC)

It would be better if we got the proper DidYouMean extension working again (as I keep meaning to... ;). This would handle all such cases and more. As an interim measure I don't have a particular problem withe redirects. Conrad.Irwin 11:59, 30 July 2008 (UTC)
Yes, that would be good. I'm just tweaking what exists now (;-). In any case I have set the deletion-process filter to keep any presently useful redirects, i.e. it won't remove anything that will make an existing working link fail. If someone creates helpful redirects (e.g. united nationsUnited Nations then we will keep them for now. (Wish we had TC: for Title Case in the parser functions. ;-)
I might consider creating them on purpose for some titles like van der Waals force and shipshape and Bristol fashion. Robert Ullmann 12:31, 30 July 2008 (UTC)
Well as redirects seem to be acceptable for phrases, I'd go for it. Conrad.Irwin 13:08, 30 July 2008 (UTC)
See User:Robert Ullmann/Entry cases, which has some information about where we are. (or were on 13 June...) The redirects for phrases that we explicitly allow are for variant forms, not variant capitalization. This is issue is orthogonal to phrase/not phrase I think. Anyway, there are 2K+ entries that would not be found by go/search from another capitalization. Robert Ullmann 15:50, 30 July 2008 (UTC)

redirect=noEdit

In the js for didyoumean, the new url uses redirect=no, presumably to prevent looping (in some contrived cases). Since then we've added rdfrom= to the url, and check for it, so it can't loop that way. Meanwhile, if didyoumean finds an entry that is a redirect, the no-redirect keeps the user from arriving all the way at the desired page.

I think it can be removed, and I'm going to try it. Please check this (if this whole thing with js makes any sense at all to you of course ;-) Robert Ullmann 11:55, 30 July 2008 (UTC)

It was deliberately like that to mimic the MediaWiki double redirect behaviour, but feel free to change it if you think it's better changed. Conrad.Irwin 11:57, 30 July 2008 (UTC)
(Hi Conrad!) We have cases where we want to do an auto-redirect and another "ordinary" redirect to get the user to the page. Consider a link to Van Der Waals force, which now works. I think this is okay. (The presentation of the two "redirected from" notes at the top might be tweaked a bit.) This way the dym mechanism can do almost all of what the go/search code does, as it does both steps. (difference is that dym won't find Title Case, unless we provide a redirect) Robert Ullmann 12:17, 30 July 2008 (UTC)

processEdit

To explain the process, what the automation is doing now:

It is looking at all the redirects in mainspace and Talk: using a dump from sometime in March. (That's okay, it has plenty to look at, I'll start it again with a new dump if we ever see one.) The Talk: space really isn't the objective, but not looking at those leaves "orphaned" talk page redirects. It proceeds from longest title to shortest (>3) in alpha (UCS collating) order within each length. (A good idea from DAVilla.)

There are two "cases" it looks at. First: redirects that point to a different case of the same term. Here it checks that the dym (Did you mean) and search/go mechanisms will both find the target if the redirect is removed: i.e. the target is all lower, all upper, the same with first letter inverted, or in Sentence case.

Second: two (or more) redirects to the same target, differing only in case. Here it checks the initial case of the target, preferring to keep a Sentence or all-upper case redirect if the target starts with UC, preferring the all-lower case redirect if the target starts with an lc letter. The same case matching is then applied, to ensure that looking for the redirect form will find the other redirect by dym or go/search.

All this is then re-checked against the current wikt: that the redirect still exists, hasn't changed (or is back to the same), that the other redirect still exists (ditto), that there aren't any internal links that would be broken. Fixing them in some cases, e.g. Transwiki: space and ignoring others (various cleanup lists).

Thus any external link, whether from a sister project, or somewhere else in the net, that was working will continue to work.

It does 5 at a time, and waits an adapted time so that it will be < 10% of RC and not completely flood the deletion log. Also so it does only what I can reasonably look at once or twice a day for possible trouble. You will note that it sometimes turns up entries that were moved to the wrong case, and should be moved back or to a different form. Robert Ullmann 12:46, 2 August 2008 (UTC)

Wikisaurus Template NamingEdit

Background reading: User talk:Sack36#Template:wse-shell.

EncycloPetey has pointed out that the wse prefix in wikisaurus templates such as {{wse-shell}} are probably poorly named, as they seem to imply a language of {{wse}}. Basically, all three letter templates should be reserved for language names, and following that, template names with three letter starts should also be reserved for language specific templates, such as {{grc-verb}}, {{grc-ipa-rows}}, etc. So, Sack36 (who, for those of you unaware, has been forging a major revamp of the wikisaurus pages) is now wondering what the templates should be named, and what the most efficient method of affecting such a change would be. So, to sum up:

Are three letter prefixes on templates like {{wse-shell}} inappropriate (bearing in mind that wse has not yet been assigned to any language yet, that I could find anyway, but conceivably could be in the future)?

general conventionEdit

What convention do we want to go with when dealing with this on any similar project?

Since the two and three letter with a dash is used, what if we considered project templates as a whole different critter and use different patterns? e.g.

  1. we could use sequence numbers to designate which project, thus wikisaurus-shell could be either shell1 or 1shell or even 1-shell
  2. we could capitalize instead of separate by a dash
  3. we could make it a suffix, not a prefix
  4. we could make it a four character prefix

convention appliedEdit

Poor Wikisaurus gets a new template name... again.

I'm thinking how nice it would be if it were something short and sweet. I'm partial to 01-shell. Amina (sack36) 04:03, 26 July 2008 (UTC)

If so, what would be the most appropriate naming scheme to rename these templates to?
If so, and if so, would any of our resident programmers be willing to save us a whole lot of tedious manual labour and sic one of their bots to the task of implementing these changes at the entry level?

Many thanks. -Atelaes λάλει ἐμοί 03:12, 26 July 2008 (UTC)

Some thoughts: (2) We've generally agreed on Wiktionary not to capitalize templates, so capitalizing would not be the best option. (3) A suffix would spread the templates out in a category, rather than grouping them.
So, what about a 4-char prefix like saur or wksr? --EncycloPetey 05:55, 26 July 2008 (UTC)
I think saur is a good idea. -Atelaes λάλει ἐμοί 06:01, 26 July 2008 (UTC)
Type "sa" 10 times fast and you'll see the reason I'm none too keen on saur. What of the numbered idea? I'm still fond of 1-beginlist or 01-shell. Amina (sack36) 07:27, 26 July 2008 (UTC)
Well, how often would you have to type these templates ten times fast? No, I think it better to keep the templates organized by a sustainable naming system. Rest assured that I have to deal with a few long ones myself, such as {{grc-ipa-rows}}, but it really doesn't amount to much time spent typing in practice. -Atelaes λάλει ἐμοί 07:31, 26 July 2008 (UTC)
it isn't that capitalized names are prohibited so much as that we don't want the Everything Capitalized style of the pedia ... numbers are just meaningless ... people seem to get very fixated on '-' as a separator ... you could just use "ws " as the prefix: {{ws shell}} etc Robert Ullmann 07:41, 26 July 2008 (UTC)
I think we're trying to stay away from the two and three letter prefixes. You have a point about the meaninglessness of numbers. So many of the special characters are used in the wiki language I just don't know what other special characters are open for use.
As for typing these letters fast, it turns out that I will be doing just that. Each synonym and antonym used will have a template attached to it. Since thesauri are traditionally made up of synonyms and antonyms and not a whole lot else, that's going to mount up to a huge toll on my two weakest fingers. If we end up with a finger twister, so be it. I'd just like to avoid it if we can. What about using "roget"? Amina (sack36) 14:18, 26 July 2008 (UTC)
Did you know that it's possible to customize your edittools so that you could input the entire template (including braces) with just one click? So, ease of typing shouldn't concern you, since whatever it is could be entered with a single click. --EncycloPetey 17:16, 26 July 2008 (UTC)
Thanks to you, Petey, I now know that it can be done. I don't, however, know how to do it.
We do seem to be getting a little off topic. From the above discussion it looks like there are five valid suggestions to date:
  1. "saurus"
  2. "roget"
  3. "saur"
  4. "wksr"
  5. use a different separator
Have I left anything out? Does anyone know a separator not in use elsewhere? Amina (sack36) 10:27, 27 July 2008 (UTC)
As I said, just use space as the separator, and use ws or wse or whatever as you please. (It isn't so much that 2 and 3 letter codes as prefixes are reserved, so much as we want languages to use them so that templates for languages aren't all over the name space. There is no 2-letter ws for language, and won't be, since ISO is making no new 2-letter assignments.) Oh, and not "roget". Robert Ullmann 09:55, 28 July 2008 (UTC)

Two letter as in ws shellEdit

Confusion reigns. Wasn't the whole brouhaha about not using two and three letters? EncycloPetey began this thread because "wse-" was a language designation. What am I getting wrong? Amina (sack36) 15:20, 28 July 2008 (UTC)

Robert is saying that you can use ws_name. Don't use ws-name, because then it looks like ws is a language code (even though it isn't, and never will be); and don't use wse_name or wse-name, because wse is a language code. (I mean, I suppose wse_name should theoretically be fine, and Robert's saying it is, but that's just begging people to be confused.) —RuakhTALK 15:42, 28 July 2008 (UTC)
oh! cool! Let's go with ws_name then. Do we need to vote now? Amina (sack36) 16:08, 28 July 2008 (UTC)
Nope, ws_name is the agreed solution. Conrad.Irwin 16:10, 28 July 2008 (UTC)
Not sure where the _ came from? It is just a space (they are the same in entry titles, such as template names. Robert Ullmann 12:27, 29 July 2008 (UTC)
Yeah, they're equivalent in template names (though not, oddly enough, in template parameter names). I wrote spaces here just because ws_name seemed clearer than ws name. (I suppose I could have written {{ws name|…}}, but it didn't occur to me. *shrug*) —RuakhTALK 13:06, 29 July 2008 (UTC)
No biggie. I understood (more or less) what was meant by it. It took a bit for it to sink in, but the job can be done either way with the same results.Amina (sack36) 11:34, 1 August 2008 (UTC)

Help with sign language templatesEdit

Hi, all. At Wiktionary:About sign languages#Option 2: Hold-move is a description of an academic sign language transcription system and my suggestions for how to convert transcriptions using that system into ASCII strings suitable for entry pagenames. I seek help to simplify the resulting pagenames and to polish the hold-move chart templates. For examples of entry pagenames and hold-move charts using the current (draft version) transcription system, see the sample ASL entries B-In-Vplane-B-Chesthigh-Back Hold Short B-InFinger-Vplane-B-Chesthigh-Back Short B-In-Vplane-B-Chesthigh-Back Short B-InFinger-Vplane-B-Chesthigh-Back Hold (busy), 1-Sfhead-Splane-Claw5-Side-Radial Hold Claw5-MedialSfhead-Hplane-Claw5-Side-Radial RoundHplane-RoundHplane (confused), and OpenA-BackFinger-Mplane-OpenA-Center-Mplane Hold C-Ulnar-Tip-C-Center-Tip Hold 1-Sternumhigh-Vplane Hold (How are you?).

After incorporating your feedback, I hope we end up with a transcription system that is friendly to both editors and readers. If so, I'll request feedback through WT:BP with the intent to finalize and approve Wiktionary:About sign languages. Rod (A. Smith) 03:05, 29 July 2008 (UTC)

Firstly, I think this is an amazing achievement thus far. Secondly, I notice that for some of the shorter ones you have sentences describing how to make the sign, yet for the longer ones you have tables - how hard would it be to use the sentences for everything (because I find them much easier to comprehend than the tables). If we are wanting to make the titles simpler, would it be possible to exclude some of the details from it? This might result in a few collisions - as I have no idea about sign language at all, I don't know which detail is perhaps less important. Conrad.Irwin 18:05, 29 July 2008 (UTC)
From the following candidate versions of the pagenames for the three signs above, I've removed the direction both the hands are facing, the location of the weak hand, and the hold segments. Those dropped attributes are crucial phonemes for the proper production of the signs, but they are arguably less important than the ones that remain.
  • B-In-B Short B-InFinger-B Short B-In-B Short B-InFinger-B (busy)
  • 1-Sfhead-Claw5 Claw5-Sfhead-Claw5 Round-Round (confused)
  • OpenA-Finger-OpenA C-Ulnar-C 1-Center (How are you?)
It's something like writing English without spaces, with every second vowel removed, and with b, d, g, v, and z replaced by p, t, k, f, and s. It'ssmethnklikwritnkEnklshwithutspces,wthefrysecntfowlremfet,ntwithp,t,k,f,ndsreplcetpp,t,k,f,ants.Rmofnkthosattrputsmayhfethpenfitfmaknksiknsasertlocte,tspitthefctthatphnemshafpentrppet. Removing those attributes may have the benefit of making signs easier to locate, despite the fact that phonemes have been dropped.
Despite the joke above, dropping the less important phonemes may be the best way to reduce the length of the entry pagenames. In any event, after reviewing several seven ASL dictionaries with native ASL signers, I've learned that their text descriptions and photographs frequently omit or obscure important details. So, while I agree that a text description and photos are important to include, I'm also pretty sure that a structured, detailed hold-move chart is important to include as well. The hold-move templates I've created take up too much space, though, to fit on a standard resolution monitor screen. I'd appreciate any suggestions or help regarding reducing the overall size of the charts, especially their heights. Rod (A. Smith) 02:14, 31 July 2008 (UTC)
And, by the way, Conrad, in case it's not clear, I appreciate your feedback above, too.  :-) Rod (A. Smith) 02:21, 31 July 2008 (UTC)

OK. I've cut down on the attributes represented in entry pagenames. The result almost reads well:

  • B@In-B Sidetoside B@InFinger-B (busy)
  • 1@Sfhead-Claw5 Claw5@Sfhead-Claw5 RoundHplane-RoundHplane (confused)
  • OpenA@BackFinger-OpenA C@Ulnar-C 1@Center (How are you?)

Any other suggestions? Rod (A. Smith) 22:12, 3 August 2008 (UTC)

Inconsistency in latin verbsEdit

Latin verbs should be first-person-sg entries. However, some verbs, like volvo/volvere, just have it the other way round. There's also the case of eo/ire, where both have a conjugation table. Shouldn't they comply to the standard way of dictionarizing Latin verb entries? —This comment was unsigned.

  • Yes. We're working on it. But with so few Latin editors it takes a while. Can you help? SemperBlotto 17:15, 29 July 2008 (UTC)
Note: At this point most entries for first, second, and fourth conjugation Latin verbs have had this problem corrected. Most of the remaining problems are verbs of the third conjugation, like volvō. The problem may be fixed entirely within the next month or so. --EncycloPetey 16:15, 30 July 2008 (UTC)

The API is dead! Long live the API!Edit

The old query API (query.php, as opposed to the newer, but still fairly old, api.php) is being killed at the end of August. I don't think any of our infrastructural scripts are using it, but if anyone knows of any, please say so. For more information, see <http://lists.wikimedia.org/pipermail/mediawiki-api/2008-July/000620.html>. —RuakhTALK 23:44, 30 July 2008 (UTC)

Asterisks in page titlesEdit

I was wondering if there is any community consensus regarding asterisks in page titles? My query arose upon realizing that the page f**k had been blocked from creation by Connel MacKenzie with the summary "bad entry title". Discussion with him about it doesn't seem to have gone anywhere (see User talk:Connel MacKenzie#f**k for more information), so could I get some information here? Thanks, --Teh Rote 01:48, 31 July 2008 (UTC)

I would favor complete orthographic freedom, but not at a high cost in terms of technical effort, server utilization, contributor effort, or user confusion. I would need to hear something explicit and specific about the costs to weigh them against the benefit. How many CFI-meeting entries do we believe would be added as a result of allowing asterisks? DCDuring TALK 08:10, 31 July 2008 (UTC)
There be a lot of bowdlerization what goes on, so I'd image that most vulgar slang can be written with asterisks in several places. I have no problem with including asterisks, and can think of no obvious technical restrictions. Conrad.Irwin 08:30, 31 July 2008 (UTC)
I agree with DCDuring and Conrad.Irwin. I think the problems that Connel MacKenzie refers to are mostly downstream problems — problems in mirrors, programs that use our site as a data source, etc. — and while those are definitely worth consideration, overall I think that they should be expected to match our behavior rather than vice versa. (By the way, we do have [[*]] itself, though it's woefully incomplete, and [[*nix]], but until just now, it had formatting problems due to the way {{en-noun}} works, which we can take as a tocsin.) —RuakhTALK 09:48, 31 July 2008 (UTC)
Oh, yeah, and we have [[Category:*Topics]], plus counterparts for many languages. —RuakhTALK 09:49, 31 July 2008 (UTC)
If I understand Connels' concerns correctly, the problem is not so much the presence of asterisks in the pagenamws, but only when they occur in the center of "words", as this throws off searches. So, entries where an asterisk separates two words, or appears before a word, or alone, should not cause the probelm. But, having an asterisk in the middle of a "word" does cause search problems. --EncycloPetey 17:43, 31 July 2008 (UTC)
Asterisks are also used to indicate that a pronunciation is reconstructed. 24.29.228.33 17:53, 31 July 2008 (UTC)
I've created User:Rodasmith/test*asterisk. Mediawiki locates it as expected when I type "User:Rodasmith/test*asterisk" in the search box and click "Go". It fails to locate it when I click "Search", but so what? That hardly seems like a justification for barring an entry if it actually meets CFI. That is, if authors use f**k enough, we should have an entry on it. If some search engines fail to locate the entry, big deal. Rod (A. Smith) 02:34, 5 August 2008 (UTC)
You didn't wait long enough; the search index isn't updated in real time. It works now. (checkbox for User namespace of course). AFAIK know there is no technical problem except as noted above an oddity with template parsing where initial * and # are treated as being at the start of a line. I don't think Connel's use of "bad entry title" was a reference to a technical problem (note that Connel created *nix ;-), but rather the form with the *'s used as a bowdlerization. (a title like "Enterprise (series)" would also be a "bad entry title" here.) Robert Ullmann 14:02, 5 August 2008 (UTC)
Good, so Mediawiki seems tolerant of asterisks, and cuil and google:"f**k" at least suggest that f**k is used frequently enough to merit an entry. I don't know how many of those results are for other strings, e.g. f!=k. The results need sifting through to determine how widespread f**k really is, but unless anyone presents a solid reason not to create the entry, process allows anyone to do so. Of course, the creator or anyone else may RFV or RFD it, so it wouldn't hurt to find three decent quotations.  :-) Rod (A. Smith) 03:21, 7 August 2008 (UTC)
To Robert Ullmann: Based on the comments Connel made on his talk page, it seems he was referring to a technical problem. It may have been that he was referring to the googleability of the term, as the asterisk seems to be a "wild card" on the search engine. However, if it's common enough to be included (which seems pretty clear), then an administrator should unprotect the title from creation. Teh Rote 05:19, 7 August 2008 (UTC)
Note that in computer science, "**" is often different from "*", as "\\" differs from "\". When then entry first caused problems with my tools, I looked at it and discovered it was not only garbage, but garbage from a known-bad contributor (User:Wonderfool.) The normal substitution of profanities is some selection of "!@#$%^&*()" but rarely the same character substituted for different letters - so this is an obviously contrived formation. That isn't how one bowdlerizes fuck in English. Before the vandals go creating this entry again, they need to prove that this is more common than "#&$%" and all others. I've re-deleted the entry. --Connel MacKenzie 16:45, 25 August 2008 (UTC)
Wait, wait, wait. Out of process, you knowingly removed a page that contained accurate, relevant content that is appropriate to a dictionary. Who's the vandal, again? (For the record, I agree that f**k is rare, as bowdlerizations go; but unless the tool in question is both (1) particularly important and (2) particularly difficult to fix, that's not a good reason to rule out an entire class of valid entries.) —RuakhTALK 17:04, 25 August 2008 (UTC)
Wow, I haven't been on in a while. There are a few things I would like to say...
  • First and foremost, who is this Wonderfool character? Without the userpage, it's kind of hard to tell.
  • To Connel: I've tried not to interact with you for that past while, but since you got into this discussion, I suppose it's necessary. Now, since when are there rules regarding bowlderization in English? I'd like to see some evidence for this.
  • To Ruakh: f**k is rare? The citations seem to say otherwise. Teh Rote 20:35, 27 August 2008 (UTC)
Perhaps it's not “the normal substitution”, but I've certainly seen uses like f*ck, f**k, f***, s**t, *ss, etc. Looks like Googling "f**k" with or without quotation marks finds plenty of attestations. I don't see why it's necessary to prove that this orthographic form is more popular than some other one. And obviously it is meant for a different purpose: #&$! inserts generic swearing, while f**k seeks to convey a specific word in a less offensive way—I've seen it used in direct quotations.
Please be specific about the technical problem. If your own tools can't handle some legal characters in a Wiktionary entry, then please fix your tools instead of “fixing” the dictionary. Michael Z. 2008-08-27 21:33 z

(removing indent) I agree with Teh Rote above, "f**k" is (at least in the UK, I make no claims about elsewhere) a very common method of bowlderising "fuck" - I'd go as far as saying it is the 1st or 2nd most common method (the other being "f*ck"). As nobody has presented any actual evidence of any technical why we cannot included it in the dictionary, I have to agree with Mzajac and Ruakh regarding Connel's approach to this word. Thryduulf 22:48, 27 August 2008 (UTC)

Requesting unblocks by blocked usersEdit

After having informed a user here the other day of their block, I wondered to myself how it was that they would request their unblock if they wanted to contest it. Over on Wikipedia, a blocked user is allowed to edit only their talk page, on which they can ask for unblock. However, their talkpage is protected if they misbehave there, or if they use the template over 3 times.

I wanted to find out if a blocked user here had the same option, so I created a new account, and I asked Conrad.Irwin to block it. Interestingly, once I was blocked on it, and tried to edit a page, I was told that if I contested the block, I should speak to an admin about it. However, when I tried to edit my talk page, it was not possible to do so. How exactly is the user supposed to contact an admin, or anyone for that matter, if they cannot edit their talk page?

Since I understand that Wiktionary doesn't have so much time and labor as Wikipedia, I acknowledge we can't afford to have that degree of leniency to allow them to request unblock 3 times, but I think we should still be at least giving them 1 time - for instance, an administrator is only human, and might think a user is vandalizing even though the user will be able to explain how they weren't, after which only then it would become clear.

Anyway, I am interested to hear responses from other users. --nwspel tork kontribz 11:44, 31 July 2008 (UTC)

When the feature allowing blocked users to edit their own talk page was introduced on en.wp I was not in favour of it. However experience has shown that it is not abused as often as I feared, and indeed abuses do not come near to outweighing legitimate uses. So I would support its introduction here. Thryduulf 14:50, 31 July 2008 (UTC)
The block message says pretty clearly that you can contact the blocking admin by email, explaining that one has to set their own email address in preferences (which can be done) before sending. In two years, I've only had two blocked accounts take the trouble, one I immediately unblocked, the other is a 14-year old in Toronto who sent me an obscene message. If they don't want to trouble to send email, why should we bother? Robert Ullmann 15:00, 31 July 2008 (UTC)
Most administrators don't have their email address written on their userpage, so most of the time, a blocked user is not able to email anyone. --nwspel tork kontribz 15:05, 31 July 2008 (UTC)
They don't have to. From a user page there is a link allowing you to send an e-mail via the WM software without even knowing the receiver's email. --EncycloPetey 17:46, 31 July 2008 (UTC)
How about the options at Wiktionary:Contact us? That gives them a mailing list and an IRC channel. Conrad.Irwin 15:12, 31 July 2008 (UTC)
No admin (or anyone else) should have their email address written on their userpage. You use the "e-mail this user" link on the left. All admins are required to have a usable email set in preferences. And blocked users are only prohibited from sending email if that box is checked when blocking, the default is not checked. Robert Ullmann 15:14, 31 July 2008 (UTC)
I will make it clear that users can do this in the blocked template. --nwspel tork kontribz 15:51, 31 July 2008 (UTC)
I've tried to clarify Mediawiki:Blockedtext - does that look good enough or do we want more explanation? Conrad.Irwin 23:32, 31 July 2008 (UTC)
Seems good. --nwspel tork kontribz 08:13, 1 August 2008 (UTC)
This is only good if we actually want unregistered users to be contributing. Given the effort involved in patrolling and the bad PR en.wikt seems to have gotten in handling new users, I'm not sure that, as a community, we really want users to contribute, especially unregistered ones. They tend to make a mess requiring lots of cleanup. They don't read our copious documentation. Perhaps we already have more than enough contributors. Should unregistered users be allowed to edit principal namespace page or, indeed, any pages? Should registered users be required to prove themselves by making proposed changes on talk pages for approval by one or more admins? Should only whitelisted users be granted the privilege of editing principal namespace pages? Do we need votes? DCDuring TALK 18:19, 25 August 2008 (UTC)
I don't think the Wikimedia Foundation would let us make any of the changes you're suggesting — and I wouldn't want it to. There are many models of how a wiki can work, and many (most?) of the world's wikis are restricted in some form or another, and that's totally fine; but the WMF wikis aren't, and I don't think they should be. —RuakhTALK 22:02, 25 August 2008 (UTC)
I think that our current practice isn't broke and doesn't need fixing. Each edit is judged on its own merits. New users make more bad edits that experienced users and their edits tend to be deleted or undone more often than those of experienced users. Users only get blocked if they seem to be deliberately offensive, stupid, disruptive etc - and they know what they are doing so don't need to be pre- or post-warned. Badly formatted or worded edits tend to get fixed more often than rejected out of hand, but most of us haven't got the time or inclination to act as nursemaids to new users. SemperBlotto 18:59, 25 August 2008 (UTC)
I have had users request unblocks in a number of ways, Email this user, OTRS, via a sock on my talk page and on the mailing list. I think we are easy enough to get a hold of. - TheDaveRoss 19:26, 25 August 2008 (UTC)
Agreed. However, I strongly feel that the bit about emailing the blocking admin should be reinstated into the block text. Conrad.Irwin took it out recently, but I feel that that's fairly important. -Atelaes λάλει ἐμοί 19:39, 25 August 2008 (UTC)
Making it easy for users to protest blocks is what we need, despite the risk of it being an outlet for trolling. If blockers don't want to get and respond to such e-mails, then users need easy ways to escalate. I would argue that such dialogs ought to be visible for all (or all admins, all checkusers, or some other group). (There is a good argument to be made that such a group ought not to be very big.) Perhaps more than one member of the group should be authorized to respond. The human anger response (by both blocker and blockee) would sometimes make the blocker one of the worst responders to an objection to his block. DCDuring TALK 19:53, 25 August 2008 (UTC)
Sounds a bit like you are describing m:OTRS to me. I've re-added the "E-mail this user" link to the block message, I removed it because I thought it made the message clearer and it doesn't work for anonymous users (who make up a substantial proportion of blockees) Conrad.Irwin 19:55, 25 August 2008 (UTC)
Though I remembering looking at it and even considered volunteering for it, I would even now not know to use it, though, if outraged enough, I would figure it out and be all the angrier for the effort. I guess I am thinking of such a system that operated within en.wikt, though in practice that may be the way OTRS works for en.wikt complaints. Once more I am in the position of offering my own ignorance/memory lapse as an illustration of the problem that many users have. I don't really know how to separate mistaken from malicious users, let alone long-term productive ones from others.
I just have a strong sense that our systems, policies and procedures need to be more accommodating of error by passive users, contributors, admins, checkusers, et al. Accommodating does not mean just shrugging one's shoulders at them but recognizing that they happen and devising mechanisms to limit the cost to all involved.
If the OTRS mechanism is effective, then perhaps we should direct users there instead of to blocking admin e-mails or provide it as an option. We should also recommend that users print out the "how to appeal" instructions so they can use them even if blocked. Veteran trolls already know all about such mechanisms. Newer users would not and would be all the more frustrated at believing that they have no avenue to pursue (even when they have read-only access to all of wiki-world). DCDuring TALK 20:36, 25 August 2008 (UTC)
Last modified on 13 April 2014, at 18:31