Wiktionary:Beer parlour/2006/October

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit

User:Ncik vandalism (Again)

Based on his own POV, Ncik (talkcontribsglobal account infodeleted contribsnukeedit filter logpage movesblockblock logactive blocks) has been removing category items.

I fully expect User:Eclecticology to again support Ncik. I have issues a 15 minute block so he will at least stop for now.

Unlike his similar antics last year, this time he actually replied on a discussion page before engaging in vandalism. However, the lack of discussion is alarming.

Even if some basis for Ncik's arguments can be found, the removal of a category from numerous items (as opposed to renaming the category to Ncik prescriptivist, regional POV's liking) is the worst kind of vandalism en.wiktionary.org enjoys. The various goatse vandals lack the subtlety of Ncik's vandalism, as en.wiktionary.org still has not recovered from his last round of similar vandalism (while on the other hand, image vandalism is usually reverted long before anyone views an image even once.)

All sysop assistance is appreciated. The unilateral decision by Ncik to vandalize the category is unacceptable. The lack of discussion on such an obviously controversial issue is alarming. To have acted in opposition to the initial feedback is inexplicable.

--Connel MacKenzie 13:28, 1 October 2006 (UTC)

I request to desysop Connel. His banning me without even attempting to discuss the issue is not accetable. I was merely removing items from a category which, according to long standing ruls on Category:English nouns with irregular plurals, didn't belong there. Ncik 13:40, 1 October 2006 (UTC)
You want to desysop Connel because he blocked you for vandalism? I don't think so. The topic has been discussed here: http://en.wiktionary.org/wiki/Category:English_irregular_plurals. Where are these rules to which you refer?
These rules:
This category lists English nouns whose plural is formed irregularly with respect to spelling: It includes the singulars of all English nouns except those that:
  • are symbols or letters and form their plural by adding -s or -’s
  • are not proper nouns, end in a consonant + y, and form their plural by removing the -y and adding -ies
  • end in a sibilant (one of [s], [ʃ], [z], [ʒ]) and form their plural by adding -es
  • are not subject to one of the above rules and form their plural by adding -s
And I know that the topic was discussed at Category talk:English nouns with irregular plurals (I moved it to its original place) because I was involved in these discussions as you can see. Ncik 14:12, 1 October 2006 (UTC)

(Ncik, I might be inclined to agree, but your request to desysop Connel discredits everything you say ...)

If the argument has merit, the proper action would be to recat the entries to Category:English plurals or Category:English plurals ending in "-es" (or whatever). Uncategorizing entries is vandalism that requires someone manually fix them all later (though in this case we have a convenient list). Robert Ullmann 15:18, 1 October 2006 (UTC)

I don't know what the merit of a category Category:English plurals ending in "-es" would be, but I don't object to having it. Uncategorising wrongly categorised item is hardly vandalism! Also, adding a new category is faster than renaming one (simply use CTRL+v). You should consider a more neutral (or critical) attitude towards Connel. You haven't been around for long, so you probably won't know that this is not the first desysop request brought forward against him. Ncik 16:29, 1 October 2006 (UTC)
Moot point - it's done. The word "irregular" is removed from the offending category names - I think we can all agree that this is an accceptable solution. Cheers! bd2412 T 16:48, 1 October 2006 (UTC)
Thank you BD2412. --Connel MacKenzie 18:58, 1 October 2006 (UTC)
Ncik, I obviously hold the opinion that you are the worst type of vandal Wiktionary encounters; that is true. But a de-sysop request for a 15 minute ban to stop active vandalism in progress, from a 'contributor' who already ignored the conversation? Indeed. --Connel MacKenzie 18:58, 1 October 2006 (UTC)
Ncik, it is irresonsible and rude to continue with an action you have been asked to stop, until some discussion has taken place. I wouldn't have characterized it as vandalism, but it certainly isn't a way to make friends. Whether or not the blocking was warranted, when another user asks you to stop and discuss the changes you are making you ought to do so, that is the spirit of community that we are aiming for. There are not many ways to deal with someone who refuses to cooperate and who refuses to discuss things, so you can resolve that by not being one of those people. - TheDaveRoss 19:17, 1 October 2006 (UTC)
This bias is outrageous. When and who by was I asked to stop removing those items from the category? By nobody until after Connel blocked me. A quick post on my talk page is sufficient to immediately draw my attention to any objection. Blocking a user to achieve this is clearly a gross abuse of admin powers. Ncik 22:47, 1 October 2006 (UTC)
  • Lastly, with regard to the vandalism of this page, I can think of no better endorsement of today's 15 minute block, than Primetime stepping up in defense. Perhaps he feels his vandal status is threatened or otherwise outclassed?
    Primetime's vandal status is completely irrelevant to Ncik's actions and yours. Arguments can only add weight, if they do anything. "Because he's a vandal, he can be ignored" is the about the best truth that can be drawn. DAVilla 04:26, 7 October 2006 (UTC)
  • For those of you confused by the vandalism, Ncik and I have disputed (for over a year) the classification of various English irregulars. The inexplicable support he got for his conduct at that time caused much bad blood all around. In a poorly thought out compromise, I have not pressed this issue, either way. This certainly is not a new topic; the "obvious" recategorizing does not, in fact, satisfy all concerns. But that can wait for another day, with ample discussion beforehand. --Connel MacKenzie 22:43, 1 October 2006 (UTC)

There Are No Heroes Here

Jesus Christ, what a mess. What a stupid, puerile, unnecessary little mess.

I can't begin to disentangle the claims and counterclaims surrounding the obviously broken "Category:English irregular plurals ending in '-es'", but there is nothing about that minor little category that's so important that it's worth bans and invective and bad blood and bitter recriminations at this level. Guys: Wiktionary is just a dictionary. That category is just a category. Get a grip, will you?

If Ncik waltzed in after a long absence and started making changes against long-ago consensus (which is what Connel seems to be alleging, although I can't see it) that's wrong.

If Connel blocked someone without warning, that's wrong too. And calling it "vandalism", or comparing it to goatse, is an overreaction hyperbolic enough that I hear Mike Godwin waiting just around the corner.

Here's how crazy and nonsensical this all is: someone claiming to be Primetime popped in to point out (rather accurately) how crazy and nonsensical this all is. But of course you can't see it now, because Primetime is PNG here, and gets reverted no matter what. (But check the page history if you're curious.)

There are quite a few of things which the principals in this pathetic little feud seem not to realize. They don't realize how relatively unimportant the issue at hand is, and how badly they're overreacting to it. They don't realize how childish they're being. Most importantly, they don't realize how damaging their incessant truculence and vituperation is to the project. I don't know about anybody else, but I can not deal with this nonsense. If that's the way you want to run things, fine, but it proves the place to be a looney bin that sane people will prefer to stay far away from. I've got better things to do than try to contribute to, or even make sense out of, a project where people's priorities are so bizarrely unbalanced that overblown tempests like this one can regularly erupt out of the tiniest of teapots.

I'll come back in a few days and see if you people have calmed down at all. Do try to.

scs 00:29, 2 October 2006 (UTC)

Noddy Suits

How does Wiktionary handle non-vulgar slang words and phrases? For example, I was looking for a definition of "Noddy Suit". It turns out its a very common slang word within the british armed forces for a [NBC suit]. Should this have it's own page, or is Wiktionary strictly for words similar to those id find in the Oxford dictionary? Renski 15:58, 3 October 2006 (UTC)

As long as it is attested, usually with print citations, and not just a neologism or protologism it deserves its own entry. Google books has at least one citation, you should be able to find a couple of good web citations as well. Also note that "vulgar" has different senses. Just put (British military slang) on the defintion line. Sounds fine to me. Robert Ullmann 16:38, 3 October 2006 (UTC)
The Oxford dictionary does include slang terms. --Ptcamn 00:47, 4 October 2006 (UTC)

Alphabet pages

It looks like Devanagari alphabet was redirected to Devanagari back in April. User:Taxman explained (in the page history) that it's not actually an alphabet. This got me to thinking: shouldn't every "X alphabet" page in the main namespace redirect elsewhere? After all, a phrase like Greek alphabet, for example, is not idiomatic: it is easily understood as meaning Greek + alphabet. Hence it shouldn't be an article in the main namespace. The logical place for such information is the Appendix: namespace, which already contains several "X script" pages. Unless anyone objects, I'll start to make such changes "soon". - dcljr 22:47, 4 October 2006 (UTC)

I very, very strongly object. Idiomacy is not the only criteria. --Connel MacKenzie 11:03, 5 October 2006 (UTC)
Absolutely not. These are set phrases that belong in the main namespace. And Taxman is correct, Devanagari is not actually an alphabet, but a syllabary. Most of the scripts that developed out of Phoenician to the west became true alphabets, while most scripts that moved eastward became syllabaries. Some people now claim that Thai has become a true alphabet, but this is debatable. Korean is an alphabet, but it did not evolve from Phoenician. Many of the North American Indian languages are written in syllabaries (e.g., Cherokee and Ojibwe). —Stephen 11:27, 5 October 2006 (UTC)
Oh, I get it! It's not an alphabet unless it doesn't make any cens. Stuff that makes sense iz inferior. :-P DAVilla 13:43, 5 October 2006 (UTC)
I don’t understand what you are trying to say. Syllabaries are not inferior to alphabets, they are simply different. For some languages, syllabaries make much better sense, and other languages are more suited to alphabets. —Stephen 15:36, 5 October 2006 (UTC)
Google print shows a number of usages of "Devanagari alphabet" even in modern works that know better, so it's clear the term is used innacurately to get the point across to people that don't know the difference between an abugida and an alphabet. So I created the definition as such. The problem is it either obscures the more detailed information in Devanagari or must duplicate it. - Taxman 15:19, 5 October 2006 (UTC)
I think your point of including the previous contents in the Appendix namespace was overlooked, and I quite agree with it. As for the page itself, the title in the main dictionary space needs a definition even if the phrase is misconstructed, as you and Stephen claim, assuming it can be attested as such. DAVilla 14:41, 5 October 2006 (UTC)
Oh my. I agree with DAVilla; the addition of Appendix:___ alphabet would be useful. I did seem to completly misunderstand the original question, then. --Connel MacKenzie 01:51, 6 October 2006 (UTC)

Sorry, I tried to make my comment more concise at the expense of clarity, apparently. Originally I had pointed out that Taxman's edit summary was quite right and that I wasn't objecting to his redirect at all. But then I removed the remark, thinking it wasn't necessary. I was, OTOH, questioning whether Greek alphabet and similar entries should be in the main namespace. I disagree with Stephen: "Greek alphabet" is not a "set phrase". It's a regular noun phrase referring to the alphabet used to write the Greek language. This is no "Good morning" or "Long time no see". If Greek language doesn't warrant its own entry, I don't see why Greek alphabet should. <time passes> Uh, oh. I see that Greek language does have an entry, even though French language and English language do not (they're redirects, just like I was suggesting Greek alphabet should be). Connel, am I to understand you're saying Appendix:Greek alphabet is a good idea, but we should keep Greek alphabet as an article? That seems unnecessarily redundant to me... - dcljr 23:32, 17 October 2006 (UTC)


It has recently come up, via 15MinuteBlockGate(tm), that we do not have any Official policy about who to block, for what, and for how long...infact, we have no Official policies at all. No where does it say that Connel was wrong for blocking Ncik. My thoughts on what policy is, and should be, as it pertains to WMF communities such as ours, are that loose guidelines are a Good Thing(tm) and should be written down, so as to avoid incidents of offense and allow everyone to work from the same page. Currently we have some 40 policies in various stages of completion and acceptance, from rejected policies, to drafts, think tanks and semi-official ones. There is a hypothetical "Official" status of which we have none, but I think there are plenty of unwritten or unconfirmed policies which could be formalized and written down, yet would affect the day-to-day workings of Wiktionary very little, as they are practices we already adhere to. The purpose of writting them down and stamping them with our approval would be to let new people know what the long established guidelines are, and to have something to point at if we feel someone is being out of line. I propose that we start working on these policies, and promoting the ones we support, and rejecting the ones we don't, until we have something which resembles actual Wiktionary policy. Starting with WT:BLOCK, as the most recent policy it would have been handy to have.

I wrote down some duration guidelines in there, they can and should be ammended to reflect a broad consensus and to be more/less specific, so as to reflect community opinion. I also changed some parts so they no longer referred to vandals (per WP:DENY). Have a go, discuss it here or there, and once we have something which most people agree with we will upgrade it to OFFICIAL and have a party...then move on to the 39 others. - TheDaveRoss 20:05, 6 October 2006 (UTC)

My initial take on WT:BLOCK is that it is complex and very technical. Much of it is completely beyond my ability to understand, especially the abbreviations. —Stephen 02:35, 7 October 2006 (UTC)
The only portions which are technical are the ones which pertain to IP (internet protocol) addresses, which all admins should have a basic handle on if they are going to block an IP address. I would say that the parts which seem too technical wouldn't apply to people who find them so. No one has to do ARIN lookups nor rangeblocks, and those of us who are comfortable with those features do understand what those things mean. The people who are most likely to do much work with such things are checkusers, who all know what is going on there. As for the non-technical side...please help us clean it and make it clear, complexity is the opposite of our goal, we want it to be as user friendly and reflective of community opinion as possible. If there are specific parts which you think need a lot of work, please indicate them and we will work on them if you would rather not. - TheDaveRoss
I have reformatted the page a bit, maybe some things are a little clearer. - TheDaveRoss 07:18, 7 October 2006 (UTC)
Good, it's not an acronym anymore! Bullies lacking ostensible criterial knowledge... to RfV anyone? :-P DAVilla 14:59, 7 October 2006 (UTC)
I half-jokingly started adding a "glossary" line to the range-block section. Much to my dismay, 10 out of 10 terms were redlinks. I'll add them shortly. --Connel MacKenzie 17:11, 7 October 2006 (UTC)
I started a Wiktionary:Range blocks page to help with the techie side of adminship. - TheDaveRoss


I'm enabling Werdnabot (archiving the grease pit and any discussion page that has it set up) on Wiktionary. Dvortygirl has temporarily flagged the bot to bypass the MediaWiki antispam captchas. It will be deflagged in four days — and I'll wait for consensus to flag it. If you want it to archive a discussion page, get consensus on that page if it's not your user talk page, and add {{User:Werdnabot/Archiver/Linkhere}} <!--Werdnabot-archive Age-x Target-y-->. Where x is the maximum age in days that a dormant thread will stay on the page, and y is where to place the archived threads. If you want a section index, add <!--Werdnabot-index z-->, where z is where you want the section index placed. This bot is approved on en.wikipedia and has run without error there for a number of months. All yelling to my talk page. Werdna 05:12, 7 October 2006 (UTC)

Wow, superuseful! Thanks! DAVilla 18:38, 7 October 2006 (UTC)

flavor/flavour of English definitions (奶嘴)

I'm sure others have encountered this kind of thing if they have dealt with Wiktionary long enough. I entered the word 奶嘴 into Wiktionary just now. As you will note from clicking on the link, it actually goes by a different name depending on what part of the English speaking world you come from. In this case, I provided a picture in order to quickly convey the meaning. What would be the proper way to annotate this? Does it need to be annotated? For some reason, it seems as though I should put something like:

  1. (US) pacifier; (UK and Australia) dummy; (Canada and Ireland) soother

However, according to WT:ELE rules, the stuff in parentheses would imply the specific dialect or accent of the defined word (not the defining word or words). Am I obsessing too much over this or is there a proper approach for such an entry? A-cai 03:35, 9 October 2006 (UTC)

I'm pretty sure we just list them all, then on the target pages, explain what locales they are specific to. --Connel MacKenzie 06:39, 9 October 2006 (UTC)
The example you gave seems very reasonable to me, whatever ELE says. Widsith 07:57, 9 October 2006 (UTC)
But what it means is that the US usage of 奶嘴 means pacifier, and the UK usage of 奶嘴 means dummy ... I had thought (while editing something, I think widdershins ;-) that this case should be (e.g.)
  1. pacifier (US); dummy (UK and Australia); soother (Canada and Ireland)
But that isn't really that distinctive. Connel's idea is probably best. Robert Ullmann 16:46, 9 October 2006 (UTC)
Hm, I don't see why the "and"s are not italicised... what is wrong with UK and Australia or even just UK, Australia? — Paul G 09:24, 10 October 2006 (UTC)


At the moment, {{IPA}} creates an automatic link to w:IPA chart for English. But this template is used for all languages here. Maybe it should link to something a bit more generic, like w:IPA. Widsith 08:57, 10 October 2006 (UTC)

Better yet, change the link depending on the language, where unknown or unspecified languages go to the most general page. Adding a language parameter to existing entries is something a bot could do. DAVilla 08:55, 11 October 2006 (UTC)
Actually, I have been meaning to raise this issue. I think it would be really cool if the template gave readers (who don't necessarily know IPA symbols) an intuitive way to understand the sound represented by each IPA symbol. One way would be something like: [ pan˥˩kʊŋ˥˥ʂɚ˥˩ invalid IPA characters ([]][[]][[]][[]][[]][[]][[]][[]][[]][[]][[]][[]][[]][[]) ], then make sure each symbol has an entry that contains enough information (i.e. sound files, rhyming words etc) for a reader to equate the correct sound to the symbol (The entry for ʂ invalid IPA characters ([]) isn't too bad). Can the IPA template be modified to do this (without the user having to hyperlink each and every letter)? Is there another way that the same result could be obtained? A-cai 12:07, 11 October 2006 (UTC)
I can think of a way of doing it in Javascript, but not wikitemplates. To me, this seems like a marvelously useful idea. Actually, if we had the StringFunctions extension, this could be done in wikitemplates. --Connel MacKenzie 18:04, 11 October 2006 (UTC)
I also think we should have local copies of IPA and SAMPA charts, as complete as possible. I also think we should have subpages of the chart with information about each phonetic, but that may just be me. - TheDaveRoss 18:32, 11 October 2006 (UTC)

Template en-noun for uncountable nouns

Template en-noun currently displays a hyphen where the plural would be for uncountable nouns. To me, this makes it look like there is something missing or something that could not be displayed by the browser. In fact, there is something missing, namely any indication that the noun is uncountable. I understand that the hyphen is meant to mean "no plural" but that is not quite the same as uncountable, which means "not used in the plural".

Could someone remedy this please, so that "uncountable" is displayed? I would if I knew how to do this.

Thanks. — Paul G 09:23, 10 October 2006 (UTC)

Several of the template appear to be broken at the moment. See discussion in the Grease Pit. --Jeffqyzt 13:49, 10 October 2006 (UTC)
Thanks. I've just noticed the same with en-verb. — Paul G 15:09, 10 October 2006 (UTC)
I think it is a laudable victory for open communication, that this was resolved as quickly as it was, yesterday. --Connel MacKenzie 17:49, 11 October 2006 (UTC)


Do we have a "probable cause" clause for Checkuser? DAVilla 15:16, 10 October 2006 (UTC)

I think in meta yes, it depends on the situation. Do you have a few more specifics or is it the usual WF socks... -- Tawker 17:56, 10 October 2006 (UTC)
More of a general "rights" issue than anything specific, prompted by WF as far as false accusations go, and no reason that I could immediately find to even raise the question. DAVilla 18:13, 10 October 2006 (UTC)
I don't really understand why all of WF's sockpuppets are immediately blocked, since most of the time they are just making perfectly reasonable contributions to the site. Widsith 07:52, 11 October 2006 (UTC)
Probably something to do with using new socks to evade an indefinite blocks on his other ID's, that is still against the rules, right? --Versageek 08:33, 11 October 2006 (UTC)
You are probably right, but I don't feel very comfortable about it. The vast majority of his edits were, and are, constructive. Widsith 09:03, 11 October 2006 (UTC)
Do you have a better solution to propose? His MO has been to submit legitimate edits long enough to gain trust (i.e. sysop) then to wreak havoc. Yes, of course he is trying the same thing again; what gain can we possibly get from encouraging that now? --Connel MacKenzie 17:45, 11 October 2006 (UTC)
Fair enough. Widsith 20:09, 11 October 2006 (UTC)
Do accounts get any special privileges after so many edits, like 200 or something? Most of the pages that are protected I think are editable by sysops only, but if there are large areas of semi-protected pages, or for the case of casting a vote or the like, I could see enforcing this to be necessary even without probable cause. DAVilla 08:51, 11 October 2006 (UTC)
Could you restate your question more specifically, please? The only edit-count based privilege I know of, is board voting. --Connel MacKenzie 17:45, 11 October 2006 (UTC)
The thing is, every time you block WF, you let him know you're onto him. Right now he probably knows how to evade all the tricks we have against vandalism, and so we're going to need new ones sooner than we should. If you waited until he started making questionable contributions to check him, or until he had 200 edits or something, then the learning curve would be stretched out that much more. DAVilla 21:46, 11 October 2006 (UTC)
Our tricks aren't supposed to be hidden, the idea is to block all vandals, block all anonymous means of editing, then have cake. - TheDaveRoss 21:50, 11 October 2006 (UTC)
For no reason should Wonderfool, the person, not the account, ever be allowed to edit again. He has shown twice that his intent is not good, regardless of the number of quality edits he makes in the mean time. - TheDaveRoss 18:29, 11 October 2006 (UTC)

The <tt> Funcion is for what?

asdfasdf Why do you use the <tt> in the pronouncation? Why do you include haUs? My userpage in on wikipedia, user:100110100. Please reply there. Thanks. 09:02, 11 October 2006 (UTC)

<tt> changes the code to monospace. The pronunciation in {{IPA}} uses font code that's much more complicated. In addition to that template, the pronunciation for house should probably use the {{SAMPA}} template rather than <tt>. DAVilla 09:14, 11 October 2006 (UTC)
Thanks. What is monospace? Thanks. 02:25, 22 October 2006 (UTC)

Stop me if this sounds familiar

Additions by regular contributors far outweigh spotted edits by passers by, and that's not counting the revisions necessary to revert the spotted edits. Many times when fixing an edit I start to reword it comepletely, and it feels like I could have done a better job if I had done it myself the first time. This is apart from concerns about learning the standard format which merely makes the burden lighter on us. Now not all of the regular contributors are registered. There are a few anonymous IP's who never bothered to register (or perhaps only didn't bother to log in?) who make excellent contributions, but—and this is the point—who make them in batch.

Newcommers, not counting numerous vandals, make mistakes because they don't understand the implications of a multilingual online dictionary, as opposed to any dictionary they've seen before, be it a regular or even a translating dictionary. This single aspect has many facets. Each spelling gets its own page: we do not redirect inflections or alternate spellings. We write foreign entries in English, and we don't translate them. That funny text next to a language script you don't understand isn't the pronunciation. Even simple things like using lower-case page titles seem to trip people up.

At Wikipedia aside from two occasions I refused to revert vandalism because I did not agree with allowing completely anonymous editors. Granted that was a short-lived first experience with a wiki (I don't have much encyclopedic knowledge to contribute) but I felt the same way coming here. Now being a sysop I feel obligated if I see it. But doesn't everyone else having to patrol all these edits feel the same way? Can we actually measure how much good anonymous editing does, or even if it does any net good at all? Again, not counting formatting, and even excluding vandalism which we can't say would be diminished by how much.

I mean, goodness, it's not like registering even requires an email address or opening a new window or anything. If we can't expect people to take that basic step, then can we expect them to make any effort at all to understand what they're editing? DAVilla 20:09, 11 October 2006 (UTC)

Did I write that? Oh wait, that wasn't me. It was my rant, in someone else's words. :-)
One thing I've noticed since we've had "patrolled" edits, is that the "good" edits get filtered out very quickly. The side effect of this, is that patrolling edits (with "hide patrolled edits" turned on) becomes a sysop-burning-out vandalfest of crappy contributions. With none of the "good" edits mixed in, it looks like all anon contributions are crap. (This really isn't the case, but with the extra focus of patrolling edits, it seems that way.)
Also of note, is that according to Alexa, we are still experiencing growing pains. The lack of additional sysop nominations recently may be starting to catch up with us. --Connel MacKenzie 20:26, 11 October 2006 (UTC)
Ooops. Also see m:GAY. --Connel MacKenzie 20:28, 11 October 2006 (UTC)
Well yes indeed. The majority of anonymous contributions (not anonymous edits to existing words) are very poor. What I would like is for people to take a simple test (construct some sort of dummy entry with reasonable format) and, only if they make a decent fist of it, then get a flag set that allows them to be a contributor. Probably not feasable though. But seriously, it might be a good thing to put to the vote - we haven't had a vote for a while. SemperBlotto 21:19, 11 October 2006 (UTC)
There are a few (maybe 15) IPs which contribute prolifically and constructively, and while I would hate to lose their support, I have a feeling that they would finally register an account if their hand was forced. Let's do it. - TheDaveRoss 21:33, 11 October 2006 (UTC)
I have very good reason to seriously doubt that. I tried forcing a name on one rather prolific anon contributor, and he stopped contributing here, until I undid the change. (He's the #1 anon on WP.) For the other fourteen, you may be correct. --Connel MacKenzie 00:00, 13 October 2006 (UTC)

Interesting point about good edits being filtered quickly. Actually, I'm feeling better about the anon-editing thing because, since writing it, I've noticed a lot of foreign-language contributions by anonimous IPs.

If we can get translation sections divided sense-wise by default, that just might be enough to tip the scales. Translations are the kind of edits that I feel really good seeing because I know I couldn't add them myself. I see anonimous IPs and anonimous-to-me new contributors still marking numbers, and I feel bad because the contribution is basically nil. Some day a contributor who would be able to add the translation him/herself anyways is going to have to come by and check it.

But that doesn't always work. What happens is that a vandal can hit us in a language we don't speak, so that it's much harder to detect it. I reverted one "translation" today that I am 99% sure was vandalism, but it's hard to know when you don't know the word. This case was at least for a language I kind-of speak, so I could evaluate it somewhat, but were someone to add Telugu words for genitalia all over entries, I wouldn't be able to make such a determination. --EncycloPetey 23:12, 13 October 2006 (UTC)
You will note that my argument barely concerned vandals at all. I was very seriously suggesting that, vandals aside, the only people who contribute anything of worth are those who come back time and again. The group of contributors who could be characterized as spotted editors with good intentions do not, on the whole, contribute to the project. That was my assertion, initially. The number of legitimate translations is potentially an argument against. DAVilla 08:20, 14 October 2006 (UTC)

However, that doesn't really negate what I said before. It might be that anon IPs are registered with other Wiktionaries and don't bother to take the time and register at each one. If we ever get cross-project and/or cross-language accounts, then I would very seriously consider studying this. Despite the "friends of gays" humor, this is not Wikipedia and the differences are pronounced as I've laid them out above.

Very true - the humor was not meant to detract from your points, only to provide some levity. The differences are quite pronounced, indeed. --Connel MacKenzie 20:43, 13 October 2006 (UTC)
I'm certain I've read that long ago, and it was no less funny this time around, probably moreso. DAVilla 08:20, 14 October 2006 (UTC)

In the meantime, I propose semi-protecting some of the, um, more carnal Wikisaurus pages. This does very little but it might make lighter work for a couple of people. I know this has been brought up before, and at the time in fact I was against it because I felt they were vandal magnets. But the truth is that the contributions to these pages aren't pure vandalism, they're just sketchy, and it's just as well to have people editing them whom we have a little more confidence in. DAVilla 22:16, 11 October 2006 (UTC)

I support the protection of the sketchy WS pages. But I think it should be WT:VOTEd on, as there was so much dispute about consensus, last time a major/minor change was attempted with something in the WS namespace. --Connel MacKenzie 23:47, 12 October 2006 (UTC)
I protected (auto/sysop) Wikisaurus:homosexual about a month ago, and either no one noticed, no one minded, or no one cared, because no one complained. I am in favor of protecting, purging, formatting, cleaning and making these pages more in line with something one might consider...um...valid? I am willing to do the work, but last time I hit so much resistance that I gave up. If there were a vote and consensus indicated that cleanup would be allowed (last time I cleaned for 3 hours and then had it reverted) I will clean them up again. All of them. - TheDaveRoss 23:52, 12 October 2006 (UTC)
OK, I have started such a vote at WT:VOTE. --Connel MacKenzie 20:41, 13 October 2006 (UTC)

I have an additional suggestion. For each of the problematic Wikisaurus pages, that would be semi-protected under this proposal, put a simple note at the top saying, "This page is full. Please add additional synonyms to [[/overflow]]." That way the kids can continue to have their fun thinking of new words for penis and breasts, but nobody else has to look at them. —scs 00:38, 16 October 2006 (UTC)

Oh! Lookit that. Wikisaurus:penis and Wikisaurus:breasts are already doing just that. —scs 00:40, 16 October 2006 (UTC)
I still find fault with the notion that we have to act as a repository for any swill that the twelve year olds can come up with. Why is that? - TheDaveRoss 01:48, 16 October 2006 (UTC)
It's an unexpected result which, arguably, we have to live with. The proof is not direct:
1. We are, like Wikipedia, the free dictionary anyone can edit.
2. Our success is (like Wikipedia's) due to our accessibility, to the extremely low barrier-to-entry for editing.
3. Anything which detracts from our openness (mandatory registration, page protection, etc.) has an unknowable but significant deleterious effect on our openness and must, therefore, be avoided if at all possible.
4. If we're open, we're open to everybody; we can't say "The free dictionary anyone can edit as long as they make only edits we like."
5. Demonstrably, there are ample numbers of editors who are irresistibly driven to add new synonyms for penises, breasts, sex, and masturbation.
6. If we worked too hard to prevent them (and no matter how seemingly important the short-term gain of diminishing the embarrassingly puerile content of those pages), we would conflict with point 3.
7. Also, if we worked too hard to prevent them, there would be a backlash: some fraction of the frustrated aspiring penis synonym adders would waste our time arguing about the restrictions, or would turn to vandalism out of spite or frustration.
8. Therefore, giving them some reasonably painless and low-key "out" is a low-cost compromise which, as I said, lets them "continue to have their fun" without pissing anyone off.
Now, I freely concede that any "proof" with that many steps in it may well have some errors or unsound inferences along the way. I'm not prepared to defend that "proof" to the death; I don't expect everyone to be swayed by it. But it's why I conclude that "overflow" pages like Wikisaurus:penis/more and Wikisaurus:breasts/more are a reasonable and appropriate solution to the problem.
In the argument above, point 3 is the most important. If an aspiring first-time editor comes to our supposedly open project and discovers that he has to go through some registration process first, or that too many pages are protected and can't be edited after all, he may say to himself, "eh, never mind" and wander away again. Moreover, this can happen just as well when the aspiring editor was not here to add Yet Another synonym for penis, but was rather here to make some quite useful change, perhaps the first of many, perhaps as a toe-in-the-water prelude to registering and becoming a valued long-term contributor. Ergo, we can't (for example) discourage anonymous editors too severely, even though it's anonymous editors who cause most of our annoying nuisance edits and vandalism, because there's no way a priori to distinguish between the well-meaning first-time editors and the annoying ones.
Or, in a nutshell, we have to (sometimes) act as a repository for any swill that the twelve year olds can come up with so that we can be as open as we have to be to also attract the editors who will actually write the open dictionary. The occasional pockets of swill are, unquestionably, among the prices we pay for our openness, but it does seem that the result (i.e. the rest of our non-swill content) is worth that price. —scs 02:54, 16 October 2006 (UTC)
I am more concerned about the folks who want a resource than the ones who want a project. Yes, I think that it is important that anyone can edit, but you are wrong about number 4 there, we do have plenty of restrictions on what can and can't be included, why doesn't this cover Wikisaurus? I am not willing to put effort into that portion of Wiktionary anymore because of the uselessness that I see in it, there is no reason to add valid content because no one will ever trust it as a resource of merit while the criteria for inclusion there is limited to the imagination and the ability to click the edit button. I would just as soon lose a few potential editors and do the work myself, if the result is something useful and accurate, to gaining those potential editors at the cost of the usefulness and validity of the project as a whole. It makes Wiktionary look bad to even have that portion of it, let us fix it. - TheDaveRoss 07:24, 17 October 2006 (UTC)
If it doesn't make me sound like a namby-pamby fence-sitter, I don't disagree with anything you've said. A couple of those pages are, indeed, an embarrassment. But it's tough to find just the right balance to strike between openness and control. —scs 18:27, 17 October 2006 (UTC)
Note that there's nothing preventing cleanup/verification of the spill-over pages, or from someone requesting moves of content from the spill-over into the main page, if anyone's so inclined. By the way, I like the current /more sub-page vs. /overflow, as a name, FWIW. The term overflow trivializes the content, which is undesirable, even if much of it turns out to be in fact trivial. --Jeffqyzt 12:51, 18 October 2006 (UTC)

(can I go back to the left margin?) I'd just as soon dump Wikisaurus entirely. In all the times I have ever looked at Recent changes, I have never seen an edit that wasn't to breasts, penis or some such. (Are there any other words in Wikisaurus?). I looked at the stats once and Wikisaurus:breasts (IIRC) was the second highest page, probably from Google when people look up all those slangy words. Is this really how we want to present to the world? Robert Ullmann 13:16, 18 October 2006 (UTC)

I and several others have put a fair ammount of work into the legit side of Wikisaurus, including formatting and content, we got the anatomy pages down to less than half of the total number of WT:pages User:TheDaveRoss/to_saurus/cleanup#Articles has a list of articles that existed around the time I left, divided by whether or not they were content that I considered valid or not. - TheDaveRoss 15:33, 18 October 2006 (UTC)
Indeed. I went to the vote page to support protecting the pages, and saw you were very eager to fix it. Go for it! Robert Ullmann 16:23, 18 October 2006 (UTC)

Current events

Can anyone think of meaningful content for this page, or shall we nuke it from the sidebar? - TheDaveRoss 21:36, 11 October 2006 (UTC)

It is supposed to redirect to WT:AN, right? --Connel MacKenzie 22:06, 11 October 2006 (UTC)
Now it does, unless someone can think of something better. It has also been protected. - TheDaveRoss 22:09, 11 October 2006 (UTC)

Personal matter

I'd like to direct your attention to a vote going on here. Naturally, only regular contributors could be counted towards such an important decision, or at least those who can prove their trustworthiness. Vote will close within a day of the nineteenth legitimate submission. DAVilla 05:04, 12 October 2006 (UTC)

Bot category move requests page?

Looking at Category:German idioms, I was about to move it to the correct location Category:de:Idioms, then paused. Since I'm not 100% certain, I'll let it linger, but what I really wonder, is, is there a good place for me (or anyone) to request/discuss this kind of move? Should we have a Wiktionary:Requests for category moves page? Something like a 24 hour wait period for discussion, then one of the (growing pool of) bot operators could just zap it over?

Good idea/bad idea/comments? (About the request page, not this one individual move.)

Thanks in advance, --Connel MacKenzie 23:44, 12 October 2006 (UTC)

How is it not the correct location? We put POS (and ~POS;-) cats under (language) (POS); it should be German idioms. The de: cats are for topics; de:Idioms would be German words about idioms ... Robert Ullmann 08:45, 13 October 2006 (UTC)
Apparently I lapsed into momentary stupidity there. All the more reason to have a "Requests" page, for the added sanity checks provided there. --Connel MacKenzie 16:40, 13 October 2006 (UTC)

Collapsible translations sections

As a result of a suggestion in WT:GP about hiding translations sections, I put together templates to do it. Like this:

{{trans-top|discussion space}}
*Language 1: one
*Language 2: two
*Language 3: three
*Language 4: four

With the idea that if people liked it, we might add this to {{top}} or do something like that. There are examples at get, orange and book. There has been a bit more discussion at Wiktionary talk:Translations. Aaronsama then added them to WT:ELE ... I've (at least temporarily) reverted his edit ... it isn't bad, but I think it ought to be raised here first of course. Quite a number of people have thought along these lines.

The templates also need some technical work, but of course that can always be done. Robert Ullmann 17:00, 13 October 2006 (UTC)

I am obviously enthusiastic about switching over to a system like this. I think it would be one significant step closer to making Wiktionary more usable as a translation dictionary. However, I would advise against adding it to {{top}} for three reasons:
  • The syntax for {{top}} is different than the syntax for {{trans-top}}.
  • {{top}} is likely being used for things besides just translations.
  • I think it should be very clear that the template is specifically for translations.
I know the switchover would be a significant amount of work, but doing it right is always better in the long run. In any case, I'm pleased we have a possible solution to a longstanding problem. --Aaronsama 17:27, 13 October 2006 (UTC)
{{top}} is not used for anything other than translations. Or, any place it is used that isn't a translation table, is routinely corrected to {{top2}}...but I haven't seen this particular error in quite some time now. --Connel MacKenzie 17:50, 13 October 2006 (UTC)
This question is actually irrelevant to the discussion. Because we currently put a summary on a separate line, each and every page would need to be altered anyways. DAVilla 18:42, 13 October 2006 (UTC)

I don't see any reason not to use the same framework for all sections, that is, to use the same {mid} and {bottom} for any {top}. The difference for translations is that there is a parameter to {top}, so if you want to make only translations collapsible and if you want to distinguish them in color etc. all of this can go into the {top} code, which can easily distinguish translations from other sections simply based on the existence of a pipe | and following text. In the future we could always add special parameters as well.

Can we see an example of what an uncollapsible frame might look like, that is, a {top} using code similar to WikiNews but showing a simpler box, without a visible frame or anything? DAVilla 18:36, 13 October 2006 (UTC)

I like this option, does it mean we can finally change the ugly yellow background? - TheDaveRoss 18:48, 13 October 2006 (UTC)
That's tangential of course, but your point is made. DAVilla 19:20, 13 October 2006 (UTC)
But not very clearly. Could someone please put together more coherent examples or the different flavors of what is being proposed? Clearly stating just what is and what isn't being proposed would help too. --Connel MacKenzie 20:33, 13 October 2006 (UTC)
Basically there are three completely independent discussions going on. First, how should the templates be named? Second, which code should be used where? Third, how should they look? Every possible decision for each of those questions can be accommodated with minimal impact on the others, with the exception that {{bottom}} cannot be overloaded with two different code types.
(Bottom can be overloaded that way, if we have it close two divs, and have all top variants open two divs, needed or not. (minor technical point, if anyone doesn't understand this, don't worry!) Robert Ullmann 21:47, 13 October 2006 (UTC))
Right... though that's not the same code that's there now, my point, but I guess your point is it's pretty close. The thrust is we really can think of these as independent. DAVilla 22:08, 13 October 2006 (UTC)
For the first question, do we want something that's easy for anyone to pick up and remember, or do we want something that's specific to the use? The latter is an important direction for templates in general, such as the structural names for {{italbrac}} and {{italbrac-colon}} rather than an abstract name related to the use in {{synonyms}} for instance. In that case the abstract name is easier to remember, but in this case I think the structural name {{top}} will do.
My comments above apply to the second question. TheDaveRoss's comments apply to that idea, at least what he understood of it, but also specifically the third question, which can be altered even now without the new code. DAVilla 20:59, 13 October 2006 (UTC)
I think this should go back to the grease pit for a couple days, to address (or at least understand better) DAVilla's concerns. --Connel MacKenzie 04:42, 14 October 2006 (UTC)

Am I right in thinking that this isn't supposed to replace ALL translation sections, only to help with those pages which are very crowded/cluttered etc? Or are we planning to hide all translation sections? Widsith 06:25, 14 October 2006 (UTC)

I would be in favor of doing this for all translation sections, and having a preferences option to default them open or closed. The reasons for doing all of them are twofold, first, it is good to have consistancy, second, all translation sections will hopefully grow very large. - TheDaveRoss 20:13, 14 October 2006 (UTC)
I had assumed that this was related to the WT:PREF setting for "hiding translation sections," but apparently this is very much the same thing...perhaps a pretty version, right? Perhaps simply customizing that feature to do this, would be better? The goal is to make it a user preference, is it not? --Connel MacKenzie 20:26, 18 October 2006 (UTC)

If we're going to try this for the translations sections (and I do like the idea), then we should also be thinking about doing Quotations the same way. Right now, long Quotaions sections are being shunted to a separate Citations page. This makes it very difficult to coordinate definitions changes with the quotations, since it is not always obvious that the citations exist, and in some cases I've found citations pages for entries that didn't even exist yet. --EncycloPetey 22:13, 18 October 2006 (UTC)

I agree we should also be thinking about doing Quotations the same way. --Enginear 19:42, 20 October 2006 (UTC)
I suggest extending this to other (possibly all) sub-definition sections, including Synonyms, Antonyms, Derived terms, Related terms etc. Ncik 00:36, 22 October 2006 (UTC)
What we need is a very generic form of this template, something we can use to create many styles and implementations of the same idea, without creating lots of javascript and css classes for each. - [The]DaveRoss 04:54, 22 October 2006 (UTC)
Has this started? I'd think this would require some community agreement before moving beyond one or two examples. I think the parameter in {{top}} is quite clever. Hopefully, not too clever. Bot-converting these 26,000 entries might be advisable, if this gains strong support. --Connel MacKenzie 15:22, 19 October 2006 (UTC)
  • I am seeing this more and more often. The more I see it, the more it occurs to me that it works and works well. Did a vote ever start for it? I don't recall seeing a request for bot converting the translation tables to put the subsection heading inside the template. FWIW, I think the technique is cleaner than what we've used in the past. --Connel MacKenzie 19:11, 11 November 2006 (UTC)

Translation links to sister language dictionaries

From time to time people ask why our translations link to entries here in the English wiktionary, rather than to the appropriate other-language wiktionaries. We all know the answer to that, but I've been meaning to ask why we don't do what de.wiktionary.org does, namely link to both.

I just came across a page of ours that does do this, although with an odd convention of hiding the links behind an unobtrusive degree symbol, °. See bouquet#Translations. That's probably not the best or most obvious way to do it, but while we're talking about translations, do we want to pursue something like this, too? —scs 14:38, 16 October 2006 (UTC)

I've noticed that in other wikts, and it turns out to be quite useful. It works well to have a template for the translation line with the code, then each word paired with transliteration or whatever. Then it generates all the links: wikilink the word, sister-link the word in the other wikt, generate the other info, then the next word. For an example, see rw:Template:isemura which I've just done. (;-) (and see rw:mudasobwa and rw:kuwa mbere, note that I'm in the middle of adding a bunch of the language name templates) Robert Ullmann 15:01, 16 October 2006 (UTC)
Oh, one thought I may try in the Kinyarwandan wikt (that would not be good here!) is to link to the word in rw.wikt if it exists, else link to the sister wikt. Since most words in most languages (other than fr and en) will not be in the rw.wikt for a while (unless we do some massive imports or something). Not a good idea here, we want the redlinks so people can see what to add. But there, automatic links especially to fr and en would be very useful. (Those are the other two national languages in Rwanda) Robert Ullmann 16:10, 16 October 2006 (UTC)
It would be better to use a common symbol for these links, such as (*) or (^). The language codes are confusing...most of them are unfamiliar to most people, and many of them look like they could be definite or indefinite articles, prepositions, or abbreviations. —Stephen 04:32, 17 October 2006 (UTC)
The language codes may be confusing, but it seems to me that an obscure symbol is even worse. It's not obvious what it's for; it's easy to overlook; it's hard to click on even if you do notice it and know what it's for. —scs 17:44, 17 October 2006 (UTC)
There is some clever Monobook.js code floating around that finds all valid interwiki links on a page, and adds them to be within the translations section also (if it can.) This seems like a more elegant approach, to me, rather than filling the entries with links that may or may not work. Shall I add this? --Connel MacKenzie 07:47, 17 October 2006 (UTC)
Yes, we should try that to see how it works. —Stephen 07:56, 17 October 2006 (UTC)
We have already in quasi-use the Template Template:t, which works as {{t|fr|traduction|f}}, comes out as traduction f. I think Paul G, Polyglot and Wonderfool have all used it, some more than others. --DWarF 08:17, 17 October 2006 (UTC)
Sorry, WF, but that just clutters things. Adding something like http://bs.wiktionary.org/wiki/MedijaViki:Monobook.js#interwikiExtra has the same effect for the reader, without the clutter for the editors (and updated automatically, with each run of the interwiki bot.) --Connel MacKenzie 08:28, 17 October 2006 (UTC)
Pardon me, you are talking about two different things; scs and I (etc.) are talking about links to the foreign word in the foreign wikt. The interwiki links are to the English word in the foreign wikt. Robert Ullmann 11:43, 17 October 2006 (UTC)
Right you are. Sorry about that. Shall I just forget the other thing, then? Or should I modify that code to assume that if a translation exists on the other language Wiktionary, that the word(s) in that language also exist on that Wiktionary? --Connel MacKenzie 05:38, 18 October 2006 (UTC)

There doesn’t seem to be a consensus here. Apart from {{t}}, there is also {{trad}}, which seems to do more or less the same. It is used in the Dutch wikt, for example. I think it is very useful, and not very cluttering. See die, where I used it in Dutch translations. It should be possible to have a bot change all translations to use this template, based on the language which comes before it. Is it wanted to use something like this? WT:ELE dictates this should not be the case, but I think it is nicer, if somebody does not know the Wikt system verry well, he will not think of it to click on that word and look for a link in another wikt there. Moreover, it takes one click-and-search less and also generates a link if that page does not yet exist on the other wikt. unsigned

I've merged t and trad, and made trad a redirect to t. Also checked within {{t}} for proper gender template names and called the templates (so the user customization works). And fixed the Chinese codes; Min Nan is "nan", but the wikt is at "zh-min-nan", which could be fixed with a simple DNS CNAME record, but no-one has done that yet! Likewise yue and cmn. None of which addresses whether we want to do this, but we have users from other wikts who are very persistent in demanding that we do (;-). We can always replace them later ... (Main discussion at Wiktionary talk:Translations) Robert Ullmann 13:38, 14 November 2006 (UTC)

WT:BLOCK upgraded

I have just upgraded WT:BLOCK to semi-official, please continue to look it over, and watch out for a vote to raise it to official policy status later this week. Next on the list is Wiktionary:Page deletion guidelines so have a look and make changes you think it needs. - TheDaveRoss 20:53, 16 October 2006 (UTC)

It should be pointed out that all users should look at this and comment and/or vote. It isn't some private sysop thing; it is our common policy. Robert Ullmann 11:37, 17 October 2006 (UTC)
Absolutely, the more people who look at it and comment the better we can feel about it reflecting consensus. This is one of those cases where silence indicates support. Thanks Robert. - TheDaveRoss 16:15, 17 October 2006 (UTC)
Most of what’s on that page is incomprehensible to me. What little I can figure out (if I understand it correctly) seems to have a lot of unnecessary repetition. What, for example, is the difference between "blatant vandalism" and "vandalism only accounts"? What’s the difference between "blatant vandalism" and "vandalism"? —Stephen 21:27, 18 October 2006 (UTC)
Blatant vandalism applies more to anonymous IPs, whereas a vandalism only account would be an account which is created with the sole intent of vandalizing. I am pretty sure that removing the repetition would cause it to be less clear, but it is a wiki, feel free to cut out what you find redundant. This is the time to make modifications, please, please do so. - TheDaveRoss 03:25, 19 October 2006 (UTC)
In my experience, 99.99% of cases of vandalism are done by accounts created in order to commit vandalism. About the only vandalism accounts that were created for legitimate purposes have been Primetime, Wonderfool, EddieSegoura, and some of the Jahbulon detractors. So it seems to me that a simple category of vandalism should be sufficient to cover the topic. I can’t really offer any other suggestions on the page since I don’t know what it’s talking about for the most part, especially everything in RangeBlock and BlockDuration. What’s the difference in vandalism and pure stupidity? How do you recognize "bad" sockpuppets? What’s the difference between random spam and vandalism? —Stephen 04:39, 19 October 2006 (UTC)
Ah ha! These are things I can address. The simple answer to range blocking is that if you don't understand it, don't use it. I would be more than happy to go into more depth than WT:BLOCK and Wiktionary:Range_blocks if you would like, but the easier solution is to ask someone who knows what is going on with those blocks and with ARIN checks to do the actual blocking. Yes, most vandalism is from accounts created for the purpose, or anonymous accounts, however the idea is for the policy guidelines to be comprehensive enough to still apply when those rarer (and often more contentious) situations arise. The reason why we subdivide the page is that there are many degrees of vandalism (or so I and the folks I discussed it with feel) and one block type and duration isn't universally applicable. There should be a different block for an IP which spams "Dave is a dork" than there should be for logged in long time user who persistantly reverts someone elses good faith edits but refuses to discuss the changes. The intent of the page is stated as clearly as I could state it at the top of that page. As for the difference between vandalism and stupidity, that has a lot to do with perceived intent, persistance, type of vandalism...it is tricky but mostly we know it when we see it I think. Bad sockpuppets are the kind which are used to vandalise, evade blocks, etc, while a good sockpuppet doesn't, simple as that. - TheDaveRoss 05:04, 19 October 2006 (UTC)


I've found about 30 words in various translation tables given for "Proto-Polynesian". All entries begin with an asterisk, which suggests they are hypothetical reconstructions rather than actual words. However, I can't find anything written in Translations Policy pages that spells out our criteria for "proto"-language translations. So, do we remove them all and amend Wiktionary:Translations? --EncycloPetey 22:59, 16 October 2006 (UTC)

I added them. They're interesting and useful as are words for all proto-languages. In particular they're useful for putting in the etymology sections of entries which don't yet exist for languages such as Hawaiian, Maori, Samoan, Tahitian, and Tongan. Naturally all words in all protolanguages are reconstructions and hence hypothetical but this does not reduce their status since years of research by trained experts goes into them. I cannot think of any reason you would want to remove them. Instead concentrate on removing junk words the kids made up yesterday. — Hippietrail 00:24, 17 October 2006 (UTC)

The last time this came up, they were all moved to Appendix:Proto- namespacing, with an asterix preceding the root. --Connel MacKenzie 07:58, 17 October 2006 (UTC)

Han characters 2

Copied from above; the issue is to clean up the Han character (CJKV) entries and make sure we have no problem with coppyright. Robert Ullmann 18:56, 18 October 2006 (UTC)

Ok, speaking as an IP attorney, I think this is most akin to the situation in Kregos v. Associated Press, 510 U.S. 1112 (1994). There, a baseball reporter came up with a set of nine statistics that he thought were particularly important to determine which pitchers would win the day's games. The Supreme Court held that although Kregos could receive protection in the arrangment and presentation of the statistics, this protection would be very narrow. In essence, all that could be protected was the exact presentation. Any other paper that chose to publish similar statistics could do so as long as the alternate presentation differed from that created by Kregos in "more than a trivial degree", specifically finding it unlikely that the AP's form infringed where it included only 6 of Kregos' 9 stats, included 4 additional stats that Kregos did not.
In our case, we are attempting to provide as much information as possible about every character available. The actual information we seek to present is in the public domain. It is only, therefore, the particular selection and arrangement to which another party could lay claim. Hence, if we strip all non-essential information originating with the other source, include additional information (which we are going to do anyway), and change the arrangement to suit our purposes, we should be in a position to prevail over any challenge to our use of this information.
Cheers! bd2412 T 16:35, 17 October 2006 (UTC)
Having not heard anything back from the Wikimedia General Counsel, having asked repeatedly ... sigh. Given what you say, I think we should do this:
  • Format the info at the top of the entry, which Unicode can't really claim any copyright on (compilation, derivation or whatever) into Template:Han char under a Translingual header so that the format meets our standard.
  • Delete the "Dictionary information" section; it is page and line references to the unabridged versions of large dictionaries most people don't have anyway. (One of them is ~10,000 pages.) This information was developed by Unicode, they might have some claim, but so what; we strip it.
  • Delete the "Technical information" section; this is the Unicode/IS 10646 code point in hex and decimal, ditto the Big5 code point. Unicode has no claim on this, but there is no reason to have it. We don't give the JIS codes (or ASCII, or whatever).
  • We already have additional information on a large number of entries.
  • Fix up the headers so that we don't have == Korean Hanja == and so on.
  • Cat them in Category:Han characters, sorted by radical and stroke. If and only if all this is done!
I've set this up on AWB. It can't be run automatically, there are just too many variations that people have introduced. I have matched most of the common patterns. See the entries that are in Category:Han characters, I've run a few. (Easy to roll back if there are concerns!)
Comments please? Robert Ullmann 18:56, 18 October 2006 (UTC)
My two cents, or yen, or yuan, or whatever:
  • I think it's worth keeping some of the technical information, at the very least, the Unicode code point. It's a useful "handle" and a very useful cross-reference (if for nothing else, to the rest of the Unihan information we're thinking of deleting).
  • Personally (though IANAL), I think the only information Unicode might complain about our abuse of their copyright on is the "Common meaning" phrasing. So if we're worried about copyright, I'd say we should delete all of those, or delete them if they're identical to the phrasing in Unihan.txt (older or newer versions).
  • I could go either way on the "Dictionary information" subsection. This, too, is potentially useful as a cross-reference for our readers. The original use for those references within Unihan.txt was, I suspect, mostly to validate the CJK unification work and to cross-check its coverage. Those uses don't apply to us, of course, so the question is, how often will those listings help one of our readers look up an ideograph in one of those other dictionaries?
scs 13:03, 19 October 2006 (UTC)
I would like to keep the code point information as well. I think it is useful, when dealing with so many coding schemes on the computer, to have a single place to look up this info. Might as well be wiktionary!
I strongly agree with dumping the common meanings section. There are simply too many problems associated with it. As a contributor, let me give you one of my main beefs with the common meanings/pronunciation section for individual Han characters. In Chinese, some han characters can be pronounced in several different ways. The pronunciation usually is associated with a specific meaning. For the Chinese students, we call this 多音字 (PRC) or 破音字 (Taiwan). For example, the character: can be a noun or a verb. As a noun, it is pronounced shù ([ ʂu˥˩ ] invalid IPA characters ([[]][[]])), and has a root meaning of number. As a verb, it is pronounced shǔ ([ ʂu˨˩˦ ] invalid IPA characters ([[]][[]])), and has a root meaning of to count. Now take a look at the common meanings section for this character (while you're at it, take a look at the pronunciation section as well). I can't for the life of me figure out how I would indicate the above, given the constraints imposed by the common meanings section. On top of which, you're not given the part of speech (unlike the rest of wiktionary)! Incidentally, this is the entry from one of our competitors. I think we should shoot for atleast as good as, if not better than our competition.
Finally, I agree with getting rid of the "dictionary with page numbers" section. The referenced dictionaries are mainly for scholars. If a scholar can't find a character in a dictionary without first looking up the page number on wiktionary ... (you make up your own punchline!) :)

A-cai 13:34, 19 October 2006 (UTC)

Okay, I rolled back the ones I had done, and made a number of changes, then ran a few more. See .
  • I'm using two templates (as I was before), one for the info under the Han character header, one under References.
  • All the data in the page is stuffed into the templates (except alternate form, which goes into {{see}} where it belongs); this means the process isn't losing any information. (In a formal sense, it is reversible, except for white-space, and the presence or absence of headers for unused fields.)
  • The unicode code point is in the second template.
  • The common meaning is stuffed into the first template as a parameter, but not used.
  • The dictionary references are stuffed into the 2nd template, and not used.
  • The second template generates a link to the Unihan database, where all that can be found. It can also do other things with the codepoint if we like. (Remember the codepoint IS the page title, coded in hex; no copyright problems there!)
  • We can always choose to display/hide things by modifying the template(s), if and/or when we run into trouble with having one of the data fields in the wikitext, we can always bot-strip just that from the template call.
I think that addresses the points so far. What else should we look at? Robert Ullmann 17:42, 19 October 2006 (UTC)
Small correction to the above: the dictionary data comes from ISO JTC1/SC2/WG2 (Joint Technical Committee 1, Sub-committee 2, working group 2, i.e. the IS 10646 character set group). It isn't Unicode copyright. We can do what we want. (ISO standards are only publication copyright for the documents themselves, the data is true public domain.) Robert Ullmann 12:50, 20 October 2006 (UTC)
Looks good so far. I may be preaching to the choir (hopefully), I think we need to get rid of the hanzi/hanja headers, and replace them with something more informative. It's like saying:

===alphabet letters===

  1. dog
I think it would be better to replace hanzi with the appropriate part of speech. Most Chinese characters only come to life when used with other Chinese characters. More often than not, a Chinese character is a component of a word, not a word itself (unless we're talking archaic Chinese, which is a different story). Think about these words in English: transfer, ferry, conference, The fer comes from Latin ferre. The component fer would get its own Chinese character if fer were part of the Chinese language. I believe we call this an affix. I would say that a substantial number of individual Chinese characters are really used as affixes, prefixes and suffixes, and should be labeled as such. For the individual characters that can stand by themselves, we should label them noun, verb etc like every other entry in Wiktionary. Anyway, I think I may be rehashing old ground. If so, please ignore my babbling :)

A-cai 13:46, 20 October 2006 (UTC)

I just sort of figured ==Chinese== ===Hanzi=== was an improvement on ==Chinese Hanzi== which isn't a language header. Of course it should be POS (or whatever particle/suffix/etc.) But the POS-etc. headings have to be done as edits to individual characters. In the meantime the hanzi/hanja headers serve as the section for the bulleted romanizations and the compounds. Robert Ullmann 14:04, 20 October 2006 (UTC)
Note that my AWB ruleset (22 rules, and that is for the first two dozen entries ;-) also fixes a few other things along the way:
  • subst {{kanji}} so the Readings header isn't inside a template
  • change all Compounds headers to L4
  • some WS cleanup
  • change top->top2, mid->mid2 since they are used with compounds, not with translations (this is very common)
  • removes "Other info:" when there isn't any.
I'm not messing with the internal syntax of the Chinese section, which should be 2-3 separate language sections anyway. Robert Ullmann 15:37, 20 October 2006 (UTC)
Also adding the correct radical and stroke index parameter for the ja-kanji template. (23 rules, will be 24 ;-) Robert Ullmann 16:24, 20 October 2006 (UTC)
I modified to show what information I think is lacking. Note that Min Nan has a meaning that is not shared with Mandarin. I don't think a bot could make the entry look like this. The downside of that is that it will take many years for a single Chinese speaker to slog through thousands of entries. The good news is that I have made it through all of the a's for HSK Beginning Mandarin :-)

A-cai 03:22, 21 October 2006 (UTC)

So where then do we put the compounds? They aren't going to be the same for Cantonese/Mandarin/Min Nan/etc. correct? All the things in the "Chinese/hanzi" section belong under the language headers. (Oh, if zh-forms is at the top, I don't think we need to repeat it? you think?) I could break the "Chinese/hanzi" section into Cantonese and Mandarin, and sort the languages correctly, but I'd have to assume the compounds went with Mandarin. (Another half-dozen rules in the ever-growing ruleset ;-). Probably better not to? Also: when we have more people, it won't take that long; the way to get more people is to make a good start! Robert Ullmann 12:14, 21 October 2006 (UTC)
Take a look at . None of the words in the derived terms section exist in Mandarin! You are correct, it's dicey to lump compounds into a section called Chinese. The implication is that Chinese = Mandarin, which makes fitting other Chinese dialects into Wiktionary an awkward task.

A-cai 12:50, 21 October 2006 (UTC)

For the moment, I've changed the templates so they display the "common meaning" and the dict indices. That way essentially all the info is presently displayed. I think the Chinese/Hanzi section has to stay until an entry is edited; the 'bot-like code can't create a proper Cantonese section, and has no way of sorting the compounds. Please see . Robert Ullmann 12:44, 21 October 2006 (UTC)
Note that I basically left the Chinese hanzi section alone. I'm not yet convinced that we are faced with an either/or situation. It may end up making sense to keep both a Chinese hanzi section and Mandarin/Min Nan/Cantonese sections as needed. It may turn out that, as we attract a wider audience, we will get feedback that will make such questions easier to resolve.

A-cai 12:57, 21 October 2006 (UTC)

Seems like a reasonable idea for now. On another point: the dictionary indices may be more useful than one would think: for example the KangXi is online: www.KangXiZiDian.com with scanned page images. (It also has an index, so one doesn't really need to know that 字 is on page 0277, bit still...) Robert Ullmann 14:24, 21 October 2006 (UTC)

New vote on numeral headers

I've started a new vote on Wiktionary:Votes#Number versus Numeral, pertaining to the use of headers like ===Cardinal Numeral=== in place of the current ===Cardinal Number===. The discussion should happen on the linked page at Wiktionary_talk:Entry_layout_explained/POS_headers#Number_versus_Numeral, rather than here, as that is the location linked from the vote itself and is where the disucssion began. --EncycloPetey 22:17, 18 October 2006 (UTC)

Acceptable entries?

I am unclear about the policies for acceptable entries. Where do you draw the line when dealing with lesser used and virtually unused words? For example, I have seen "obsolete" word entries, taken from old dictionaries, which seem to me not to fall under the policy of acceptable words being ones which people are apt to want to look up. Are these not supposed to be here? I see that there is a policy that says words that have limited regional use are not acceptable. Why not?

I long looked for a word we used when I was a kid, and I finally found it only in D.A.R.E., which said it is used primarily in New England. I have sometimes looked for words with specific meanings, and there turn out to be obscure words which fit. Considering the policy that there need to be published uses, sometimes the only places you can find certain words are in special dictionaries, as in the case of obsolete or regional words, or those almost exclusively used orally. Are these dictionaries acceptable as published references?

It seems to me that there should be very little limitation on acceptable entries in the Wiktionary. Exclusion is the domain of commercial printed dictionaries catering to particular markets and constrained by limited space. Here, there is virtually no limitation on space and no commercial consideration. The presence of any word is harmless. If there is a qualification concerning its status as a word, that qualification can be included in the entry, such as its limited use. The principle of inclusion is what could eventually take the Wiktionary beyond the scope of Oxford. Abstrator 06:55, 19 October 2006 (UTC)

Yes. All real words are acceptable. If they have a local use you might have some difficulty in providing citations to prove their definitions though. Basically, every word is judged on its own merits. SemperBlotto 07:37, 19 October 2006 (UTC)
Not to state the obvious, but have you had a chance to look at our Criteria for Inclusion page yet? If not, that might clear up some confusion. --Jeffqyzt 18:06, 19 October 2006 (UTC)

Yes, this is what I was commenting on. There seem to be ambiguous and contradictory criteria. The "General Rule" and "Attestation" sections would exclude words such as I described above. Abstrator 05:50, 20 October 2006 (UTC)

Ancient Greek

I've been trying to find ways to systematically categorize and present Ancient Greek terms (at this point, mainly nominal; I hope to move on to verbs soon, but they are prone to greater disparities in accentuation paradigms!); it has been a bit difficult due to the fact that not only am I new to Wiktionary, but the presentation of the language on Wiktionary is in the earliest stages of development. I thus have several concerns:

  1. What is the appropriate manner to present the orthographic Romanization of these terms? I understand that Japanese terms written in Rōmaji are often wikified separate from the same terms presented in Kana/Kanji. As keyboards do not well represent the polytonic orthography of Ancient Greek, I feel it would not be inappropriate to likewise wikify the Romanization of the Greek terms that searching for them is simplified.
  2. Ancient Greek is highly dialectical and words often vary greatly from one to another. I would propose that each dissimilar term have its own page. (For example, in the Attic dialect, young man is νεανίας, while in Ionic, νεηνίης.) Were separate pages not utilized, would a simple header "Alternate spellings" or "Other dialects" be the best possible way to represent these terms?
  3. The nominal templates I've created, while very specific, would most likely be extremely confusing to casual scholars of the language or those new to it. Would it be proper to create a page regarding these inflection templates? Conversely, ought I simply explain each template on its respective talk page?

Thank you. Medellia 16:52, 19 October 2006 (UTC)

  1. On the Latin Wiktionary I've been giving both the Romanization and Beta code, linking to the Beta Code without the accents and the Romanization as is (e.g. ἕκτος = hectos and E(/KTOS).
  2. I should expect dialect forms to be treated the same way as dialects in any other language (I'm not sure what the current policy is; but cf. armor and armour, octante and huitante).
  3. The use of a template can be explained on the page itself, using <noinclude> tags. Example at la:Formula:grc-declinatio-adj-oxy. The meaning of a template (i.e. what all the cases and such are for) might be better on a separate page, e.g. Appendix:Greek first declension. —Muke Tever 02:46, 20 October 2006 (UTC)
Note that Rōmaji is an actual written form on the Japanese language, not (just) a romanization. That is pretty much the only reason we have entries for words in rōmaji. We don't (for example) have entries for romanized versions of words in Russian, they aren't written that way.
Second point: each form that is spelled differently of course gets its own entry, and "Alternative spellings" is the way variant spelling get linked together. (in general).
The documentation for the template(s) should usually be on the corresponding talk page. Robert Ullmann 12:10, 20 October 2006 (UTC)
Minor correction above. DAVilla 19:49, 23 October 2006 (UTC)

I don't think Ancient Greek should have romanised page entries, although of course it's helpful to include romanisation within the Greek page itself. Widsith 12:30, 20 October 2006 (UTC)

Thank you all for your time, patience, and aid. I will begin re-editing templates as time permits. Medellia 02:42, 23 October 2006 (UTC)


Okay, I'm getting worried. Either I don't understand uncountable or someone else doesn't. Recently I ran across the entry Wikipedia claiming that it had an uncountable sense. I objected on the talk page and someone agreed with me, so I removed all mention of "uncountable" from the entry. Then I saw an example on Template talk:en-proper noun saying America could be uncountable (see that talk page to discuss whether any proper nouns can be uncountable — maybe there are some). Finally, I looked at Special:Whatlinkshere/Template:uncountable and noticed that many language names are marked as being uncountable. Examples: "Hebrew ... (uncountable) The language of the Hebrew people" and "Japanese ... (uncountable) the main language spoken in Japan". The most egregious of these is Chinese, which lists no fewer than 5 "uncountable" senses, only one of which (#7) I agree is definitely uncountable. Just because there is, in reality, only one of something doesn't necessarily mean it's uncountable. As I point out on Template talk:en-proper noun, for example, one can talk about "two Americas". I guess I can't think of a good example of using "a Chinese" (referring to the language) or "two Chinese(s)" (again, in the language sense) in a sentence, but does that really make it uncountable (I know, according to our definition, it would seem that the answer to that is yes)? Wouldn't it be better to just call such words proper nouns, or collective nouns, or whatever they are, and leave it at that? Maybe I'm just thinking too much in the mathematical sense of uncountable vs. countable.... - dcljr 17:51, 19 October 2006 (UTC)

There's (at least) three different reasons for a sense to be uncountable:
  1. It is a truly proper noun, (i.e. not just something conventionally capitalized in English) ex. North America, John. These are generally pluralizable, and pluralizing this kind of uncountable produces the sense "things or people called X" (I know a lot of Johns; Mark Twain's America is more romantic than the other Americas of history and fable; the Chinese I learned is not a Chinese one finds spoken around here)
  2. It is a mass noun, referring to a material rather than a concrete item, ex. water, wood. These are also generally pluralizable (though perhaps not as much); pluralizing this kind of uncountable produces the sense "kinds of X" (Many different hardwoods are processed by this plant) or "instances of X" (That'll be three large waters).
  3. It is an abstract noun, which is sort of a subclass of the preceding; ex. justice, transitivity. Pluralizing this (which is often harder) generally also produces the sense of "kinds of X".
I agree it is better to be more specific, thus on la: I stopped using innumerabile, replacing it with materiale or abstractum or proprium as the case may be. —Muke Tever 03:01, 20 October 2006 (UTC)
Don't worry, it is probably the other people. There is a simple test, like for it's/its (use it's where "it is" makes sense). Just try counting: zero Japanese, one Japanese, two Japanese ... makes sense for people, doesn't for language. So Japanese (the person/people) is countable, Japanese (the language) is uncountable.
Zero Wikipedias, one Wikipedia, two Wikipedias ... yup, countable! (Sure, Wikipedia as an adjectified noun, "wikipedia editing" doesn't seem countable, but that is always the case. And it would be the edits that are counted anyway. And are.)
And then don't let poetic exceptions like "two Americas" confuse you ;-) Robert Ullmann 18:06, 19 October 2006 (UTC)

This has come up before, without any definitive resolution. I hope that Wiktionary can decide to use the common use of the term is explaining countability to its readers. For example, the sands of Southern California may be different then the sands of Hawaii, but we should explain that sand is a mass noun (or whatever the current favorite term is) that refers to millions of individual grains of sand. Emotions should also be clearer about singular/plural usage. If a term can be used as a plural but very rarely is (and only in certain grammatical cases) then we should have a better way of indicating it. AFAIK, we do not have a consistent method of doing so, today. --Connel MacKenzie 18:11, 19 October 2006 (UTC)

Here's my take on countability.
  1. As was mentioned in an earlier rendition of the discussion, it's senses that are countable or uncountable, not words.
  2. If none of a word's senses is countable, there's no need to mention a plural on the inflection line.
  3. Trying to assert what the plural would be if the word had one (i.e. if we were to discover a countable sense we're missing, or if the language were to evolve a new, countable sense somewhere down the road) is quite unnecessary. We're descriptive, not prescriptive. If/when that countable sense is discovered, the world's English speakers are perfectly capable of discovering or inventing a plural form without our help. (And then we can list it, when we've got some live usage to cite.)
  4. As Connel says, there's also little point in devising really speculative example sentences just so that a plural can be listed. (Though I'm not sure "sand" is the best counterexample, since that poetic usage "the sands of time" is so prevalent and so mellifluous. "The snows of yesteryear" gets me going, too.)
  5. To me, the biggest reason to list a word (or sense) as "uncountable" is as a flag reminding us it's okay that the word has no plural listed. Words which don't have a plural listed, but which aren't explicitly marked as uncountable, may be in need of attention. But words which are marked as uncountable can be passed over (on that score) so that one can devote one's time to the words that do need attention.
Given that it's the senses that are countable or uncountable, it's arguably somewhat wrong to say "uncountable" in the inflection line (i.e. as if it applies to the word as a whole). In the same earler incarnation of the discussion, someone suggested having the template display something like "no countable/plural senses attested", which is a fine idea.
scs 02:44, 20 October 2006 (UTC)
...with the test of attestation of a plural being whether it meets our CFI. --Enginear 19:39, 20 October 2006 (UTC)
It seems odd to me that the presence of a plural form should be linked to countability. Many (most?) uncountable nouns do in fact have plural forms, because of the use of the plural to indicate types. Dfeuer 20:18, 6 November 2006 (UTC)
That's how people endlessly confuse themselves, the existence of a plural form doesn't have anything to do with countable/uncountable. Fish, fish, fishes. The question is, are two fish or fishes? Answer: two fish for the ordinary sense (uncountable), two fishes for "type of fish" (countable). One fish, two fish, red fish, blue fish! Lesson for the child: fish is uncountable, and adding a general adjective doesn't modify countability: one red fish, two red fish. (And all this time I bet you thought Dr. Suess was just for fun!) Robert Ullmann 22:12, 6 November 2006 (UTC)
It's been a few days, and I don't think this has properly been handled yet. I think Wiktionary needs to acknowledge that uncountable nouns often have plurals, and get rid of the link in the noun template between countability and the existence of a plural --Dfeuer 23:37, 13 November 2006 (UTC)
I think many of the deeper thinkers have been distracted by other issues. Hopefully, one of the MIAs will comment soon. --Enginear 00:50, 14 November 2006 (UTC)
Fish isn't the best example to give here, or you're going to just confuse everyone even more. Since Dr. Seuss can count them, fish as well as deer and moose are in fact countable, although the plural forms are identical to the singular. Then you have to consider that they also have uncountable senses for the meats of each animal. And fishes is unusual in that it is only a plural of the type Dfeuer mentioned, whereas we would usually use cats to classify lions and tigers or even rices for different types of (uncountable) rice. Then persons and peoples just throws a wrench into the whole equation. If there were more legal documents concerning geese, we would probably also have the word gooses. Luckily the geeses of the world have yet to unite! DAVilla 21:05, 24 November 2006 (UTC)


I happened to look at this definition and noticed it was incorrect.

I posted a corrected entry of similar length. I later noticed that this entry had been expanded, but that some of the information was incorrect. I left it mostly as it was but edited it so that it was correct and consistent (i.e. I left the format much the same but corrected the facts).

Shortly afterwards it was vandalised with the incorrect facts being reinserted. I changed it back again and left the reasons for doing so in the talk page. Minutes later the vandal "SemperBlotto" had spoiled the page again.

He clearly knows nothing about the word. The edits he makes are inconsistent with:

a) Themselves b) Wikepedia c) Normal usage

Dahl is a generic name for a husked pulse. The fact that it's husked is what make it dahl, and not simply a pulse - at least the last time he edited the page he left that change in, whether by accident or design I don't know.

A Dahl in the sense of a meal is a meal made with any dahl (in the sense of an ingredient), not just lentils to which he keeps changing the text. He's not even consistent because in the expansions he lists other types of dahl.

Finally he keeps inserting a spurious claim that "pigeon pea" is dahl, when it has no such special claim.

The vandal keeps making changes but does not leave any indication in the talk page of why he's doing so. He just arrogantly changes the page so that it displays the incorrect information. —This unsigned comment was added by (talk).

If you want to dispute a sense, there is a procedure, see WT:RFV. If you simply delete information, you will get reverted really fast. That's the way it works. If you added the word "husked" or change "lentils" to "pulses" it would probably stand, but because you are blanking information (which is vandalism, btw!) the whole entry gets reverted. Robert Ullmann 19:45, 19 October 2006 (UTC)
First of all, blanking is not vandalism if it is commented. This user did not blank any information, aside from removing one definition that was in his opinion incorrect as he stated. The user did then leave comments on the talk page as well, and reverting edits without responding to those comments is arrogance. The only mistake that I see made by this anon is not accepting a definition he has not heard or could not believe, perhaps one that is used "incorrectly" but nonetheless thus used. DAVilla 09:19, 21 October 2006 (UTC)

Hmmm, interesting.

Well, he's now added references and I can see why he's made the changes he has.

I thought it was odd that an expert on Indian foodstuffs and cookery happened to appear a few minutes after I edited the entry originally.

The problem is that the references are simply wrong (or absurdly incomplete)!

The first reference is not one I know, but it's definition is laughably wrong. Lord alone knows where it came from.

The problem with the second reference, the OED2, is that its entry is so small, to cover such a large subject. It's not wrong per se, but it's based on a single example and is woefully inadequate.

I'll check the WT:RFV procedure.

Oh, also: create an account, log in, and sign your talk page contributions and you will have much more credibility. Robert Ullmann 19:48, 19 October 2006 (UTC)
As I have often been cautioned against, do not inappropriately abuse the v-word. Accusing the most prolific editor of it, won't get you very far.
Certainly the very rare (or very British?) term "pulses" to mean vegatables is less coherent than "lentils." Since that seems to be the most common one used, it makes perfect sense to describe the dish as using them.

No, a pulse is a specific type of vegetable, such as a pea or a bean. It's a sub-catagory of vegetable, not a British word to mean all vegetables.

Any pulse, when dried, husked, and (usually) split, is a dahl.

The problem is that you and the other guy are trying to work this out from explanations by people who have clearly got hold of the wrong end of the stick.

A lentil is a type of pulse. It is a lentil whether it has been husked and split or not.

Dahl is any pulse, but it is only dahl when it has been dried, husked and split.

A dhal in the sense of a meal is something made with any ingredient that qualifies as dahl.

Still, you guys obviously have the technology to get your own ways so I suppose I'll just have to leave you to it.

You can use the template {{unreferenced}} to request sources for "pigeon pea" (if they haven't since been added.) Use {{rfv}} as described above. (edit) 19:53, 19 October 2006 (UTC)
--Connel MacKenzie 19:50, 19 October 2006 (UTC)
The above incident prompts me to the following question: how do we create a more inviting atmosphere to newcomers while still maintaining vigilence against bonifide vandals?
What if we were to write a quick tutorial that anons would be required to read prior to making their first edit? You would not want it to be too long, but it would layout a few key things such as:
  1. don't delete things without explanation
  2. use the {{rfv}} tag for entries with questionable info
  3. go to WT:ELE for proper format info
then it would have a checkbox saying something like, "click here to proceed." Just a thought.

A-cai 15:30, 20 October 2006 (UTC)

what happened to the edit links?

I know there are problems with having them over on the right where they used to be, but the new location seems worse. Is this a Mediawiki change, or a Wikimedia change, or a Wiktionary change? —scs 04:11, 20 October 2006 (UTC)

They haven't moved for me, since that problem we had a few days ago. Try refreshing your CSS...CTRL-Sh-R or Ctrl-R. --Connel MacKenzie 04:13, 20 October 2006 (UTC)
Aha. Thanks. That did the trick. (But remember: saying things like "Control Shift R" in this context is parochial advice, because it may only apply to people who use your browser and/or platform. For me, for example, it's actually Shift-⌘-R.) —scs 04:52, 20 October 2006 (UTC)
I'm glad my ambiguous hint was enough for you to get it resolved. --Connel MacKenzie 06:50, 20 October 2006 (UTC)
  • WARNING: Repeat the refresh procedure again (Shift-?-R, Ctrl-R, Ctrl-Sh-R, Apple-Sh-R, etc.) before complaining that the edit links have moved again. If you oserve it, refresh your CSS cache, and try again. If that does not work, please wait five minutes and try again, then if the refresh still does not help, report the problem here. Thanks. --Connel MacKenzie 17:29, 21 October 2006 (UTC)

Non-steroidal anti-inflammatory drug

Should there be an entry for the entire phrase non-steroidal anti-inflammatory drug? RJFJR 13:18, 20 October 2006 (UTC)

Google has 385,000 hits without the hyphen and 306,000 with the hyphen. Should probably make an entry for with the hyphen that says see the other spelling. (What's the format for something like that?) RJFJR 13:32, 20 October 2006 (UTC)
We use an ===Alternative spellings=== section in one, and the {{alternative spelling of}} template in the other. SemperBlotto 13:38, 20 October 2006 (UTC)
Whereas US usage tends to append the prefix "non-" without the hyphen, UK usage conventionally uses the hyphen in all but a handful of words (such as nonage and nonentity, both of which are derived from foreign languages rather than adding "non-" directly to English words). — Paul G 07:25, 21 October 2006 (UTC)

Patrolling edits

Someone marked symphonious as "patrolled", which means that the format was judged minimally OK by an admin. However, it is quite clear on looking at the page that it does not meet minimum formatting standards. Admins, please be careful to make sure that new pages are minimally edited before marking them off as "patrolled". --EncycloPetey 22:44, 20 October 2006 (UTC)

Sorry, what? As I edited it, I marked it off as patrolled, since I was cleaning up the formatting of the entry.
Please note that more sysops need to be attacking the patrolled edits. As our daily volume continues to increase, it is more and more important not to burn out the sysops that are doing the patrolling. Anonymous contributions are patrolled reasonably well, but the registered users patrol is behind weeks now. Note also, that the farther back you go, the more subtle traps you'll find, e.g. Special:Contributions/Brya (banned for intentionally incorrect articles on Wikipedia - most of which refer back to intentionally bad entries here, apparently.)
For symphonious, the initial few books.google references supported (inprecisely) the definition given, so I looked no further. --Connel MacKenzie 07:42, 21 October 2006 (UTC)
The editor of botonical entries you name is, as far as I can tell, quite knowledgeable in her field and at most pushing a point of view. I haven't reviewed her edits and don't have the technical qualifications to do so regardless, but from her comments she seems to have a perhaps deeper grasp of her scientific discipline, the "big picture" and all, and (from their comments) is disliked by the community for her confidence, or rather the assertion thereof, more than anything. Personality is one thing. For a judgement on the technical matters, I would not rely on her any more than the Wikipedia community, just as I would not rely on the Wikipedia community any more than Wikispecies. DAVilla 09:01, 21 October 2006 (UTC)
This user has got up my nose on several occasions but, as her knowledge seemed to be better than mine on the subject, I didn't like to revert her edits. We probably need someone equally knowledgable to review here work. Easier said than done. SemperBlotto 09:14, 21 October 2006 (UTC)

What would help us would be a feature where we could mark multiple edits as patrolled. Either where a user makes a series of edits until he finally gets it right (where we would just mark the latest edit), or the ability to mark all edits by a specified user if we were confident in them. SemperBlotto 09:35, 21 October 2006 (UTC)

What bugs me is when I see a user (typically IP) make a dozen edits trying to get it right. (Not doing anything wrong mind you, just learning!) and the entry (of course) still needs some work. I can go edit it, but then I have to go manually mark a dozen separate edits as patrolled. If the most recent edit is marked patrolled, shouldn't the prior ones be (in most cases, not all?) Part of the problem is that one (apparently?) has to open each diff, then mark as patrolled, which opens yet another page. If one could just check them off? Robert Ullmann 11:41, 21 October 2006 (UTC)
I'd love to have this feature, but so far I haven't thought of an efficient way to do it. --Connel MacKenzie 19:21, 21 October 2006 (UTC)
One note about Brya: I'd like to disagree with DAVilla a bit: I checked one of the contributions to wikt and wp (Coniferae) in detail in other references, and found that Brya was simply wrong. I understand that that is anecdotal. Robert Ullmann 11:41, 21 October 2006 (UTC)
Well I said I'm not qualified to judge the technical details, but what exactly are we looking at in these changes? The rank of the taxon Coniferae? All sources seem to agree this would depend on the system used, so are you saying that it's usually a phylum rather than a class, and what do we mean by "usually"?
Are we looking at the 'circumscription' (really too technical language it would seem) of Coniferae? They're conifers of course, but are you saying they only include conifers rather than being practically synonymous with them? I like Encyclopetey's style more of course, but the accuracy I can't judge, not without your sources and not without the proper background and training. How do you know she's wrong? DAVilla 19:38, 22 October 2006 (UTC)
Coniferae is not a "taxon", nor the canonical name for a phylum or a class (or whatever rank); it is a "descriptive botanical name" with no formal place in the taxonomic system; used more or less loosely to refer to various taxons. The WP people keep fixing their entry as Brya keeps entering her POV, go look. Ours is presently wrong... Robert Ullmann 19:54, 22 October 2006 (UTC)
Okay, I'm a lot more skeptical now. As I said, I'm not too happy with her abbreviated style in the first place. Of course it makes sense for Wikipedia to follow modern practices. Still, our purposes are different, and we don't for instance equate Pinales and Coniferales even if they have come to mean the same thing. What did Coniferae mean prior to the first meeting in 1900? I don't think she's entirely wrong for not abiding by what this international group says. She's stubbornly blind to ignore them, absolutely, and I see now she probably shouldn't have her way outright, but we should always be open to dissenting opinions. If only she were more open to the majority opinion... DAVilla 20:36, 24 October 2006 (UTC)
One upshot of having to look at each revision individually is the case that prompted Wikipedia to begin using the oversight feature, namely someone comes along and adds some personal information to the entry that needs to be deleted. The only way to be sure that that sort of information isn't added then removed it to view each revision one at a time. Now, I am not saying this has or is likely to happen on Wikt, but that is one case where checking each one is a good thing. I would also like to see automated patrolling of entries all prior revisions to one being marked as patrolled (not just on edit by sysop, this could pose problems e.g. someone subtley vandalises something some sysop is about to work on and the vandalism goes unseen by the sysop who isn't checking the diff) and a whitelist. - [The]DaveRoss 15:28, 23 October 2006 (UTC)
Yes, it could really only work in certain cases, where the present edit did not undo anything that was added or changed in the previous edit, for absolute certainty. But we're just talking about patrolling, which isn't supposed to be a fine filter anyways, so do we need absolute certainty? I can see you argument applying more to the idea, if it has ever been raised, of compacting such edits so they are no longer stored as separate edits in the database, not that there's necessarily any need for that. DAVilla 15:43, 23 October 2006 (UTC)

Vulgar slang

In my view, "vulgar slang" is an inappropriate or misleading label, as it is too broad. It has often been used by lexicographers to refer to terms used by the lower classes, and so covers not only swear words such as "bullshit", but milder terms such as "crap" and many other slang and colloquial terms that are used in robust or informal speech.

Better, I feel, are "taboo slang" (used by some modern dictionaries) for strong swear words, such as "bullshit" and "coarse slang" for the milder ones ("crap", "dick", etc). We already have these labels on some words, as well as "sexual slang" for some others. — Paul G 07:21, 21 October 2006 (UTC)

crap is milder thank bullshit? Really? Did you mean to say crud?
I think "vulgar" is only broad in the historic meaning, not the modern meaning. It is certainly a much stronger warning than "coarse" and therefore fits more situations better, I think. Likewise, "taboo." --Connel MacKenzie 07:31, 21 October 2006 (UTC)

I don't mind ‘vulgar slang’, ‘coarse slang’ or ‘taboo slang’, but I think we should only use one of them and not try to create distinctions between them which are not very obvious. Better to have a scale of (colloquial), (slang), and then (vulgar/coarse/tabboo slang). Widsith 07:45, 21 October 2006 (UTC)

Many people find terms such as "gob" (meaning "mouth", as in "shut your gob!" = "be quiet!") or "knockers" (breasts) vulgar, but they are not vulgar in the way that "bullshit" or "fuck" are. "Gob" and "knockers" are crude or coarse, whereas "bullshit" and "fuck" are taboo in formal contexts. The distinction here is very clear.
Perhaps that is true in London, but outside of England, what you say simply is not true. "Bullshit" has no where near the severity as "fuck" in America. When someone says "I called 'bullshit'" or "I called him on his 'shit'" is usually not even a naughty context, let alone crude, coarse or vulgar. Well, maybe coarse (out of context) but that would have to be really far out of context. --Connel MacKenzie 07:43, 22 October 2006 (UTC)
I must say, the degree of vulgarity of bullshit seems to be extremely regional. At work in Central London, I might on occasion say that someone had written bullshit, but I would think twice before saying their work was crap. Similarly, the preacher at my North London church, 9 days ago, described something as "what I'll call bull, in case some of you might be offended by the full word". He would not even have made an oblique reference to a taboo word; he, and the rest of us, felt bullshit was mild, just not sermon material. --Enginear 18:17, 24 October 2006 (UTC)
Hence one label is not sufficient to cover both of these levels of language, and "coarse slang" and "taboo slang" would be more appropriate. "Vulgar" is too subjective, in my opinion, having undertones of snobbery and hinting that such language is used by "common" people.
Incidentally, I did mean "crap", not "crud". "Crap" is indeed milder than "bullshit", while "crud" is not offensive at all, merely being slang for dirt. You'll hear "crap" used as a mild expletive in programmes such as Friends and Smallville, whereas "bullshit" is rarely heard until after the (UK) watershed. — Paul G 07:31, 22 October 2006 (UTC)
The distinction should be between (slang) and (vulgar slang / taboo slang). It is enough to say that knockers or gob are slang, since all slang words have some level of vulgarity or taboo-ness, that is what slang means. A distinction between ‘taboo slang’ and ‘coarse slang’ is not intuitively obvious, in fact it's not even obvious to me after explanation. Widsith 08:58, 22 October 2006 (UTC)
Widsith, as in our previous disagreement on its meaning, we do seem to have very different ideas of slang. I do not believe for example that "use your loaf" is in any way vulgar or taboo (unless you are using the archaic meaning of vulgar as common, but it is slang. I would argue the same for gob (just about). The usage of slang which I hear in London is exclusively that of one of the OED's definitions: "Language of a highly colloquial type, considered as below the level of standard educated speech, AND [my emphasis] consisting either of new words or of current words employed in some special sense." As you know, I therefore think it less confusing if we use vulgar, taboo, coarse, or whatever, without mentioning slang unless the words are new or used in an unusual sense. So of all the words used here, only gob, knockers, bull, loaf and the figurative meanings of bullshit, crap, fuck, etc are slang, while the literal meanings of the last three, although variously colloquial, coarse and taboo, are not slang. --Enginear 18:17, 24 October 2006 (UTC)
"Crud" is more often a euphemism for "crap" on this side of the pond. It can also mean some item of dirt, or a crumb, but more often is used euphemistically. The meaning I think you intend for 'watershed' is unheard of over here, as well. --Connel MacKenzie 07:43, 22 October 2006 (UTC)
I've therefore added (UK) to sense 3 of watershed#Noun. AFAIK it's still set at 9 pm, and I've never understood why they chose that word to describe it. --Enginear 18:17, 24 October 2006 (UTC)
Aside from the issue of whether "crap", "crud", or "bullshit" is more or less coarse in various linguistic and cultural contexts, because the word vulgar has other meanings, and perhaps more because of its use in the work "Dictionary of Vulgar Tongue", it has caused confusion from some contributors. Is there any other label that we could agree on for the category that would not invite misuse, in terms of the category? Personally, I think either taboo or offensive, or some combination of either of those two with slang, would lessen the ambiguity. This still doesn't address the issue of "degrees of offensiveness", but I'm not sure that words have fixed values across space and time in that regard, so any scale may be meaningless outside a particular context. --Jeffqyzt 16:44, 24 October 2006 (UTC)
Thank you, Jeffqyzt, you've explained my point quite succinctly - the Dictionary of Vulgar Tongue wasn't a book of swear words :)
"Offensive" would cover derogatory epithets too (although we do already use "derogatory"). Some modern print dictionaries do indeed use "offensive" for un-PC terms for various types of person.
I think "watershed" might be used because TV programmes are intended for family viewing on one side and for adults only on the other side, and so the watershed is a cut-off between the two, just as a physical watershed is the cut-off line between two catchment basins. — Paul G 15:44, 30 October 2006 (UTC)

Japanese pronunciation?

somebody can help me telling the reading of the name 夏香? is it natsukaori????


A-cai 13:53, 21 October 2006 (UTC)

There are a 100-odd web pages that read it as Natsuka, none Natsukaori ... Robert Ullmann 14:13, 21 October 2006 (UTC)


The translation sections of our entries for common personal names (e.g. George, John) seem to list cognates rather than translations. Though Wiktionary currently lists Jean and Juan as the French and Spanish translations of John, respectively, the French and Spanish Wikipedias both have "John Lennon" as "John Lennon", not "Jean Lennon" or "Juan Lennon". Translation really only applies to historical figures, who are older than the languages themselves are (e.g. John the Baptist does become Jean le Baptiste and Juan el Bautista), and who probably deserve their own entries anyway. The only exception I can think of, besides transliterations into non-Latin scripts, is that writers in Latin tend to Latinize names and stick case endings on them. Otherwise names are usually the same across languages. --Ptcamn 14:02, 21 October 2006 (UTC)

Translation is also used for modern personages such as royalty and popes (check the interwiki links at Elizabeth II of the United_Kingdom or Pope Benedict XVI). Whether translation or a borrowing is used is more an issue for the person translating (well, and the person reading their translation). The issue is that the act of translation is rare, not that the translations don't exist. (And of course, the translations aren't always cognate. I understand the Irish were big on identifying native names with Latin ones by near-matches in sound.) —Muke Tever 15:47, 21 October 2006 (UTC)
I agree with Muke. In Irish, we don't call John Lennon 'Seán Ó Lionnán' - we don't even call our own James Joyce Séamus Seoige! But we do refer to an Chéad Rí Searlas (Charles I), and more recently, an Pápa Eoin Pól II (JPII). Rredwell 04:30, 31 October 2006 (UTC)

Watch out for Primetime impersonators.

On Wikipedia, w:Karmafist ( talkcontribspage movesblock userblock log ) Wiktionary: User:Karmafist was found impersonating other well-known vandals like Primetime, Bobby Boulders, Zephram Stark, and WordBomb in this CheckUser request on Wikipeia (which he started to troll Wikipeida and which royally backfired on him). He is a former Wikipedia administrator who started a huge sock farm and used it to subtly vandalize Wikipedia after he got desysopped in the w:Wikipedia:Requests for arbitration/Pedophilia userbox wheel war ArbCom case. He got caught and was community banned there. I am writing this to let you know in case he or another user starts impersonating Primetime or other vandals here. By the way, he has an account here that is not blocked. Jesse Viviano 23:09, 21 October 2006 (UTC)

Well, a Karmafist sockpuppet could probably get away with subtle vandalism, which on the whole isn't that much different from the junk that swarms of people like to dump here, but he won't get very far if he impersonates any of the more infamous vandals, who are pounced upon like a meat truck in the Sahara. Anyways I'm not sure that Karmafist would be interested in vandalizing Wiktionary. I've reviewed the decisions that led to his desysopping, and though he's unpopular, in my opinion Karmafist has a very good if not excellent sense of judgement. If he made any mistakes in treating PigsOnTheWing aggressively then it's understandable given the personal attacks that POTW (or "Pigs" unless you agree that's an epithet) made on him. Karmafist's redemtion is that, after given another chance, POTW was in the end completely banned from editing for an entire year, strictly on account of behavior. Karmafist made the opposite judgement call in the case of a new user who was incorrectly banned for trolling and in the end turned out to be a bystander. The reason that Karmafist was criticized for reverting that block, while in the same decision another sysop was commended for reverting numerous blocks, is that the block was made by Jimbo Wales himself. Jimbo's block was incorrect but not improper, while Karmafist's revert was improper if correct in the end. Hence we arive at Karmafist's problem, which is insubordination. Because that contempt of authority led to an attempt to vandalize the project, I'm not suggesting that Karmafist should not have been banned, or that he could be fully trusted here, but I do understand why he's bitter, and I say that he probably wouldn't be interested in vandalizing Wiktionary because Wiktionary isn't Jimbo Wale's pet project. Even if Jimbo has the same power here, it's Wikipedia that he's known for, Wikipedia that he cares about, and Wikipedia where he sometimes intervenes. In contrast, Wiktionary is at the moment run by the community, which is a fundamental principal that Karmafist has always espoused. DAVilla 18:55, 22 October 2006 (UTC)

Han character phonetic header

There are a number of Chinese characters that are used as phonetic elements. Some of the more common characters include:

  • 吧 - bā
  • 斯 - sī
  • 林 - lín
  • 科 - kē
  • 罗 - luó
  • 拉 - lā
  • 多 - duō
  • 克 - kè
  • 地 - dì
  • 亚 - yà
  • 格 - gé
  • 威 - wēi
  • 治 - zhì

An example of their use would be: 格林威治 (gélínwēizhì) Greenwich. I would like to mark these as such in the individual character entries, but I'm unsure of an apropriate English label. I'm thinking of:

trad. and simpl.


  1. pattern; form


  1. phonetic representation of syllables that start with g
    lín wēi zhì píngjūn shíjiān
    Greenwich meantime

Is Phonetic a good choice, or is there another label that would be more appropriate? A-cai 02:39, 22 October 2006 (UTC)

"Phonetic" to me sounds like a phonetic element that's included as part of another character, such as ma the horse having nothing to do with other words meaning mother etc. or marking a question. What you are talking about are transcriptions transliterations into Chinese, the reverse direction of say Pinyin. In what sense is Pinyin "phonetic"? There must be a more appropriate label. DAVilla 16:19, 23 October 2006 (UTC)

I also thought of using ===Phoneme=== or ===Allophone===, but I think they may be too obscure. A-cai 22:20, 23 October 2006 (UTC)

And possibly incorrect. How about ===Phonetic transliteration=== or ===Transcription phonetic=== or is that too long? What would be the "opposite" of Romanization? Anyone else care to comment? DAVilla 06:50, 24 October 2006 (UTC)
Sorry, I can think of no appropriate heading. ===Symbol=== just doesn't quite cut it, but that's what I'd use. --Connel MacKenzie 07:00, 24 October 2006 (UTC)

How about ===Phonetic syllable===? A-cai 09:11, 24 October 2006 (UTC)

Well it's pretty close if not exactly right, and anyways easy enough to replace in the future, whereas "Phonetic" is overarching and might be more difficult to sort out at a later stage. DAVilla 15:41, 27 October 2006 (UTC)

Unified style guide

I've been repeatedly frustrated by unwritten or poorly publicised rules on Wiktionary that I've unknowingly violated. For example, I recently standardised abbreviation categories to use the more readable proper noun format ("English abbreviations") rather than the language code format ("en:Abbreviations"). The more readable format was already used for most abbreviation categories, so I assumed it safe to standardise in that direction. Connel MacKenzie pointed out on IRC that the community had decided to use the code format at some point, but couldn't think of a relevant content or talk page this was mentioned on.

I think the solution to this problem—which will get worse as we accumulate more and more poorly publicised or documented community decisions— is to create a unified style guide. This style guide would provide all relevant information on the various style conventions in a single, easily-referenced location. For example, it would link to Entry layout explained for articles and Wiktionary:Categorization for categories, and they in return would be indexed under the style guide. For a working example of this, see the English Wikisource's style guide (note the links under "Poetry and annotations").

The style guide would unify the scattered style guidelines Wiktionary does have. For example, both Wiktionary:Style guide (which would be overwritten by the proposal) and Wiktionary:Entry layout explained document article style guidelines. Both Wiktionary:Categorization and Wiktionary:Categories document category style guidelines. There is no coherent organisation that a new user can refer to before contributing.

If this sounds like a good idea, I'm willing to draft a proposal for further discussion. —[admin] Pathoschild 06:22, 22 October 2006 (UTC)

There is a good reason for using language codes in categories (such as "en:abbreviations"), and that is that "English" is ambiguous here, as it is not clear whether it refers to the words themselves or their referents: it can mean "things native to England" or "words in the English language". That distinction doesn't make sense for abbreviations, but for categories such as, say, "English birds" it is clear - "wren" is an English bird in both senses, but "emu" is only in the latter sense; "moineau" (French for "sparrow") refers to a bird of England, but is not an English word. The same applies to other languages. Hence we tend to use either "en:Birds" or just "Birds" for categories of this kind (the absence of a language code meaning the English language by default, as this is the English witkionary); "moineau" would come under "fr:Birds" (words for birds in the French language).
I quite agree that we need a unified language guide. Part of the problem is that our style is still evolving (Wiktionary is still a fairly young project) and policy decisions about representation of content do not always get written down where it is easy to find them.
Unfortunately that inevitably means that newer contributors receive a rash of messages on their user pages informing them about various mistakes in their contributions. This would not stop entirely if there were a single, easily accessible, easily understandable page giving our complete policy on style (isn't that what WT:ELE is supposed to be, by the way?), as newcomers will always make mistakes, but it would help cut down on any unnecessary frustration or offence caused. — Paul G 07:48, 22 October 2006 (UTC)
WT:ELE seems to be about defnition articles only; there's really not anything there about categories, templates, policy pages, etc. Even if ELE isn't the place for all of these (for reasons of size, if nothing else) it should perhaps have pointers to the analogous style pages for other types of entries. (I see that there is a link to Wiktionary:Categorization in the ELE section on categorization, but perhaps a seperate block of "Styles for non-definition articles") would be in order.
Also, I see that a) Wiktionary:Categorization doesn't make the "English language" vs "English nation" case there, and b) we are inconsistent, see [[Category:English nouns]] (among many others.) Of course, just "Nouns" would invite the addition of all the other languages' noun categories, so that's unworkable, but what about "Words used primarily in England" ...oh, I guess there's [[Category:UK]]...sorry, I seem to be rambling a bit. In short, yes, further organization is good! :-) --Jeffqyzt 15:53, 23 October 2006 (UTC)
This an excellent idae. At the moment the style information seems to be spead around quite widely and I doubt that any newcomers read it all before begining. I have been working out the style by using the templates and looking at current entries, but it is too easy to be lead astray by remembering older entries inserted before a style change. Religiously watching the changes made to my entries by more experienced users has probably provided me with more new information than just about any other source.
It doesn't help that there is quite a bit on inconsistancy: Why does the template suggest an example for an adjective but not an adverb? Why does an adverb suggest alternative wording bat an adjective not? Why is the primary parameter in italic for an adjective but not for an adverb? Knowing why these decisions were taken helps one to remember them.
A style guide, preferably as a pdf guide that can be easily printed and taken away to be read at leisure seems a major swtep forward. Moglex 16:48, 6 November 2006 (UTC)

Definitionless entries

I've added hill partridge merely so I could add the derived terms. Wikipedia does not yet have an entry for this term, and the dictionaries I've consulted for its definition disagree. So:

  • Could some ornithologically aware person provide a definition?
  • What is the procedure for flagging up words as having no definition? I see that there are a template ("definitionless words") and category ("Definitionless words:English") but the latter has long since been cleared out. I know that adding entries without a definition is bad form and apologise for doing this, but I'd like to know how to mark the entry up so it can be found and rectified. Thanks. — Paul G 07:35, 22 October 2006 (UTC)
One of my dishwashing tasks is to regularly patrol stubs, and the Definitionless category. In this case I have added a definition derived from one found by Googling. Like all entries, it is subject to improvement. SemperBlotto 07:43, 22 October 2006 (UTC)
That was quick. Thanks, SemperBlotto. I found at least two different genera referred to in various dictionaries, so I wonder if there is more than one definition. — Paul G 07:52, 22 October 2006 (UTC)

Han characters + bot

See discussions supra. I'm thinking that it would be good to run this AWB ruleset under (say) "UllmannBot" with a 'bot flag. While it isn't going to run entirely automatically, I think people would appreciate not having Recent Changes flooded with 21,000 changes over the next few weeks/months?

To be very specific: "Bot" operator is moi, with AWB. Name would be UllmannBot.

Tasks -- Han character entry formatting:

  • Format the info at the top of the entry, which Unicode can't really claim any copyright on (compilation, derivation or whatever) into Template:Han char under a Translingual header so that the format meets our standard.
  • Format the "Dictionary information" section and the "Technical information" section into Template:Han ref, in References section under Han character.
  • Fix up the headers so that we don't have == Korean Hanja == and so on.
  • Cat them in Category:Han characters, sorted by radical and stroke.
  • Alternate form goes into {{see}} where it belongs
  • Graphical Significance and Origin goes into Etymology
  • The unicode code point is in the second template.
  • The common meaning is stuffed into the first template as a parameter, may or may not be displayed (is now).
  • The dictionary references are stuffed into the 2nd template.
  • The second template generates a link to the Unihan database, where all that can be found. It can also do other things with the codepoint if we like. (Remember the codepoint IS the page title, coded in hex; no copyright problems there!)
  • We can always choose to display/hide things by modifying the template(s), if and/or when we run into trouble with having one of the data fields in the wikitext, we can always bot-strip just that from the template call.
  • Radical and stroke information is added to the {{kanji}} template so that Category:Japanese kanji sorts properly.
  • {{kanji}} is subst'd so that the Readings header isn't inside a template; {{ja-kanji}} and {{ja-readings}} are left with the proper parameters.
  • Compounds changed to L4 header.
  • Use of {{top}} for compounds changed to {{top2}}. (very common)

Independent question: since I would be logging AWB in under this account; would it be good to do the POS headings for Japanese under the 'bot flag? (I can easily do it either way, just have to remember to switch the login.)

Support, object, other issues? Robert Ullmann 16:08, 22 October 2006 (UTC)

Overall, I support this move. But I was pretty sure the earlier consensus was to eliminate the "Common meanings" entirely, whether copyvio or not. I do think the the Japanese corrections logically fit within the same cleanup task; it would be (server-)abusive not to combine them, right? --Connel MacKenzie 19:07, 22 October 2006 (UTC)
True, the common meanings are not very good, but right now are the only "definition" in most of these entries. I'm stuffing them into the template, we can turn display on or off; if we do end up with a copyvio problem we can strip them. Robert Ullmann 19:44, 22 October 2006 (UTC)
Said another way, I support removing them entirely, not masking them. 1) Potential copyvios shouldn't be here. 2) The definitions are wrong or misleading, from the examples presented here. --Connel MacKenzie 20:30, 22 October 2006 (UTC)
First, I don't want to be deleting any information on this pass. Second, the common meaning isn't wrong, just inadequate. Until we have proper definitions, it is better than nothing. A-cai's example : means count as a verb, and number as a noun, the common meanings just says "number; several, count; fate", clearly when the proper language POS sections are provided this can go away, but until then it is very useful. Third, we don't know if it is a copyvio, we are working on that; if it is we will strip it out. Robert Ullmann 21:47, 22 October 2006 (UTC)
I thought that, if there was a copyvio, then because of the histories they would have to be deleted entirely, and reconstructed from non-copyrighted sources. It may only be necessary to delete those which have not had any edits to them, since the copyright is only held on the collection, correct? (and for individual entries probably applies to the common meanings more than anything). Still, if the format is wrong, then I don't see much need for holding onto masses of these. The reason they're there is so that people can add to them, and we'd really prefer that they be adding the right kinds of information. I say transform or recreate the most common and, when we're sure we're ready, take it from there. DAVilla 16:07, 23 October 2006 (UTC)
There is no copyright on the entries themselves, or on any compiled data that Unicode held no copyright on. The history doesn't matter. It is particular fields and sources that may or may not be an issue. Meanwhile, we have already added a huge amount of other edits. (and "collection [or compilation] copyright" doesn't apply here: the "collection" is ISO standard 10646.) Go ask BD2412 or read what he wrote above. Robert Ullmann 16:19, 23 October 2006 (UTC)
Oh, and (as you say) I am doing the earliest (and probably most interesting) entries first; they have the most added information. There are 21,361 left. Robert Ullmann 16:29, 23 October 2006 (UTC)
Yeah, I was probably out in left field on some of that. Not sure why I thought histories would matter. DAVilla 16:42, 23 October 2006 (UTC)

The process is also adding in the kanji templates and categories; there are a lot of entries that are kanji, but aren't categorized, the process cats them with the correct rad+stroke index. (Rod A. Smith did some of this last May, IIRC) Robert Ullmann 16:19, 23 October 2006 (UTC)

I'm sorry, but I do recall some of the examples being discussed as being classified as "wrong" not just ambiguous (as your example above.) I think one was the character for "cunt", listed with the definition "vagina" with no indication that it is a pejorative slang term (in direct opposition to the neutral technical term.)
I do not see BD2412's comments above as advocating keeping them: quite the opposite, in fact. Am I looking at the wrong section? --Connel MacKenzie 16:37, 23 October 2006 (UTC)
Like the others, that example is not wrong, just incomplete. (a bit embarrassingly perhaps) Of course we need to add proper language entries. On the other point, BD2412 didn't say anything about any specific field. Until we know the source, it is utterly pointless to discuss IP law; and still pointless because ALL "common sense" about IP law is WRONG! We'll ask BD2412 or whomever for an expert opinion when we have the data. Until then it has been sitting there for years; I've just leaving it in place UNTIL WE KNOW! Sorry to shout, but this is exasperating! I understand that it is hard for people to get that they don't know jack about IP law, but it really makes no common sense!
Let me try to explain one problem: if we delete one field and retain another we are taking a position that we have the rights to the latter. If we do not remove anything we are not taking a position (yet). Removing a field can create a liability where none existed before. It is like when an ISP edits some user content; they implicitly take responsibility for all of it. Now I don't know whether this exact principle applies to the instant case, but this is the kind of trap that exists. Sigh.
Can we please leave all the IP copyright stuff out of the discussion and to the lawyers. If tomorrow or next year they tell us to strip something, we will. We/I have given them (Wikimedia Foundation General Counsel et al) the information. Until then, forget it, please? Robert Ullmann 17:02, 23 October 2006 (UTC)

I want to apologize to anyone who is upset with my tone. I am very frustrated, having been trying for months to sort out how to reformat the entries and make them useful, while keeping copyright entirely out of it. Without totally separating the issues, we will/would never get anywhere. (Lawyers are SLOW. ;-). Robert Ullmann 17:18, 23 October 2006 (UTC)

Okay, I read the section above, #Han characters 2 (not sure how I missed it). You may want to go ahead with what you have in mind, but take into account Connel's main objection. Deleting the common meanings couldn't get us in any trouble so long as we only retain the information that's too basic to copyright, correct? Anyways we've apparently already stripped out some information. As I understand BD, presenting it in our own style is enough, especially if we add to it later. Personally I don't think the common meanings make sense anyway without giving concrete examples of derived terms (in various dialects) where those meanings are used. If you want to keep common meanings, maybe you could gear presentation toward that? Thoughts only... don't let me stand in your way. DAVilla 18:20, 23 October 2006 (UTC)
It comes down to something real simple: if I delete the "common meanings" field and we decide we want it back, fixing it is very painful. If I keep it and stuff it in the template, "disappearing it" is trivial (comment out a line in the template), and removing it is much simplified. So for now we keep it. Robert Ullmann 21:44, 23 October 2006 (UTC)

Hmm, the chararacters sort out by radical very nicely in the categories. I've been putting the Kanji grade in the right place ... but Category:Grade 1 kanji is missing one ... should have 80 ;-) Note that most of these entries have lacked any category at all. (Please don't tell EC that Category:Han characters will someday have 70,000+ entries. He will redefine conniption ...) Robert Ullmann 23:53, 23 October 2006 (UTC)

I stopped running the AWB changes because Connel had some issues. Please see my talk page. Basically about putting the "common meaning" somewhere where it/they still look like definitions to tools parsing the XML dump. Please see for what the result would be; and the example we've been using for a few months for what it would like with proper entries for each language. Comment here or there, please? Robert Ullmann 13:43, 26 October 2006 (UTC)


I NEED HELP ON MY HOMEWORK. i forgot my book on this guy and i only know his nick name. its Guy, hes a black astronout and he died in the space shutle. I NEED HELP HELP HELP HELP!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Dale Gribble? --Versageek 22:19, 23 October 2006 (UTC)
Um, google? (Guy was the nickname of the first African-American to fly on the shuttle, but he is alive and well ;-) Go google a bit. Robert Ullmann 23:57, 23 October 2006 (UTC)
From the Wikipedia article on the Space Shuttle you can get to one on the program which has links to pages on the two disasters which have lists (and pictures) of those crews. (It looks like there's a black guy on each.) —scs 01:39, 24 October 2006 (UTC)

New alert

Akanemoto (talkcontribs) - I know ja: entries and formats have been through a few iterations to refine it, but this brand new user now Being Bold to an inexplicable degree. Could a native, or even a fluent speaker please review these edits, and see what can be salvaged, if anything? --Connel MacKenzie 03:09, 24 October 2006 (UTC)

I see now that one ja- speaking sysop has reverted most (all?) of the offending contributions. I also see that the user is mostly unresponsive on his/her talk page. Should this move to WT:VIP? --Connel MacKenzie 16:52, 24 October 2006 (UTC)
(edit conflict ;-) Tohru has put a notice on his talk page; he started editing a little while ago without replying (not that is itself bad); I added another note asking him not to crate any more subpages. I reverted the kana entries. He is apparently trying to create indexes to short kana words by generating cross-tables of likely combinations. Both Tohru and I have pointed out that we have proper indexes of several different kinds. Robert Ullmann 16:59, 24 October 2006 (UTC)
Right. Those are pretty straight-forward to delete (although voluminous) but I was much more concerned about the main namespace (inexplicable) template replacements that you've now reverted. Thank you. I have only a shred of optimism left for this contributor...but it could just be a rough start. It is a tough call. --Connel MacKenzie 19:50, 24 October 2006 (UTC)

Finding misspelt entries

I see that Wiktionary does not make any attempt to list possible entries that a user might have intended.

Would it be a useful addition to provide such a facility?

It would need to be near the calibre of the one that Google provides.

If it is thought a good idea, I would be very happy to write the code (C++) for inclusion in the site software.

No one's implemented that feature (it would need to go to the developers, I think.) I've assumed we don't have it in part because of efficiency concerns (that and there is no developer concentrating on wiktionary needs.) The closest we have is entries that consist of the statement that this is a common misspelling of that word (see template:misspelling of). RJFJR 14:18, 25 October 2006 (UTC)
I know the feature does not exist. How does one go about volunteering to write the code for it (assuming the resource requirements are not too onerous)? Moglex 14:47, 25 October 2006 (UTC)
I don’t understand what you are proposing. What entries might one intend that he does not type? What entries do you want to list that one did not specify, and how and where would you list them? —Stephen 21:56, 25 October 2006 (UTC)
You mean to say you've never deleted a page because the title was a misspelling of an already existing entry? DAVilla 16:32, 26 October 2006 (UTC)
It think it is very reasonable to ask for details of what "corrections" are being proposed, which is obviously what Stephen is getting at. While I too am curious about the newcomer's proposal, I am highly skeptical that a more elaborate "did-you-mean" feature will gain acceptance any time soon. Lots of secondary features rely of searches failing; depending on the approach taken, such a feature could cause many more problems than it solves. --Connel MacKenzie 17:00, 26 October 2006 (UTC)
Nobody mentioned corrections. It might be considered usful by some if, when you got to the "No page with this exact title exists" point there was an option to see some words that might have been intended. Particularly useful for someone who has heard a word but never seen it written (especially if English (or whatever) isn't their first language). I wouldn't have thought it contentious, but as no-one seems to think it would be a useful feature I'll go for a pint instead. Moglex 17:43, 31 October 2006 (UTC)
It should also be noted that MediaWiki is coded in PHP, and so an extension like this would need to be also. Also, bad spellings couldn't be redirects, because what might be misspelled in English could be a correct spelling in German etc. - [The]DaveRoss 18:35, 26 October 2006 (UTC)

Headers not working?

Why are ALL of the headers in entries showing the "This is not a standard Wiktionary heading" message when I sweep over them? bd2412 T 20:42, 26 October 2006 (UTC)

Because User:Hippietrail, nor I, nor anyone else ever updated his Javascript that describes valid third level headings. This should move to WT:BR or WT:GP. I'll take a look. --Connel MacKenzie 21:07, 26 October 2006 (UTC)
All better now? --Connel MacKenzie 22:00, 26 October 2006 (UTC)
Well, in the sense that I get nothing at all as opposed to any kind of message... bd2412 T 23:11, 26 October 2006 (UTC)
Hmmm. There seems to be some kind of über-caching of MediaWiki:Monobook.js. Refreshing doesn't seem to refresh it, for me, anymore. (But if I log out, it now works again. When I log back in, it is magically re-borkened.) --Connel MacKenzie 23:52, 26 October 2006 (UTC)

What to do with common meanings section for Han character entries

I have now read the discussion on Robert's talk page (see above). It occurs to me that perhaps we should rethink what we mean by "common meanings." Based on the content, I would say that common meanings means any generic definition that has survived the transition from old Chinese and middle Chinese into modern spoken CJKV languages (languages that inherited or borrowed words from old or middle Chinese). For example: , whose original meaning in old Chinese was sun, eventually came to mean day. So in the common meanings section, we write: sun, day. However, what about a character that had one meaning in middle Chinese (about the time that most Chinese words were exported to Japanese), but another meaning in modern Mandarin?

An often cited example:

middle Chinese Mandarin Min Nan Japanese
行 (to go; to walk) ɕiŋ˧˥ (In colloquial speech, to go is now , and to walk is now ) kiã˧˥ (to go; to walk) iku (to go)
去 (to leave) tɕʰy˥˩ (to go. to leave is now ) kʰi˨˩ (to go) saru (to leave)
走 (to run) tsou˨˩˦ (Now means to go, to walk or to leave, to run is now ) tsau˥˧ (to run) hashiru (to run)

Note that the meanings in Min Nan and Japanese are considerably closer to middle Chinese than in Mandarin. Mandarin has evolved away from middle Chinese as a result of being influenced by other languages and cultures (northern China, from where Mandarin originates, is more flat, which allowed for greater mobility). If these were English entries, the first column would have its own page which would be labeled as middle English or old English. Maybe that's what we should do here (over the next 30 years ;-)

P.S. Connel, with respect to the issue of using Mandarin vs. Chinese, I think we may have to table that discussion until we have more of a quorum at Wiktionary. Any debate/vote on the subject at this point would include no more than a half dozen or so individuals, only one or two of whom actually speak any dialect of Chinese with any degree of fluency :-) For the time being, I will continue to format my entries as I have been. I'm sure some bot could be written at a future date to modify the headers, should the need arise. A-cai 22:24, 26 October 2006 (UTC)

While I agree that the issue of "common meanings" probably could be hashed out a bit better, I don't see any remaining impediments to Robert starting the gigantic task. I imagine he will refine his rule-set considerably during the first thousand, and get quite sick of it during the second thousand. :-)
The "Chinese" vs. "Mandarin" issue certainly is not resolved at this point in time. This is still the English Wiktionary. Since we are writing this for English readers, I maintain that the normal, common, everyday meaning of "Chinese" is more likely to be understood than "Mandarin" and therefore the language heading "Chinese" is more appropriate. But that is completely othogonal to the topic at hand. --Connel MacKenzie 23:31, 26 October 2006 (UTC)

So you would keep the common meanings under that definition in the translingual section? (Even though defining the character translingually is not really possible, as I was arguing with Connel... ;-) Robert Ullmann 12:49, 27 October 2006 (UTC)

I'm not suggesting putting the common meanings under the translingual section. I'm suggesting that we make CJKV entries look more like the rest of wiktionary. In the example that I mentioned on my talk page (forum), all of the modern words derive from a now extinct anscestor (Latin). I don't see how that is any different in the case of CJKV languages. Therefore, why not have a section for middle and old Chinese? Hang on, I'll do up to show you what I mean...

A-cai 13:39, 27 October 2006 (UTC)

Ok, now take a look at . This makes more sense than the way I had it before, because most of the archaic meanings that I had in the Mandarin section actually predate Mandarin. But, I had to put them somewhere, and I couldn't put them into a common meanings section, because the meanings didn't migrate to Japanese. If I had an Old Chinese section, I could put them there. Note, that I didn't put a pronunciation with the Old Chinese entry. This is because no one is precisely sure exactly what Chinese sounded like at that time. However, future linguistic nutcases (I say that affectionately :) may very well want to put approximate reconstructed pronunciations for old and middle Chinese. I actually would love to see how the sounds of the language evolved and permutated over the centuries down to modern times.
However, I'm not suggesting that you make your bot figure out how to separate the common meanings section into middle and old Chinese sections. Have your bot do what it do (as in the movie Ray). If Connel wants it in the Translingual section vs some other place, or in some specific format, that's all fine with me. All I'm saying is that eventually I want all the characters to look more like does now.

A-cai 13:53, 27 October 2006 (UTC)

I like what you've done at . To the point. DAVilla 15:43, 27 October 2006 (UTC)
By the way, shouldn't that page (along with many others) be linked from , and under what header? DAVilla 15:47, 27 October 2006 (UTC)
Just for the sake of completeness, I have added a middle Chinese section to to illustrate what could (in theory) be done. Baxter's dictionary of middle Chinese only contains readings for 5,000 or so of the most common characters, so we would not be able to do this for all 20,000+ Han characters. At any rate, this is a possibility (provided somebody is crazy enough to embark on the task :)

A-cai 00:30, 28 October 2006 (UTC)

Graphical significance and origin vs. Etymology

Just when I thought some group of people were reaching a consensus... http://en.wiktionary.org/w/index.php?title=%E5%A4%A7&diff=1727540&oldid=1727378

I'm afraid I don't have enough Mandarin knowledge to address this. The English descriptions of what is going on with this entry, certainly are not helpful. --Connel MacKenzie 06:41, 29 October 2006 (UTC)

I don't understand the edit you linked to. It seems like a minor change to me. Is a "sick person" a person or a representation thereof? Isn't the origin, the significance of a word, the etymology? Different ways of saying the same thing. DAVilla 07:20, 29 October 2006 (UTC)
I take it that the question now is: should the header be ===Etymology=== or ===Graphical significance and origin=== for individual Han characters? (The relevant Chinese idiom is: 雞蛋裡挑骨頭鸡蛋里挑骨头 :-) I guess this depends on how we look at the issue. If you regard as the written representation of the Mandarin adjective which means big (Pinyin dà), and you want to know how this word came to be, then etymology seems appropriate as a header. However, if you are wondering why the symbol came to be used to represent the concept of big, then Badagnani's header (Graphical significance and origin) might make more sense. The problem that I envision is that the content in the affected section often contains both kinds of information. To be absolutely correct about it, I'm not sure that the term etymology can be applied outside the context of a specific language (in other words, having an etymology in the translingual section may not be appropriate). I'm not a linguistics professor, but I'm thinking if we were to be absolutely correct, we would have to do something like:
===Graphical significance and origin===
  • mumbo jumbo about a big person with arms stretched out ... blah blah ;)
===Etymology 1===
  • Old Chinese -> middle Chinese (Baxter daH) big -> Mandarin (Pinyin dà) big
===Etymology 2===
  • Once used interchangeably with (Pinyin dài, to care for). In modern Mandarin, is only read as dài when in the compound 大夫 (pinyin dàifu, doctor).
Obviously, this is a silly example, but even so, a linguist would learn about the phonological evolution of the word in the etymology. In this case the tone register appears to have shifted from ˨˩ to ˥˩, which shows that the Mandarin falling tone evolved away from the middle Chinese falling tone, whereas Cantonese and Min Nan seem to have more closely preserved the original falling tone of middle Chinese. This phonological stuff obviously wouldn't be appropriate for a section called Graphical significance and origin.
Having said all of that, ===Graphical significance and origin=== is a mouthful, and it is not a standard Wiktionary header.

A-cai 08:03, 29 October 2006 (UTC)

It should be Etymology, it is our standard header for this information. It is the source of the word/term both phonologically and graphically, in any language. (Of course, in alphabetic languages they are more closely related.) Look at our own definition at etymology: "The origin and historical development of a word." Robert Ullmann 11:34, 29 October 2006 (UTC)
Hello, I was just directed here; I don't mind at all what the section is called. I mind that the section describing why hanzi are written they way they are seems to be being removed from entry after entry. This issue is of signal importance to a complete understanding of the written form of Chinese, as well as the other languages that use these characters, and I feel that this isn't being treated seriously, from the comments left in response to my suggestions. Regarding calling the explanation of the written form an "etymology," I'm not sure that seems correct. Etymology, to me, does not refer to the shape of a pictograph or ideograph but instead to how a word has developed in pronunciation and meaning over time, often from language to language, as in "boeuf" to "beef." We shouldn't try to confuse our users needlessly so whatever heading is chosen, it should be as clear as possible. "Graphical significance" seems quite apt. I agree that the description of the changes in pronunciation and meaning over time (as in the daifu example A-cai gave above) represents something different so to do these entries right, I think A-cai's "cumbersome" format proposal seems most comprehensively treat all of these issues. Of course, this can be built upon gradually, character by character. Badagnani 06:39, 16 November 2006 (UTC)

Wiktionary logo vote

If you want a decent new Wiktionary logo you better pop over to meta and cast your vote. There are only three days left! Ncik 11:25, 28 October 2006 (UTC)

More definitions for English wiktionary

It is possible to prepare a list of English definitions/meanings missing in English wiktionary from Webster's dictionary which is now Public Domain. Editors would then only have to edit these new definitions and they could be added into English wiktionary. I have compared wiktionary with Webster's dictionary and it looks like there are about 25000 new meanings which English wiktionary does not have yet.

For example the entry:
residuary -> (a.) Consisting of residue; as, residuary matter; pertaining to the residue, or part remaining; as, the residuary advantage of an estate.
can be easily edited and added into English wiktionary. (it is easier to edit this entry than to create it from scratch)

Please let me know what do you think. I can easily prepare the file with definitions for words missing in wiktionary.


I see no reason why not, as long as the pages aren't created from the Webster definitions in advance. I compile similar lists frequently, albeit without definitions (see Scrabble and Project literature, for example). —[admin] Pathoschild 17:09, 28 October 2006 (UTC)
BTW, this discussion more properly belongs on WT:BP itself (or possibly, since you're talking about script automation, in the Grease Pit), rather than the Beer Parlour's talk page; this page is for discussion of the Beer Parlour page itself, and things related to it. --Jeffqyzt 22:38, 28 October 2006 (UTC)
Moved to WT:BP. --Connel MacKenzie 02:29, 29 October 2006 (UTC)

West Virginia's state animal

the state animal is a black bear.black bears like living near the mountains.there are alot of bears near the mountains.

That was the content of the page. (before being deleted of course)
We see things like this a few times a day (or so). I've been leaving them alone for a while. It is something that occurred to me: while we deal with seemingly endless drunk university students, this is a kid. Playing with this new thing they found on the web. A thing that they should (and will) find very very useful in learning, if they don't get their fingers burnt right away.
So I leave it alone for a while, so they can see that it really does show up on the site. It will still be un-patrolled tomorrow, and even if it ends up missed we'll find it sometime. Robert Ullmann 22:09, 29 October 2006 (UTC)
I find the {{wikify}} tag very useful for things like this. Sure, we will probably end up deleting it as "not dictionary worthy" but in the meantime, the original contributor will follow the formatting links (and about 25% of the time, will clean it up themselves.) --Connel MacKenzie 21:18, 30 October 2006 (UTC)

Template "given name"

What is the purpose of this template? At the moment it expands to (male given name) or (female given name) depending on the argument after the pipe. On the page for Patrick, it generated the following content:

  1. (male given name) A male given name.

Used as shown, it is completely pointless, as it duplicates the definition.

A much more useful form would be "A {{1}} given name", so that {{given name|male}}<nowiki> and <nowiki>{{given name|female}} could be used for the definition.

I'm going to make this change. I'll clean up any linked pages. — PaulG

Well, I've made the change, but there are dozens of linked pages. Could a bot clean these up?
And since when did four tildes not give a signature any more? — PaulG

(since you left out a / nowiki tag ;-) Robert Ullmann 17:03, 30 October 2006 (UTC)

Oh, I see. Thank you for that, Robert. — Paul G 09:11, 5 November 2006 (UTC)

Bot flag for Han character formatting

I will try this again. If you want me not to have TWENTY THOUSAND PLUS edits in Recent Changes for formatting these and the rōmaji and hiragana entries that need POS headings, don't discuss anything else in this section!

Should I create an account (UllmannBot) and run these with a 'bot flag. Even though none of it is presently automatic. Yes or no. If you think this should be a formal vote, set it up and plant the link here. Robert Ullmann 17:03, 30 October 2006 (UTC)

  1.   Support granting the bot flag to UllmannBot. —scs 19:46, 30 October 2006 (UTC)

Yes, you should create the account UllmannBot, so it can be put to a normal vote. --Connel MacKenzie 21:01, 30 October 2006 (UTC)

Vote started at Wiktionary:Votes#User:UllmannBot. --Connel MacKenzie 21:14, 30 October 2006 (UTC)

I created the account; but of course can't in practice use it without the flag (patrolling the edits is more work than doing them in the first place!) We'll see ;-) Robert Ullmann 12:03, 31 October 2006 (UTC)


Would anyone support me putting to a VOTE the notion that "non-US" is POV, while the label "Commonwealth" (that we've used off and on for at least two years) is NPOV? We've had about a dozen US/UK skirmishes now, with no difinitive policy set in place to reduce this sort of POV pushing. --Connel MacKenzie 20:57, 30 October 2006 (UTC)

I support the vote and I agree with you. Somehow, positive association is better than negative. --Enginear 22:33, 30 October 2006 (UTC)
Sounds good to me, I've see a few things labled 'non-US', that were terms which I, a native resident of the Northeastern US, was quite familiar with.. besides, US implies a political or geographic division - in this day and age, language trancends political & geographical boundries. --Versageek 21:21, 30 October 2006 (UTC)
I don’t think "non-US" is so much POV as it is simply bad English. It doesn’t sound like something a native speaker would say. A sentence such as "bloke is a non-US word" sounds to me like a non-good construction. It would be okay, however, to say that "bloke is not American English", or that "the term bloke is not used in the U.S." —Stephen 22:58, 30 October 2006 (UTC)
Are we talking about spelling variations or pronunciation/definition? These are two separate issues in my mind. The first has to do with the spelling simplifications proposed by Noah Webster, and later promoted by Theodore Roosevelt (both thought color would be easier for people to spell than colour). These simplifications were eventually adopted in the United States, but not elsewhere.
To my mind, spelling differences are different from pronunciation and meaning differences. Pronunciation and meaning are almost always dictated by a specific region. I believe Wiktionary already has a good policy for pronunciation and meaning, which is to label it according to region/language (ex. Canadian English) in the pronunciation section and on the relevant definition line.
In the case of spelling, perhaps the answer is to call "colour" the traditional spelling, and "color" the reformed spelling or simplified spelling. Since the reformed spelling is only used in the United States, it might also make sense to call it American spelling instead of reformed spelling or simplified spelling.
Incidentally, this is not the only "reformation" of English spelling. In the 17th century, spelling was standardised as part of the king's aim to make civil service documents intelligible to all and consolidate the union of Scotland with England & Wales (that is the full extent of my knowledge -- if I have details wrong, please correct). --Enginear 10:09, 8 November 2006 (UTC)
I agree that non-US is POV, because it does imply that US usage is (or should be) the standard.

A-cai 23:04, 30 October 2006 (UTC)

Why exactly should US-English be the norm? Colour is not an older version of the word color - color is the US spelling and in the UK and other parts of the English speaking world we still use colour, therefore both terms would require marking up as such. This crusade against UK spellings in favour of their American counter parts has to stop! We need to define both as fairly as possible - this is not a US only project remember! In my opinion we need to accurately represent the English language here and not just one form of it such as US English or British (shudder!) English.--Williamsayers79 08:34, 31 October 2006 (UTC)

Whoa. You are three years late for the color/colour flamewar. THAT IS NOT THE TOPIC AT HAND. The question I raised what what to label the Commonwealth varient. Only. (I'm glad to see that everyone is in agreement that US-specific uses should be labeled as such.) --Connel MacKenzie 08:40, 31 October 2006 (UTC)
I agree with Connel. Please don't attribute dark motives where none exist :) I take it that you were objecting to the term traditional. I can understand such an objection. If I read you correctly, you feel that traditional implies obsolete or archaic. This is similar to some people in Taiwan who promote the use of the term 正體字 (lit. correct characters) over 繁體字 (lit. complex characters) when describing traditional Chinese characters. No one objects on English Wiktionary to the use of the term traditional for traditional Chinese characters (despite the fact that they are still used in Taiwan and Hong Kong) precisely because native English speakers do not have an emotional attachment to Chinese written forms. What I meant by traditional spelling is that at one time, everyone (including Americans) spelled it as colour.
Maybe I should back up, and explain why I was trying to move away from the word commonwealth. Commonwealth implies that colour is correct for all English speaking countries with the exception of the US. However, since colour was also the official spelling in the US until the 20th century, it doesn't seem correct to imply that it is a spelling that would never be seen in American literature (I direct your attention to The Prince and the Pauper/Chapter 33, by Mark Twain. Do a find on "colour"). So this is why I suggested the above wording, it was not out of some twisted desire to ensure that American English become universally adopted. As a matter of fact, I believe the last sentence of my first post shows that I was attempting to find an acceptable neutral term. Obviously, you disagree with the proposed term, so I will put on my thinking cap and try to find something else. Hopefully, you now see why I would object to saying that colour is the British spelling (it was also used in the US at one time).

A-cai 12:31, 31 October 2006 (UTC)

The distinction between US and Commonwealth English is not that great an idea. Some words are decidedly Australian or Jamaican. Some are decidedly Irish or Scottish. In my opinion there is English (shared by all) and there is English that is particular to a locale. Throwing all the other but American English together does not cut it (American English is not consistent either) GerardM 14:27, 31 October 2006 (UTC)

Logically (although, perhaps, impractically) this should be part of the etymology or usage notes, rather than in the definition. The issue that A-cai raises is interesting, in that what we consider "modern" English, as an evolution from Middle English, is still in the process of evolving (to "Post-modern English?"). So, when we assign a particular usage tag (say {{US}} or {{UK}}) that tag is not necessarily indicative of the entire life of the word in question. In many ways the current situation (for the past couple of hundred years, say) with regards to English usage is a reversal of trends, whereby formerly divergent usages are now much more exposed to eachother and available for "cross-pollination". Regional dialects are probably losing importance in regards to language evolution, whereas cultural group jargon (made much more accessible to members via the internet, mass media distribution, global telecoms networks, easy air travel, etc.) is rising in importance. If we're going to accomplish it via the current tagging system, the "correct" and NPOV way would be something like (US until mid 19th century, Canada, UK, AUS, India) which is clearly unwieldy. I would say that for the sake of legibility, unless otherwise tagged as {{archaic}} or {{obsolete}} or somesuch, we should take regional/dialectal tags to indicate "current" usage, whatever "current" means in the context of a living, evolving language. Perhaps some guidance as to that would be in order? --Jeffqyzt 15:22, 31 October 2006 (UTC)

Re-reading the above comments, it seems clear that "Commonwealth" is 'better' than "non-US" but inadequate for some situations. I believe it is the best compromise we are likely to reach, with "Usage notes" filling out the more ambiguous situations. Does at least that much, have broad agreement? --Connel MacKenzie 18:07, 31 October 2006 (UTC)

It has my agreement, particularly as we're not saying they are the only possible tags. It's horses for courses, but "US" and "Commonwealth" will between them cover a good proportion of those usages which are not universal. --Enginear 20:54, 31 October 2006 (UTC)

My main beef with the term commonwealth to mean non-US it that it is so inaccurate. It implies that there are only two types of English in the world, to the exclusion of all others. Usage notes will go some way to clarifying differences, but to a degree we are creating a mistake (by using Commonwealth) and then correcting it (by using notes). I have several reasons for not using the term Commonwealth unless I know it applies in all places such as an official title.

  • It totally ignores English speaking countries like Ireland that is not part of the Comm.
  • Many countries in the Commonwealth do not use English as their first language, and indeed, in some of them the level of English spoken is very low.
  • Spellings are not consistent throughout the commonwealth. Eg, in Australia, some sections of the media insist on using (for want of a better term now) American spellings, with suffixes such as –or and –ize.
  • Some spellings and terms uses in the UK do not apply in other part of the comm.
  • Ignores words that are spelt the same in the US and Canada but different elsewhere.

This has nothing to do with a right way or a wrong way to spell, and I have removed notes that state this when I have found them. There is no simple answer, but to imply that if it is not American English then it has to be Commonwealth English (or vice versa) will never work.--Dmol 14:50, 4 November 2006 (UTC)

Agreed that there is no simple answer. The only way to handle the problem comprehensively would be to have an entire section for each word with spelling variations detailing where each is used. Even then you'd still have arguments where some areas of a country use one spelling and others another - or both.
I would suggest that perhaps the best solution is possibly also the easiest: Do not make any comment about where each spelling is used other than to mark spellings that are local to a subdivision of just one country 'regional'.
In the main I wouild imagine that people for whom the difference is important, i.e. people who are writing in a country where English is the first or an important second, language will have been taught their 'version' of or/our ise/ize er/re differences. Moglex 15:15, 4 November 2006 (UTC)
I disagree with attributing a spelling form to a specific region and here's why: As we can see from the above, nobody is going to agree on which spelling is appropriate for which region, because that depends on one's subjective philosophy regardless of where you are from. When Theodore Roosevelt submitted the bill to congress to make the reformed spellings standard (color), there was a massive backlash from congress, who felt that he was attempting to impose a system of orthography by executive order. Most of the major American newspapers at the time refused to adopt the new spellings out of protest. However, cultural attitudes changed in the United States, and now we have color instead of colour. This same thing could happen in other regions (ex. Australia and Canada in the above post). The main objection to such a change would be cultural, not linguistic. For example, every single objection to the color spelling that I have read on Wiktionary is because it is perceived that the Americans are yet again attempting to force their cultural views on others. No other defense of the colour spelling (to my knowledge) has ever been raised (such as it is easier to remember, or makes more logical sense). This is why I made it a point to emphasize that there is a difference between spelling variations and regional usage of words. For example, color/colour means exactly the same thing regardless of what area of the English speaking world you come from. In contrast, a rubber can mean something very different in Britain than in the United States (UK: an eraser, US: a condom), despite the fact that it is spelled exactly the same way in both places. This is why I originally proposed a solution for the spelling problem that does not attribute a specific region. For spelling, we are really talking about the difference between the spelling that derives from King James and William Shakespeare or earlier and the reformed spellings that were proposed by Noah Webster, and later adopted in the United States as a standard spelling (For the time being, I'm am leaving out non-standard spellings such as mutha for mother, which is popular in Hip Hop circles). There were objections to my original proposed terms of traditional vs. reformed spelling. That only means that the adjectives that I chose to describe the phenomenon were objectionable to some. The objections stemmed more from a perceived bias attributed to the terms than from their imprecision. However, the phenomenon still exists. Perhaps we could label the colour as complex spelling and color as simplified spelling. Or we could label it as something else that captures the same idea of reformed vs. non-reformed that meets with everyone's sense of cultural sensitivity.

A-cai 20:30, 4 November 2006 (UTC)

I think I was unclear in what I said above. My suggestion was that no mention is made of where or what the different spellings came from or are, with the exception that if a spelling is local to a part of a country (or a very small country), it is labeled simply '(regional)' - not with a specific region.
Although I quite like your suggestion (if any suitable words could be found), it does rather beg the question: what useful information does a user get if you do not actually tell them which way they should be spelling the word for their target audience? And since you cannot do that accurately without a very complex system, why do anything at all. Provided all the spellings are present, that should be enough.
I don't follow your argument. As a descriptive dictionary, surely our job includes describing, to the best of our ability, what is considered "educated usage" in every country where English is spoken. We need to accept that we will be far from complete, but nonetheless do our best. And being a wiki, the best will slowly improve. But, to take a simple example, if we fail to tell a British 12 yr old that color is not standard British usage, and they lose marks at school for poor spelling, (or ditto at 18 and they are rejected for a job interview) we have failed them.
Also, surely this thread is about usage, as well as spelling. Knocking people up, tabling, pissed, etc, all need appropriate regional tags. --Enginear 20:06, 7 November 2006 (UTC)
I'm sorry, but this seems quite beyond the point. We have made such a distinction for a very long time. The question at hand is only whether "Non-US" is sufficiently POV to prohibit its use. At least that much, everyone seems to agree upon. --Connel MacKenzie 20:14, 7 November 2006 (UTC)
Agreed -- I'd forgotten the origin of this discussion -- perhaps a sign that a vote is overdue. I had wondered if any ex-British Empire citizens might object to using a name which, for some, may still have Empirical overtones. But no one has, so no need for any counter-proposals. Go for it. --Enginear 10:16, 8 November 2006 (UTC)

No amount of voting will change the fact that the English language can not be divided up into US and Commonwealth which is what Connel is trying to do. It ignores the exceptions I have already listed above, plus dozens of other groupings such as Caribbean, Irish-American, UK but not other commonwealth, African English, North American (both US and Canada) etc.--Dmol 11:28, 8 November 2006 (UTC)

What an extraordinary, outrageous lie! You are the main person pushing the POV that en-UK is the only valid "English" using loaded POV terms like "non-US". I have said numerous times, that each of those distinctions should/must be made. --Connel MacKenzie 19:25, 8 November 2006 (UTC)
I'm no big fan of voting here, either. And, clearly, a binary split is not perfectly adequate. The questions, it seems to me, are whether we want to try to tackle the problem at all, and if so, how thoroughly.
My opinion is that sticking with a "Commonwealth" vs. "US" split is an appropriate compromise. Trying to do less -- saying "Eh, "color" and "colour" are two alternative spellings, flip a coin or use whatever you want" -- is a useless cop-out. But trying to do more -- trying to say, for every word in the language, precisely which of its alternative spellings are used by the British, and the Americans, and the Canadians, and the Australians, and the New Zealanders, and the Indians, and the Irish, and my crazy cousin Fred -- may well be impossible, or not worth it, or impractical given the resources we have.
It's also worth remembering, before we get too prescriptively descriptive on regional spelling, that our main focus here is (in my opinion) describing what words mean. I don't see our role as capturing "official" spellings; I don't see our users coming here and trying to answer the question, "What is the totally official spelling of word X in region Y?". Trying to come up with "official" spelling lists for even a single region, let alone simultaneously for all English-speaking regions, is a thankless task and probably a fool's errand. Let's not get too bogged down in it. —scs 19:17, 8 November 2006 (UTC)
(P.S. Personally I feel that "UK" might be an appropriate, shorter and more parallel, synonym for "Commonwealth" in this context, but that's probably a more contentious point. —s)
Calling "Commonwealth English" "UK" has been rejected several times now. Apparently, the British and the non-British alike, detest such commingling. --Connel MacKenzie 19:25, 8 November 2006 (UTC)
I'm not surprised, but the {{UK}} tag still seems to be in pretty widespread use, so I wasn't sure. —scs 20:12, 8 November 2006 (UTC)

noun phrase or phrasal noun?

Hello all, just to mix things up a bit but I have noticed both terms Noun phrase, Phrase and Phrasal noun used in articles as headers in varrying articles what the hell are the guidelines (if any exist)?--Williamsayers79 08:36, 31 October 2006 (UTC)

For example I have recentyl editted the entry small molecule drug, personally I would have thougth this to be a Noun phrase but should I call it a Phrasal noun or Noun phrase in the header?--Williamsayers79 08:41, 31 October 2006 (UTC)
===Noun=== and ===Phrase=== are valid en.wiktionary headings, but "Noun phrase" is not. --Connel MacKenzie 08:42, 31 October 2006 (UTC)
So we don't use Phrasal noun or Noun phrase then? --Williamsayers79 08:44, 31 October 2006 (UTC)
Those headings are routinely changed to ===Noun===, as they are found. --Connel MacKenzie 08:45, 31 October 2006 (UTC)
Thanks, I shall do so too then.--Williamsayers79 08:52, 31 October 2006 (UTC)
See WT:POS draft policy Robert Ullmann 11:59, 31 October 2006 (UTC)

You say this like it's policy, but it's not. I for one still prefer "Noun phrase", for the simple reason that a noun is usually defined as a type of word. The OED for example defines noun as "A word used as the name or designation of a person, place, or thing", or "a word that is capable of functioning as the subject and direct object in a sentence, and as the object of a preposition", definitions which identify single words and seem to exclude phrases, at least by implication. Our own definition of noun specifies it as a word, and therefore it is problematic to call a phrase simply a noun. Widsith 14:15, 31 October 2006 (UTC)

Are you suggesting that noun needs to be updated/corrected? "Noun phrase" as a heading has been soundly rejected in the past. What I said was that the non-Wiktionary headings are routinely corrected - that is a statement of fact. That is not something that I alone do. The question of what Wiktionary accepted practices are, vs. what are official policies are, can only be addressed by noting that Wiktionary has no official policies, AFAIK. --Connel MacKenzie 18:03, 31 October 2006 (UTC)
Discussion is on-going at WT:POS. No one commenting there has supported the use of Noun phrase or Verb phrase for English entries, or provided reasonable arguments there for their use. Personally, I feel I could go either way, but every who has been active in the discussion has preferred simply using Noun or Verb. --EncycloPetey 00:15, 1 November 2006 (UTC)
My opinion is that in most cases, calling a dictionary entry a "noun phrase" tells you almost nothing. If "hobbyhorse" and "hobby-horse" are nouns, but "hobby horse" is a noun phrase, then basically what "noun phrase" means is "has a space in it" or "is more than one word". But you can see that for yourself, if it matters to you.
In grammar, a noun phrase (to be more than a noun) might include an article and/or some adjectives ("the big red bus"), but those (usually) aren't dictionary entries.
Broadening your mind, so that a "word" can be more than one word, makes things so much simpler! :-) —scs 00:40, 1 November 2006 (UTC)
Amen to almost everything scs says, up to part where real phrases (though they exist) are supposedly irrelevant for dictionaries. There are obvious noun phrases we'll want to include, like "a month of Sundays" or "the very idea". Knee-jerk deletion of the label "phrase" in these cases is unhelpful. There are thousands of verb phrases we'll want to include, including most idioms. Calling "hit the nail on the head" or "take the bull by the horns" just plain "verbs" is just plain icky. A policy insisting that "verb" is the only acceptable label does not add to Wiktionary's credibility. Keffy 03:47, 1 November 2006 (UTC)
No prob -- I wasn't talking about real phrases. (But I'd say in many cases they're simply "phrases", not noun phrases or verb phrases.) —scs 04:46, 1 November 2006 (UTC)

I understand these arguments, but it is not done to convey extra information to a reader, it is simply done because a gourp of words cannot be a noun. A noun is a single word (or at least that's the way it's normally defined). I don't feel especially strongly about it I suppose, but I just think "Noun phrase" is more exact. Widsith 11:40, 1 November 2006 (UTC)

So is Sri Lanka a proper noun or a proper noun phrase? :-) —scs 12:20, 1 November 2006 (UTC)

I have pointed this out a thousand times but here we go again: An "X phrase", where X is a part of speech, is not neccessarily an X. An "X phrase" is a phrase whose headword is an X. Consider Adam and Eve: This is a noun phrase since the head of the phrase is a noun but as a whole the phrase is a verb. head over heels: This again is a noun phrase but the phrase is adverbial. The POS header should obviously contain the POS of the whole phrase, the fact that a phrase is an X phrase is of secondary importance (but could be mentioned somewhere). Ncik 12:49, 1 November 2006 (UTC)

Really? That's not how we define noun phrase. Widsith 16:55, 1 November 2006 (UTC)
More precisely, this is not how SemperBlotto defines a noun phrase. The original version agreed with me. Ncik 10:17, 2 November 2006 (UTC)
I agree with Ncik: the heading is not specific enough to a single meaning to be very useful. For individual words, the POS headings are useful, but for phrases, much less so. (I think a better example than Adam and Eve would help - Cockney rhyming slang is unheard of, outside of Europe; rare outside of England.) For phrases (especially idiomatic phrases) the part of speech sometimes helps convey which meaning is intended. Each individual definition line is what could/should indicate the part of speech. On the other hand, 'Adam and Eve' is also used as a noun, to convey figuratively any 'first parents' of something. Complexity arises when a single idiom can be used as different parts of speech (but that is infrequent enough to address in a usage note?) I still think User:Hippietrail's part of speech templates are the better way to describe these. To a native English speaker, the information is so obvious, it is laughable. To an English (as second language) student, the information is very helpful. So the abbreviated form (with the "hint" text) seems like the best middle ground. --Connel MacKenzie 16:44, 1 November 2006 (UTC)