Wiktionary:Beer parlour/2006/June

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit

Tea room, Beer parlour, and Grease pit added to "navigation" bar

I've been bold and added these three to the nav bar - that's on the left of your screen if you're using monobook at least. I've assumed this would be uncontroversial. If you believe I was wrong let me know here or in the Grease pit. — Hippietrail 01:22, 1 June 2006 (UTC)

I'm against it. Our users will probably not know what they are, and they can all be reached from Community portal. Also, I keep hitting the wrong one when I want Recent changes! SemperBlotto 08:47, 1 June 2006 (UTC)

Yes, I'm against modifying the Navbar for all users as well. I hope you won't mind, but I'm going to remove them again. —Vildricianus 10:26, 1 June 2006 (UTC)

Well, I liked it. Widsith 10:36, 1 June 2006 (UTC)

You can add it (and many many more things) for just yourself by customizing your monobook.js. If it seems, though, that a majority of people want it back, OK then for me. I thought it would only add to the confusion for the majority of users who don't bother for discussion anyway. —Vildricianus 11:21, 1 June 2006 (UTC)
I think it is a Good Thing to encourage more people to converse in the discussion areas. I think many of our regular contributors get lost, trying to find even the Beer parlour. --Connel MacKenzie T C 22:03, 3 June 2006 (UTC)
We'll need another room soon enough if there are only a couple more people like me, writing entire chapters of needless prose each time they comment (certainly if they reply to each and every message). OK, I'll stop here.
No, I won't. I suggested on User talk:Hippietrail to add WT:DR instead. I guess the "Discussion rooms" theme (leading to a clear page with a nice table) is also a bit less obscure than the "beer, tea, nuts and bolts" theme. However, I guess it may be perceived as "autocratic", though, to first remove Hippietrail's links and then add my own one.
Note: in order not to disrupt SB's daily rituals, it'll need to be added below the recent changes link. :-) — Vildricianus 23:02, 3 June 2006 (UTC)
Does anybody object? — Vildricianus 13:27, 6 June 2006 (UTC)
Sounds good to me. --Connel MacKenzie T C 04:26, 7 June 2006 (UTC)
Done. — Vildricianus 21:51, 7 June 2006 (UTC)

Failure in the verification process

Why was butt monkey deleted?

Also, would I have to find quotations if I recreated kindergarten rule? I didn't try to attest the page because the definitions didn't look right to me. Idiomatic I don't know about, but there is certainly a definition and asking to cite that would be a bit ridiculous. The old definition was deleted for failing the RFV process. My question is if that means it will be deleted automatically again, even with an appropriate definition, when in fact it should be RfD'd to determine if it meets CFI. Davilla 09:00, 1 June 2006 (UTC)

I restored it. Looks passable to me. --Newnoise (Shout louder) 09:53, 1 June 2006 (UTC)--Newnoise (Shout louder) 09:53, 1 June 2006 (UTC)
  • It looks to me like it failed RFV. Looking closer at the deletion, I do not agree - but then again, the "references" were not formatted as per WT:ELE. In fact, they were ambiguous links, simply numbered. Hrm. I think all of us should pay closer attention to the quotation formatting, to avoid inadvertent mistakes like this in the future. Would you like me to format those entries, or do you want to attempt it first (to make sure you've got it) ? --Connel MacKenzie T C 05:14, 2 June 2006 (UTC)
    I actually prefer to write out the quotations myself because it means I have some influence over the format. The problem is it takes a long time to do. It's seldom easy to find good quotations, easy meaning neither complex in search terms nor arduous in selection. I really should copy the text of the quotation at least, but the other information is almost as difficult to track down, e.g. the year that a webpage was written, the real name of a blogger, or the production studio for a movie. In this case I incorrectly assumed that the term wasn't that contentious in the first place. Also failed RfV pages have then undergone the RfD process in the past, and this one would have been caught even in my absence. Davilla 14:09, 2 June 2006 (UTC)
  • I was the one who most recently purged RFV, and deleted the entry for "butt monkey." There were about 150 terms to read through (I read as much of the conversation on WT:RFV as I can, but it would take in inordinate amount of time if I were to read RFV, read the RFVed page, read the citations, veriify the citation etc. Since the person who is cleaning up after the process doesn't want to repeat the process, it needs to be made clear to them that one thing or another should happen, if there is no input on the RFV page, it gets deleted. If there are no citations or changes on the RFVed page, it gets deleted. It is a pain to clean the thing up because it gets backed up for months, and it gets backed up for months because it is a pain. The things we can do to make this an easier cleanup are: state clearly on WT:RFV that something has passed CFI once it has. If a name I know has typed out "passed RFV" under the proper heading all I do is go to the appropriate page and remove the tag, if they type out "passed RFV, removed tag" then they are my favorite person for the week :) If some entries were deleted that shouldn't have been, by all means reinstate them, I will not be offended, I know that mistakes are made, but ususally it is around 2am when I get around to that kind of thing, and making it a 3 hour project isn't what I am looking for then ;) - TheDaveRoss 19:13, 2 June 2006 (UTC)
    Okay, I thought there had already been some talk on the rfv page in the form of general support, and with one quotation listed prominently didn't bother to go back. So it looks like more communication would have avoided the problem.
    By the way, except in reviewing a word as a third party I've never removed the rfv tag myself because I believed it would be a faulty procedure. Yes, I'm confident in the quotations I use, but not always in the ones other people use, so as a point of procedure I submit my work for verification. Sorry since it would be quite easy to do.
    Anyways as for this discussion I meant "failure" in two ways, the more critical issue as far as the process goes being words that fail RfV for having the wrong definition. Davilla 07:15, 3 June 2006 (UTC)
What I have recently done is that if the word has attracted little comment or no decision at rfv, but I feel it still needs attention, then I moved it to rfd, which has provoked responses. This enables me better to consider if it sould be deleted. If I did not think it was a rfd type word, and could not decide what to do with it, I left it at rfv. Now that rfv is reasonably up to date, I hope to keep checking and fixing if I can. Andrew massyn 19:55, 2 June 2006 (UTC)
And thank you both for the amazing amount of purging and fixing of RFV & RFD which has gone on in the last week. Much appreciated by those of us who read them. --Enginear 01:12, 3 June 2006 (UTC)
I echo Enginear's thanks. It is appreciated! --Connel MacKenzie T C 22:04, 3 June 2006 (UTC)

Back to butt monkey. There are several serious contributors who think it should remain in place, but maybe marked for cleanup. So why does it get deleted. Whilst I too thank people for doing some dishwashing, you have to be careful not to throw the bay out with the bathwater. --Richardb 22:38, 8 June 2006 (UTC)

If a "serious contributor" (read: anyone who is reading this, and many more besides) would like to see an RFCed entry kept, all they have to do is take some time during the month that it is in WT:RFV and prove that it meets the CFI. "Butt monkey" still has not provided evidence that it belongs. The definitions are such that they really fit any disparaging term, so trying to find something which meets the CFI shouldn't be hard. Instead of clear evidence though, I was met with 4 numbered links, at least one of which doesn't even contain the term, one is an XML document (a possible RSS feed, but clearly not somehting which provides a decent citation of usage), the others disagree on usage and formatting (butt-monkey vs butt monkey vs Butt Monkey), as well as a seemingly random "Beavis and Butthead Do America" at the bottom there.
If someone is REALLY concerned about keeping the entry, and it is worth keeping, then 3 DECENT cites in DECENT formats can surely be dug up for it. A month is a long time, I think, and if no subsequent effort is put into a page to prove it's worth, I don't put forth the effort myself during the cleanup. The notation on the RFV page itself didn't indicate one way or the other what the entry merited "annoying and irking person. — Vildricianus 14:48, 9 April 2006 (UTC)", "Legit and worth searching for validation. Davilla 17:57, 9 April 2006 (UTC)"
Not wanting to get too much on the defensive, upon reviewing it I would still delete it, it only boasts 2 real cites:
  • "I replied and called him a "butt-monkey" and reminded him that I had truly looked everywhere online and couldn't find it." and
  • "I'm finished being everybody's butt monkey."
Which do not agree on what the definition is, so really it has one cite per definition, certainly not up to CFI standards.
  • "That’s what I want to know. Novack was the only one who published the information–why the heck isn’t he under the most pressure? But of course the answer is politics, he’s a Republican butt monkey and so won’t be punished. Too bad there’s no state laws under which he could be pursued by a Democratic AG."
From the XML feed doesn't even agree with any of the stated definitions. Believing it belongs isn't enough, we have a month to prove it, this one wasn't proven, it should be deleted. For the benefit of the doubt, I am relisting it on RFV, it gets another whole month of scrutiny, hopefully it will be confirmed or denied definately this time around. - TheDaveRoss 08:20, 13 June 2006 (UTC)

AOL block

How long are we going to have it blocked and forced to use HTTPS? Is this a permanent measure? — Vildricianus 15:47, 2 June 2006 (UTC)

The root problem hasn't been solved, and can't be from our end without help from AOL. The other option is unblocking without solving the problem and just continue to use short blocks for the AOL IPs. - TheDaveRoss 18:40, 2 June 2006 (UTC)
Editing from the secure.wikimedia.org server is not error-free. A couple of pages can't be accessed or are linked wrong. How long has the block been up? I'm not sure how long we need to go on with it - especially if the result is not as desired. — Vildricianus 19:55, 2 June 2006 (UTC)
Is it, though? I've not gotten any complaints lately. Am I not checking the right places? --Connel MacKenzie T C 22:06, 3 June 2006 (UTC)
Perhaps it's fine this way. In that case we may want to give a full statemant at Wiktionary:AOL, and make it clear that it's going to be a bit longer then temporary. IIRC, my motivation for posting on this was the annoying sitenotice most users have to put up with. — Vildricianus 22:29, 3 June 2006 (UTC)
Ah. You showed me how to use CSS to hide that (that was my first personal CSS customization, IIRC) but it never made it into WT:CUSTOM... I'd almost forgotten it was there. --Connel MacKenzie T C 04:29, 7 June 2006 (UTC)

The way in which Wiktionary defines correct plural forms

Is it entirely by reference to 'reputable sources', or are plural forms derived from various grammatical rules accepted as well? I ask because another user raised objections to my changing various words' plural forms (generally by removing the ubiquitous -s suffix), and proceeded to revert the entries I altered. I am a new user to Wiktionary, so if I have violated some rules or regulations in my unilateral alterations, then I apologise. However, it seemed very strange that the said user objected to my changing the plural of platypus from platypuses and platypi to platypodes, when it states in Wikipedia's article on the platypus:

>There is no universally agreed upon plural of "platypus". Scientists generally use "platypuses", "platypoda", or simply "platypus". Colloquially, "platypi" is also used for the plural, although this is spurious pseudo-Latin. (The true plural would be "platypodes".)<

I had always presumed that dictionaries were written in an effort to be as correct as possible. Furthermore, I had started removing -s suffix plural forms from the entries of the words found in the article that discusses irregular English plurals, because essentially the only criterion for a word being included in the list was for it to form a plural without the addition of an -s; thus, either the word would have to be removed from the list if it could be pluralised by the addition of an -s, or that plural form was incorrect. I opted for the latter.

I await clarification on this issue.

Doremítzwr 16:25, 2 June 2006 (UTC)

The resolution of the categories of irregular plural forms issue took a different approach than yours. However you are correct in listing multiple plural forms when they exist, even "incorrect" pseudo-latin forms, and could simply note as much in the Tea Room when there is a disagreement. Davilla 16:32, 2 June 2006 (UTC)
While there is some technical correctness to some of these plurals their use in English can often be seen as somewhat pedantic. A pedantic use should be supported with evidence. At the other extreme "platypi" is very common, and needs to be recorded even if it is incorrect. Anyone using that plural needs to be made aware that such use will reflect a less than completely literate quality to one's writing style. We record common incorrect forms, but they cannot be used with impunity. Eclecticology 18:10, 2 June 2006 (UTC)

My interpretation of what you two (Davilla & Eclecticology) have written thus far is that every plural form, from the most classically correct, to (as I would regard them) the most simplistic bastardisations, ought to be included, and noted as such. However, I am a proud pedant, and do not see why pedantic forms ought to be supported by evidence any more than any other form ought to be (although I am perfectly willing to provide evidence for any entry when necessary).

No, no. Only the attested classically correct forms, and attested bastardizations. --Connel MacKenzie T C 22:56, 3 June 2006 (UTC)

Concerning Davilla's point that even >"incorrect" pseudo-Latin forms< (excuse the correction, but come on, Latin is a language) ought to be listed: in the sense that they have as much validity as -s plural forms, I absolutely agree. The -s suffix is not, strictly speaking, an English morpheme; thus adding -s to a word to form a plural is no more correct than following a number of other equally valid 'English' rules for forming plurals (such as replacing the word's final -us with an -i or -um with an -a).

What I propose is that the ('pedantic') forms with which I replaced the former plural forms be reinstated alongside those which I deleted. This ought to appease everyone. Is this proposition acceptable to everyone?

Doremítzwr 22:45, 2 June 2006 (UTC)

The problem with the pedantic forms is that they are theoretical only. What is needed is evidence that they have in fact been used that way. Eclecticology 03:34, 3 June 2006 (UTC)
You're right, I flatly forgot to capitalize the L, so there's no need for the excuse or the "come on". Davilla 07:23, 3 June 2006 (UTC)
Is it sufficient that I and a few other pedants currently use them (not to mention the likely far greater number who formerly used them)? What is the harm in including them, if they are labelled as 'Etymologically Correct' (but not necessarily 'Common Usage')? Of course, most of you are unlikely to agree with the label 'Simplistic Bastardisation' for -s plural forms, so I open it up to you to invent your own (more neutral) term. If it is acceptable for everyone here, I will go about reinstating the etymologically correct plural forms, label them as such, and provide reasons where necessary. I will, however, also leave the -s plural forms, as an example of common usage. OK?
Doremítzwr 15:00, 3 June 2006 (UTC)
Please explain why you think scenarii is the only vald plural of scenario, since that is the gist of your last edit. All other dictionaries give the plural as scenarios. Jonathan Webley 17:31, 3 June 2006 (UTC)
Scenario is a word of Italian origin, whose plural is scenarii. Final o>i is neither a difficult nor a particularly rare (at least compared to the Icelandic saga>sögur) rule for forming plurals (think concerto>concerti, soprano>soprani, cello>celli, among others), thus, considering ease of use and precedent, there is a good case for the scenario>scenarii plural; that is my justification. Of course, adding an -s to pluralise scenario would be no more incorrect than applying any other seemingly applicable rule to do so (not that I can think of one at this point in time).
No, my reason for listing scenarii as the only possible plural form of scenario has more to do with the aesthetically displeasing sound of "scenarios", in contrast to the far more elegant and unobtrusive "scenarii". I realise that this is a very subjective reason, although I still believe it to be a somewhat valid one (well I would, wouldn't I?). Unfortunately, Wiktionary is intended to be more descriptive than prescriptive (as is the tendency for English dictionaries nowadays), so I expect that scenarios will soon be included as an alternative plural form to scenarii for scenario.
Doremítzwr 00:37, 4 June 2006 (UTC)
Having Wiktionary as a descriptive rather than prescriptive was a conscious decision taken a long time ago; I believe that it continues to have broad support. In some ways it could be viewed as Wiktionary's counterpart to NPOV. Eclecticology 08:14, 4 June 2006 (UTC)
In English scenarii is not a plural of scenario. If its the plural of the Italian word scenario then create the Italian entries thus, instead of buggering up the English ones. Jonathan Webley 07:37, 4 June 2006 (UTC)
I assume we follow the same rationale for inflections etc. as we do for full entries, namely, attestation? If a plural can be attested as used that way, we include it. If not, we may perhaps note it somewhere, but not wikify it or create a separate entry for it. As scenarii seems to get plenty of Google books hits, I assume it is legitimate enough to be included. Jonathan, if it's not a plural of scenario, what is it then? — Vildricianus 11:31, 4 June 2006 (UTC)
I agree with Connel above that inflections should be attested.
scenarii, whilst getting a few google hits, is not in any dictionary. That makes it either a misspelling or a protologism. It is moot, and I could be persuaded that scenarii is a valid alternative plural.
My earlier vehemence was more a reaction to the repeated deletion of commonly accepted plurals in words such as affidavit, imprimatur, stimulus, amoeba by the same contributor, which I felt was verging on vandalism. Jonathan Webley 12:52, 4 June 2006 (UTC)
I agree with the idea of attesting them, which would mean scenarii is not protologistic, from what's been said. However, it certainly isn't the principal plural form, regarless of the new contributor's subjective opinions. Davilla 10:36, 7 June 2006 (UTC)
Doremítzwr, you miss a couple of points about English plurals:
  • Naturalised English words follow English rules for pluralisation. Hence we say "pizzas" in English, and never "pizze", the Italian plural. Only when a word is not naturalised is the plural from the originating language the only acceptable plural. (There is, however, often a middle ground for some words, in which both kinds of plural are used.)
  • Your statement that "scenarios" sounds "aesthetically displeasing" and "scenarii" sounds "far more elegant and unobtrusive" may be your view, but is utterly irrelevant here. Wiktionary records actual usage of words rather than the forms that contributors think have the best euphony. — Paul G 10:37, 14 June 2006 (UTC)

other authoritative dictionaries

[discussion moved from WT:RFD 19:01, 3 June 2006 (UTC)]

Over at WT:RFD there's been some debate about how many entries we should have for OED/Oxford English Dictionary and SOED/Shorter Oxford English Dictionary. Without opening the question of whether we want an entry for every dictionary ever (let alone every book title ever), how do people feel about Hanyu Da Zidian, Dai Kanwa Ziten (aka Morohashi), and Dae Jaweon? These are, as I understand it, pretty much analogous to the OED for Chinese, Japanese, and Korean. The question comes up because these dictionaries are regularly referenced from our CJK entries, and are therefore showing up as missing in RJFJR's "list of words used in Wiktionary that aren't defined in Wiktionary" project. So I'm very tempted to create little entries for them. –Scs 22:13, 30 May 2006 (UTC)

Referenced, but not linked, right? Davilla 16:50, 3 June 2006 (UTC)
Correct. (RJFJR's premise, which I'm in at least partial agreement with, is that as a dictionary we ought to define all the terms we use, i.e. whether or not we explicitly link to them. Or do I hear Gödel laughing off in the distance?) –scs 18:54, 3 June 2006 (UTC)
Since when is English first order logic? I don't think it is logic(al) at all! TheDaveRoss

Can we first please have a link to the infamous previous discussions? — Vildricianus 19:17, 3 June 2006 (UTC)

We can do without articles about these dictionaries; they are essentially encyclopedic. Eclecticology 08:23, 4 June 2006 (UTC)
I've argued even the OED is encyclopedic (in its full form), so you know my opinion. The idealistic argument doesn't make sense to me. I don't think we have to create entries for all of the works we quote, for instance. Anways we define Oxford and English and dictionary so what's the worry? Maybe just starting with an Appendix: article would be less risky. Davilla 10:42, 7 June 2006 (UTC)

After reconsidering this, I'm quite sure these terms are not fit to be in here at all. They are just publications; how can we ever determine which to include and which not? People might expect to find Abraham Lincoln, fine, but who thinks of Oxford English Dictionary as a "word"? It's not a word, it's just the title of a book. This has been up for a couple of weeks now but I haven't heard a single strong argument in favour of "short entries for the most common English dictionaries". — Vildricianus 21:46, 23 June 2006 (UTC)

Templates for quotation references

A "lighter" issue to relieve the tensions here. (Which tensions?)

Eclecticology created a couple of templates a while ago for the purpose of consistent and easy referencing of quotations. Example:

*{{RQ:Shakespeare John}}
*:Quotation text.

I plan to create some more, but I would first like to discuss their naming conventions. The "RQ:" is not very consistent with the majority of our templates being lowercase, so I'd change that to "rq:". Second, how to refer to the author and work? The idea is to make them easy to remember and consistent. For instance, the reference for Shakespeare's King John is currently named {{RQ:Shakespeare John}}. Arguably, that should be {{rq:Shakespeare, King John}} or something like that. Anyone has a suggestion? — Vildricianus 21:06, 3 June 2006 (UTC)

See {{rq:Wells, War of the worlds}} for an example. It's obviously important to keep in mind the case of the work's title. I suggest avoiding Title Case and lowercasing everything save proper nouns and the first word. — Vildricianus 21:17, 3 June 2006 (UTC)

I see no reason to change these to lower case. In naming the templates it helps to keep it short but still identifiable, unlike some of the things that have developed during the recent craze for templates. In general I have no objection with the use of sentence case in full titles. Eclecticology 08:37, 4 June 2006 (UTC)
That's all fine but I don't want to spend ten minutes looking for a template. It would help if there were some logic behind their names instead of randomness. What should War of the Worlds be: Template:RQ:Wells War? Wells Worlds? Wells War Worlds? — Vildricianus 10:54, 4 June 2006 (UTC)
Sometimes a longer title will be appropriate. Eclecticology 08:13, 7 June 2006 (UTC)
The shorter titles are preferred if you remember them, but the longer ones are necessary if you don't. So why not just redirect your favorite abbreviated title to the full title, sentence case as agreed?
Anyways, how permanent is this as a solution? I can see the need, but it's not the prettiest substitution. Davilla 15:05, 9 June 2006 (UTC)
  • First point: As I understand template use from discussions on Wikipedia, they place extar stree on the server that links do not (template redirects especially so). I therefore dislike using them except in cases such as entry line categorization, grammar, and etymology. --EncycloPetey 21:42, 15 June 2006 (UTC)
  • Simple template transclusion poses no straint. If it did, the devs would long have warned against the many WP templates that are used in thousands of instances. Bot-changing thousands of entries in order to keep things consistent is way more resource-intensive, though, so templates are a much better solution. — Vildricianus 23:02, 15 June 2006 (UTC)
  • Second point: In this particular case, I can imagine that the list of templates will grow very, very large, so that it will be very difficult to maintain a useful list. It requires looking up the template as an added extra step, and there oftem won't be one made yet. I don't believe the effort is worth it. I have thus far found very few repeat cases of source quotation, and believe that I have been adding more quotations than everyone else combined for the past few months. When I go looking for citations, it's easier for me to pull the information from Wikisource than to worry about specialized templates. The only possible exception I can think of might be for entries quoted from the KJV/AV Bible. --EncycloPetey 21:42, 15 June 2006 (UTC)
  • You don't seem to understand their purpose: they're for referencing where the quote comes from. It should be done properly with links to WP and WS, but as these may change from time to time, the links have to be adapted. Moreover, they would allow quick adjustment according to formatting policies. — Vildricianus 23:02, 15 June 2006 (UTC)
Actually, I do understand this. The problem is that there aren't frequent repeats of sources. In terms of possible linkage changes, the act of updating one template versus updating three pages to reflect a moved link doesn't seem like it's worth the organizational investment. It also means that there will be almost one template per one citation, so I don't see how that would make format updating any faster either. We've never had a consensus on how citations are to be formatted anyway, and there are at least two major systems I've come across in addition to the format that I use. Templates won't solve that. --EncycloPetey 01:50, 16 June 2006 (UTC)
  • Third point: For Shakespeare, the existing templates are lacking. They do not indicate which edition was used, which makes a huge difference for spelling and editorial changes. The First Folio seldom resembles modern editted texts, for example. I do not think though that having a separate template for each play and each edition is worthwhile. --EncycloPetey 21:42, 15 June 2006 (UTC)
  • That's a more specific issue. There could be ways of tackling this, like leaving the edition out of the template.
  • Now I don't mind barring the option of using them at all. It was Eclecticology's idea which I was merely expanding. I find them useful, though. — Vildricianus 23:02, 15 June 2006 (UTC)

Prolific PIE

I've noticed an increase lately of contributors wishing to PIE our etymologies. We discussed that this is not desired not too long ago. Does anyone know why they are increasing, these days? Have they been shut out of Wikipedia or something? --Connel MacKenzie T C 02:17, 4 June 2006 (UTC)

Jargon, jargon, jargon! What the "F" is PIE ?--Richardb 06:04, 4 June 2006 (UTC)
PIE is Proto-Indo-European. I had not heard that this was undesireable, but only that the hypothetical Proto-Germanic and PIE forms were not to be wikified. —Stephen 06:23, 4 June 2006 (UTC)
I can't remember it being the outcome of the latest discussions that they were undesired in etymologies. As separate entries, yes, but as it is quite natural for etymologies to contain references to them, I don't see that as undesired. — Vildricianus 10:56, 4 June 2006 (UTC)
The best a PIE form can ever hope to be is "theoretical." But even then, the circular nature of assigning the "forms" by a contingent of learned scholars is still undesireable...the premise used to construct these is flawed. These were all spoken fragments. The notion that they can be accurately reconstructed (ahem: using current language features as the only clues) has yet to stand the test of time.
Even if the PIE forms do eventually gain recognition as being somehow "valid," they remain unhelpful (if not misleading) to the majority of our readers. The best that "proto" forms can hope for in the way of attestation, is a listing in a secondary source. But the experts that are inventing these proto forms don't often seem to agree amongst themselves! These should remain Wikibooks/Wikisource projects for some time to come, I believe. Using Wiktionary to linkspam them, or otherwise add to their validity is an abuse of Wiktionary. --Connel MacKenzie T C 17:30, 24 June 2006 (UTC)
PIE reconstructed words is ARE valid. This is historical linguistics at its heroic and triumphal best. This is built solidly on the science of phonology and how sound-change affects language. Some reconstructions are less certain than others, but the literature carefully labels these. They are as valid.--Allamakee Democrat 00:03, 26 June 2006 (UTC)


{{cattag}} doesn't work properly, at least, not as it is being used. Categories have an initial capital but labels do not, so, for example, as Template:Chemistry is currently defined, it generates the category "Category:Chemistry", which is correct, but the label (Chemistry), which is not (we use lower-case initial letters for labels unless a label requires an initial capital).

Could someone take a look at this and fix it, please? — Paul G 08:46, 4 June 2006 (UTC)

When did we start using lower case in the labels at the beginning of a definition line? Eclecticology 08:10, 7 June 2006 (UTC)
As far as I can see, that has always been done in the majority of these labels. — Vildricianus 08:38, 7 June 2006 (UTC)
The problem is that {{chemistry}} passes an upper-case label to cattag. As far as cattag is concerned, this could just as well be a proper noun. Regardless of whether you think it should be displayed as a capital, the lower-case should be used as a parameter. Making things upper-case is easy to do in code. Deciding whether they should be lower-case (like chemistry) or not (like uS) isn't as straight-forward. Davilla 10:07, 7 June 2006 (UTC)


While looking throught the Special:Allpages, there is a very handy "Next page" button on there, but annoyingly, there is not a "previous page" button on there. Can adjusting my monobook solve this, or is it a job for the developpers? --Dangherous 10:52, 4 June 2006 (UTC)

I don't think so. Developer thing, but I doubt they'll spend time in that. — Vildricianus 11:19, 4 June 2006 (UTC)
It's definitely not possible for us to do in JavaScript because it involves a database lookup. It requires the database because the URL requires the first word on the new page. This could be done with the toolserver though words added after the latest database dump will be omitted. You're probably better off filing a feature request on http://bugzilla.wikipedia.orgHippietrail 00:40, 5 June 2006 (UTC)
bugzilla:4673 seems to be the request to "vote" for (if you've created an account there.) --Connel MacKenzie T C 07:15, 12 June 2006 (UTC)

Criteria for inclusion: more ideas to test multi-word entries

I've posted another idea at Wiktionary talk:Criteria for inclusion#Multi-word entries, sums of their parts and translations, regarding multi-word terms, whose existence has lately been put to the test of RFD, CFI and Pawley tests. Please comment. — Vildricianus 13:28, 4 June 2006 (UTC)

Plurals vs. Site Statistics

It has come to my attention, that the count decreased by several thousand entries as a result of User:TheCheatBot activities. Specifically: converting 'bot uploaded "subst:"ed entries to the templatized form made the entries no longer "count" towards the site statistics.

Tracing the problem led to the discovery that Wiktionary pages (unlike Wikipedia pages) do not count towards the site statistics if they don't contain even one wikilink. Oddly, templates (that do wikify downstream) somehow don't count.

Question 1: Does anyone care? If pages were counted the same way as Wikipedia, we'd have about 207,000 entries (as of the last XML dump.)

Question 2: Assuming people do care, there are two things that then need to be addressed: A) Template-heavy entries, B) linkless, templateless entries. For B), we can start a (long term?) cleanup effort. But for A), should the format of {{plural of}} simply be changed to take the wikified form as the parameter? I can do this fairly easily, contingent on #1 - if anyone actually cares.

Other possibilities exist: we could have User:Scs or User:Patrik Stridvall work some magic on Article.php, much to Brion's delight, pushing it back into the sourcecode tree. Or we could file a bug report, saying that Wiktionary wants to be counted the way Wikipedia is (since the opposite styles are probably more appropriate anyhow, in that we expect more short entries.) Or we could file a bug report saying that template transclusions should be taken into account before searching to a wikilink.

Or, maybe, all of the above? Comments? --Connel MacKenzie T C 04:54, 7 June 2006 (UTC)

  • Also of note: the syntax "* Common misspelling of foo" does count as an entry, by merit of having the correct spelling wikified. Perhaps we should go back to deleting them all?? --Connel MacKenzie T C 06:02, 7 June 2006 (UTC)
I'd go for the Article.php fiddling option. Whether an entry has wikilinks or not is not very crucial for a dictionary, more so for an encyclopedia. Wikifying wikilinkless pages can always be done via Special:Deadendpages. — Vildricianus 08:36, 7 June 2006 (UTC)
I'm always willing to support removing templates and replacing them with plain text. Eclecticology 08:01, 7 June 2006 (UTC)
Even for something like this? --Connel MacKenzie T C 14:53, 7 June 2006 (UTC)
Erm, we've just decided to do the opposite. — Vildricianus 08:36, 7 June 2006 (UTC)
Don't change our practices for the number, just change the way it's counted. Davilla 10:26, 7 June 2006 (UTC)
Well, I don't have the ability to change LocalSettings.php, nor the ability to change the Wikimedia software. But I do have the ability to run a 'bot that would make these replacements. --Connel MacKenzie T C 14:54, 11 June 2006 (UTC)
Why, there are a number of things in the running, proposals to change some settings: custom namespaces, subpages in ns:0, plus this one - why not propose them all in one complete list to the devs? — Vildricianus 19:01, 11 June 2006 (UTC)
PS: it's still not clear to me who has the ability. Bureaucrats or only the devs? — Vildricianus 19:02, 11 June 2006 (UTC)
Also, could we create a page(s) listing the pages that have no links. I find it odd that there would be any entries at all that had no links whatsoever. It would be nice to have a list where these can be examined and links added. --EncycloPetey 01:42, 16 June 2006 (UTC)

Misspellings format/template/etc.

  • On a related note, (perhaps a separate thread is better) shouldn't "* Common misspelling of..." be in a template, for consistency? --Connel MacKenzie T C 14:56, 7 June 2006 (UTC)
    Separate thread. I think we still need to re-discuss misspellings policy. Format? Criteria? Wouldn't it be better to make an appendix? And link to that from the article? What is the format for words that are correct spellings in other languages but misspellings in English? — Vildricianus 15:19, 7 June 2006 (UTC)
    Certainly not an appendix. If we list common misspellings, it's to make it easier for readers who unwittingly search for them that way. If they were relegated to an appendix, such readers would probably never find them. But yes, we should certainly have a template for convenience and consistency. —scs 15:14, 8 June 2006 (UTC)
    Are they always common misspellings? You claim some Commonwealth spellings are in the US while their American counterparts are in the UK. There is some question as to what constitutes a misspelling versus an alternative spelling, or even a misconstruction like copywritten. Then there's always the question of how common it is. After verifying it, I recently marked werdup as a deliberate misspelling. Davilla 16:05, 7 June 2006 (UTC)
    Anywho I've created a {{form of}} which probably generalizes way too much. Davilla 08:15, 8 June 2006 (UTC)
    Hippietrail was the last one (I think) to try creating templates for this purpose, but inexplicably voted for their deletion later. --Connel MacKenzie T C 02:41, 16 June 2006 (UTC)
Aren't we supposed to have at least a ==Language== statement? — Vildricianus 19:12, 14 June 2006 (UTC)
Well, they aren't words in English (or whatever language they are a misspelling in.) So I'd hope not. --Connel MacKenzie T C 02:41, 16 June 2006 (UTC)
If they aren't words they aren't fit for inclusion here, right? — Vildricianus 09:35, 16 June 2006 (UTC)

Created {{misspelling of}} for this purpose. "Common" not needed as all misspellings we include are "commone" (right?). — Vildricianus 15:00, 16 June 2006 (UTC)

Block letters

Comments/corrections on Wiktionary:Blocking policy#2006/06/08 are appreciated (I have not sent it yet.) --Connel MacKenzie T C 15:48, 8 June 2006 (UTC)

We should (later) take them out of the Blocking policy page, though. Looks a bit weird. — Vildricianus 16:28, 8 June 2006 (UTC)
WT:ABUSE/Wiktionary:Abuse reports? --Connel MacKenzie T C 17:54, 8 June 2006 (UTC)
Yup. Keeping things consistent across the projects is a Good Thing. — Vildricianus 18:06, 8 June 2006 (UTC)
I'd like to send this in 1/2hr from now. I've seen corrections to the typos, and wildly positive feedback from a Wikipedia admin. I don't see good reason to significantly delay this. --Connel MacKenzie T C 19:31, 8 June 2006 (UTC)

I have only just seen this for some reason. I appreciate the work and energy Connel has put into chasing up these copyvios, which is definitely a good thing. However with all respect to him, I'm not convinced he's the best person to be writing them; I would personally prefer that they were drafted by someone a little more diplomatic, like Vildricianus or one of our bureaucrats. Or at least they could be toned down a little? As Richard commented, it's about providing information rather than reprimands. Widsith 15:13, 15 June 2006 (UTC)

<redacted>--Connel MacKenzie T C 05:26, 16 June 2006 (UTC)

I started a boilerplate letter for future letters of this nature here (moved here), please ammend, comment upon, or spit at it if you wish to see these letters be different in tone/form/style/usage in the future. - TheDaveRoss 05:07, 16 June 2006 (UTC)

The point of sending an abuse report, is at its very essence, a request for assistance. The ISPs and or Uni networks that get them have the ability to actually do something to a disruptive user. AOL can, in theory, sue one of its customers for getting all of AOL blocked (if the block resulted in lost revenue for AOL, that is. And that, they would only know if everyone dropping AOL said they did so because of the E*icornt guy.)
A University, on the other hand, has much more leeway in what they can do about an abusive user. In this case, since it was thousands upon thousands of mendacious copyright violations, I may have taken too much liberty in trying to express the proper sentiment.
Certainly, in my opinion, it seems like the univerity itself is promoting plagarism by not providing their 5th year student (as he claimed to be earlier this year) with any knowledge that stealing other people's work is not only wrong, but against the law. On numerous occasions, he claimed that he created entries from his own term paper(s) (which were later also proven to be copyvios.) So, is the university encouraging copyright violation by not doing the slightest thing to prevent it, even in its student's papers? Well, my opinion may not matter. But that is how I arrived at that controversial wording, anyhow. But presumably those copyvio term papers assisted the person in getting their degree last month - who knows. Perhaps the university could revoke the conferred degree since it was based on fraud. Would that help Wiktionary? I don't know, but I like to think that it will.
Anyway, I'd like to see the boilerplate give a couple (well-worded) options depending on the network-type, and/or the abuse type.
--Connel MacKenzie T C 05:52, 16 June 2006 (UTC)
Yeah, I agree with all of that – sorry if I gave the impression otherwise. This sort of correspondence is definitely important if Wiktionary is going to take itself seriously. Widsith 19:56, 16 June 2006 (UTC)


Is Wiktionary interested in setting theirs, or do they like having the same one as Wikipedia? :P

The Wiktionary logo wouldn't work very well as a favicon, I think:   But you could use something different (Wikipedia's favicon is not their logo, after all. maybe you could force them to change ;))

For that matter, is there any reason Open Content is capitalised in the logo? It looks kinda weird... --en.{wp,wb}|commons:user:pfctdayelise 13:04, 8 June 2006 (UTC)

Subpages for main namespace

Already mentioned before, but I thought I'd post this more clearly. What do other people think of having the ability to have subpages in the main namespace? It would be highly appropriate for Wiktionary, given the amount of pseudo-subpages for citations, to give one example. I see no reason why we wouldn't allow this ability. — Vildricianus 20:16, 8 June 2006 (UTC)

Note: as this requires technical fiddling with settings and stuff, this may go with the entire namespace topic. I thought I'd throw this up while that one's still active, to allow a one-off change to LocalSettings.php. — Vildricianus 18:27, 9 June 2006 (UTC)

Can I file this at bugzilla? I guess nobody minds. — Vildricianus 19:43, 18 June 2006 (UTC)
Could you please explain the difference of making "official" sub-pages in NS:0 again? I don't get it. The /Citations pages work, don't they? --Connel MacKenzie T C 16:15, 28 June 2006 (UTC)
They do work, but they aren't subpages. They're just entries with a slash in their title. Compare this real subpage and this false subpage. Subpages should be real, shouldn't they? They allow making use of some magic words and such, and at least display a "back to" link atop of the page. — Vildricianus 16:20, 28 June 2006 (UTC)
So by making the request to make NS:0 allow sub-pages simply adds the link to the parent page? It certainly sounds like a bug, that this has not been enabled all along. Here are the entries (not including /Citations) that have "/" in the headword:

w/ w/o GNU/Linux n/a and/or P/E ratio 24/7 w/e w/i 9/11 look-down/shoot-down look down/shoot down East/West engine east/west engine licări/licuri TCP/IP c/o km/h client/server by/citations And/or By/citations C/o Client/server Km/h Licări/licuri Look-down/shoot-down Look down/shoot down N/a W/ W/e W/i W/o tCP/IP w/c b/c n/c n/s o/a o/c y/o m/o w/h a/c w/off 7/7 CSMA/CD Farsi/Experiment Persian/Experiment N/A AC/DC haben/conjugation s/he S/he snitch/verification J/psi particle Internet forum/ bouncebackability/citations colour/experiment color/experiment sa/vol へ/compare に embarrass/etymology I/O -ize/Derived terms if/ja theater/homonyms Rodasmith/Context-US б/г б/м бн/о [[о/.]] п/г п/м п/пр. с/г с/м с/ч т/г т/м о/ гвт/ч см/сек км/час ж/д

--Connel MacKenzie T C 16:29, 28 June 2006 (UTC)

Ahem. bugzilla:6476. --Connel MacKenzie T C 18:14, 28 June 2006 (UTC)

Lists of synonyms vs. Definitions

I have come across several pages recently which display an unfortunate trend: rather than offering definitions they give lists of synonyms in the place of a definition (e.g. torpid - unmoving, dormant or hibernating) This leads to little loops of words which all may mean the same thing, but never state what they actually mean. We need to strive to actually define words, not just offer relationships, that is the *saurus' job ;). The adjectives are the biggest offenders as far as I can tell, but I am not sure of a good way to hunt down such entries and fix them, thoughts and suggestions are appreciated. - TheDaveRoss 15:57, 9 June 2006 (UTC)

Quite a relevant remark, I've been wondering as well. There are some instances where I think having synonyms within the definitions is appropriate: 1/ When the word is a variant, a rare, dated or obsolete term that is properly defined by a more widespread or current word. Listing the current term as a synonym for the obsolete one may be more effective than repeating the definition, if relevant of course. 2/ As additional information, to clarify the definition. Emphasis here is that there is already a definition, but that synonyms prove to add essential clarity to it, so that it is more appropriate to list them in the defs rather than in the =Synonyms= section.
In general, I usually try to add a couple of synonyms to a definition, which may either be retained or moved to their own sections later on. Most of the time I don't bother to create a separate section, as the purpose is to list them at Wikisaurus.
Personally, I think torpid deserves way better definitions than that. In this case, synonyms-only is not very appropriate if I may say so. — Vildricianus 18:44, 9 June 2006 (UTC)
I agree. I don't know about "torpid", but many words that have this treatment have come from Webster 1913. The practice of "synonyms as definitions" comes from print dictionaries wanting to save money on paper and ink (and effort, for that matter: give a list of synonyms and the reader is bound to know what one of them means). So yes, we should certainly strive to avoid this. Wiktionary is not paper, and defining A as B, B as C and C as A is of no help to anyone.
I'll put a "rfc" tag on the page. — Paul G 09:21, 10 June 2006 (UTC)

Sometimes I've thought it would be fun to write a tool to create a tree of what words are defined in terms of what other words, to (a) detect isolated clusters and loops like these, but more interestingly (b) discover what the truly fundamental concepts are (analogous to postulates in Geometry and other formal systems), that you can't define except by example or by discovery or by knowing them instinctively, or whatever. —scs 22:26, 10 June 2006 (UTC)

That would be an interesting tool, WordNet probably has the most potential for showing the relations in such a way. - TheDaveRoss 21:35, 12 June 2006 (UTC)
This would be more than just fun or interesting. I see it as important, if not essential, to weed out loops in the graph (rather than tree). Of course, some will have to stay, as all words of a language are defined using words of that language rather than a metalanguage, but basic loops of the kind I described could then be tracked down and eliminated. — Paul G 10:42, 14 June 2006 (UTC)

Changing user name

Two users have posted to my talk page asking me to change their user name.

I can't remember if this is possible or not, but one claims to have done so already on Wikipedia.

If it is possible, can someone remind me how to do it, please, and advise whether we allow this.

Thanks. — Paul G 09:13, 10 June 2006 (UTC)

Bureaucrats have the Special:Renameuser page IIRC. If you've got specific questions about the how-to, I think the crats at Wikipedia can help you. They also have a page for such requests, and it's quite busy over there. I can't see why we wouldn't allow this. — Vildricianus 09:23, 10 June 2006 (UTC)
Ah, now I am a 'crat, and looking at Special:Special pages, there is a link to do this. Thanks, Vildricianus.
The only reason I thought that we might disallow this is if it meant that a user's contributions before the change could no longer be traced back to them (so, for example, allowing a vandal to escape being held accountable). — Paul G 09:26, 10 June 2006 (UTC)
It's discussed at w:Wikipedia:Changing username SemperBlotto 09:30, 10 June 2006 (UTC)
I have now moved the users. I seem to remember once someone asking if their contributions could be accredited to their new name, which I don't think is possible. Maybe that is why I thought that this was not possible or not allowed. Thanks for the help. — Paul G 09:33, 10 June 2006 (UTC)
(2 edit conflicts!) No, that's not the case. The username in the database changes, but not the user ID, which is a number, not a name. Contributions will remain linked to the same ID, whatever its name is. Histories of pages and diffs will all display the new name, so if you were to change your name, even your earliest contribs will be under that new name.
Things that won't change are userpages. They have to be moved to the new location. Also, hardcoded references, like signatures on talk pages, won't change of course. People who want to efface every trace they've left behind have to change all that manually. — Vildricianus 09:34, 10 June 2006 (UTC)
Cool stuff. Can I get my username changed? --Newnoise (Shout louder) 21:30, 11 June 2006 (UTC)
Like said on Paul's talk page: you can, but it's not a game. It's a significant load for the database to rework all changes. — Vildricianus 21:49, 11 June 2006 (UTC)

Lowercase userpages

For instance User:davilla, or User:wietsezuyderwijk. It's really odd, as lowercase usernames are not allowed in the database, so these userpages don't correspond to the usernames. Should we disallow this? It's not something everyone should start doing actually. — Vildricianus 09:46, 10 June 2006 (UTC)

But there are no users of these names - only a userpage of those titles. You can see that on the "user"pages themselves, as there are no links to "user contributions", "e-mail this user" and so on. \Mike 10:35, 10 June 2006 (UTC)
I don't think allowing lowercase usernames/userpages is a good idea. --Connel MacKenzie T C 14:50, 11 June 2006 (UTC)
Lowercase user pages I will abstain from. Lowercase usernames somehow I doubt would open the door to abuse. ∂ανίΠα 18:07, 11 June 2006 (UTC)
AFAIK, it's simply impossible to have lowercase usernames. Also, it's probably even worse to have User:davilla but User talk:Davilla. This is an aberration in our lowercase system and should not be done. — Vildricianus 18:41, 11 June 2006 (UTC)

Moved to meta:New logo for Wiktionary, please continue discussion there. —Nightstallion (?) 16:43, 11 June 2006 (UTC)

wheels on wheels

This word is in my watchlist. I noticed Connel had battled a spat of like vandalism yesterday and wondered if there might be a real entry under the history. Davilla 08:07, 11 June 2006 (UTC)

Right now that entry has View or restore 1 deleted edit? displaying for me. WoW is a rather famous page-move vandal - I would not be surprised if this Beep Parlour page had been moved to that exact location at some time in the past. (Move "reverts" actually remove the redirect that would have been left behind, but only if the sysop is careful and reverts the page move from the move log's 'revert' link.) My theory is that that 'revert' special action does not clear out user's watchlists, if they were affected. --Connel MacKenzie T C 14:48, 11 June 2006 (UTC)

Nasalised vowels

I think it's been asked before, but I can't find it here or in Wikipedia... how do you enter nasalised phonemes in IPA and SAMPA?

I think SAMPA is done by preceding the vowel by a tilde (for example, ~O for the vowel in French "bon") but how do you get the IPA characters? Windows Character Map has a, e, n and o only).

(Note to self: fix pronunciation of "Port-au-Prince" once this question has been answered.)

Thanks. — Paul G 09:13, 11 June 2006 (UTC)

You can see those characters here : API and SAMPA for fr: (fr: Wiktionary). API X-SAMPA are : on = ɔ̃ / O~ ; an = ã / A~ ; in = ɛ̃ / E~ ; un = œ̃ / 9~. As you can see, the X-SAMPA is done by preceding the tilde by the vowel. The character «   ̃ » should be placed in the editool. - Dakdada 12:37, 11 June 2006 (UTC)
Oh! (Slaps forehead.) That was the first thing I did, look at fr:bon, but they didn't have an IPA pronunciation listed, only some API thing I'd never heard of. But, of course, API is French for IPA.
Paul: was your question "What's the IPA nasalization character?", or, "How do I enter it in Windows?" If the latter, I'd say, either cut and paste (from fr:bon if necessary), or wait for someone to take Dakdada's suggestion and add it to the edittool, or use the alt numlock 771 thing, however that works. —scs 16:04, 11 June 2006 (UTC)
Or, buy a Mac. Widsith 22:08, 11 June 2006 (UTC)
My question was really the former, Scs. I'd have cut and pasted it from somewhere if I had found it. But thank you for the useful feedback. I know this sort of thing tends to be easier in Apple's world than Microsoft's, but I'll pass on your suggestion, Widsith :) — Paul G 09:05, 12 June 2006 (UTC)
Speaking of which, I do use a Mac, and I don't know how to enter the ̃ character (which of course is U+0303 Combining Tilde) from the keyboard, either. So, Widsith, what's the secret? —scs 09:50, 12 June 2006 (UTC)
Well I just have the Character Palette up on screen the whole time, and you can add the combining tilde to your "Favorites" section so it's there for you any time. Widsith 10:40, 12 June 2006 (UTC)

Wikimedia Commons - Legal help?

On commons:Commons:Village Pump#Mailing list discussion - appoint a legal adviser? there is a general request for assistance in US copyright law. If any of our lawyer-contributors could take a few minutes to look in on the issues there, it would probably be very beneficial to the entire Commons project. --Connel MacKenzie T C 14:38, 11 June 2006 (UTC)

Temp buttons

I've a small request to make to any sysops out there. It probably won't happen, but is there any chance I could borrow a sysop's account for a few minutes? There's this page that got deleted, which I really want to see the history of. However, I'd rather not say what the page is, for fear of embarrasment. So if any sysops fancy lending me their account for a really short time, then email me thru the obvious channels. I won't go admin-rouge again, cross my heart. ;). --Newnoise (Shout louder) 21:40, 11 June 2006 (UTC)

That's a ridiculous request. I ain't giving my password to you :-). You can e-mail someone (not me now, I'm off to bed) and ask the undelete info in private if you mind the embarrasment. Perhaps someone is weird enough to do it, but it's not me. — Vildricianus 21:51, 11 June 2006 (UTC)
I can't imagine any sysop denying an e-mail request for information about a deleted entry. If it is not a state secret of some sort, you'll probably get the deleted entry's contents as well. Your request is, as Vild says, ridiculous. --Connel MacKenzie T C 07:20, 12 June 2006 (UTC)
Certainly, and here are my credit card details and the keys to my house as well ;) Just tell us what the page is and let one of us sort it out. No need to feel embarrassed - I'm sure no one will laugh at you. — Paul G 09:17, 12 June 2006 (UTC)
Problem sorted. Thanks to one over-trusting admin. --Newnoise (Shout louder) 09:19, 12 June 2006 (UTC)

Page moves autoconfirmed

Discussion moved to Wiktionary:Votes/2006-06/Page moves autoconfirmed.

Vote called for - Spelling variants in entry names

Connel has asserted The notion that entering a redirect is OK, is completely unacceptable! , and persistently bases his arguments on that assertion. I would similarly assert that this is just something Connel believes, and there has never been a solid decision on this. This needs to be resolved before we can finish developing this policy.

So I call for a vote. Please go to Wiktionary talk:Spelling variants in entry names#Vote called for. Need to resolve one point of dispute. —This unsigned comment was added by Richardb (talkcontribs) 10:56, 13 June 2006.

Like said there, I'm against such votes. — Vildricianus 11:32, 13 June 2006 (UTC)


Just canvassing opinion here, but....does anyone else feel that we should standardise our Pronunciation sections by using only IPA? SAMPA and that other one just look very dated to me, and in my personal opinon they don't make us look good. Apart from aesthetics, the practical value is that the section would be much less cluttered, which can currently be a problem when we are distinguishing between US and UK pronunciations, plus adding homophones etc. Widsith 15:14, 13 June 2006 (UTC)

  • Support, but I'm not sure what other people think about this. I've always had troubles with SAMPA and AHD, but thought that suggesting something like this would sound too POV. — Vildricianus 15:23, 13 June 2006 (UTC)
  • Support. IPA is complex because, like Wiktionary and unlike AHD or WEAE, it is designed to represent "all words in all languages" (with the noted exception of sign languages). SAMPA is just a 7-bit transcoding of IPA, so if we want, we can choose some day to write a MediaWiki extension to convert IPA to SAMPA (though I doubt we'd ever want to do so). Two other simplifications of IPA are phonemic representation and, for non-Latin scripts, raw transliteration. Phonemic representation doesn't help non-natives pronounce words accurately, so its value here is questionable. IPA seems to be the clear winner for standardization. Rod (A. Smith) 17:20, 13 June 2006 (UTC)
    IPA to SAMPA
    Almost correct. SAMPA is an ASCII transcoding of the IPA used in a particular language, and they vary (and conflict) between English and other languages. X-SAMPA is the one that's universal.
    Considering the difficulty of entering IPA characters, I would perfer to go the other way. Bot convert all IPA: [aɪpʰiˈeɪ] to a template that results in IPA: <ipa>"Eks{mp_h@</ipa> X-SAMPA: ["Eks{mp_h@]. Of course the reverse would be convenient within {{subst:}}. ∂ανίΠα 18:36, 13 June 2006 (UTC)
  • We shouldn't be using AHD anyway, that is the AHD's proprietary pronunciation scheme. I don't care whether we use SAMPA or IPA (I don't have particular objections to either one) but I think it would be best if we created our own scheme. We have several very gifted linguists around here, if we could tailor something which looked a little less cryptic than IPA/SAMPA but was still as complete and useful. Seeing as I am not qualified to create this, I could only tag along, but I think that would be the optimal solution. - TheDaveRoss 16:03, 13 June 2006 (UTC)
Well, I can't agree with that as I think there are too many already, that's the whole problem. IPA is the most widely-applicable and internationally-recognised. Widsith 17:10, 13 June 2006 (UTC)
I agree with Widsith. Besides, there's no reason to think we'd be better than the people who created IPA and attempting to do so would be original research. By the way, though, what does everyone think about removing the "/" symbols from the {{IPA}} parameter and having the template enclose the pronunciation in "/" characters? Rod (A. Smith) 17:45, 13 June 2006 (UTC)
If you have a bot ready to clean up afterwards, that's fine for me. Also, if there's agreement on that point, someone should bot-remove all {{AHD}} instances and their contents (only 387). — Vildricianus 18:17, 13 June 2006 (UTC)
[somewhat stale text after multiple edit conflicts]
There is a specific difference between enclosing IPA transcriptions in // or []. Slashes are for phonemic transcriptions, while brackets are for phonetic ones. It's a subtle but pretty significant distinction, and we wouldn't want to have the template always using slashes unless we took steps to make sure that editors were always entering phonemic transcriptions.
See w:International Phonetic Alphabet#Types of transcription for more information on this.
As that page notes, unless you really know what you're doing (and I certanly don't), you should probably stick to phonetic transcriptions and use brackets. And, given that we tend to list distinct UK and US pronunciations, that suggests we're using phonetic transcriptions already.
Come to think of it, to make this distinction more apparent, and to make it more likely that editors will enter IPA pronunciations appropriately, while freeing them of the burden of remembering which transcriptions are supposed to use slashes and which are supposed to use brackets, perhaps we should have two separate templates, {IPA-phonemic} and {IPA-phonetic}. The former would add slashes and the latter would add brackets. (Though in doing so we'd basically be trading off the hard-to-remember distinction between // and [] for the almost as hard to remember distinction between the words "phonemic" and "phonetic".) —scs 18:49, 13 June 2006 (UTC)
AFAIK, that's not how it's done here. If we were to phonetically transcribe English words, we'd end up with things like [b̥ɛdz] instead of /bedz/. Look in the WT:BPA for the discussion about the /r/ thing. — Vildricianus 19:12, 13 June 2006 (UTC)

There is a one reason (the only one I think) to keep (X-)SAMPA : is it written with ASCII characters, and therefore can be read by anyone ; that's not the case for the API (so a lot of people have problems seing some characters). - Dakdada 18:27, 13 June 2006 (UTC)

That's true, of course, but the question is, how many people is it actually useful to? On the one hand, more and more browsers do support IPA, in all its funky typographical glory. On the other hand, many readers are unfamiliar with IPA and do not find it useful -- some even complain about it. So how many readers are there who (a) do value IPA but (b) don't have a browser that can display it? (But no, I certainly don't know the answer.) —scs 18:58, 13 June 2006 (UTC)
  • This is a much trickier issue than many of you might be aware. European dictionaries, if they include pronunciation at all, have been using IPA for a long time. Most europeans are used to IPA at some level. American dictionaries almost never use IPA. Most Americans are not used to IPA at all. Each person signing would help by including their country, primary language, or which variety of English they feel themselves to use.
  • Every dictionary which uses IPA uses it differently - even dictionaries made by the same publisher such as the OED and the SOED.
  • Our "AHD" scheme is not AHD's propreitary scheme and I've argued many times to no effect that we shouldn't call it "AHD". It's actually an invented schemes which uses elements of several schemes used by various American dictionaries plus a few of our own - but it is probably closer to AHD's scheme than to any other.
  • Due to a bug with how templates work, they always include a linefeed character at the end. When the template is expanded, any whitespace including linefeeds is rendered as a single space. This means that putting "/" outside the templates annoyingly creates a space between the pronunciation and the final "/".
  • SAMPA is not a 1:1 mapping to IPA. SAMPA actually treats only a certain number of languages and may use a different symbol for the same sound in 2 languages. X-SAMPA is a closer mapping to IPA.
  • IPA does not include symbols for all sounds in all languages. In grammars of minority languages there are regularly sounds which the linguist had to invent a new symbol for. Even in well-known European languages there are sounds lacking symbols. For instance there are no symbols to distinguish Spanish "r" and "rr" and there is no symbol for the Czech "ř".
Eh? Surely /ɾ/ versus /r/. Widsith 20:05, 13 June 2006 (UTC)
  • "Two other simplifications of IPA are phonemic representation..." above makes no sense. IPA is specifically a "phonemic representation" though it can also be used for lower levels of representation.
    (FWIW, as stated elsewhere in this thread, IPA encodes phonemic or phonetic (or even morphological) properties of words. The sentence that you say makes no sense did make sense an hour ago, when I thought most IPA editors here were using IPA to encode phonetic information. Rod (A. Smith) 19:52, 13 June 2006 (UTC))
  • IPA has very poor support for tonal languages such as Chinese, Thai, and Vietnamese, and also for other languages in which tone can distinguish two words, such as Japanese - which is not considered a tonal language.
  • In my opinion, the best way to handle pronunciations would be if people contributed them from specific dictionaries using exactly the characters used in them. This is the only true "no original research" way to go. It avoids the issues with different dictionaries using IPA in different ways. It avoids the issues with American dictionaries not using IPA at all. But the big question is - would it violate copyright? — Hippietrail 18:37, 13 June 2006 (UTC)
    • It's been said before that that's not really possible. People aren't supposed to contribute from dictionaries (no secondary sources). I guess one can't make a dictionary without doing original research, unless public domain sources are heavily relied on. — Vildricianus 19:12, 13 June 2006 (UTC)
That would clearly be unacceptable, I think. It feels very much like copyright violation to me -- or, if not strictly speaking a copyright violation in the legal sense, certainly an improper lifting of the work of others in the ethical sense. Some could probably come up with arguments justifying such borrowings (perhaps under "fair use"), but enough others would assert the contrary opinion -- i.e. that copyright was being violated -- that such a policy would never fly.
I concede that inventing our own punctuation feels very much like Original Research. I don't know how to reconcile the two goals here. (I suspect -- though it'll probably open up a can of worms if I say this -- that we've already got huge amounts of original research going on here, that originally-composed pronunciations are just one example of. How often do you compose a new Wiktionary definition after consulting N other references, versus composing it off the top of your head based on your certain knowledge, as a native English speaker, of what a familiar word means?) —scs 19:09, 13 June 2006 (UTC)
  • Another argument for using AHD or whatever English standard we choose is that it represents the phonology of English, whereas IPA, even excluding the diacritics that designate different varieties of the phones, distinguishes between sounds in a phoneme that English speakers (and especially those who speak only English) consider to be equivalent allophones and therefore could not, and indeed must be trained to, hear. ∂ανίΠα 18:54, 13 June 2006 (UTC)
Even after a couple years here now, I still don't understand the majority of the IPA symbols. Perhaps if every single IPA character in Wiktionary pronunciation sections were an audio link to the sound file for that IPA character's audio pronunciation, it might become something useful. The characters, on their own, still look like unhelpful gibberish, to me. --Connel MacKenzie T C 18:46, 13 June 2006 (UTC)
That's weird. I learnt them in a couple of minutes. Come on, make an effort dude. — Vildricianus 19:12, 13 June 2006 (UTC)
At the very least, audo pronunciations of examples in the pronunciation key would be helpful. I also think the chart should be organized by allophone (i.e. AHD symbol) rather than by distinct phones (i.e. IPA symbol). Different IPA symbols that correspond to the same AHD symbol would be listed in different columns, especially as they apply to regional variations. ∂ανίΠα 18:54, 13 June 2006 (UTC)
As it happens, just last night I started writing Yet Another "nice, simple" introduction to IPA. I've got the consonants all done; now I've just got to do the vowels. (Of course, the vowels are the much harder part.) Expect results in a day or two. In the meantime, there is a useful chart on Wikipedia at IPA chart for English. —scs 19:00, 13 June 2006 (UTC)

Couple more points:

On the one hand it would clearly be nice to reduce duplication and standardize on just one universal pronunciation scheme. But another argument which has been made in the past and which I think still applies is that some punctuation is better than no punctuation. So if we've got editors who are only comfortable entering punctuations in "AHD" notation (or, gasp, using ad-hoc fo-NET-ick schemes), we shouldn't turn them away, since they're incrementally helping.

One related request: if we do retain multiple punctuations, the explicitly-tagged form on several lines is clearly preferable, i.e.

AHD: /kăt/
IPA: /kæt/
SAMPA: /k{t/

I've seen some pages that string everything together on one line, separated by slashes, untagged:

kăt / kæt / k{t

and this to me is a pretty unreadable mess. —scs 19:23, 13 June 2006 (UTC)

I've been formatting them into this:
Vildricianus 19:30, 13 June 2006 (UTC)

  • I think it would be nice if we made use of our HTTP and CSS abilities and made the pronunciation something that meant more than gibberish above a definition to the average user. We can make it look nice, use things like color to show the primary and secondary stresses (as opposed to ", % and ' which unless you are familiar with the scheme could mean anything). I am knocking up an excample of some possibilities, but remembering we don't have to do it the way that all the paper dictionaries do it is a good thing I think. - TheDaveRoss 19:37, 13 June 2006 (UTC)
    OTOH, remembering we don't have to do it differently just because the possibilities are there is an equally good thing. — Vildricianus 20:28, 13 June 2006 (UTC)
Yet another possibility, of course, would be to encode just one canonical pronunciation in the database (probably using IPA), and then to autogenerate the others (AHD, SAMPA, etc.) on the fly, perhaps based on user preference. I'm aware that these schemes are not all one-to-one, meaning that automatic transliteration is not a drop-kick easy solution, but it might -- just might -- be possible. It's something I started doing some background work on when this exact question came up on Wikipedia a couple of months ago, and I'm going to keep working on it as time permits. —scs 20:01, 13 June 2006 (UTC)
w:Stress (linguistics) —Yuk! ∂ανίΠα 20:04, 13 June 2006 (UTC)
  • Ok, here is what I put together, it is certainly not comprehensive, just some thoughts:
United States
IPA dɛm ʌn stre(ɪ) ʃʌn
  • Further, use all of the characters as links to an anchored explanation page e.g.:

Phonetic description: 'ɛ' represents a full vowel, a short frontal mid monophthong.
Audio: link
Examples: bed, tread, creche
Compare: ɪ, ʊ, (etc...)

Expanded to include whatever we feel is useful and appealing. I very much like the idea scs has, for the English language at least IPA/SAMPA are pretty much 1:1, other languages not so much. If it is feasible, I would be in favor of it. - TheDaveRoss 20:06, 13 June 2006 (UTC)

What do the parenthesis mean in the above pronunciation, and how do you indicate varying final sounds based on drops or ligatures liaisons? ∂ανίΠα 20:31, 13 June 2006 (UTC)
  • I think a solution would be to invent a system per-language and per-variety in some cases such as US v UK English. We can provide info on how to convert an IPA or American pronunciation found elsewhere into our sytem. We then have some JS, which I wouldn't mind writing myself after I get home in a month or two, which converts from this into all kinds of flavours to match various dictionaries.
  • There are as far as I know, 2 free sets of pronunciation data available on the Internet that we could use. They both use ugly systems of their own devising and they are both only for American English. One is Moby and the other is CMU. Info and links for both can be found here.Hippietrail 20:11, 13 June 2006 (UTC)

Aargh no! Why are we making this so complicated? I would rather just leave the system as it is than start inventing our own colour-coded schemes! IPA does everything we need it to, and it is familiar to the most people. It already indicates stress perfectly well. It is also distinguishes adequately between UK and US English – we are not concerned with going into detailed transcription, just with giving a broad phonetic value for each word. I do agree we need a better page to explain it all, but one page is enough, we don't need to link to each symbol individually. Occam's Razor etc. Widsith 20:17, 13 June 2006 (UTC)

(edit confl.) Right. Are we going to tackle all this right now, overhauling yet another long-standing practice? I guess personally, I'm going to make sure all other things are done first before spending time on this. It'd be nice to have some of the content settled before it's reformed again. Pronunciation that's meant to be accurate will always look like gibberish to those who don't bother to spend some minutes figuring out how it works. They can always check some of the excellent audio around here, which will always be better and clearer than any notation we come up with. Dave, that's a nice idea, but are you sure it'll remain simple enough? — Vildricianus 20:25, 13 June 2006 (UTC)

Potentially it can be as simple as changing {{IPA|xxx}} to {{IPA|x|x|x|3}}. There are precious few non-regular contributors who add pronunciations, so learning the schema wouldn't pose a problem. As for the color coding...I wasn't aware that changing an apostrophe to a bgcolor would cause you so much dismay, by all means leave it the way it is. - TheDaveRoss 20:38, 13 June 2006 (UTC)
My thoughts...
I have never been a fan of so-called "AHD", for want of a better name. I only ever use it for US pronunciations. It does not always match IPA and does not, by definition, match the phonemes used in UK English (RP). Further, it does not contain a short forms /u/ and /i/ of the long vowels /uː/ and /iː/. I often feel I am forcing the system to fit when using it for pronunciations. I would not mourn its passing, should we remove it.
On the other hand, IPA is much more precise and lends itself to the representation of almost all sounds in almost all languages. Agreed, it takes a little while to learn, but then so does any pronunciation scheme. There are very many symbols, but only a few (40 or so) are used to represent English phonemes. Like Widsith says, IPA does the job and we need nothing more (except for (X-)SAMPA to help those whose browsers do not display IPA - for example, I don't see IPA characters displayed correctly in Linux but do in Windows, despite using Firefox in both OSs).
Davilla's proposal for colour-coding is unhelpful and has no point, in my opinion, as it makes the pronunciation hard to read. (The pronunciation you give is wrong, incidentally: /ʌ/ is the sound in a British pronunciation of "cup"; you want /ə/, which is the sound of the first and last as in "banana". I realise that some Americans pronounce these in the same way, but those who do definitely use /ə/ and not /ʌ/. Further, the dot used to separate syllables and is not used when there is a stress mark, which does this job.)
A good example is London: . — Vildricianus 12:22, 14 June 2006 (UTC)
You mean [ˈləndɪn]? :-P ∂ανίΠα 18:11, 14 June 2006 (UTC)
Is that the new fashion? — Vildricianus 18:46, 14 June 2006 (UTC)
Does the Midwest define fashion? Spoken there irregardless. :-P ∂ανίΠα 19:18, 14 June 2006 (UTC)
I'm afraid I cannot take credit for such an excellent idea as color coding the stress, although personally I would have used it in combination with the standard stress markers. Regardless it is a style issue that can be deferred for much later.
There is an excellent point though raised in your correction of TheDaveRoss. The reason for using AHD or some American standard is that it identifies phonemes rather than individual phones, so the pronunciation of AHD /kəp/ translates to IPA [kʰʌp] in British English and, ŭ being equivalent to ə in American English, [kʰəp]. This is the reason I recently suggested ording the pronunciation key by phoneme rather than IPA symbol. Seeing the same words given as examples of ŭ and ə in the American English column, TheDaveRoss would have immediately realized that he does not have the ability to discern between the two, just as I do not have the ability to discern ŏ from ä and even ô.
If we were to use solely IPA in pronunciations then we would be obligated to provide, in some cases, an RP, GA, Scotish, Irish, Australian, etc. pronunciation, all of which could differ. In fact I would consider the opinions weighed here to be entirely skewed, deciding not on whether to use IPA so much as a highly bastardized version of it, and not just because of the whitewashing of other dialects, on which point IPA differs even if AHD does not, but also the /r/ issue and for omitting other markers that distinguish allophones, especially phones that are not considered allophones in many other languages, in particular aspiration. This version of IPA is not acceptable given the the multilingual aspect of this dictionary. With the correct IPA only, a complete listing of dialects for every pronunciation provided is first of all beyond our capabilites, and secondly difficult to verify in existence given the regionality of many words. Yet these can be the only alternatives to AHD or some variant thereof. ∂ανίΠα 17:56, 14 June 2006 (UTC)
Yes, Davilla, you've got it. We should already be providing RP, US, Canadian, Australian, NZ, SA, etc, pronunciations, and a lot of pages already do this. Rather than this being a liability imposed by IPA, it is a necessity for an Internet-based dictionary. Providing dialects is not acceptable, but I certainly think we should have standardised pronunciations for each country in which English is spoken as a first language (or a major second language). — Paul G 08:57, 15 June 2006 (UTC)
I'm not sure you understood me. The point is that, using IPA only, these aren't optional. All of these pronunciations are required because IPA cannot represent a general pronunciation. The vowels especially in different dialects of English have some consistent variation, so for most words, any provided IPA transcription must be labeled regionally. AHD avoids that problem, in theory at least. I would prefer to work out any problems with its inclusion of RP etc. than to abandon it. 21:18, 15 June 2006 (UTC)
That's not the way I understand IPA. The "/.../" syntax represents phonemic information, so the regional vowel or consonant pronunciation differences need not be shown. I.e. "/kɑɹ/" should be perfectly acceptible for a region-neutral transcription of "car", as "/kɑɹ/" says nothing about how rhotic or non-rhotic dialects pronounce the phoneme /ɹ/ when it occurs at the end of a syllable. Rod (A. Smith) 22:17, 15 June 2006 (UTC)
Hmm, I'm not sure I'm correct above, because using that transcription system, I'm not sure how to represent the sound appended by non-rhotic English to "idea". Rod (A. Smith) 22:31, 15 June 2006 (UTC)
This some implications for the rhymes pages, too, which currently use all three schemes. These could of course be converted by a bot, and, fortunately, the page names themselves all already use IPA.
So that makes me a Support for IPA but with the retention of (X-)SAMPA. — Paul G 11:12, 14 June 2006 (UTC)
support use of IPA, though not to the exclusion of other systems. For one thing, AHD might be useful for English pronunciations, but it won't work for poronunciations of the myriad other languages we're using here, nor can a non-English speaker be expected to understand the pronunciation of the various symbols of AHD as used in English. I like IPA as my personal standard, though I am aware of frequent US/UK mainstream differences (I grew up watching BBC programmes in addition to US programs). AHD does not capture differences in pronunciation that occur across different regions of the British Isles or in different parts of the US to the degree that would be needed. I suspect it fails even worse in trying to bridge Australian, South African, and Indian pronunciation. Some regions of India speak an almost unintelligible form of English if you're not prepared for the cadence that accompanies it. However, I don't think we're at a point yet where we need to worry too much about burgeoning pronunciation sections. --EncycloPetey 21:56, 15 June 2006 (UTC)
The pronunciation scheme we call "AHD" should certainly not be used to represent pronunciations in varieties of English other than American (that is, US) English, IMO. When I enter UK pronunciations, I omit AHD, for the reasons EncycloPetey gives. It was designed for American English alone and just doesn't work for any other type of English. — Paul G 09:36, 16 June 2006 (UTC)
Then why even bother to differentiate between ŭ and ə, ä and ŏ, hw and w? Certainly not for the sake of this poor American speaker! Now I understand why you're so opposed to AHD. You've missed the point entirely... though you're not to blame, since it apparently doesn't completely capture British differences. Still, I would rather work with it until it does bridge that, and Australian, and even Indian, by and large. As a starting point, which symbols do you find do not match up? Davilla 17:56, 16 June 2006 (UTC)
So here's a question: Would we like to (are we trying to, do we want to) list separate and distinct UK and US pronunciations everywhere? Or, if it were possible (though this is a fairly big "if"), would we like to, when possible, list one pronunciation that worked for both? That is, do we want to list phonetic or phonemic transcriptions? (Again, see w:International Phonetic Alphabet#Types of transcription for more information on this distinction.)
I could go either way on this question, and I suspect it's another we'd have a hard time reaching consensus on, but if we don't at least raise the issue and consider it, we're likely to confuse ourselves further while dancing around it... —scs 18:19, 16 June 2006 (UTC)
Why should we disallow either? Phonetic [] transcriptions are narrow transcriptions, and they should use regional labels or however the accent is named. In America the most common is GA and across the pond it's the RP. Phonemic // transcriptions are broad transcriptions, and they should be as inclusive as possible. Although the bastardized IPA that everyone here seems to like is an option for these, a better option is to use AHD or a variant for phonemic transcriptions, and IPA for phonetic. Davilla 18:06, 19 June 2006 (UTC)
I feel very strongly that we should be showing phonemic rather than phonetic transcription. This means that, yes, sometimes the same transcription applies to both US and UK pronunciations. Of course, there is not universal agreement on which symbols are which – our system at the moment uses /r/ for both UK [ɹ] and US [ɻ], whereas I feel those symbols should be considered phonemic. That argument has been and gone though. Widsith 19:49, 16 June 2006 (UTC)


Does anybody else think that we need some sort of policy on things like this? I just checked on my old machine and can't read Davilla's new name with it. — Vildricianus 15:04, 14 June 2006 (UTC)

Yes. All Wiki sites are supposed to oppose inappropriate usernames. As I also administer Chinese Wiktionary, I have seen bad usernames insulting Zhou Ji, the Minister of Education of the People's Republic of China. Admins there have sentenced all bad usernames to "life imprisonment". (We cannot "kill" or "execute" inappropriate usernames.) Let us make a policy against inappropriate usernames.--Jusjih 23:15, 14 June 2006 (UTC)
Yes. We should implement Wikipedia's policy here as well. I'm not too bothered about the use of Cyrillic or Greek characters though. SemperBlotto 14:40, 15 June 2006 (UTC)
I know it doesn't sound too good for a multilingual dictionary, but usernames in Latin script only would be easier, clearer, and bug-free, while names in non-Latin are troublesome. Here are some arguments:
  1. each time I want to enter such name in a field I'll need to copy/paste it;
  2. the url bar is full of %E6%F4%A9 and other such gibberish;
  3. it's unreadable on ill-configured machines (??????? or □□□□□□□);
  4. some (many?) people can't read Greek or Cyrillic.
Now I don't mind that much (I can read Greek and Cyrillic), but other people will certainly. Such names may be fun and distinctive, but we're not on the Greek or Russian Wiktionary here. — Vildricianus 15:40, 15 June 2006 (UTC)
I certainly don't find the special characters convenient, not having buttons for those on my keyboard. That said, who would like to be the first to tell Jon Harald Søby how he ought to spell his name? --Dvortygirl 04:55, 16 June 2006 (UTC)
Me! :-) He should spell it exactly as he does on IRC! :-) --Connel MacKenzie T C 05:22, 16 June 2006 (UTC)
Exactly. Don't take this as an ad hominem statement or so, but I've always found it difficult to track him down, always needing to go to WT:A and clicking through. One or two instances are fine, and if it's their original name they have the most (only?) valid argument, but it would be interesting to have as few of these usernames as possible. — Vildricianus 09:41, 16 June 2006 (UTC)
Ha! It's all Greek to me. Παρατηρητής 14:58, 16 June 2006 (UTC)
Ideas. First of all, there should be a single user name across all Wikimedia projects, and I believe there was an initiative to that end, though I hope it hasn't died out seeing as it hasn't produced any results so far. The username policy in that case would be centrally coordinated. The discussion here applies to Latin scripts, which might be an acceptable restriction for English Wiktionary at the moment, although eventually we would have to accept usernames for those based on other language Wiktionary projects and/or other English language Wikimedia projects anyways. Better for the other language projects to determine if names in other scripts are vulgar, have celebrity status, etc. than us.
The rules we use for top-level see-also's (the language independent ones), slightly altered, would apply here, based on the similarity of glyphs. The existence of a user Jon Harald Søby would deny Jon Harald Soby, Jon Harald søby, Jon Harald-Søby, Jon Hara1d S0by, Jon. Harald Søby, Jonn Harald Soby, etc. as new user names, all of which would by default resolve to jonharaldsoby and transparently redirect to Jon Harald Søby. This would first of all make immitations that much more difficult, and secondly make identifying such a user easier, as per the above discussion.
And for goodness sake, why not an initial lower-case? Davilla 17:38, 16 June 2006 (UTC)
  1. Yes, single login is still being worked on. Expect it somewhere in the future. [1]; [2]; [3].
  2. The idea to technically defeat impersonation is great. I'm not sure how feasible, though. There should be a list of exceptions to be checked against, or something like that. User:John should still be possible if there is already a User:Jon.
    Use the same approach as with language-independent see-also's at the top. See also John and See also Jon are not appropriate on the other page. Davilla 14:40, 18 June 2006 (UTC)
  3. The lowercase usernames are just technically not possible with the current database layout I guess. I don't think it would be any priority for the devs, but if you feel it's needed you can file a request at bugzilla:. — Vildricianus 18:11, 16 June 2006 (UTC)

Personally I think people should be able to use whatever Username they like in whatever script they like. Widsith 19:45, 16 June 2006 (UTC)

But would you agree that the script should be consistent? There are many variations to the captial A, for instance. Davilla 14:40, 18 June 2006 (UTC)
Personally, I think so too. But I have a personal opinion that's widely different from what would be useful or appropriate for other people. What is not useful for many users is a username in a script they can't read or write. — Vildricianus 15:51, 18 June 2006 (UTC)

Various abbreviations for genders

The page for sandwich has m4 against the Irish and en against the Swedish. Can someone please add these, along with any others and their meanings on the page that explains the abbreviations we use for genders. Alternatively, please change these, if appropriate, to abbreviations we already use. Thanks. — Paul G 14:30, 15 June 2006 (UTC)

Where is this page? All I could find was the policy page in development. --EncycloPetey 01:37, 16 June 2006 (UTC)
They are referred to in WT:ELE#Translations, which would also need to be updated (or, preferably, changed to a cross-reference to the appropriate page) but I'm sure they were given somewhere else too. Can anyone find the page? — Paul G 09:28, 16 June 2006 (UTC)
en is the indefinite article used for words of common gender in Swedish. Any such should be changed into {{c}}, I guess... Alas, I've seen den as well. (That's the definite article, btw). \Mike 12:23, 20 June 2006 (UTC)

"color-colour" doesn't work

Back to the old chestnut of colour/color, I'm afraid.

The system of having a template for the translations of "colo(u)r" that can be used on the pages for both spellings (colour and color), using {{color-colour (noun)}} and {{color-colour (verb)}}, is a great idea, but has an unfortunate flaw. I have just modified "colour", adding a new noun sense. The translations for the noun are all for just one of these senses, and the translations for the verb are for three senses, when there are currently five. I've no idea how many noun and verb senses are defined on the page for color as I haven't looked.

So the system very quickly gets out of synch. One solution would be for {{color-colour (noun)}} to include everything relating to the noun (definitions, synonyms, antonyms, translations, etc) so that is less likely to happen.

What do people think of this idea? — Paul G 15:53, 15 June 2006 (UTC)

Oops, I forgot about UK/US-specific senses... but then, shouldn't these be included on both pages anyway? — Paul G 15:58, 15 June 2006 (UTC)
Which is why I once said that these entries have absolute priority... Once all senses are established, it should work fine. The trouble is that few people, if any at all, seem to be willing to focus on it. — Vildricianus 16:12, 15 June 2006 (UTC)
Both templates look fine to me. As I also administer Chinese Wiktionary, similar problems can also occur there with regard to compounds in various Chinese-like characters used in Chinese, Japanese, and Korean, like China as 中國 (traditional Chinese, Japanese kyujitai, and Korean hanja) and 中国 (simplified Chinese and Japanese shinjitai) and Taiwan as 臺灣 (traditional Chinese, Japanese kyujitai, and Korean hanja), 台灣 (traditional Chinese), and 台湾 (simplified Chinese and Japanese shinjitai). I have tried a template for 中国 (China) there based on "color" templates here.--Jusjih 06:28, 16 June 2006 (UTC)
But they are not fine, for the reasons I have given. I'm convinced that the whole of the noun section needs to go into the "noun" template, and the whole of the verb section into the "verb" template, which I will do if no one has any valid objections. — Paul G 09:25, 16 June 2006 (UTC)
That is very bizarre. Months of effort have been put toward making this NPOV, for you and you suggest throwing the baby out with the bathwater, based on invalid logic stated above?
The translations have the glossed/named sections to address the issue you are suggesting. If a more granular solution is needed, then perhaps the specific translation glosses should each have their own shared telmplate sections...but that would defeat the current flexibility that allows for the TTBC subsection. Trying to go the opposite direction is to re-create and re-ignite the POV issue that started all of this in the first place! The entries color and colour still haven't recovered from the last round of POV merging yet. --Connel MacKenzie T C 15:45, 16 June 2006 (UTC)
Agree. Paul, please come up with a better solution then. Until you or anyone else can, the template:color-colour thing is not going to change I'm afraid. — Vildricianus 15:58, 16 June 2006 (UTC)
No, not at all, Connel. (No one informed me that this was being done for my personal benefit, incidentally.) The point is that the definitions are in the template, but the synonyms, translations, etc, are not, so there can be completely different sections on each page and fall out of synch. How do these templates work anyway? If someone tells me, then I can work on fixing the page, which is currently broken. — Paul G 16:44, 16 June 2006 (UTC)
Well I agree that it's less than clear. All activity around the issue stopped when Richard moved the discussions. I should really start on making a basic description, some kind of page where we can work out the bugs. Instead of going to and fro all the time we can gradually develop the idea then and see if it's really the solution we all hope it is. Expect something soon. — Vildricianus 17:28, 16 June 2006 (UTC)
Well, colour/colour wasn't for you, but I came up with the translations approach with you specifically in mind. The Synonyms, Antonyms and other sections are not the same and should not be. That would be POV (still!) The Translations sections need to be integrated only because we can't possibly expect translators to understand the subtlety of the different words, which to them, translate the same either way.
The mechanism is that any template that contains a heading level, will include that heading on the desired pages. The [Edit] link to the right of that heading will edit the template's section (the section that is common to both entries.) The "noinclude" section above the heading provides navigation links back to the common entries...and perhaps could have a notice to include sections for lemmas from both entries. Maybe it should have a tiny table of which glosses/lemmas go with which entry?
--Connel MacKenzie T C 18:38, 16 June 2006 (UTC)
By the way, I strongly object to your assertion that these entries are currently "broken." The synonyms, antonyms and related terms are not and should not ever be synchronized. The problem, from the very, very start, was someone incorrectly asserting that they are the same, triggering years of bitter dispute. --Connel MacKenzie T C 18:43, 16 June 2006 (UTC)
I'd added an experimental list of glosses at the top. But this shows that the sporadic re-merging of the entries has caused considerable problems in the past...resulting in this section never properly being identified as needing subdivision. --Connel MacKenzie T C 18:55, 16 June 2006 (UTC)

Wiktionary:Changing username

I've re-discovered this page and revamped it, reflecting what's possible and what's not. Purpose is actually to discourage it but still allow it for valid requests like User:BD2412's. Please comment, improve or alter as you see fit. — Vildricianus 11:24, 16 June 2006 (UTC)

Category:Form of templates

I take it is now agreed on that these templates become the standard? If so, then I can already enjoy myself replacing the old stuff manually as I see it. — Vildricianus 14:09, 16 June 2006 (UTC)

Excuse my ignorance, but why are these templates needed? What value do I get form using
# {{plural of| }}
instead of the simpler and shorter:
# plural of 
If there is a benefit, then it should be given in the talk page of the template. Jonathan Webley 14:25, 16 June 2006 (UTC)
Um; please go back here. Looks like you missed the discussion. The benefit is that templates are the only way to bring an end to the eternal discussion of deciding on whether we use plural of word, plural of word, plural of word, plural of "word" and so forth ad nauseam. — Vildricianus 14:50, 16 June 2006 (UTC)
Ah! I wasn't following that discussion since I thought it was about a bot. Does this affect ELE? Jonathan Webley 15:44, 16 June 2006 (UTC)
If my question gets some replies, then probably yes. — Vildricianus 15:59, 16 June 2006 (UTC)
(to Vild's original question above) Although the "form of" templates have not been declared an official standard, they at least are approaching de facto status. I myself support them, but that's probably because I'm a fanatic about use-mention distinction. :-) Rod (A. Smith) 23:50, 16 June 2006 (UTC)
I'm in two minds. I'm fond of them as they allow such flexibility and later adjustment. But at times I get the impression that it's becoming ever more complicated to create a simple Wiktionary entry. If we can manage to keep it all well-documented and transparent, it might just work well, but we should really draw a limit somewhere, to keep our enthusiastic selves in check. — Vildricianus 09:55, 17 June 2006 (UTC)
We all have our own motivations, our own specialties, our own compulsions. My attitude is, for any particular bit of fussiness or compulsion that I can't be bothered with, I just compose entires without them, knowing that the editors who are fussy and compulsive in the aspects I'm not will be along, eventually, to fix them. (I'm doing them a favor! I'm giving them something to do! :-) ) —scs 16:03, 17 June 2006 (UTC)

New: image undeletion

Note for sysops and other weirdos: images can now be undeleted. Only from now on, though. Those deleted before Brion installed the function are still lost. But new deletions are no longer perpetual. — Vildricianus 22:28, 16 June 2006 (UTC)

Excellent! This is a very important feature while I administer seven Wiki sites.--Jusjih 04:46, 17 June 2006 (UTC)


Copied from Wiktionary talk:Spelling variants in entry names

I hereby give up trying to formulate a policy for this, as so many people (wilfully) can't even read a policy, or read what vote is being called for.

It is quite clear from the policy that REDIRECTS will be used in many places. Yet so many people came to this vote and voted because they thought the policy said REDIRECTS were to be preferred, and that was what was being VOTED on. That is a wilfull mis-reading, or, more probably, a not bothering to read. It is futile to try to resolve contentious issues by developing a policy when so many people can't even bother to properly read the policy, or what the vote is specifically about, before they vote.

I had actually picked this area as one that would be easy to formulate a policy just by researching the archives and checking what current practice is, and documenting that, as is the normal best practice of writing policies. Where I had made a slight error, and it was pointed out, I adjusted the policy. But, ... what can I say, I just give up on you lot. Too many egos who know what is right and don't really want a policy.

I don't think you'll be seeing much more of me at Wiktionary. My view is now stronger than ever that it is headed down a blind alley - too many egos involved. And, as I've said before, I find the content for basic words bloody hopeless. Several other online dictionaries are far more reliable. I might contribute a few words and phrases (in full expectation that Connel will delete them as quickly as possible), and might come back in a few months to see if you idiots have sorted yourselves out a bit more.--Richardb 08:33, 17 June 2006 (UTC)

    • While I didn't involve myself in the formulation of the above policy, I would like to take this opportunity to reflect on Richardb's comments. I agree that in the case of some Wiktionarians, enthusiasm and passion far outstrip actual knowledge. I also think that form is often taking precedence over substance. However, I am not yet ready to give up on Wiktionary. The premise behind Wiktionary is simply too powerful to ignore. I think of Wiktionary as a long term project (unless more people start contributing, maybe 10 to 20 years for the bulk of the work), not something that will be immediately competitive against the current crop of dictionaries used by professionals.

As I have stated before, we need to get a lot more language experts involved in contributing to Wiktionary. Many of these people have limited computer knowledge, and will have little patience for reading page after page of policy notes and usage guidelines. I believe that Wiktionary needs a user interface that allows the user to concentrate on the input of words and their definitions. I spend far too much time trying to correctly format each of my entries so that I don't offend the sensibilities of fellow Wiktionarians. Ideally, a computer should be worrying about how to format the page. In other words, come up with a default look and feel for all major types of entries, program all that into an intuitive web-based interface, then offer that in place of or in addition to this thing I'm using to post this message right now. A-cai 13:22, 22 June 2006 (UTC)

I think the CSS advances from Hippietrail, Connel and others are starting to get us to exactly that point. Widsith 13:37, 22 June 2006 (UTC)
What we need is a well-balanced mixture of technically adept contributors and linguistically knowledgeable ones. But then, what is "linguistically knowledgeable"? Professionals are not likely to spend a lot of time here. And what's most important is that we remain open to non-professionals, which is the core of any Wikimedia project.
Dictionary writing is an esoteric field, and wiki-dictionary writing adhering to Wikimedia policies is even more esoteric. Many aspects are to be taken into account, which I think Wikipedia in its current state reflects very well. Professionals are likely to have a POV, which is not welcome here.
As for the software limits: WiktionaryZ is being worked at. Patience please. — Vildricianus 14:03, 22 June 2006 (UTC)

A "menu" appendix

I am thinking of creating an appendix of things you would find on restaurant menus in different languages etc - but I can't make up my mind as to a decent format.

Option 1 - just a table of the different courses

  • English - starters - soup - first course - second course - vegetables - side orders - dessert - drinks - etc
  • Italian - antipasti - zuppe - primi - secondi - verdure - contorni - dolci - bibite - ecc

(Properly formatted of course)

Option 2 - much more detailed with a section for each language

* calamari - deep fried squid rings
* tricolore - tomato, mozzarella and basil
* minestrone - vegetable soup
* stracciatella - broth with egg and cheese
* pizza Margherita - tomato, mozzarella and basil
* pizza quattro stagione - "four seasons" pizza
and so one, as detailed as you like

What do you think? I favour the second, and am willing to make a start on the Italian entry (and eventually fill in all the red links!) SemperBlotto 09:42, 17 June 2006 (UTC)

Nice idea; I suggest you just start with what you have, to see how big it's going to be for each language. I think the second layout is much clearer. If it's really going to be big, perhaps each language needs a subpage. — Vildricianus 09:48, 17 June 2006 (UTC)
Starting out in this fashion is a good idea, but I would consider this to be a Wiktionary Appendix at first, that is, a list of terms that are missing. Eventually the Appendix:Menu will have links to categories. There are simply too many terms that fit under this description, and categories are a faster and more direct way to grow it. But I'm thinking kind of futuristically and there's no reason not to leave that for when the time comes. Davilla 15:50, 18 June 2006 (UTC)
Nice idea, but this Italian-speaker would ask you to make sure you spell things correctly... it's "quattro stagioni", ("four seasons"), not "quattro stagione" ("four season"). The -e form is quite commonly seen in English, but is clearly a mistake in Italian.
Also it's "insalata tricolore" ("three-colour salad") rather than plain "tricolore", as far as I am aware. — Paul G 10:06, 19 June 2006 (UTC)
Yes. The real entry won't be written at speed. (Insalata tricolore is more usual, but I have seen it without the insalata - possibly to highlight the fact that there is no lettuce etc) And it is often avocado instead of basil. SemperBlotto 11:27, 19 June 2006 (UTC)
First stab at it generated at Appendix:Menus - Italian subpage created (Paul please check). I shall have a go at English Indian restaurants soonish. SemperBlotto 10:40, 20 June 2006 (UTC)

Google Book Search - date warning

I was looking for an early citation for automated teller machine and found one dated 1879! This did not seem to be reasonable, especially as the work also mentioned point-of-sale terminals. Further research shows that the work was actually published in 1994. The work is one of a yearly series that began back in 1879! I am now worried about my citation to apartheid - I'm sure that the date must be wrong. SemperBlotto 14:25, 18 June 2006 (UTC)

Yes, Google books is fine for finding the quotes themselves but I often try to double-check on the dates. They also often provide the date for a republished version of a work, which makes it a worthless one. — Vildricianus 14:42, 18 June 2006 (UTC)
Occasionally the whole book will be mis-identified. I found this when researching muke—the link to the dictionnaire futunien actually identifies it as "A Greek-English Lexicon of the Septuagint". —Muke Tever 22:26, 18 June 2006 (UTC)


Anyone has an idea of how to keep things accessible for AOL without permanently displaying the links on the sitenotice? There's going to be a fundraiser soon, so it'll have to move anyway. — Vildricianus 15:55, 18 June 2006 (UTC)

Somewhere at the top of the Main Page? — Vildricianus 12:54, 20 June 2006 (UTC)
Done. — Vildricianus 12:58, 20 June 2006 (UTC)
Well, there is no reason not to have the fundraiser and the AOL notice in sitenotice. On the other hand, I've hated seeing the AOL notice since it was first added. --Connel MacKenzie T C 17:04, 24 June 2006 (UTC)
The effect of the sitenotice is lost if it's permanently used (as on Wikinews). It should be blanked from time to time. — Vildricianus 17:09, 24 June 2006 (UTC)

New Feature Request: Video of ASL and Other Signing Languages

I would like to request a new feature for the wikitionary community: Support of uploading video of ASL (but on limited to ASL) signs for particular words.

Feature outline:

  • Polices are created regarding video format and size.
  • Polices are created regarding dress code and background.
  • Templates are created to denote the language is signed in and the region where the signer is located (signs may very slightly from community to community).
  • Embedded video support would be nice.

Also, signing video can be used in wikibooks as a tool to teach sign language with examples.

What do you guys think?

--Zoohouse 16:36, 18 June 2006 (UTC)

Multimedia integration is scheduled for the (near?) future. — Vildricianus 13:00, 20 June 2006 (UTC)
Ogg video has been supported for a long time, hasn't it? Our audio help links (residing on Wikipedia) describe the format, last time I checked. Dress code? I think we can assume good faith. Standard size will evolve after we've seen a few experiments. --Connel MacKenzie T C 17:03, 24 June 2006 (UTC)

Is it time to use Estuary English instead of RP for UK Pronunciations?

Hardly anyone in Britain uses RP for real -- not even the Queen's grandchildren, and the Queen herself has toned down the wilder aspects of her use of it. It is no longer used on the BBC (or at least, used only by a very few). It is no longer used by well-educated people, even at Oxbridge (except for a few, mainly old). It is rarely used by politicians, because the public prejudice against it would make them almost unelectable if they used it. It is fossilised speech from at least 50 years ago. Like most people at my (academically prestigious) school, I consciously un-learned it 40 years ago, because it was uncool. It is now even less cool. People who use it are either

  • prats
  • thought to be prats because they sound like them
  • foreigners who have been taught by teachers who were last in Britain many years ago.

Yes that was tongue in cheek, but at least it wasn't plums in mouth :-) Enginear 00:29, 20 June 2006 (UTC)

The Wikipedia articles w:English English, w:Received pronunciation and w:Estuary English are relatively NPOV, but IMO wrong (or at least out-of-date) to suggest that EE is less internationally intelligible. I suggest the opposite:

  • RP leaves out the a of dictionary, which surely no live language does, and is therefore less intelligible. How many Wiktionary editors actually pronounce it that way, rather than just see it in dictionaries? (I am a professional engineer in London, and of perhaps 100 friends and colleagues, I can only think of one who would.)
  • RP makes homophones of land/lend, sand/send, banned/bend, spanned/spend, etc. How can that help intelligibility?
Well, I agree that it does, but "land" and "lend" are distinguished in pronunciations as /æ/ and /ɛ/. We don't merge these vowels in the UK pronunciations here. — Paul G 10:20, 19 June 2006 (UTC)
  • BBC predominant usage of estuary English (approx 70% of BBC people work in London) means that its international acceptability is growing
  • There are strong connections between Cockney and Australian English (many ?most early Aus settlers were Cockneys, and my Aussie mother was sometimes mistaken for a Cockney in London) so the influence of Cockney on EE makes it more intelligible to Aussies than RP.

The UK has a wonderful profusion of strong accents, so that it is sometimes easy (even for non-locals) to distinguish accents from towns 20 - 30 miles apart (eg Birmingham/Wolverhampton, Liverpool/Manchester) and locals can often place people to within 10 miles. I would like eventually to see some recognition of these many accents (and dialects). But meanwhile, let's at least use a real accent. The use of RP in a 21st century electronic wiki-based dictionary is as ridiculous as the use of Latin in church services in the 20th century. It makes us less useful to our users by reducing intelligibility. Enginear 02:16, 19 June 2006 (UTC)

Rather than go from one weird dialect to another, I think it would be better to label our British pronunciations as just (UK) rather than (RP), and to think of them as being ‘standard UK’ rather than ‘received pronunciation’. As you say, no one really uses RP anymore, but there is still a ‘standard’ as used by most newsreaders etc. Of course, there is no reason not to have information for various dialectal pronunciations as well. Widsith 06:33, 19 June 2006 (UTC)
Errr... I don't say the "a" in Dictionary or Wiktionary - so do most of the people I know (I live in w:Stroud, Gloucestershire). See Wiktionary (one of the audios is my pronunciation). Regards, — Celestianpower háblame 09:22, 19 June 2006 (UTC)
Me neither! :-) — Vildricianus 09:24, 19 June 2006 (UTC)
I obviously need to get out (of London) more :-) Enginear 00:29, 20 June 2006 (UTC)
I agree here with Widsith. I have been thinking for some time that it would be better to use "UK" rather than "RP", as many of the pronunciations I have entered are not really RP. If we use "UK", we should point out somewhere that this means "pronunciation of British English as it is spoken by most people in the south of England".
To be honest, so do I. I agree Estuary English is not quite what we should be looking for, but the difficulty is finding a label which is more accurate. I suppose BBC English (2006) is verifiable and is really what I meant. As spoken by most people in the south of England is less useful, due to the difficulty of defining south (and actually I think you mean the home counties -- I think fewer in the south west speak like that, though CP might disagree). Indeed, even in the HCs, to say "most people" may be excessive; but though it may be a minority, it's still many times the number that speak RP. Enginear 00:29, 20 June 2006 (UTC)
Estuary English is not yet well enough established to take over a standard pronunciation, in my view. Adopting it as our standard here would mean changing almost every pronunciation (for example, /aɪ/ (as the Queen says "I") would become /ʌI/ (as Tony Blair says it) and /str/ would become /ʃtr/).
I think we are OK with the pronunciations as they currently are, because a reader knowing that /aɪ/ is pronounced like "I" will use their own dialect to pronounce words containing this diphthong, rather than thinking that they have to be pronounced like the Queen might say them.
I think we would be making a rod for our own backs by listing pronunciations in various dialects of the UK, interesting and useful as that might be. I think that we should, as print dictionaries do, stick to one "standard UK" pronunciation
However, what I think would be useful, and involve minimal work, would be to give a table somewhere that showed how to map these into other dialects of the UK. The same could of course be done for other countries (particularly the US) where there is wide variation in pronunciations. — Paul G 10:18, 19 June 2006 (UTC)
That's a great idea. I thought I knew the Plymouth [UK] accent well, but became a laughing-stock when I tried it in front of some Plymouth friends, because I got one vowel wrong (and of course I was a poncey grockel). The table would be quick to rough out, but require a lot of collaboration to perfect -- good wiki material. Enginear 00:29, 20 June 2006 (UTC)
I'm not very opinionated on the issue of which pronunciation is better. I am however very much opposed to the idea of labeling either as the standard for the UK, or for any other country for that matter. The "standard" is a temporal phenomenon that will not survive the test of time. What will somebody think when they come here 100 years from now and find "standard UK" listed as the pronunciation? Will they need to check the edit history of the pronunciation to determine which standard is being referred to? The pronunciations must be labeled descriptively according to the linguistically accepted name of the accent. Anything else would be POV. And if Enginear wants to add the other, that's MORE THAN okay. Davilla 17:20, 19 June 2006 (UTC)
I didn't say we should call it ‘standard UK’, just that that's a shorthand way of thinking about it. The label would be simply (UK). As a standard it may be a temporal phenomenon, but it's no less useful for that. Widsith 17:39, 19 June 2006 (UTC)
Indeed, the BBC now has a policy of internet-accessible archiving for a significant proportion of their output, and will no doubt migrate it to whatever follows the internet, so BBC English (2006) should be easy to hear in 5, 50 or maybe 500 years. Enginear 00:29, 20 June 2006 (UTC)
  • What we should do is to avoid original research - a general policy on Wikis. This means we should look a what dictionaries list as the pronunciation of a word rather than trying to figure it out ourself. British English generally has one set of phonemes though some merge and others separate in various dialects as they get further from the ones dictionaries use as a "standard". The same for American English. This is a separate issue from what symbols we use to indicate the phonemes. The problem with IPA is too many people think the symbols are supposed to always be exact representations of the phones. They should be thought of as representing phonemes which as they are realized in various accents will appear close to various phones, generally in predictable ways. Some dialects make unpredicatable sound changes for some words and these are more significant. The best way to show a pronunciation graphically is with a standard set of symbols which can be interpreted into the widest range of dialects possible. The best way to include a specific pronunciation in a specific dialect is to include a sound sample. — Hippietrail 02:50, 20 June 2006 (UTC)
    • "No original research" doesn't work for a dictionary. It's either that or copyvios, the only alternative being PD import. I've no idea how copyrightable pronunciations are, though, and I indeed use paper dictionaries as a source when I add them. But in general, we've always been doing a fair bit of original research. The entire quotations system, unless they're from Webster or the like, is original research. In my opinion, "no original research" and "descriptive dictionary" don't match at all. — Vildricianus 10:21, 20 June 2006 (UTC)

(Translation or Multilingual) Word of the day

I'm not a wordy sort of person and this obscurity of the moment stuff was never something I was interested in, but I have to say that seeing it on the front page even with an audio file and all makes the whole site feel ten times more professional. My question is if there's a way to show off the multilingual aspect of this dictionary, if not by putting up an equally impressive foreign word every once in a while just yet, by at least noting the language of the word of the day, namely English as it has been. Davilla 19:46, 19 June 2006 (UTC)

Except for the trickiness of the layout itself (that Vild recently put a lot of time into), I have no conceptual objection to listing the TOW on the front page. Making it a daily thing I'd expect would be too hard to keep up. (Note that WOTD has a veritable army of supporters, but still has had a few mishaps.) --Connel MacKenzie T C 00:27, 20 June 2006 (UTC)
We could do with another set of eyes for the WOTD. I usually fail to look at it until it's UTC evening, so I hope others are keeping an eye on it as well. The TOW is not ready for Main Page listing, for the same reasons. I usually miss out on updating it in time, and apart from Widsith, no one seems to be watching it.
Generally, though, let's keep the current scheme for the WOTD for a while, let's say that we try to keep it up for at least half a year and see how popular we get. You should read what I wrote here; other projects are beginning to notice it. — Vildricianus 10:31, 20 June 2006 (UTC)

CFI: Fictional characters

Most recent RFD nomination demands Parlouring. Item in question here is Category:Star Wars, more specifically its contents. The question itself is: Inclusion of fictional characters that don't have any idiomatic meaning? Both off-topic discussion and Star Wars puns prohibited. — Vildricianus 19:46, 19 June 2006 (UTC)

The two arguments I see for keeping them are: 1) If they are (or might be) used attributively, 2) If Wiktionary prefers internal self-referential Wiktionary links over external Wikipedia links. The main reason I see for not keeping them, is that we have no measure of what is notable or not. Normally, we leave the argument of notablility to Wikipedia...so that we can focus only on attestation. --Connel MacKenzie T C 00:34, 20 June 2006 (UTC)
What I also want to add is that Wiktionary users are usually aware of Wikipedia (it's of course not the other way round). If they want to find out about Grey Jedis or whatever, they're much better off looking there than here. We're not an encyclopedia, and most people don't expect it to be. As a result, I don't think all that many people will actually look for those terms here - they'll do so in WP. — Vildricianus 10:35, 20 June 2006 (UTC)
While that is true now, I'm not sure it always will be. I think people will try searching one or the other based only on their own preconceptions of what a dictionary or an encyclopedia is, not based on our criteria for inclusion. --Connel MacKenzie T C 16:57, 20 June 2006 (UTC)
Though I could be wrong on that, I guess people don't have the preconception that such stuff is in a dictionary. — Vildricianus 12:53, 21 June 2006 (UTC)
I sure wish people added content based on what a dictionary or an encyclopedia is, because they sure aren't doing it based on our criteria for inclusion. Davilla 17:27, 21 June 2006 (UTC)
My comments here. Davilla 08:22, 21 June 2006 (UTC)

What is the right way to cite a usenet post?, an RFC?

See: host the verb, and subdomain Thanks. --kop 23:15, 19 June 2006 (UTC)

Altered (fixed?) Davilla 03:52, 20 June 2006 (UTC)
Gah! Oh no. The only time the url-hiding syntax is used, is when the source is clearly specified...otherwise the complete URL should be displayed, not "[1]".
Hello, please don't fault me for what others have done. All I did on that quotation was change the wiki format. It wasn't a usenet post so I didn't bother filling it out. Davilla
To my eye, that looks like URL-hiding...something only to be done in certain situations. But when the web site itself is purporting to be a reputable publishing house, with an extensive readership or a long history of publishing, the URL itself is very relevant. Is it for a magazine that has been published for years? Or is it a website that will be gone this time next week? --Connel MacKenzie T C 08:07, 21 June 2006 (UTC)
For USENET, the Message ID is the only reliable link to the message. Where'd it go?
I deleted it because I never use it. Nice of you to bring it up now. Where would you like it to go? Davilla 07:41, 21 June 2006 (UTC)
Perhaps we should nudge a comment from Hippietrail, as he did extensive /Citations pages primarily from Usenet. --Connel MacKenzie T C 08:07, 21 June 2006 (UTC)
The RFC reference looks OK, but it would be nicer to link www.rfc-archive.org or something. --Connel MacKenzie T C 16:50, 20 June 2006 (UTC)
It's probably best to link RFCs to www.rfc-editor.org as it seems to be the offical site of the IETF's RFC publishing body. --kop 19:42, 20 June 2006 (UTC)
Or, http://www.ietf.org/rfc/rfcNNNN.txt seems better, still. --Connel MacKenzie T C 08:07, 21 June 2006 (UTC)
The ietf.org domain has a certain cachet, but http://www.ietf.org/rfc says "When in doubt, the RFC Editor Web Page is the authoritative source page." --kop 10:36, 21 June 2006 (UTC)

Speaking of RFCs and STDs, which should be cited? STDs are more authoratative but are usually a revised RFC. That's the case for subdomain. RFC 1024, which is cited and which is also the standard, updates RFC 882. All three contain the referenced text. Is there a way to cite both the earliest usage, RFC 882, and the current authority, STD 13, as the source of the referenced text (with different dates) or what's the way to go here? --kop 19:42, 20 June 2006 (UTC)

The authority isn't so important. The quotation is for the language and not the technical aspects. Try earliest use. Davilla 07:41, 21 June 2006 (UTC)
Next most important after earliest usage is to show time span of usage, so a cite (perhaps from STD 13 if it is used for real rather than defined) showing most recent usage becomes 2nd priority. Then there's the recommendation that 3 cites implies adequate usage for our CFI, so one in between (say from RFC 1024) would be good too. Enginear 12:57, 21 June 2006 (UTC)
Um, but those aren't independent. How many times should the same quotation be listed? Davilla 15:05, 21 June 2006 (UTC)
Right. The earliest RFC should be referenced. Since this isn't a discussion about RFV, this seems the most appropriate secondary source to be using...just as the wikimagic link does. (So: ISBNs, RFCs, what else?) --Connel MacKenzie T C 15:26, 21 June 2006 (UTC)
I see a problem with "for real" v.s. defined when it comes to technical vocabulary, particularly computer words. I expect that frequently (for some value of frequently) the very first usage of a technical word will be a definition. You make up a new word to go with your new concept and publish. Vola! A definition is the first thing anybody sees. Seems to me that that kind of case would be a legitimate citation. (As an example, it's hard to imagine that a word like thunk didn't first show up in a paper by McCarthy, although that's just a guess.) --kop 05:22, 22 June 2006 (UTC)

rhyme lists

As part of some work I'm doing with Wiktionary's rhyme lists, I generated a composite list of every English rhyme we've currently got. In case the data is useful to others, I'm posting it (in 26 parts) at scs/allrhymes. Besides making it easy to quickly find which rhyme page a word is listed on, these lists can also help with other punctuation tasks, such as checking the punctuations of words, or constructing IPA punctuations for new words (by starting with one of that word's rhymes). Enjoy! —scs 02:45, 20 June 2006 (UTC)

P.S. to Paul G: apologies for the duplication of effort if these lists are, as I suspect, essentially the same as master lists you've already got.

On the contrary, no apologies are necessary. I think this is excellent. A very small percentage of pages have a link to the relevant rhymes page, and so this could be a great way of providing access to rhymes without having to update thousands of pages. I think this project should be adopted by Wiktionary, and updated periodically. — Paul G 09:59, 21 June 2006 (UTC)
Thanks for the good word. As for the "update to thousands of pages", I've got a bot ready to do them all, but due to recent circumstances (my own, not Wiktionary's) I may not be able to get the bot approved right away, so this project may have to wait for a few weeks. —scs 12:14, 21 June 2006 (UTC)
Cool! Note that there is a discussion earlier on this page discussion near the bottom of this sub-pageEnginear 01:34, 24 June 2006 (UTC) about dropping the so-called "AHD" pronunciation scheme. This would affect all the rhymes pages, but I don't think any decision has been made yet. Just so you know and can modify your bot if need be. — Paul G 09:06, 23 June 2006 (UTC)

New RFD page

Announcement: pages that are not in the main namespace (i.e. entries) are now to be nominated for deletion on the new Wiktionary:Requests for deletion/Others page (unless someone comes up with a better name). Shortcut is WT:RFDO. Reasons include the different procedures required to get rid of templates and categories (namely, deprecation), and to relieve the main RFD page from nominations the majority of users are not likely to be interested in (technical discussion). Comments and suggestions are welcome in the Grease pit, here. — Vildricianus 18:18, 20 June 2006 (UTC)

Derived and related terms

I've been giving the Beer parlour a miss for a couple of days as there just seemed to be too much screaming going on and not enough action. That's just my view.

Anyway, I feel I do need to comment on the following.

Lists of derived terms and related terms are given in alphabetical order. It is my view (see also the end of my posting) that alphabetical order in this case should ignore spaces and punctuation.

So, for example, I have reordered the derived terms for fog, which were previously ordered with regard to the spaces as:

to the following, which disregards the spaces:

There is a very good reason for doing this. Originally, many compounds, such as "foghorn", were two words. Over time, many of these become hyphenated or fused into single words (fog horn -> fog-horn -> foghorn). Not only that, but US English tends to do this more quickly (or even immediately), while UK English is more conservative, tending to keep multi-word phrases and hyphenated forms for longer (or even permanently).

So ignoring spaces and hyphenation when listing compounds has various advantages:

  1. Updating Wiktionary to show multi-word phrases that become hyphenated or fused is a simple matter of adding (or changing) a single line rather than needing to reshuffle the order of words in the list.
  2. A person looking for "foghorn" in the list of terms derived from "fog" and not knowing whether it is "fog horn", "fog-horn" or "foghorn" need only look in one position, rather than having to scan the list of words with spaces and words without. This is not such a big deal with "fog", but for entries such as iron or black, which have (or will have) dozens of related terms, finding words can be hard work if they have to look in more than one place.
  3. A person looking in the words of the form "fog" + <space> + <word> and not seeing "fog horn" there might add it without realising that "foghorn" is listed further down. This gives us redundant duplicates in the list. (If spaces and punctuation are disregarded in the alphabetical order, the user would not only find "foghorn" but could, if need be, add "fog horn" alongside as an alternative spelling. Such duplication is acceptable since it clearly indicates that alternative spellings are being given.)

Incidentally, this not just my view, but the standard used by many print dictionaries and word lists, for the reasons given above. I propose that we use this policy when alphabetising all lists of words in Wiktionary, as far as is possible. — Paul G 10:15, 21 June 2006 (UTC)

I agree. Enginear 12:49, 21 June 2006 (UTC)
Looks like a lot of words for a simple fact. But you're right of course. Also, I'd like to add that in these or any sections other than translations, columns should be formed by the templates {{top2}} and {{mid2}}, not {{top}} and {{mid}} which are reserved for translations only. Another thing is that both sections come before the translations. Derived terms should always be at the same level of translations, namely at level 4 (level 5 with mulptiple etymologies), and split up per part of speech. Related terms can also be at level 3 if they are shared by multiple parts of speech. — Vildricianus 13:02, 21 June 2006 (UTC)
The 4-level has always been confusing for me, precisely because of exceptions as you mention. Davilla 15:02, 21 June 2006 (UTC)
Listing derived terms at the 4th level follows the same reasoning for listing translations at the 4th level. — Vildricianus 15:23, 21 June 2006 (UTC)
Yes, indeed, Vildricianus, a lot of words for something very simple :) I wanted to make my point clearly and back it up with sound reasons. Thanks for the reminder about the formatting rules (templates for derived terms/translations and ordering of these sections), which are broken in some older entries and sometimes in new postings by people who are unaware of them. — Paul G 09:04, 23 June 2006 (UTC)
Those who are attentive will have noticed the self-criticism in my "lot of words" remark :-) — Vildricianus 13:41, 23 June 2006 (UTC)
Yup, that's the way I've done it. Davilla 15:02, 21 June 2006 (UTC)


Would a non-English word such as 乒乓球 be categorized under Category:Chinese nouns & Category:Games or just Category:Chinese nouns? Please help. --Yorktown1776 17:26, 21 June 2006 (UTC)

Please use Category:zh:Games. Rod (A. Smith) 20:10, 21 June 2006 (UTC)
Wikipedia has the convention that no article should be in a catagory and that category's parent category. With very few exceptions, we try to follow that convention. --Connel MacKenzie T C 20:26, 21 June 2006 (UTC)
Yes, but Category:Chinese nouns is not a parent category of Category:zh:Games. (I think I must be missing your point.) Rod (A. Smith) 20:43, 21 June 2006 (UTC)
It's not? Isn't it supposed to be? --Connel MacKenzie T C 16:23, 24 June 2006 (UTC)
I thought the POS categorization tree was purposely separate from the topics categorization tree. Thus, it would be reasonable to categorize the verb steal (as it, "to steal a base") under Category:Baseball (a subcategory of Category:Sports, a subcategory of Category:Recreation, a subcategory of Category:*Topics) and under Category:English transitive verbs. I could easily be in the minority here, though. Thoughts? Rod (A. Smith) 00:24, 25 June 2006 (UTC)
Thanks. It was categorized under Category:zh:Games. --Yorktown1776 

Trivia links to podcast site

From Talk:omelet:

I assume that stuff like this are not warranted of a place in the main article:

omelet as word of the day at podictionary the podcast for word lovers

--Dangherous 10:11, 11 June 2006 (UTC)

Actually I want a ruling on that. Here's the track record: A similar entry at "hyena" was removed by administrator Connel MacKenzie and I asked why in the most constructive and polite terms, as seen at Talk:hyena. Connel agreed that inclusion was not unreasonable and reversed the removal, saying perhaps it should be decided at the beer parlour. I still await some more feedback on it. —This unsigned comment was added by Charles Hodgson (talkcontribs) .
I restored it only because you were a newcomer and I thought I had been too hasty. Since you added more links without disscussion anyway...well, I'm disappointed. --Connel MacKenzie T C 22:15, 22 June 2006 (UTC)
I am very much against the addition of this type of external link. I have explained why at User talk:Charles Hodgson. Kappa 12:28, 22 June 2006 (UTC)
I personally like the podcast site, but I (and probably all other admins and most other editors) agree that such external links are undesirable at Wiktionary. Rod (A. Smith) 18:57, 22 June 2006 (UTC)
Agree. Kappa explained very well why. — Vildricianus 19:02, 22 June 2006 (UTC)
You'll get no argument from me on that score. Dictionaries do not contain "trivia" sections, nor do these links particularly attest anything other than that the word was a word of the day. bd2412 T 21:57, 22 June 2006 (UTC)
Hippietrail, Ncik and others have put forth several "Trivia" section proposals for just this sort of thing...along with anagrams, Gutenberg ranking, etc. It was felt that a trivia section would be much better than having the "Usage notes" overrun by well intentioned contributors. --Connel MacKenzie T C 22:15, 22 June 2006 (UTC)
Clarification: This sort of thing, without the linkspam. --Connel MacKenzie T C 22:18, 22 June 2006 (UTC)
[edit conflict] I wouldn't necessarily object to the occasional well-intentioned, minimal trivia section (though as Wikipedia has discovered, these things have an overwhelming propensity to get completely out of hand). However, I would still object to the "word of the day at podictionary the podcast for word lovers" links, because they're not "trivia", they're distinctly promotional. —scs 22:29, 22 June 2006 (UTC)
I'll back off a step and say "trivia" is not universally bad - but this is too trivial. bd2412 T 22:59, 22 June 2006 (UTC)
Thanks for the ruling which of course I accept. Connel, I didn't mean to disappoint; my reading of your message was that you were going to bring it up here, I kept checking. As to "too trivial" I note that at least one Wiktionary user, after listening to one of my pieces, came back and edited a usage note. Since my pieces are heavily etymology centred I would have anticipated similar listener contribution in those areas over time. I suppose I should have done something to bring my listeners to Wiktionary, but am guilty of not having done so (I get 5000+ audio file downloads a day). That's the way the cookie crumbles, thanks for hearing me out. Charles Hodgson
I suggested in an email to Mr. Hodgson that he could put his links in a different namespace, like a user subpage, or even on the article's talk page. --Dangherous 14:01, 24 June 2006 (UTC)

List of words used in but not defined in WT

I've updated my list of words that are most used without being defined at User:RJFJR/WTconcord. I've made my program case-insensitive and remove possible targets less than 5 letters long (too many of them were fragments from non-English areas). RJFJR 04:15, 23 June 2006 (UTC)

Very interesting list. --Brandnewuser 00:01, 24 June 2006 (UTC)

Do we insist on an entry for the full form of any entered abbreviation?

I've been told that "our CFI" states that we do, though it isn't stated on the actual CFI page. But I don't feel it is always appropriate.

I would expect to look up any abbreviation I didn't know in a dictionary, so I would argue that any citable abbreviation should be included. However, the full form of an unknown abbreviation may for example be more appropriate for an encyclopaedia, or just so plain obvious that it would never be looked up.

For example, many people still do not know what lol means but, to take one of its meanings, do we really need an entry for laugh out loud? Indeed, can anyone find a cite for laugh out loud, other than in a definition of lol, in any context similar to those where lol is used? (Perhaps that is why we don't have an entry for it.)

Similarly, while we have TLA we don't have, and don't need, Three letter acronym.

The general feeling appears to be that we should not include Shorter Oxford English Dictionary and certainly if I wanted to find more about it I would look in an encyclopaedia. But, though I've known of the Shorter... for about 40 years, I only recently came across the abbreviation SOED, and I would never think to look up an abbreviation in an encyclopaedia. I suggest we should merely define SOED as an abbreviation of w:Shorter Oxford English Dictionary and leave the full form to be explained in Wikipedia.

There is another class where part of the full form of the abbreviation may be worthy of an entry (in wt or wp), but not the complete form. For example, take the qualification MIEE, ie Member of the Institution of Electrical Engineers. I would suggest that should be defined as Member of the w:Institution of Electrical Engineers with no attempt to look for the full phrase anywhere.

(Obviously, we would hide the w:, but I have shown it above for clarity of argument.) Enginear 02:37, 24 June 2006 (UTC)

Is a so obvious, it will never be looked up? Why do we have that entry then? Is the so obvious, it will never be looked up? Isn't supercalifragilisticexpialidocious obvious, too?
The debate is (forever) ongoing about multi-word proper nouns. I'm not at all surprised that one side or the other might have taken the initiative to modify either ELE or CFI (or both) to match their POV.
As for links to Wikipedia, I honestly forget now, what the argument was for having them first be expanded stub-entries here. Probably something along the lines that nothing should point outside of Wiktionary, except the one link that matches the name exactly. For example, Wiktionary can have thousands of internal links to William Shakespeare but only one link to w:William Shakespeare...for the page with the identical headword. If it is time for that topic to be discussed again, then I'd like to indicate my support of that concept. --Connel MacKenzie T C 03:57, 24 June 2006 (UTC)

I think Enginear makes some very good points. Me, I would have no problem with the definition of LOL being just "Laughing Out Loud", or perhaps "Laughing Out Loud" (that is, "[[laughing|Laughing]] [[out|Out]] [[loud|Loud]]"). There's certainly no need to have the expansion laughing out loud or Laughing Out Loud as a multi-word headword in every case. —scs 13:37, 24 June 2006 (UTC)

I daresay this is a good example of how it should be. — Vildricianus 16:25, 24 June 2006 (UTC)

Bot approval request: Scsbot/wikised

So I've got this new bot, User:Scsbot, that's ready for y'all to look at if you're interested and, maybe, approve. The first applications will be:

  1. adding Rhyme: links to the Pronunciation section of pages that don't have them
  2. adding "breadcrumb" backlinks to the English:Rhymes pages (see for example Rhymes:English:-ɛmɪt and Rhymes:English:-eɪzɪŋ)

but it's a general-purpose tool that will be amenable to all sorts of other bulk-editing tasks.

Now, as I see it, there are three possible outcomes here:

  1. those that care about such things reach consensus that this bot (and its owner) can be trusted to do automated bulk editing without breaking anything, and that the "bot" bit should be set to hide its edits so they don't clog up Special:Recentchanges
  2. those that care reach consensus that this bot can be trusted, but that the "bot" bit should not be set, because these particular changes are for whatever reason interesting enough that seeing them in Special:Recentchanges would be preferable, even though there will be a lot of them
  3. the consensus is that this bot can not be trusted and should not be left to run unattended.

Personally, I have no preference between #1 and #2. As to #3, it's anyone's prerogative not to trust such a tool, but if so, I would like to hear your specific concerns, so that I can attempt to address them.

I've prepared some fairly complete documentation on this bot and placed it on its user page (User:Scsbot) and several subpages. There should be plenty of information there for anyone who wants to do a sort of code review. Theoretically, everything necessary is there for someone who wants to try downloading and running their own copy of this bot, although I haven't gone through the exercise of rebuilding the bot and all its supporting tools from scratch based on what's there, so there could easily be some ancillary pieces (not to mention installation instructions and related lower-level documentation) which are missing. If anyone is interested in going through this exercise, feel free to ask me questions as needed.

If I don't hear concerns or formal "bot" approval, I'll probably be bold and start running the thing anyway, but at a slow rate so as not to swamp Special:Recentchanges too badly. (But don't worry, I'm not going to do that right away. I figure I'll give y'all at least a week to find and read this note and look at the bot and think about if you're interested, but by then I'll be on vacation for a week, so it'll be a couple of weeks before this thing starts running in earnest no matter what.)

scs 15:52, 24 June 2006 (UTC)

  1. Vildricianus 15:56, 24 June 2006 (UTC)
  1. Separate bot requests for each separate task. (Dude, I'd love to have just one single 'bot account!) --Connel MacKenzie T C 16:18, 24 June 2006 (UTC)

  • Moved from above:
    Why are you opposing then? — Vildricianus 16:23, 24 June 2006 (UTC)
    Sorry if that wasn't clear: This request seems to be for a generic all purpose 'bot account...with a completely open-ended set of task assignments. Or did I read that incorrectly? --Connel MacKenzie T C 16:32, 24 June 2006 (UTC)
    And the awful Wiktionary custom of having bots only for specific tasks is something you opposed, IIRC. Or did you agree that TheCheatBot had to be split up per task in a dozen different bots? — Vildricianus 16:38, 24 June 2006 (UTC)
    Correct. My personal opposition to the policy is just that: my personal opinion only. Yes, if we vote to abolish that silly practice, this would certainly have my support. But this seems a roundabout way to try to effect a policy change. I'd rather it were a formal decision. --Connel MacKenzie T C 16:47, 24 June 2006 (UTC)
    (To be clear: I was not aware of this "policy" and was not attempting a roundabout change. —scs 16:59, 24 June 2006 (UTC))
    Do we even have a "policy" on it? It's just a custom. Perhaps, if we really consider it as a policy, someone should state it at Wiktionary:Bots. I don't mind, I don't operate any bot, but I guess others might benefit from a change. — Vildricianus 16:54, 24 June 2006 (UTC)
    PS: In case it's not clear, I see this as a good moment to change that policy/habbit. Look at Scsbot: all it's doing is small fixes like header changes and the like. Don't tell me he has to get approval for each header he wants to change. Of course completely different things that might disconcert people can't be done without discussion first, but that hardly has anything to do with having the bot flag or not. — Vildricianus 20:56, 24 June 2006 (UTC)
  • So this is gonna get real confusing real fast, because the debate over the alleged "separate bots for separate tasks" policy is going to completely overwhelm the question of whether to approve this particular one and what it's approved to do. (But I'm not blaming anyone for engendering the confusion, because I've just stupidly gone and added to the confusion, by testing the bot this afternoon on a third and unrelated task, as Vild alluded to just above.)
    Question 1: Do we have any written bot policy? Vild linked to Wiktionary:Bots, but it's red.
    Question 2: Written policy or not, how would we decide whether two tasks are different enough to require two different bot accounts and two different approvals? Are my rhyme-adding and breadcrumb-adding tasks different or not?
    Question 3: What are we really approving when we approve a bot? The reliability of the bot and the trustworthiness of its owner, or a specific task it's intended to do? It could be argued (and I think this was my assumption, based on Wikipedia's bot policy) that it's mostly the former. Under this interpretation, there are two reasons bots need to be approved:
    1. to make sure the code is reliable, and does not run amok making changes not intended by the bot owner, or randombly garbling or blanking pages
    2. to make sure that the bot owner is responsible, and does not invoke the bot on inappropriate tasks
    But (again, under this former interpretation) the mechanism that ensures that the bot is not used on tasks which the community consensus would not agree with is not the initial bot approval process. Rather, it is the normal communication between the bot owner and the rest of the community. If a bot owner goofs and makes bot-assisted edits that are objected to, people can discuss it with the bot owner just like any other contested edit. (Obviously it's harder for an objecter to revert, but a responsible bot owner -- which of course is one thing the approval policy is intended to ensure -- will if convinced that a set of edits were not wanted promptly redeploy the bot to reverse them.)

    Vild and Connel, thanks much for your input, but without (I hope) seeming to put you off, I'm going to suggest that we wait to hear from some others before we spend a lot more time debating this among ourselves. —scs 21:40, 24 June 2006 (UTC)
I'll answer your questions:
  • 1: No. It was my intention to display a nice red link.
  • 2: Hard to tell. The current policy just causes users to use their normal account to do some bot tasks because no bot can do it for them because they weren't approved for it... Reason enough to get rid of this attitude (that's it, it's an attitude, not a policy), because the tasks get done anyway - rather by using a non-bot account than going through the entire approval process here again.
  • 3: Here on en:wikt, the latter. Ask Connel how many bots he runs - or plans to run - and how many accounts he has. Perhaps I shouldn't have started this discussion here, but then I'd just have to opposed your request, which isn't solving anything.
That said, I'd like to hear the proponents of this weird attitude, too. — Vildricianus 22:56, 24 June 2006 (UTC)
Firstly, some changes, eg changing a non-standard spelling/format, to a standard one, are impossible to reverse without knowing exactly where each change was made (and may be difficult even then). Particularly if edits are "hidden" from Recent Changes, I believe this info must be kept (certainly for a minimum of 6 months) in case of problems. My preference would be to store the info within the project, eg as a sub-page of the bot user page.
Secondly, we can all make mistakes, and bot mistakes can be extensive. I do think that all proposed bot tasks should be pre-warned, eg by a statement in GP, perhaps three days in advance, preferably with a description of how the task is to be acheived. I don't think there should need to be approval -- eg lack of positive replies should not be a bar to running it. Rather that confirmation should be given that any concerns raised will be addressed before running. Obviously, any contentious tasks should be agreed by BP first. --Enginear 01:54, 29 June 2006 (UTC)

Sign up images: off?

What do others think about turning off the CAPTCHA™ images displayed at signup? They're useful for smaller wikis, but this one is active enough to have it switched off. Its primary target is automated registrations, which are done by vandalbots, spambots and the like. With almost 40 sysops, of whom 15-20 pretty active, massive attacks will be spotted quickly, and as a matter of fact, any decent spammer knows how to break them anyway. The top 10 Wikipedias all have them off, for the good reason that they're cumbersome. Not only do they take some time to generate (up to half a minute), they also disallow people with disabilities to register. Any thoughts? I'd like to have them off ASAP. — Vildricianus 20:26, 24 June 2006 (UTC)

  • I thought I replied to this before. I think we should keep the CAPTCHA images. --Connel MacKenzie T C 22:10, 7 July 2006 (UTC)

Allowing blocked users to post on their talk page?

This is permitted at Wikipedia, but I'm not sure which is best. Disallowing it encourages sockpuppetry and eliminates possible rectification of the situation. Allowing it may leave user talk pages open to further nonsense or vandalism. Any thoughts? — Vildricianus 11:49, 25 June 2006 (UTC)

"Encourages sockpuppetry"? That is a bit contrived, even in the best of situations. --Connel MacKenzie T C 12:13, 25 June 2006 (UTC)
As a matter of fact, people who have been blocked unjustly don't have a chance of contacting the blocking admin or the community, unless they know they have to specify an e-mail address, confirm, etc. Given the fact that there are even admins out here who have problems doing so, don't expect simple users to get it. If they're an IP, they don't have any chance at all. That's why I have an e-mail address displayed at the top of my userpage. Perhaps all admins should do so (a spam address, of course, not any important or personal address). — Vildricianus 12:21, 25 June 2006 (UTC)

Failed RFV

Discussion moved to Wiktionary:Votes/2006-06/Failed RFV.

Templates for English adjectives

Moved from the original templates section, which disappears in the archives right now. — Vildricianus 17:43, 25 June 2006 (UTC)

FYI, continuing with the above theme, the obvious English adjective template is implemented at {{en-adj}}. The default is the "more" form. "{{en-adj|er}}" and "{{en-adj|-}}" do the obvious. Irregulars are given as "{{en-adj|better|best}}", stem changes can be indicated as "{{en-adj|pretti|er}}", and it supports custom wiki text for multiple comparatives or other detailed notes. See the documentation at Template talk:en-adj for details. Rod (A. Smith) 04:10, 23 June 2006 (UTC)

medical dictionary

There is a proposal to start a distinct medical wikitionary. I have no problems with this, but wonder why it is not made part of this effort.

I have noticed a lack of terms, e.g., gastrocnemius, Rolandic fissure. I see you have ACL but not anterior cruciate ligament. I've added a few entries, but have likely also made some formatting errors. --Allamakee Democrat 03:40, 26 June 2006 (UTC)

PS I've not been able to find a template along the line of {{surgery}} or {{medicine}}. --Allamakee Democrat 04:57, 26 June 2006 (UTC)
I've used {{medicine}}, recently, even. {{surgery}} is a good idea, as well. --Connel MacKenzie T C 05:06, 26 June 2006 (UTC)
Or {{anatomy}} or {{pathology}}. The proposal goes much much further than being a dictionary, but is intending to include information on cures and treatment. Jonathan Webley 11:44, 26 June 2006 (UTC)
One of the big non-intuitive clumsies in wiki software is finding templates. And as wiktionary differentiates between cap and lower case, it's even clumsier. In times past, I've edited a few templates, but many of them leave me totally baffled as to what is going on, even tho' I see what they produce. I have no idea how {{surgery}} would be constructed. --Allamakee Democrat 15:30, 26 June 2006 (UTC)
Actually you no longer need a surgery template. There's already a Category:Surgery, and if you add {{cattag|surgery}} to an article, then the cattag template adds (surgery) to the article and includes the link to the category. abdominal is an example using {{cattag|zoology}}. I just added the surgery catgerory to appendectomy using this method. Jonathan Webley 15:43, 26 June 2006 (UTC)

Using sub-pages to improve Wiktionary's usefulness as a translating dictionary

Right now, Wiktionary is next to useless as a translating dictionary, as it generally only gives one- or two-word "equivalents", rather than exploring the translation issues between two words as most published dictionaries do. For any given pair of languages (e.g. English and Japanese), there should be six dictionary entries available for a given semantic concept:

  1. English word defined in English
  2. Japanese word defined in Japanese
  3. Translations of the Japanese word into English, written in English
  4. Translations of the English word into Japanese, written in Japanese
  5. Translations of the English word into Japanese, written in English
  6. Translations of the Japanese word into English, written in Japanese

The current structure of Wikipedia does a good job with 1 and 2, an arguably good job with 3 and 4, and a terrible job with 5 and 6. "Outgoing" translations, in other words, translations intended for a learner of a foreign language to make the right pick when translating from her native tongue, are limited to one-to-one, or at least one-to-few translations, with little or no room for explanation as to which pick is correct. Words such as "if", which require a significant amount of explanation to translate properly into a foreign language like Japanese, do not stand a chance. Another example is translating "brother" into a language such as Japanese. Brother could be translated as:

  1. kyōdai: 1. sibling 2. literal brother in the generic sense 3. figurative brother
  2. oniisan: older brother (formal form)
  3. ani: older brother (familiar form)
  4. aniki: older brother (more familiar than ani)
  5. otōto-san: younger brother (formal form)
  6. otōto: younger brother (familiar form)

The current "brother" page gives most of these forms, and a few others, but it looks rather cluttered and does not give enough information to allow someone to use the word properly. And if the word is not rendered properly, it ruins the meaning of the sentence in Japanese. So what's the solution? I propose that the language listed for each multilingual translation of words on the entry page link to a sub-page for complicated translations. In other words, Japanese: kyōdai (sibling) would be Japanese: kyōdai (sibling), where the word "Japanese" links to en.wiktionary.org/wiki/Brother/ja, which is a full page devoted to translating "brother" from English into Japanese. Feedback? --Aaronsama 13:39, 27 June 2006 (UTC)

The question What is the scope of Wiktionary springs to mind - but I think you raise a very valid point nevertheless. (Wiki* is not paper.) I'd appreciate seeing your example filled out, as an experiment or as an example for discussion, so I can see (better) what you mean. What you suggest might eventually be an answer to the controversy about the German entry Hand. I'd appreciate Paul's, Ncik's and Ec's comments on the topic...but having the sample page ready will help that discussion. --Connel MacKenzie T C 16:04, 27 June 2006 (UTC)
  • This is a problem I've been thinking about for several months now. Real bilingual dictionaries are organized in a certain way. Wiktionary is not a real bilingual dictionary. So far we have been concerned with getting the basics in order. We've been including bits of everything which hopefully means nothing is overlooked when it comes time to expand. By this I mean we have bits of translations, bits of rhymes, bits of pronunciations, bits of synonyms and antonyms. So we are aware of all of these things but as yet we have not stopped to think about how to take most of these aspects to the next level of proffessionalism. So yes we have many translations but they are mostly glosses and they are crammed together into Translations sections. What we really need to be a real bilingual dictionary is to move away from these "squished" translation sections toward proper bilingual dictionary sections.
  • But how to do this?! Currently Wiktionary's back end is totally ignorant of dictionary concerns. This means we cannot have a database with a general dictionary section for each word as well as as many bilingual dictionary entries as we need, and then create pages depending on each users' language needs. Because we cannot do this we have to include everything about each word on a single page. If we started to have proper bilingual dictionary entries for several languages in each word, you can bet complaints will start pouring in from people who don't want to see large amounts of extra stuff that doesn't pertain to them.
  • How can we give the bilingual dictionary users what they need without disgruntling the monolingual dictionary users? We can hope WiktionaryZ will solve all problems and start using that instead. Or we can ask the nerds in the Grease pit how we can act like we have a dictionary-aware back end with the current software. If you prefer the first answer then go and play with WiktionaryZ and if it solves your problem let us know. If not, we can do many things with templates and JavaScript but we already have quite a few users who dislike these solutions more and more.
  • How can we improve Wiktionary with JavaScript and templates with so many people disliking JavaScript and templates? Hehe well talk about it here and let's see what people think! — Hippietrail 22:35, 27 June 2006 (UTC)
    • I took a stab at creating a sample in Chinese of what I think Aaronsama is talking about. I modified the entry for brother so that it links to brother/zh. Chinese adds another dimension to this because it is actually a family of languages. I put everything in zh, but it might be cleaner to do brother/zh-cn (Mandarin written in Simplified script), brother/zh-tw (Mandarin written in Traditional script), brother/nan-cn (Min Nan written in Simplified script), brother/nan-tw (Min Nan written in Traditional script), brother/yue-cn (Cantonese written in Simplified Script), brother/yue-hk (Cantonese written in Traditional script) etc. Wow, this really gets complicated!

A-cai 23:06, 27 June 2006 (UTC)

I like the way that this is progressing. The translation section of each term always struck me as woefully inadequate and nigh-useless for non-Western-European languages. However, I'd like to take a moment to point out that the kind of resource being described is usually best served with example sentences covering the range of terms in another language that are likely to be used in translating the term being covered. In the case of brother, many of the translations are served by a simple explanation of the general distinctions between the terms (older brother, younger brother, etc.--though I'll point out now that it's going to be hard to come up with a way of dealing with the formality levels that is both useful and elegant), but for other terms, particularly common ones, the distinctions aren't going to come across easily without some explanation and example sentences (go is a particarly tough example, which would require several dozen example sentences before we could even begin to pretend some kind of comprehensiveness. light, crime, power, etc., all strike me as difficult cases as well). Anyways, it will be quite impossible to serve this need effectively without each language pairing having its own page en->ja, en->de, etc. If I find 60 minutes to staple together, maybe I'll take a stab at formatting an en->ja page of a fairly low-difficulty level.
There is one thing that should be noted (and it has before). As we build out these sorts of features, we do run into the situation where the amount of work to be done multiplies, and the number of people for whom it will be useful divides. I'm not convinced that this feature's usefulness can overcome its vastness to the point where user contribution will come close to making it comprehensive. Jun-Dai 23:35, 27 June 2006 (UTC)
Jun-Dai? Jun-Dai! Welcome back! --Connel MacKenzie T C 09:11, 28 June 2006 (UTC)
What happened to the translation tables, or has brother never had any? Is this proposed scheme intended to work with translation tables (I hope so) or to replace them (which I think would be a step backward)? — Paul G 09:25, 28 June 2006 (UTC)
I like this idea a lot. Subpages have a lot of potential, towards quotations as we already have seen, but also towards translations and perhaps another host of things. First thing to do then is allowing subpages in the main namespace, right? (See above).
Secondly, let's not think about WiktionaryZ right here. It's a promising project but it will not solve our problems anywhere in the near future.
Given the positive reactions here, I assume that we should give this proposal a try, so the question is, to what extent will we bring translations to subpages? One thing I really like here is the idea that the main entry pages wouldn't be burdened by the translations anymore, which is often what translations do. A large number of entries have minimal definitions but a host of translations, which makes it difficult to change the definitions. Bringing them to subpages could solve quite a few issues, and would perhaps create a better balance, allowing both the monolingual English side of entries to expand more easily, and the translations to take up as much space as they want to on subpages. — Vildricianus 09:44, 28 June 2006 (UTC)
I added a little more pizazz to the brother/zh example (which hopefully makes it look better).

A-cai 14:24, 28 June 2006 (UTC)

So, brother now has down-links in the translations section? And no actual translations? Is that the proposal now? --Connel MacKenzie T C 16:21, 28 June 2006 (UTC)

Notice: Imports are back / Direct transwiki

[4]. Meaning that Transwikis now can happen directly. If we're interested (which I reckon we are), we should apply for this, allowing direct imports from transwikible Wikipedia stuff. — Vildricianus 18:05, 28 June 2006 (UTC)

Yup I already transwikified some pages on fr:, it's working well. - Dakdada 20:47, 28 June 2006 (UTC)
But that doesn't seem to work here?
Import pages
No transwiki import sources have been defined and direct history uploads are disabled.
--Connel MacKenzie T C 06:27, 30 June 2006 (UTC)
Yes, you have to ask to enable it ([5]). - Dakdada 10:23, 30 June 2006 (UTC)

Works. First import was Worth one's salt (as a test). — Vildricianus 21:23, 30 June 2006 (UTC)

  • Update. Namespace can now be selected. In short, articles from en.wikipedia can directly be imported into our Transwiki namespace (by our sysops). — Vildricianus 18:38, 1 July 2006 (UTC)
    • I'm not sure exactly how to proceed here; I've been doing the bulk of the transwikiing to Wiktionary lately. I'm not a sysop here and can't use the import function, though, so I'll continue to copy+paste merge with edit history, unless this is a big problem -- just let me know on my talk page. Thanks! TheProject 21:29, 1 July 2006 (UTC)

See new topic in WT:GP#Transwiki - new procedure? for hammering out a system. — Vildricianus 17:05, 2 July 2006 (UTC)

Announcing new namespaces

In case you didn't notice yet... we have 14 new namespaces:

  • Appendix
  • Appendix talk
  • Concordance
  • Concordance talk
  • Index
  • Index talk
  • Rhymes
  • Rhymes talk
  • Transwiki
  • Transwiki talk
  • Wikisaurus
  • Wikisaurus talk
  • WT
  • WT talk

Numbered from 100 to 113. I still argued for my own at Vild: but didn't get it.

Most entries that were in the wrong place, most notably rhymes: instead of Rhymes: are now harmoniously together in the real namespace.

Any entries you still spot in the wrong place, for instance Talk:WikiSaurus:blabla, please move them to Wikisaurus talk:blabla. — Vildricianus 21:49, 28 June 2006 (UTC)

For instance - over 300 Talk:Transwiki:crap in Double Redirects. SemperBlotto 21:57, 29 June 2006 (UTC)
We still have User:DblRedirBot of course. — Vildricianus 22:05, 29 June 2006 (UTC)
The lag on DoubleRedirects stems from the maintenance scripts that must run to get that special page updated...then when I notice it, I run DblRedirBot. Should be caught up now...leave a message on my talk page if you see it out of hand again. --Connel MacKenzie T C 05:52, 30 June 2006 (UTC)
0 right now. — Vildricianus 22:16, 30 June 2006 (UTC)
See Wiktionary:Statistics for a breakdown of pages per namespace, including the new ones. — Vildricianus 11:43, 29 June 2006 (UTC)
Here's some nice toy: Special:Random/Rhymes. — Vildricianus 21:49, 29 June 2006 (UTC)

Possible project: drug names.

Medline is copyright free. If you click and follow links, you get access to an exhaustingly exhaustive list of drugs, generic-names and brand-names, and if you dig deep enough, chemical names. It is easily mined. I'm in no hurry, but potentially many thousands (even 10s of Ks) of new entries are available. I'm thinking along the lines of a list of requested articles after some publically-minded person sucks the site dry of main entries and x-refs the embedded urls. Given a few months, even I could make a start.

Unless I am mistaken, this site lets you list every registered trade name® in the drug industry as dictionary entries, with impunity. --Allamakee Democrat 04:54, 29 June 2006 (UTC)

† The MedMaster™ Patient Drug Information database provides information copyrighted by the American Society of Health-System Pharmacists, Inc., Bethesda, Maryland Copyright© 2005. All Rights Reserved.
‡ Copyright © 2006 Thomson Healthcare. All rights reserved. USP DI ® and Advice for the Patient ® are registered trademarks of USP used under license to Thomson MICROMEDEX. Information is for End User's use only and may not be sold, redistributed or otherwise used for commercial purposes.
  • Natural Standard Monograph Copyright© 2005 Natural Standard. Commercial distribution or reproduction prohibited.
and also http://www.nlm.nih.gov/medlineplus/copyright.html says pretty clearly that "When using NLM Web sites, you may encounter documents or other information resources contributed or licensed by private individuals, companies, or organizations that may be protected by U.S. and foreign copyright laws." So I think this is not a "safe" resource for us to be using.
On the other hand, we can create entries, and refer to MEDLINE in the references section (or in the external links section.) And we probably should, for that matter. But to blindly pull content from there would be very problematic.
Also, your last comment seems to be a non-sequitur. A web site can't allow you to violate copyright with impunity. --Connel MacKenzie T C 17:03, 1 July 2006 (UTC)


Discussion moved to Wiktionary:Votes/bt-2006-06/KillRedirectsBot (User:DblRedirBot).