Wiktionary:Beer parlour/2006/April

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


What the #@!%$@ is going on???

Why does the database keep locking up tonight? Why do I have to "Save"/"Enter" each word three or more times before it shows up in the database. Why does this have to be so frustrating tonight? --EncycloPetey 10:44, 1 April 2006 (UTC) (who would rather relax with a nice cuppa, despite being American)[reply]

I don't know, but it seems to be acting up again now. Between database locks and edit conflicts, I haven't been able to edit on the first attempt even once. --Connel MacKenzie T C 04:07, 5 April 2006 (UTC)[reply]

words easily confused

Should words that are easily confused with other words have a section for these other words? If so, what should the heading be? Has this been policy? JillianE 13:45, 1 April 2006 (UTC)[reply]

I always put them in ===See also===; or in ===Homophones=== if they sound the same. SemperBlotto 13:50, 1 April 2006 (UTC)[reply]
In some cases they may also warrant ====Usage notes==== Davilla 17:01, 1 April 2006 (UTC)[reply]
I think ====Usage notes==== (note: level-4 heading) is the appropriate place for these; for example:
==English==
===Adjective===
populous
whatever goes here
====Usage notes====
  • Do not confuse populous, which is an adjective, with populace, which is a noun.
(I quote this as an example because you often see "populous" used where "populace" is intended.) I have added this sort of usage note to various words (see phosphorous for another example).
They should certainly go under "Homophones" too if they are homophones (as is the case with "phosphorous" but I think they need more treatment than just "See also", hence my suggestion that they should be included in under "Usage notes". — Paul G 09:44, 3 April 2006 (UTC)[reply]
The term "Usage notes" denotes notes about usage. Confusable terms have little to do with usage I think; therefore I'd say =Related terms=, which is often the case, else =See also= as SemperBlotto suggested. — Vildricianus 12:12, 3 April 2006 (UTC)[reply]
It is about usage really - it's saying "Don't use Y when you mean X; use X". — Paul G 13:43, 3 April 2006 (UTC)[reply]
Hmm, yes perhaps. Not sure. — Vildricianus 14:00, 3 April 2006 (UTC)[reply]
I also use ===Usage notes=== for this sort of thing. --Connel MacKenzie T C 16:42, 3 April 2006 (UTC)[reply]
I suppose if it's easily confused, as stated, then it really does belong as a usage note, though there must be at least a few only somewhat confusing cases where see also will suffice. Davilla 17:42, 3 April 2006 (UTC)[reply]
Certainly, and I think this is probably the only time a usage note should be used to indicate confusables. I have seen "phoshorous" instead of "phosphorus" and "populous" for "populace" many times. "Affect" and "effect" might be another case; "somewhat confusables" can probably safely go under "See also". — Paul G 09:59, 4 April 2006 (UTC)[reply]

According to the OED and according to the citations supplied therein, as well as evidence in the form of citations I have supplied from the Mayflower Compact (1620) and from Dickens' A Tale of Two Cities (1859), the phrase Anno Domini should be capitalized. As such, I moved it to be capitalized. Stephen G. Brown has moved it back repeatedly without offering citations or evidence. What does the community think? --EncycloPetey 14:43, 1 April 2006 (UTC)[reply]

First, your examples are British English, not Latin. The American Random House Dictionary of the English Language disagrees with you, showing as entries *A.D. < L annō Domini, and *annō Domini, and *A.H. < L annō Hejirae, etc. The American Heritage Dictionary of the English Language (fourth edition, 2000), also says annō Domini. Your claim that I have moved it back without offering evidence is patently false, because I countered your OED with our Random House and American Heritage. In any case, annō Domini is correct in American usage, so there are no grounds for moving it at this late date to the British spelling. —Stephen 14:56, 1 April 2006 (UTC)[reply]
Remember, our criteria for inclusion of information specifically exclude dictionaries. We can cite them as reputable references advocating "anno Domini" in a ===usage note===, but without citations of the use of "anno Domini" there's nothing to justify it standing at that title—while the number of citations so far proferred for Anno Domini certainly justify it standing at that title unredirected (even if anno Domini may later get its own, separate page, when evidence is accrued). —Muke Tever 16:00, 1 April 2006 (UTC)[reply]
This is different. The phrase clearly merits inclusion. The question is about orthography, and virtually all professional American writers and typographers have always turned to the AHD, Random House, Websters, or a house manual of style as the authority on matters of orthography (and those who write the styleguides themselves turn to the AHD, Random House, or Websters). The argument that numbers should be the deciding factor in anything involving quality or beauty or anything of that sort is silly. Using that logic, most people are the best writers of English, and H.W. Fowler writes poorly...a ridiculous notion indeed. Using that logic, "why r u" is better English and higher style than "wherefore art thou." Numbers and cites prove that something exists, but they are no indication at all of best practice. When Americans write "Anno Domini," it’s because they are simply guessing how it should be because they are too busy to look it up or they just don’t care. —Stephen 14:14, 3 April 2006 (UTC)[reply]
I didn't say the amount of the numeration was relevant. I said that it takes three citations for us to have an entry on a word. Three citations were already given for Anno Domini so there is no reason in the universe why it shouldn't have an entry. If you want to decorate it with admonitions not to use it culled from reputable sources, that's fine. But it's a failure of NPOV to tell people that a phrase shouldn't exist because you or anybody doesn't like it, for whatever reason. —Muke Tever 22:13, 3 April 2006 (UTC)[reply]
I never said anything of the sort (concerning "shouldn’t have an entry" and "admonition not to use it"). The page anno Domini already existed, and EncycloPetey wanted to move it to the newly created Anno Domini. Nobody has suggested that Anno Domini is not used or does not deserve an entry; it’s the original anno Domini that he feels doesn’t deserve an entry. We have pages such as analyze and analyse where each mentions the alternative spelling, and color and colour. This has never been about the existence of Anno Domini at all, but about anno Domini. EncycloPetey kept insisting on having only his version and reducing the old page to a redirect, and has been insisting that there is no evidence that the old page is either correct or used, and I have been insisting that the old page is not only correct but is also used by reputable writers. As Eclecticology points out, the spelling anno domini is also in use, and I don’t mind if it has an entry of its own or is mentioned as an alternative spelling. I only insist that the original page anno Domini is correct and that there are no grounds for deleting it. —Stephen 19:39, 6 April 2006 (UTC)[reply]
All right; I had read in the original comment "Stephen G. Brown has moved it back repeatedly without offering citations or evidence." and my mind entirely discarded the previous statement relating the same as what you just restated. —Muke Tever 12:57, 7 April 2006 (UTC)[reply]
Stephen has discussed this on his talk page before. I get where he is coming from but I must admit all my sources too, like EP's, show it with a capital A. Ultimately I think we have to be led by citations on this kind of issue. Widsith 14:54, 1 April 2006 (UTC)[reply]
Currently, Anno Domini has only a Latin entry. This entry contains English quotations!?! Ncik 15:00, 1 April 2006 (UTC)[reply]
This is easily fixed (and now done) by adding an English header. BTW, one of the quotations I suppiled on the Talk Page is in Latin. --EncycloPetey 15:03, 1 April 2006 (UTC)[reply]
Yes, it’s in Latin ... written by Shakespeare. If I recall, he was a British subject by birth. —Stephen 15:20, 1 April 2006 (UTC)[reply]
You have casually dismissed each and every instance of capitalization as an "exception" to your view, and yet every source and WikiSource and each quote in the OED capitalizes Anno Domini. You have yet to provide a single documented example to support your position. It seems that capitalization is the rule, and your viewpoint is an unsupported minority view at best. I will be examining medieval Latin texts from Hungary and Croatia when I get home, to see what light they shed on the situation. I may even have a few medieval Italian and German texts in Latin to examine. --EncycloPetey 15:26, 1 April 2006 (UTC)[reply]
I trust the AHD and Random House far more than I trust Florida lawmakers, ancient mariners, or any of your other examples. I consider the AHD and Random House to be authoritative in matters of American orthography. —Stephen 15:34, 1 April 2006 (UTC)[reply]
It's a blind faith you have -- one that lacks evidence and relies solely on someone's saying so. Believe as you will, the totality of citations is against you. --EncycloPetey 15:39, 1 April 2006 (UTC)[reply]
Certainly no blinder than your faith in your sea captains, old poets and state law-makers. Your citations are examples of bad usage and poor education. Nothing you’ve said can stand against the American authorities that I have already named. —Stephen 15:47, 1 April 2006 (UTC)[reply]
use-mention distinctionMuke Tever 16:00, 1 April 2006 (UTC)[reply]
It's not blind faith when you can see the evidence. Q.E.D. As I understand your argument, the evidence is "bad" (as you put it) because it contradicts your sources' unsupported assertion. There is a very important distinction between attested and asserted that you are missing here. --EncycloPetey 15:53, 1 April 2006 (UTC)[reply]
It certainly is blind faith if you believe in the evidence that you have given, in opposition to what I have named. Your examples are bad usage because they are wrong from an American perspective, and they are wrong because they are not correct American usage. Besides the examples I’ve already mentioned, the AHD and Random House also have entries for annō Hebraico and annō Hejirae. The entry at the AHD goes on to say:
annō Domini
SYLLABICATION: an·no Dom·i·ni
ADVERB: abbr. A.D. or a.d. In a specified year of the Christian era.
ETYMOLOGY: Medieval Latin annō Domini: Latin annō, ablative of annus, year + Latin Domini, genitive of Dominus, Lord.
Other American sources include:
  • The Columbia Guide to Standard American English. 1993...anno Domini
  • The Columbia Guide to Standard American English. 1993...A.D. is an abbreviation for anno Domini.
  • The New Dictionary of Cultural Literacy, Third Edition. 2002...It stands for anno Domini, a Latin phrase meaning in the year of our Lord.
  • Sir Richard Hawkins, The Cambridge History of English and American Literature: An Encyclopedia in Eighteen Volumes. 1907–21...entitled The Observations of Sir Richard Hawkins, Knight, in his voiage into the South Sea; anno Domini, 1593 (printed 1622).
  • The Encyclopedia of World History...A.D., or anno Domini, in the year of our Lord.
  • Roget’s International Thesaurus of English Words and Phrases...anno Domini [L.], A.D.; ante Christum [L.], A.C.; before Christ, B.C.
  • The Cambridge History of English and American Literature...adds more explicitly that this was in anno millesimo xiiii ab incarnatione Domini nostri Jesus Christi.
—Stephen 16:23, 1 April 2006 (UTC)[reply]

I'd like to approach this from a slightly different angle. What is the correct capitalization in Latin? That page should exist with a Latin header. It appears from this conversation that the borrowed term in English should be listed under capital A. It's not clear to me if the lowercase should be considered an alternate English spelling. Davilla 17:18, 1 April 2006 (UTC)[reply]

Never mind. It appears the Latin is correctly placed at anno Domini. The diacritic on ō is ignored for the entry name according to Wiktionary policy. But shouldn't annō Domini redirect? Davilla 19:25, 1 April 2006 (UTC)[reply]
A definition of anno Domini wouldn't count, but use of "anno Domini" in the definition of A.D. would, whether in a dictionary or otherwise. This is sufficient evidence IMO. Davilla 20:52, 1 April 2006 (UTC)[reply]
The problem with this is that the best American usage is lowercase ‘a,’ and I consider the capital to be sloppy style. I have already suggested that a note be added to the effect that British usage prefers the capitalization, just as we do with the other words that we spell differently. As for the capitalization that was followed in Latin one or two thousand years ago, it was all handwritten or chiseled in stone and caps and lowercase usually didn’t have the same effect or degree of standardization that today is the case. —Stephen 17:32, 1 April 2006 (UTC)[reply]
More proof:
I can find thousands of these anno Domini with the click of a mouse. —Stephen 18:28, 1 April 2006 (UTC)[reply]
What a brilliant demonstration of why we need "The Information Desk" so our newbies are not faced with flame wars such as this. Which, I must say, seems an almost deliberately concocted flame war. Just agree to disagree, pleeeease ! Put in a redirect and a note about the usage being different in US and UK, and leave it at that. We've got better things to do, surely ! And please, take this petty war out of Beer Parlour and put it in the talk page of the article(s), where it belongs. --Richardb 00:56, 2 April 2006 (UTC)[reply]
This is not a flame war. This discussion is useful and informative. Widsith 09:25, 2 April 2006 (UTC)[reply]
To a newbie looking for help, this looks like, smells like a flame war (Certainly no blinder than your faith in your sea captains, old poets and state law-makers. ). I didn't say it wasn't useful or informative, it's just inappropriate for here. This Beer Parlour is just becoming so overcrowded and noisy with everyone bringing their petty squabbles here. Just put an "ad" here, and take your discussion off to the appropriate place. Before Beer Parlour becomes too big even if you have got a broadband connection !--Richardb 13:53, 4 April 2006 (UTC)[reply]
Exactly, still, it belongs in the Tea room. I'm inclined to move this entire bit there. — Vildricianus 12:07, 3 April 2006 (UTC)[reply]
The Random House Handbook and the Globe and Mail Style Book both use anno domini. Eclecticology 00:56, 3 April 2006 (UTC)[reply]

The large chunk of discussion on this topic has now been moved to Talk:anno Domini, or rather the attached page Talk:anno Domini/BP April 2006.
--Richardb 13:11, 15 April 2006 (UTC)[reply]

Idea - most popular lookups not found

Anyone know if/how we can run a report on the most popular GO and SEARCH terms not found. That would be another list we could work on. And might be illumminating for many of our discussions about WikiSaurus and Protologism. Arguably, if a word is searched for a lot, it is a real word !--Richardb 05:39, 2 April 2006 (UTC)[reply]

For some reason, tuttler springs to mind. The MediaWiki software has the capability to generate statistics sortof like that, but after one or two days last year, the feature was turned off for performance reasons. The performance impact was quite noticable. --Connel MacKenzie T C 17:33, 2 April 2006 (UTC)[reply]
As an ex-IT person, I can't imagine it having much impact to just log what words are GO/SEARCHed for, for later reporting. That would consume miniscule time/energy compared with the actual SEARCH function. Against the usefulness of finding out actually what our "customers" are looking for.--Richardb 10:20, 3 April 2006 (UTC)[reply]
It depends on how it is done. If the database is replicated over several machines and every go/search generates a write instead of only reads it might be expensive. Of course since there is no real use to have every go/search immediately accessible you could just dump it in a file on each machine and do it in a batch every day. --Patrik Stridvall 15:01, 3 April 2006 (UTC)[reply]
Did the software try to compile all input terms or only those not found? Did it dump and post-process?
You could generate some meaningful statistics even if it were only turned on for short periods of time. I'd be interested in the proportion of misspellings as well. Davilla 17:29, 3 April 2006 (UTC)[reply]
I'm sorry if I worded it poorly above: that was not my software, nor my decison. IIRC, I did not have any influence on the decision, but found out about it afterwards. I think the link was at http://www2.knams.wikimedia.org/?site=en.wiktionary.org but that link is not functioning right now. --Connel MacKenzie T C 17:42, 3 April 2006 (UTC)[reply]

advanced/extended/hypothetical etymologies - Proto-Indo-European

is it appropriate to add advanced etymological information to words? By this I mean tracing words back to their hypothesized proto-indo-european roots, where possible. Thus, joke comes from the latin iocus which in turn is the offspring of the proto-indo-european iek. I would like to see this, along with a standard format for adding PIE roots, and, eventually structured categories of root words so that there might be a "Category:PIE_iek" and "Category:LA_iocus" (is "latin" LA?) and "Category:DE_ger" (the root word for jest is strictly germanic, if I recall correctly) or something similar for words that likely descended from whatever language roots they may be from. I think this could be an incredibly helpful linguistic tool and I'd like to help slowly manually building upon it in my spare time using what references I have at my disposal. Thoughts, please. Jxn 20:42, 2 April 2006 (UTC)[reply]

The problem with all proto languages is that they are highly speculative. Worse the speculations are largely unverifiable without doing original research which is outside the scope of Wiktionary. In fact doing orginal research is quite hard in itself since there are virtually no sources to do any research on. Not that it really matters to us since anything put here must be possible for anybody to verify without doing any original research.
Secondly I really can't see any reasonable use for it. Looking at List of Indo-European roots just make me go "OK and knowing that two seemingly totally different words are related is useful in what way?". Having a category for any words that are cognates because they trace their etymology back to say a specific Latin word is one thing. I don't think it is that useful but at least it is way to find missing etymologies by looking at the category and realizing that some words are missing. But then you can click on the "What links here" on the page for say some Latin word if you are really intrested.
I think its great that somebody is intrested in adding etymologies but including proto languages is taking it too far. It would possibly be useful to have "reverse etymologies" on a page of a word or in the appendix to show how words "fan out" from a common "ancestor" but since words inflect or combine with other word to form new ones this would very likely look too horrible to be of any real use. Lets start worrying about adding normal etymologies first. BTW there is a page Wiktionary:Etymology that badly needs updating. Feel free update it or discuss any updates on the talk page. --Patrik Stridvall 22:22, 2 April 2006 (UTC)[reply]

You can put a reconstructed proto-form in the etymology. (If you do, please spell it according to the source and cite the source, so people don't quibble with it later.) Yes, Latin is la. There are already a few categories like this; equus is in Category:Latin root equ (there are more categories like this on the Latin Wiktionary; try la:Categoria:Radices, specifically la:Categoria:Radices Latinae). A large problem with putting reconstructed forms in page (or category) titles is their instability—the form varies from source to source and theory to theory. It would be best to put it under the attested forms of the root (and then possibly link them together with an infobox style template, similar to the one at la:Categoria:Radice equ). —Muke Tever 22:42, 2 April 2006 (UTC)[reply]

We don't have a policy against including roots as Patrik's comment suggests, and there are quite a few etymologies, e.g. on bark, which include PIE roots. Ncik 22:58, 2 April 2006 (UTC)[reply]

I didn't say that we have a policy against it. But as Muke says "the form varies from source to source and theory to theory" so I think it would be inappropiate to have such things on the pages for English words or for that matter on the pages of any living language. Speculation about the futher origin when you reach the "end of the line" is OK so in the case of bark the speculation can go on the page of the Old Norse word with references. So while we don't have a policy right now, I think we should. And that policy should be "Nothing that is too speculative on the pages of living languages". --Patrik Stridvall 09:21, 3 April 2006 (UTC)[reply]
What I find more interesting is adding "cognates" in the etymology section, mentioning related words in related languages. I've done this in a couple of instances, experimentally; see eten or accoil. — Vildricianus 12:00, 3 April 2006 (UTC)[reply]
Yes, that might be useful. Many to not say almost all English words with Old Norse origin exists in Swedish and I suspect, Danish, Norwegian and Icelandic as well. What worries me is that some words with Latin or Greek orgin exists in basicly all Romance and Germanic languages and possibly also Slavic Languages. In any case we probably should try to agree on some appropiate formating for the cognates. --Patrik Stridvall 14:18, 3 April 2006 (UTC)[reply]
That "the form varies from source to source and theory to theory" is not to me a problem. The form of "colorise"/"colorize"/"colourise"/"colourize" also varies from source to source and theory to theory but we have a way to handle it: Alternative spellings. We can handle it even better for these since the sources and theories have names. — 69.19.14.15 19:04, 3 April 2006 (UTC)[reply]
For "colorise"/"colorize"/"colourise"/"colourize" we can and eventually will dig up quotes to support each form. For recontructed proto-forms this is for obvious reasons not possible. The credibility of the proto-forms are entirely depended of credibility of whomever recontructed them and can't be verified without doing original research which is out of the scope of Wiktionary. Actually it is not really possible to verify them at all per any reasonable definition of the word verify.
You're thinking inside the box. Naturally it is impossible to find quotes for proto-forms in the way that it is for written languages, but it is certainly possible to find citations. A citation for such a form would be to state what dictionary or other scholarly work it is discussed in. Your disclaimer about the credibility of proto-forms is justified and I would recommend a page somewhere to explain this, but original research is not needed, even though Wiktionary, unlike Wikipedia already has plenty of minor original research. — Hippietrail 00:21, 5 April 2006 (UTC)[reply]
Note that I have not argued against having the proto-form on separate pages starting with "*" and I have no problem with pages for dead languages meantioning them as well as linking to them. See also below. -Patrik Stridvall 08:46, 5 April 2006 (UTC)[reply]
Filling the pages of dead languages with references to existing theories is one thing, but doing it on living languages is taking it too far. Anybody that have any use for them to do research on whatever will not trust us as a source and anybody else will have very little to not say no use for them. Cognates is one thing. It is much easier to remember foreign words that have cognates in your native language than other words. I'm trying to improve my French and words that have cognates is much easier to learn. Sure most of the time I can guess myself, but knowing for sure would be even better. I'm not trying to tell anybody how he or she is use his or her time but please lets spend time on adding the verifiable etymologies first. --Patrik Stridvall 20:01, 3 April 2006 (UTC)[reply]
But this is your opinion which you are entitled to. None of your arguments hold up though since they could all be used to deny entry of all kinds of things Wiktionary already includes. And as we all know, contributors spend their time here just exactly how they want to, and they surely always will. — Hippietrail 00:21, 5 April 2006 (UTC)[reply]
It is not about denying entry, it is about keeping the pages of living languages free from things that only are useful to a very small minority. But as I have said, I have no problem with having it on the pages of dead languages. As for policy in general, just because we allow entry for something doesn't mean that we should allow anybody to put it anywhere. All prioritizing regarding presentation is to some extent POV but that is unavoidable. In this case, I think my proposal is a reasonable compromise. --Patrik Stridvall 08:46, 5 April 2006 (UTC)[reply]

I have done a lot of work on etymologies myself, especially on the Old English entries. I like to see a full Etymology section, including on living languages. However, I note that pages are now being created in ‘Proto-Indo-European’. Personally when I use PIE forms I do not link them, I just put them in italics. They are in my opinion too conjectural and vary too much from authority to authority. So, although I don't really mind if someone wants to create all these pages, I hope care will be taken that existing etymologies will not be have their PIE forms changed just to fit in with the ‘standards’ now being created. Widsith 06:12, 4 April 2006 (UTC)[reply]

thanks all for the input/information. I would like to add that my preference is not to have the *proto language roots included with the basic definitions of words, but to have possible proto language roots (perhaps even conflicting hypothesized roots from different sources) included in a separate "extended etymology" section or similar at the bottom of the page. I think adding them above the definition would detract from the main purpose of wiktionary--word meanings, but I think that good sourced proto roots should be included, if only to help spur thought on the interrelationships between words. I think this could make wiktionary a much more powerful source of information than most standard dictionary (and, if people are active and watchful enough, allow it to offer high-quality etymological data that not even OED includes). What are others' thoughts on a standardized separate section for this type of etymological data, if sourced properly? (with the caveat, of course, that users be somehow informed--perhaps by key words such as "possible" or "theorized"--that these proto roots are not necessarily correct or confirmed) Jxn 01:59, 8 April 2006 (UTC)[reply]

Serbo-Croatian

I noticed that there are a lot of Serbian translations. I was wondering whether it wouldn't make more sense to call them Serbo-Croatian. Is there a policy on this? Ncik 23:34, 2 April 2006 (UTC)[reply]

For polital reasons if nothing else, Serbocroatian, Serbian, and Croatian are now distinct languages with the former being defunct. We certainly welcome Serbocroatian entries but please research them as they will not always be the same as the Serbian entries. — Hippietrail 23:37, 2 April 2006 (UTC)[reply]
I am a native speaker of this language and am offering my thoughts on this subject. Serbo-Croatian language was divided (for political reasons) into Bosnian, Croatian, and Serbian. Entries for Serbo-Croatian do not need to be entered as that language (by law) no longer exists. Each entry should be separately (even if it is the same, which ~90% of the time is true) entered as Bosnian, Croatian, or Serbian respectively. Listing entries as Serbo-Croatian is politically incorrect. Linguistically, it makes sense to group the entries as such, however I believe that it will cause more harm than good. Entering entries as Serbo-Croatian only complicates things more and would possibly incite NPOV-specific vandalism. Similar situation can be said for Hindi and Urdu that can be linguistically grouped as Hindustani. But we won't go into that. --Dijan 00:58, 3 April 2006 (UTC)[reply]
Could it be possible (perhaps for you Dijan?) to develop a page Wiktionary:About Bosnian, Croatian, Serbian and Serbo-Croatian, which clearly states our stance towards this matter? — Vildricianus 11:52, 3 April 2006 (UTC)[reply]
I will try to create such a page as soon as I find some time for it. Thanks for the suggestion. On a related note, I should also include on that page the decision (discussed earlier) to not use unnecessary ligatures in these languages. --Dijan 22:12, 3 April 2006 (UTC)[reply]
Can you advise me which law applies to my country, Australia and which applies to the country I am currently in, Honduras? I'm very surprised these countries had the inclination to make such a law. Should we also remove all entries for Old English since it is also no longer a language by some definition. Since old print dictionaries of Serbo-Croatian have not ceased to exist and our policy and motto is still "all words in all languages", we should support also this defunct language. I'm not forcing anybody to add Serbo-Croatian words of course but I'll be pretty upset if somebody starts deleting any that are already here. — 69.19.14.15 18:58, 3 April 2006 (UTC)[reply]
I am not saying that these entries should be deleted, but rather salvaged and separated into the three appropriate languages. If you add Serbo-Croatian, then what is the point of adding Serbian, Bosnian, and Croatian as separate entries? It creates much unnecessary repetition and creates high probability of NPOV vandalism (this can be witnessed on the Serbo-Croatian Wikipedia which is subject to this type of vandalism almost everyday). --Dijan 22:10, 3 April 2006 (UTC)[reply]
According to the Wikipedia article, there are different views on what Serbocroatian was and whether it exists now. The views depend mostly on whether the person holding them was Bosnian, Croatian, or Serbian. The point of adding S, B, and C as separate entries is that each is now (and presumably before Yugoslavia) its own language with its own standard whereas Serbocroatian when it existed had its own artificial standard which was not a superset of these three all merged together. Where possible, Serbocroatian entries should accord with the standard as it existed during Yugoslavia. If you find any Serbocroatian entries which need to be altered in any way at all to become the 3 separate modern entries, that should be proof enough that it had its differences to the others and we shouldn't actively try to sweep it under the carpet just because history has moved on. — Hippietrail 18:44, 6 April 2006 (UTC)[reply]
Entries for Serbo-Croatian do not need to be entered as that language (by law) no longer exists. This describes sh as a dead language, which Wiktionary has nothing in its policy to discourage. ISO 639-3, by the way, has Serbo-Croatian as a "macrolanguage", a kind of grouping for languages that are often considered to be somehow "the same" but comprise multiple language codes (other examples are Chinese, Arabic, and Nahuatl). —Muke Tever 22:08, 3 April 2006 (UTC)[reply]
I did not mean for it to sound as if it is dead.  :) Of course it is not dead. It has simply, let's say, assumed new identity. But, again for sake of anti-vandalism and needless repetition, it does not need to be included. If you argue that Serbo-Croatian should be included, then why not include Hindustani as well (rather than separating it into Hindi and Urdu...they are the same language after all)? --Dijan 22:17, 3 April 2006 (UTC)[reply]
Because Hindustani doesn't have an ISO 639-3 code? :x) —Muke Tever 22:49, 4 April 2006 (UTC)[reply]

Chinese

On a related note, why are Mandarin, Cantonese, Min, etc. all considered Chinese when they are not so linguistically? Wikipedia notes:

The diversity of Chinese variants is comparable to the Romance languages, and greater than the North Germanic languages. However, owing to China's sociopolitical and cultural situation, whether these variants should be known as "languages" or "dialects" is a

subject of ongoing debate.

Are we bowing to political pressure? Continuing:

From a purely descriptive point of view, "languages" and "dialects" are simply arbitrary groups of similar idiolects, and the distinction is irrelevant to linguists who are only

concerned with describing regional speeches scientifically.

The NPOV approach would therefore be to list not languages or dialects but the broadest heading for which a word could be said to belong. We have already discovered the utility of a translingual header, and thinking of examples like quotation marks and Chinese symbols shared in Korean and Japanese, I expect the emerging development of other broad headers. On the other hand, it would not be convenient to look up a common English word only to find it under "Germanic Family" or "Midwestern American". It is unfortunate that for practicality the list of languages/dialects needs to be standardized for consistency. Only for the latter reason could I accept the distinction of Serbian and Croatian. However, I have great reservations in applying different standards based solely on politics. The dissimilarity of the flavors of Chinese is clear. If we split some very similar language/dialect classifications then we should split them all. Davilla 19:41, 3 April 2006 (UTC)[reply]

  • I regularly add Mandarin and Min Nan words to Wiktionary so I understand your sentiment. I agree that calling something a dialect vs. a language seems arbitrary at times. Mandarin is as different from Min Nan as French is to Spanish, so why not call Spanish, French, Italian and Romanian dialects of Latin? I wish I had a solution for you, but I don't think a neat solution exists. Language is not like mathematics. There is no one right answer. Language is based on consensus. What is "right" is often what the most number of people can agree upon. In the case of the Chinese section, we simply don't have enough regular contributors to form such a consensus (I'm the only regular contributor that is fluent in one or more Chinese dialects). I hope to attract more regular contributors by adding more words. My strategy has been to focus on words that are familiar to native speakers, but not commonly found in other dictionaries. I believe this is the quickest way to get the attention of the experts.

A-cai 01:44, 5 April 2006 (UTC)[reply]

Far and above any suggestions I have to offer, I appreciate your contributions in these languages. Since it is a political issue I would not make the effort to distinguish languages and dialects and simply drop the heirarchical listing in every case. Even if you consider them to be dialects, they could still be separately listed as "Mandarin" or "Mandarin Chinese", "Cantonese", "Min Nan", etc. just as closely related languages are listed separately. But the voice of regular contributors such as yourself has the greatest weight in the matter. Davilla 20:51, 16 April 2006 (UTC)[reply]

Write a review of Wiktionary

On Wiktionary's Alexa.com webtraffic page, there's a bit at the bottom that says "I am familiar with this website and want to review it on Amazon.com". There are currently no user reviews on Amazon.com for this site, and I was wondering if anyone fancies writing a small piece about it? This link should take you to the sign-in page if you've got an amazon account. I look forward to seeing something up there soon. --Dangherous 09:33, 3 April 2006 (UTC)[reply]


Straw poll: defintion-use distinction

This is a straw poll to determine how many people here are opposed in principle to the idea of distinguishing definitions of synonymy from definitions of use. For example, if the definition is "a swear word", then an example of the former would be the phrase four-letter word, and of the latter the interjection Christ. Synonymy maintains that four-letter words and swear words are one in the same. Most definitions are of this type. As for the definition of use, it is the expression Christ, not a Christ itself, that is an expletive. This parallels the use-mention distiction. Other examples include the listed forms of a word such as plurals.

This is a straw poll because no standardized formatting is being attached. In particular this is not a commitment to italicization, which has already gotten a bad rap. What's more interesting to me is the distinction, and I'd like to know if that's worth pursuing. It is unnecessary to comment that it depends on the style because that's a given. Even if the community thinks this is worth pursuing, it may ultimately fail because of lack of consensus on formatting. Davilla 21:48, 3 April 2006 (UTC)[reply]

In favor in principle

  1. Will try to suggest some styles. Personally willing to consider nearly any. Davilla 21:48, 3 April 2006 (UTC)[reply]
  2. Ncik 02:54, 4 April 2006 (UTC)[reply]
  3. --Richardb 13:34, 4 April 2006 (UTC) Odd, ?I thought we were already clearly differentiating "synonyms" (four-letter word) and "examples of use" (motherfucker), by putting synonyms under a seperate heading. Personally I'd argue that synonyms should be put directly under the definition line (just below any usage example). That way it is very easy to distinguish synonyms for the different meanings/definitions. Putting them under some seprate heading is damn confusing.[reply]
    And, of course, where there is a big list of synonyms, replace that list with a simple see [[WikiSaurus:xxx]], and put all the synonyms there.
    Hmmm... This isn't what I meant at all. Davilla 21:40, 16 April 2006 (UTC)[reply]

Against in principle

Comments

  • Interesting subject. I think that a dictionary definition ranges somewhere between two extremes, the one being a clear description of what the word itself is, e.g. plural of word (pardon my italics), the other a synonym. Is this what you're talking about or am I wrong? — Vildricianus 22:01, 3 April 2006 (UTC)[reply]
    Yes, I think, but what is the inbetween? Usage notes tagged to a defition? Davilla 21:46, 16 April 2006 (UTC)[reply]
Replace "Christ" by "motherfucker", and the example works. Ncik 02:54, 4 April 2006 (UTC)[reply]
The relation of "motherfucker!" to "a swear word" is not one of synonymy but one of hyponymy, and this is already covered in our entry layout explained page. The relation of motherfucker! to a motherfucker is a matter of part of speech—one is an interjection, the other a noun. It's entirely unclear what you're proposing. —Muke Tever 22:42, 4 April 2006 (UTC)[reply]
"Shut up, motherfucker!" is a sentence featuring "motherfucker" as the vocative of the noun "motherfucker" used as a swear word. The definition for this noun sense should not read "somebody who fucks his mother" as this is not the case and usually not even suggested. The definition should be "A swear word." I hope now you understand why italicising the definition would be a good idea. Ncik 23:40, 4 April 2006 (UTC)[reply]
No, suggesting the meaning of the word is exactly why the word is being used. The same goes for calling someone moron or idiot. Not all uses of words are intended to be taken as literal truth, and the most I would expect is a usage note describing it is used as a swear word. —Muke Tever 22:52, 5 April 2006 (UTC)[reply]

Conclusion

It is clear from the above discussion that this is too complex to pursue. Davilla 21:40, 16 April 2006 (UTC)[reply]

The Devil's Dictionary by Ambrose Bierce

This may be somewhat of an off-kilter proposal, but since Ambrose Bierce's Devil's Dictionary is in the public domain, why don't we add his definitions to the articles that exist for words he defines? They are historically significant, and would enliven the place. They could be added in a set-off box to let people know that it's not the "conventional" dictionary definition. Cheers! bd2412 T 04:11, 4 April 2006 (UTC)[reply]

Another solution would be to upload it to wikisource, and link it from Wiktionary. Kipmaster 09:04, 4 April 2006 (UTC)[reply]
Reasonable - if so, the link to WikiSource would have to specify something to the effect that:
 
Wikisource
Wikisource has a humorous definition for this term from The Devil's Dictionary by Ambrose Bierce: see Beer parlour/2006/April

Upon further review, I see that WikiSource has the Devil's Dictionary splayed out by letter of the alphabet, not by individual terms defined (and they're done through the J's). Some of the definitions are best read serially anyhow, vis:

ACADEME, n. An ancient school where morality and philosophy were taught.

ACADEMY, n. [from ACADEME] A modern school where football is taught.

Can we make a template that puts the first letter of the page where {{PAGENAME}} would normally go in this one? bd2412 T 13:08, 4 April 2006 (UTC)[reply]

Sounds interesting, sounds fun. But, have you thought of the consequences. Next we'll be wanting links to the Urban Dictionary, and then to all other manner of witty tomes with definitions, which are annually produced by the thousand just before Christmas. so, my vote is 'Nice idea, but No. --Richardb 13:23, 4 April 2006 (UTC)[reply]
I respect your opinion, but I never accept slippery slope arguments, as they present the first order of logical fallacy. WikiSource is part of the WikiMedia project, and we already have plenty of articles linked to WikiSource, WikiQuote, and Wikipedia. Why would additional links to WikiSource lead to links to Urban Dictionary? This is not a gateway drug situation; we as a community have the ability to draw lines and abide by them. bd2412 T 14:30, 4 April 2006 (UTC)[reply]
I'd vote no for this, as the joke entries would end up under the ==English== heading, further confounding reuse of Wiktionary data. Additionally, I don't see how these entries would meet our criteria for inclusion (see also the RFV template {{nosecondary}}.) --Connel MacKenzie T C 17:36, 4 April 2006 (UTC)[reply]
Having considered the consequences, I agree that it would be improper to import those definitions here - but what about a link to the Wikisource page? bd2412 T 17:49, 4 April 2006 (UTC)[reply]
I don't know. At first blush that seems OK. Inter-project links are a good thing. --Connel MacKenzie T C 18:29, 4 April 2006 (UTC)[reply]
On looking at the Devil's dictionary, I'd definitely vote NO. It is pretty pathetic, and no more deserving of links than the many other deinition joke books that no doubt are, and will be, available either in WikiSource or elsewhere on-line. --Richardb 14:17, 5 April 2006 (UTC)[reply]
I'm beginning to suspect that you're not 100% behind this idea. bd2412 T 01:28, 11 April 2006 (UTC)[reply]

The Easter Competition 2006 is now open. Anyone may participate. SemperBlotto 09:52, 4 April 2006 (UTC)[reply]

Pages for reconstructed words?

Over the past few months, I've noticed a lot of broken links, and never a working one, for the many reconstructed words that linguists so adore. So a few hours ago I tried making a few such pages, and I think it worked pretty well. These are the first ones on Wiktionary that I'm aware of, and both pages have two Proto-Indo-European roots each: *gerə-, *od-. I also made some categories that seemed to be lacking, including Category:Roots, Category:Proto-Indo-European language, and Category:Reconstructions. Before I spend too much more time making more pages like this (which I'll gladly do if there's any interest), I decided to come here and see if anyone has any suggestions, criticism, comments, etc. regarding these changes and additions. Thoughts? -Silence 10:02, 4 April 2006 (UTC)[reply]

Reconstructed words are not supposed to be linked in etymologies. They (with perhaps a very few exceptions) specifically fail our criteria for headword inclusion. —Muke Tever 22:39, 4 April 2006 (UTC)[reply]
Is there a language code for this language ? Does this language have a name (perhaps Proto-Indo-European? If so, then just put the words in under that language heading. Forget the asterisks. And you'd have to successfully argue (or assert) a change in the Criteria for Inclusion
If it doesn't have an offical language code, then arguably it doesn't belong here, but rather in a WikiBooks text book on Proto-Indo-European roots". IMHO. --Richardb 13:09, 4 April 2006 (UTC)[reply]
Reconstructed languages are specifically excluded from the scope of ISO 639 language codes. —Muke Tever 22:39, 4 April 2006 (UTC)[reply]
Information. There is an Appendix that already deals with this. Maybe it should just stay as an appendix Wiktionary Appendix:Proto-Indo-European roots--Richardb 13:12, 4 April 2006 (UTC)[reply]
"Is there a language code for this language ?" - Not that I know of, but the code "ine" (for "Other Indo-European language") could perhaps work for it. Additionally, just about any linguist in the world will be aware of what language you're talking about if you use the abbreviation PIE, so I don't see any reason we couldn't simply use "PIE" for these purposes, since that doesn't seem to be taken (and probably never will be by anything else, considering the confusing it'd cause) on any of the language code lists I know of. I'm sure that the only reason it doesn't have a code is because it's a reconstructed language, and language codes are generally for attested languages (i.e. ones directly found in speech or writing, not derived from systematic commonalities between existing languages, even when those derivations are extremely sound and near-universally accepted, which is often the case for PIE); it's certainly noteworthy enough, and indeed, increasingly often, many major dictionaries (such as the American Heritage Dictionary) put a great deal of focus on PIE roots (in fact, I bet the main reason most don't is because of space concerns, which isn't an issue for us!), despite AHD being an English-specific, and not general Indo-European, dictionary, so this stuff can be highly valuable even to someone only interested in their own language, and not in comparative linguistics or what-have-you.
"Does this language have a name (perhaps Proto-Indo-European?" - The name that was originally used for it by its native speakers is lost, but Proto-Indo-European is indeed the name that is almost invariably used for this language, yes. (The main other name I've seen used is "Pre-Indo-European".)
"Forget the asterisks." - I'd considered this, but after thinking about the matter, it is my opinion that the asterisks are vitally important, not just for PIE, but for all languages where a form is not directly found in any text, but is just theorized (sometimes very strongly, almost to the point of certainty) to have existed based on etymological evidence. Without the asterisk, which is a very common linguistic convention, that extremely necessary aspect of these words could be lost to casual readers. For example, the Latin verb inodiare is not attested in any source, but we're almost certain that it existed because in odio is attested and later verbs in early French (which later developed into words like "annoy" and "ennui") are very similar both in form and meaning to this, suggesting, based on comparison to similar developments where there are attested "middle stages" for word evolution, that inodiare was probably a Vulgar/Late Latin innovation. But it's still a, so we should have a page on it at *inodiare, not inodiare (and since there are no other words spelled the same, a redirect from the latter to the former would probably be merited here), to distinguish it from the numerous Latin verbs that are attested, like amare. The same applies to every language; the convention, already very common on Wiktionary (but not formalized, apparently), of using * before words like Proto-Germanic reconstructions (which I feel should also have their own pages where noteworthy and widely-accepted, for the same reason as PIE), is an important indicator of their nature.
"If it doesn't have an offical language code, then arguably it doesn't belong here, but rather in a WikiBooks text book on Proto-Indo-European roots". IMHO." - I disagree. Anything that belongs in a comprehensive dictionary belongs in Wiktionary, and there's no argument one could make that the most successful ancestor language in the history of mankind (that there is any strong indication existed), which hundreds of thousands of words in dozens of languages have clear links to, isn't a valuable part of any comprehensive dictionary. Delegating PIE to WikiBooks would be rendering it completely useless to Wiktionary, as we couldn't then have separate pages for individual roots and thus couldn't effectively discuss or cite which form to use for each, when forms are disputed and when they're widely-accepted, etc. -Silence 15:13, 4 April 2006 (UTC)[reply]
As I commented somewhere above, I don't see the need for these forms to exist as pages of their own. But if they do go in they should retain the asterisks, which are important markers of their conjectural nature. Widsith 13:15, 4 April 2006 (UTC)[reply]
If not as pages of their own, then as what? I'm certainly willing to help put together a list, but surely you realize that a list has a ridiculous number of limitations, such as being extremely difficult to search through, to link to specific entries in, and to provide detailed information regarding each root. Imagine if instead of having individual entries for English, we just provided an alphabetical list of every English word on a single page, and expected people to track down the word they wanted by scrolling through it. Well, for PIE this would be even worse, since most people will only be familiar with the forms by means of their derivations, and thus an alphabetical list will be nearly useless to all but the most die-hard of linguists. Moreover, many of the phonemes in PIE, such as laryngeals, don't fit in the Latin alphabet and would need to be placed somewhat arbitrarily in such a list. I'm not saying that I oppose using a list, but I oppose only using a list, because it's much less valuable to our readers. -Silence 15:13, 4 April 2006 (UTC)[reply]
I do sympathise with what you're saying, and I have a great interest in PIE studies myself. But one reason why, as you say, ‘most people will only be familiar with the forms by means of their derivations’ is because there is little consensus over what the PIE forms should look like. How will you arbitrate between different authorities on the subject? Don't get me wrong – by all means go for it if you are willing to try and deal with this minefield. I am just not sure how useful it can be while there is so much hypothesis and disagreement involved. Widsith 15:35, 4 April 2006 (UTC)[reply]
  • I wrote a compelling and masterful eight-paragraph response to all the points on this page, elaborating a large number of my earlier comments, explaining the flaws in all the problems and alternatives (such as the list and category) to having individual pages, and overall demonstrating perfectly why including entries for reconstructed roots would be an absolutely fantastic addition to this dictionary that would open up a whole world of new etymological depth and value to readers. But that was all deleted, so, screw that. Can I just say that I'm right and somehow convince you just with that? =_= I do so love discussion and would love more feedback on the specific pages I've tried out above, but right now I want to stab my eyes with glass and nails. I hate the world.
  • To summarize the last paragraph: People aren't familiar with PIE for the same reason they're not familiar with most etymology: because linguistics isn't exactly a casual, everyday interest for most people. How much consensus there is regarding the forms has nothing to do with how well-known they are, and indeed, if everyone already knew all PIE forms, then there'd be no point listing it because it wouldn't be providing anyone new information. There's really less controversy over many PIE forms than you suggest, but we will arbitrate between different authorities in the same way Wikipedia deals with different POVs, and in the same way we already deal with variant forms for things like color and colour: we provide all widely-attested, noteworthy forms of words, and explain why and how they differ. Just look at *gerə- for an example of this (and of the types of references we can use in general). As for "I am just not sure how useful it can be while there is so much hypothesis and disagreement involved."—about as useful as etymology in general is. If absolutes are what you're looking for, you'll probably be disappointed; a form that's 95% likely, rather than 100%, still merits mentioning, as there are so few things in this world that are truly certain. -Silence 17:23, 4 April 2006 (UTC)[reply]

Oh well, I can't see why these don't deserve inclusion. But they'll need to be flagged with something in order to separate them from "standard" entries, they'll need plenty of references and so forth. They should also keep the asterisk. That's what I think, at least. — Vildricianus 09:02, 5 April 2006 (UTC)[reply]

I also think we should have entries for PIE roots as long as those are carefully researched and extensively referenced. Ncik 13:33, 5 April 2006 (UTC)[reply]

  • My understanding was that we do allow these entries, in the Wiktionary Appendix: pseudo-namespace only. (Richardb stated this above, right?) This obviously means they shouldn't be linked from main namespace entries. --Connel MacKenzie T C 13:56, 5 April 2006 (UTC)[reply]
    • I'm willing to negotiate on whether the PIE roots should be in the main encyclopedia-space (which I mainly thought was a good idea because the asterisk already takes care of distinguishing them from normal entries and because it's a lot easier and faster to type than some 20-letter phrase like "Appendix:Proto-Indo-European root" at the start of each page!) or in the Appendix; I can certainly see some advantages to keeping reconstructed forms in a separate namespace. However, regardless of whether they're in the main namespace or the Appendix namespace, it would not be acceptable to not link to those pages! Cross-namespace links are 100% appropriate in this context, as they are directly content-relevant; in fact, hundreds of Wiktionary articles already link to Appendix entries, as if they didn't, how on earth would anyone find the Appendix page they need?! If we were a paper dictionary, we'd direct users to the appropriate Appendix entry when a relevant PIE root was mentioned in an etymology; as an online dictionary, the exact equivalent is to provide a hyperlink leading directly to the appropriate root. I see absolutely no benefit to not linking fully to PIE roots in dictionary entries, where they are directly relevant.
    • Also, one thing that still hasn't been addressed is what to do with reconstructed words that are not PIE, and belong to a language where not all the words are reconstructed: like my example, inodiare (a reconstructed Latin word), above. If the PIE roots go in the appendix, presumably these words will also go in the appendix (where they are noteworthy enough for their own page), but if that's the case, what will the format be for naming them? I can't think of any clear, non-awkward way to consistently name, except for simply "Appendix:*inodiare" (which is pretty convenient, though having the * immediately after the : doesn't look good and could cause the asterisk to be missed). I'm willing to continue this work in Appendix entries, even though I feel it's rather more bureaucracy and convolution than is needed in this case, but only if (1) we decide on a consistent naming scheme for all reconstructed words and roots, not just PIE ones, and (2) we still link to those individual pages from non-Appendix dictionary entries.
    • Also, I agree with Vildricianus. References and the asterisk are both essential. -Silence 14:12, 5 April 2006 (UTC)[reply]
  • The way you have done it in *gerə- seems fine to me, as far as it goes. Perhaps it could have a very simple banner somewhere (by template) that states what Proto-Indian-European is about, and a request for translations to NOT be added, and any other applicable restrictions / instructions.--Richardb 15:33, 5 April 2006 (UTC)[reply]
    • A banner sounds fine to me, as a good way (that doesn't require excessively long page titles like "Wiktionary Appendix:Proto-Indo-European root *seH₂wel-" (vs. simply "*seH₂wel-" or "*sāwel-")) to make it clear to readers that this is about a reconstructed proto-language root, not an attested form; it could be used to explain the "*" terminology, and perhaps a link to an Appendix page that gives general information on PIE phonetics.
    • I understand concerns with including non-attested forms in the dictionary (though they are attested in the sense of being recorded by numerous reputable and noteworthy publications; they just aren't directly attested in their original language), but I just feel that purely as a matter of practicality and user-friendliness, "*" is just as good as "Wiktionary Appendix: Proto-Indo-European root *" for telling readers that a certain page is about a reconstructed root, and * has the advantage of being infinitely more concise (and thus both easy to link to and to search for on Wiktionary, requiring fewer redirects and piped elaborate links overall) while still carrying the data "reconstructed form" in its title in the form of the simple * (since, as far as I know, we don't use "*" for anything else in titles except *). However, if there's more support for keeping reconstructions in the Wiktionary Appendix: namespace, I'll go along with that, as long as we can decide on how to name (and organize) pages for all reconstructed roots, including ones that aren't PIE. Really, everything gets a lot more complicated if we decide to reserve reconstructed forms for the dictionary's appendix, since we also have to decide on what to change in the layout (i.e. if we have the name of the language in the page title, as is the case with "Wiktionary Appendix:Proto-Indo-European root *X-", we presumably won't also use a section header with the language's name within the page; also, what's to be done when there's more than one reconstructed root with the same form, which is the case with quite a few PIE roots, including the two prototypes I created above?) and deal with all sorts of novel issues, when we already have a well-established, working system at hand in the form of ordinary dictionary-style entries.
    • The only thing I disagree with is "a request for translations to NOT be added"; I don't see why a banner on PIE roots (or reconstructed roots in general) would need such a disclaimer. If a significant translation is lacking, it should certainly be added (preferably with a citation for verification), as is the case with all information on all pages. -Silence 16:55, 5 April 2006 (UTC)[reply]
I also don't see what's wrong with translations. For the time being we should allow all sorts of stuff to be added, and then, if problems arise, we can discuss again. But imposing restrictive policies before having any practical experience seems unreasonable. Ncik 02:41, 9 April 2006 (UTC)[reply]
  • The asterix character is used and does have special meaning. It is unacceptable (for this) in the main namespace. Categories in particular, are hosed by these entries by default. And not having an ISO 639 code means they do not meet our criteria for inclusion. Therefore, having them with the prefix "Wiktionary Appendix:Proto-Indo-European root *" is appropriate. Having them in the main namespace without that prefix is not appropriate.
I'm absolutely in favour of having these entries in the main namespace. If prefixing hypothesised with an asterisk causes technical problems, we should consider dropping it, though. The criteria for inclusion need to be modified. Ncik 02:41, 9 April 2006 (UTC)[reply]
Hypothesised words should of course be linked. That's common sense and we've always done so. Ncik 02:41, 9 April 2006 (UTC)[reply]
Ncik, that is simply not true. We include all words in all languages. But to be included here, it has to be a word. A theory about a word is not a word. --Connel MacKenzie T C 02:49, 9 April 2006 (UTC)[reply]
As I said, the criteria for inclusion might need to be changed. Arguably, a hypothisesed word is a word. Ncik 03:09, 9 April 2006 (UTC)[reply]
Well then, go convince Eclecticology. These are certainly more questionable than say Klingon, or Quenya. But even then, it is only a theory being attested, not a word. --Connel MacKenzie T C 06:58, 9 April 2006 (UTC) (edit) 07:03, 9 April 2006 (UTC)[reply]
I haven't really stepped in on this yet, but I principally agree with Connel. These PIE hypotheses have no place in the main namespace. Even as an appendix I have doubts. At the very least any such entry for a PIE form should be properly verifiable to insure that these spaces are not being used for someone's original research on his pet theories of linguistic roots. Eclecticology 18:44, 10 April 2006 (UTC)[reply]
  • Since this is obviously an issue that will require a lot of discussion before any sort of consensus can be reached, and since I don't want to go too far (even though I'm eager to waste hours of time creating such pages and linking to them from their derivative pages as soon as possible) with working on PIE roots before there's any agreement on how (or even if) we're to use them, I've created a new page to centralize discussion on this issue so we can eventually hammer out an agreement on this. It's at Wiktionary:Reconstructed terms, which is currently just a place to discuss, but may eventually become a Wiktionary guideline if we can work out an agreement on how Wiktionary should handle unattested, reconstructed words, roots, and phrases.

Pages for entries with no English equivalent

Many English words and phrases have no equivalent in other languages, and we can provide translations of these using a gloss or explanation in those foreign languages where a translation is lacking.

In the reverse case, where a foreign phrase has no English equivalent, how can we create English pages for these?

For example, the French words "tutoyer" and "vouvoyer" mean "to address someone using the 'tu' form" and "...using the 'vous' form" respectively, and have no English translation because there is (no longer) informal and formal forms of "you". Equivalents do exist in languages where there are an informal and a formal form of "you"; for example, Italian has "dare del tu" and "dare del Lei" respectively (see the "Related verbs, nouns and pronouns" section of the Wikipedia page on the T-V distinction, where examples are given for various other languages as well).

As there are equivalents in various languages, In order to collect this information into a single page, I think we could have a page that gives translations and possibly an English explanation rather than a definition, but the issue would be how to title it.

title? Something like "address someone using the informal form of 'you'"

Explanation...

Translations

How do others think could this be done? I think it is worth doing because of the existence of equivalents in many other languages. We don't have translations from non-English languages to non-English languages, and, while it would be convenient, it would be POV to select a particular language and give all the translations there. — Paul G 10:33, 4 April 2006 (UTC)[reply]

Well, in this particular case I was wondering if there isn't a verb to thou. There is a Yorkshire saying (that tries to explain when to use thou) - "Tha' thous they that thous thee" i.e. You are only allowed to address someone as thou, if they do the same to you first. But in general, some sort of placeholder entry seems reasonable. SemperBlotto 10:43, 4 April 2006 (UTC)[reply]

Shakespeare also uses ‘thou’ as a verb. Widsith 13:13, 4 April 2006 (UTC)[reply]
I also use "thou" as a verb, by I'm not published yet. — Vildricianus 08:56, 5 April 2006 (UTC)[reply]
I’ve certainly heard "to thee and thou someone," as in "the King pushed his familiarity with the Grand Duke so far as to thee and thou him." I can’t think of an equivalent expression for "to you someone" however. Of course, tutoyer doesn’t really mean the same as "to thee and thou," and rarely would it be translated that way. —Stephen 18:48, 6 April 2006 (UTC)[reply]
Why not just put the word in as "tutoyer", put French as the language, and then put a definition in English ? Is that just too obvious ?
Heck, the eskimos have something like 15 words for snow, but we don't need them indexed to us in English, because we don't need to look them up in English. But, if someone knows those words they are free to enter them in Wiktionary under their native language, and put in a definition of the word, in English. So why would we need to have an English word or phrase invented just so we can find it? I'm sure the same goes for desert people, coral island peoples etc. We don't have gigantic waves in England, so we adopted the word tsunami from Japanese. Should we have an entry for "giant wave", just so we can find the japanese translation "tsunami". Or do we just learn to use the word tsunami ? Same goes for avalanche, siesta, verandah etc.
Now, if you are French, my understanding is the Academie believes there should be a French word for everything. Won't allow words like "jeans" or "camping" or "le weekend" to be used in French, so they invent words. But English is not like that. I think you are trying to use a French Academie concept to have us invent words in English. It's just not necessary, in English. We adopt and adapt instead. If the concept "tutoyer" is going to be useful to any English speaker, then he'll probably use the word "tutoyer".
Anyway, if you do invent some word or phrase to do the job, I'd love to see how that fares against the good old Criteria for Inclusion!

AS to collecting the translations of the same non-English concept into one place, do it in the Wiktionary of one of those langauges. Anyone who is interested enough is surely going to be using one of those languages. It's enough that we have the defintion in the Englsih wiktionary, in English, unde the word in one of the other languages.--Richardb 12:54, 4 April 2006 (UTC)[reply]

I support this proposal, or something along these lines, perhaps in a different namespace. I should be able to find dare del tu from tutoyer without going through any other language version of wiktionary. It needs to be clear that the title is not actually a word, so maybe something like appendix:address someone using the informal form of 'you'. Kappa 18:00, 4 April 2006 (UTC)[reply]
I don't agree with you, Kappa. IMHO the other wiktionaries are there for exactly this reason: that the translations between fr: and it: (for example) should not be included in the English Wiktionary, as they wouldn't be of any special importance for anyone who speaks neither French nor Italian. And why do you think it being disadvantageous to have to refer to another wiktionary? \Mike 09:38, 5 April 2006 (UTC)[reply]
Other wiktionaries are not in English, and I think their main reason for existing is to provide a dictionary in a language their users can understand. You are asking me to navigate a website when I may only know a single word of the language. What if there is more than one definition of the word I'm looking up? Kappa 01:59, 6 April 2006 (UTC)[reply]

I don't see what Paul's problem is either. Just provide a definition for the phrase in English. Or are you afraid this might not be in accordance with "our" (this excludes at least me) "policy" (idiotic rule) of not having definitions, only translations, for non-English entries? Ncik 23:58, 4 April 2006 (UTC)[reply]

Ncik, what makes you say we have a policy of not having definitions for non-English entries ? These are standard, part of our reason for being. Where did you get the idea they are prohibited. 'Cos if there is something to that effect anywhere, it needs to be corrected.--Richardb 15:13, 5 April 2006 (UTC)[reply]
Actually it's a pretty common idea (not mine either) that for foreign-language entries, a definition should not be given, merely the English equivalent. So for Wein not "a beverage produced from fermented grapes" or even "wine; a beverage produced from fermented grapes" (IMHO the most useful form) but only "wine". —Muke Tever 22:49, 5 April 2006 (UTC)[reply]
I agree that "wine; a beverage produced from fermented grapes", or something similar, is indeed often the most useful form for such cases. It also seems to be a relatively common practice; just yesterday I stumbled upon gula, which is translated/defined as "yolk, the yellow part of an egg". At the very least, though, wine should obviously be hyperlinked, so even if we don't provide the full definition we make it easy to track it down. -Silence 23:14, 5 April 2006 (UTC)[reply]
Silence, you may well have encountered such a page, but the fact that there are hundreds of other pages that do not do this means that it is the exception rather than the rule.
Postscript: I see what has happened here. Someone created many Swedish entries with definitions where translations (perhaps with glosses) would have been sufficient, and contrary to what is being done for other non-English languages.
My understanding of the policy is that we give translations rather than definitions for non-English words. The reason for this is to avoid duplication and inconsistency (as is the reason for many other Wiktionary policies). For example, the English word "taxi" is used by many other languages. It makes no sense to give the full definition for every language on the page for taxi when it is there in the English section (we aim to avoid duplication). If someone makes a change to the English definition, you can bet your sweet bippy that they won't update the definitions for the non-English languages (we aim to avoid inconsistency).
When a simple translation is insufficient to indicate which sense of the word is intended, a gloss (a very brief explanation) can be added in italics to clarify the sense; for example: "wine (drink)", if there are other senses of "wine". Using a gloss is what I prefer to do (and is what print dictionaries do). I don't think it is official policy, but it makes sense. Again, a gloss rather than a fully-fledged definition avoids duplication and potential inconsistency.
Finally, translations certainly should be wikified, unless they are phrases that are non-idiomatic, in which case the individual words should be wikified. — Paul G 09:25, 6 April 2006 (UTC)[reply]
A gloss is certainly better than just the word alone, even if there are not (yet) any additional senses of a word to confuse. But the full definition should not by any means be discouraged. The original question concerned translations that don't exist. But if they do exist, translations can't be exact in every case. Machines can already do at least 90% and everyone knows they're total rubbish. Even 99% accuracy is not good enough. Allowing the entire definion avoids errors and leads to the discovery of differences many would not have suspected. We usually think of a wine as coming from grapes, but alchol derived from other fruits can also be called wines in English. And that's a pretty standard drink. What about foods that aren't the same everywhere in the world? You really can't go too far with this. I have to insist when a new class of Chinese students infallibly claims that you can't eat soup, but who would know better from reading the AHD definitions? Davilla 17:46, 6 April 2006 (UTC)[reply]
The idea is that the gloss, definition or whatever should enable the user to determine which of the definitions at the entry for the translation is referred to. So "drink" is probably not sufficient for "wine", but I'm sure you see the point. The translation effectively becomes readable as "wine, in the sense of the alcholic drink, as opposed to any other sense of the word". — Paul G 11:22, 7 April 2006 (UTC)[reply]
How quickly we come around! When I arrived not so long ago I was arguing that we shouldn't have foreign language entries at all. Well then it shouldn't be too hard for me to make a compromise toward the accepted style.
I can see wanting to keep the definition of foreign words short, but rather than saying a gloss should be used when necessary, glosses should be encouraged. It doesn't hurt to be too careful, as most words can eaisly have several meanings. Otherwise we end up with the current situation, where they're rarely included. Also when a gloss really is necessary, we should encourage using more than one word to describe it. Not a full definition, per se, just enough to be sure. On the other hand, I still contend that full definitions are necessary in some cases. Davilla 18:12, 23 April 2006 (UTC)[reply]

Back to the point at hand.

It is not appropriate to pick a foreign language and put all the translations there as this shows bias.

The comment about "tsunami", "verandah", etc, is irrelevant as these are now naturalised words in English. They were borrowed/transliterated/translated from other languages because there was a need for them in English. "Tutoyer" does not work this way - the word is not used in English, and there is no single word in English that has the same meaning ("thou" being used, perhaps, by Shakespeare and Yorkshire folk, but not by contemporary linguists as an equivalent of "tutoyer"). Instead, a phrase such as "to address someone using 'tu'" or "to address someone using the informal form" is used. There is no English linguistic equivalent because there is no longer any English cultural equivalent.

The Wikipedia page I referred to shows that several languages have this concept, so it is appropriate to collect them together on a single Wiktionary page, just as Wikipedia does under "T-V distinction". The question, to my mind, is not whether we should do this, but how. It isn't sufficient to say "oh well, someone will look up tutoyer if they are interested, and we should put the translations here" because that page will only give a translation into English, and what if they think to look up, say, the Italian page rather than the French page?

Ncik, I think you're missing the point: we can certainly translate/define "tutoyer" on the the page for that word, but where should the English page be that gives the translations "tutoyer", "dare del tu", etc?

I like Kappa's idea of using an appendix page (perhaps in the Wiktionary namespace rather than the Wiktionary Appendix one) that all of the translations would cross-refer to. Someone looking up "tutoyer" could then find equivalents in other languages.

(By the way, it is somewhat of a myth that the Inuits have n words for snow (and no one seems to agree just what number n is anyhow); they have words for different kinds of snow, just as there are phrases for different types of snow in English ("powdery snow", etc - ask a skier).) — Paul G 09:41, 6 April 2006 (UTC)[reply]

It is good to know that Paul G and Connel (with whom I fought edit wars on ELE about this) get isolated with their simplistic views on senses of non-English words. I would also like to note that most of the examples that came up above were nouns, which are the easiest (yet still too complex for not needing to be equipped with proper definitions) cases. Adjectives, verbs, prepositions, etc. are considerably more difficult to describe. Ncik 00:39, 7 April 2006 (UTC)[reply]
Ncik, putting aside your sarcasm (I'm sure you don't really think it is good), I don't understand what you mean by your comment. Please could you explain:
  • How are my views and Connel's on non-English words simplistic?
Assuming that any sense a non-English word can take is a sense some English word has, and hence that a simple translation will do, is blatantly nonsense. Virtually any non-English word I can think of requires it's own, complete definition. "Simplistic" is the appropriate word for your views I'd say. Ncik 03:04, 9 April 2006 (UTC)[reply]
  • What are Connel and I overlooking?
Apparently all the arguments other people and I have brought up in previous discussions about this. Ncik 03:04, 9 April 2006 (UTC)[reply]
  • What is wrong with considering what we are discussing here?
Nothing. Considering things is always good. Ncik 03:04, 9 April 2006 (UTC)[reply]
  • How do you propose we handle what we are discussing here?
For the initial topic of the thread, see my remark from the 4th of April. For foreign language definitions, my comments right above apply. Ncik 03:04, 9 April 2006 (UTC)[reply]
  • Do you think there is no need to discuss what we are discussing here?
There probably is need for discussion. Ncik 03:04, 9 April 2006 (UTC)[reply]
I want to understand so that we can make progress on this issue. Thanks. — Paul G 11:18, 7 April 2006 (UTC)[reply]

To paraphrase, the problem is this: foreign word X has no direct equivalent in English, but it does have equivalents in other foreign languages; should we collate these equivalents on some page for English-speakers? I think there are 4 possible answers.

  1. No, we shouldn't. Let people look on Wiktionnaire etc. if they're interested.
  2. Yes, we should use an archaic or dialectal equivalent if available (thou in the case of tutoyer).
    • I doubt the specific word "thou" is the best solution in this case, but in general, when there may be no equivalent whatsoever, there could still be a WikiSaurus entry, just as there will certainly be a WiktionaryZ entry. Picking a single title is a more general problem for WikiSaurus guidelines. Davilla 21:15, 16 April 2006 (UTC)[reply]
  3. Yes, we should in this instance break convention and add translations to a foreign term (so tutoyer would be followed by a translation table for Italian and others).
  4. Yes, we should do this using some kind of Appendix or =Equivalent terms= heading, the details of which can be hammered out here.

In my opinion, option 2 should happen as a matter of course. Option 3 is probably also helpful. Option 4 I think is unnecessary, but for that matter it isn't likely to upset anyone either. Widsith 14:29, 9 April 2006 (UTC)[reply]

  • Ncik's attacks I think are uncalled for. He (still/again) is ignoring the relevant subtlety: the English Wiktionary is for English readers. The nonsensical definition for wine given above, while technically accurate, is pointlessly verbose; providing the "simple" gloss conveys much more information. A comprehensible gloss is far more powerful than a long drawn-out technical description (while finessing redundancy and synchronization issues.)
The English Wiktionary is for English readers. But it is not for malinforming English readers on non-English words. Ncik 17:20, 9 April 2006 (UTC)[reply]
The solution for "tutoyer" is to give a secondary description (see #Layout and #Straw poll: defintion-use distinction for what I mean by this) of the word, and the best possible translations in the "Translations" section (in the case of English it's probably thou). Ncik 17:20, 9 April 2006 (UTC)[reply]
  • There's a lot to read through and I'm short of time so I'll just say what to me makes sense and what I've been doing all along. Print bilingual dictionaries try to define terms so that the user will understand them. Sometimes one gloss (one word in the other language) is enough. Sometimes the gloss will be accompanied by a hint in parentheses to disambiguate from several possible senses in the other language. Sometimes several glosses are better, especially when the two languages have quite different semantic range for the terms involved. Other times there is simply no gloss and the term is defined. Anything more simplistic than this obvious approach is, well, simplistic. In the case of tutear and the likes, I've come across this quite a few times reading world literature in English translation and I've seen it handled in various ways - never have I seen it translated as to thou - that would be absurd! — Hippietrail 17:52, 9 April 2006 (UTC)[reply]
    • The discussion of how to treat translations is valid and interesting but belongs in another thread. I'll repeat my view: that a gloss should be given where the English word has more than one sense and need not give any more information than is necessary to distinguish which sense is intended. I see no point in giving a full definition. Note that current practice doesn't even go this far: most pages for foreign-language words just give a translation with no gloss or definition, which is often inadequate, IMO.
Does the following language encourage the current practice? "A translation into English should normally be given instead of a definition." This description in WT:EL could probably be ammended. Edit war starts now! Davilla 05:27, 18 April 2006 (UTC)[reply]
I can only repeat myself: The point of giving a full definition is that different words have different senses. Assuming that English is the mother of all languages and that any other language has simply taken meanings of English words and redistributed them among its own words (i.e. sequences of letter), and hence that a simple reference to an existing English sense will suffice as a definition for any non-Engglish word, is plainly wrong. Ncik 11:24, 10 April 2006 (UTC)[reply]
I see your point, and this is entirely appropriate when this occurs. However, in very many cases (possibly the majority), the word in another language has exactly the same referent as the word in an English language, so a gloss to distinguish which of the English senses is intended is then sufficient. "Wine" is an example - French "vin", Italian "vino" and Portuguese "vinho" all refer to exactly the same concept, so a gloss to give the equivalent English sense is sufficient here. We don't expand any further than that because we are the English Wiktionary and definitions in other languages are given by other wiktionaries.
    • It is certainly appropriate to give "tutoyer", etc, as translations of "to thou", but not vice versa. Suggesting that the the phrase "On se tutoie?" (used by a French person to give ask if he or she may address the person being spoken to using the informal form of "you") can be always be translated into English as something like "Can we thou each other?" is wrong, as this is unidiomatic and unlikely to have been used by Shakespeare or be used in Yorkshire.
Nobody suggests any translation will be appropriate in all possible contexts. They are the best available approximants, just like synonyms. If nothing reasonable can be found, the definition has to suffice, and one doesn't give translations (again, exactly as with synonyms). Whether it makes sense to list "to thou" as a translation of "tutoyer" is not a general issue, but a specific one that can be discussed in the Tea Room. Ncik 11:24, 10 April 2006 (UTC)[reply]
I agree, but I was simply taking that as an example. The point is that where no English equivalent exists, we can translate from the foreign language(s) into English by using a wordy explanation, but we can't currently provide translations in the opposite direction because it is unclear where these would go. My original question was how we can go about providing a page that lists translations from the English explanation back into the foreign languages, taking "tutoyer" as an example. — Paul G 13:56, 10 April 2006 (UTC)[reply]
    • Perhaps the solution is to list the translations on the page for the verb "to thou" and to give links to the foreign-language pages. The entries for the foreign-language words in the English Wiktionary could include cross-references to "thou". How does this sound as a solution to the problem? — Paul G 09:14, 10 April 2006 (UTC)[reply]
  • Am I missing something? When did English ever have a formal vs. informal second person singular pronoun? As far as I am aware the distinction between "you" vs. "thou" was formerly plural vs. singular and is now standard vs. archaic. In no way would "to thou" be an equivalent of "tutear" unless there are senses that have eluded me. — Hippietrail 15:59, 10 April 2006 (UTC)[reply]
  • Yeah - you're missing something. It's true that you was originally plural and thou singular (from OE ēow vs. þu), but from early in the Middle English period the plural was used (in imitation of French custom) as a polite singular. That is the way it remained until a couple of hundred years ago or so. This discussion seems doomed, by the way... Widsith 20:57, 10 April 2006 (UTC)[reply]

Having just read through all the above I'm not convinced that we need any special listings for foreign words that do not have a one word translation. A translation can just as easily be a phrase as a word. At other times a full description may be necessary. Translating tutoyer as to thou would be technically correct; the same tone of impoliteness is there when it is used by Shakespeare's Sir Toby Belch. The result just sounds weird to the modern English speaker. The need to translate the concept from English to French is not there if the concept is not a normal part of English. A person translating the English you into French or some other language still needs to be sensitive to practice in the target language. Our entries are for words that exist rather than ones that don't exist. Eclecticology 23:37, 10 April 2006 (UTC)[reply]

  • Here's an example of how one real dictionary handles the Spanish tutear:
    tutear(se) vt, vp be on first-name terms (with sb)
    Note that that does make a reasonably close cultural equivalent but personally I feel it's not really a direct translation. — 216.72.195.201 00:35, 11 April 2006 (UTC)[reply]
Indeed, my Collins French-English dictionary gives this as an approximate translation of "tutoyer". — Paul G 15:26, 20 April 2006 (UTC)[reply]

1910 Black's Law Dictionary

The 1910 edition of Black's Law Dictionary is in the public domain, and is available on CD-ROM - I'm waiting to see if I can get one on EBay right now [1], but if someone else can track a free one down, that would be spiffy. bd2412 T 22:19, 4 April 2006 (UTC)[reply]

The seller actually has a bunch of them posted. I sent him this note:
Greetings! Would you consider donating this CD to Wikipedia? Since Black's Law 2d is in the public domain, we are going to post the entire thing on the internet and let the world have it for free. Cheers!
Hope he gets the hint that his product is soon to become obsolete anyway. bd2412 T 04:51, 6 April 2006 (UTC)[reply]
Wait a second, isn't only the 1891 PD right now? I thought the time period was 100 years. --Rory096 00:45, 7 April 2006 (UTC)[reply]
No sir, because of the timing of the passage of various copyright laws, "the U. S. copyright in any work published or copyrighted prior to January 1, 1923, has expired by operation of law, and the work has permanently fallen into the public domain in the United States." [2] Enjoy your public domain freedom! bd2412 T 01:18, 7 April 2006 (UTC)[reply]

Suggestion for improved user navigation within a MediaWiki page.

Copy of a post I made in the Wikitech newsgroup. (Is this the right place to make such suggestions ?) Any comments, advice, info ?

Hi, a newbie to Wikitech, so apologise if
a) don't get things right first time
b) being bold enough to make suggestion for improvement before I really know much about Wikitech.

Background

  • been contributing to Wikionary for a year or two. Administrator. Mostly interested in improving the organisation, the process, policies etc, rather than the content
  • 28 years in IT, mostly as business process analyst, interfacing between "dumb" users and "nerdy" techos.

Observed Problem

For the great majority of pages on Wikipedia and Wiktionary, they can be very long, with lots you want to skip over (probably more so in Wiktionary) and there is very little help in navigating through the long page.

At best there is a Table of Contents at the top. Which, as soon as you click on it, disappears as you drop down the page. Great !

Suggested Improvement(s)

1. Make the article collapsible by levels.

  • Make some extra tabs along the top that show -All-, -1-, -2-, -3-, -4-, maybe more
  • If you click on -1-, then the article will collapse to show only the Level 1 headings, with a little + (plus) sign to the left hand side, and a ++ sign
  • If you click on -2-, than it collapses to level 1 and level 2 headings, etc. If you click on -All- then the article is shown in full, with no collapsing of levels.
  • If you click on the + sign against a heading, it will expand the content below that heading to the next level of headings.
  • If you click on the ++ sign against a heading, it will expand all levels and content below that heading.
  • When a level has been expanded, then a little - (minus) sign will replace the + sign. If you click on this the content will collapse to just that heading.

2. Find a way to keep the Table of Contents visible on the screen at all times

  • New Window - Maybe a pop-up window, or an option to click on a spot at the top of TOC to show the TOC in a new window.
  • Frame - Maybe put the TOC in a small frame in the left-hand side-bar (Use a couple of frames to hold all that sidebar info constantly visible (within a scrolling frame)

3. Both of the above

4. Both the above, plus have the Table of Contents also collapsible in the same way.

Benefits

  • A user can drill down in an article, expanding and collapsing at will.
rather than - going to the top of the page, clicking on a TOC entry, viewing that, not finding what they want, going back to the TOC, at the top, trying another TOC line.
  • Or, with the TOC in a frame in the left hand border, the TOC is always visible (and better if collapsible), thus making the second type of navigation far less tiresome.

How to implement

1. Firstly, and most importantly, users do not have to write their articles differently.

The whole thing is driven by the currently used heading levels.
Indeed, some of the complexities that are gone into to put some additional navigation aids on a page would become unnecessary, making it less complicated to write big pages, because the within-the-page navigation is automated from the normal content.

2. Maybe users could choose, by profile, to start either with articles fully expanded, or fully collapsed.

3. The default for low-tech users would be fully expanded, so no different from now (except for the + and - signs and the -All-, -1, -2- tabs being there if they want to click on them

4. The general concept of little + and - signs against headings is already widely used, and would need no introduction for many, even most users to be able to make use of it straightaway without any notification or education effort. For those who are very unsavvy, then they would just continue as now.

5. The collapsing / expanding by clicking on + and - signs is everywhere. I'm sure even early versions of Front Page had it, Microsoft has it within it's web site, it's used in Excel, used within the document map of Word etc etc etc. Must be pretty easy to find out how it was done for these.

6. It impacts only the display of a page. There is nothing different going on elsewhere in the background, in the infrastructure of the off screen processes. I'm not sure how this impacts style sheets, since I have never really bothered to grasp those.

OK. Hope this suggestion is food for thought in improving the user experience of the fantastic MediaWiki software.

Richardb of Wiktionary. --Richardb 14:12, 5 April 2006 (UTC)[reply]

I've created this page to have a common ground to discuss improvements to WikiSaurus. Some activity is happening. Hope you are interested.--Richardb 09:19, 2 April 2006 (UTC)[reply]

There's been a bit of discussion, I've put up a compromise proposal. seeWiktionary:Project_-_WikiSaurus_improvement_1#Compromise_Proposal_RB_2006_April_5th --Richardb 15:04, 5 April 2006 (UTC)[reply]

AOL Notice at the top of Wiktionary

Hi, I just followed the link to IRC #wiktionary and it was hard to find more information about why AOL is on HTTPS. #wikipedia helped me piece together an answer. I imagine I'm not the only one who wonders. Can't this notice be changed to link to a technical explanation of how HTTPS bypasses AOL's proxy and thus lets AOL users be banned/tracked by IP like the rest of the Internet? Also, how come Wikipedia isn't doing this? Is there an internal discussion about this policy? Also, why can't the cert be one that my browser automatically recognizes? I'd like to know more!! 67.182.158.151 04:55, 6 April 2006 (UTC)[reply]

Essentially if you are using non https we don't see what your true IP address is, we see the IP of an AOL proxy server. Because of repeat vandalism from AOL, we took the slightly darastic step (it was that bad) of blocking the AOL proxy ip addresses. Why Wikipedia is not doing this, well, mostly because it would require a lot more servers and the problem isn't quite as bad there, we only have a few admins here and often only one RC patroller. As for the certificate, if you go to CACert.org you can download their root key so it won't prompt you. To use a "recognized" key would cost us around $250 and its $250 we don't have -- Tawker 06:30, 6 April 2006 (UTC)[reply]
Thanks for filling me in. Now can we change the template at the top of the page from "For more info, visit IRC." to "For more info, visit Wiktionary:Beer parlour#AOL Notice at the top of Wiktionary" or, better, move this convo to a new page and link to that? I will gladly edit that page until it is a well-balanced explanation that does Wiktionary proud. Hell, I might even register an account. --67.182.158.151 21:30, 6 April 2006 (UTC)[reply]
It would be best if everyone registered an account, it only takes about 30 seconds (if you are slow coming up with a name, and it makes it easier for everyone. - TheDaveRoss 21:52, 6 April 2006 (UTC)[reply]


Bouncebackability, and the Collins living dictionary.

I was thinking about the amount of time and effort and enthusiasm we waste arguing over protologisms, and self-promotional words like bouncebackability. Maybe we could learn something from the Collins dictionary, which has created a sort of secondary dictionary, the Living Dictionary, which allows words like bouncebackability.

Perhaps we should have some similar split within Wikitionary -

The Main Wiktionary - as comprehensive and learned as we can make it, with a tough CFI
The "Up-to-Date" Wiktionary (or some such name), which is far less learned, far more populist, far less tough on the CFI.

I can see some benefits. Let's us be tougher on the Main Wiktionary, while being more encouraging for people just getting into this sort of thing.

I don't have any proposal of how to do it, but was just musing over whether we should consider it ?--Richardb 12:09, 7 April 2006 (UTC)[reply]

I worry that the "Up-to-Date" Wiktionary will become another "Urban Dictionary", which will hurt the credibility of the project as a whole. bd2412 T 13:16, 7 April 2006 (UTC)[reply]
Has the Living Dictionary hurt the credibility of Collins ? By keeping our version of the Living Dictionary/URban Dictionary seprate, we can encourage particpation, while isolating the dross. We are not exactly over-run with participants to help us build this Wiktionary. Perhaps we should be more friendly to possible future converts. --Richardb 13:43, 14 April 2006 (UTC)[reply]
  • Just create a new category system that lists words by their earliest verifiably-attested date, and institute the categories especially for neologisms from the last few years. That'll distinguish new words in a non-controversial (as opposed to trying to objectively judge what is or isn't "Living Dictionary" cruft) and long-term-useful way. -Silence 14:55, 7 April 2006 (UTC)[reply]

The Times Digital Archive

The Times newspaper is fully serchable for free (just for this month) from [3] for years 1785 to 1985. A good source of attestations. Be careful to change the radio button to "entire article content". SemperBlotto 17:01, 7 April 2006 (UTC)[reply]

Onelook

Since when does Wiktionary appear in search results at www.onelook.com ?!? — Vildricianus 09:49, 10 April 2006 (UTC)[reply]

Interesting, since yesterday apparently [4]. — Vildricianus 10:24, 10 April 2006 (UTC)[reply]
See yesterday's announcement. — Paul G
Muh, I'll have to watch those announcements. — Vildricianus 08:51, 11 April 2006 (UTC)[reply]
Seemed all it took was letting them know we existed: here is the communication OneLook. Then Connel made that list and voila! - TheDaveRoss 21:46, 11 April 2006 (UTC)[reply]
That deserves a big wikithanks to both of you then. — Vildricianus 07:33, 12 April 2006 (UTC)[reply]

Somalians

their isnt enough about us online, how do you say hello in somalian?

someone find out

As a Somali you are in the best position to add that information. Eclecticology 23:53, 10 April 2006 (UTC)[reply]

"Word of the day" is wrong

Today's word of the day has an incorrect definition, suggesting that onomatopoeia is an adjective. A more appropriate definition would be "The property of a word of sounding like what it represents." and I have changed onomatopoeia accordingly. However, I haven't been able to edit the entry on the front page (I went through a maze of templates that took me back to where I had started).

Could someone edit the entry on the main page ASAP, please, and could whoever updates the word of the day ensure it is correct before it is posted on the front page for the whole world to see? We'll look like a laughing stock otherwise. Thanks. — Paul G 13:39, 10 April 2006 (UTC)[reply]

That was quick. Thanks.
Yes, it occurred to me too that the "Word of the day" section of the is unprotected. Perhaps yes, we shouldn't have an edit link to it from the main page, or could find some other way of protecting it. — Paul G 14:09, 10 April 2006 (UTC)[reply]
When the WOTD thing was proposed, it was suggested that a significant number of entries be provided in advance. (I think 90 days worth was suggested.) I think instead, having a single month of entries that are front-page linked and semi-protected would be better. Then our volunteers could be responsible for moving the "next month's" entries into the "main page month" and archiving entries before overwriting them. That way, if no one stays on top of it, the previous month's entries are simply/automatically recycled back onto the Main Page. When we have a full year's worth of entries, we could consider expanding the cycle to cover 366 entries? (I thought the WOTD effort was going to wait until there was a much larger lead time. Oh well.)
I very strongly agree with the notion that each WOTD should have an edit link to itself. --Connel MacKenzie T C 16:42, 10 April 2006 (UTC)[reply]
It becomes tricky to do that when some of the entries are for future dates. In any case, I have been working to keep at least two or three weeks' worth of words in the queue for some time now. The selection of onomatopoeia as WOTD was made on 25 March -- more than two weeks before it appeared on the main page. If there was a problem with the definition, there was plenty of time to make changes before it went up on the main page. If people want to examine upcoming WOTD selections and make corrections, the queue is (and has been) linked from the WOTD page. It is up to the initiative of the individual to do so. --EncycloPetey 05:26, 11 April 2006 (UTC)[reply]
I think you have done a fine job with WOTD to date. Certainly, you are doing much better than your predecessors. I'm sorry if I seem unfavorable to the overall concept. You've done great stuff with this so far; really, all I hope to do with my above suggestions is make the process a little more fault tolerant (or absence tolerant?) I'll try to keep anything that might sound like criticism in check. --Connel MacKenzie T C 17:04, 11 April 2006 (UTC)[reply]
If I gave the impression that I somehow took offense at something you said, then I certainly did not mean to. I was only attempting to (pointedly) note that the original concern upon which this discussion began would never have been an issue if more people took an active interest in the WOTD project. This was more of a firm nudge to get additional people to help out with editing and WOTD advance preparation than anything else. --EncycloPetey 21:39, 11 April 2006 (UTC) (SEE PLEA FOR PARTICIPATION BELOW[reply]
You didn't give that impression, but re-reading what I wrote, it wan't giving you your credit due. You've made a heroic effort with WOTD and made great progress, where previous attempts have failed - Bravo! Some lumps in the road? Sure. --Connel MacKenzie T C 02:39, 12 April 2006 (UTC)[reply]
Thanks. I certainly appreciate the efforts made to make WOTD more user-friendly and accessible. If the efforts of the admins and sys-ops do that, then I am all for any such changes. --EncycloPetey 06:16, 13 April 2006 (UTC)[reply]

Software bug?

I was trying to edit prose. When I clicked the Edit link of Related Terms section, Edit page of Italian section (a section below the intended section) is opened. When I clicked any section on that page, edit page of the section below the intended section is opening. When I clicked the bottom most section.. a blank edit page of a Section is opened. Is it a software bug? __చదువరి 18:02, 10 April 2006 (UTC)[reply]

I've encountered this as well. Try reloading the page, that might help. — Vildricianus 18:04, 10 April 2006 (UTC)[reply]
I could locate the problem. On the first section of the page immediately after the section heading there was this sentence written.. <!-- dmh (why can't Johnny log in?) -->. After removing this, it worked fine. __చదువరి 18:10, 10 April 2006 (UTC)[reply]
The complete line was "==English==<!-- dmh (why can't Johnny log in?) -->". Nothing is supposed to be on heading lines after the heading. Thanks for fixing it Chaduvari. --Connel MacKenzie T C 18:20, 10 April 2006 (UTC)[reply]

The history behind a word

Does Wiktionary interested in these kind of stuffs?

(unsigned comment 19:31, April 10, 2006 218.68.245.150)
There recently have been a couple proponents, for experimenting with word histories (much more in depth, and not the same as etymologies.) Currently we don't enter word-history information. I think one suggestion was to try something to the "/Citations" sub-page concept, but I haven't seen any experiments of it, nor in-depth discussion of it, yet. --Connel MacKenzie T C 01:39, 11 April 2006 (UTC)[reply]

Getting involved in Word of the Day

I think it would be great to see many more people looking ahead to future WOTDs and contributing to those entries that will soon be featured.

Wiktionary is a great place. Contributors here have all manner and sort of backgrounds, and they can thus contribute in many different ways. Some people here are particularly talented at writing clear and precise definitions. Others have a knowledge of etymologies or cognates. Still others are skilled at pronunciation, translation, and the various other components of a good Wiktionary entry. I'd like to encourage everyone, then, to participate in WOTD in whatever way you can -- whether your contribution would be large or small.

While a "word of the day" has traditionally been a little sheet of paper on a cheap throw-away calendar, here on Wiktionary it has the opportunity to be more than that. The WOTD displays prominently on the Main Page, and so I'd like to see WOTD become a real showcase for what Wiktionary can do. To that end, I've been working to add quotations from English and American literature to upcoming words, and making other little edits. I know also that Dvortygirl has been adding soundfiles to upcoming words, and so she has been contributing significantly.

Any excuse to have older entries brought out, expanded, and cleaned up is a good one, but this plea goes beyond that. Even if you only drop in and look around once every week or two, a little attention from the talented regulars would help a lot. Attention from occasional users who know another language (and can thus add translations) would also be a great service. Instead of simply being a nifty word, the WOTD could be a real showcase of the talented work being done here. --EncycloPetey 10:05, 11 April 2006 (UTC)[reply]

AWB

AWB is a great idea in principle, provided the person who uses it knows what they are doing.

Unfortunately User:That Guy, From That Show! has used it to enter a load of Greek pages that use neither the standard format nor the correct pronunciation schemas (those used being in Greek, something ad hoc and something else). I've asked the user to sort this out, but we might need to do this ourselves. I've also posted this on Requests for cleanup. — Paul G 10:57, 11 April 2006 (UTC)[reply]

I've also answered on RFC. — Vildricianus 13:50, 11 April 2006 (UTC)[reply]
TGFTS asked specifically what sorts of things needed cleanup. I identified {{gstr}} as a starting point as it is one of the thorniest problems outstanding on my cleanup list.
Please compare the pages before and after, before accusing User:That Guy, From That Show! of adding crap to Wiktionary. He is part of the cleanup process, not part of the problem! --Connel MacKenzie T C 14:07, 11 April 2006 (UTC)[reply]

See also at top

For see also's at the top of a page, it's pretty clear when an entry title is a variant of another in the languages of the Latin script, which include English. The rule technically could be if they reduce to the same non-empty unspaced alphanumeric string in lower ASCII, that is, numbers and all uppercase (or equivalently all lowercase) letters A-Z without diacritics or ligatures. So that's not the question here. The question is:

When do variants need to be placed in the See also section at the top of a page? I can think of a number of answers, from the broadest to the narrowest in terms of (a) redirects:

  1. When an entry for the variant of the word exists or should exist.
  2. When an entry for the variant of the word exists or should exist but doesn't exist as a redirect to the same page.
  3. When an entry for the variant of the word exists or could exist as a full entry to which the same page would not redirect.
  4. When an entry for the variant of the word exists or should exist but couldn't exist as a redirect to the contents of the same page.

and perpendicularly in terms of (b) page contents:

  1. Regardless.
  2. And the variant isn't an alternative spelling (in that section or as a POS headword).
  3. And the variant doesn't appear anywhere within the page.

I'm worried that current practice, if it can be characterized at all, might be a2-b2 or a4-b1, which would be language dependent in the first case or require hypothetical reasoning in the second. Other more sound combinations would conveniently lend themselves to maintenance by a bot. Davilla 19:29, 3 April 2006 (UTC)[reply]

My understaning is: (a) 1, (b) 1.
Regarding part (a): this practice was adopted as a result of the de-capitalization. The primary purpose is as a navigation aide. That is, it is working-around a MediWiki software limitation, not for lexical reasons, but for lookup reasons. The practice was later expanded to all variations.
Regarding part (b): 1. The navigation aide has nothing to do with the lexical contents of the entry. One link appears at the top of the page (where a newcomer would conceivably expect to find links to the word they want) while the other link appears in the section of the entry specific to a particular language.
--Connel MacKenzie T C 20:05, 3 April 2006 (UTC)[reply]
See Also's at the top are navigation aids, to help people find the word they are really looking for, before they waste time on the one they got by accident or mistake. In effect, they are a form of micro-disambiguation page. They generally do not have a ===See Also=== type heading.
===See Also=== at the bottom is to list related words and topics, for someone to read further on, after they have read about this page about word.--Richardb 13:44, 4 April 2006 (UTC)[reply]
  • I call these "disambiguation see also". I would also include forms such as "faex" / "fæx". I have also just started addings words which are in the same language but differ only by presence of absence of a double letter, such as "pero" and "perro" in Spanish. Another type is "foo-bar", "foo bar", "foobar". I also crosslink terms which differ only by presence or absence of an apostrophe in any position: "ill" and "i'll". — Hippietrail 00:33, 5 April 2006 (UTC)[reply]
    • I don't know a thing about Spanish, but why is disambiguation necessary for single/double consonant instances? Is there actually anything spelt out on the see also matter? Not in WT:ELE at least. — Vildricianus 09:17, 29 April 2006 (UTC)[reply]
  • You could possibly argue that none of these are strictly "necessary", but it's extremely common at least among English speakers to confuse Spanish words which differ only be a single or double letter. — Hippietrail 00:52, 30 April 2006 (UTC)[reply]

This issue is addressed in part in the new Draft Policy Wiktionary:Spelling Variants in Entry Names - Draft Policy--Richardb 12:50, 15 April 2006 (UTC)[reply]

Placing of categories

TGFTC's use of the AWB, and in particular Connel's response, drew my attention to this not-so-important formatting fact. To summarize: I find it useful to have things like [[Category:Dutch nouns]] in the Dutch section, not at the bottom of the page. This is quite helpful in multiple-language entries like auto. The AWB automatically puts them at the bottom of the page ([5]), which would force me to edit the entire page or the last section to change or add or remove any categories that actually apply only to one particular section,. Any thoughts? — Vildricianus 14:04, 11 April 2006 (UTC)[reply]

Well, it is what WT:ELE#Category links says to do. --Connel MacKenzie T C 14:09, 11 April 2006 (UTC)[reply]
That's why I'm suggesting a change... — Vildricianus 14:18, 11 April 2006 (UTC)[reply]
Fair enough. I'm missing something though: how often does one edit the categories on a page? Edit them, that is, as opposed to adding them. I know in cases where a category is changed, it usually means the category was named incorrectly in the first place...and the easiest/better way to fix those is with the bot category.py. Tagged categories (e.g. {{computing}}) are within their respective templates, so I don't think you mean them. Could you expound on what sorts of problems you've encountered with this are? Also, even if you did need to edit other categories, you'd only need to edit the last section of the page, not the entire page. --Connel MacKenzie T C 19:33, 11 April 2006 (UTC)[reply]
The thing is, for example, I want to add {{nl-noun}} or some other inflection template to a Dutch noun (which of course happens to be a Swedish and Danish one as well), so I section-edit; and that template contains the category "Dutch nouns", of which I then want to remove the duplicate —> I have to re-edit the full page or the last section in order to do this, which is 2 edits instead of 1. Both in practical and theoretical way, such a category belongs in the appropriate section. — Vildricianus 19:44, 11 April 2006 (UTC)[reply]
I agree as well with Vildricianus. When the category is specific to a language other than English, the category would be best placed immediately following the language header to make it easy to find and edit. This also reduces confusion for those entries which are words in several dozen languages. Some such pages now have a list of categories at the bottom that runs like the closing credits from Star Wars. It is much, much easier to find and edit the desired category tag if it is placed with the language in question. That said, for all English-based categories, Translinguals, Abbreviations, Symbols, etc, I agree that Categories should be grouped at the end of the page. --EncycloPetey 21:34, 11 April 2006 (UTC)[reply]


This is why I am testing here. Issues that are Wiktionary-specific will have to be discovered and addressed before editors use AWB here regularly. I will be noting exactly what I am testing on my user page and then listing the articles that are changed for review. That will make it easier for editors to identify possible problems.

I did 2 tests today. One was identifying misspelled words and another looked for duplicated word use like "that that". The information about articles fixed during each test is here.

BTW: When new talk is added to the user page that belongs to the AWB user, AWB displays a warning to go visit talk. This is very useful to stop AWB editing in progress.

All feedback is welcome.

--That Guy, From That Show! 06:03, 13 April 2006 (UTC)[reply]
  • I think that the proposed option of keeping categories at the bottom of a language section has just as many problems as our current practice. If the goal is to be able to "correct" categories while section editing, it would only then make sense to have the category in the applicable sub-section, not the language section. There are seceral reasons against that practice. I believe it is much more consistent to simply follow the Wikipedia convention of adding categories at the bottom of the page, to correspond to where they appear when viewed with the default Monobook skin. A very large majority of our entries is formatted following this convention: changing the convention would not help people editing sections, as most of the time they'd have to hunt the category down anyway. Trying to convert all existing entries to that format would be a massive manual effort, with very miniscule (if any) gain. --Connel MacKenzie T C 16:53, 20 April 2006 (UTC)[reply]
Upon further reflection, I'd say you're right. Both options have a downside. It's best then, to stick to the old way. — Vildricianus 17:01, 20 April 2006 (UTC)[reply]
This issue seems resolved as not appropriate for policy, so this point may be irrelevant, but note that some templates add categories (e.g. {{janoun}}), so restricting categories to the bottom of the entry page may not even be feasible. Rodasmith 04:22, 5 May 2006 (UTC)[reply]
Rod, I thought we were talking about categories on the page. You raise an interesting point though; if the category you are looking for doesn't appear at the bootom of the page, a link to the likely template will appear just below the edit box. --Connel MacKenzie T C 04:57, 5 May 2006 (UTC)[reply]
I was referring to the implied goal of letting editors find categories quickly as a reason to move literal category inclusions to the bottom of a page or language section, but I forgot about the edit mode template link, so never mind. Rodasmith 05:40, 5 May 2006 (UTC)[reply]

Heiroglyphs

Are there any plans to offer Egyptian Heiroglyph translations of words/phrases? I ask because Wiktionary's mission statement includes such an option; I just am not knowledgable in heiroglyphs. — This unsigned comment was added by Sewnmouthsecret (talkcontribs) at 2006-04-11 23:44:22.

Sure. It's difficult though because the hieroglyphs aren't in Unicode yet, and our hieroglyph extension is only really suited for 'pretty-picture' display, not much usefulness. One might do something like:
  • Egyptian, Ancient:
E23
Z1
though that doesn't really give a place to link to. (la: uses titles like E23:Z1; I don't know if en: would be amenable to such things.) —Muke Tever 22:55, 12 April 2006 (UTC)[reply]
I don't think en.wikt: is keen on that syntax, but who knows? I'd just like to say that that is one cute picture of a lion. --Connel MacKenzie T C 17:48, 9 May 2006 (UTC)[reply]

Transwiki

I'm getting stuck in to the transwiki stuff, and I'm not sure where we stand on the Wikipedia edit histories (See [[Talk:Transwiki:Hot Stove League]] for an example). I can't think of any good reason why we should want to keep them at all. Shall I just delete them after completing the transwikiing? Or let them stay for nostalgic reasons? --Dangherous 23:43, 12 April 2006 (UTC)[reply]

Delete would be my vote, I see no point in keeping rejected transwikis, though if someone has a better argument I might sway -- Tawker 01:15, 13 April 2006 (UTC)[reply]
  • We don't keep talk pages if the main entry does not exist. So if the TW page is deleted, the talk page must be deleted with it. If the TW is converted to a redirect to a main namespace entry, none of the TW information is being used, so the talk page can be deleted then also. If the transwiki is moved to the main namespace, the talk page goes with it (by default.) The only thorny scenario is when the TW entry is used to expand an existing main namespace entry...the safe thing to do then is copy the TW history to the main namespace entry's talk page. --Connel MacKenzie T C 08:06, 13 April 2006 (UTC)[reply]
  • The GFDL requires giving credit to submitters. So if any of the submission to WP is kept we need to document who submitted it. (Someone can probably do a much better job of stating that. Is there a pronoun in the house?) JillianE 13:23, 13 April 2006 (UTC)[reply]
    • If you are going to adopt a word that has been transwikied, the best you can do is completely rewrite the article rather than move the Transwiki page. Much of what appears in the history of these pages relates to material that some considered encyclopedic when it was in Wikipedia, or worse, the back and forth of edit wars. The links about who did what are often no longer available to be checked since they were deleted with the article in Wikipedia. Eclecticology 06:37, 18 April 2006 (UTC)[reply]
  • meta:Transwiki policy was developed with the GFDL credit requirement in mind. There are basically three things we can do with a transwikied article: delete it, move it into the main article space, or merge it into an existing article. If we delete it, the talk page can be deleted as well. (Some projects have a policy to keep talk pages with deletion discussions that aren't preserved elsewhere, but that doesn't appear to apply to the transwikied article in question.) If we move the page to hot stove league, selecting the "move talk page" option, the talk page and its credits are automatically moved into their correct place. If we merge it with another article, the credits alone should be copied into the new article's talk page. In any of these cases, there should be an entry in the transwiki log to say exactly what was done. Some editors in some projects ignore or delete such entries, but besides being policy, it is a courtesy to editors from the original projects who might wonder what happened to their pet articles, allowing them to discover their ultimate fate. ~ Jeff Q 02:27, 20 April 2006 (UTC)[reply]

Attention please, mainly linguists and etymologists.

Hi to all. I have been a contributor in Wiktionary since late Feb. 2006, and since then I have added or corrected more than 270 etymology sections. I am an etymologist of Greek, with a good knowledge of Latin and English, so my contributions are primarily involved with the etymology of the words from Greek origin in the English language.

Unfortunately, as many etymologists already know, most of the dictionaries (online or published, apart from a few exceptions and of course the ones specialized in the etymology of Latin and Greek) lack in their etymology sections that involve Greek roots, most of the times tracing the word only as far back as Latin, even if the word originates from Greek and despite the well known relation between Greek and Latin languages as a “mother and daughter” relationship. Even worse, they some times by-pass the Greek (or any) etymon and they replace it with an unattested, hypothetical, IE “word”! (For those who don’t know, the IE “words” are marked with a * symbol, exactly because that symbol in linguistics declares a hypothetical, unattested, guessed, non-existant word.) Now that’s quite acceptable in theoretical linguistics, but not in a good etymology section, since these * “words” are not and shouldn’t be used as etymons.

Latin is no daughter of Greek. It is a sister language. —Muke Tever 16:42, 15 April 2006 (UTC)[reply]

Just to make a brief summary for non-linguists or non-etymologists to understand: Etymology, even if it is a part of linguistics, it is also a separate science, dealing with the origin and historical development of words. Furthermore, the obligation of etymology is, by definition, to trace the word back to its etymon; hence, the oldest attested form of the word; and the hypothetical IE * words are not etymons! (Personally I can accept hypothetical words as etymons only and if there has never been recorded an etymon for this word; even then, only until a reasonable level.) Consequently, if there is an etymon of the word there, an etymology section has to use it as the actual, true sense of the word, because, otherwise, it wouldn’t be etymology, it will be "theoreticology"! The etymologists here know what I am talking about, no need to analyse it any further.

Therefore I thought that it would have been a good idea to detail the etymologies of words from Greek origin, tracing the words back to their etymons, giving to the readers the pleasure to see and to know some attested etymology knowledge that can only be found in specialized etymological dictionaries.

However, for some reason that is not acceptable by some contributors in Wiktionary and they reverted some of my edits, in which I have traced words back to their true etymons and they have replaced them with some unattested IE words!! (I know that this is a common tactic among many dictionaries but it is not correct.) Even worse, at the same time they are accusing me of being “erroneous”, a “vandal” and alike, claiming that “eventually what I am doing it will be taken as vandalism and dealt with accordingly”! I am hoping that these contributors are merely acting out of ignorance of the issue and not out of bias. (The only thing that is probably acceptable of what they say, is the argument about me using the term “Latinised Greek” instead of “Latin” in words that actually are Latinised Greek. But I do that only to specify the different between a “Latinised Greek” word and a word of actual Latin origin. For example , “guberno” is the Latinized form of the Greek “kuberno”, while “monstrum” is Latin; and "μαξιλάρι", (pillow), is the Hellenized form of Latin "maxillaris". Although there isn’t anything misleading or false in how I use the term “Latinised Greek”, I will stop using it, it’s not so important anyway.)

To conclude, the reason I write all this is because, after the readers have taken a look at my edits, I would like to give them the choice to decide whether I should continue to contribute to Wiktionary or not. So please, if it’s not a lot of hassle, take a look at my edits and come back here to put a “leave” or “stay” comment. I think that’s very democratic and therefore I will respect any result, so if you decide that I should leave, then I will. But if you decide that I should continue contributing, I will do that by the scientific way: hence, trace the word back to its attested, true etymon. Because the bad thing is not to not know something but to not want to know… Kassios 11:10, 15 April 2006 (UTC)[reply]

To make it easy, here is the standard etymology of father:
From Middle English fader < Old English fæder < Proto-Germanic *fader < Proto-Indo-European *pə2ter; cognates include Mycenaean Greek pa-te (pater), Greek πατήρ (patḗr), Latin pater, Spanish padre, French père, and German Vater.
And here is YOUR etymology of father:
From Middle English fader < Old English fæder < Latinized {{Gr.}} pater < Ancient Greek πατήρ (patḗr), cognates include Spanish padre, French père, and German Vater. Many linguists claim that the word derives from Proto-Germanic *fader <Proto-Indo-European *pə2ter although the above are unattested. The fact is that the oldest attested record for father is the Mycenaean Greek pa-te (pater).
Here is the standard etymology of kiss:
Old English cyssan, from Proto-Germanic *kussijanan, from Proto-Indo-European *kuss- (probably imitative). Cognates include Dutch kussen, German küssen, Swedish kyssa, Old Norse kyssa.
And here is what you wrote for kiss:
From Ancient Greek κύσσω (kysso) poet. form of κύσω (κύσο) "to kiss" (Homer, Odyssey, 16.15: kusse de min kephalen... "he kissed his forehead"; Aristophanes, Clouds, 56.81: kuson me... "kiss me...", etc.), from κυνέω (kyneo) "to kiss".
For more on this, please see User talk:Widsith#On etymology. For an example of Kassios's ideas on etymology, see e.g. daughter, which he derives from Greek. In my opinion it is nonsense. Other relevant discussions can be seen on his own talk page, and on Stephen's too. Kassios can make valuable contributions here to Greek entries and genuine Greek borrowings; I just wish he would drop his unsupported theories that IE didn't exist and that Germanic languages are derived from ancient Greek. Widsith 11:18, 15 April 2006 (UTC)[reply]
You will have to bear in mind that many of the words are Latin and were merely translated to Greek when the Empire moved its seat of influence to Byzantium.
Short History Lesson.
  1. During the Third Century 212 the Emperor Caracalla opened Roman Citizenship to non-Romans. This meant that foreign language speakers suddenly became eligible for high office.
  2. Diocletian 284-305 (who was Dalmatian and therefore spoke Greek) was the first Emperor to move from Rome to the East - the start of the Byzantine Empire.
  3. Constantine 306-337 founded Constantinople on Byzantium in 325. Byzantium was Greek speaking. Constantinople was a modern city which encouraged scholars. The arts were on the rise in the East
  4. The western Roman empire was languishing for various reasons, namely pestilence, the threat of the barbarians, and civil war. The arts were in decay in the West.
  5. The imperial language of the East changed from Latin to Greek, and many words were merely translated.

Thus to establish a Greek etymology, one would have to ensure a progression from Greek to Latin in documents prior to Circa 300.Andrew massyn 15:40, 15 April 2006 (UTC)[reply]

Please realise that etymon means the first attested word, so all the Greek etymons in my sections already existed for hundreds of years in the Greek language before they were ever used in Latin.Kassios 16:05, 15 April 2006 (UTC)[reply]
Don't you claim to know Greek? ἔτυμον means the true, original sense of a word. (The etymon is ἔτυμος, 'true'. The Latins had a literal translation of ἐτυμολογία, 'veriloquium', with the same meaning.) In linguistics it comes to mean the original form from which the meaning derives. Temporal relation is only incidental. —Muke Tever 16:51, 15 April 2006 (UTC)[reply]
That’s not the meaning of etymon at all. The etymon is the origin of a word, but more often than not when dealing with Germanic, the oldest attested forms are mere cognates, because Ancient Greek was written while Proto-Germanic was not. The words that you keep calling etymons are just cognates. The etymons may only be reconstructed. —Stephen 17:51, 15 April 2006 (UTC)[reply]
Please Stephen from here and on, post your comments on the bottom of the page and not in the middle of my original post, so others can follow my post easily. I'll make a start:
"Guessed" shows that you do not understand anything about proto-language reconstruction. The * indicates a reconstructed form, and the reason for reconstruction is because the language was not written. Just because a form is marked as reconstructed, that does not mean that the language itself is theoretical or that it’s only guesswork that a modern word evolved from a word in the proto-language. —Stephen 14:55, 15 April 2006 (UTC)[reply]
Stephen, reconstructed from what? They are just hypotheses, hence, not etymons!Kassios 16:14, 15 April 2006 (UTC)[reply]
The etymon is the furthest attested word in the chain. The proto-etyma, yes, are hypothetical, but are, in general, based in sound science, and they can be cited. Do you have citations for your assertions? —Muke Tever 16:42, 15 April 2006 (UTC)[reply]
Kassios, This branch of linguistics is complex and takes years to learn. Let me try to put it in a nutshell. There are numerous tools and laws that apply, such as Grimm’s law. Grimm’s law explains sound shifts such as k > h, t > th, p > f, etc. If you assemble all the known cognates for a certain word, such as tree, in every possible Indo-European language, living and dead, and apply the principles of Grimm’s law to each, many or most of the attested words will point back to a single protoform. We can also do the reverse and predict later froms from the reconstructed protoforms. For example, if I don’t know the Portuguese word for "thread", I can take the Spanish word "hilo", apply the known principles of sound changes to obtain Latin "filum", and then go forward again to get Portuguese "fio." Looking in a Portuguese dictionary, I see that this is correct, which is evidence that the reconstructed Latin form is also correct. That’s the short version. Super short. —Stephen 17:51, 15 April 2006 (UTC)[reply]
The "etymons" that you are claiming are NOT etymons at all, they are just cognates. The Ancient Greek πατήρ, while it was definitely used, is NOT the etymon of English father. You are trying to demonstrate that English and the Germanic languages are actually descended from Ancient Greek, and that the proto-language that linguists and etymologists call Proto-Germanic was actually just Ancient Greek. That’s ridiculous, the Germanic languages did not develope out of any dialect of Greek; and the Greek and Germanic languages did not occupy the same geographical areas, or even adjacent areas, and therefore Germanic words did not come from anything in Greek, regardless of how real the Greek word may have been. Greek words entered English, with extremely rare exceptions, either via French from Latin, or recently adopted directly from Greek as a scientific, technical or learned terms. —Stephen 14:55, 15 April 2006 (UTC)[reply]
They are etymons because they are the oldest attested words. And where on earth did I ever claim that Germanic languages came from Greek??? All I am saying is that some Germanic words came from Greek, I don't know how but they did.Kassios 16:14, 15 April 2006 (UTC)[reply]
They are etymons because they are the oldest attested words. A pure misunderstanding of etymology, and a logical fallacy as well. Look up Post hoc ergo propter hoc sometime. Etymology doesn't seek to find "the oldest word", but to trace the actual familial relationship. In the case of Greek and many of these words, the relationship is a cousin or sibling relationship, not a parental one. —Muke Tever 16:42, 15 April 2006 (UTC)[reply]
Kassios, the problems now becomes clear. You do not know the meaning of etymon. The oldest attested words in the Indo-European languages are mostly Greek, because Greek developed writing millenia ago. The Germanic languages went unwritten until rather recently. The etymons of Germanic words are unattested Proto-Germanic words; the fully attested Greek words are not etymons, but cognates. The etymons are the most important items in an etymology, even if unattested. The cognates are of passing interest, regardless of how old or how well documented. As I said before, the main value of Mycenean Greek is in the reconstruction of the proto-language. Ancient cognates carry more weight than more recent ones. —Stephen 17:51, 15 April 2006 (UTC)[reply]
That’s all well and good, but in most etymologies, the Greek is only a COGNATE, not an etymon. —Stephen 14:55, 15 April 2006 (UTC)[reply]
That's because they are not good and detailed etymologies, since they don't go as far back as the Greek etymons.Kassios 16:14, 15 April 2006 (UTC)[reply]
The projected age of the reconstructed etyma are older than the Greek ones (for what it's worth). —Muke Tever 16:52, 15 April 2006 (UTC)[reply]
Kassios, the most ancient cognates pale in comparison to even unattested etymons. An etymology doesn’t even have to list any cognates, they are window dressing. The etymons, most of which are unattested in the Germanic languages, are the thing that is important. Similar work is being carried out even now on proto-Algonquian, proto-Karen, and many other languages whose ancestors were not written. It makes up the biggest part of modern lingistics. In fact, there is a major project afoot for reconstruction of Proto-Indo-European, since Pokorny’s amazing work is woefully out of date. —Stephen 17:51, 15 April 2006 (UTC)[reply]
Come on guys, you are not telling me something that I already don’t know… the difference is that I take things into consideration, rather than simply dismissing or fully accepting them. Anyway, what more can I say? I feel like I am trying to make a priest think about the possibility of god’s non-existence!
Unfortunately there weren’t a substantial number of linguists and etymologists taking part in the dispute, like I would have wished… However, since none of you four have put a “leave” comment, I take it that you accept my etymologies (at least the ones ‘officially’ accepted by the majority), so I am staying. For the rest, time will tell. Kassios 17:33, 23 April 2006 (UTC)[reply]

Features needing wiki extensions or developers

Now that Wiktionary is growing up, it's been around a couple of years, it's getting big and juicy, there's things we really need that just can't be done without making changes to the wiki software, generally writing an extension would probably be able to get the job done. Is it time we start looking for somebody able to do this? Could we hound the existing devs? Is there a contributor who is capable of writing the code? Could we do a donation drive and pay somebody to do it?

The one that seems fairly easy and would be very useful that I've been thinking about for a long time is this:

  • Random page in a given language

This would vastly improve the possibilities of contributors who know one or two languages well but don't feel like writing articles from scratch. They could randomly check articles relevant to them and make incremental improvements.

I'd like to hear other peoples' thoughts generally and specifically. Are we ready for such improvements yet? Will the devs allow us to install our own extensions just for en.wiktionary? — Hippietrail 18:52, 15 April 2006 (UTC)[reply]

Would be nice. --Expurgator t(c) 00:13, 17 April 2006 (UTC)[reply]
  • You mean like having a tool on http://tools.wikimedia.de/ that does it, with sidebar link(s) here to make it transparent? Hippietrail, I'd like to see you request an account there, as you seem to have a better idea of how it might work than I do. With a USB harddrive to store PuTTY.exe and your ssh key, you could even access it from internet cafes. --Connel MacKenzie T C 17:03, 20 April 2006 (UTC)[reply]
    Thanks for pointing me to tools.wikipedia.de - it seems to be the way forward. Sadly, software development is outside my internet cafe budget while I'm traveling so I can't presently make use of it myself. — Hippietrail 03:05, 23 April 2006 (UTC)[reply]
  • I like the idea of random-word-in-language, as that could apply to all Wiktionaries, using the referring URL to determine the default language. --Connel MacKenzie T C 17:05, 20 April 2006 (UTC)[reply]
    Well it would apply to all Wiktionaries but as each is its own country with its own government and traditions, it likely wouldn't work the same way everywhere. At the moment I'm only talking about en.wiktionary.org the country we live and and know. — Hippietrail 03:05, 23 April 2006 (UTC)[reply]
  • How would this work? Would something scan for the correct level two headers? — Vildricianus 17:11, 20 April 2006 (UTC)[reply]
    Yes I can think of a couple of ways. Every time a new database dump is available a piece of software scans all the level-2 headers in the standard namespace. It could keep this info in memory, dump an XML file, output it to a new database field that we add somehow. I just don't know what's possible with wiki extensions, or what the people running us will allow us to customize.
    Of course we need a way to choose random one to create a link in the sidebar. There's surely several ways to do this also depending on what "the guys" allow us to do. — Hippietrail 03:05, 23 April 2006 (UTC)[reply]
TIMTOWTDI. I'd guess building a separate table by language would do the trick, or (as I often do) using the XML dumps as checkpoints. It would be better not to use the XML dumps, but I haven't figured out what I'm doing on tools.wikimedia.de yet. There are probably several more efficient approaches as well. --Connel MacKenzie T C 17:17, 20 April 2006 (UTC)[reply]
Sadly, I don't grok databases very well. I concentrated on all other areas of computer science because I thought db's were boring. Now I've found some interesting uses for them I feel a huge lack in my tech knowledge. But why not collect some funds and pay somebody? Why not start consulting with the devs to see what might and might not be possible first? Maybe we'll find we actually can do it ourselves, or at least we'll have some solid info to give somebody we might recruit. — Hippietrail 03:05, 23 April 2006 (UTC)[reply]

I myself know a fair amount of PHP and MySQL, so I wouldn't mind playing around, but only in a group setting. I have very little time right now, so I couldn't try and be a leader, but I think it would be fun if we picked a project and edited right here in Wiktionary until we were happy then gave it to the devs. Starting simple and finding people are both pluses. - TheDaveRoss 03:09, 23 April 2006 (UTC)[reply]

The problem is it can't be done just with CSS/Javascript/bot. It requires server support. That means at least we have to set up a local server for ourselves at home (which I don't have right now). Maybe we can find somebody to provide us a server just for this experiment, and we can give logins to the contributors here who are interested and capable. It would be a mirror of Wiktionary running off a real Wiktionary dumped database, but we would be able to hack it. If need be we could also block everybody else to limit bandwidth. — Hippietrail 03:18, 23 April 2006 (UTC)[reply]
That's exactly what tools.wikimedia.de is. Without the bandwidth limiting. If we have a php guru lurking around, perhaps they could use my onelook feed to cull the entries from (onelook wanted English only, for some reason.) This is from the latest XML dump only. --Connel MacKenzie T C 03:36, 23 April 2006 (UTC)[reply]
I consider myself a very good PHP and MySQL programmer, I do it for living (and I haven't died yet :-), but like TheDaveRoss I don't really want to be a leader either. But if somebody describes in more detail what needs to be done I can help write the code. Perhaps we should setup a project page to describe and discuss what needs to be done. --Patrik Stridvall 14:58, 23 April 2006 (UTC)[reply]
If nobody else feels up to it, I'm willing to "lead", but be aware of my limitations on time and actual coding/hacking/twiddling I can do before possibly July. If somebody would like to set up a page and start thinking about the possibilities I've already outlined, and read up how to create wiki extensions, that's what is needed next. — Hippietrail 15:42, 23 April 2006 (UTC)[reply]
I already know how to write a MediaWiki extensions if that is needed. I have already submitted a simple patch more than 2 months ago. However nobody seems to care and I haven't felt like pushing it. "Nogomatch" just looks a little ugly without it so it is not a big deal.
Writing code is often the small part of the work, the big part is usually the bureaucracy including finding out what need to be done, convincing the relevant person to deploy it and that sort of things. I do that at work almost everyday so I'm not that hot on doing that. I like programming.
Connel and I and some others can handle the code, the other main reason I brought this to the Beer parlour is because we may need support from as many of you non-coders as possible to help deal with the bureacracy and the developers. None of us know what's possible and how hard we may have to push - with luck it will be easy but let's hedge our bets. — Hippietrail 00:45, 26 April 2006 (UTC)[reply]
As to this concrete problem, we could of course write some MediaWiki extension since the database is obviously available on this site but since we need new tables with indexed data and that sort of thing and it is running in a production enviroment that will likely require both a lot of work as well as some convincing to get it deployed.
Deploying a MySQL database containing the dump without the history on tools.wikimedia.de should be no problem. There is a script for that (see m:Data dumps) and it worked fine on my own Linux box. Writing a few PHP script that parses and generates new tables with indexes shouldn't be that hard, but that depends very much on what exactly needs to be extracted. Writing a search page which presents the search results is quite easy though. --Patrik Stridvall 18:10, 23 April 2006 (UTC)[reply]
There's more than one way to skin a cat. As Connel's proof on concept shows, we can achieve this without touching the database structure. Connel's solution generates redirects to random pages from another server. I have custom javascript to insert it into the nav-bar so that it looks and works like any other link there. — Hippietrail 00:45, 26 April 2006 (UTC)[reply]
AFAIK, tools.wikimedia.de has that data already, only much more up to date than the XML dumps. As I said before, I haven't yet figured out what I'm doing there, nor how to access it directly, but as the "Editcount" proves, it is possible. I have a tentative experiment on my page there but it is based off the XML dump and has unicode issues/bugs. Right now, the proof of concept is a long way off from being truly usable. --Connel MacKenzie T C 08:09, 24 April 2006 (UTC)[reply]
OK. I have applied for an account. If/when I get I will see what I can do. In the mean time, anybody that have anything they think would be useful, please describe it in more detail. Simple things first... --Patrik Stridvall 17:11, 25 April 2006 (UTC)[reply]
What I did was build a separate cross reference of terms that include "==English==" in the XML dump. On my experiment, I pick a random number, and redirect to that numbered term in my list, to http://en.wiktionary.org/wiki/<random page name wiki-url-encoded.>. If you don't look quick, you can mistake it for an actual wikimedia.org link. Still experimental, today. --Connel MacKenzie T C 03:03, 26 April 2006 (UTC)[reply]

Including gender

Hello, all. I have been working on Wiktionary's translations to be checked and have noticed great inconsistency regarding the inclusion of gender and plural variants with the dictionary entry. There was some discussion about creating a guideline/policy for this, but it has been inactive since 2004 with no resolution. Many word translation sections are cluttered with regular gender and plural variants. In addition, some translations give the regular gender variants while other translations give only what is considered the basic form. For example, see the synonymous Spanish translations for the words mad and angry. All the gender variants are listed under mad: enfadado m, enfadada f, enojado m, enojada f. However, under angry, only enojado and enfadado are listed, not the feminine forms. Many other listings are inconsist in this way. Would someone be willing to open up a discussion page so that a possible policy can be talked over?--El aprendelenguas 22:18, 15 April 2006 (UTC)[reply]

I set up Wiktionary:Languages with more than one grammatical gender and linked to it from Wiktionary:Language considerations. Ncik 23:30, 16 April 2006 (UTC)[reply]
Thanks, Ncik. :) I invite those who can make an opinion on the terms for this policy to review the proposal I have posted through the link above and to leave their comments and suggestions on the discussion page.--El aprendelenguas 19:53, 17 April 2006 (UTC)[reply]
Simpler: Feel free to add the gender variations where you feel thay are needed. No-one who first adds a translation is under any obligation to show all the genders. Eclecticology 06:46, 18 April 2006 (UTC)[reply]
This would also work, but what is happening is translators notice that other people have written obvious, regular gender and plural forms of a word, and these translators assume that is Wiktionary's policy. If we write down a set of rules, then it will be clear how we want translators to organize their contributions, and there will be a consistency among Wiktionary entries.--El aprendelenguas 01:44, 19 April 2006 (UTC)[reply]

I needa "citation needed" tag

do y'all have one here? see w:template:citation needed. if I wann drop a note that something should be cited, is that doable? LingLangLung 05:45, 16 April 2006 (UTC)[reply]

We usually use the Request for Verification {{rfv}} tag for that. --EncycloPetey 05:46, 16 April 2006 (UTC)[reply]
{{rfv}} if you want citations for a word's existence. {{rfv-sense}} if you want citations for an individual sense's existence. {{unreferenced}} for other sections wanting citations, like etymology or usage notes. —Muke Tever 23:22, 16 April 2006 (UTC)[reply]
Any chance of getting a shorter, more memorable name than {{unreferenced}}? --Connel MacKenzie T C 03:38, 23 April 2006 (UTC)[reply]

Quote sources.

Certain of the qotes I use come from writers who are reasonably old (and usually reasonably dead), e.g. Rider Haggard, Lewis Caroll, Percy Fitzpatrick, etc. The books have been republished and I may be quoting from the fifteenth edition, or a newly copywrited source or something. Consequently the date cited in my reference may be from 1970, or 1995 or a date which has nothing to do with the original publication date. This looks odd if I have marked the word as (dated) and show the modern publication date without reference to the original publication date see e.g. koodoo. I could amplify the source, but this becomes unwieldly. The full citation for my edition of "Jock" is: "Longman Group Ltd (Formerly Longmans, Green & Co Ltd) First Published 1907, 19th impression 1943, New edition reset 1948, fifteenth impression 1976 ISBN xxx"

What to do? Andrew massyn 18:29, 16 April 2006 (UTC)[reply]
What I have been doing is putting one or the other date (it's been awhile, I forget which, but check old RFVs), in brackets or parentheses. Thus:
[1907] 1976, Sir Percy Fitzpatrick, Jock of the Bushveld
or
1907 (1976), Sir Percy Fitzpatrick, Jock of the Bushveld
If that's unclear you might add something explanatory like "1907 (1976 ed.)" —Muke Tever 23:28, 16 April 2006 (UTC)[reply]

The format I prefer is:

  • 1956: ‘This is an illustrative quote,’ chuckled Holmes. — Arthur Conan Doyle, Study in Scarlet (Norton 1983, p.45).

Widsith 08:02, 17 April 2006 (UTC)[reply]

Since we are giving illustrations of a word's history the most important year is when the author first used the word. The publication year for the edition we are using should be after the title. The year for A Study in Scarlet should be when Doyle first published the story in The Strand or other magazine of his day. Eclecticology 06:56, 18 April 2006 (UTC)[reply]
And of course the problem could be circumvented if the first edition is available for quoting. But that doesn't address your question. Davilla 14:23, 23 April 2006 (UTC)[reply]

As long as the issue is already under discussion, I have a related problem in quoting from Shakespeare. The problem is more quickly illustrated than explained. Here is the quotation I used on the page for attested:

  • 1599Shakespeare, Twelfth Night v 1 (First Folio edition)
    A Contract of eternall bond of loue,
    Confirm'd by mutuall ioynder of your hands,
    Attested by the holy close of lippes,

But in most cases, I rely on the Shakespeare text provided in WikiSource, which would look like this:

  • 1599Shakespeare, Twelfth Night v 1
    A contract of eternal bond of love,
    Confirm'd by mutual joinder of your hands,
    Attested by the holy close of lips,
  1. Notice there are significant changes to spelling and capitalization. All the Shakespeare texts in WikiSource have been edited to modern conventions, as have nearly all of the available print editions. I do have a facsimile copy of the First Folio edition, but it is large, unwieldly, and does not include all the plays or texts of Shakespeare, and some speeches are significantly different from modern editions which make use of other period sources.
  2. What date do we use? WikiSource gives the best guess as to the date of each play's composition, often based on good historical evidence. If we instead choose to use the first publication date, that would usually be for the First Folio edition, but if we're using modern editted text, then how should we indicate that?
    • For our purposes I would think that the date of publication is more important—see comments below. The other date is kind of important though, so if you want to list it as well, maybe you could try to see what format makes sense. Maybe parenthesized after the author's name? Bolding not necessary. Davilla 12:29, 22 April 2006 (UTC)[reply]

This is all a bit of a headache for me. --EncycloPetey 07:09, 18 April 2006 (UTC)[reply]

  • As I've been digging up more cites the past year or so I've found there is more and more to think about. Many later editions make slight changes over earlier editions, UK and US editions often change spelling, and sometimes change vocabulary, cites may be found in footnotes, prologs, etc added at a later time than the original publication date and by the same or a different author. Some interesting examples:
    • Prescott's History of the Conquest of Mexico was published in the USA not so long after the publication of Webster's dictionary. So spellings found in this book may depend on whether you have a US or UK edition, and whether the publisher of that edition took it upon themselves to "correct" spelling, but then again since it's often stated that American spellings such as color and center were introduced by Webster, maybe Prescott was still using traditional spelling which was the same as British spelling, and later US publishers took it upon themselves to "correct". Without giving the publisher and date of a the edition you cite, how will we know what's what?
      So in this case it would be necessary to list the publisher, no? Davilla
      Just this case? I only noticed it because I wondered and months later happened to find a different edition to mine. It might be best to give publisher and year always. — Hippietrail 02:46, 23 April 2006 (UTC)[reply]
    • Murakami's The Wind-up Bird Chronicle was originally written in Japanese and subsequently translated in American English. When it was published in Britain, not only spelling was changed but terminology up to and including boom box becoming music machine. In this case it's technically wrong not only to cite these as Murakami's usage, but also wrong to cite them as the translator's usage, but they may still be worth citing because they are in print.
      The author of the work is of course the original author, but what's more important to us is who decided on the specific language shown. Except through a translator, how a foreign language author phrases his ideas has no bearing on attested English use. It may be more correct to list the author first, but just as important in my opinion to list the translator. Davilla
      I think for our purposes it's even "more correct". I believe the OED only lists the translator in such cases. For a book of quotes or somesuch the original author must be primary. For works concentrating on the use of a particular word in a particular language, the translator made the ultimate decision of word. I believe the English word id in psychiatry is due to a translator. Of course the translator is bound by his job of representing the original and wouldn't have published the use we're interested in if the original author hadn't initiated the sequence of events. — Hippietrail 02:46, 23 April 2006 (UTC)[reply]
  • Also what do the dates mean, date of authorship? date of publication? date of publication in the original language? date of this translation? date of this exact edition?
    • An interesting case I can think of here is the posthumous publication of the works of John Kennedy Toole. While Confederacy of Dunces was published only a few years after his death, The Neon Bible was published at least 20 years after it was written.
      Date means the date that the specific language shown was decided upon. Since posthumous works may still be edited, and since the publisher could have easily made minor alterations—from misinterpreted handwriting to obvious misspellings (e.g. the name of a political figure) to preferences in style even without intention—it is the date of publication. If you have a dusty manuscript and therefore better knowledge of what the author really intended, then you're welcome to make the corrections. Davilla 12:21, 22 April 2006 (UTC)[reply]
      Of course all publishers are capable of making edits and are almost never credited. There is almost never a way to know for certain which words are due only to the author and which to subsequent people. But this doesn't mean a citation doesn't simply list the author as stated. Why not list both the date a work was written and the date it was first published? This allows the reader to think "well the editor might've updated it" without us stating she did or she didn't. — Hippietrail 02:46, 23 April 2006 (UTC)[reply]
      I have no problem listing both dates. The question is which should be bolded and listed first for the purpose of sorting quotations. Credit the author, of course, but use the date of publication. Davilla 14:23, 23 April 2006 (UTC)[reply]

There's a lot to think about! — Hippietrail 18:56, 20 April 2006 (UTC)[reply]

  • Another case I thought of after my original post is uncredited translators. I own at least one such work, a translation of Siddhartha by Hermann Hesse into Spanish. It could be due to lazy publishers just leaving it out, it could be their decision because the translation is out of copyright, or it could be that they got the translation cheaply and told the translator she wouldn't be credited. What to do in such cases? — Hippietrail 02:46, 23 April 2006 (UTC)[reply]
    It would be necessary to note that it's a translation, at minimum. In my opinion it would be a good idea to say the translator is unkown. Davilla 14:23, 23 April 2006 (UTC)[reply]

AOL sitenotice

As noted above, it is rather ridiculous to refer people to IRC for a simple explanation, especially AOL users. I have created Wiktionary:AOL for the purpose of changing the IRC link in Mediawiki:Sitenotice to point there. -- Beland 18:48, 16 April 2006 (UTC)[reply]

And I've updated the site notice to say as such, thanks for the good work :) -- Tawker 05:35, 17 April 2006 (UTC)[reply]

Vote on format of entries to be created by Connel's inflection bots

A vote on the format of entries to be created by Connel's inflection bots is being held here. Ncik 23:13, 16 April 2006 (UTC)[reply]

Restarted again at #Vote for User:TheCheatBot format. --Connel MacKenzie T C 16:16, 12 May 2006 (UTC)[reply]
Moved to Wiktionary:Tea room#lard.Vildricianus 10:03, 17 April 2006 (UTC)[reply]

Revamping Beer parlour

A meta-consideration of mine on this overly long page. Probably an emotionally difficult matter; however, it's becoming more and more of a problem for people with a slow connection to load this page, and its maintainability is low. Does anybody have ideas to tackle this? For the "new room" proposal, which has garnered relatively little response, I looked into the way Wikipedia handles its Village pump, and some aspects there are attractive (most are not).

Now, does anyone think it'd be beneficent to have our main talk page split up in a number of sections in order to separate "heavy talk" (such as bot or policy proposals) from lighter palaver? That way, people can choose more easily which to follow and which not. There would then be one master page that contains all, like it is done at the Village pump, for quick viewing of all posts in all sections. Note that each section remains as editable as ever, on either page.

I'd say that there would be a unified archive then, in order not to complicate things, but perhaps separated by month from then on?

See also Wiktionary talk:Information desk#"Include" Wiktionary:Information desk in Beer Parlour ?.

Too futuristic? Too ambitious? — Vildricianus 10:39, 17 April 2006 (UTC)[reply]

Does your page need revamping? Fear not; for Dangherous is a veteran of page revamps (my C.V includes redesigns of Main Page, WT:CP, Wiktionary:Things to do and WT:WOTD). I'll be willing (and maybe expected ;)) to help make it look prettier. As forg optimistic and futuristic, not all. Although, if my past revamps are anything to go by, it will a take a few weeks for the redesign to go "live", but patience and imaginaton will be necessary - well, either that, or just copy the frameworks of another page. --Dangherous 11:47, 17 April 2006 (UTC)[reply]
It shouldn't just be prettier, but mainly more useful and smaller. — Vildricianus 14:15, 18 April 2006 (UTC)[reply]
Not futuristic enough. All talk should be in discussion threads rather than wiki-editable, and besides the response hierarchy, every message would be categorized as pertinant to a certain page, e.g. the word "lard", a user's bot page set up as a proposal, the AOL sitenotice itself, etc. The tea room then is simply a forum, the automatic root of all discussions concerning any word in the main corpus. The same message could be found by clicking discussion on lard as could be viewed from the tea room. The difference is that the tea room is so much bigger that the discussion would have to be recent to show at the top. RfV and RfD's, besides being on the topic of a particular word, would also belong to the WT:RFV and WT:RFD subcategories of WT:TR. Past discussion wouldn't disappear when an entry is deleted. The beer parlour could be futher categorized into topics just as most forums are. Sending a message is accomplished by categorizing with the user page. Moving a thread from a user's talk page to the beer parlour is simply a matter of adding a category. If a discussion drifts from one topic to another, the response hierarchy is preserved, but somewhere along the line the messages could be recategorized. Essentially wiki is great for a collaborative effort, but there are better ways to accomplish talk. This proposal links the standard discussion forum with the structure of Wiktionary. Davilla 14:43, 17 April 2006 (UTC)[reply]
Do you think you can implement all this prose of yours? :-) — Vildricianus 14:15, 18 April 2006 (UTC)[reply]
No. Anyways it would be easier to do at Wikipedia first. A forum in place of the discussion page is much more straightforward. There aren't complications with loose messages flying around needing categories to tie them to pages. There is also a lot more interest and involvement at that project. They just don't need it as much as we do. Davilla 19:49, 18 April 2006 (UTC)[reply]
  • I'd say unless a better solution comes up, just do it like the Village pump does. It works pretty well and made a totally overloaded page useful again. Split into a few basic topics that get the most traffic and then break it down further if a given topic starts overloading again. Currently, this page is a serious problem to load on slow connections. - Taxman 20:17, 18 April 2006 (UTC)[reply]

Since this discussion is about WT:BP, doesn't it belong on Wiktionary talk:Beer parlour? Rodasmith 20:07, 19 April 2006 (UTC)[reply]

It does, but a major change like this should be viewed by all, I think. — Vildricianus 16:59, 20 April 2006 (UTC)[reply]

pasted in from User talk:Vildricianus:

Wiktionary talk:Beer parlour is on the watchlist of everyone who watches Wiktionary:Beer parlour, so posts to each location have the same audience, right? (Posting here since I loathe the length of WT:BP. Let me know if you would rather I post on WT:BP.) Rodasmith 17:10, 20 April 2006 (UTC)[reply]
I don't have the Beer parlour in my watch list. Some people don't have anything in their watchlist here, and merely read the Beer parlour. I still think this should be kept there for a while to draw attention. Once more specific things are to be discussed, it can be moved (which is, according to me, how the majority of BP-discussions should be treated - alas, this only happens in rare instances). — Vildricianus 17:15, 20 April 2006 (UTC)[reply]
OK. BTW, the length and disorganization of BP and the resulting tendency of editors not to keep it in their watch list somewhat explains the low participation in the use-mention orthography vote. Peace. Rodasmith 17:59, 20 April 2006 (UTC)[reply]
Yes, I know. Which is, obviously, why I want to change it! Cheers. — Vildricianus 18:16, 20 April 2006 (UTC)[reply]
  • I don't use my watchlist for anything anymore. After a couple thousand entries, it isn't usable. I do know that comments made on that talk page of discussion pages here usually go unnoticed for months. I think talk:WT:BP, talk:WT:TR, talk:WT:RFV, etc. should redirect to the main respective discussion page to avoid that problem. --Connel MacKenzie T C 18:24, 20 April 2006 (UTC)[reply]
    • Perhaps we can make them usable again by including all of them on some "main talk page" - a single page to read all comments on (similar to the proposed "main Beer parlour page"). — Vildricianus 18:29, 20 April 2006 (UTC)[reply]

Now "everyone" knows about it, this thing is, as suggested, up for further discussion at Wiktionary talk:Beer parlour#Revamp. — Vildricianus 09:18, 21 April 2006 (UTC)[reply]

Citations

I've come across a catagory on Wiktionary called Citations.

Wouldn't citations be better catagorized in Wikiquote?

Feedback requested please.

Thanks

68.148.165.213 04:26, 18 April 2006 (UTC)[reply]

We use citations on Wiktionary to show subtle ways a term is used, and when. Note that the citations are generally in much greater depth and breadth than the on-page quotations. I'm sure Wikiquote has a much different goal. There is plenty of overlap between projects as it is; they are mainly symbiotic entities. Category: Citations simply collects the ones that have already been done; from a technical perspective, all terms here should eventually have such citations. --Connel MacKenzie T C 05:33, 18 April 2006 (UTC)[reply]

"Sum of its parts" argument in RFD

(I have already posted something similar at WT:RFD, but here is a more appropriate place.)

At some point, although I don't remember reading about it, we seem to have decided that an entry being the sum of its parts is not sufficient grounds for it to be deleted. This hasn't stopped WT:RFD being littered with pairs of postings of the form "Delete - sum of its parts — 'sum of its parts' is not a valid reason". Why is this? I don't understand.

If someone creates an entry big red bus, everyone else agrees that it should be deleted, and that the reason it should be deleted is that it is the sum of its parts, or unidiomatic. Or is it that "unidiomatic" and "sum of its parts" are being understood to have different meanings? The phrase little green man is idiomatic but is more than the sum of its parts (it is an extraterrestrial and not simply a short veridian adult male, and indeed need not even be little or green, and is arguably not a man). Is there some distinction between the two terms "unidiomatic" and "sum of its parts" I am not aware of?

So if I go ahead and create big red bus, which I won't, what reason should be given for it to be deleted? It seems to me that some unidiomatic terms are getting through because those who RFD them as "sum of its parts" are being slapped down.

It would be helpful if someone could clarify why "sum of its parts" is disallowed and what the appropriate reason for deletion would be for expressions such as "big red bus". Thanks. — Paul G 09:58, 18 April 2006 (UTC)[reply]

I don't know if this is being sneaky and reinventive or if it's what we've intended all along. Sum of parts means that one mentally sums the parts to derive the meaning. A set phrase like risk management could be a literal summation of its parts, a combination of some sense of each word. However, the phrase taken altogether recalls ideas that are not associated with the parts individually. Risk management is, in fact, a sum of its parts, but the sum of parts argument cannot be used because one who is familiar with the term does not sum the parts when interpreting the phrase. Davilla 15:01, 18 April 2006 (UTC)[reply]
Apparently the reason is because we are going by the Pawley list. In section 3, it states: "Customary status: Does the use of the phrase imply certain behavior patterns, values, or sequences of activities that are known by society at large? They represent conventionalized knowledge. For example, expected behavior at the front door is different from at the back door (besides their participation in idioms), indicating that these function as cultural units (lexemes) that are more significant than the sum of the parts. Consider go to the mosque, get off work, take a vacation."
So it looks like we should probably be saying "does not have customary status" instead of "sum of its parts". As this is really another way of saying "unidiomatic", "unidiomatic" is probably the right term to use. — Paul G 15:50, 20 April 2006 (UTC)[reply]
Since it was my "slap down" that instigated this, I feel I ought to reply. Being idiomatic is only one of the very many reasons a phrase can or should be included. Using the Pawley list as a starting point, I'd think any RFD nomination would need to fail all twenty three rules. To say something is "unidiomatic" means that it fails only one of the twenty-three possible reasons. Unfortunately, the Pawley list is often forgotten here. It also never did get voted on, to make each of those criteria official rules here. I do not think any of them are unreasonable. --Connel MacKenzie T C 19:10, 20 April 2006 (UTC)[reply]
#7 is unreasonable. The father of Jacob, additive identity element, and murder of any person or people by a government do not deserve entries. Davilla 21:34, 20 April 2006 (UTC)[reply]
"murder of any person or people by a government", apparently the definition of democide, is not a set enough phrase in English to meet our CFI on attestation grounds. "additive identity element" (zero?) is something I think that should be added, as the term as a whole is quite opaque to a layman and on its own terms (additive + identity element) does not build on any ordinary use of the word "additive". If "father of Jacob" is frequent enough as an element to merit inclusion, it should certainly be added, along with any other cultural-reference collocation that will be opaque to everyone outside it: sons of Britain, father of America, etc., especially if the supporting cites demonstrate that the author does not bother to explain the phrase elsewhere in the work. —Muke Tever 12:39, 21 April 2006 (UTC)[reply]
The Pawley list does not say it has to be a set phrase. If #7 were "a definition that's also a set phrase" then I would agree to it. Now, tere can't be too many lay words in English whose definitions are precise. The point is that technical definitions more commonly can be. If I gave you one of these definitions, presumably in as short a form as possible, and it had a synonym of one word, Pawley would include the entire definition as an entry. undecidable <-> neither provable nor unprovable, yield <-> coupon rate divided by market price, phoneme <-> set of cognitavely equivalent phones. That would be just crazy. And on top of it all I don't even like the example he used, for the reason that, although suitable as a synonym, the only one of its kind is a poor definition of unique in that it implies no comparative or superlative form. I'm ready to burn it and bury it. Davilla 19:16, 23 April 2006 (UTC)[reply]
I agree with Davilla on this one. Pawley rule 7 is anathema to Wiktionary and we should not adopt it. The other rules are OK, IMO. — Paul G 15:17, 24 April 2006 (UTC)[reply]
I'm not so sure. If the phrase in question meets the attestation aspects of our CFI, the rule #7 would be the appropriate counterpart. That is, does "set of cognitavely equivalent phones" occur in running text (no secondary sources) often enough to meet CFI? I t does not. In fact, (with quotes) it doesn't get any books.google.com hits. Nor does yield's definition. So obviously, they aren't set phrases.
Davilla's examples in this case are perhaps extreme. But, in my opinion, the point of referring to the Pawley list is to avoid inappropriate RFDing of terms. Any phrase still needs to otherwise survive the RFV process, if genuinely questioned. I think he gave poor examples, in that they most certainly would not survive the RFV process as it exists now. --Connel MacKenzie T C 16:14, 12 May 2006 (UTC)[reply]

Well anyway, if the Pawley list is considered a good start for making WT:CFI a bit less abstract (and personally I think it does), we should turn some more attention to it, improve it, agree on it and make it into an official policy (as integral part of CFI then), right?

I noticed CFI isn't even an official policy. Isn't it about time that we get at least one official policy? If it isn't, then perhaps we should do away with the categorization of policies into degrees of endorsement. —Vildricianus | t | 18:16, 12 May 2006 (UTC)[reply]

Football, Soccer, whatever...

With the World Cup coming, it might be a good time to consider having a visible spot where related words are displayed for the purposes of acquiring translations into more languages. --EncycloPetey 10:16, 18 April 2006 (UTC)[reply]

It would be nice to have periodical special sections like this on the Main Page eventually, rather as Wikipedia do with their News bit. Widsith 10:26, 18 April 2006 (UTC)[reply]
Perhaps we can direct our Word of the day into that? — Vildricianus 13:50, 18 April 2006 (UTC)[reply]
That's not really what WOTD is about. It's closer to what Project:Beacon is doing. --EncycloPetey 08:48, 19 April 2006 (UTC)[reply]
The notion of a "Category of the week" (i.e. Category:Football (Soccer)) is clever. I'd like to see what an experimental box would look like. --Connel MacKenzie T C 19:14, 20 April 2006 (UTC)[reply]

WOTD: New system

Has anyone thought about a better system yet? Before being bold with that, I'll propose here what I'd do:

Thoughts? — Vildricianus 09:58, 17 April 2006 (UTC) (moved down to get some attention) — Vildricianus 14:22, 18 April 2006 (UTC) [reply]

There has been a proposal on the Talk page for WOTD made by Connel M., but I don't know that people have commented much on it. His proposal looks OK to me, if I understand it correctly, but I don't know enough about the mechanics of WIKI code to be sure of the suggestion. --EncycloPetey 08:50, 19 April 2006 (UTC)[reply]
Ah yes, Wiktionary talk:Word of the day was the page I was looking for. I seemed to have forget about it. — Vildricianus 16:18, 19 April 2006 (UTC)[reply]

Bot request, is this feasible?

Discussion moved to User_talk:Taxman since I initiated the request and am receiving assistance in implimenting it. Additional help would not be unwelcome though. - Taxman 15:27, 22 April 2006 (UTC)[reply]

Template documentation

There has been a constant demand for more, better, and more clearly organized template documentation, but Wiktionary has no policy regarding where to put such documentation. Two common methods follow:

  1. Document templates inside <noinclude/> sections of templates.
    Supporting argument: It is the most common method currently in use at Wiktionary and the location is inconsequential. (If it proves to be a load on the servers, then we'll simply have to move them to talk pages.)
  2. Document templates on the template talk pages.
    Supporting argument: Doing so is consistent with Wikipedia (e.g.: w:Template talk:cite web) and reduces server load.

Your input on how to document templates is welcome at Template talk:en-infl-reg-vowel-e/Template documentation. Rodasmith 19:52, 19 April 2006 (UTC)[reply]

Now at Wiktionary talk:Template documentation. — Vildricianus 11:37, 21 April 2006 (UTC)[reply]

Category documentation

On a "related" tangent, there is also the overlooked need for category documentation. Just yesterday I found that Category:Phrasebook had no link to the project, although the project is the only support of its existence. This is also common with lingusitic categories that don't suggest related concepts. Davilla 04:53, 20 April 2006 (UTC)[reply]

cacodaemonical and related general question.

cacodaemoniacal : note to myself find quotes. Further note to myself. There are 14 uses of the word on Google, and none in written form. For cacodaemonic there are over 640 uses on Google and 3 in print including Salmon Rushdie. Cacodaemonic is defined in my dictionaries, the other is not. Is cacacodaemonical a variant or bad usage?

This could apply to any number of words. What is Wktionary's policy regarding this sort of thing? My own view is not to encourage it, even if there there is usage out there. I am not a proponant of rigid english by any means, however, do we encourage bad downrite aful inglish or not? Andrew massyn 20:55, 19 April 2006 (UTC)[reply]

There is nothing inherently bad about it. English allows adjectives to end in -ic or -ical and cacodaemonical seems like a valid word to me. Widsith 21:26, 19 April 2006 (UTC)[reply]

I appreciate that the example may not be the best, but what is the general view on the question. I have the same difficulty with electrocutioner where I have just provided a genuine quote (the quote being a footnote to the case, which was a deposition by one of the witnesses to the attempted execution). Andrew massyn 21:37, 19 April 2006 (UTC)[reply]

In the first example, cacodaemoniacal isn't considered a valid adjective, because the English language already has an adjective with that meaning (cacodaemonic) right? I'm glad to see that our general guidlines and CFI would probably not allow the "-iacal" form to pass RFV.
The second example is not as strong, in my opinion. "Electrocutioner" is a play on "executioner," certainly. Or rather, a modern variant of it. But I don't see how that complaint is the same. --Connel MacKenzie T C - edit 00:17, 20 April 2006 (UTC)[reply]
The existence of one word shouldn't have any negation on the existence of similar or synonymous words. demoniac and demoniacal both exist, for example, don't they? —Muke Tever 00:50, 20 April 2006 (UTC)[reply]
Indeed. See also comic/comical, tragic/tragical, electric/electrical, et. hundreds of al. It is very common in English and quite legitimate (although often one form has taken on different connotations, or is more archaic (I note from the OED that archaical has also been recorded!)). Widsith 09:46, 20 April 2006 (UTC)[reply]
For what it's worth cacodaemon appears in H. P. Lovecraft's (bad) poen Astrophobos. Have you tried googling for the American spellings cacodemonic / cacaodemonical? --EncycloPetey 09:54, 20 April 2006 (UTC)[reply]
I'd like to note that I did not think the spelling (on the first line of text in this section) ending with "-iacal" was a typo. Obviously, that spelling would have a rather different pronunciation than it would have if spelt "-ical". As Muke pointed out, other examples exist in English. I have a vauge notion that there is a prescriptivist reason against the "-iacal" (and "-ical") spellings when a shorter synonym ("-ic") already exists. But that seems to just be my POV; I've not found useful references to back that notion up. --Connel MacKenzie T C 16:26, 20 April 2006 (UTC)[reply]

Indeed you are right! I took the word from the "requested words" and then in my note to myself, got the spelling wrong (bad BAD editor, Andrew!). Still, it seems to strengthen my question. What do we do with bad English which is perpetuated in the ether, but not in print? Andrew massyn 20:56, 20 April 2006 (UTC)[reply]

If it's common enough to meet CFI, you mark it as nonstandard, preferably also citing a reputable source that deprecates it. —Muke Tever 22:23, 20 April 2006 (UTC)[reply]
I submitted the word, having seen it in published works by Lovecraft, including The Hound, among others. It is assumed a derived term, and see no reason not to put it in Wiktionary.
Quote- "There were nauseous musical instruments, stringed, brass, and wood-wind, on which St John and I sometimes produced dissonances of exquisite morbidity and cacodaemoniacal ghastliness; whilst in a multitude of inlaid ebony cabinets reposed the most incredible and unimaginable variety of tomb-loot ever assembled by human madness and perversity."

sewnmouthsecret

Indeed. I redact my earlier comment about cacodaemoniacal not meeting CFI. I wonder which of the variants (cacodemonic, cacodaemonic, cacodæmonic, cacodemoniac, cacodaemoniac, cacodæmoniac, cacodemonical, cacodaemonical, cacodæmonical, cacodemoniacal, cacodaemoniacal, cacodæmoniacal) should exist in Wiktionary. I'm still unclear on what usage notes (if any) should go with them. --Connel MacKenzie T C 21:23, 21 April 2006 (UTC)[reply]
I think that one lot (without the 'ae') is American usage, and the other, with the 'ae' seems to be English usage. I intend to put in the lot of them, and find the necessary citations. As for the one with the 'aical' at the end, I intend to leave it out. It offends my soul too much (no pun intended). Regards Andrew massyn 23:28, 21 April 2006 (UTC)[reply]
Not to cause a huff, but would anyone object to putting in cacodaemoniacal (not -aical)? It does appear in print, after all. Also, Wiktionary's stated goal is "A term should be included if it's likely that someone would run across it and want to know what it means." If one were to read this, and come here to find it, they would not find an entry. Also, not to upset people, but simply because a word form offends someone doesn't mean it warrants exclusion. There are MANY "words" listed that are naught but popular culture vernacular that offend me to be in here. This word is legit, as it is in print. SewnMouthSecret
User:Andrew massyn was saying that he refused to enter the term. You are welcome to enter it, certainly. It should have links to the other forms and perhaps a usage note saying which forms are obsolete (if known.) --Connel MacKenzie T C 15:49, 12 May 2006 (UTC)[reply]

Words that are composed of words

I think I saw a format for words, like landholder, that are composed of other words, for linking to those component words. I just can't find it now.

If I just link both words with out a space between them landholder ([[land]][[holder]]) then it looks like one link. But if I put a space between land holder then it looks like it should be written as two words. Do I just ignore linking or do I put a dot in between or what? JillianE 14:07, 20 April 2006 (UTC)[reply]

I'm not sure what you mean - this links to both words: landholder? Jonathan Webley 15:32, 20 April 2006 (UTC)[reply]
But it looks like just one link even though land and holder are linked to different words. You can't see that it is separate words. JillianE 15:38, 20 April 2006 (UTC)[reply]
But it IS only one word. You should not try to separate the headword. You could use the Etymology section to point out that it is a combination of two words. SemperBlotto 15:42, 20 April 2006 (UTC)[reply]
For example, in the etymology section, use From land + holder. Jonathan Webley 15:44, 20 April 2006 (UTC)[reply]
OK, Etymology looks good. That is probably where I saw it used but couldn't find again. Thank you. JillianE 15:50, 20 April 2006 (UTC)[reply]
On a sidenote, I've only been italicizing terms in etymologies that are from another language, terms in the same language as the entry I leave normal - but I always wikify in either case. I would like to ask our main etymology experts and especially Stephen who is also a typesetting expert, what is best practice here? — Hippietrail 17:04, 20 April 2006 (UTC)[reply]
I'm neither a typographer nor an etymologist, but I prefer to follow the practice of many print dictionaries and use italics for words in other languages and bold for English words, wikifying both kinds. The OED and some other dictionaries use small caps for English words. — Paul G 09:32, 21 April 2006 (UTC)[reply]
I've been a typographer (briefly). My practice has been not to italicize words in etymologies. Part of the reason for this is that some scripts and some fonts used by web-browsers make italics nearly unreadable. Cyrillic in particular becomes very difficult to read in italics, and Hebrew and ancient Greek suffer as well. Since I prefer to be consistent in what I do, I have not used italics in etymologies, and recomend against doing so. --EncycloPetey 09:55, 21 April 2006 (UTC)[reply]
Personally I agree with EP on this. I think the wikification is enough visual distinction. Widsith 13:58, 21 April 2006 (UTC)[reply]
Well, I agree with EncycloPetey that words in non-Latin scripts look better when not italicised, but maintain that words in languages that use the Latin alphabet should be italicised to indicate that they are being quoted rather than used.
For example: "Italian radio" looks like it means "a radio that is Italian", but "Italian radio" shows we are talking about the Italian word "radio". Wikification alone is not sufficient to make this distinction, especially if the language name is also wikified: "Languagename word".
In languages that don't use the Latin alphabet, the fact that a word is in a script other than Latin is sufficient to show it is being quoted rather than used; for example: "Greek έχω". — Paul G 14:26, 21 April 2006 (UTC)[reply]
This is a carry-over from the debate on formatting
# Plural of [[...]].
The point is that English words are never written in another script, so any time the standard calls for italicizing or bolding words for emphasis, these preferences can be safely ignored in any other script. The fact that Latin characters aren't used is enough emphasis in and of itself. Davilla 16:02, 21 April 2006 (UTC)[reply]
Agreed. Sentences can only use words of their own script, so they can safely mention words of other scripts (but only such words) without italics. Rodasmith 19:34, 21 April 2006 (UTC)[reply]

create test template

I'd like to experiment with templates. In particular, i'd like to create a template of my own. I don't want to pollute the Template namespace, though. Is it possible to achieve this goal? If so, how?

Yes, you can make a user subpage like User:yourname/Subpage, which you can include elsewhere like this: {{User:yourname/Subpage}}. — Vildricianus 15:56, 20 April 2006 (UTC)[reply]
Wait, you can do this!? And it even has the little edit links! Wow, I thought there was something special about the template: namespace. Well this is awesome news.
Colonel had asked a while ago, can't recall where, about doing language-specific searches. If pages were divided into one language/dialect each, e.g. en:burrito English:burrito and es:burrito Spanish:burrito, as separate name spaces, then a language-specific search would be akin to a namespace search, which is already possible. And it wouldn't look any differently than it does now. The page burrito would consist of:
See also [[Burrito]]

{{English:burrito}}
----
{{Spanish:burrito}}
All you would need, minimally, is a bot to maintain things, e.g. new entries properly divided into the correct pages. So this is already feasible with the existing technology! Davilla 20:47, 20 April 2006 (UTC)[reply]
We could use the English names of languages within the English Wiktionary, or some other proposed namespace strategy. Davilla 15:55, 21 April 2006 (UTC)[reply]
Why does es:burrito disappear when wikified? Davilla 20:51, 20 April 2006 (UTC)[reply]
Because [[es:burrito]] produces an interwiki link to the page "burrito" on the Spanish wiktionary. I don't believe the current interwiki table would allow a pagename like es:burrito, it would always send people to es:. —Muke Tever 22:21, 20 April 2006 (UTC)[reply]

Wiktionarians hierarchy

A) There are different titles a Wiktionarian may assume, e.g. a simple user, a Beaurocrat, an Administrator.

Question: please write down all possible titles.

B) Statement: "A Beurocrat has more privileges than a simple user; an Administrator has all the priviledges of a Beurocrat and then some. More generally, all titles may be written in a list according to the privileges that come with each of them."

Questions: 1. Is the statement correct? 2. If so, please order the list of part (A) accordingly (least priviliged first). 3. If not so, please describe the hierarchical relations between the titles.

C) Please describe concisely each title's privileges (and duties, if any).

Difference between "Appendix" and "Wiktionary Appendix"

From Category talk:Appendices:

Will someone please explain the difference between these two? Looking at the list, I can't see why some are listed as one instead of the other. There doesn't seem to be any rhyme or reason behind listing them. --日本穣 03:04, 3 March 2006 (UTC)[reply]

We haven't yet made a policy deciding which one is preferred. --Expurgator t(c) 21:19, 10 April 2006 (UTC)[reply]
Yes, don't we have to decide on which to use? — Vildricianus 10:16, 21 April 2006 (UTC)[reply]
Theoretically would the first, Appendix:, be part of the dictionary, something that would be included in a print version if there ever were one? Declensions are an example, I would think, including the Latin. The second is for the operation of Wiktionary, if there are any pages that merit that. What is the purpose of, say, Wiktionary Appendix:Forms and shapes? I couldn't see having any lists in the "final product". That's what categories are for! Davilla 15:51, 21 April 2006 (UTC)[reply]
Last year, we had a tiny BP vote for the pseudo namespaces Appendix: and Index:. The Wiktionary Appendix: and Wiktionary Index: namespaces are old and should be cleared out to the new pseudo-namespaces at some point. I don't think bureaucrats can effect the change of making a pseudo-namespace into a real namespace, can they? Lists (especially for individual languages) go in the Index:, while "appendix-y" stuff goes in Appendix:. --Connel MacKenzie T C 05:21, 5 May 2006 (UTC)[reply]

Assistence for following discussions

There is now a template {{active discussion}} (or {{ad}}) and an accompanying Category:Active discussions. Perhaps useful. — Vildricianus 10:30, 21 April 2006 (UTC)[reply]

It is suggested that just below the ==Language== heading is an appropriate place to put any disambiguation information, for disambiguation between words that look very similar (especially to non-native speakers) but have very different concepts. eg: cobbler/cobblers.

Carry the discussion on in Wiktionary talk:Disambiguation in layout - Policy Think Tank--Richardb 10:55, 21 April 2006 (UTC)[reply]

commonly used templates?

Is there a list of commonly used templates? RJFJR 13:31, 21 April 2006 (UTC)[reply]

This should be at Wiktionary:Index to templates. However, I'm rewriting this, and perhaps you find something useful at User:Vildricianus/Page9. If you're looking for inflection templates, Wiktionary:Inflection templates is the page to be. — Vildricianus 13:44, 21 April 2006 (UTC)[reply]
Thank you. RJFJR 14:42, 21 April 2006 (UTC)[reply]

Working on categorizing all templates to make them more accessible. Any help is welcome. Category:Templates. — Vildricianus 18:50, 25 April 2006 (UTC)[reply]

Should many of these apply to the No go match "advanced" blank page outline at template:new en? Davilla 16:38, 26 April 2006 (UTC)[reply]
On a related note, I thought there was a special page listing all templates. I wanted to look for something there but there is no such page. Is it somewhere else? — Hippietrail 16:56, 26 April 2006 (UTC)[reply]
WT:I2T used to be that index, but it has been broken into about a quarter of a million hard-to-find sub-pages now. --Connel MacKenzie T C 09:53, 27 April 2006 (UTC)[reply]

enable browser to display IPA

The characters of the IPA alphabet are not displayed properly in my browser. Instead of the IPA characters i see small rectangles containing digits. What do i need to do to enable my browsers to display IPA? More specifically, 1. What fonts should i download? 2. Where are these fonts available on the net? 3. How do i install them? 4. How do i configure my browsers to use these fonts?

My system: OS: Debian linux (sarge) Browsers: FireFox, Konqueror, Mozilla

Thanks, Itay 13:43, 21 April 2006 (UTC)[reply]

At the risk of sounding like I'm just saying RTFM, maybe you might find something in the help documentation of your web browsers about this. Try looking for stuff on displaying foreign languages and Unicode. — Paul G 14:29, 21 April 2006 (UTC)[reply]
Glad you didn't actually RTFM him seeing as most users of this dictionary are going to be web surfers who know nothing about Wiktionary or probably even Wikipedia and aren't going to FRTM. Tell me if you're tired of hearing this because I feel like I'm repeating myself. This is a common problem and frankly I've never been able to resolve it on my own machine. What's needed is an <X-SAMPA> tag similar to the <MATH> tag but much simpler. Then the fonts wouldn't need to be installed on any machine. I know this isn't easy to implement, but could folks please at least agree that it's needed? Davilla 15:23, 21 April 2006 (UTC)[reply]
I agree this is needed. Even the "correct" client-configuration solutions are quite imperfect and inconsistent. Relying on the eventual compliance of all browsers seems short-sighted...for example, WinXP/IE out of the box has serious problems with this web site. Perhaps some searching for the relevant bug reports on bugzilla would help? --Connel MacKenzie T C 15:41, 12 May 2006 (UTC)[reply]
I've found a satisfactory solution to the problem.
Method: I searched for any Debian package whose name or description contain "IPA" or "phonetic". I picked from the lists those packages that seemed to fit my needs, and installed them. I didn't configure my browsers specially.
Effects: With character encoding set to UTF-8, i checked in two places: 1) the character bar at the bottom of the edit tab, 2) Category:IPA_symbols. Mozilla and FireFox display correctly all the characters in (1) and almost all those in (2). The single exception is the character just below the exclamation mark (displayed as a rectangle with numbers). Konqueror, on the other hand, fails in both tests. In (1) it displays empty rectangles. In (2) it doesn't display anything at all.
If anyone knows how to fix any of the remaining deficiencies, please let me know. Thanks, Itay 23:17, 21 April 2006 (UTC)[reply]

Personally, I've never found the pronuncation stuff remotely useable. Nowhere nearly as usable as the stuff in my paper dictionaries. Let's at least have one simple version, something like that in FreeDictionary.com. USeable without having to become technical gurus.--Richardb 04:39, 14 May 2006 (UTC)[reply]

By using IPA (or SAMPA) consistently, we create a foundation where {{IPA}} (and {{SAMPA}}) can show alternate pronunciation formats based on user preferences. (m:Wiki is not paper) Rod (A. Smith) 17:24, 14 May 2006 (UTC)[reply]

Hear ye, this new project. (some advertising which some comment high up in this page will obviously fail to do). — Vildricianus 14:40, 21 April 2006 (UTC)[reply]

The user Drago has been entering words from Hungarian, Finnish, Basque, German, Dutch, And Latin. While I would be willing to grant that someone familiar with Hungarian might be able to work in Finnish and German, his etymologies for Dutch have consisted of "Same as [German word]" erroneously, and his Latin entries have been dubious or wrong. My guess has been that he (she?) has been entering information from commmercially published dictionaries and not from personal knowledge. Besides the errors of information, this presents a possible copyright violation. --EncycloPetey 09:37, 22 April 2006 (UTC)[reply]

At least some of it apparently comes from other Internet sites. I noticed a nonsense translation that he had given for a Hungarian word (I don’t remember what word it was) and searched for it on the Internet. It took me to a minimalist online Hungarian-English glossary that had the same bad English. For Swahili, one should always include the noun class when listing any noun (it’s like indicating gender in French or German), but he never does it. For Czech, Croatian, Catalan, etc., he never bothers to indicate gender (if he even knows it), and the translations are all undifferentiated semantically. For instance, for Catalan mos he wrote "a bit" ... no clue as to whether it’s a small amount or a drill bit or what. This complete lack of the standard grammatical and semantic information found in any decent dictionary makes his entries not very useful. —Stephen 10:00, 22 April 2006 (UTC)[reply]
I try to have a look at all his work on words that end in a vowel, as he mangles Italian from time to time (and never ever indicates gender, or supplies a plural) SemperBlotto 11:29, 22 April 2006 (UTC)[reply]

He is now entering ‘Germanic’ words. E.g. daudhaz. Yikes. Widsith 15:34, 22 April 2006 (UTC)[reply]

I think this one needs to be renamed with an * on the front (whatever we call them words). SemperBlotto 15:38, 22 April 2006 (UTC)[reply]
There was some discussion about this on BP recently, but I don't think a consensus has yet been reached on how to deal with reconstructed words. Not like this, anyway. I've left a note on his talk page. Widsith 15:40, 22 April 2006 (UTC)[reply]

He's still actively at it -- creating nonesense entries for languages he does not speak. Look for instance at his edits to the Spanish word mala, to which he gives the meaning "suitcase" "from a Germanic language." He's now entering Maori words as well. I had called him on his entry for --EncycloPetey 04:24, 25 April 2006 (UTC)[reply]

Here is a conversation I had with him after he created the entry for Latin securus:

Reminder: please do not enter words from languages you do not know, unless you have extremely strong evidence that you are entering the correct information. My guess is that you do not know Latin. --EncycloPetey 09:57, 21 April 2006 (UTC)[reply]
What makes you think that?
Your definition of securus was wrong. You defined securitas by mistake. You seem to be entering words from a wide array of unrelated languages, which implies that you are working from dictionaries rather than personal knowledge of the languages. --EncycloPetey 10:08, 21 April 2006 (UTC)[reply]
True: I just copied the translation of German "sicher" without indicating the original meaning of the Latin word.

In other words, he knowingly fabricated an entry for a Latin word he did not know, based solely on the definition of a German word derived from that Latin word. I have seen him continue to do this in languages I know, and would be surprised if he weren't doing this in other languages as well. In short, we may need a bot to tag all of Drago's work with RfV. --EncycloPetey 06:20, 25 April 2006 (UTC)[reply]

a list of useful links

Hello all,


I've drafted a table of contents to an imaginary "The Wiktionary Book". The TOC is populated with useful links. The list is on my user page. Feel free to have a look.

Itay 10:49, 22 April 2006 (UTC)[reply]


"Looks good. Keep at it, find a good place to plug it in top the documentation, and, if appropriate, RFD any other documentation that becomes unnecessary.--Richardb 09:56, 24 April 2006 (UTC)[reply]
Perhaps to merge with WT:UT. — Vildricianus 16:27, 24 April 2006 (UTC)[reply]

Template WikiSaurus-link

I've created a useful little template, with a visual link to WikiSaurus.

Find this word in Wikisaurus

for example ......... {{WikiSaurus-link|money}}. gives -->

And I'm hoping someone can improve on the logo!
I've updated "Creating a WikiSaurus entry". I guess WT:ELE also needs updating.--Richardb 11:38, 23 April 2006 (UTC)[reply]

But the Wikisaurus entry only relates to one sense of the word. Are you going to allow multiple Wikisaurus templates, potentially one for every sense? Davilla 14:01, 23 April 2006 (UTC)[reply]
This is Wiki - no-ne allows or dis-allows anything
 
A "smiley"

If a word requires more than one {{WikiSaurus-link}}, then use more than one. Like everything else about WikiSaurus, it's still suck it and see, trial and error, the proof of the pudding is in the eating.--Richardb 09:54, 24 April 2006 (UTC)[reply]

Has anyone gone through to make sure relevant Wiktionary entries have a link to existing Wikisaurus pages? I note, as an example, that talk includes no link to WikiSaurus:talk, but bloviate does. --EncycloPetey 09:21, 29 April 2006 (UTC)[reply]
Not so far. Sounds like maybe a job for a bot. But, as always, WikiSaurus is still a tad experimental, still developing. Is this logo a good idea ?--Richardb 12:01, 6 May 2006 (UTC)[reply]

Missing Images

Has anyone else noticed problems with missing images? In the past few weeks, I've removed links to images that do not display (two or three.) But today, looking at trivet, the image loaded only after I followed the commons' link to track it down. Since then, it appears fine? Has anyone else noticed images not appearing, randomly? --Connel MacKenzie T C 15:15, 23 April 2006 (UTC)[reply]

Perhaps a problem with Commons; they do seem to take a long time to load. — Vildricianus 16:23, 24 April 2006 (UTC)[reply]
I was talking about minutes of elapsed time, with retries. --Connel MacKenzie T C 09:55, 27 April 2006 (UTC)[reply]

Chinese Traditional and Simplified

For years I've been using a format in the translation tables that I put a lot of thought into for Chinese entries:

  • Chinese: [traditional], [simplified] (pinyin)

Lately I've noticed quite a few are now in this format:

  • Chinese (Simplified): [simplified] (pinyin)
  • Chinese (Traditional): [traditional] (pinyin)

I don't consider this an improvement for several reasons:

  1. It promotes the misunderstanding that Simplified and Traditional Chinese are separate languages
  2. It puts simplified characters first even though they can be ambiguous, presumably because they are official in the largest Chinese-using country
  3. It duplicates the pinyin for no gain
  4. My format is closer to those used for other languages with multiple writing systems or romanization:
    • Russian: [cyrillic] (romanization)
    • Korean: [hangul] ([hanja], romanization)
  5. My system is simpler for cases where the traditional and simplified spellings are identical, in which case we only need:
    • Chinese: [characters] (pinyin)
  6. My system is simpler if people want to add the older romanizations or romanizations for Chinese languages/dialects other than Mandarin:
    • Chinese: [trad], [simp] (pinyin, system2: foo, system3: bar)
    • Chinese: [trad], [simp] (pinyin, Cantonese: foo, Hokkien: bar)

My rationale for putting traditional first is that contrary to popular belief it is still in use in all Chinese-lanauge-using countries, it is older and traditional, and mainly because it is never ambiguous - certain simplified characters are equivalent to 2 or more traditional characters.

My rationale for not labeling is that the order is common sense and it's always easy to tell because simplified characters are simpler.

People who look up a lot of Chinese in a dictionary know how Chinese works, people who are not accustomed to Chinese would need to read the "how to use this dictionary" section and are the same kind of people that will need to read about other unlabeled things such as gender and romanizations of other languages such as Russian or Arabic.

I can only assume this is being done based on the new Serbian entries which are split over two lines. I've always felt these also would work better one one line without bothering to label Latin and Cyrillic since only users unfamiliar with both of those scripts would have a problem - and they won't be able to use an English Wiktionary anyway.

So given all this, how do others feel? — Hippietrail 16:21, 23 April 2006 (UTC)[reply]

I support you on the Chinese 100%. I presume you're clustering labels such as Chinese Characters, Mandarin, etc. as having essentially the same problem. I don't know anything about Cyrillic but it seems like an easy enough distinction for a one-line translation. The problem might be legibility in having to list several translations for one English meaning, such as an equivalent case of the Swedish for nibling in one of these multiple-script languages. But there's no ambiguity. Davilla 16:57, 23 April 2006 (UTC)[reply]
I agree completely - can the needless duplication and put all Chinese script variations in a single line. bd2412 T 17:15, 23 April 2006 (UTC)[reply]
I support this too, apart from the excellent reasons mentioned above it, it also means less typing for me :) Kappa 19:03, 23 April 2006 (UTC)[reply]

Yes, I've never understood what the CJKV Characters were supposed to be, or why they were on a separate line. A different thing is though, that they perhaps deserve a different separation mark than a comma, which merely separates synonyms. I don't know how that works for Chinese, but I guess there are also several translations possible for one English term. (Is that what Davilla said?) — Vildricianus 16:22, 24 April 2006 (UTC)[reply]

The "CJKV Characters" are a different thing entirely. Even though both Chinese and Japanese rarely use single-character words, the vast majority of characters have "meanings" for historical reasons - and these are found in all "character dictionaries". These are almost totally language-indepedant and are valid for Chinese, Japanese, Korean, and Vietnamese. In cases where one of these languages does have a single-character word it may seem the CJKV Chars section is duplication, but this is not usual. Note that there are no pronunciations or romanizations in this section either since these are language-dependent.
As for commas and semicolons in these sections, all characters separated by commas are variants and have the same meaning, those separated by semicolons are different characters which can be looked at as being synonyms. — Hippietrail 01:13, 26 April 2006 (UTC)[reply]
Ah, the CJKV Characters, because there are no pronunciations, cause some ambiguity, which is why semi-colons are needed. But how important are the simplified characters in this case? Vietnam must be politically close to China. I don't know how Japanese and South Koreans use them, but isn't it the traditional character which is shared by the languages, not the simplified? Anyways it needs to be apparent how the distinctions are made. Being somewhat of a technical person, I've noticed that what sometimes seems obvious many times is not. Are you absolutely sure that the system of commas and semi-colons accomplishes this clarity? Davilla 16:34, 26 April 2006 (UTC)[reply]
Not exactly. They don't cause ambiguity, some simplified ones just are ambiguous. They have no inherent pronunciation. At the time of each character's invention it may have had one set pronunciation in the language and dialect of the era and place in which it was born. But most characters are very very old and it would only be possible to reconstruct their pronunciations. Ok it's all very complicated so most people needing these ought to learn about them first. I will try to summarize but I'll inevitably leave things out. The vast majority of CJKV characters are very old and come from China. The script has gone through several stages with changes to characters at and between each stage, as well as variants which came to live alongside their original versions. When other countries started using them, they each introduced a very small number of characters of their own. Vietnamese stopped using them altogether some time ago and never went through a simplification process. Japan implemented a simplification just after WW2, introducing new variants to be used instead of the old traditional characters. China was next to simplify and their was bigger than the Japanese process. Some Chinese-speaking places such as Hong Kong still mainly use the traditional characters whereas others followed mainland China in using mainly the new simplified characters. Even so the traditional forms have not died out in mainland China, and are still in use, though most people would need to be further educated to use them fluently. North Korea no longer uses characters at all, while in South Korea they are mostly unused and young people know very few characters but it is still not difficult to find some even in newspapers. Korea never underwent a simplification process so when characters are used they are the traditional ones.
Basically I try to order synonymous characters in the history of simplification, which is also the historical order, and also by putting standard forms before variants - I'm not an expert so I don't always get it right.
I think the best current example is on sword - some characters I took a special interest in after seeing a historical Chinese movie at the time I was learning that Japanese also had simplified characters.
I hope this helps. I know the section won't be super transparent for everybody - but neither will the Georgian or the Arabic sections. It could do with documentation as could many things on Wiktionary. Documentation is not something I am good at but I am more than willing to explain things in my way to help others write documentation. — Hippietrail 17:30, 26 April 2006 (UTC)[reply]
Cool, I didn't know Japan had also undergone a simplification, earlier if less ambitious than the Chinese. Your ordering strategy makes perfect sense to me.
By the way, the translation of chayote is both traditional and simplified. Is it correct to leave out any indication of such? Davilla 19:16, 15 May 2006 (UTC)[reply]

Is this a policy problem or is it a technical problem?

  • I understand the approach with respect to Simplified and Traditional Chinese characters, and I try to abide by it. However, it seems to me that there must be a more elegant technical solution to the variation problem that afflicts many of the Wiktionary languages. Perhaps Chinese is the most dramatic example because its writing system is the most complex and the least unified of all the major languages. I like the solution on the Chinese language version of wiktionary and wikipedia. A php script was created so that both Simplified and Traditional get a separate tab at the top of the page. That way, if you prefer to see Simplified, you click on the simplified tab. If you prefer to view traditional, you click on the traditional tab. This concept works reasonably well on the Chinese language pages, because variation tabs are only created for the Chinese written script. I would ideally like to see an efficient way to "toggle" between major written forms for each language (that needs it) in wiktionary (eventually). For example, in the case of Chinese, perhaps there would be a way to toggle between Simplified, Traditional, Pinyin etc. For Japanese, you would have a way to toggle between kanji, hiragana, katakana and romaji. For Russian, you would toggle between cyrillic and romanization. For English, you would toggle between British spelling and American spelling. So here are my questions:
  1. Does my idea make sense?
  2. Would it be difficult to implement?
  3. Is there anyone out there that has the permission and know-how to implement it?
  • If the above does not hold water, perhaps the second option would be to improve the existing Wiktionary policy for the format of Chinese entries. If anyone strayed from said policy, at least we could cite the policy to the offending wiktionarians.

A-cai 11:59, 30 April 2006 (UTC)[reply]

The issue you describe is separate. AFAIK it's possible to map directly from traditional Chinese, to simplified, to God-awful pinyin or any other romanization or phonetic spelling, so the tabs at top would be completely mechanically generated, provided all the contributors were able to write in traditional, and even if they weren't it would be better to have a single page of mixed traditional and simplified that could be later rewritten as traditional, owing to the elegance of a purely mechanical solution. Japanese I would imagine is the same, although someone once told me two different phonetic systems are used depending on whether it's Japanese or foreign words, so I'm really not sure.
Your solution for English isn't as clear to me because of problems like programme versus program where the meaning has to be known. If there are no similar situations the other way around, it might be possible to write all pages in--please don't execute me for the clarity--British English and localize (translate) to American. You'd have to be careful though that headwords, quotations, etc. were not localized. Davilla 19:16, 15 May 2006 (UTC)[reply]

Categories of the form <language>:<part of speech>

Did we come to a decision whether or not to have these? The last I read, some time ago, we were going to ditch them because they would get impossibly long. I see that people are still adding items to the categories for Danish nouns, Portuguese adjectives, etc.

I went ahead and removed the [[Category:English nouns]] from the templates I created for nouns and plurals ({{en-noun-reg}} and the like) so that we wouldn't end up with huge categories that would take for ever to download.

So should we continue to use these or not? I think we need to decide whether they are practical or even useful to have. — Paul G 13:20, 24 April 2006 (UTC)[reply]

Certainly the non-English POS categories are useful I think. An additional question is, are they supposed to take the form of Category:nl:Prepositions or of Category:Dutch prepositions. — Vildricianus 16:13, 24 April 2006 (UTC)[reply]
For those ;anguages that have few entries, I would advocate having such categories on a temporary basis. I would also think this sort of this would be fine for collecting prepositions, interjections, articles, and other short lists. However, for those categories that would include more than 2000 words, the category becomes less useful, as you've noted. I'd hate to see nouns and adjectives go completely uncategorized, so maybe a sorting program is needed, much as they have for sorting categories (and for stub articles) on Wikipedia? --EncycloPetey 04:32, 25 April 2006 (UTC)[reply]

On a related note, if we do decide to stop using these categories for certain languages, someone should tell Drago, since he seems to be on a mission from God to add these to as many pages as possible -- even if the page is already categorized for the language in question. --EncycloPetey 09:27, 27 April 2006 (UTC)[reply]

OED now available online

The Oxford English Dictionary and some other dictionaries are now available online in UK libraries. See here for full details. — Paul G 15:07, 24 April 2006 (UTC)[reply]

Yes, around the time Primetime showed up, it also became clear that the OED (some versions, anyway) seem to be available on-line through almost all libraries in North America as well. If you have your library card and know your library's web-site, look into it; you may be pleasantly surprised. --Connel MacKenzie T C 15:30, 12 May 2006 (UTC)[reply]

Rankings rankle

This is just getting silly. Now there are more and more articles with both a Gutenberg ranking and a TV/movie script ranking. I'm sure they're fun to make but does anybody care about this stuff? Each entry on my screen is 3/4 of an inch high and about 6 inches wide. They push the actual article farther and farther down the page and are thus taking over. Please move the bloody things to the bottom, to the Talk page or to some new .../Rankings page. They're not important! The definitions are and I can't see them! — Hippietrail 20:14, 24 April 2006 (UTC)[reply]

Agree. We can lave Gutenberg, but TV ones should be gone, as they 're a little unnecessay --Dangherous 13:46, 25 April 2006 (UTC)[reply]
Also agree. It's not like they're from an established linguistic corpus. Gutenberg is ok, though not truly important, and I see nothing wrong with moving it to the bottom. The TV one has got to go. - Taxman 20:40, 27 April 2006 (UTC)[reply]
I've got rid of the TV rankings, but am not going to touch the Gutenberg ones - there's a thousand of them! --Dangherous 20:48, 27 April 2006 (UTC)[reply]
  • For anyone who is annoyed by the rankings, I've just modified the template so that you can hide them using CSS. Currently I've given them just a CSS id of "rank", but if we did want to go for multiple rankings I could instead give a CSS class of "rank" and id for "gutenberg" and others. This is the code to add to your custom CSS file to hide the Gutenberg ranking box:

#rank { display: none }

If you make any new templates please consider giving them CSS classes or id's. If anybody would like to make some documentation on customizing Wiktionary with CSS, please ask me any questions. I plan to add more of these type of customization features so please also suggest any you think of. — Hippietrail 18:31, 30 April 2006 (UTC)[reply]

My view. Keep Guternberg ones, but demote them down the page a lot!. Make them optionally hidden by CSS (though 99% of users will use the default CSS). I guess the TV/Movie scripts analysis is really more useful for curren spoken English. So we should keep them too, but down the bottom, and optionally hidden using CSS. (But I wouldn't cry if both the Gutenberg and TV/Movie stuff went missing!--Richardb 04:33, 14 May 2006 (UTC)[reply]

rfdproto template?

Wiktionary:Page deletion guidelines says to use {{rfdproto}} but it doesn't seem to exist.

The deletion log tells me it was deleted per RFD consensus. I have rewritten that bit on the above page. — Vildricianus 20:55, 25 April 2006 (UTC)[reply]
Thank you. RJFJR 13:49, 26 April 2006 (UTC)[reply]

Fair use question

Some of you may have noticed my Dictionary notes sections to compare our articles with what other dictionaries do. Recently I gave such treatment to the entry for hound dog since I thought it surprising that it's absent from many dictionaries. User:Stephen G. Brown kindly added to it to show one dictionary that does cover it. However, he also included some text verbatim from a dictionary still under copyright. IANAL so I'm a bit worried that it might not come under fair use. You can see Stephen's any my conversation about it here. Opinions appreciated. — Hippietrail 00:25, 26 April 2006 (UTC)[reply]

I'm quite sure it fall under fair use. First of all it very doubtful that the sentence has any copyright protection at all because of the Merger doctrine. Secondly hound dog doesn't really copy, it quotes which is explictly allowed. --Patrik Stridvall 22:09, 26 April 2006 (UTC)[reply]
Hi. As an intellectual property attorney, I can assure you that this single, properly attributed reference in quotation marks will not run us afoul of the copyright laws. Note however, that if we were to do this for every entry (as opposed to a handful of illustrative ones), we would essentially be replacing the quoted dictionary in the marketplace, and would potentially face liability. Cheers! bd2412 T 22:31, 26 April 2006 (UTC)[reply]

Disambiguation "See also"

There is a policy think tank page - Wiktionary talk:Disambiguation in layout - Policy Think Tank. Maybe the discussion could go there ?



It seems there are a few ways different people like to format this little gadget. Let's discuss which is best and why.

I've been doing it like this:

See also zzib, zzob, zzub

Others have been doing it like this:

See also zzib, zzob and zzub

The second is used in the "see" template and its cousins but there are still many around done the long way, including variations with and with the indent, with a colon after "See also", without "and", and one "See also" per line which I think is butt-ugly.

Personally I don't feel the indent fits, and I also find the bolds and italics visually jarring, but your opinions are sure to differ.

Let's hear what people think, choose the most popular, and then I intend to add CSS to the templates to allow each person to customise how it looks for them. I was also taught to put a comma before the and - CSS can also make this user-selectable. I think we can use templates and CSS together in lots of ways but this will be an early proof-of-concept. — Hippietrail 00:59, 26 April 2006 (UTC)[reply]

Well, I'm guessing most people would prefer the bold because it is more visible. In that case the "and" looks awkward, but it isn't really necessary in the first place. Use all commas instead, avoiding the serial comma issue, or putting it off to CSS for those 0.1% of users who bother modifying the look of just one website. So what's left? Whether to indent? I'd vote no, weakly. The colon? I'd just as well leave it out. I never know whether it should be italicized. Could someone confirm yes? But no objections to having it if the last question is clear. Davilla 16:13, 26 April 2006 (UTC)[reply]

Apart from the format, the "see also" at the top of pages is sometimes also used to refer to things that should be referred to in the ===See also=== section, e.g. in cannon. Am I right in saying that the top "see also" is only appropriate for diacritical and capitalization issues? — Vildricianus 20:54, 27 April 2006 (UTC)[reply]

There's some discussion back up the Beer parlour a bit... — Hippietrail 02:29, 28 April 2006 (UTC)[reply]
  • Would anybody besides Davilla like to offer an opinion? Please do so or I shall go ahead and make changes based on only Davilla's an my feelings, assuming that nobody else cares either way. (Also try to restrict comments to just your own preferences and leave out assumptions and theories of what other people's preferences may be - they can comment for themselves) — Hippietrail 17:11, 30 April 2006 (UTC)[reply]

See also word, Word, wórd.
is what I prefer. No indent. The bolding is not important to me since my CSS boldens all wikilinks. I've no idea how many "manual" disambigs there are. If there are many, perhaps a bot could templatize them. —Vildricianus 17:30, 30 April 2006 (UTC)[reply]

Since all of us so far don't want the indent or the and, I'll remove those now. I'll also add the CSS ID "disambig-see-also" and leave the italics for the text and bold for the links, which people can customize on their CSS page. This does not mean the conversation is over so please continue to comment. — Hippietrail 17:16, 3 May 2006 (UTC)[reply]
OK, I'll chime in (now that I've found this section.) I prefer these indented (to stand apart form the TOC. The psycho-serial comma is no good; instead the word and should appear before the last item in a list like this. I do not think they should be bolded, as they are not inflected forms of the headword. --Connel MacKenzie T C 05:44, 5 May 2006 (UTC)[reply]
Indentation is easy. This perfectly matches the indent of the colol for me on Firefox:
.disambig-see-also { text-indent: 2em }
We can make that default if more people agree with you.
The combination of the POV-Connel-comma and the and both being optional will make somewhat ugly CSS but I'll experiment. Making just one or just the other optional is much easier.
I also don't like the bold so I've got this in my CSS:
.disambig-see-also b { font-style: italic; font-weight: normal }
Again, any of these can be made defaults based on what preferences people post here. The current defaults are not my prferences but the current most popular preferences. — Hippietrail 21:33, 6 May 2006 (UTC)[reply]
  • Excuuuuse me? The only POV comma around here is yours - that crazy (incorrect) "serial comma."  :-) I think I now object to giving Wiktionary the ability to display those incorrect "serial commas" under any circumstance. In English, they are wrong!  :-) --Connel MacKenzie T C 23:39, 6 May 2006 (UTC)[reply]

Unnecessary inclusion of the indefinite article in translations

When entering English translations of non-English nouns, please do not include the indefinite article if that is not part of the translation.

For example, the French word "Arabe" means "Arab", not "an Arab", which is "un Arabe" or "une Arabe" in French. I have corrected Arabe accordingly.

This is important for two reasons:

  1. A user seeing "Arabe" translated as "an Arab" could well think that "I spoke to an Arab" should be translated as *"J'ai parlé à Arabe" instead of the correct "J'ai parlé à un Arabe".
  2. Some languages inflect words to indicate the presence or absence of an article (whether definite or indefinite), and those forms that effectively include an article should include the article in their translation. Including the article in languages that do not do this obscures whether or not the article is part of the translation for those languages that do.

If it is necessary to distinguish the meaning (which is almost always the case), add a gloss in parentheses after the translation. — Paul G 09:14, 26 April 2006 (UTC)[reply]

Does this also apply to non-English translations of English nouns? One of the frequent translators here had commented that, for some languages, using the indefinite article would in most cases be clearer than the gender/number/whatever tags. Davilla 16:01, 26 April 2006 (UTC)[reply]
Likewise, should that rule apply to translations into English from languages that have no definite or indefinite articles? Rodasmith 16:47, 26 April 2006 (UTC)[reply]

I don't think such formatting is necessary, in fact I think the fuller a definition, the better, and that usually means including articles for the sake of clarity. You could use exactly the same argument for English definitions: dog is currently defined as ‘a member of the genus Canis’; should we take out the article there? Definitions are not about providing mathematical substitutions for the headwords, even in foreign languages. Of course there are translation problems with such basic elements of language as particles, but they won't be solved by just removing articles in the translation/definition. Then of course there are languages like Arabic, which doesn't have indefinite articles but does have definite ones...at some point we need to assume that a user will have some familiarity with the basics of a language before using a dictionary. Otherwise we will need to include a lot more grammatical information here than we do (most of it has been outsourced to Wikibooks I think). Yes, it's just conceivable that someone could end up with J'ai parlé à Arabe, but if they are really attempting this kind of word for word translation they are doomed to failure from the start and there's no way we can help them. Widsith 17:08, 26 April 2006 (UTC)[reply]

Widsith, to answer several of your points:
  1. I'm talking about translations, not definitions, so the argument does not apply to English entries.
  2. Translations are just that, not definitions: "dog" is not a definition of French chien, it is a translation of it.
  3. We might not be able to help the person translating word for word, but we should not be making it more likely that they will get it wrong by including words in translations that are not part of the translation. "A dog" is not a translation of "chien".
  4. I'm in favour of including more grammatical information where the user might otherwise make mistakes. See for example what I have done to the French and Italian translations of be able to. For example, I think that country names that include the article in translation (such as into French and Italian) should include usage notes: "Spain" is "l'Espagne" in French, not merely "Espagne", but "in Spain" is "en Espagne", not "en l'Espagne"; this rule changes for some other countries: "in the United States" is "aux États-Unis", not "en les États-Unis", and "in Canada" is "au Canada", not "en Canada" or "en le Canada". I think we should certainly be providing this sort of information to help the user.
Paul G 09:12, 27 April 2006 (UTC)[reply]

Yes, I take all of your points and I understand the problem very well. I suppose that what I am trying to express is my immediate visceral reaction to seeing foreign entries which just have one or two-word English translations next to them. Wiktionary could be more than just a translation dictionary. Also, it should be noted that most words, perhaps all words, do not have exact equivalents in other languages, and the more information we can give about specific connotations or shades of meaning, the better. Widsith 09:42, 27 April 2006 (UTC)[reply]

I agree with Paul about not listing the articles, and also with Widsith in that we need to give ample grammatical and lexical information. — Vildricianus 09:03, 29 April 2006 (UTC)[reply]

I've just made a template with CSS to allow optional serial commas, Oxford commas, or Harvard commas. In case you don't know what that means, it's the comma I just used before the "or" in the list in the previous sentence.

Serial commas are recommended by many style guides, and recommended against by many other style guides. So some people hate 'em, some love 'em. Now we can choose!

This example doesn't use a serial comma:

one, two and three

This one does:

one, two, and three

This one uses the new template so most of you won't see it:

one, two, and three

This is the code to use it:

{{serial comma}}

I've made it hidden by default in the global monobook CSS. If you love serial commas as much as me, put this in your CSS file:

.serial-comma { display: inline }

I've also included it in the new templates for the "disambiguation see also"s.

Feedback appreciated. One thing I'm wondering is whether {{,}} might be a better place for it... — Hippietrail 16:00, 26 April 2006 (UTC)[reply]

I'd think so, for legibility, even though this isn't such a big issue to me. Redirected to Template:, Davilla 18:29, 1 May 2006 (UTC)[reply]
So what does it mean if I can already see it? Widsith 16:02, 26 April 2006 (UTC)[reply]
Oh yes, sorry. It's just that any time a change is made to the global CSS or JS files, users won't see them until their caches are refreshed, which happens naturally from time to time. But you can force it any time on most browsers by hitting CTRL F5 and waiting a few seconds. — Hippietrail 16:28, 26 April 2006 (UTC)[reply]

My view is that some of you are going nuts! When someone first starts using Wiktionary, it is almost impossible at times to figure out what is going on becuase of the over-use of templates. My view is that this is just playing for playing's sake (there are less polite terms), and adding uncessary complication to what is supposed to be a simple-to-maintain system. For !*! sake, can you not get so diverted. Lets have some real work done on the things people really use. I would almost describe the introduction of this kind of stuff as "vandalism". --Richardb 04:22, 14 May 2006 (UTC)[reply]

Templates promote consistency, especially if they are well documented. Contribute at Wiktionary talk:Template documentation! Rod (A. Smith) 17:24, 14 May 2006 (UTC)[reply]

There is currently a transwiki discussion on Wikipedia about moving all Swadesh lists to Wiktionary. I would undertake the transwikiing myself (its one of "my areas"), but it also involved foreign languages (less so "my area"). Can anyone have a look and compare the Appendix:Swadesh list with the pages in Wikipedia to see if we can use these Swadesh lists here, and if there's any overlap? It would be quite a large task, but we'd be stronger as a result. --Dangherous 18:14, 26 April 2006 (UTC)[reply]

Well I don't think there's been any opposition to the idea that they need to be here and not there. I think there are already language family entries here for most of the languages that have lists at Wikipedia. If not, create them from the template. The idea would be to merge the individual lists from Wikipedia into the relevant language family list. If you're willing to do the work, go for it, and see if you can't get native speakers from as many of the languages as possible to check your work. - Taxman 20:46, 27 April 2006 (UTC)[reply]

Shall we make a new category for translators?

I'm curious if anyone would find it helpful if we had a category for Wiktionary users who are translators, similar to what we have over at Wikipedia:Category:Wikipedian translators. Including oneself in this category could be facilitated by bringing over Wikipedia:Template:User translator. I see a number of requests for translations, for instance, and such a category might help. Cheers, Eiríkr Útlendi | Tala við mig 20:27, 26 April 2006 (UTC)[reply]

I assume that all users with a Babel language level of 3 or more could somehow function as translators. — Vildricianus 20:31, 26 April 2006 (UTC)[reply]
Oo, forgive me for saying so, but I fear that that is a dangerous assumption, albeit quite widely held by those not in the field. I work in translation myself, and I've seen plenty of results from this arrangement, and let me simply say it ain't pretty. One person's Level 3 might be another's Level 2, on the one hand, and simply speaking / reading / writing another language does not make someone a translator on the other -- they are decidedly different (though overlapping) skillsets. Besides which, not everyone with such language skills might want to do any translation. Having a category where those already active in translation (or at least interested and confident enough that they can do it) can be duly indicated would be a bit sharper than the language categories alone. Cheers, Eiríkr Útlendi | Tala við mig 21:52, 26 April 2006 (UTC)[reply]
At first I would have assumed, like Vildricianus, that Babel would tell us who is or could be translators, but as Eirikr says, this is far from being the case. When the ttbc template was introduced recently, I went through the complete set of Babel entries at the time and contacted everyone with Babel 3 or above to ask if they would like to contribute to checking translations. There were quite a lot of replies, but it was by no means true that everyone wanted (or even was able) to be involved. Some acknowledged that their linguistic ability in their Babel 3 or 4 languages was not suitable for checking translations (for example, they spoke the language but did not read it).
So I think this is a good idea. What's more, it might attract people who do not have Babel entries who are familiar with the lesser-known languages, which we desperately need. Of all the languages listed in Babel, only a small fraction actually have users associated with them. There are many translations to be checked in obscure languages which, at the moment, we have no way of verifying. Introducing a "translators" category or list might help with this task. — Paul G 09:22, 27 April 2006 (UTC)[reply]
I think such a category could be helpful. Please be bold. --Connel MacKenzie T C 09:56, 27 April 2006 (UTC)[reply]
I'd say that people should only list babel levels according to what they want to do here, not to what their personal abilities are. — Vildricianus 15:58, 27 April 2006 (UTC)[reply]
That's great, but I don't think that viewpoint is all that commonly held, with the result that Babel levels on user pages are probably not a great indicator of who can or wants to translate. For that matter, if the Babel levels are really supposed to be an indication of what people want to do here (i.e. to flag themselves as possible linguistic resources for the purpose of other people more easily finding them), then the lower Babel levels really shouldn't exist. After all, what can someone of only Level 1 really do? This is part of what led me in the first place to think that the Babel levels were less for advertising one's services than simply saying a little something about yourself. But then that's just my take on things.  :) Cheers, Eiríkr Útlendi | Tala við mig 17:39, 27 April 2006 (UTC)[reply]
Following Connel's advice, I've copied over (and slightly modified) Template:User translator from the Wikipedia version, and created Category:Wiktionarian translators. If you're of a mind to, please look these over and make any suggestions / edits. Thanks, Eiríkr Útlendi | Tala við mig 17:57, 27 April 2006 (UTC)[reply]
Calling someone a translator, because he or she can do occasional translations is wrong. Proper translating is more than changing words from one language to another. When done properly information will be localised.
While it will be great to have translators collaborate with us, you have to appreciate that the needs of translators is quite particular. Translators are typically interested in translation lists, not in definitions, etymology, lexicology. In the line of their work they may create translation glossaries. That is the traditional format that makes sense to them.
What I have learned is that the Wiktionary community is great because its diversity of people. Translators are a welcome part of our community but so are language teachers and other professionals. I would indicate the level 5; professional user. When a translator wants to indicate his profession. He/she is welcome to it and yes, we should have a category for them. But making the distinction in the Babel templates is not that great an idea imho. Thanks, GerardM 07:47, 30 April 2006 (UTC)[reply]
I don't think any harm is done by adding a translator category. However, I'm not sure how useful it will end up being. First of all, Wiktionary is anonymous. Anybody can say anything they want to about their own language ability or translation prowess. That's why I appreciate the User contributions link. For the most part, the only way to evaluate a Wiktionarian is to look at their track record. A-cai 12:26, 30 April 2006 (UTC)[reply]

For anyone who cares to help turn this collection of terms into articles... Cheers! bd2412 T 08:20, 27 April 2006 (UTC)[reply]

Computing languages

Do we want to include reserved words from computing languages?

xpage was recently posted to WT:RFD (summarised below) with the objection that this is a Java class and, by another user, that this does not belong under the heading "English".

Perhaps we could include these, with the computing language as the title. Our remit is "all words in all languages", after all - as this stands, it doesn't restrict us to natural languages only. So should we include C's "void", Visual Basic's "Dim" and the like?

Note that there is precedent for this: I believe the OED includes PEEK and POKE (which are used in BASIC). That said, I also believe it gives these as verbs (as in "what value do you get if you PEEK memory location 12345?"), so they might be "English" words after all. — Paul G 09:34, 27 April 2006 (UTC)[reply]

I'm broadly in favour of including such things. Though we must be careful not to get too encyclopedic - e.g. all the formats of the if (or IF) statement in FORTRAN, COBOL, BASIC, C++ etc SemperBlotto 09:38, 27 April 2006 (UTC)[reply]
I would prefer restricting ourselves to natural language. peek, poke, and I'm sure many other words like goto are okay since they're used as verbs, that is, quite integrally within a sentence. They can be cited within English or other natural language texts, and are probably best placed under a broader header. Not necessarily all-caps, but I'd think all major languages accept reserved words that way, so it would be just as well to place them there. Any more than that would be opening a can of worms. I mean, do we want to start quoting lines of code? And anyways, what constitutes a legitimate language? Think of all the variants of Basic. Think of all the languages that have been around since the dawn of man, er... machine. There's so much of no import and too much gray area. Assembly shorthand, Unix commands like echo, operating system calls like gestalt, standard files like config, MUD commands like emit... where do you draw the line? Davilla 20:37, 27 April 2006 (UTC)[reply]
I'll hold off with IBM System 360 machine-code mnemonics for a while then! SemperBlotto 21:14, 27 April 2006 (UTC)[reply]
I would much prefer that Wiktionary allowed term from all computing languages, but sadly, it does not. Certainly tracking syntax changes (by programming language version) would be very useful, but the English Wiktionary seems to have a hard enough time with spoken languages at this point. --Connel MacKenzie T C 22:00, 27 April 2006 (UTC)[reply]
I'm all for creating a separate Wiktionary for programming language statements, and even for all the standard library routines on all computer platforms ever. Throw in all the tags in all markup languages ever, the official names of al the Unicode characters, and all the standard Unix commands from /bin - but that stuff has nothing to do with a "normal" dictionary of a spoken language. The fact that the term "language" has other senses shouldn't throw us off course. So put in the request for a new wiktionary and you can count on my 100% support. — Hippietrail 02:20, 28 April 2006 (UTC)[reply]
In the meantime, it might be worrthwhile to begin amassing a list of computer syntax terms as an appendix page (without links). That way, there would be physical evidence of (1) the extensive list of terms, (2) cross-language use, and (3) demonstration of someone's willingness to work on the project. --EncycloPetey 05:51, 28 April 2006 (UTC)[reply]
Perhaps we could start something like that, initially containing only the Wiki-relevant protocols, languages and jargon? Off the top of my head: .css, PHP, python, Perl, WikiSyntax, JavaScript, Solaris (toolserver), RedHat/Fedora Core 3 (cluster), HTML, XML, XHTML, bash, DOS .bat files, M. Perhaps only the top 200 most relevant keywords for each, for a start? --Connel MacKenzie T C 15:26, 12 May 2006 (UTC)[reply]

What about "citing" a keyword in computer languages rather than computer programs? For instance, goto is a keyword in at least every version of basic that has ever existed. Same criteria: three citations in "independent" languages, e.g. different implementations of {Java} by Sun or IBM or what have you. They could go under a ==Programming== language header or something. Languages must be Turing complete. Reserved words only is probably too restrictive, but it's safer to start out strict on this one, IMO. 59.112.38.180 21:07, 19 May 2006 (UTC)[reply]

Censorship

This is America's Sweetheart. Steven G. Brown has been going through all of my contributions and reverting them. I added quotes to two entries that I got from the OED2. A new entry I created he RFVd. I asked for an explanation on his talk page, but he gave none, just reverted. When I started restoring my quotes, he protected the pages. Is this OK? I got upset and RFVd one of his entries, and he just removed the tag and protected the page. He then said every single one of my contributions needs to be verified on the RFV page and blocked me. Is it OK to block someone you're in a content dispute with and revert all of their changes without explanation? —This unsigned comment was added by 129.82.42.77 (talkcontribs) 17:56, 27 April 2006 (UTC).[reply]

Your message to me entitled "Primitive Censorship attempts" ended in the words "Now go play outside." You clearly were not looking for an explanation. All of your contributions (Special:Contributions/America's Sweetheart) focus on sexual perversion and racial denigration and add no deeper understanding or anything of any value whatsoever. Nevertheless, to be fair, I allowed one of them to stand (an extraordinarily filthy article named sperm burper) and rfv’d it so that others could have a look and give their opinions. Rather than wait to see what the community had to say, you began tagging my recent Russian pages with rfv. You began to create such a disturbance that I blocked you for one day. Frankly, I think you should be blocked for a much longer time. —Stephen 18:58, 27 April 2006 (UTC)[reply]
So, the OED2 was wrong to include those quotations in their dictionary? If tagging an entry because you think it's "filthy" is fair, I'd hate to see what you think is unfair. I tagged one of your entries, and I still think it needs verification. I also spent a lot of time formatting those quotations and looking up the authors for attribution, but I guess stopping you from reverting them all is vandalism. I personally think your censorship attempts are vandalism and that you should be de-sysopped and banned permanently from editing Wiktionary.
Unsigned comment 13:10, April 27, 2006 62.7.244.103 (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks)
I won't try and defend America's Sweetheart. I don't want to defend America's Sweetheart. I mean, this user was in an edit war with a sysop. And I have to believe that the sysops need to have the power to block users who act that way. But I can't defend all of Stephen's actions either. He deleted citations from Hemingway and Twain. He deleted a quotation from Gone with the Wind. I have to agree, that sounds like censorship to me. Maybe in his opinion they don't add anything of value. And if it comes down to deleting trash then sure, call it censorship or what have you, but it's got to go. This, on the other hand, seems pretty legitimate. I don't want Wiktionary to be that strongly, um... regulated. Davilla 19:51, 27 April 2006 (UTC)[reply]
Since you are not a sysop, you can't see the nonsense that was added, but other sysops can. The entries that potentially could be entries that meet our criteria can by all means be reentered properly. While less restraint exists for entries that have been previously deleted, a genuine contribution (of correctly enting a term) is not at all likely to be redeleted. That is not censorship. No one is preventing that user from making their point on their own website. No one is trying to prevent that person from discussing their opinion in public. What Stephen did was quash a vandal, who judging from their IP hopping, trolling and other activities, has no intention of being a helpful contributor here. Taking the vandal's bizarre selection of (unhelpful) citations as a whole, I strongly agree with Stephens actions. Except that I think he should have blocked longer. --Connel MacKenzie T C 20:04, 27 April 2006 (UTC)[reply]
Where are you getting all of this? Anyone can look at what I wrote by looking in the history of the pages ([6], [7], and[8].) None of the definitions were deleted. They were reverted and one was tagged (he left my uncontroversial definition, lighterman, alone [tellingly].) I also didn't know that you could read my mind and tell that I intended to stop contributing suddenly ("based on his IP hopping, blah blah blah") judging how I just got here . . . ok . . . Also, what is all this about not censoring me in public or on another website? You're censoring my contributions right now on Wiktionary.
Unsigned comment 14:56, April 27, 2006 159.61.240.143 (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks)
Nice troll. Deleting your cruft here does not prevent you from presenting your particular point of view elsewhere, in an appropriate place. --Connel MacKenzie T C 21:28, 27 April 2006 (UTC)[reply]
I'd have to agree. Cable can do what it wishes, but the TV networks are still considered censored. I'm not talking about the world wide web, I'm talking about censorship as the meaning applies right here on Wiktionary. And it's not necessarily wrong. Omitting some constructed languages, requiring a certain degree of attestation, not allowing certain images: these are all forms of censorship, in a way, but they are standards decided by the community. The sysops are free to pursue vandals, and I have full faith that they use their best judgement when doing so. But I am just as thankful that contributors are allowed to voice their opinions here. Davilla 21:40, 27 April 2006 (UTC)[reply]
So do I understand this right? When pages and their histories are deleted, they also disappear from a user's contributions. You're saying there was a lot more than what I can see. Then I'd have to agree, America's Sweetheart should have been blocked sooner, instead of being allowed to make later, more legitimate contributions whose reversions make the sysops look guilty. Davilla 21:40, 27 April 2006 (UTC)[reply]
None of my entries were deleted! Look in the deletion log: [9] None of them are mine!
I've been looking at some of the pages in question... why are the edits being reverted? One would think that a revert war would be explained on the talk page, but apparently the sysop(s) in question didn't feel like taking this responsibility —Muke Tever 00:34, 28 April 2006 (UTC)[reply]
This is a general comment. I have not looked closely at the history of this incident. Wiktionary gets a great deal of traffic and those volunteers who undertake to make some sense of it and keep order must sometimes make quick judgments with regard to what is appropriate content. Please believe me when I say that the vast majority of edits that look like junk, are. That said, admins do make the occasional mistake. I don't believe that any of us seek to censor the content here.
Admins do not outrank the other contributors. We're really more like the janitors around here. The check is community trust in electing admins in the first place and discussions like this one. --Dvortygirl 06:47, 28 April 2006 (UTC)[reply]
I agree with Dvortygirl. If Stephen was hasty it's because he and other admins are fed up with dealing with the quantities of junk etc. which are put on to Wiktionary. But there is a right way and a wrong way to respond to that, and the snide comments and RfVing from America's Sweetheart was not conducive to resolving the problem. Hopefully once his block has expired he will be able to make a useful contribution here. Widsith 07:03, 28 April 2006 (UTC)[reply]
Perhaps someone could restore my work, and, in return I could apologize to Steven?

A few elements and factors are indispensable in the process of becoming "trusted" and less easily reverted. One of them is not applying the "eye for an eye" principle and rfv'ing a perfect entry. Another one is spelling somebody's name correctly (mine is an exception, everyone is allowed to misspell it). Other things that may help is de-redlinking one's userpage, adding a Babel or something like that, and in general, not restricting one's contributions to editing likely targets for vandalism, like fag or nigger. Certainly, this sounds hypocritical, as it is one's contributions that count, but that's the way a wiki works. If all sysops were to consider and think ten minutes before reverting (I do so at times, usually to no avail), we would need a thousand of them. — Vildricianus 10:40, 28 April 2006 (UTC)[reply]

  • It is very clear that User:America's Sweetheart is of course, a simplistic vandal. Stephen's intuition was right on spot. The relatalitory vandalism from blocking this user (who then resorted to several other dynamic IP addresses) clearly demonstrates this. The childish page removals are even more childish than the initial questionable entries. --Connel MacKenzie T C 01:43, 29 April 2006 (UTC)[reply]
In making up my mind about America’s Sweetheart’s intentions, the biggest point is the fact that all but one of his contributions are about the most vile and dehumanizing of terms, and he seems intent on making them even more horrible. The definitions already listed under nigger, for instance, are already pretty complete, and the quotes that already appear there seem to me to be sufficient for a dictionary. I can’t see how adding "*A similar error has turned Othello..into a rank woolly-pated, thick-lipped nigger.--Hartley Coleridge Essays and marginalia (1851) I" adds anything of value to a dictionary page. It may be fit for an encyclopedia, but I don’t think we have to list every nasty cite in a dictionary. —Stephen 09:33, 29 April 2006 (UTC)[reply]
By the way, these racist and perverse entries are continuations of the ones such as nigger baby that we had to deal with in early March. —Stephen 09:54, 29 April 2006 (UTC)[reply]
This is a dictionary. We gather quotations, to illustrate the use of a word: both its use in grammatical and pragmatic contexts, and its use over time. Nasty words have nasty cites; that's just a fact of life. —Muke Tever 14:22, 29 April 2006 (UTC)[reply]
  • BTW, MacKenzie has deleted "sperm burper" after I added three quotes to it. He didn't care that it was in an RFV or that it was proven to exist. He also reverted changes to "angel" after I proved that sense existed with three quotes. 62.118.249.75 21:13, 30 April 2006 (UTC)[reply]
You didn't prove the sense existed. Two of your quotes were from the Bible and Shakespeare, neither of which used angel in any sense but the primary; and your third citation didn't even use the word. Widsith 21:19, 30 April 2006 (UTC)[reply]
The quotes make very clear that they're talking about an angel in a homosexual sense. The word is a variant of ingle which is defined in the OED in the same sense. I got two of the quotes from this book which has the same interpretation of the quotes as me. 21:51, 30 April 2006 (UTC)
You've misunderstood, and horribly, the source I brought up :( The sense 'homosexual' is only one of the senses that entry mentions—not all the cites it offers are for the same sense. (It also speaks about the sense of ingle we already have, i.e. a fire[place]; and the Bible quote appears to be intended as an explanation for the source, or at least the long-standingness, of the association of homosexuals with angels.) In any case, a pun or allusion is only a suggestion of usage, and not a usage itself. —Muke Tever 23:17, 30 April 2006 (UTC)[reply]
  • By the way, MacKenzie has just deleted "shit stabber", "shit on a raft", and "shit hunter", all of which were attested. It's impossible that the confluence of events is a mere coincidence. 85.31.186.86 21:55, 30 April 2006 (UTC)[reply]
Of course all User:Primetime sockpuppet submissions are deleted. Got any more IP addresses for me to block? --Connel MacKenzie T C 05:55, 1 May 2006 (UTC)[reply]

Look, I'm a hasty guy too. But I try my best not to delete things! (Just move them). Some of our "censors" argue that some words should not be in WikiSaurus because they don't have page entries. Then someone goes to the trouble of defining sperm burper and finding citations. This term has 36,100 Ghits. But it is deleted! I'm afraid I already have some of the "Censors" down in my book as bowdlerisers. I think this is uncalled for Censorship. I am going to put sperm burper back.--Richardb 03:54, 14 May 2006 (UTC)[reply]

Richardb, please feel free to enter a legitimate entry for sperm burper (of course, with three pring citations.) Do remember not to restore previous iterations from the known copyright violation source User:America's Sweetheart, confirmed sockpuppet of User:Primetime; I pretty sure that doing so would be a direct violation of WMF policy. Please note discussions about User:Primetime here and on w:WP:AN#User:Primetime regarding the much harsher treatment his entries are recieving on Wikipedia, which is also a WMF project. --Connel MacKenzie T C 20:00, 14 May 2006 (UTC)[reply]
To me it was a valid entry. In the history it had another contributor too, with some slightly bowdlerised version. SO, a valid entry could have been constructed by merging the entries. But, I'm not going to betoher entering a sysop war over delete/reinstate/delete/reinstate.--Richardb 12:53, 16 May 2006 (UTC)[reply]
What you are suggesting would be a "derivative work," therefore also a copyvio. --Connel MacKenzie T C 07:15, 18 May 2006 (UTC)[reply]

Redirects from lower to uppercase

Relevant Policy

We've had a draft policy since 15 Apr 2006, Wiktionary:Spelling Variants in Entry Names - Draft Policy that covers this !--Richardb 03:37, 14 May 2006 (UTC)[reply]

The "draft pol" discussion there, of course, is now overcome by events. Read on here, and find out why this "draft" needs massive revision, before a vote on accepting it will ever be remotely conceivable. Furthermore, that "draft" starts with a horrific error - since it was created on the sly, it does not yet reflect that it is "conflicting directly with existing practices." The comments I've made on that topic are mysteriously absent now, with the very first "rule" it suggests being the most outrageous. Any Latin-alphabet spelling difference is not acceptable as a redirect here on the English Wiktionary.
As current technical developments evolve (as discussed below) this will need further discussion and revision. Some improvements have already been made; more are imminent. This "draft" (which apparently reflects only one individual's POV,) does not acknowledge the former, nor recognize the latter. --Connel MacKenzie T C 21:07, 14 May 2006 (UTC)[reply]


Is it safe to delete entries that redirect from, say, paris to Paris? — Vildricianus 18:50, 27 April 2006 (UTC)[reply]

No. --Connel MacKenzie T C 19:05, 27 April 2006 (UTC)[reply]
It's been quite a long time now. There are quite a few wrong redirects (especially but not only those that involve German). There also seem to be people who are either actively making new such redirects or actively wikifying uppercase words in articles when they should be pipelinking so that we can eventually clean these up. So when will "eventually" be? I think we should decide on a time, put up a warning notice between now and that time that they'll be deleted, and then get rid of them. — Hippietrail 21:37, 27 April 2006 (UTC)[reply]
As time has passed, I've noticed greater variety in how mirrors and other Wikis/Wikts link here, not less. Until the MediaWiki software is fixed to correctly handle external links, we should be adding more redirects, not removing them.
I am shocked at the notion that deleting these navigation aides is somehow considered helpful, buy anyone anywhere. Deleting a redirect increases the db size, while breaking links. How is that helpful? --Connel MacKenzie T C 22:05, 27 April 2006 (UTC)[reply]
That's all nice, but this here involves the other direction of redirects (read title), whose purpose I find doubtful. Who'll be ever linking to paris (ok, Kipmaster destroyed my example, say then, london)? — Vildricianus 22:12, 27 April 2006 (UTC)[reply]
Well, yes. Some mirrors recognize that en.wiktionary is now case sensitive, and therefore convert all links to lowercase first character (from their site to ours!) Others, like onelook, assume we still do it the same as Wikipedia, and link directly to uppercase first characters. The variety is just as bad within Wikimedia projects, it seems. --Connel MacKenzie T C 22:31, 27 April 2006 (UTC)[reply]
Mmm, that's even worse then. Bah. — Vildricianus 22:34, 27 April 2006 (UTC)[reply]
I don't see any reason to be shocked. Just tell us why you think it's a bad idea. If it's because other sites are lazy about how they link to us then I think my suggestion of putting a warning at the top of every page is a good one - or do we really want to encourage sites to continue ignorantly linking to us indefinitely? Doesn't that in a way make us a slave to outside sites? — Hippietrail 22:43, 27 April 2006 (UTC)[reply]
But you know very well that such linked used to work correctly. --Connel MacKenzie T C 00:45, 28 April 2006 (UTC)[reply]
Yes, but that has nothing at all to do with trying to fix things. I really can't understand why you are fighting to keep things as they are instead of trying to move forward in a nice way which will lead us to a better overall situation. Well what do other people think? Especially long-time contributors - should we keep this upper to lower and lower to upper, the good ones and the broken ones, indefinitely? Forever? Or should we say - in the place we usually ask for donations and such - that we're deprecating this practice and sites linking to us have 2 months, 6 months, 12 months, whatever to comply? And if we don't want to move forward with this, why keep only half a system instead of making some bot which makes redirects for all the words which currently only have lowercase entry? I for one am quite annoyed each time it looks like a German noun or a proper noun already has an entry but the blue link turns out to be misleading due to one of these stale links. — Hippietrail 01:02, 28 April 2006 (UTC)[reply]
What I am saying has everything to do with fixing things! Wikipedia (AKA Big Sister) does not break external links in this manner; they do the opposite. Any site linking here reasonably expects us to follow that convention (apparently.) [Note: all mirrors are supposed to link back to us - the only ones that are having problems are the ones correctly linking back!!!] --Connel MacKenzie T C 05:37, 28 April 2006 (UTC)[reply]
I'm sure there's a CSS hack to make redirects have a different color. I have that on Special:Allpages now, and am still looking for something to make it work everywhere. It also works by setting the stub threshold infinitely high, so everything looks brown except redirects. — Vildricianus 10:17, 28 April 2006 (UTC)[reply]

What about redirects from upper to lowercase? Those seem even less necessary. (But if it's still policy to retain them, does that mean we should be adding one for each new entry created?) —Scs 14:23, 3 May 2006 (UTC)[reply]

Ec opposed doing this for new entries; I am still unclear why. --Connel MacKenzie T C 06:12, 5 May 2006 (UTC)[reply]
I don't know about Ec, but my reason would be: creating all these redirects is clearly a waste of time and database space. If users should have easy access to a differently-capitalized version of the word they were seeking (as of course they should), that's clearly a function which ought to be (and in fact for the most part already is) handled automatically by the mediawiki software, not handled manually by exhaustively creating a redirect for every single word in the dictionary.
There are three circumstances I can think of where this matters:
  1. the search box
  2. internal links (using [[]] wikilink syntax)
  3. external links (especially from mirrors)
The search box already works perfectly, as you can verify by experimenting with words like "Tamper" and "cypriot" (neither of which currently have redirects).
Internal links with the wrong case (e.g. Tamper and cypriot) don't automatically work, but it's not clear that they should. It might be nice if they could (as they do in the "caseless" Wikipedia), but of course editors can always use the pipe syntax.
Finally, and maybe I'm being too callous, I'm really not worried about redirects back to Wiktionary from mirrors. Those mirrors are mostly freeloaders who are mooching off our hard work already; why should we do even more work just to make their lives easier?
Scs 17:51, 6 May 2006 (UTC)[reply]
Internal links have to treat upper- and lowercase forms differently - otherwise, once we had added smith there would be no (easy) way to add Smith. SemperBlotto 17:56, 6 May 2006 (UTC)[reply]
  • Connel, considering your emotional responses you may not be aware that we don't know what you mean. Please tell us why this breaks valid mirrors and doesn't break invalid mirrors, and why there is no long-term fix or plan to create one. — Hippietrail 21:12, 6 May 2006 (UTC)[reply]

I'm am chagrined at the notion of someone flippantly saying they don't care what work derives from Wiktionary. The point of it being GFDL is to encourage derivitives. If that isn't why you are here, then go work for one of the "closed" dictionaries and get paid for your work.

That said, there are derivitives out there that honor the GFDL, and there are derivitives that do not. The ones that do honor the GFDL have to link back here. They must, as a GFDL compliance requirement. Following the meta: directions, they will consistently link back here wrong. The mirrors that were valid in the past, linking back correctly to how we used to work, no longer work. The mirrors that "correct" entries to lower-case will randomly get the wrong result.

Just as much confusion exists within our sister projects, as with external valid mirrors. The variety of when to choose upper case vs. lower case is astounding. The only project that has even tried to rectify their redirects is the English Wikipedia; I think User:Uncle G may have abandoned that effort as well. English Wikisource, English Wikinews, English Wikibooks, English Wikispecies, English Wikiquote and Wikicommons are each much worse off than English Wikipedia. Other language Wikimedia projects I can't even guess at, but I do know that many use the "visible extended interwiki" style references, (e.g. fr:bon) in their translation sections. With the recent decapitalization of all Wiktionaries, many have changed the rules they follow for such redirects - some assuming lowercase, some assuming upper case, some assuming a link to [Search], etc.)

With the software changes that have been made to decapitalize Wiktionary (decapitate, as I like to say,) the external links no longer work. Using redirects is the only viable work-around that I know of, to date. At this point I don't think the WM developers even acknowledge the problem. Apparently, many here don't quite get it either.

--Connel MacKenzie T C 23:10, 6 May 2006 (UTC)[reply]

There are two different questions here: how bad is the problem, and what's the right fix?
  1. How bad is the problem? I'm sorry you thought I was being flippant. You're right, derived projects are important. The ones I was dismissing (and don't feel like helping) are just those that slurp a wiki's contents and redisplay it with no value added other than the negative value of a bunch of ads.

    With that said, I'm still puzzled what the problem is. Can someone give some specific examples? (The only one mentioned so far has been http://onelook.com/, and it seems to work perfectly.) The obvious way for a site mirroring our definition of "foo" to link back is to link back to "foo" -- and, similarly, for "Foo" to link back to "Foo", "fOo" to link back to "fOo", etc. Are there really sites that mirror an entry titled "foo" and say to themselves, "Since the wiki software treats initial capitalization as nonsignificant, I should link back to 'Foo'"? That takes work, quite unnecessary work; why would a site go out of its way to do something unnecessary which would only cause problems (just like this) later? (And what's such a site doing mirroring an entry titled "foo" in the first place? Shouldn't it think that the title is "Foo"? Or is that the problem, that it was once titled "Foo", and along the way, when we decapitalized, we changed it to "foo"?)
  2. What's the right fix? To my mind, the right fix is clearly not manual duplication of every entry. That's a viable workaround only if the code is inviolate. But a software fix for the broken external link problem would be trivial. I'll work on it myself (it's a perfect excuse to dive into the Mediawiki code, which I've been meaning to do) and report back later. —Scs 13:58, 7 May 2006 (UTC)[reply]
I have to agree. From a software design standpoint, fixing the problem at one point is a million times simpler than creating all of these redirects. My apologies for the understatement. Consider that the software fix is a one-time solution, and the latter requires continual management and updating every time a page is created, indefinitely into the future. The second is not a viable solution and should not influence our reasoning on this topic. But the problem is the more important matter. If mirror sites may not function properly as a result, and as claimed many do not now, I would consider this to be a PRIORITY. Davilla 17:52, 8 May 2006 (UTC)[reply]
  • If you are adept enough at PHP to effect the right change, then please do! BTW, doing a lookup on www.onelook.com for "dog" links one to DOG (which I just added {{see}} to.) Perhaps "dictionary" is a better example, as that links to Dictionary which redirects to dictionary. Why do they do that? I assume it is to be consistent with their links to en.wikipedia.org. But my point is that if such a massively public, well-known site as onelook gets it wrong, how can we expect 500+ other mirrors to get it right? If you really need me to, I'll go the Alexa's list of whatSitesLinkThere and find more examples...but I'd rather not. --Connel MacKenzie T C 19:33, 7 May 2006 (UTC)[reply]
I'm not sure what onelook is doing, but they may not be as broken as you might think. If you search for "cypriot" there, they link to our Cypriot, and if you search for "Tamper", they link to our tamper. ("Tamper" and "cypriot" are two examples of words we don't currently have case-redirecting entries for.) Now, I see that for words we have both entries for, they do seem to always link to the capitalized one. They may be linking to the first spelling (irrespective of case) we ever added, or they may be linking to the one that's the first alphabetically. But it does appear they're taking care to link only to words we do have entries for. So they may be linking to our capitalized words not because they're stupid, but simply because we have them -- and our having them might therefore be, not a necessary fix, but rather enabling behavior!
As a test, it would be interesting to delete one of our redirects for some relatively unimportant word (say, Decile), wait a few weeks for their mirror to catch up, and see what they do.
Speaking of mirrors, we've been talking about "proper mirrors that do link back", but another important thing a proper mirror does is update itself regularly. (That is, after all, what "mirror" in this sense means; anything else is just a "snapshot", and those have way more problems than case mapping.) If we were to delete all our upper-case redirects, any proper mirror ought to catch up and fix themselves automatically on their next scan. (But it's true, sister projects with manually-composed links are another story.)
If you know of any other specific mirrors, please mention them. Besides onelook, so far the only one I've found is http://open-dictionary.com/, which seems to be only half-broken: "cypriot" finds their "Cypriot" which is a copy of ours, but "Tamper" doesn't work. (And there seems to be something broken about their links back to us in general.)
Scs 23:48, 7 May 2006 (UTC)[reply]
Well, that is very interesting. Onelook seems to be using the wrong index, as they link back to use for proteger (even though they had me generate an English only list, so they would only get English terms.) Apparently, there are more problems than I first suspected.
Alexa seems to be reporting a tremendous number of false-positives. open-dictionary.com looks pretty broken (and seems to auto-convert the first character to uppercase.) thefreedictionary.com seems to link when there is no "better" dictionary definition elsewhere, but links "correctly." Perhaps I should check the list of mirrors on meta: instead. --Connel MacKenzie T C 02:23, 8 May 2006 (UTC)[reply]
  • This Nicaraguan internet cafe is not as good as the one I was at last night so I'll be brief and mabye I even missed something above.
    • If the information on how to mirror on Meta is wrong, isn't it our duty to edit it and correct the mistakes? Isn't Meta editable by everybody just like any other Wiki?
    • If the Wiki software contains a bug or lacks a feature that means we are putting in lots of work to to get mirrors working - and still with many imperfections, isn't it our duty to report that bug on http://bugzilla.wikipedia.org ?
    • Wiktionary is part of a community and we should act the part by being responsible and reporting problems to the other parts of the wider Wiki community and seeing that they are looked at. There is no need for us to be so passive and waste a lot of effort in workarounds when we can be proactive and help improve the Wiki experience for everybody. — Hippietrail 18:49, 8 May 2006 (UTC)[reply]
These are excellent points HT. Perhaps I'm not seeing the forrest through the trees. --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)[reply]

AFAIK, there is no such list at Meta. There is this and this, but the real "mirrors and forks" list is at Wikipedia. I haven't looked very thoroughly, but IIRC, there is no list of Wiktionary-only mirrors. Anyone? —Vildricianus | t | 19:21, 8 May 2006 (UTC)[reply]

Some time back, a visiting Wikipedian referred to the meta mirrors list as their starting point for finding a non-compliant mirror. I assumed that meant there was such a list. Perhaps they were just checking for Wiktionary content at each of the Wikipedia mirrors? --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)[reply]

Having actually thought about this a little more, my attention is drawn to MediaWiki:Newarticletext. I've been bold and added a search link that may mitigate the problem to a certain degree. Clicking on that "search" link invokes the full [Go] functionality (which includes the search logic.)

If there were some way of telling what the referring URL was, we could possibly:

  1. add an indentifying id= or name= somehwere in MediaWiki:Newarticletext,
  2. detect that tag in MediaWiki:Monobook.js,
  3. add logic to check the refurl to make sure we don't repeat searching in a loop,
  4. add logic to automatically invoke the [Go] link when those conditions are met.

Should I pursue this experiment? If this works, I would have no objection to deleting all "de-capitalization" redirects. This would have the secondary benefit of offering the preload templates to internal red links. --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)[reply]

Hey, Connel, what city you in? We should be sitting in the same room, so we can compare notes and work on this side by side. :-)
I just tried your Wiktionary "E-mail this user" link but you don't seem to have a confirmed e-mail address. So instead of sending you my contact info, I guess we'll have to use Vulcan Mind Meld or something.  :-) --Connel MacKenzie T C 19:53, 9 May 2006 (UTC)[reply]
I'd been approaching the problem from a slightly different direction. I've got some new PHP code that's trying to do the right thing (and I got it working, too, not five minutes ago, in my own home wiki here), but the problem is that it can't inject its output quite where it wants, precisely because 'newarticletext' and 'noarticletext' are templates, not dynamically-generated.
(Did you mean MediaWiki:Newarticletext, or MediaWiki:Noarticletext? It was the page that comes up as MediaWiki:Noarticletext that I was trying to augment with a link to the other-case article, to fix broken links coming in from the outside.)
It might be possible to do some fancy programming right there in the templates (using mediawiki code, not PHP code, more or less as you suggest), but I haven't explored this yet.
I'm now going to try to ask the real mediawiki developers for a little advice. There's probably a right way and a wrong way to proceed, and they'll have a better feel for that than we will.
Scs 19:42, 9 May 2006 (UTC)[reply]
Your last point is the best...I agree completely.
Actually, it's even better than that. I posted a message to the Wikitech-l list, and one of the regulars there has already provided what I thought couldn't be done: the "fancy programming right there in the templates, using mediawiki code". See here. I haven't tested this yet, because I don't have the expression evaluation parser loaded into my home wiki. —Scs 04:12, 10 May 2006 (UTC)[reply]
Yes, here on en.wikt: we've had the inelegant {{~if}} and {{if}} for a while, but those are shunned for a variety of reasons. In the last week or two, now the "#IF" magic-word exists, I've been thinking about correcting links to our templates. Unfortunately, that doesn't do a http 302 nor even a javascript redirect. And since we can't determine the referring URL, we won't know if that was intended or not (e.g. an internal link should always get the edit page with an option to redirect, while external links should just be redirected immediately.) --Connel MacKenzie T C 05:02, 10 May 2006 (UTC)[reply]
As far as "New" vs. "No", I think I may have confused myself there. "New" is the result of an internal link, which is what I was testing a few minutes ago. --Connel MacKenzie T C 19:53, 9 May 2006 (UTC)[reply]

technical discussion

New section for edit link...

OK, it seems clear to me now that my basic assumption was wrong. Anyone arriving at a page containing MediaWiki:Noarticletext should be redirected (if alt page exists.) Anyone arriving at a page containing MediaWiki:Newarticletext should be warned if alt pages exist.

On the Noar page, I'll add an identifier within the conditional template(s). Withing Monobook.js (my own, for now) I'll trigger a redirect. --Connel MacKenzie T C 05:17, 10 May 2006 (UTC)[reply]

For those still with us -- I got the {{#ifeq:}} hack in MediaWiki:Noarticletext working as its author intended, so we shouldn't need to muck around with monobook.js or CSS after all. See cypriot, (cypriot/Cypriot) Tamper, (tamper/Tamper) and nopageatall for a demonstration. —Scs 05:37, 11 May 2006 (UTC) added regular wikilinks. --Connel MacKenzie T C 06:23, 11 May 2006 (UTC)[reply]
Another example: milf/MILF/milf. Oh crap, does this mean I have to make good on my DeleteRedirectsBot idiotic promise? --Connel MacKenzie T C 06:23, 11 May 2006 (UTC)[reply]
By the way, I would like to see Hippietrail's auto-redirect experiment as well... --Connel MacKenzie T C 06:25, 11 May 2006 (UTC)[reply]
Now that this is "solved" for NS:0, perhaps we should let this conversation get archived, and restart a fresh discussion about policy regarding #REDIRECTs? --Connel MacKenzie T C 15:16, 12 May 2006 (UTC)[reply]
Not yet, I'd say. Let's see for a while how well this works. —Vildricianus | t | 17:53, 12 May 2006 (UTC)[reply]
  • Oh. I feel kindof silly now. Whether Noarticletext or Newarticletext finds a match, the redirect for both will be the same: to Special:Search/{{PAGENAME}} (or Special:Search/{{NAMESPACE}}:{{PAGENAME}}. There is no looping danger, in that case, right? If the page does not exist, the search page will open up, not a target page. If the page does exist in any capitalization form, the target will be reached without this nonsense. Forest for the trees, I tell ya. I must be ill, to have spun my wheels this much. --Connel MacKenzie T C 05:13, 13 May 2006 (UTC)[reply]
    • Actually, the choice to not do it for "Newarticletext" is so that we are able to enter upper and lower case entries for any given term. So we'll acutally be "nicer" to external links than internal. Unless of course, we go berserk and add some kind of back-link (via cookie?) similar for "regular" redirect pages that when clicked on will go back to the internal link with something extra in the url (like redirect=no.)

Anyway, I have it working in my User:Connel MacKenzie/monobook.js (the top-most function in the file.) For external links, when you get to buenos Aires it auto-redirects to the search page, which zaps you directly to Buenos Aires. --Connel MacKenzie T C 07:19, 13 May 2006 (UTC)[reply]


Proper nouns/place names

I've been adding some capital cities lately, and I was wondering to what extent we're going with place names. We've got most countries and capitals now, and a couple of other cities and towns, but are all place names on Earth considered to be the part of the all words of all languages statement? I think they are, but I'm not certain everyone agrees on this.

If they are, then, how are we going about categorizing them? I now see that the Category:Capital cities may not have been an excellent choice, for I think it may involve politically loaded inclusions/exclusions and therefrom resulting discussions/edit wars etc. that are better left in Wikipedia. Any thoughts? — Vildricianus 21:21, 27 April 2006 (UTC)[reply]

I'm curious too, what would ideally go in such an entry? Just a short one-line listing, "a village / town / city in XXX country", with a "See also" pointing to Wikipedia? I think I'll go look at a couple Wiktionary entries and see if I can answer my own question.  :) Cheers, Eiríkr Útlendi | Tala við mig 21:28, 27 April 2006 (UTC)[reply]
I see no good reason for deleting any place name that is entered, even ones as small as, say, Butetown or Denigomodu. Otherwise we'd have to draft some kind of policy saying "only towns with x number of people in are allowed for inclusion", and nobody likes making policies, do they ;). As far as adding them goes, it should be very low down on our "priority list". Category:Capital cities is a good enough category in my eyes. When, in 5 years or so down the line, we've run out of non-proper nouns to add, we'll end up creating them anyway, lol. --Dangherous 21:36, 27 April 2006 (UTC)[reply]
Well, if Dunabökény can stay in here, then anything will. Take that entry as a test to the system...whatever system it may be. --Dangherous 21:46, 27 April 2006 (UTC)[reply]
We might want to consider (though not necessarily right away) whether we want a single all-inclusive Catgeory:Capital cities, or some kind of regional breakdown. I can think of several ways to do this, but then the category isn't overly large right now. --EncycloPetey 05:45, 28 April 2006 (UTC)[reply]

Etymologies and translations are two good reasons to have them, though I confess I still feel in two minds about it myself. Widsith 07:05, 28 April 2006 (UTC)[reply]

That's what I thought as well, but there's not much to say in either section for less notable places, like, for instance, Big Lake, Texas. — Vildricianus 10:07, 28 April 2006 (UTC)[reply]
Does anyone have a sense for what criteria are used for determining inclusions in published geographical dictionaries like Webster's? --EncycloPetey 09:18, 29 April 2006 (UTC)[reply]
My gut feeling is that they have an idea of how big the book should be, and what price they can sell it for (and to whom) and include places in reverse order of size and importance until the book is "full". They probably include smaller places in the USA than in China if that is where they plan on marketting it. But our Wiki can be as big as it likes, is free, and we market to the world! SemperBlotto 10:14, 29 April 2006 (UTC)[reply]

These must be treated in a dual nature, just like given and family names. I really don't care which historical figures had the name David, and I don't really care which states in the U.S. have cities named Athens. The first is a common given name and the second is a place name. However, the Biblical figure and the city in Greece each deserve an entry. By what criteria though? The CFI currently says that names must be attributive. I've suggested before not including a place name (as a specific city or what have you) unless it has a common or non-literal translation on the other side of the world, which would indicate its importance. I'm sure "Big Lake" has a translation into Chinese, but would any Chinese person know anything about the city aside from the presumed big lake nearby? Taipei, on the other hand, isn't the most well received transliteration of the Chinese word, but it is the universally standard one. Davilla 13:36, 30 April 2006 (UTC)[reply]

My view is that place names should only be included if:-

  • they have a different name in a diffrent language. We need the name in order to show the translation.
  • they are necessarily referenced from another entry :Athenian => Athens.
    Though I worry about Leodensian => Leeds. But, i suppose if Leodensian was written in a novel I'd want to know what that meant.

Mostly that then limits us to having entries for significant places.

But WT:CFI already says something - A name should be included if it is used attributively, with a widely-understood meaning. . Perhaps it could do with a bit of updating to reflect the above though.
--Richardb 03:28, 14 May 2006 (UTC)[reply]

Wikisaurus cleanup

There is a project space established for this sort of discussion. Wiktionary talk:Project - Improving WikiSaurus Please try to be a bit disciplined and conduct the discussions there. I will try to move this discussion to that place, and apologise in advance if I don't do it perfectly. (And Connel, please don't kneecap me or something if I make any mistakes). It sure would help if you guys had the duscipline to use such discussion places in the first place. We would have one central place to carry on the discussion, and to look back on discussions. It would also help keep Beer Parlor more manageable.--Richardb 00:59, 14 May 2006 (UTC)[reply]

What is the status of clearing the WikiSaurus namespace of the plethora of nonsense that doesn't meet CFI? It is having secondary effects, no longer just a honeypot for vulgar minded contributions. The nonsense in WikiSaurus is not worthy of any "reference" such as a slang dictionary, nor this dictionary, nor this thesaurus.

It is no longer just an embarrassment.

--Connel MacKenzie T C 01:52, 29 April 2006 (UTC)[reply]

I'm not terribly involved with WikiSaurus, but I have given some thought to its remarkable underdevelopment. Perhaps we need an "entry of the week" (or something) where members help to brainstorm a variety of synonyms and related words on some topic. As far as I know most WikiSaurus development comes from the efforts of Davilla and Richardb, so some format for increasing involvement seems warranted. It would at least lead to a greater diversity of entries that would help swamp out the embarrassing ones. I note for example that there don't seem to be any entries pertaining to measures of time (e.g. soon, quickly, immediately, ASAP, ...). A broader array of meaningful entries can serve as a nice carpet to cover the naughty bits. --EncycloPetey 09:17, 29 April 2006 (UTC)[reply]
FYI I'm not a contributor. Davilla 13:17, 30 April 2006 (UTC)[reply]
I think he meant TheDaveRoss. —Vildricianus 13:24, 30 April 2006 (UTC)[reply]
I'm sorry, but several initiatives have started and died out, with the intent of doing just that. Each has failed. Confer WikiSaurus:penis. Of the 1,215 "synonyms" listed, I'd say perhaps five are valid.
I never quite know how serioulsly to take you Connel. But you sure must lead a sheltered life if you think tere are only five valid words for penis!Richardb
Um, let's not entertain personal attacks. (OK, I should heed my own words!) I suggest that non of this nonsense belongs in a respectable thesaurus. --Connel MacKenzie T C 18:01, 6 May 2006 (UTC)[reply]

To call this host of nonsense "slang" is far too generous. How would you conversationally use flaming staff of vengeance which conquers the sniveling flowers of pink town with medieval war bombs of gooey hysteria as a synonym for penis?

One of the factors preventing development of WikiSaurus is the incredible preponderance of the vulgar entries.
If you look at the WikiSaurus:Category, I'd say the non-vulgar entries are building, slowly diluting the proportion of vulgar pages. They are only a pre-ponderance if that is what you look for!Richardb
I vehemently disagree. --Connel MacKenzie T C 18:01, 6 May 2006 (UTC)[reply]

It is simply too difficult to comprehend creating valid entries, when so overwhelmed with this useless nonsense, no matter how amusing it may seem to our visiting vandals. We've tolerated WikiSaurus as a vandal honeypot for far too long. I think it should be scrapped and started fresh. --Connel MacKenzie T C 16:15, 29 April 2006 (UTC)[reply]

Where's the big picture of WikiSaurus? How can we scrap items prior to having something better to replace them? Are there enough people preparing to volunteer keeping the Saurus clean? What is clean? Won't it require more effort and manpower than is currently available? Do we have agreement on any of the many different possible approaches towards WS? And if we don't, will it help deleting all the current dubious content? — Vildricianus 19:06, 29 April 2006 (UTC)[reply]
Given just a bit of manpower, the rule to inforce is pretty simple. No red links on Saurus. If you want to add a term and it's red linked, you'd better create the page. If the page doesn't belong in the dictionary corpus, then the link doesn't belong in the thesaurus. The harder rule is to require an existing sense of the word. If the sense isn't there, you'd better add it, and hope it doesn't fail RfV. In fact, would it be too much to require reciprocated links? staff => See Wikisaurus penis if and only if staff is listed on Wikisaurus:penis. Davilla 14:51, 4 May 2006 (UTC)[reply]
That sounds very reasonable. Shall we start a vote on it, so that Richardb doesn't arrive after consensus is reached, to proclaim "that isn't the way it's done" again? --Connel MacKenzie T C 06:17, 5 May 2006 (UTC)[reply]
Connel, it might have been nice to at least leave a message for me. Richardb
Point taken. My impropriety was borne out of frustration. But I should have left you a note. --Connel MacKenzie T C 18:01, 6 May 2006 (UTC)[reply]

Another thing: would it help importing a public domain thesaurus? — Vildricianus 19:07, 29 April 2006 (UTC)[reply]

My previous suggestion to clean house was met with enough resistance that I didn't, whether this is a good thing or not is up for debate, but there was definately not a consensus to remove the unconventional material from the anatomy and sex related articles, for the most part folks took the stance "if we don't look at it it isn't there" or "add enough and maybe people won't notice", which I find fault with, but I am not the supreme overseer of wikt and will bow to the lack of consensus. That being said, over the past month I think Saurus has more than doubled in substantive entries, and there have been new formats suggested and tried, it certainly isn't broken beyond repair. Alas, I had to slow down in my efforts there because of recent non-wiktionary concerns, but those are hopefully clearing up to a point where I can get back at it. - TheDaveRoss 16:45, 30 April 2006 (UTC)[reply]
I knocked back TDR's cleanup the first time around, not on any principle of "NO DELETION", but principally on the fact that a lot of very useful stuff got chucked out along with the complete dross.Richardb
Two more things: here are some things that are being worked on, Request WikiSaurus entries, the page I have been using to stage a cleanup effort, and a list Dvortygirl made of related words. Also, the new layout is far from complete, so any work on the header or overall layout is a plus. I have tried things like WikiSaurus:annoy but they have proven difficult to edit, which is a bad thing.
Whether or not we should import a PD thesaurus...I a skeptical, but if a good one could be found and integrated well, I am certainly open to it. - TheDaveRoss 16:53, 30 April 2006 (UTC)[reply]
In terms of a PD thesaurus: you probably all know this, but Project Gutenberg has a copy of Roget's 1911 Thesaurus. —Scs 05:17, 4 May 2006 (UTC)[reply]

So here's a question: is there a convention for linking to Wikisaurus categories from the Synonyms and Antonyms sections of regular Wiktionary entries? It seems there ought to be. —Scs 21:13, 2 May 2006 (UTC)[reply]

I have created a template Template:WikiSaurus linkto provide a neat little link to the WikiSaurus entry.Richardb
Yeah, that's really where Wikisaurus should draw from. I wasn't sure how to do it when I linked porta-potty and a bunch of others to portable toilet, and I guess whoever recently wrote portapotty didn't look for other synonyms in any of the right places. Davilla 22:03, 2 May 2006 (UTC)[reply]
Scs: another user has created a nifty, pretty little link similar to the {{wikipedia}} for links to WikiSaurus, {{WikiSaurus-link|headword}} should do it for you. I would like to add an option for those people who are not too pleased with the current state of things, but would like to see WikiSaurus progress a little anyway: just give me content to format and start entries on. Dvorty girl has done a great job of this already, but if you have a scrap of paper and are thinking about which synonym to use one day, scribble down all of what you think of: "fire: terminate, layoff, something about a pink slip...retire, quit and outsource are related...etc" and post thay here, it is actually a lot of fun making these, you get to kind of go wild with connecting words to other words and mulling over your whole vocabulary to get as much as you can. Give it a shot, and even if you don't want to delve into category:WikiSaurus, you can still contribute. - TheDaveRoss 14:50, 3 May 2006 (UTC)[reply]

What's the right place to discuss the WikiSaurus effort? I have a couple of questions/comments, which may not belong here, but I'll go ahead and toss them out anyway:

  1. Is there a new (undocumented?) consensus on format? The format described at Help:Creating a WikiSaurus entry is quite different than the format evidenced by e.g. WikiSaurus:annoy and WikiSaurus:happy.
    This is a new format that I have been toying with, it is implemented on about 10 entries so far, deciding now whether people like it or not.
  2. I can see that the the format evidenced by e.g. WikiSaurus:annoy and WikiSaurus:happy might be more useful to users, but it looks to be much more work to maintain, involving as it does quite a bit of duplication of effort (all those transcribed definitions). Also, its "WikiSaurus" column looks redundant and suggests an overabundance of eventual synonym pages. For example, at WikiSaurus:happy, in the Antonyms section, both "sad" and "unhappy" are listed with (red) links to separate WikiSaurus:sad and Wikiaurus:unhappy entries -- but it seems to me we'd want one page containing both those words and their other synonyms.
    I have attempted to make it as easy to maintain as possible, and future revisions are VERY easy to convert to. The reasoning behind the WikiSaurus column is that I think we will want a WikiSaurus page for every word with a synonym, and I frequently look up a word that I find in a Thesaurus to see what else it might connote or relate to before using. If this isn't useful it is very easy to remove across the board by editing only one template :).
  3. What exactly are we trying to accomplish? In the conventional world of dictionaries and thesauruses, dictionary entries typically list at most one or two very close synonyms, the occasional "Usage Note" lists slightly-less-closely-related words and gives differentiating definitions and other guidance, and thesauruses list both closely- and loosely-connected synonyms, but typically as raw lists. We don't have to copy any of those formats exactly, of course, but we should have an idea what we're trying to do -- at the moment we've sort of got mixtures of close synonyms, loose synonyms, and usage guidance.
    There is no one formal declaration of the end goal. We are still feeling it out somewhat.
  4. I would suggest that lowering the workload (in particular by reducing duplication of effort) is an important goal: setting the bar too high in terms of "usefulness for the reader" ends up being less useful for the reader if it requires so much work that it never gets done. (So, without wanting to cast asparagus at all of TheDaveRoss's hard work, I'd vote for simpler, unannotated lists.)
    Unannotated is easier, but we can certainly leave the option open. I could add a "simple" template which doesn't ask for a second param, it is very fluid at this point.
  5. What's the right thing to do for different parts of speech surrounding the same concept? Separate entries or consolidated? Real thesauruses, of course, typically consolidate. Some of our entries -- for example, WikiSaurus:sleep -- do, but the instructions at Help:Creating_a_WikiSaurus_entry don't cover this. I just created WikiSaurus:agony, WikiSaurus:agonizing, and WikiSaurus:agonized, and almost created them all on one page (which still seems sensible on some fronts), but chickened out at the last minute and separated them.
    As I said, this is still all up in the air, the best place for discussing it is the "Thesaurus Considerations" page, but that is a discussion spanning 3 years.

Scs 01:16, 4 May 2006 (UTC)[reply]

- TheDaveRoss 02:19, 4 May 2006 (UTC)[reply]

Thanks for the answers and the link to Wiktionary:Thesaurus considerations. I'll probably pursue the rest of my musings there. One quick followup question, though: you said, "I think we will want a WikiSaurus page for every word with a synonym." Did you really mean that? If we want separate lists for "synonyms of sad" versus "synonyms of unhappy", it seems to me the place for them is on the sad and unhappy pages -- we don't need separate thesaurus pages any more, if they're one-to-one with headwords. (It also seems like a maintenance nightmare, though perhaps fixable with bots.) —Scs 03:42, 4 May 2006 (UTC)[reply]

It is VERY rare that two words mean exactly the same things in all contexts, and are completely interchangable.
Uh... who ever said they had to mean the same thing in all contexts? Davilla
Sad and unhappy for example, a synonym for "sad" might be "pathetic" which wouldn't be a good choice as a synonym for "unhappy."
Sure. Perhaps I shouldn't have said "'synonyms of sad' versus 'synonyms of unhappy'". But this is precisely why thesauruses don't list quite the same kind of strict synonyms that dictionaries do. They deliberately shave off some of the nuances and glom similar-meaning words together, precisely so that you can broaden your search from the one not-quite-right word you can think of to the perfectly-fitting word you wanted. (In other words, to use your example, one might very much want to go from unhappy to sad to pathetic.) —Scs 05:32, 4 May 2006 (UTC)[reply]
Actually the solution is much more basic than that. "Pathetic" is a different meaning of sad from "unhappy", so the synonym section of sad would say, for this meaning see Wikisaurus:PATHETIC, and for that meaning see Wikisaurus:SAD (as in "unhappy"). Davilla
Therefore, a page for every word. This is less contentious, I think, than including a page for every form of a word in the main namespace, plurals and verb conjugations and such, we would potentially lose content by only having particular headwords. - TheDaveRoss 04:39, 4 May 2006 (UTC)[reply]
But again, if it's a page for every word, why have a separate page? Why not list the synonyms for sad directly at sad, and the synonyms for unhappy at unhappy? —Scs 05:32, 4 May 2006 (UTC)[reply]
Yeah, but anyways that's kinda nuts to even think of it that way. (Not in the clinical sense, but not in the squirrely sense either.) Wikipedia doesn't go about making different pages for every term just because the same idea has two different names. Wikisaurus would similarly require cleanup tags to merge and split. Davilla 14:35, 4 May 2006 (UTC)[reply]
That is the primary difference between Wikipedia and Wiktionary though...pedia is interested in the ideas, while we are interested in the names :) - TheDaveRoss
Perhaps in WikiSaurus we are interested in the ideas. Since we want to group words under similar concepts, similar ideas ?Richardb
Personally,now that WikiSaurus as some serious content growing, peraps we could start doing some limited cleanup of the "dirty" stuff. But not on such tough criteria as used for entry pages. Maybe 1,000 Google hits. My guess is that would be enough to cut out stuff like flaming staff of vengeance which conquers the sniveling flowers of pink town with medieval war bombs of gooey hysteria (209 Google hits), without reducing us to Connel's preferred five entries !--Richardb 15:59, 6 May 2006 (UTC) (Bet medusetl isn't challenged, even though that gets only 3 Google hits.)[reply]
I strongly disagree. I think Wiktionary's criteria is a much better measure of a term's validity. --Connel MacKenzie T C 18:01, 6 May 2006 (UTC)[reply]

But I think the harder thing is to (bloody keyboard were te "H" is not working) see how WikiSaurus works for someting like WikiSaurus:new. Richardb

Connel called WikiSaurus a "vandal honeypot". Is this really the case, or is it more of a "sacrifical anode" ? It takes the hits while the main pages are left alone. Unless we bowdlerise Wiktionary, we are always going to have juvenile minds drawn to te "dirty" stuff. --Richardb 16:22, 6 May 2006 (UTC)[reply]

Perhaps we are using different terms for the same concept. The problem I see, is that with all the nonsense terms wikified in WS entries, those same juvenile contributors are adding entries to the main namespace. This creates plenty of ill will. And it is a significant amount of "extra" sysop cleanup activity. All for no reason, in my opinion. --Connel MacKenzie T C 18:01, 6 May 2006 (UTC)[reply]

WikiSaurus - compromise proposed (/more)

A possible compromise between the "tough criteria for WikiSaurus", and the "Don't lose even the least valuable "synonyms". Introduce, in WikiSaurus, a xxx/more subpage for the problem pages. Cull the trash from the main page (by whatever criteria), but don't just delete it, put it in the /more page. In the main page indicate that new entries not meeting the tough criteria have to be put in the /more page, and there can be researched for verifiability, and perhaps later promoted to the main page. With this I would then suggest we might even protect the main WikiSaurus page. Admin's would then be responsible for checking the /more pages every so often to see if there are any terms that could be promoted to the main page, as they meet the criteria. Thus we would meet two purposes. The main WikiSaurus page would be kept up to our "standard" (which I have to point is very subjectively applied), whilst the /more page would capture every possible synonym, and would in effect be a specific protologism page.--Richardb 23:26, 10 May 2006 (UTC)[reply]

I've made a start with WikiSaurus:penis and WikiSaurus:penis/more. Davilla knocked out a load of words of less than 1,000 Google hits. I've put them into the /more page. But there are a lot more words in the main page that do not meet the 1,000 Google Hits criteria. I've moved just 1, to show some willing. No doubt others will be wanting to rip into cleaning up this and other similar polluted entries, in this way. I can see no risk in just going for it with gusto, and rationale, as there is no deletion, just moving stuff to another entry. Nothing is lost. IF we change our mind, it can easily be restored.--Richardb 00:46, 11 May 2006 (UTC)[reply]

Again, I'm not really involved with this project. In fact I don't think I've ever made a contribution that sizable! Davilla 16:30, 12 May 2006 (UTC)[reply]
/more seems like a fine compromise. But I don't think we'll need to get too heavyhanded with warnings on the main page, let alone protection. Cruft will always tend to accumulate on the main page, but this "safety valve" of /more gives a later cleaner-upper an easier excuse to be bold and sweep out the cruft, since it won't be deleted forever. —Scs 06:18, 11 May 2006 (UTC)[reply]
Fine, good idea. However, can we find something else than 1000+ Google hits for criteria as a long-term solution? —Vildricianus | t | 08:24, 11 May 2006 (UTC)[reply]

Wiktionary:WikiSaurus criteria created as Policy Think-Tank

I've created this "Policy" page. Can we now, please, shift discussion about this to that page and it's discussion page ?--Richardb 01:28, 11 May 2006 (UTC)[reply]

First thing to do is to centralize all discussion and information we have about WikiSaurus on one page, Wiktionary:WikiSaurus, because it's a bit all over the place now. One thing we can do is, instead of using the Beer parlour to discuss new items, start using Wiktionary talk:WikiSaurus for such things, and bringing only major feats here. Perhaps we can also list all users who are interested in actively working on the WikiSaurus revamp. —Vildricianus | t | 12:04, 12 May 2006 (UTC)[reply]