Last modified on 30 July 2014, at 10:41

Wiktionary:Beer parlour/2009/April

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives +/-
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014

April 2009

Translations of taxonomic names

I wanted to enter somewhere that the Norwegian name for Primulaceae is nøkleblomfamilien. How is this possible with no English entry, only translingual? Do I create a translations section inside the translingual entry? I checked a dozen or so other members of Category:Taxonomic names and none had translations sections. __meco 08:17, 1 April 2009 (UTC)

We've had this conversation before, though I'm not sure where or when. At least some people contended, and I think this may have been the resolution, that the language-specific name is the translation of the English word representing that taxon, such as (in this case) primrose (one sense), and not of the translingual taxon name.—msh210 15:59, 1 April 2009 (UTC)
There is also a mechanism in place at Wikispecies to enter translations for taxon names. --EncycloPetey 03:21, 2 April 2009 (UTC)

Wiktionary:Writing templates

Hi everyone. A document like this has long been needed, and my meagre start merely scratches the scab off the itch. Could people please modify it where I am wrong, and improve it. Feel free to be slightly proscriptive. It'd be better to have "ideals" than "current practice". Thanks Conrad.Irwin 22:56, 1 April 2009 (UTC)

Hi, on the topic of floating, you write: "Conjugation templates: If it's small, float right and don't hide." Recently, I have made a move in the other direction, with German conjugation templates, turning some floating templates into non-floating ones. Is it really preferred that conjugation templates float in the right? From what I have seen in English Wiktionary, most conjugation templates do not float in the right, and I for one prefer to avoid floating conjugation templates. What do other people think? --Dan Polansky 09:45, 2 April 2009 (UTC)
I agree on the non-floating. There's also a timely example similar to this at WT:RFDO#Template:romance_cognates). --Bequw¢τ 05:36, 3 April 2009 (UTC)
Non-floating templates are better. They can be placed into a section without extending into other sections. --EncycloPetey 13:43, 3 April 2009 (UTC)

Move Category:Chinglish > Category:Chinese English

According to w:Chinglish, “The term "Chinglish" is mostly used in popular contexts and may have pejorative or derogatory connotations. The terms "Chinese English" and "China English" are also used, mostly in the academic community, to refer to Chinese varieties of English”

Wikipedia's rule for titling articles is “most common name”, but Wiktionary's a different animal. Any objection to moving the category? Michael Z. 2009-04-02 17:13 z

Do we even need the category at present? It currently has no entries in it, and the examples of Chinglish I've seen are a result of mistranslation or grammar problems rather than deliberate intent to develop a Chinese variety of English. — Carolina wren discussió 21:46, 2 April 2009 (UTC)
I have no idea. My uninformed impression is that Chinglish is considered a “sub-standard” speech, but we are a documentary dictionary, so there's no reason to exclude it. Someone foresaw a need by creating the category, and there are 300–500 million ESL speakers in China, so sooner or later they will start speaking to each other and someone will document it. I won't object to leaving it around until it's needed, nor to deleting it until it's needed.
I'm not even positive that it should be moved; I'm just going by the WP article's intro, and OED labels it “colloq. (freq. depreciative)”. It seems that educators and translators are interested in the subject.[1]Michael Z. 2009-04-03 00:04 z
A concurrent discussion about dialect labels make me think it's better to define a dialect template and category before it's needed, than to have editors start applying it ad hoc. After writing the last sentence, I found that it's actually already used in the entry bo jook, but there doesn't happen to be a template. So I'll create the template now, and let's keep the category. Michael Z. 2009-04-03 00:31 z

Words in the News (defunct?)

I have stopped adding words to Wiktionary:Words in the News. I don't think many people look at it, and it is getting difficult to find anything interesting to add. Unless anyone else wants to take it over, I propose that we archive the current content, remove what links to it there are, and just forget about it. If any really interesting word turns up in the news, we can always slot it into Word of the Day. SemperBlotto 15:10, 3 April 2009 (UTC)

We should add some links from there to Visivia's News wordlists, and any of the other such things; but yes, Words in the News does seem dead right now. JesseW 19:41, 3 April 2009 (UTC)
Archived. Delinked. SemperBlotto 14:53, 6 April 2009 (UTC)
I have wanted to add words to this list a few times but didn't due to the WikiNews restriction. Now that it's unmaintained I've added some words including ones we need definitions for from wherever I read news headlines today. — hippietrail 23:43, 24 June 2009 (UTC)

Taking a fortnight off.

I'm taking a break for a few weeks to attend to family and academic obligations. I would really appreciate if the rest of you would buckle down and finish the Wiktionary by the time I get back. Cheers! bd2412 T 23:33, 3 April 2009 (UTC)

Sure, we'll leave the last word for you to do in ceremony on your return! Conrad.Irwin 00:05, 4 April 2009 (UTC)
I'm distressed to have returned and found that the collection of "all words in all languages" is not finished. Quit lollygagging people, and let's wrap this up already! bd2412 T 21:56, 5 June 2009 (UTC)
We had almost finished, but the New York Times keeps making more words. Stop them! Equinox 23:53, 5 June 2009 (UTC)
Always with the excuses... bd2412 T 00:45, 6 June 2009 (UTC)

Uploading the remaining language templates.

I have started my bot slowly uploading the remaining ISO 639-3 templates as it seems a bit silly not to just have them all. The script will also tell me which templates differ from the standard's names (but won't 'correct' them). In the first few it has only come across {{aab}} which reads Arum-Tesu not Alumu-Tesu. Conrad.Irwin 22:17, 5 April 2009 (UTC)

We have been doing this by hand as needed for a reason. A number of reasons. Are you only getting the I (individual languages)? Are you checking that the ISO/SIL name is the name of the language, without things like a qualifier in parens (you are not), chekcing that it isn't a -1 language we use the 2 letter code for? Checking that it isn't an artificial language we prohibit? all in all, a recipe for a real mess. Since someone is going to have to check (and that means looking at the language itself, in WP or whatever) every single one, are you going to do it? I don't think this is really a good idea. Robert Ullmann 10:09, 6 April 2009 (UTC)
From what I've seen, it indeed only creates individual language codes (no collective codes), so no problem there. And I think we should have three-letter codes even if a two-letter code exists (it's already done on Wiktionary). As for artificial languages, the only actually forbidden artificial languages with ISO codes are Quenya and Klingon, I believe. All others just don't have any consensus. -- Prince Kassad 10:55, 6 April 2009 (UTC)
I have also replied on my talk page, including a list of discrepancies found so far. Conrad.Irwin 11:21, 6 April 2009 (UTC)
Duplicate language codes are a problem for categories (we don't want both Category:es:Cardinal numbers and Category:spa:Cardinal numbers) and technically "collective codes" are only in ISO 639-2 (639-3 has "macro languages"). I'm sure Conrad is able to filter out the ones that duplicate 639-1 codes, have parens or commas in the title, and are artificial (there's only maybe 8 or so). I don't think the concern with macro languages should stop Conrad. Best to get it out in the open and deal with it early. We should setup Wiktionary:Language codes with, among other things, a section on macro language codes and list which undergo fusion and fission. We can delete and protect the codes that don't use here. I say go for it, as it may be easier for everyone if we can deal with the majority of these issues in batch-style work (I know there's duplicate codes around still). --Bequw¢τ 02:31, 7 April 2009 (UTC)
I think it should be noted that we do not, as of yet, have a standard format for how to deal with languages with parenthetical qualifications, and there are quite a few of them. Robert proposed replacing the parentheses with a simple dash, which I think is reasonable, but we never really got consensus on that. Also, and I don't know how feasible this is for a bot to do, but it'd be nice to have a list of macros that we're treating as individual languages (such as Albanian). I don't think we should instantly put the kabash on them just yet (as for most of them, we have no good method/people for transitioning to true languages), but it'd be nice to have a list somewhere so we at least know they need fixing. Anyway, while Robert mentions some valid concerns, I think this is an excellent idea, and my thanks go out to Conrad for it. -Atelaes λάλει ἐμοί 05:01, 7 April 2009 (UTC)
What does kabash mean? We do not have an entry for this word. The uſer hight Bogorm converſation 06:00, 7 April 2009 (UTC)
See kibosh. Nadando 06:09, 7 April 2009 (UTC)
This subject is mature enough for a dedicated page, and one that sets the standard if we agree to it. I don't think it should just be a Wiktionary: page though. It should be in Appendix: so that everyone can see what languages are included in the dictionary. DAVilla 02:29, 8 April 2009 (UTC)
A first attempt at Wiktionary:Language codes is up. It includes a table that lists the ISO 639-3 macrolanguages so that we can note how we currently treat them (if we split them up or keep them whole). It has info on the language code decisions that I'm aware of. Please add to it:) (I wasn't sure about which prefix to use, Wiktionary for now)--Bequw¢τ 07:58, 14 April 2009 (UTC)
That looks useful, I like the macrolanguages table a lot, though I'd question whether it should be "policy", it's more of an informational page. If I get some time later, I'll filter the list of language codes and begin uploading again. Conrad.Irwin 08:53, 14 April 2009 (UTC)
It's now uploading again, but ignoring macro-languages, constructed languages, ISO-1 coded languages and languages with brackets. For the full list of exclusions see User:Conrad.Bot/bad_iso. Conrad.Irwin 09:44, 14 April 2009 (UTC)
Now complete, we have all the templates except for the ones on User:Conrad.Bot/bad_iso that did not exist before the filtered run. Conrad.Irwin 16:27, 16 April 2009 (UTC)

Why does the pronunciation not play automatically?

Why can't the pronunciation play automatically on wikitionary as on awww.answers.com when we click the sound (blowhorn) icon. Why is it in ogg format and why does it ask us to save the file first... (saving is ok but should be optional)? Am I asking this at the right place? 69.14.222.205 02:23, 6 April 2009 (UTC)

This isn't a Wiktionary issue; it depends on your browser configuration and how it deals with OGG files. Look at your browser help file. Equinox 02:26, 6 April 2009 (UTC)
Mediawiki uses OGG files because they're widely used and the format is free to use, not owned by someone who might demand money or recognitiion or such. See w:Ogg. --EncycloPetey 04:48, 9 April 2009 (UTC)

The prepositions/adpositions must be given with each verb or noun

I want to get the adposition used with verbs e.g. attentive to their needs, separately from its neighbors and as used with nouns a thirst for revenge, an amendment to the constitution. But many verb definitions and noun definitions does not include this information. This is very important for people with English as a foreign/second language. Is there a page on wiki projects which has this information?69.14.222.205 02:23, 6 April 2009 (UTC)

Where the combination of a verb or noun with a preposition means more than a basic sum of parts, it should have an entry of its own, linked from the verb or noun's entry in the Derived terms section. Indicating a preference for a particular preposition to be used would otherwise go in the Usgae notes section. We don't have too many projects here as they do on Wikipedia. The closest analogue we have is generally the various subcategories of Category:Requests by language. (As a side note, in both your examples with to, I'd find of to be equally valid.) — Carolina wren discussió 03:41, 6 April 2009 (UTC)
Actually, giving the relevant prepositions is a key issue which we should definitely do more of, and we have discussed it in the past without ever really working out the best format. See for example die, Verb sense 1, where I used subsenses for the job. Ƿidsiþ 12:49, 6 April 2009 (UTC)
of could be equally valid, but the point is that how is a foreigner supposed to know what to use with these nouns and verbs? Just telling the definition of a word does not allow the reader to be able to use them. A dictionary is the perfect place to tell this smaller word (adposition). What I mean here is that if the word is worthy and the defenition is Having merit, or value; useful or valuable then a person will not know how to use it in: he is worthy __ the prize'. The dictionary might as well tell that the adposition (also specify what type of adposistion : preposition, postposition, or circumposition) is of. Otherwise the user might use worthy for, worthy with etc. which are incorrect 207.148.219.146 12:45, 6 April 2009 (UTC)
207.148.219.146, I concur with you in general, but while worthy is a piece of cake, because the apposition is the same in the other two greatest West Europæan languages (digne de, etw. (Gen.) wert, Gen.->of), a more tricky example would be angry with (furieux contre, wütend auf). Howbeit, the addition of appositions for all major (Indo-Germanic? I am not sure about the non-Indo-Germanic languages, in Japanese this is determined by particles) languages is no doubt exigent. The uſer hight Bogorm converſation 16:36, 6 April 2009 (UTC)
What about verbs? Do we document that I write in ink (but presumably not for The Times, on paper, with pencil as they are used the same way throughout)? Conrad.Irwin 16:50, 6 April 2009 (UTC)
Common ways of writing could be offered as example quotations in write, and common phrases might qualify as entries in themselves. The meaning of in ink, for example, is not sum-of-parts. But we have to draw the line at offering language tutorials or lessons. Michael Z. 2009-04-06 18:12 z
My approach for a similar situation in Hungarian was a template that displays the case endings/postpositions and puts the entry in a category where words requiring the same case endings/postpositions are collected, see an example at aggódik (worry). --Panda10 23:16, 6 April 2009 (UTC)
Definitely to be encouraged! Probably the most widely known example is different than (US) vs. different to (UK) and the recommended usage different from. Unfortunately the links are and should be red because this is done through examples and usage notes at different. If you have a suggestion for a more standard way to present this information, I'm all eyes and ears. DAVilla 02:01, 8 April 2009 (UTC)
The best way to do this, in my opinion, is to collect citations that demonstrate use of the verb with various prepositions. --EncycloPetey 04:46, 9 April 2009 (UTC)

Is there a robot for this?

Is there a robot which can collect noun definitions where the plural of a word is not mentioned. I hate entries of nouns where the plural is not given.69.14.222.205 02:23, 6 April 2009 (UTC)

What do you mean, "collect noun definitions"? Can you give examples of entries that you "hate"? Equinox 02:27, 6 April 2009 (UTC)
I mean that the robot can tell a person (admin, editors etc) that these words are nouns/can be used as nouns but the plural form of the word is not mentioned in its definition. I am not able to locate any such word now, but in the past I have seen some words defined incompletely (maybe they were newly entered and the robot did not reach them yet). The question is, 'Is there such a robot?' 207.148.219.146 12:50, 6 April 2009 (UTC)
No. Though it'd be fairly easy to scan the XML dump if you are interested. Conrad.Irwin 12:59, 6 April 2009 (UTC)
Hm. How is the robot supposed to distinguish countable from uncountable entries? It would be reasonable just to detect the absence of any tag for countability and to tag it. The uſer hight Bogorm converſation 16:38, 6 April 2009 (UTC)
I've got a program that scanned the XML for nouns that don't use the {{en-noun}} template, results are at User:RJFJR/nounscan. If it used the template it should either list the plural or be marked as uncountable with a '-' so fixing the entries to use the template should help. RJFJR 16:46, 6 April 2009 (UTC)
Another good list would be those that are marked {{en-noun|?}} so that they could be fixed up as well. --Bequw¢τ 02:08, 7 April 2009 (UTC)
My previus comment struck because it applied to {{en-noun|!}}
In either case an automatic category would be nice. DAVilla 21:25, 7 April 2009 (UTC)
Those marked {{en-noun|?}} are listed in Category:English nouns with unknown or uncertain plurals, and those marked with {{en-noun|!}} are categorized in Category:English nouns with unattested plurals. Perhaps the first could be changed to something like "English noun entries with no plural given", and then used as a general cleanup cat? -- Visviva 02:58, 8 April 2009 (UTC)
It would be pretty simple to generate a list of English nouns that use either simple boldfacing or {{infl|en|noun}} in the inflection line, if there's any interest. Of course there are probably some cases where {{infl}} is still the best option. -- Visviva 03:02, 8 April 2009 (UTC)

French categories

Pharamp and I were having a discussion about the {{fr-noun}} template and the topic of categories came up. The issue was that Category:French_masculine_nouns and Category:French_feminine_nouns either have misleading category names or are not being used properly. They seem to be used mainly for singular nouns but {{fr-noun|m/f|singular word|type=plural}} places plurals in anyway. Pharamp and I were looking for other Wiktionarians' input. Should we recommend use of {{fr-noun}} over {{infl|fr|plural}}? Should we change {{fr-noun}} to make it not include these categories? Should we make different categories? I would favour the first option. —Internoob (Talk|Cont.) 22:44, 6 April 2009 (UTC)

The front page needs a revamp

I've come to realise that our front is, really, not very good (it seems to be a definite issue with many Wiktionary projects, actually) at doing its job of actually drawing people into the site. So I'm throwing a buttload of thoughts out there for mulling.

The first issue is waste of screenspace, and the single worst offender is undeniably the wall of text on the left side. Although it admittedly helps focusing attention on the left side where the WOTD is (a longstanding rule of webdesign 101 is that the most important content—besides navigation and that jazz—goes on the top left), it is ultimately mere blather and generates only tl;dr in the viewer who came to use the site, not be told the site's grand scheme. I had actually never done more than glance at it until examining the page in details for this analysis.

Second more global issue is that the page does not actually help draw the readers into the content itself. The only unambiguous link to an entry is the WOTD. ALL other links are either "hidden" (i.e. in the wall o' text) or ultimately useless for that purpose ("Wiktionary, the free dictionary"). All other links are to further index pages, some of which are not even that useful, for example, the index.

An inconvenient truth: the MediaWiki indexes are of little use for actual browsing, and unlikely to be used. Why? People who are looking for a specific word are not very likely to actually use the index (whatever its form), they'll go for one of the search box! And if you actually wanted to use the MediaWiki index, you'd end up swamped in the "form of"s, a phenomenon that is bound to get worse as more and more forms of other languages are added (I'm thinking particularly cases and conjugations, but a mere noun with a feminine represent least 4 forms in most languages). This is a separate problem, but it would probably be a god thing if we could avoid making visitors get turned off by it.

The top bar, for all its catchyness, not only does not accomplish much (even the search box is arguably very dispensible), but is actively taking a LOT of precious screen real estate, well over 1½ vertical inch that is pretty much wasted. At least the Wikipedia bar is discrete and stable for all usual screen sizes!

Now this is what we are doing wrong. What are we not doing that we could? A few ideas:

  • Feature some foreign-language material
    I know "this is the English wiktionary" and all that jazz, but actually showing that we do include original material that goes beyond just a billingual dictionary would be a good idea (cf. by showing words that do not have equivalent, or more quirky stuff than in the WOTD). Another avenue to explore is the nice little right-hand bar on ru: which list category links for a number of languages.
  • Feature some non-definitory material
    There is almost constant off-hand commentary that the Wikisaurus, appendixes, rhymebooks and suches are in a sad state. It's not exactly surprising seeing that there is little, if any incentive to actually improve them! Throw links to actual pages on the front page and people will be more interested in improving them.
  • Use the Word in the News feature
    This thing has been managed by editors mostly as an amusing aside, but is otherwise of no real impact, but by broadening it a bit beyond Wikinews report, it could become of definite interest to draw in people.

Circeus 23:47, 7 April 2009 (UTC)

Measuring elements of a Web page in inches seems like a mistake. Equinox 00:18, 8 April 2009 (UTC)
That's beyond the point. I could give the measures as 4.0299315154 E-17 light years and the fact would still be it's a BIG waste of space for something that doe snot do much to draw people into the site and pushes the useful content down. Circeus 00:32, 8 April 2009 (UTC)
You're right; I just took non-screen units as a bad sign from a would-be Web designer. How about making a prototype for an alternate front page in your user space? There are definitely improvements that could be made, but the best way to show them might be to come up with a new idea and put it together yourself for discussion as a starting-point. Equinox 00:49, 8 April 2009 (UTC)
I'll see what initial offering I can throw up. I'm familiar with the general principles of webdesign, but am not by any mean a professional (or even experienced in designing for large-scale websites). Circeus 00:58, 8 April 2009 (UTC)

The letter index links are indeed useless, because MediaWiki can't sort its way out of a wet paper bag (COO, cooperate, co-operate and coöperate never appear anywhere near each other). Let's strike all links to Special:PrefixIndex or Special:AllPages. Instead, link to Index:English and its allies. I'd be glad to rebuild the letter-link indexes, if everyone thinks this is a good idea. Michael Z. 2009-04-08 01:48 z

As for the indices, theoretically they won't have any 'form-of' entries in them (I know the ones that are generated with a bot are filtered first). Nadando 02:35, 8 April 2009 (UTC)

I'm referring to allpages/prefixindex, which is what the Main Page links to. Even if they were links to the hand-made indexes, the point about people being unlikely to use them would still stand. Circeus 04:38, 8 April 2009 (UTC)
The manual index for English actually feels a bit more like flipping through the pages of a dictionary, and less like undergoing bum surgery (although the very long pages drag my browser down a bit). It could also be useful. Linking to better stuff would improve the home page. Michael Z. 2009-04-08 05:35 z
The last time the main page was re-vamped was end of 2007, if you want to make small tweaks, feel free to do it on the front page - if we are going to be making a larger change let's create Wiktionary:Main Page/2009 redesign. I think linking to the real indexes is a good idea, now that some of them are being updated by User:Conrad.Bot (I linked to the AllPages beacuse the real indexes were far worse at the time). Ideas for what to put on the front page are needed, perhaps a history of WOTD? there's really not much that "enticing" we have. Conrad.Irwin 13:43, 8 April 2009 (UTC)
I like the idea of exemplifying our content on the main page. If we're going to link directly to entries, then Wikipedia is a great example of how to do that. EncycloPetey should definitely be commended for his consistent efforts on Word of the Day. A foreign language WOTD has been suggested several times in the past. The problem with this and other dynamic content is that it takes a lot of work to keep it up. Wikipedia for instance has an army to evaluate articles. French Wiktionary has featured articles, but not on a daily basis. Considering how much effort we put into citing neologisms, that would probably be a good source of material. I'm just not sure that we want to taut ourselves as being so "urban". Another good source is Words in the News. We might actually include the quotation, for instance when the media made a hubbub about Barack Obama's down payment speech. Ultimately we would want to highlight all areas: thesaurus, rhyme guide, phrase book, appendices, etc. DAVilla 17:26, 8 April 2009 (UTC)
The version of the demonstration design as of typing has a variable "Interesting stuff" box modelled roughly on WOTD that would be able to display pretty much whatever type of content we want it to with a bit of ingenuity applied to the template. Hit the "refresh" to see one of eight different examples from foreign word to phrasebook to appendix. Circeus 06:08, 10 April 2009 (UTC)

Re-coding started

I've started work on a version at Wiktionary:Main Page/2009 redesign, as Conrad.Irwin suggested. For the time being, It's mostly fiddling with and simplifying the code (the image overlap is implemented in a very complex fashion). I intend to leave the bottom tier (i.e. scripts and other wikts.) and the basic design elements (blue boxes+icons) mostly intact. Ideas (on the talk page plz?) are welcome and even demanded :P Circeus 19:51, 8 April 2009 (UTC)

From memory, much of the image code was to make it not look too atrocious on IE6, but I've never really been a CSS guru so there's probably a better way. I really like the "Newly discovered" box, but think there should not be only one style of container on that page (it begins to hearken back to Wiktionary:Main Page/Old 2007). Conrad.Irwin 09:48, 9 April 2009 (UTC)
The two boxes under WOTD were added by DAVilla. I like the look, and wanted to keep it globally consistent (the only reason it didn't look like the previous page is that the current version has less boxes.), though I wanted each box to have its own icon. I just checked and my system (it uses position:relative to allow the image to be absolutely positioned in regard to their container box rather than "sliding" the containers under the pictures) does work in IE. What's more, it allows for a constant margin-top between the boxes, which would have been much harder to achieve in the prvious system. I shouldn't take too much credit for it, though. I came across the solution entirely by accident while looking for something else. Circeus 15:45, 9 April 2009 (UTC)

Out-of-process removal of a verification requæst for a sense of cruz gamada

Regrettably, I bring this issue to this forum, as it has not yet been resolved.

Note this RfV-sense discussion for the “swastika” sense of the Portuguese term cruz gamada, which I started. Stephen G. Brown removed the {{rfv-sense}} without providing the necessary reference and quotations to support that sense (as required by our CFI). The entry’s revision history, alongside our discussion on his talk page shows the developments since then. The entry is now edit-protected, with the {{rfv-sense}} tag removed. As I state on his talk page, I consider Stephen’s edit-protecting of the entry to be “an abuse of [his] admin. privileges”. Since we are æqually intransigent in our positions, I ask that others from the editing community intervene in this matter to resolve the issue. Whilst it is unfortunate that it could not have been resolved before now, I wish to commend Stephen for his civility in our discussion thus far.  (u):Raifʻhār (t):Doremítzwr﴿ 02:06, 8 April 2009 (UTC)

Challenged senses of FL terms are subject to the same burden of proof as English terms. A quotation where the meaning of the head-term can be determined by knowing the meanings of the other (in this case Portuguese) words is required (technically three). It might be a bit more difficult to verify if the sense in question is "standard" or not, but still feasible. The RFV process should not be stopped early as personal knowledge is not attestation. --Bequw¢τ 03:59, 8 April 2009 (UTC)
Indeed; pretty much my points.  (u):Raifʻhār (t):Doremítzwr﴿ 04:05, 14 April 2009 (UTC)
Yes, that's quite a silly dispute. Of course, as long as it's being discussed at RfV, the tag should remain, and it does no harm anyway. I agree that there is no way that reverting a non-admin in a good-faith dispute and then protecting your version is an appropriate use of that tool. I have removed the protection at least, but that doesn't mean that you should go revert him again. It would have been better from the start if you two could have just let it sit and let others act. A little tag is not the end of the world. Dominic·t 13:09, 8 April 2009 (UTC)
I have not reverted him since the entry’s de-protection, since the RfV process still applies without it. Another editor may re-add it if it is felt necessary.  (u):Raifʻhār (t):Doremítzwr﴿ 04:05, 14 April 2009 (UTC)
There is good reason not to apply CFI fully in foreign language cases. For one, we are the only Wiktionary so far as I know to have an RFV process and a Citations namespace. If the major Wiktionaries are going to include every term in every language, then it is not reasonable to expect each of them to cite terms like confuzzle. Likewise, citing English terms is more than enough work for us. It is unfortunate that coordination and cooperation has not been established at any level. For now it is precisely Stephen to whom we have deferred many of these decisions.
If you like we can revise or put through (all or parts of) this vote on attestation criteria which I had left on the back burner. The section on attestation in other languages would allow us to defer the question of cruz gamada to the Portuguese Wiktionary. This isn't an easy or short-term solution, seeing as Portuguese Wiktionary is way behind, lacking even a page for the same term. The entry is still going to have to be cited somewhere. However, in the long term initiating verification on other Wiktionaries is going to be a much more sound approach than having all work initiated here. DAVilla 16:45, 8 April 2009 (UTC)
“[I]nitiating verification on other Wiktionaries is going to be a much more sound approach than having all work initiated here.” — Well, indeed, but why can’t we copy their (GFDL-licensed) quotations to this Wiktionary if they have some, and vice versa? I personally think that such duplication would be valuable. This also means that the CFI need not necessarily be altered.  (u):Raifʻhār (t):Doremítzwr﴿ 04:05, 14 April 2009 (UTC)
While DAVilla's scenario sounds nice, and would be the prudent approach, all else being equal, I wonder if it will actually pan out. Whatever the article count might imply, we're leaps and bounds beyond every other Wiktionary. I think it very likely that we will often be the ones whom other Wiktionaries take stuff from, even in their own languages. Because of this, I don't think that we should modify our CFI for foreign languages; they should be subject to the same restrictions as English. I think that we will get to the point where we have teams of people working on every language, as we currently have on English, and I think we'll get much better at efficiently (probably automatedly) acquiring cites. That being said, in the meantime, I think we have to exercise a bit of caution, and treat those rules as something to be tempered by good sense. Stephen Brown is easily one of our best contributors, and he has proved his expansive knowledge of languages in a multitude of ways during his stay here. If Doremítzwr were a native Portuguese speaker, and the word sounded odd to him, then he might be justified in his actions. As it stands, wasting so much of Stephen's time over this is nothing short of ridiculous. In my opinion, the rfv should be wiped, completely out of process, and completely against the official rules, because common sense dictates that it's the best approach to the situation. -Atelaes λάλει ἐμοί 04:26, 14 April 2009 (UTC)
Forgive my curtness, but it would’ve wasted far less of his and my time for him to have learnt how to quote and cite and to have done so. Very few of his contributions get challenged, but when they are, I see no reason why he should have the prerogative of closing RfVs out-of-process. Bear in mind, as well, that he could’ve left the {{rfv-sense}} tag where it was for someone else to address the requaest as, it turns out, someone already has.  (u):Raifʻhār (t):Doremítzwr﴿ 04:44, 14 April 2009 (UTC)
I agree that Stephen's reaction was not appropriate. However, let's not act as though he's solely at fault here. To begin with, you admitted from the get-go that, "I don’t doubt that this term is used thus." Your only qualm is the "correctness" of such usage. Vagahn produces a dictionary which backs up the sense, which you don't dispute, and yet you persist in your demand of the three cites. This is taking the letter of the law beyond its spirit. This is taking something which we have already verified within our current resources and utilizing a rule the wrong way. Yes, I will admit that diplomacy is not Stephen's strong suit. That does not nullify the fact that you are wasting our time with this. Let it go. -Atelaes λάλει ἐμοί 06:30, 14 April 2009 (UTC)
Very well; the dictionary does address my original concerns. I shall not object to this RfV being closed without satisfying the letter of the law.  (u):Raifʻhār (t):Doremítzwr﴿ 14:29, 14 April 2009 (UTC)
It's Vahagn :( -Vahagn Petrosyan 14:43, 14 April 2009 (UTC)
Gah....sorry. -Atelaes λάλει ἐμοί 18:44, 14 April 2009 (UTC)

Do we include non-native language?

A discussion has arisen in Wiktionary:Requests for deletion#vacuüm, and I believe it has far-reaching enough consequences that I thought it worth bringing up here. Do we include non-native language? vacuüm was requested for deletion by Hamaryns on the grounds that all the quotes were from non-native speakers (Dutch, specifically). Much to my surprise, no one batted an eye at this, opting simply to note a Netherlands specific contag. Of course the "All words in all languages" was quoted. I think that this is a very, very bad idea. Should I start entering how I pronounce French words? It's not very similar to how French speakers do (any of them). Additionally, how do we define when a non-native speaker is interspersing their native language into their English? One could well argue that the Dutch speakers in the aforementioned example are simply using the Dutch spelling of vacuum. Now, this is not an attempt to try and promote some "higher standard," I think that the English of a poor farmer in Georgia is every bit as valid as that of the queen of England. However, I think that we should limit our descriptive approach to native speakers. Now, it is certainly possible to describe how second-language speakers speak every language (or rather....how they write it), but I think this is a huge can of worms that we really don't want to open. Finally, I will personally murder anyone who attempts to add my pronunciation of Ancient Greek into the Ancient Greek entries. -Atelaes λάλει ἐμοί 08:47, 8 April 2009 (UTC)

I agree with the above. I don't think we serve any good purpose by including such variants except to provide a living embodiment of a Borgesian (w:Library of Babel) dictionary of all possible attestable utterances and orthographies. DCDuring TALK 10:10, 8 April 2009 (UTC)
I would certainly draw the line at nonnative pronunciations, not least because they would almost certainly be unverifiable. But, as the attestations show, "vacuüm" is a word that appears in published writing in English, and therefore potentially a word that a reader might encounter and want to look up. We do list misspellings at Wiktionary, and I don't see any reason to exclude misspellings that are encountered in published work simply because they are only ever committed by nonnative speakers. Angr 11:46, 8 April 2009 (UTC
Yes, we include misspellings, but they are under different standards than other words (exactly what those standards are is fairly vague, as far as I can tell). They are required to meet rather more demands than simple CFI. -Atelaes λάλει ἐμοί 18:17, 8 April 2009 (UTC)
I look forward to the day when downloadable audio will be deemed usable (mandatory?) to cite pronunciations. I wouldn't count on current technological/economic limitations to keep us from being overwhelmed by new "opportunities" and never achieving excellence at our current mission. DCDuring TALK 14:39, 8 April 2009 (UTC)
No, don't add your pronunciation under a French language header, but it would be right to add pronunciations for regions where French is relied upon to communicate, for instance many countries in Africa where the R is rolled (as in Spanish), even if it is not the native language there. The French terms to which you can add English pronunciations are borrowed terms like parlez vous.
In my opinion we should include all language transfers that can be documented. The reason is that the lines between proper and improper use of language, and in many cases even between one language and another, are artificial. I promise you that the Parisian French francophiles hold in such high esteem is not spoken near the border of Spain, but their speech is still comprehensible to a Frenchman and incomprehensible to a Spaniard. At one time in history there would have been an entire range between the languages. The reason languages are so uniform today is political, primarily the result of education and the choices at state level in how to achieve it.
The question of which language is more or less judged by the surrounding text. If the sentence minus the term is in a single language, then the entire sentence is in that language, tag-switching aside. The good thing about switching is that it rarely occurs in isolation, so it's easy to spot scenarios where this is the most likely explanation. That is why personally I don't think the immediately surrounding text is enough for citation purposes. After all, a person could be trying their hand at another language and failing miserably. However, in nearly every case, a work will be presented in one almost entirely predominant language. For instance, I'm sure you've seen Latin words sprinkled in English books, but there's no question that the book is written in English. That's why so many Latin terms are said to be borrowed.
If the works cited in this case are written in English, then in my opinion even if they are written by foreigners, the quotations I've seen count toward attestation. Given how easily misspellings and nonstandard use are accepted, note how weak a claim this is. DAVilla 16:11, 8 April 2009 (UTC)
First of all, let's please set aside any talk of "proper"and "improper", as that only confuses the issue. The distinction I'm speaking of is native and non-native. Now, this is not always an easy distinction, as there are children who grow up in bilingual households, but it's a much more possible distinction than proper vs improper. Your point about the different varieties of French is moot, as I have already been quite clear that any native speaker has equally valid speech, whether they grew up learning Parisian French or the French of some outlying French area, as well as Quebec French or whatever other type of French there is. The statement about if everything but one word is from a language, then the whole thing is in that language is not true. When I try to communicate with Spanish speakers, I often drop English words in, simply because my Spanish is not very strong. That doesn't make it Spanish. -Atelaes λάλει ἐμοί 18:17, 8 April 2009 (UTC)
Right, and my roommate with his Dutch, and even if your Spanish or his Dutch were strong, you may still do it. It's called tag-switching, as noted above. If it happened often enough it would develop into a creole, let's call it Nederlish. When that happens I have a hard time with the notion that several hundred words suddenly pop into existence without ever having been documented before. We try to avoid handling the higher-level structure directly, but you do have to keep in mind that not only languages but their boundaries are in flux. Tens of thousands of descendants like vacuüm would already be part of the pre-existing langauges, but what about the new Nederlish terms that bridge the gap? What about the purely English words with their English meanings and their Dutch misspellings (maybe bruüm instead of broom), or the Dutch words with their Dutch spellings and their English misinterpretations (maybe Nederlands for the country instead of the language). You would refuse to document these because e.g. bruüm does not appear in a Dutch context, only an English context by non-native speakers, at least until Nederlish is sanctioned as a language. How does that political decision suddenly make all our citations valid? What I'm saying is, to me "all words in all languages" does not mean all words provided it's sanctioned as being some language. Rather, if it's clear than it's natural language, then it's a word, and the problem becomes identification of the language. How sure do you have to be that it's not Dutch? Saying native vs. non-native is just as prescriptive as proper vs. improper. Their English vocabulary may not have developed fully in elementary school, only later in high school or university. I guarantee yours did not either. Yes, I would want to be certain that the citations were entirely in the target language, which is why I already did address tag-switching, the problem you mentioned. A foreigner not fluent in English would be obvious by dropping in larger Dutch words, the translations of which he does not know. It would happen more than once. I said above and repeat that I would not include such citations. But if the writer or speaker is confident enough to communicate entirely in English and makes a legitimate mistake, and if he's not the only one to make that mistake, then it's attested in the most nonstandard, half-bred, incorrect and hideous form, but attested nonetheless. DAVilla 03:41, 9 April 2009 (UTC)
You mean code-switching. tag switching is a method for routing data packets (I invented it, and Tony Li of cisco tried to patent it, somehow forgetting that he had gotten it from my IETF plenary presentation on next-generation IP ;-). You should hear Swahili and English here, where people fluent in both insert English words where there is no Swahili, and people more comfortable with Swahili insert words when speaking English. The TV stations have news in Swahili at 7pm, and English at 9pm, but unless you understand both, you are likely to miss a lot, interviews are neither subtitled nor translated. On topic: I've been careful to only tag things as "East African English" when they have been clearly borrowed into English: murram, godown. Robert Ullmann 12:15, 9 April 2009 (UTC)
I note, as an aside, that one implication of limiting all entries to native speakers is that we won't have any Neo-Latin terms. --EncycloPetey 04:43, 9 April 2009 (UTC)

No one is proposing adding Atalaes's pronunciation of ancient Greek to the dictionary. Let's curb the hyperbole.

So do we remove and ban Category:East African English, Category:Hong Kong English, and Category:Indian English? Delete the corresponding templates? Do we put Category:Quebec English on probation until we find solid proof that more English-first Quebeckers speak English than French-first Quebeckers? Should we scan the Wiktionary for quotations[2] from w:Joseph Conrad, because he was not a native English speaker? Strike Salman Rushdie quotes, in case English wasn't his first language.I guess quotations of translations from Plato, Cicero, Marx, Derrida, and Abba should be stricken too. Let's start with an audit of the personal histories of cited authors, and compile a blacklist of the ones who didn't learn English first.

The proposed restriction is not followed by any other dictionary. It's arbitrary and completely impractical. Michael Z. 2009-04-09 05:17 z

Wow, that was such a well-reasoned retort I've almost forgotten why I proposed this in the first place. Point conceded. -Atelaes λάλει ἐμοί 05:39, 9 April 2009 (UTC)
Well, I think Netherlands English vacuüm is weird, too – it's way out of the ballpark for any paper dictionary. We don't have the same extreme constraints they do, so we may end up with niche regional categories from every country in the world, especially for the lingua anglia. It may require much more work to find the boundaries of regional usage than merely to confirm attestation. Michael Z. 2009-04-09 14:39 z

New categories

We're having some international decisions to take on categories in the French parlour :

User Category:Synonyms (and antonyms) Category:Homonyms Category:Homophones Category:Heteronyms Category:Paronyms
JackPotte for for for for against
Category:Hyponyms Category:Hyperonyms Category:Meronyms Category:Holonyms Category:Transitive verbs (and intransitive) Category:Ergative verbs (and inergative)
against against against against for for

At the risk of repeating myself, here goes. If you have an automatic category for synonyms, what does that do? What's happened on the French Wiktionary is that every article which has the {{-syn-}} template on it now adds that words to the "French synonyms" category. But it doesn't say within the category what the synonyms are, just that the said words has at least one synonym. What percentage of words have at least one synonym? More than 90% surely. A category like "English synonyms" could easily have a million articles in it by the end of the year. Do you see what I'm saying?

PS Jack, nothing against you personally, it's just an admin issue, that's all. Mglovesfun 18:25, 8 April 2009 (UTC)

Even if such a list would be huge, it would be interesting to have it in order to be sure to find some words with synonyms when needed, and to know its number of entries : which language do you think propose the more synonyms ? JackPotte 18:42, 8 April 2009 (UTC)
Yeah but when you want a synonym of a word, it's a specific one. An alphabetical list of every word that has a synonym (more than a million) is of little use. Mglovesfun 18:58, 8 April 2009 (UTC)
Well, I'll leave how the French Wiktionary is run well enough alone, but I would prefer not to have most of those categories here. I like having "language" "POS" categories, etymological categories, topical categories, and a few others, but the rest I'm suspicious of. I just don't see any utility from most of what you're proposing. -Atelaes λάλει ἐμοί 19:03, 8 April 2009 (UTC)

The only one of these *-nyms that looks even remotely worthwhile in a non-trivial way for categorization is heteronym and perhaps one for homographs. — Carolina wren discussió 21:03, 8 April 2009 (UTC)

I guess you must not think some of our existing categories to be even remotely worthwhile. We already have Category:Ergative verbs by language. I agree that synonyms isn't such a useful category. Still, I don't think all of these should be brushed off, though my interpretation is a little different. If there's a definition line that includes synecdoche or the like then that could be automatically categorized. It could easily be done with a context label. DAVilla 02:35, 9 April 2009 (UTC)
The various verb ones I do consider useful, but they aren't what I referring to by *-nyms. Synecdoche I'm ambivalent about, but since there probably ought to be a usage label for it, anyway, wouldn't have a problem with an associated category. — Carolina wren discussió 03:34, 9 April 2009 (UTC)


The 5 other languages versions of this debate are being updated with the previous ideas. Clearly I'm "for" because we need to find all synonym words and compute them, even without any synonym example in mind. Synchronise this dynamic alphabetic ordered list with external ones. JackPotte 12:07, 9 April 2009 (UTC)

A word is a synonym of another word; there is not point in classifying a word as a synonym without reference to another particular word. Thus, a category for synonyms is dubious, just like categories for other semantic relations including antonyms, meronyms, holonyms and coordinated terms. In general, categories are poorly suited for capturing role types in binary relationships, which synonymy, antonymy, meronymy and holonymy are. To clarify a further bit, you can classify people into friends and non-friends with reference to a particular person, but if the person whose friends are sought is unknown or unspecified, classifying people as friends and non-friends becomes pointless. --Dan Polansky 09:52, 10 April 2009 (UTC)
1) For instance, one of the recognized longest English words : "pseudopseudohypoparathyroidism" (30 letters) seems to have no synonym (only some relatively long paraphrases), thus the synonym category might still be useful for the reasons I've described yesterday. Moreover, somebody who need to check the "replaceabitity" of each word would be interested in this synonym list.
Apart from that another proposition has been suggested today : Category:Words by number of letters (you can see here an example of result with the French acronyms).
2) I'm not academician, however we're so at the top of the linguistic researches that we could propose a new computer scientific definition of paronyms (some words of maximum 2 different letters or sounds between them), hence false-friends would be some international paronyms and would be classified in the same for all. JackPotte 20:59, 10 April 2009 (UTC)
It might make more sense to have a category for words without synonyms, although that's a dubious claim. False friends might be interesting although we don't even list those in entries. I wonder if a category could contain English terms likely to be misunderstood in Spanish, Italian terms likely to be misunderstood in French, etc. DAVilla 23:15, 10 April 2009 (UTC)
I've asked about false friend at some point not so long ago. I think the consensus was that false friends should be discussed only with regard to english, and notes in the "Usage notes" where the best way to go about it. Circeus 23:58, 10 April 2009 (UTC)
Maybe some people would build it by being inspired by the French and Italian ones. JackPotte 00:25, 11 April 2009 (UTC)
Categories decisions :
Flag of France.svg French Version
Flag of Germany.svg German version
Flag of Spain.svg Spanish Version
Flag of Italy.svg Italian Version
Flag of Portugal.svg Portuguese Version
A category for words having non-etymologically-related homographs (excluding inflected forms) and a category for words having non-etymologically-related homophones (excluding inflected forms) might be useful, provided that they are short enough, but a category for words having synonyms is not useful at all (it's very obvious). Lmaltier 06:20, 11 April 2009 (UTC)

Update CFI

I started a discussion several weeks ago about updating Wiktionary:Criteria for inclusion, specifically changing or removing the Proverbs section because current practice is inconsistent with the policy. Since nobody replied, I made the change, but it has been reverted because there had been no vote regarding this change. I've been asked to make the request here, hence this message. Should the change I had proposed be implemented, ignored, or should a different result be applied? Mindmatrix 22:57, 8 April 2009 (UTC)

I guess you'd have to wait until the outcome of this vote before making changes without going through the formal process. I would implement the change myself except there's a vote out on exactly this point. DAVilla 02:15, 9 April 2009 (UTC)

Wikimania 2009: Scholarships

Wikimania 2009, this year's global event devoted to Wikimedia projects around the globe, is now accepting applications for scholarships to the conference. This year's conference will be handled from August 26-28 in Buenos Aires, Argentina. The scholarship can be used to help offset the costs of travel and registration. For more information, check the official information page. Please remember that the Call for Participation is still open, please submit your papers! Without submissions, Wikimania would not be nearly as fun! - Rjd0060 02:10, 9 April 2009 (UTC)

Editing without Wikitext? Introducing User:Conrad.Irwin/editor.js

I've made a start on trying to improve the editing interface of Wiktionary. At the moment it only supports adding translations, but I'd thought I'd see if people are interested in this kind of thing before writing lots of features.

So, If you'd like to be able to edit a page, without looking at Wikisyntax, try enabling the WT:PREF "Add input boxes to pages to assist with adding translations.", or if you're not a WT:PREFS person, add the following line to your personal javascript.

importScript('User:Conrad.Irwin/editor.js');

More information will be found at User talk:Conrad.Irwin/editor.js, bug reports and small feature requests are welcome - here or there. Conrad.Irwin 19:03, 9 April 2009 (UTC)

That's awesome. I'd really like to see this expanded and become part of the standard javascript (as the users likely to use WT:PREFS are basically the only people that don't need this. :-)). This could really make the process more inviting for a lot of users. -Atelaes λάλει ἐμοί 19:15, 9 April 2009 (UTC)
I certainly think it's worth enabling it for a trial period to see whether it encourages people to add translations or nonsense. Some kind of function to create a minimal foreign language entry from a redlink in a translation table would also be cool, but I think that should come under acceleration (which is another thing we might want to enable by default - maybe only for logged in editors though?). What do people feel about enabling editor.js for everyone for a couple of days after Easter? Conrad.Irwin 16:00, 10 April 2009 (UTC)

Okay, what am I looking for, and where? I've tried this in Safari 4 beta and Firefox 3.0.8 on the Mac. The only difference I see is that Translations sections are expanded in Safari (only). Michael Z. 2009-04-10 16:41 z

You should see an input form like at http://jelzo.com/Screenshot.png . I'm using Firefox, and have tested in Safari 4. It sounds like you are getting a Javascript error, could you tell me what it is? (Ctrl+Shift+J in firefox may give you a list of recent errors) Otherwise, which WT:PREFS do you have enabled, there may be a conflict? Conrad.Irwin 16:48, 10 April 2009 (UTC)
Ach, sorry. It relies on a function (newNode) that I thought I had added to the sitewide Javascript but hadn't (it's used by the feedback stuff and some of me other stuff) as I had feedback.js enabled, there was no problem. Should now be fixed. Conrad.Irwin 17:02, 10 April 2009 (UTC)
Works now. Very cool way to enter structured info.
I almost gave up trying to figure out how to save, before I spotted the buttons at the top-left corner. I think it's more natural to look for action buttons at the bottom-right of the display, so maybe a 1-line strip across the bottom of the window would be more natural.
It also needs a more definite confirmation – a clear “done.” Have you tried reloading the page after entry?
And while I'm at it, I think I would be making the same language transliterations repeatedly. It would be nice if it would remember its expanded state, and auto-enter the language code and script.
But no complaints at all. This makes it much easier to just go in and enter translations, and I can see myself entering more with the streamlined process. Michael Z. 2009-04-10 22:06 z
Glad it's working, and thanks for the feedback. Yes, remembering code is a good idea; I'm not too fussed about where it appears on the screen, I might have an experiment. I thought about reloading the page, but with caching being what it is, it seems that you don't always see your changes (unless you do an ?action=purge which won't work nicely for anonymous users and is quite slow anyway.) On save, it should definitely remove the green highlighting, maybe that would help? Also, would it be worth trying to guess the script from the language code? There is a mapping at {{lang2sc}} I could steal - though I'd still provide an override. Conrad.Irwin 22:35, 10 April 2009 (UTC)
Flash a background colour once for the changed spans, and remove the green outlines – sort of a burning-in effect. That would be slick, and better than a reload. Michael Z. 2009-04-10 23:04 z
I've made these changes, with the exception that I prefer the save button to be top-left. clear your cache (ctrl+shift+F5) Conrad.Irwin 22:54, 11 April 2009 (UTC)

Works great. However, if the trans table does not have a gloss, I get an error: "Could not find translation table, please improve glosses.". Is this how it is supposed to work? Tested in Firefox v3. --Panda10 19:32, 11 April 2009 (UTC)

Yes. In order that it won't make a mistake when editing the wikitext (it could guess a little more, but it's very hard to spot if it makes a mistake, so it plays safe.) It will also fail if the trans-mid template is absent, and in some cases where the glosses are too similar. Conrad.Irwin 20:25, 11 April 2009 (UTC)
I was scratching my head over that message. How about “Error: this Transliteration section doesn't indicate which sense it applies to. Please add a gloss to {trans-top|...}.”? But I've already chosen a translation table when I started entering text, so why refuse to accept my judgment? Michael Z. 2009-04-13 17:43 z
The issue is that counting the number of {{trans-top}} in the source code does not work. (That was what I first attempted, but there are some crazy cases where you find <!-- {{trans-top}} or similar, the code tries to ignore these but I found it still couldn't count accurately enough. Thus the only way to match the trans table you select in the HTML with the trans table it finds in the wikitext is to match glosses. I could try adding some more complicated logic to match them all up in order and interpolate for those it cannot match - but I'd prefer to do that as part of a seperate "and translation gloss" module. Maybe a temporary improvement would be to try and guess which HTML tables it will not be able to find, and not provide the editing form there. Conrad.Irwin 18:02, 13 April 2009 (UTC)
How about adding an “add a title” button to untitled translation blocks, instead of the form for adding a translation? We should have a work bee to empty out Category:Translation table header lacks glossMichael Z. 2009-04-13 18:49 z
With time... Conrad.Irwin 20:22, 13 April 2009 (UTC)

I love this.

The “guessing Xxx” is nice, too. Maybe make the label “Script code”, and link it to w:List of ISO 15924 codes. How about “guessing Xxxx (<Script name>)”? Why not enter the code into the field and make the text beside it “<Script name>”, which updates if you enter a script code. If you're worried that the guess is wrong, then add a checkbox to activate the field with a click instead of having to retype the guess. Alternatively, make “Xxxx” a link or button which enters the value in a click.

Actually it will use the guess if you leave the box blank - maybe I should change "guessing" to "using"? Conrad.Irwin 18:02, 13 April 2009 (UTC)
In that case, just enter the code as the field's default text, and then “using” is self-evident. The fact that the script was guessed from the language can go into the docs page, but needn't clutter the interface. Michael Z. 2009-04-13 18:49 z
I could do this given some time, but it's a bit fiddly to work out when I can just replace the value in the text box without irritating the user. I hope the recent interface improvements make it clearer. Conrad.Irwin 20:22, 13 April 2009 (UTC)

Gender could have (m f n c p) links with tooltips, buttons, or a pop-up menu, to enter with a single operation. Or just a pop-up menu if these are absolutely the only choices. Should that say “Gender or number”, since plural is not a gender? Michael Z. 2009-04-13 17:43 z

I was under the impression (for some language) that you can have words that are masculine and feminine, I was considering putting these into a standard drop-down, though that could get long but maybe a set of tick boxes would be better - but that takes up a lot of space and is slower to use. Conrad.Irwin 18:02, 13 April 2009 (UTC)
Now changed to checkboxes. Conrad.Irwin 20:22, 13 April 2009 (UTC)

This keeps getting better. Please don't be annoyed by all the piecemeal suggestions.

Instead of having the action controls dissociated in the top-left corner, can you make each translation block's form independent, having its own "Save" button at the bottom? This would be a good place to put the help button: (?) (Save). Instead of an “undo” stack, each translation span could have an (x) button for deletion. Michael Z. 2009-04-13 21:25 z

In short, yes this would be nice, but no it doesn't work. In long:
I agree this would look nicer, and it was how the early prototypes worked, however I don't think it is practical. If each form has a save button then I add a translation into two different boxes, and click save on one box, it would only save the changes in that one box. I then have to wait until that has finished saving before allowing the user to click save on the other form, in order that there is no possibility of the edit being rejected as an edit-conflict by the software - this would be slow and, if there are lots of translations tables, that's a lot of save buttons to wait for. If the save button in each form saved the changes for all forms, it would be very unintuitive and probably lead to people saving edits the didn't intend to save. Adding a (x) to each translation is tempting and perfect from a UI point of view at the moment; however from a coding point of view it is a nightmare. I'm not prepared to hand-analyze all the dependencies that could be made by any combination of edits. What do I do if someone adds a translation, moves it to the other column, and then clicks (x). For example, if I add 'a' and 'b' as Spanish translations, the first edit adds "Spanish: a", and he second adds ", b". To delete 'a' as a translation I have to delete "a," which is not simply undoing an edit made so far. It would be possible to (internally) treat this as four edits and have the "Spanish:" and the "," as seperate edits, but then I need to keep track (somehow) of under which circumstances to remove what. I know this seems silly, because it's pretty obvious to a human what to do, and indeed if the editor.js actually worked by parsing the page into an internal format, and then modifying that, and then writing it back again, it would be trivial - but the effort of writing an accurate parser and symmetric rewriter for Wiktionary is enormous. Conrad.Irwin 01:09, 14 April 2009 (UTC)
Understood. I guess you have to consider what this will look like if it's developed more fully.
I wouldn't mind if saving made the other boxes' forms go greyed out until saving was finished (the server will get faster someday, as all computer things do). This could scale to an object-oriented model of the entry, where you change a component and the change is made immediately with a short edit summary: ten changes = ten edit summaries, fewer edit conflicts. This could be integrated with the existing edit history's rollback and undo facilities for fine-grained restores.
The alternative means making a dozen different edits, hitting save once, and writing a long edit summary. This scenario gives the editor a new interface layer, emulating a window-based desktop GUI (Edit > Undo, File > Save). Close the window without saving and lose your work, but unlike MS Word, most browsers don't warn you. There is one undo stack on the page, and a separate one in the edit history. After saving, the history can undo sessions, not individual operation. Michael Z. 2009-04-14 19:01 z

Trial Run?

Would there be any objection to turning this on for everyone for a few days? It would be worth seeing whether it encourages people to add good translations, bad translations or just rubbish. Conrad.Irwin 21:22, 12 April 2009 (UTC)

Okay by me. DAVilla 16:33, 13 April 2009 (UTC)
I love it, so yeah. As long as I get to keep it if anons add bullshit :D — [ ric ] opiaterein — 17:56, 13 April 2009 (UTC)
Ok, as I'm going to be around tomorrow, I'll try to enable it some time late morning (UK time). I'll disable it as soon as I notice problems, please feel free to disable it for me. Conrad.Irwin 01:09, 14 April 2009 (UTC)

Foreign WOTD

Circeus has started a redesign of the main page which I think has some definite merit. One of the additions to the page is a "Foreign Word of the Day" section, which I find very promising. Regardless of whether the redesign is accepted or not, I'd like to propose that we include such a section. However, such a thing really needs some infrastructure supporting it. Most importantly, it needs a point of contact, someone who is in charge of the thing. Now, with WOTD, we've basically got a dictatorship going, with EncycloPetey having absolute and unrestrained authority over it. Anyone is free to disagree with me on this, but I think that it's worked out fantastically so far, and I recommend that we adopt a similar policy, voting in a a FWOTD dictator, who will act responsibly with such unlimited power. My first choice would be Stephen, who seems to be fluent in every language known to man. However, I have my doubts as to whether he'd take on the job. Anyway, whomever we pick, we'd want to set it up so that it requires minimal work from the person in charge. So....along the lines of the current WOTD selection process, we could simply have a nomination page, where people show off the best their language has to offer. The person in charge would probably give preference to words which are more impressive, so it would offer incentive to offer up choices which are already in good shape. Thoughts? -Atelaes λάλει ἐμοί 21:30, 9 April 2009 (UTC)

FWIW I had not decided what exactly was going to go in that space (though a FWOTD was one of several suggestions I made in the discussion re: the revamp). It was added by DAVilla. I was going to let it to a separate discussion. Circeus 21:45, 9 April 2009 (UTC)
Hey I just slapped it up there as a brainstorming idea. Atelaes is right, there's absolutely no infrastructure behind it at all. But then, we're not committed to this. It depends on if people want to have it or not. There is no question that if it goes up at all that it will look much different than it does now, with a different icon and at least showing the language of the definition! Another possibility is that if there are many languages then those could be listed, not showing the definitions. Those details we can hash out together, but I like the idea of a czar running the whole thing. Likewise for newly discovered terms, which interests me more than anything else. There are a couple of other things I haven't thrown up there yet, just as ideas. Hopefully one of these will grab someone's interest. 63.95.64.254 01:39, 10 April 2009 (UTC)
I was just outline that I wasn't endorsing these specific features at the moment. I didn't intend to put down the proposal, quite the contrary. Circeus 02:08, 10 April 2009 (UTC)
Understood. That was really a response to both of you, and of course for anyone who's watching. Apologies for hijacking your redesign page. It may be called for to splinter off these changes. DAVilla 02:53, 10 April 2009 (UTC)
S'alright. As I mention above, my concern is really with the design itself. I'll leave open the question whether there should even be new content there. After all, the idea of not sliding a second box under WOTD and publicizing non-word content in a page-wide box under WOTD and discussion room links is tempting. Circeus 04:13, 10 April 2009 (UTC)
Some questions about FWOTD that ought to be resolved:
  1. How often are each of the various languages to be represented? That is, how often will we feature a French word, a Dutch word, a Hungarian word, an Ewe word, etc.? Will representation be according to the percent of total entries on Wiktionary? If so, would it be according to number of lemmata or total entries? If not, then how will features be equitably apportioned to various languages?
  2. Will non-Latin languages be featured? For many users, some languages will show up on the Main Page as a series of little empty boxes or as identical indecipherable squiggles. By the latter, I mean that some languages are interpreted by some browsers/systems with a language-specific default character that is repeated for each different character. On my Mac, Malayalam looks like a series of identical little icons, as does Lao, but the little icons differ for Malayalam and Lao. How will unenlightened users react to this apparent computer garbage on our Main Page? Transcriptions may help a little, but won't eliminate the possible "garbage" symbols.
  3. How will words be selected? For the WOTD we currently have, words are neither too common nor too bizarre, and are chosen more often because an average person could conceivably use them or come across them during a typcial day, but they are a bit out of the average person's experience. If we feature words from many different languages, then how are these standards to be applied? Most people communicate in just one or two languages, and have little opportunity or reason to spout a word in another language. I have almost no potential for using Albanian, Maori, or Norwegian in my everyday activities (though my job has permitted me to drop in Chinese, Japanese, Latin, Spanish, and a few other languages from time to time). So what qualifies an entry for featuring in FWOTD?
There are other concerns I could name (e.g. audio files), but I think the items listed above are the big three. If a satisfactory plan can be made to cover these three points, then I imagine any other issues could be dealt with as well. I have raised these questions before, but no usable replies have been made to them. --EncycloPetey 17:24, 11 April 2009 (UTC)
Some thoughts in response:
  1. This is one reason to have a czar, someone who we trust to be fair. In my opinion it would be best to highlight good entries, more like a featured article, but if quality does not correspond with the quantity then yes, the czar could make some adjustments.
  2. Good point. I agree, but I couldn't imagine leaving out other scripts. I guess just the Romanization is a possibility, though in some cases that would have to be stripped down further, leaving out the stranger diacritics. Better in my opinion would be to use an image linked to the page (or if necessary linking to transliteration to the scripted word), which hopefully isn't too much extra work.
  3. The point is not that any of us would use these words. (The phrasebook is a lot better for that.) It just hightlights the fact that this isn't your grandmother's dictionary. I think the coolest terms are those that are difficult to translate into English, for instance if the primary definition includes what are to us different concepts. I also like to see words with several definition lines, very different concepts like English bat (mammal/baseball bat). In a foreign language these can be very strange to us sometimes.
I'll state the obvious on audio files: there may not always be one. Should there be? At the very least it should be encouraged. DAVilla 19:18, 11 April 2009 (UTC)
Re: Audio files: The draft for the new Main Page incorporates FWOTD into the same template as WOTD, which I think is a mistake for many reasons. One of these reasons is that the WOTD template requires an audio file and provides a red-link if the audio file does not exist. If you intend to use a WOTD template, then the FWOTD selection must have an audio file or we will have perpetual red-links on the Main Page. This invites trouble. --EncycloPetey 19:28, 11 April 2009 (UTC)
Good point, we don't want to invite trouble. One idea is to leave the audio icon and link out entirely if a file doesn't already exist. Otherwise this might be considered a restriction on which words can be used, namely those that have already been recorded only. That I've seen there are many foreign words with recordings on the respective Wiktionary, many of those missing here, so it won't be too much to ask of nominators. However it would severely restrict the language options. Originally I had put the FL WOTD as a separate box. I think it looks nicer now, but we have to design around the content. DAVilla 20:02, 11 April 2009 (UTC)
Re: Separate box: I also think the one box looks better than two, but cosmetic improvement isn't the only concern. I see more logistical problems with the combined box, such as coordinating nominations and archives; coordinating when WOTD gets updated daily but perhaps the FWOTD doesn't; and the possibility of user confusion in having two headwords inside the single box. Having a single box with two words but only one audio link would confuse/perturb some users, I'm sure. --EncycloPetey 20:23, 11 April 2009 (UTC)
It should be possible to run the FL WOTD completely separately even though they share the same space. That did occurr to me while trying out the look. In my opinion it would only be worthwhile if there were a new foriegn word every day, so I hadn't considered unsynchronized updates, nor do I think we should. If it's not possible to update every day, then leave it for Interesting Stuff. Otherwise, wouldn't it be possible to use a single template and have the duties divided between yourself for WOTD and another person for the FL part?
To avoid the confusion you mention and issues about red audio links etc. I think it makes a lot of sense to simply require that there always be audio for the word. This will restrict us in some ways, but it's equally a bonus to have a narrowed the field of selection, (edited:) so as to avoid accusation of unfair selection by the czar. It just might encourage the addition of such information, in many cases just adding a link to content that already exists elsewhere. Also it's a strong selling point to say that we have audio recordings of foreign words. DAVilla 23:53, 11 April 2009 (UTC)
If we want to consider restricting FWOTD entries to those that have audio, then we ought to do some reconnaissance to determine how many entries in various languages have audio available. A ballpark estimate would be sufficient, but we ought to consider that common and uninteresting words are more likely to have the audio recorded than "interesting" words. I almost always have to record audio for WOTD selections before they go up, although I do find a couple each month that Dvortygirl has already recorded.
An idea: we could consider starting FWOTD as a part of "Interesting Stuff" (at least initially) and see where it takes us. --EncycloPetey 23:42, 11 April 2009 (UTC)
Trial run is an good idea, absolutely. I am going ahead to ask for nominations at this time. We will need someone to take this up though. Atelaes's original point is unanswered. If it looks like it's possible, I say do it! Interesting stuff is plan B. And by the way, we probably ought to critique the interesting stuff as thoroughly. DAVilla 00:05, 12 April 2009 (UTC)

Now accepting nominations! Wiktionary:Word du jour/Nominations DAVilla 02:20, 12 April 2009 (UTC)

Why Wiktionary:Word du jour/Nominations in lieu of Wiktionary:Mot du jour/Nominations? The uſer hight Bogorm converſation 20:49, 17 April 2009 (UTC)
Because (1) mot isn't understood by English speakers and (2) this isn't the French word of the day. DAVilla 05:52, 18 April 2009 (UTC)

Other Boxes in Redesign

I've realised that only a minimal amount of alteration to the WOTD template (basically add a few <code>{{#switch}}</code> to account for the differences and make the title change according to the content, and possibly drop off the audio) was necessary to allow a feature where the content is not composed solely of English word entries. I've made 8 sample entries that can be roughly cycled through by purging/refreshing the page (the code allowing this random switching is borrowed from w:Portal:Featured content). Circeus 05:49, 10 April 2009 (UTC)

I was wondering about featuring other pages like this but wasn't quite sure how to go about it. We don't have enough Wikisaurus content to show off, for instance. Using a round robin sort of approach is keen.
Some other ideas would be possible with a dedicated contributor to support it. The news quote didn't go on the page because it looks like a lot of work to me, but I'm hoping to inspire someone. On the other hand, I honestly think we have enough content to list a newly discovered word every day, and that's something I would be excited about personally. Can we bring that back? DAVilla 07:35, 10 April 2009 (UTC)
Probably not. SemperBlotto was running the thing singlehandedly for ages, and has grown tired of it. No one else ever stepped up to help out for very long (and I say that as someone who used to dabble from time to time). If it's not being maintained actively, it shouldn't be on the Main Page. --EncycloPetey 19:33, 11 April 2009 (UTC)
What SB was running was words in the news, which was one proposal but different from the attested neologisms I called "Newly Discovered". This was incorporated into an Interesting Stuff box, but I'd like to put a dedicated section back into the revision. DAVilla 20:02, 11 April 2009 (UTC)
For all these things I think it would be better to not specify the time period on the page, and then update it as often as we have the ability/enthusiasm. It may end up being "of the day", but if we don't specify that then it won't look so bad when it doesn't change for a week because someone is on holiday. Conrad.Irwin 08:22, 10 April 2009 (UTC)
Alternatively, it should be possible to start with a 31 advance log that turns on itself in a fashion similar to WOTD until we get enough for a yearly cycle. Circeus 16:28, 10 April 2009 (UTC)
Now see Wiktionary:Newly discovered. DAVilla 06:42, 11 April 2009 (UTC)
I listed vulgar language but I'm pretty sure no one would want to see e.g. pussy pounding on the front page. Can I assume the same about jill (to masturbate)? assward? What about a quotation like "I bet you’d just lurve to have baby oil smoothed all over your little nappy bits"? If that's objectionable, it could easily be swapped for another. DAVilla 18:20, 11 April 2009 (UTC)
Vulgar language on the Main Page would limit access to our entire website. Most schools use a filter that prevents access to pages containing certain words. We don't want access to the Main Page restricted from a significant fraction of users. --EncycloPetey 19:31, 11 April 2009 (UTC)
Thanks, that's what I thought. I'll strike all of those out. Only had them listed because I was going through rather methodically. Words like lurve aren't so vile with a decent quotation. I still have questions about where to draw the line though. What about other potentially offensive words like shotacon that don't rely on vulgar terms for definition but could be problematic nonetheless? What about quotations like "Bi people tend to develop polyamorous identities and poly people tend to develop bisexual identities." Better to leave them out too, I guess? DAVilla 02:44, 12 April 2009 (UTC)
Okay, sorry for being so dense. I answered my own question. I was just on a “Wiktionary doesn't censure” spacewalk, and it took a little while for reality to sink in. DAVilla 20:57, 12 April 2009 (UTC)

I've got IPA, and here are my demands

Peter Isotalo recently started a BP thread on changing the link in {{IPA}}. Follow the link to see the rousing conversation he found. He has (reasonably, in my opinion) become a bit impatient about the whole issue, and has asked me to make the changes. I've set up {{grc-test}} as I intend to make {{IPA}}. Basically, if there is a lang parameter entered, the template links to "Wiktionary:language pronunciation" if it exists and "w:language phonology" if it doesn't. Otherwise, it simply links to our old standby w:IPA chart for English dialects. You can see the results at User:Atelaes/Sandbox. Unless someone gives me a damned good reason not to (or pays me a large sum in unmarked bills), I intend to institute these changes tomorrow. -Atelaes λάλει ἐμοί 22:25, 11 April 2009 (UTC)

Sounds reasonable. --EncycloPetey 22:31, 11 April 2009 (UTC)
Should we maybe use the "Appendix" namespace — e.g., [[Appendix:IPA chart for language]] — if we want it to be part of our content? My understanding is that the project ("Wiktionary") namespace is for Wiktionary policies, guidelines, discussions, and so on. —RuakhTALK 02:39, 12 April 2009 (UTC)
Yes, I think you're right. It should be "Appendix:language pronunciation. -Atelaes λάλει ἐμοί 03:03, 12 April 2009 (UTC)
Do eeeet — [ ric ] opiaterein — 17:23, 12 April 2009 (UTC)

Roman numerals

For a Catalan usage note template for Catalan ordinal numbers, I was planning on writing some code to generate Roman numerals given a numeric argument. Any interest in my making it a template of its own and any preference to the name? (I was thinking {{Roman numeral}} and if there already is one by some other name, I couldn't find it.) I expect to make it with a range of 1 to 3999 (I to MMMCMXCIX) as more than that runs into problems as using CSS for the overline for thousands means that I'd be using CSS to convey content, which is bad practice and using the combining overline (U+305) has very spotty font support. Besides, for my purposes, 1000 is sufficient. — Carolina wren discussió 02:41, 13 April 2009 (UTC)

Could you use the template in the way you've intended, so that we can see what it's to be used for? --EncycloPetey 18:21, 15 April 2009 (UTC)
It's been incorporated into {{ca-num-ord-note}} (as {{Roman-num}} as I was worried about it being mistaken for a context template), and you can see it in use in primer. Since I was already passing the number into {{ca-num-ord-note}} for other purposes, it's not requiring changes to every entry to make use of it. — Carolina wren discussió 01:39, 17 April 2009 (UTC)
Do 160 entries really need the identical 100-word note about Catalan numbering? Perhaps it should be in a sidebar or call-out box with a title like “Catalan ordinals,” so that its nature is clear to readers. Or does it belong in an appendix or Wikipedia article?
Entry content (like the specific notes for 1, 2, 3, 4, 10, 100, 144, 1000, etc) should be placed in the respective entries where it can be edited, rather than buried away in a template's #switch statement. Michael Z. 2009-04-17 02:37 z
I'm trying to ensure provide a uniformity to the usage notes for a finite, but large class of entries. The lemma entries for the ordinals all exist, and I would be extremely surprised if any others would merit inclusion. Indeed, I added 144 solely because of the existence of grossa. The specific notes given by the switch exist as alternatives to the default note. If the switch is what is mainly bothering you, I suppose I could rework it so that instead of a switch, the entire alternative to those parts would be passed as a parameter by those entries that need it to assuage your concern. — Carolina wren discussió 03:41, 17 April 2009 (UTC)
Yes, I think a parameter would be preferable for text which is unique to an entry. Can't this be rewritten so that the entry text could just be text in the entry?
I'm concerned that although they do cite the entry term as an example, the second and third paragraphs are general discourse on Catalan ordinals, and not a proper part of the entry quart, for example. Michael Z. 2009-04-19 01:53 z
What I was asking for was a sample entry showing what the final product of sucha template might look like or be used for. --EncycloPetey 04:17, 19 April 2009 (UTC)

Assisted editing a success?

It's still early days, but so far the following seems to be the case:

  • More translations are being added (not sure exactly how many more, but as I type 21/80 recent changes are made with this (compared to a meagre 3/80 accelerated).
  • Anons are using it to add translations, they seem to be mainly correct - but are sometimes slightly substandard, missing a gender - or formatted slightly wrong because they've tried to put too much detail into the box.
  • I've so far noticed two blatently incorrect uses of it, both times was to "unbalance" the translations using the ← and → buttons (though both times, only slightly).
  • Unwhitelisted and whitelisted registered users seem to be using it well.

The decision thus seems to be whether to keep it enabled for all anonymous users, or to limit it to logged in users. Conrad.Irwin 14:08, 14 April 2009 (UTC)

Users might be unbalancing the translations because of the edit box itself, which makes a balanced table look unbalanced. Maybe you could use a strip across the bottom instead of just the right column? DAVilla 01:19, 17 April 2009 (UTC)
Not knowing the effect on the pattern of anonymous contribution, I can only speak about my editing experience: this is a nice tool, thanks! --Dan Polansky 14:46, 14 April 2009 (UTC)
On design: What about making the gender checkboxes a part of the "less" variant instead of the "more" variant? Many foreign languages feature gender in their translations, while few feature transliteration, display form, and override script, AFAICT anyway. --Dan Polansky 15:04, 14 April 2009 (UTC)
I had given that some thought before, so it is now done if you clear your cache (ctrl+shift+F5). Conrad.Irwin 15:26, 14 April 2009 (UTC)
I can't get this to work. Whatever I type in the first box (e.g. cy), I get the message "Please use a language code. (en, fr, aaa)". What am I doing wrong? 82.18.22.160 21:54, 14 April 2009 (UTC)
I haven't yet worked it out. You are using IE6? Conrad.Irwin 22:18, 14 April 2009 (UTC)
The same happens for me in IE7. It works fine using Google Chrome. Using Firefox I get "Loading . . ." but nothing happens. SemperBlotto 12:29, 15 April 2009 (UTC)
The same in IE7 by me. I get the error message: "Please use a language code. (en, fr, aaa)". In Firefox, which I normally use, everything works fine. --Dan Polansky 17:56, 15 April 2009 (UTC)
Would be okay for anons just as edit is okay, but take it offline if it's causeing errors. Per below, you may also consider an "off" button that would store as a cookie and could be reset in prefs or by clearing cookies. DAVilla
Yes, now done using the module that already remembers whether you have the box open or shut and which language you last used. It's at the top of User_talk:Conrad.Irwin/creation.js and can be put anywhere. Conrad.Irwin 01:52, 16 April 2009 (UTC)
I also wonder why we don't have more standard A to I and J to Z assignments for columns. If I'm looking for Japanese I'd like to find it in about the same place every time. Nearly balanced is good enough, and perfection is impossible given that the lines may or may not run over depending on fonts and page width. DAVilla 20:16, 15 April 2009 (UTC)
This is something that would need to be taken up more widely if it were to be changed. The other thing that concerns me is the grouping of languages in translation sections as that makes things that much harder. Conrad.Irwin 01:52, 16 April 2009 (UTC)
You mean like when people put everything that is or ever was spoken anywhere within China or the territories it claims under "Chinese"? You should consider those as incorrect. I'm going through a few entries and adding *Chinese: See Mandarin and hoping to get some feedback. DAVilla 01:19, 17 April 2009 (UTC)
IE6 works okay as far as language code goes, but the box to save changes is hidden behind the logo and so the changes can't be saved.
On a similar note I was wondering if it would be possible to position that box relative to the window, rather than at the top of the page. DAVilla 01:19, 17 April 2009 (UTC)
Now fixed (both problems) in IE6, it should already have been right in other browsers. Conrad.Irwin 11:12, 17 April 2009 (UTC)

I'd say keep it enabled! A very nice tool which, so thanks a lot! --Eivind (t) 16:32, 15 April 2009 (UTC)

Could there be a way to turn it off? I'm not likely to use it myself, and unless one is intending to edit a translation box with it, it's ugly intrusion. By the way, a way to edit an existing translation to add gender or other things to existing entries would be useful. — Carolina wren discussió 17:51, 15 April 2009 (UTC)
You could put the following into Special:MyPage/monobook.js, though I appreciate that's not an ideal solution.
window.editorLoaded = true;

Editing existing entries is more tricky as it has to do more detailed analysis of the page, but it's certainly somewhere on the todo list. Conrad.Irwin 17:55, 15 April 2009 (UTC)

You can now disabled it at User talk:Conrad.Irwin/editor.js after you hard refresh. Conrad.Irwin 01:52, 16 April 2009 (UTC)
The feature is almost useless to me. Although it seems to work in Safari on a Mac, I need special characters for Latin that can't be typed from my keyboard easily. I do note one confusing point, and that is the complete lack of visible prompting of what goes into each of the little text windows. The ISO-code window needs at least something to prompt more visibly for an ISO code. Is it possible for the boxes to have "dummy" text visible in them when they first appear, and which disappears once someone begins typing information into the window? --EncycloPetey 18:15, 15 April 2009 (UTC)
It's possible, and I'll give it a go at some point. Conrad.Irwin 01:52, 16 April 2009 (UTC)
One other point: The gender checkboxes say "male" and "female", which are not grammatical genders. They need to say "masculine" and "feminine", or else use an abbreviated form of those words. The grammatical gender does not always match the actual gender of the referent. Consider that the German Mädchen (young woman) is a female word, but is grammatically neuter. The grammatical gender of animal names in many languages has similar problems. --EncycloPetey 18:32, 15 April 2009 (UTC)
I've fixed this. Conrad.Irwin 01:52, 16 April 2009 (UTC)

I find this tool absolutely indispensable. Can't imagine going back to manually adding translations. Please, keep it developing. Particularly, I'm interested in being able to modify already existing stuff, like adding sc= parameter or a transliteration to an existing translation. --Vahagn Petrosyan 18:44, 16 April 2009 (UTC)

It's a great tool and a time-saver, Irwin! Sometimes it gives an error on the gloss format being incorrect (just added another and then the next translation fails), even if there is no problem with the gloss. It doesn't happen too often, though. Unfortunately, I can't use for Chinese translations. First of all, it doesn't allow both traditional and simplified entries. Also, I don't agree to use "Mandarin". It's more common to use "Chinese" in translations here and adding dialects on the next line. Just repeating here what I said in your discussion page. Please let me know if you have questions. I would like to add many more Chinese translations. (Please don't explain me the Chinese language family situation, I am well aware of it.) Anatoli 01:49, 20 April 2009 (UTC)

I made some fixes about the gloss and the formatting last night, so hopefully they'll be better - feel free to list pages that you think are right and that it raises an error on, if I get time I'll fix it. The reason it says Mandarin is because it just uses '{{subst:zh}}'. Conrad.Irwin 09:49, 20 April 2009 (UTC)
Conrad.Irwin, it should show "Mandarin" for cmn or zh-cmn, not for zh (zh stands for Zhōngwén (中文) - Chinese), '{{subst:cmn}}' produces Mandarin as well, '{{subst:yue}}' produces Cantonese, etc. Anyway, I'll get to the bottom of the Chinese templates usage. More importantly, IMHO, the traditional/simplified separation should be accommodated, e.g. trad. 中國, simpl. 中国 (pinyin: Zhōngguó), otherwise, I can't use the assisted for Chinese (Mandarin) at all, in case when traditional/simplified are identical, then it should provide just one script, e.g. 北京 (Běijīng).
I get errors occasionally on other languages. However, thanks for your efforts, in any case. Anatoli 12:14, 24 April 2009 (UTC)
The remembering of the "less"/"more" state does not work any more for me. That is, when I choose to see less on one page, and open another page, I see all the fields in the newly opened page. Browser: Firefox 3.0.9; OS: Windows Vista. --Dan Polansky 11:14, 22 April 2009 (UTC)
It won't remember anything until you hit "Preview", I should maybe change that. Conrad.Irwin 08:54, 24 April 2009 (UTC)
So that is the trick that does it. Works for me. --Dan Polansky 11:41, 24 April 2009 (UTC)

Nested translations etc.

Is there a list of which languages should get nested under which headings? Are there other templates, apart from the Chinese ones that people want to be able to (ab)use in place of {{t}}? Are these deviations from a standard a good idea? Conrad.Irwin 09:49, 20 April 2009 (UTC)

"Norwegian Bokmål" (nb) and "Norwegian Nynorsk" (nn) should be nested under "Norwegian" (no). Most often we put translations that are correct in both languages under "Norwegian", and spesific translations under nb and nn. --Eivind (t) 10:03, 20 April 2009 (UTC)

For Bokmål and Nynorsk, I believe both are subsumed under one Norwegian Wiktionary, i.e., {{no}}. Upper and Lower Sorbian should nest under Sorbian (often ignored, unfortunately). Jicarilla, Chiricahua and Western Apache should nest under Apache. Under Chinese go Mandarin, Yue, Xiang, Min Nan, Min Dong, Gan, Wu, and Hakka. Modern and Ancient Greek under Greek. Eastern and Western Mari under Mari. The Arabic dialects under Arabic. Brazilian and European Portuguese under Portuguese. I think there are others that are differentiated only by Northern and Southern, or Upper and Lower, that should nest, but I don’t recall off the top of my head. —Stephen 13:31, 22 April 2009 (UTC)
What about various Old and Middle languages? Old Irish under Irish? Middle Welsh under Welsh? And if so, where do we put Old English and Modern English, since translation boxes don't have a line for English? Is this nesting something AutoFormatBot could do? Because when I've added Ancient Greek using the handy new quick-entry form, it gets automatically alphabetized between "Am" and "Ao" rather than nested under Greek, and if I have to go in and fix that manually it defeats the purpose of the convenient quick-entry form. Angr 13:46, 22 April 2009 (UTC)
Yes, Old and Middle go with the Modern language, except in the case of Old English, since Modern English does not get a line in the translation section. —Stephen 14:02, 22 April 2009 (UTC)
So Old English and Middle English get alphabetized under "O" and "M" respectively? That will separate them from each other as well as confuse people who are accustomed to looking for "Old/Middle Foobar" under "Foobar". Maybe there should be an empty "English" line with "Old English" and "Middle English" indented under it, the way "Serbian" is usually empty, with "Cyrillic" and "Latin" indented under it. Angr 14:44, 22 April 2009 (UTC)
I fear that that will encourage well-meaners to add (modern) English synonyms or summat.—msh210 17:56, 22 April 2009 (UTC)
We could call it "English (earlier stages):" or summat of that. Angr 10:44, 24 April 2009 (UTC)
No, please, keep Old English and Middle English away from Modern English, they have an entirely different vocabulary (no Gallicisms, Latin loanwords) and grammar(ge- for forming past participle like in German). Old High German and Middle High German do not have the same problem, but the spelling is too different (lack of noun capitalisation unlike German). The current format is the best one - English, Old English, Middle English; French, Old French; Latin, Late Latin and so on. I support nesting Bokmål and Nynorsk under Norwegian, it seems reasonable. The uſer hight Bogorm converſation 18:09, 24 April 2009 (UTC)
I thought we didn't separate Middle English from English. There is ==Old English== but no ==Middle English==, only # {{obsolete}}. Likewise translations for * Old English but not * Middle English, only * English: (obsolete). DAVilla 15:49, 30 April 2009 (UTC)
Apparently I'm mistaken. See Category:Middle English language. DAVilla 20:32, 3 May 2009 (UTC)
We've been through this before. Yes, alphabetizing the languages separates them, but there is no way around this. Not every language descends from a similarly named language. The choices are (1) alphabetical order or (2) complete language family tree. There is no middle. DAVilla 15:49, 30 April 2009 (UTC)
I've been doing entries for the Valencian standard of català as indents. (See eight for an example.) It's a peculiar situation. Valencian and Catalan share the same ISO 639 code, but both are recognized in ISO 639-3 as names for the language and both have bodies to prescribe their standard. Fortunately the differences between the two are largely a matter of pronunciation or preference between equally acceptable forms (and thus wouldn't concern the translations section) but as seen with huit/vuit the difference is sometimes orthographic. It's rare enough that manually adding the Valencian form is a viable option so long as editor.js doesn't disturb existing entries.— Carolina wren discussió 04:03, 24 April 2009 (UTC)
Irwin, thanks for fixing the '{{subst:zh}}'. I can now add assisted Chinese translations, if jiantizi/fantizi match. Is traditionals/simplified on your to do list, at least. Would be great if you can add this. Even if you can add two entries separated by commas, this could benefit some other translations where you can have variants in spelling. Here are two example:

What do you say? If you had an additional (optional) textbox that would also work.

Anatoli 01:02, 30 April 2009 (UTC)
We don't do translations into "Chinese". Nothing was fixed by substituting that for Mandarin, which is the correct language header. DAVilla 01:15, 30 April 2009 (UTC)
DAVilla, there is an existing '{{subst:cmn}}' template for Mandarin (Chinese Mandarin). {{subst:zh}}' is for Chinese (Zhōngwén - Chinese (language)) and zh links to the Chinese Wiktionary, which happens to be in Mandarin but I see no difference, since standard written Chinese is normally in Mandarin. As I described before, the translations could be nested further by having Mandarin, Cantonese, etc, then Chinese needs to be the header for all Chinese languages/dialects. It would be a waste of space and typing time, in my opinion. Anatoli 03:14, 30 April 2009 (UTC)
I'm not an expert, but I belive that "Traditional"/"Simplified" can be toggled by setting the appropriate script template. (Click on "more", and use Hans or Hant in the "Script" box - you can then use the "Qualifier" box to put (simplified) and (traditional) in front if you're feeling cunning). I did not fix Template:zh, but it was my suggestion that it be fixed - there are a lot of users who seem to want to add translations under Chinese. If this isn't desirable then the "fix" needs reverting and it needs to be explained somewhere why this is the case. As for supporting nesting, it will hopefully happen, eventually. Conrad.Irwin 09:35, 30 April 2009 (UTC)
(For what it's worth, we currently have about 8000 Chinese translations). Conrad.Irwin 09:41, 30 April 2009 (UTC)
And they're mostly wrong. Per Wiktionary:About Chinese, particularly A-cai's comments on the talk page, cases where only * Chinese is listed need to be changed to * Chinese ** Mandarin or, my preference, just * Mandarin. The idea of using * Chinese ''See Mandarin'' is very new and potentially controversial, but the languages listed in the translations section have always been meant to match the L2 headers. DAVilla 15:39, 30 April 2009 (UTC)
As stated on that template talk page, "Wiktionary breaks down Chinese dialects with special codes such as cmn for Mandarin". Your point about Mandarin-language Wiktionary makes my point entirely. Regardless of which code they use, the words that concern them are in Mandarin, not from the entire Chinese language. The reason that {{zh}} says "Mandarin" is for use in linking to the Mandarin-language Wiktionary, which should say "Mandarin" not "Chinese". In fact I believe that is its only use here. DAVilla 15:39, 30 April 2009 (UTC)

Sorry, DAVilla but I don't see your point, especially not seeing you being active in creating Chinese translations. Mandarin = Standard Chinese, they are synonyms, besides, it's the only official dialect in China in Taiwan. We have the template cmn, which can be used for "Mandarin". Please leave zh for Chinese - zh stands for Chinese, not for Mandarin. They can be used when there is a difference between dialects. The original edit of the template wasn't mine. Your demands to remove the word "Chinese" from translations are not helpful, I have to do more manual editing. If the word Mandarin were used more often in translations, then I would stick to it but Chinese is more common than Mandarin when referring to the language of China and most translations use Chinese, not Mandarin in translations and I prefer to continue. The linked translations may have Mandarin, Cantonese, etc. with the appropriate pronunciation.

Bear in mind that adding a translation under Chinese, will add a link to [[page#Chinese]]. As we only have 15 pages with this heading, compared with 21075 using [[page#Mandarin]], DAVilla may well be talking sense. Presumably the problem would be somewhat lessened if I added support for "nesting" all the languages on Template_talk:zh under a "Chinese" heading. (Something I hope to be able to do at some point soonish). Conrad.Irwin 23:55, 3 May 2009 (UTC)
No, they don't. The link just searches for any occasion of the word, even if it's in Japanese because there is no [[page#Chinese]] but simply [[page]]. Nesting is OK but I see some problems with your tool, plus there is an extra unused line with just Chinese: on it, etc. Just in case you don't know, there is only one written Chinese standard (with 2 scripts), even in Hong Kong, where standard documents are written in "Mandarin", although they may read them out loud in Cantonese. Mandarin refers to the standard spoken language. That's why I don't see any issue about the Chinese wiktionary being in Mandarin - it's the normal way to write in Chinese. Anatoli 00:50, 4 May 2009 (UTC)

Irwin, your suggested would require separate entries for simplified/traditional. They are regarded as the same word. 經驗 and 经验 (jīngyàn) is the same word (experience), only written in 2 forms. I prefer to see them together, followed by the pronunciation, like this: 經驗, 经验 (jīngyàn) Anatoli 22:39, 3 May 2009 (UTC)

You can achieve this by first adding 經驗, then adding 经验 with the transliteration (jīngyàn). [If you can't see the transliteration box, it is under More]. Which gives [3] (I should maybe have used cmn not zh which would get rid of the (zh), but it was just an example). Conrad.Irwin 23:55, 3 May 2009 (UTC)
This sounds like a good workaround, perhaps the first then should be sc=Hant, and the 2nd sc=Hans. Only some may assume they separate words. The template: {{zh-tsp|經驗|经验|jīngyàn}} or simply: {{zh-ts|經驗|经验}} (jīngyàn) makes it clearer but it could be an overkill in terms of how much info it provides. Anatoli 00:50, 4 May 2009 (UTC)
You can of course also edit the wikitext if you want to do something funky. While adding support for other templates is not impossible, it requires someone to design the interface for them and test that the wikitext generated is always correct (I think the actual edits can be done with the editing functions already used for {{t}}). If you wanted to give designing the interface a go then I'm sure that your changes could be added to editor.js (while it does require a knowledge of javascript, the fix should be manageable if time-consuming). Conrad.Irwin 22:38, 4 May 2009 (UTC)

Word of the Day

The Word of the Day appears on the home page to be stuck on the 12 April entry. When "refresh" is clicked in the Word of the Day box, one is taken to a page containing only today's proper word (not a home page with the proper word). Just FYI. —This unsigned comment was added by 76.202.234.114 (talk) at 15:40, 14 April 2009 (UTC).

Thanks for the heads-up.  (u):Raifʻhār (t):Doremítzwr﴿ 15:46, 14 April 2009 (UTC)
That's a result of the interaction with your browser, although it may be caused partly by MW as well. I know that the problem did not exist at this time last year, but has been particularly bad of late. I am having refresh problems on more than one Wiktionary page (including the Main Page and Recent Changes), and the problem is not limited to one browser or OS. --EncycloPetey 18:09, 15 April 2009 (UTC)
Hmm... It could also be partly caused by the fact that the Main Page dynamically calls content from the current WOTD according to the date. Is there any way to force a refresh on that? --EncycloPetey 19:06, 15 April 2009 (UTC)
Not MW at the moment: it displays as April 15 for me. Does it still look wrong to you? Besides local caching issues, there may be internet caching issues, potentially. DAVilla 20:07, 15 April 2009 (UTC)
It looks fine to me on my Mac right now, but I have problems when I use my Back Button (Mac), or when I first visit Wiktionary on any given day (Mac or PC). The latter applies even if I didn't log in. --EncycloPetey 20:21, 15 April 2009 (UTC)
Word of the day: word n, 1. Please leave a note in the Beer parlour to tell us that there is no word of the day. 78.49.0.96 09:34, 25 May 2009 (UTC)
Seems to be "chagrin". Conrad.Irwin 09:49, 25 May 2009 (UTC)

More direct link to sister project search results

Would it not be nice if one could specify a parameter so sister project templates, especially {{commonslite}}, would link directly to a search results page in the sister project? By default for the article title, optionally the display, or perhaps a third parameter. The default result now at best takes another click and may discourage clickthrough altogether. Implementation is a GP matter, but does anyone else see the advantage? Are there drawbacks? I see more benefit on Commons than on other projects. DCDuring TALK 16:59, 16 April 2009 (UTC)

Commons is the only site that I would even consider that for, and all in all it's probably still a bad idea. We don't want to link to searches on e.g. Wikipedia because it's expense. We already have enough shit ugly Wikipedia boxes plastered onto every damn page as if someone isn't going to have enough sense how to go to Wikipedia and type in that exact term, when what we really need are Wikipedia links to relevant articles and maybe an occassional box linking to the disambiguation page. DAVilla 02:15, 17 April 2009 (UTC)
Yea, to hell with them. DCDuring TALK 10:32, 17 April 2009 (UTC)

Russian noun stuff

I got tired of using {{infl}} for Russian nouns, given that it's such a widely spoken language and all those parameters, including sc= just got to be too much. So, I moved the old {{ru-noun}} to {{ru-decl-noun}} (all entries linking to the old former have been changed over, don't worry) so that we can now use {{ru-noun}} for inflection lines. The usage isn't too complicated, have a look at the talk page for more on how to use it. — [ ric ] opiaterein — 17:01, 16 April 2009 (UTC)

I wonder whether there is any need for the animate/inanimate option. Animate nouns are people or animals, living, moving beings. Inanimate nouns are plants, rocks, elements, dust, ideas, feelings, dimensions, things that are not living. It’s pretty cut-and-dried. Also, since we show the plural in the declension table, the plural parameter in the heading seems like overkill. —Stephen 20:25, 17 April 2009 (UTC)
It's certainly useful to learners of the language, who don't always remember that there is a distinction between the two. I have always found the "in/animate" notes in my Polish dictionaries to be very helpful. And I don't know about Russian, but it also has an impact in Polish on the construction of place names from masculine nouns. --EncycloPetey 20:56, 17 April 2009 (UTC)
Agree 110%. — V-ball 03:10, 17 November 2010 (UTC)
In Russian, animate/inanimate does not impact the grammar outside of the declension of the word itself. Inanimate nouns have accusative like the nominative, while animate nouns have it like the genitive. But we are giving the approriate accusative for each word, so it is not important to know about animacy. Only in a few words, such as prick, which can be an inanimate bodily organ or an animate irritating person, the accusative can have both forms depending on animacy. Polish is a more difficult case, because there are different pronouns and such. In Russian, it’s only the accusative case, which we show for every noun. —Stephen 22:26, 17 April 2009 (UTC)
I too think info about animate/inanimateness is not interesting. What I'd like to see is an ability to add the wording indeclinable in the inflection line, when necessary. Also, does anyone else think a feature for showing feminine counterparts to words like гражданин, армянин is desirable? I do. --Vahagn Petrosyan 19:44, 18 April 2009 (UTC)
I think showing feminine counterparts is useful. {{he-noun}} does it, e.g.—msh210 15:42, 21 April 2009 (UTC)
Yes, indeclinable is a useful parameter, since it’s a significant noun class in Russian, and there should be a way to show feminine counterparts on gender-specific Russian nouns such as American, brother, and Mr. —Stephen 13:42, 22 April 2009 (UTC)

Arabic Romanization

Hi, arabic romanization guidelines have been published on Wiktionary:About_Arabic, after having been discussed on the corresponding talk page. It is based on the qalam system, which has been chosen because it is very easy to type on any latin keyboard and uses transliterations that are well known by most already. Thanks for any comments and suggestions. --Beru7 20:25, 16 April 2009 (UTC)

I'm not crazy about the mixed use of lowercase and capital letters. I don't think ease of typing on any Latin-alphabet keyboard is really the best way to choose a transliteration system, but I don't do a lot of editing for Arabic. — [ ric ] opiaterein — 15:14, 17 April 2009 (UTC)
I do not like the idea at all, especially since it includes the use of numerals, which is highly confusing. We're already using a transliteration system for Arabic which appears to be widely accepted. -- Prince Kassad 15:37, 17 April 2009 (UTC)
Currently there is no system in use. There are several systems, mixed together and used inconsistently throughout the wiki. I don't think anything could be worse.
Now, many people do not like the numeral 3 for ع (the only numeral used), which is comprehensible. Remember it is already widely used on the internet, however, and that the most commonly used tranliteration for ع, the backquote `, is not good at all on computer screens, as it is not vey distinguishable from ' which is used to transliterate hamzas. Also, in Arabic, ع is a real consonant with no specific rule. It should have a symbol that has the same size as other consonants, not just a quote. ʔ would be about the only alternative.
Concerning the use of lower-case and upper-case: it is a common practice for transliterating arabic. Karin C. Ryding's "Reference Grammar of Standard Arabic" (Cambridge University Press) uses such a system for example as do many other recent grammar books written in english. Very serious books. If they use it, so can we. --Beru7 16:30, 17 April 2009 (UTC)
Personally I'm more concerned about the use of t-h and s-h, which is too easily misinterpreted IMHO (also, do other consonant+h sequences occur?). I'd be more in favor of adopting a set of existing diacritics for sh/th and the emphatic letters, but I don't edit the area, so I'll defer to those who do. Circeus 17:56, 17 April 2009 (UTC)
The ALA-LC transliteration system uses ' where we use -. t, s and k are concerned. But using th, kh and sh makes the transliterations much more readable for english speakers. Cases where - will have to be used are rare.
By the way there is already a consensus by us people who are actively editing the arabic area. --Beru7 18:34, 17 April 2009 (UTC)
Mixed-case is a common way to transliterate Arabic and I believe it’s familiar to anyone who studies Arabic. In this new system, there is only one numeral being used, and it is likewise a common way to represent that letter. I don’t like the hyphenated s-h usage either, but it makes clear to anyone that the letters are not a digraph, and if we don’t use diacritics or other IPA symbols, I don’t think of any better solution. It’s true that I’ve used our earlier system consistently, but some recent users have been insisting on doing it their own way, so that it’s quickly become a hodge-podge. We need to publish a strict standard, so now is the time to decide. Another thing about Beru7’s new system is that it is easy to type without EditTools, and now that we are using User:Conrad.Irwin’s js editor, the EditTools are not available. —Stephen 20:20, 17 April 2009 (UTC)
However, most readers of Wiktionary do not study Arabic, but are just casual users of Arabic wanting to look up some word. These will not understand the numeral (or the hyphenated letters), and will fare off better with the current pseudo-scientific system. -- Prince Kassad 20:59, 17 April 2009 (UTC)
I'm not familiar with these, but it seems to me that Arabic is not English, and has non-English sounds, so an anglophone will need some basic familiarization with any system. Making it “easy to read” is just making it easy to ignore the differences. The important factors are compatibility and standardization, particularly with other lexicographical and linguistic references.
Do the five modifications come from the usage in Ryding and the other serious books, or are they another example of Wiktionary refusing to work with the real world because we are so much smarter? Michael Z. 2009-04-17 21:18 z
Kassad, Michael is right: arabic has 28 consonants, and nothing can be done about that. So any transliteration will have to use unusual signs or usage of letters.
Concerning the second part of your post, Michael, Ryding does not adress the "sh" problem to my knowledge. In fact, almost every book I own that presents transliterated text uses a different system. It's not because each author is trying to outsmart the other, it is because each has his own requirements. Just like we have ours. --Beru7 22:01, 17 April 2009 (UTC)
That's fair, but can't we be compatible with one source, rather than having to defend our own innovations? Michael Z. 2009-04-18 23:22 z
I would have liked to, believe me. First thing I tried is to look at all the existing systems. Most were designed for print, and the resulting confusion between hamza and 3ayn was unacceptable... Others are very systematic but do not take into account that they should be as easy as possible to read for most (IPA or the buckwalter which uses * for dhel, v for th). That's how I ended up with the modified qalam, which also happens to be compatible with many online transliteration tools (yamli, yoolki, eiktub). I have written my own as well for those who are interested. --Beru7 12:14, 19 April 2009 (UTC)
Support. To me the hardest thing to accept was 3 for `ayn (ﻉ). I suggested to use ` (backquote). However, 3 is used by online Arabic editors (Yamli, Google) and is the "standard" method for the letter input in chat, so it's well-known to Arabs. This is the only number Beru7 had in his proposal. Capital letters are used in Qalam, which is one of the standard Arabic transliteraton methods. I also uses ` for ﻉ. I think this proposal is worth considering. Anatoli 07:58, 18 April 2009 (UTC)

Wiktionary:Russian transliteration

Would someone please have a look at Anatoli's recent edits and comment on the talk page? We can't seem to agree on the basic concepts. Michael Z. 2009-04-18 23:37 z

Plural of multi-word nouns

Currently, en-noun generates a brand new entry in multi-word nouns (e.g. book award - book awards). I've been wondering if the components should be separated in the plural form, e.g. book awards or at least a new option added to en-noun to display separate words if it makes sense? We are adding a lot of new plural entries when the components would be sufficient. Panda10 13:37, 19 April 2009 (UTC)

Using Special:PrefixIndex to automatically list prefix derived terms

I just made an edit to self- which got me thinking about whether it would be a good idea to implement that as a standard: we could just transclude Special:PrefixIndex into the list of derived terms for prefixes, instead of building that list manually. Additional red-link entries could be added by hand (as I left self-belief). In fact we could create a template such as {{prefix derived terms}} (or whatever name you prefer) with {{Special:PrefixIndex/{{PAGENAME}}}} and then add it to pages. This would make those lists more complete and even self-updating! What do you think? --Waldir 15:42, 20 April 2009 (UTC)

  • No - you will get words from every language mixed together (you just got lucky with "self-"). SemperBlotto 15:45, 20 April 2009 (UTC)
Oh, you're right. I wonder if subst:Special:PrefixIndex will work there... otherwise I'll copy and past the list. --Waldir 15:57, 20 April 2009 (UTC)

on using the Wikisaurus

I've been working on compiling a list of names for birds (i.e. cuckoo, duck, wren, nightjar etc.), and decided that most likely Wikisaurus:bird was the bets place to foist this on instead of making a duplicate Appendix:Names of birds.

However, I am noticing that there are MANY MANY more names than I had first thought. Using w:List of birds to avoid forgetting anything, I've just finished cuculiformes and am already at ~120 names, a number that is bound to rise fast (and that's not counting 50+ names moved to Wikisaurus:hummingbird). I'm not sure yet whether to move out more stuff (i.e. Wikisaurus:fowl and Wikisaurus:bird of prey, almost certainly Wikisaurus:songbird) or reduce the list to the well-known/most generic names and move the rest (e.g. brush-turkey, go-away-bird, tragopan) to an appendix.

It's worth noting here that there is some duplication between wikisaurus and the entries themselves in detailing tehse semantic networks: hyponyms, hypernyms and meronyms have atendency to be listed in "related terms" section (i.e. marteau, shark). Circeus 19:47, 20 April 2009 (UTC)

To me, keeping some bird names in more specific entries such as Wikisaurus:hummingbird instead of in Wikisaurus:bird seems to be a useful way to prevent having a large, difficult to overview list of exotic bird names in Wikisaurus:bird.
A partial duplication of the mainspace content in Wikisaurus looks okay to me, presenting no problem. --Dan Polansky 10:11, 21 April 2009 (UTC)
What you're describing here sounds more like categories to me. Do hyponyms make sense in Wikisaurus? I wouldn't go to a thesaurus to find different birds of prey. Equivalently, I wouldn't expect to see eagle and vulture in the same thesaurus entry. If some thesauruses do that, I'm not sure it's a strategy that can be completely unfurled. Wouldn't it create a gigantic mess, or am I being too pessimistic? Maybe it could be done, but it would mix a sort of picture dictionary hierarchy into something for which it wasn't intended. That can be changed of course, the intention bit, as long as you realize that you're deliberately mixing the two. DAVilla 07:35, 28 April 2009 (UTC)
Well, look at Wikisaurus:abode, to which I ultimately added very little. Circeus 12:46, 28 April 2009 (UTC)
Actually, this warrant a proper discussion. I'll keep the Wikisaurus:abode link to note that although I did add a few, it was not in its original form, very different from the original "bird" entry. Compare also Wikisaurus:building or wikisaurus:creature.
Ultimately, my editing is basically going down to the logical conclusion with regard to inclusion of hyponyms: it is ultimately not fitting not list only a few (though I would not be entirely opposed to keep only the broadest ones and refer to an appendix for the rest).
When you say "What you're describing here sounds more like categories to me.", you're harkening back to the classic list v. category debate. Many arguments from that page are appropriate here, two that immediately come to mind are that categories are much harder to establish and maintain (you have to edit dozens of articles), and they cannot contain redlinks. In our case, Wikisaurus allow beter subdivisions (i.e. a sub-splitting for hummingbirds, passerine, birds of prey and fowl) that would probably be considered inappropriate in category space, whereas the categories would still have the entire, unstructured list of names (i.e. treating waterfowl and ostrich equally). In reverse, as is obvious here, I am kinda running into a problem with having too much information on the page, although the spliting away of some groups I believe ultimately makes for a better thesaurus. 23:36, 28 April 2009 (UTC)
When this happened with derived terms, we decided not to list derived terms of derived terms, e.g. unenthusiastic and enthusiastically but not unenthusiastically. Ultimately this did very little: all of the longest pages on Wiktionary have a laundry list of derived terms, and there's probably no way around that. In your case, however, it's the ideal solution. Don't list hyponyms of hyponyms. Is this what you ultimately settled upon anyway?
Yeah, I did mean categories in the Wiktionary sense, but you could also take it to be the regular English meaning, as in hierarchies of lists or even an outright picture dictionary. DAVilla 15:20, 30 April 2009 (UTC)
On whether hyponyms belong to a thesaurus: I understand one of the purposes of a thesaurus to be to help people find words they can't recall. If one recalls a hypernym of the word one was looking for, a thesaurus that has hyponyms helps him while one that has only synonyms does not. Like, "How only was that rodent called?" In a similar way, if one recalls a coordinate term, more likely scenario, a thesaurus with hyponyms helps: coordinate term of, say, "apple" is a hyponym of one of the hypernyms of "apple". Like, "How only was that thing called, similar to apple but not apple? I see, pear, found in the hypernym of apple--Wikisaurus:fruit" or even "..Wikisaurus:edible fruit." Admittedly, categories do that job. However, there is no clearly defined relationship between a category and its memebers, unlike "hyponymy" in Wikisaurus; while Category:Fish contains fish, Category:Physics does not contain items of the class "physic". Also, there are further semantic relations such as meronymy that don't fit well into categories. As Circeus mentioned, contrasted to categories, Wikisaurus enables much finer hyponymy documentation, one that, in the category namespace, would probably be considered an overcategorization.
The guideline "don't include hyponyms of hyponyms" is a useful heuristic, but should IMHO be taken as that: a heuristic. Like, I have included hypernyms of hypernyms in Wikisaurus:fish by including "vertebrate" and "animal", as it did no harm. The point is to build helpful pages that help people recall words and browse the network of words and concepts by their semantic relations, without violating the semantic relations, but with fluid rules for the depth of unfolding of the relations, by which I mean which degree of A of A of A of ... we include under the relationship heads, where A stands for hyponym, meronymy, hypernym and holonym. --Dan Polansky 16:31, 30 April 2009 (UTC)
An afterthought: hyponymy is included in many entries of Roget's 1911, thought not fully. Consider:
Animal” in Roget's Thesaurus, T. Y. Crowell Co., 1911.
which has "[major divisions of animals] mammal, bird, reptile, amphibian, fish, crustacean, shellfish, mollusk, worm, insect, arthropod, microbe" and further selected names of species. Of course, Roget did not have to set up an inclusion policy for other people to follow: he formed a possibly unarticulated policy in his mind while compiling his thesaurus. --Dan Polansky 16:54, 30 April 2009 (UTC)

language names we don't use

I've started Wiktionary:Language names as a list of language names we don't use. Such a list is useful in my opinion, but: (a) I'm not sure whether it will be overly long; if not, then (b) am not sure it needs its own page (rather than beig part of Language considerations or something; and if it is to be its own page, then (c) that's not a great title. Edits and opinions are hereby requested.—msh210 21:55, 20 April 2009 (UTC)

What we need is a master list of level 2 "language" names that are included in Wiktionary, which someone had suggested not long ago, and also its completion with a list of recognized dialect names for each language. The latter could probably not be well maintained in a centralized location. Rather, it should be information included on each About: page for a language. In the case of widely used languages like English and Spanish, the list of dialect names would probably be so long as to constitute a subpage.
The list you created would ultimately include a compilation of all that, far too long to be of any use. Removing all the dialectical information leaves something much more interesting, synonyms that refer to the same language as well as broader or ambiguous names that languages are sometimes called. Terms like "New Latin", "Modern Hebrew", and "American English" are better left out, along with basically anything that's a subset of a level 2. They would simply grow the list to an unreasonable size. DAVilla 07:16, 28 April 2009 (UTC)
Good idea.—msh210 16:11, 5 May 2009 (UTC)
If someone wants the current list of level 2 headers, it is at User:Conrad.Irwin/languages. Auto-Format also has a list that includes language codes. User:AutoFormat/Languages. Conrad.Irwin 16:22, 5 May 2009 (UTC)

Implications of WT:RFV#remis

Hi all. Please note this RfV-sense discussion; it is for the sense “[t]he generally accepted (mis)spelling of the term ‘penis’ when input on a mobile device with T9 text prediction” of the letter-combination remis. Ya gotta love those deluded, bowdlerising T9 lexicographers, since remis is indeed what I got when I checked to see what I would get when I tried to key penis into my phone. This is a peculiar class of misspelling, and it seems to me beyond pointless to give them any kind of recognition herein. Apart from the absurdity that there can be such a thing as a “generally accepted misspelling”, if we accept remis, then we must also accept shiv (shit), dial (dick), dual (fuck), collock (bollock(s)), captap (bastard), aunt (cunt), yank (wank), and so on. Of course, they’re just the insults and vulgarisms — think of the proliferation of “text-os” we’d get from trying to key into our phones the various words herein marked {{rare}}, {{archaic}}, {{obsolete}}, and the like; my recent contributions would give us aimsiz (chorizo), whichrneter (whichsoever), syno (syon), synt (syoun), photo (sioun), fyi (ezh), and diabol (diabologue). Such a list of accidental contranyms, anagrams, truncations, and garbled forms are surely not desirable. Shall we delete these with extreme præjudice?  (u):Raifʻhār (t):Doremítzwr﴿ 19:40, 21 April 2009 (UTC)

I see value in keeping them iff, for example, such a word has become {{Internet slang}} for "penis", in use on Usenet or even in print, with etymology from T9 but with use even where T9 is not employed. Aside from that, RFV should serve to weed them out, though I certainly wouldn't complain if someone speedily deleted such a word after checking for use.—msh210 20:18, 21 April 2009 (UTC)
Hmm. I suppose. The first page of hits yielded by Google Groups Search throw up uses in the chess sense, as a sort of taxi, and as a misspelling of remix. I’d expect that remis would be far more common as a misspelling of remix, remiss, &c. than as a text-o for penis (u):Raifʻhār (t):Doremítzwr﴿ 20:33, 21 April 2009 (UTC)
I don't know if this is labeled properly, but I think we can cross that bridge when we get there. At the moment I have absolutely no expectation that the term will be cited in durably archived media. If anyone thinks this is worth having even when it cannot be cited, which might be a bit controversial, then I would suggest a heading similar to anagrams. DAVilla 06:32, 28 April 2009 (UTC)

Usage notes and verbs

I know this sounds weird. I'm not a native English speaker and I often look up some words to fully understand their meanings. The sections "Usage notes" are really helpfull. I know Witkionary is a dictionary rather than an English language textbook, but most of people consult it not only to understand a word, but also to learn how to use this word in a practical context.

This is why usage notes, like in who or which are important. You can actually learn how to use this word in the right way. What I found confusing is the use for proposition after a verb. This is not something made clear on Wiktionary. I don't know if it were your (as a the community) intention to do so, but it would be really nice to some explanation about that. We know that "talk with" and "talk to" are different. The difference might not be widely understood, or maybe there is a common form that is used instead of the grammatically correct form. (Such as "if I was" vs "if I were")

Another example are the verbs to speak, which can used with either the prepositions "with", "of" or "to" (speak for is considered a phrasal verbs as I've just read), to connect ("with" or "to"), to think ("of" or "about") etc.

Since it's a bit confusing and the examples don't really help, can we readers have a little more attention on this topic? Made a bit clear over the prepositions used with a verb? It would be really nice. I want to discuss this topic and find a solution with you, rather than complaining or telling you what to focus on. I don't really have any idea about it, and I think a section just for prepositional usage is too much. Thanks for your time! Exe --125.24.188.33 20:27, 22 April 2009 (UTC)

I agree that the usage notes section of verbs should indicate what prepositions the verb takes (or that it doesn't take any, like eat). It should also indicate, if there are two objects, what each one is (e.g., what "I gave my son a cat" means).—msh210 22:13, 22 April 2009 (UTC)
Unfortunately, there are few generalizations about verb/preposition combinations that hold. For example, you said that "eat" takes no preposition, but there are instances where it does, with the KJV translation of Genesis being an oft-quoted example ("eat of the tree"; "thou shalt not eat of it"). Admittedly, such constructions are rare in modern English, and sound archaic, but they are still used. And there are adverbial prepositional phrases that can be used with eat, such as "eat in the kitchen", "eat on the deck", "eat off the floor", etc. --EncycloPetey 21:05, 26 April 2009 (UTC)
This is also in part what examples/quotes are for. Circeus 02:47, 23 April 2009 (UTC)
I wonder if examples/quotes can feasibly do a thorough job of describing the various nuances of words. In my opinion, this need is rendered very difficult by one of the primary failings of our current format, the inability to add lots of information without bogging down the entry. What we really, really need is a robust, powerful sense template, where, in addition to a definition, all sorts of other sense specific information can be added, such as detailed usage notes, as well as lots of stuff we currently have unintuitively detached, like synonyms and translations. Ideally, all the sense specific stuff would be hidden from view initially, with a few floating tabs for opening it up, if the user wanted to dig further. This would allow all information for a specific sense to be clustered in one spot. It would also allow the initial page to be far more intelligible to the average user, while allowing us a lot more space for adding information for the in-depth user. Anyone with some JS skills feel like giving this a shot? -Atelaes λάλει ἐμοί 03:16, 23 April 2009 (UTC)
Old news :p User:Conrad.Irwin/parser.js (also on WT:PREFS) has been doing this (albeit very crudely and bugily) since 2007. I can't promise to be able to improve it at any time in the near future. Conrad.Irwin 09:15, 23 April 2009 (UTC)
Ah, yes. Come to think of it, I think I've tried this before, and forgot about it. The layout needs some work, but that's exactly what we need to be doing. This is the future of Wiktionary. -Atelaes λάλει ἐμοί 09:26, 23 April 2009 (UTC)
This information could, and should, be put under "=== Usage notes===". Either in prose, or if someone comes up with a clear format, using a template. Conrad.Irwin 09:15, 23 April 2009 (UTC)
speak
of to with
It's true, it should be under "Usage notes", but that's mean you will have that section in several pages about verbs. I little template (like the example) and a more attention on the examples given should be fine. Exe --125.24.217.199 11:38, 23 April 2009 (UTC)
Prepositions are difficult to pin down. I am always thinking about how best we could deal with them, but there is no one clear solution that I can see so far. Taking your example. Speak + prep. The prepositions in these examples have their own meanings, and they can transfer those meanings to other verbs equally well eg with "speak", "talk", "throw", "give", "walk" etc. + "to", "to" has the meaning of "directed towards". "with" means "together", and so is unlikely to collocate with "throw" and "give", but it does with "speak", "talk" and "walk". If we were to fill the prepositional possibilities of "speak", it would be a huge entry, with most of it being nothing more than a repetition of the meaning of each preposition. (to, with, against, at, over, about, around, into, etc.) I think prepositional collocations are only really useful when a verb consistently uses just one or two particular prepositions in the majority of written examples, and yet is not a phrasal verb. "Return" for example collocates with "to" and "from" in nearly all cases, but does not form a phrasal verb with either. -- ALGRIF talk 18:11, 23 April 2009 (UTC)
  • I would like to present an example to see what would be the best way to deal with this problem. I'm looking at smash verb. "smash into" and "smash through" are not considered to be phrasal verbs, but they are very common collocates. So what do people think would be a clear and consistent way to put the ideas The car smashed into the wall. The police smashed into the room. and The builders smashed through the wall. into the entry at "smash"? -- ALGRIF talk 15:22, 24 April 2009 (UTC)

I had a look at speak directly followed by a preposition. I found 83 possibilities in the Corpus of Current American English (I didn't look at the British National Corpus). Of those, I looked carefully at the first 20 or so.

word            count   PMW     Object of the preposition
to              17218   44.72   listener,       topic
of              9433    24.50   topic,  nothing to speak of
with            7814    20.30   listener        
for             4484    11.65   source  
in              4431    11.51   tongues, terms of
about           3185    8.27    topic   
on              1996    5.18    topic   
up              1878    4.88    (intransitive)  
at              1794    4.66    listener        
through 985     2.56    (intransitive)  
from            933     2.42    Not complement  
by              483     1.25    Not complement  
into            435     1.13    Not complement  
like            297     0.77    Not complement  
as              278     0.72    to + topic       
before          243     0.63    Not complement  
against 209     0.54    topic   source
without 190     0.49    Not complement  
out             120     0.31    (intransitive)  
during          101     0.26    Not complement  
over            93      0.24    noise   
after           90      0.23    topic   
back            46      0.12    (intransitive)

This may have omissions. I distinguished between complements and adjuncts by trying to front them. So, for example, *Of the problems I'm having I spoke is not acceptable, so [of + topic] is a complement and therefore worth noting. But From the stage, I spoke of the problems is fine, so I take [from + object] to be an adjunct and not worth noting in the dictionary.

This kind of analysis is very important, but quite difficult.--Brett 14:59, 28 April 2009 (UTC)

Brett: Superficially (all I'm capable of), it doesn't seem so much difficult as time-consuming, at least to get the "easy" 80%. What made it difficult? I wonder how our users could see the interest and value in it.
Algrif: Multiple usage examples and/or citations? We already have people complaining about our long entries, but also about insufficient usage examples. More or wordier definitions seems definitely the wrong way to go. Perhaps WT:CFI needs to be amended to explicitly allow for the inclusion of some verb-preposition collocations that would be too debatable under current criteria. Perhaps such entries would appear under their own "rel" bar at the verb. Our longer verb entries often seem rather hard to use and hiving off some material to separate entries might help keep the situation from getting worse. DCDuring TALK 16:33, 28 April 2009 (UTC)
It's both difficult and time consuming. It wouldn't take much to train people to come up with the lists of prepositions, but looking through the results and recognizing the various possibilities, and then thinking through carefully whether they are adjuncts or complements, takes a level of language awareness that, frankly speaking, most of the population simply doesn't have. But if we're serious about this, I see no alternative. I'm not aware of any available source for the data. But then there is the prohibition about own research.--Brett 00:09, 29 April 2009 (UTC)
This seems like a kind of attestation/WT:CFI "research" that could lead to vastly better entries for common verbs, some of which seem stuck in 1913. I would expect big improvements of all the senses, whether or not used with prepositions. If our methods are documented and objective and confined to validation/attestation, where is the problem? DCDuring TALK 02:13, 29 April 2009 (UTC)
Brett, Wiktionary is not wikipedia and does not have a NOR policy, only the Criteria for inclusion. Writing a wholly new, multilingual dictionary without original reseach would be impossible, as it would be almost impossible to write definitions of new terms (cf. splog or link spam). Circeus 04:57, 29 April 2009 (UTC)
I'm glad to learn that there's no NOR policy here. I'd had it thrown at me and not taken the time to check.
Of course DCDuring is right. Complements go beyond prepositions and objects. Just for English verbs, the possibilities include: to-infinitive (e.g., want to go), bare infinitive (e.g., make him go), various kinds of content clause (e.g., wondered what the problem was, said (that) it was difficult, it worries me that they're late, understand what a great chance it is), present participles (e.g., keep making progress), and locative complements (e.g., put it here).--Brett 13:05, 29 April 2009 (UTC)
Longman's DCE is exemplary in this regard. I was just thinking that any focused attention by our better editors on those core entries will likely lead to lots of improvement for all aspects of these entries, especially the rarely improved definitions. This is first programmatic effort I'm aware of that would get at these definitions. I don't know how often users use Wikt for these words, though. DCDuring TALK 14:10, 29 April 2009 (UTC)
My first thought is to suggest we try this for the 20 or 50 most common verbs in English, but then we run across the problem of coordinating prepositional usage with definitional senses. Brett's analysis is a good start, but it doesn't take into consideration (yet) which prepositions are used with which senses. That issue adds another very difficult, but firmly relevant and important, level to be done in an analysis like this. --EncycloPetey 20:48, 3 May 2009 (UTC)

The Centre for Corpus Research at the University of Birmingham has made available online the book Grammar Patterns 1: Verbs, originally published in 1996 by Collins Cobuild and now out of print. It can be found here--Brett 14:07, 15 May 2009 (UTC)

Sign language entry links

Currently, most links to our sign language entries show just the name of the target page:

* [[American Sign Language]]: [[OpenB@Chin-PalmBack-OpenB@CenterChesthigh-PalmUp OpenB@Palm-PalmUp-OpenB@CenterChesthigh-PalmUp]]

I just discovered that our Mediawiki instance allows image links. Using such syntax, we can link to sign language entries from images.

* [[American Sign Language]]: [[Image:ASL OpenB@Palm-PalmUp-OpenB@CenterChesthigh-PalmUp.jpg|35px|link=OpenB@Chin-PalmBack-OpenB@CenterChesthigh-PalmUp OpenB@Palm-PalmUp-OpenB@CenterChesthigh-PalmUp]]

I think that's a more reader-friendly format, but I'm not sure whether images in translation tables will make for a layout that's too jagged or otherwise jarring. Comments? —Rod (A. Smith) 16:59, 23 April 2009 (UTC)

I'm for including pictures. (On a computer that won't load the BP, I can't see the layout of the format Rod proposed. But I like the format at two#Translations.)—msh210 17:09, 23 April 2009 (UTC)
{{t-image}} allows images for languages that aren't yet in unicode already, so there'd be no problem with including images. It would rely on the images already being present - though I suppose you could always fall back to the previous system if necessary. Conrad.Irwin 17:16, 23 April 2009 (UTC)
Using {{t-image}} results in the following:
* [[American Sign Language]]: {{t-image|ase|ASL OpenB@Palm-PalmUp-OpenB@CenterChesthigh-PalmUp.jpg|35px|OpenB@Chin-PalmBack-OpenB@CenterChesthigh-PalmUp OpenB@Palm-PalmUp-OpenB@CenterChesthigh-PalmUp}}
The transliteration appears after the image. Is that desirable? If so, I think the transliteration should also link to the target entry. If the community agrees, I'll edit {{t-image}} to do so. Also, I'm not sure how big to make the image. 35px (as above) seems about the smallest readable size. Is the large vertical space surrounding the image OK? —Rod (A. Smith) 17:53, 23 April 2009 (UTC)

capitalization of proverbs' pagetitles

The CFI say, when discussing pagetitles:

===Proverbs===
Proverbs that are whole sentences should begin with a capital letter. For example: You can't judge a book by its cover.

The problem with this is that not only that particular proverb but actually all English proverbs are lowercase-initial-letter. Clearly, current practice does not match the CFI. Two people recently tried to edit the CFI because of this inconsistency (diff, diff) and were reverted for different reasons (diff, diff). But something should be done. I propose just that the above text be removed from the CFI. Any opposition?—msh210 17:49, 23 April 2009 (UTC)

I never liked that rule myself, so I support your proposal. We already have plenty of sentence-like utterances that aren't proverbs and thus aren't capitalised (e.g. what's cooking); I don't see the point of a distinction. Equinox 22:03, 23 April 2009 (UTC)
Can the guideline suggest that they have a small initial then? No point in having random initial caps in these entries, or near-duplicates differing by sentence case. Michael Z. 2009-04-23 22:37 z
No objection from me.—msh210 23:33, 23 April 2009 (UTC)
I support this; it makes sense to me to align the guideline with the current practice. --Dan Polansky 07:24, 24 April 2009 (UTC)
I agree too. Even when proverbs are sentences by themselves, thay may be used inside sentences (e.g. after because). Lmaltier 11:14, 24 April 2009 (UTC)
You have my agreement, too. I never liked, nor understood the rationale of this CFI. It can lead to double entries, as you say. -- ALGRIF talk 12:40, 24 April 2009 (UTC)
Support for changing CFI to say "lowercase letter" instead of "capital letter". No point in contradicting ourselves. --Jackofclubs 06:01, 25 April 2009 (UTC)
Support as well, as these aren't going to be followed by a period, and can come as fragments of a larger sentence (e.g. "Even though you can't judge a book by its cover, the gaudy cover of this book made me apprehensive.") bd2412 T 06:07, 25 April 2009 (UTC)
Support, although I always feel better if we hash out exactly what the change will say before making the change. This is certainly one case where we havn't explicitly articulated our norms properly. --EncycloPetey 20:59, 26 April 2009 (UTC)
How about Proverbs should begin with a loewrcase letter. For example: you can't judge a book by its cover.? --Jackofclubs 08:48, 27 April 2009 (UTC)
What about God helps those who help themselves? I would prefer : Usual Wiktionary capitalization rules also apply to proverbs (proverbs can come as fragments of a larger sentence). For example: you can't judge a book by its cover. Lmaltier 15:30, 27 April 2009 (UTC)

Okay, lemme restart this conversation, then: The proposal is changing

===Proverbs===
Proverbs that are whole sentences should begin with a capital letter. For example: [[You can't judge a book by its cover]].

in the CFI to

===Proverbs===
Even proverbs that are whole sentences should begin with a lowercase letter. For example: [[you can't judge a book by its cover]]. (Exception: A proverb like [[God helps those who help themselves]] starts with a proper noun, so is capitalized.)

. Any objection?—msh210 15:49, 27 April 2009 (UTC)

  • Yuck, no offence but that sounds rather awkward. I prefer Lmalter's "Usual Wiktionary capitalization rules also apply to proverbs (proverbs can come as fragments of a larger sentence). For example: you can't judge a book by its cover." Ideally there would be a page Wiktionary:Proverbs too, to tell users what a proverb actually is, and what is not a proverb, with style guide. --Jackofclubs 16:04, 27 April 2009 (UTC)

How about “Don't capitalize proverbs as sentences”? There's no need to mention every reason that you would capitalize a word as exceptions. Michael Z. 2009-04-27 18:34 z

What about:

===Proverbs===
Proverb entries should begin with a lowercase letter, regardless of whether they are whole sentences. An example: you can't judge a book by its cover.

A benefit over the "don't" proposal is that it heeds Strunk's "specify positively". The proposal only states the rule, not the reasoning behind. It also explicitly discards one item that someone could see as speaking against the rule, namely that proverbs usually are whole sentences, making it clear that the rule was created with the awareness of the item. Other proposals are okay for me; though. The semantics is clear from all the proposals, even if style varies. --Dan Polansky 19:08, 27 April 2009 (UTC)

I agree that these all work, but I don't mind shaking out the best wording.
Would Strunk apply positiveness this way to a prohibition? The problem is that not all proverbs will begin with lowercase letters (e.g. “God...,” above). We're not prohibiting editors from capitalizing the first word of a proverb – we're only only advising them not to capitalize it as the start of a sentence. Michael Z. 2009-04-27 20:09 z
In general, I'd think Strunk would go for a positive expression even with a prohibition, by chosing "avoid" in preference to "don't". But to the specific point: I don't know about Strunk, but, to me, "write in lowercase" seems to do just as well as "don't write in uppercase". However, I see your point that has led you to choose "Don't capitalize proverbs as sentences": your statement is more accurate, as it caters for such cases as "God...". I can fix my proposal by adding to the first sentence ", unless the first word of the proverb is capitalized on its own", or something to the effect. That makes my prosed statement much less charming, though. In any case, an example should be added of Rome wasn't built in a day or another proverb that should start with a capital letter. Hence my second take:
===Proverbs===
Proverb entries should begin with a lowercase letter, regardless of whether they are whole sentences, unless the first word of the proverb is capitalized on its own. Examples: you can't judge a book by its cover, Rome wasn't built in a day.
--Dan Polansky 21:02, 27 April 2009 (UTC)
Strunkist: “avoid capitalizing proverbs as sentences.” Michael Z. 2009-04-27 21:29 z
===Proverbs===
Proverb entries should begin with a lowercase letter, regardless of whether they are whole sentences, unless the first word of the proverb is capitalized on its own. Examples: you can't judge a book by its cover, Rome wasn't built in a day.

(Sigh.) I was hoping this could be uncontroversial, to avoid voting. Can we agree on

===Proverbs===
Proverb entries should begin with a lowercase letter, regardless of whether they are whole sentences, unless the first word of the proverb is capitalized on its own. Examples: [[you can't judge a book by its cover]], [[Rome wasn't built in a day]].

then?—msh210 20:31, 28 April 2009 (UTC)

My copyedit:

A proverb entry's title begins with a lowercase letter, whether it is a full sentence or not. The first word may still be capitalized on its own:

 Michael Z. 2009-04-29 16:45 z

Msh210, I support this proposal, of course ;). --Dan Polansky 10:06, 30 April 2009 (UTC)
This looks good to me. Equinox 21:42, 1 May 2009 (UTC)
Six of one, half a dozen of the other. DAVilla 20:12, 3 May 2009 (UTC)

Thank you, folks. Done.—msh210 16:10, 5 May 2009 (UTC)

Logged-in users editing their own user pages

Since this is always permissible, could we automate the removal of the "red exclamation mark" designating unpatrolledness? Equinox 21:55, 25 April 2009 (UTC)

I'd say no, 'cause as an admin (primarily at no.wiki) I've often seen users abusing their user pages, adding offensive content or blatant vandalism. We have a policy at Wiktionary:Usernames and user pages, and we should make sure it is not violated. --Eivind (t) 20:48, 26 April 2009 (UTC)
We already have this. Just enable the WT:PREF "Patrol in enhanced mode" or w/e. If there's someone with that pref logged in, it will be patrolled by javascript. Conrad.Irwin 21:07, 26 April 2009 (UTC)
Then we should disable that. We should not give carte blanche to all users to make patrolled edits to their own user pages, for the reasons ElvindJ has given. The edits should be patrolled if the user in question isn't "whitelisted". --EncycloPetey 21:12, 26 April 2009 (UTC)
Concur with EJ and EP. Given the fact that user pages will be rarely examined otherwise if the user makes no other edits, it is imperative that user pages not be given a pass. Indeed, it might be worthwhile to disable autopatrol of sub pages in userspace even for whitelisted users if that be technically doable. — Carolina wren discussió 04:34, 27 April 2009 (UTC)
Given that we already have a massive (a couple of hundred every day) backlog of unpatrolled edits, I'd rather we actually did some more guessing of edits that are "likely" to be constructive and, if not auto-patrol them too, give them a different colour exclamation mark so that we can patrol more easily the edits that need looking at. Yes people might be making a mess on their userpage, but (in my experience anyway) the most common thing is a link to Wikipedia, followed perhaps by a random spiel about oneself, then you get the people who create adverts (which SemperBlotto then goes through and deletes in batches). Yes we might miss a few bad edits, but we already miss a few anyway. I did recently, as an experiment, try to keep a day with no unpatrolled edits. I was able to do this (with others still patrolling as much as usual, I assume), but it took maybe two or three hours of extra time throughout the day. Conrad.Irwin 09:24, 27 April 2009 (UTC)
As Conrad notes, SB has a script that marks for deletion userpages of users whose only edit is their userpage (or something like that, anyway). So I see no harm in patrolling these. Moreover, doing so lessens the number of unpatrolled edits that patrollers need to wade through. Is there anyone who actually patrols (by hand, not just by JavaScript) who thinks these should not be marked patrolled?—msh210 15:44, 27 April 2009 (UTC)
Yes, I do. --EncycloPetey 03:05, 28 April 2009 (UTC)

Spanish

Two issues I have off the top of my head. First one is simple:

Why? I've come up with some pretty good reasons in the past, but I've forgotten a lot of them. The main reason is that Spanish nouns have, at the very most, 3 non-lemma forms - and that's only because some nouns that describe humans have masculine and feminine forms. To compare that to languages with heavy inflection is silly. Look at Armenian եղեռն, Hungarian vizsla, Lithuanian brolis, Russian, Finnish... These heavily-inflected languages obviously need the broad 'x noun forms' categories, while Spanish, which really only has "nominative plural" forms, definitely doesn't.

Spanish adjective forms makes more sense, because Spanish adjectives pretty consistently have specific m/f forms, which the majority of Spanish nouns do not. Even 'Category:Spanish plurals' was better for Spanish nouns than this noun form business. — [ R I C ] opiaterein — 15:38, 27 April 2009 (UTC)

What's a femmie? --Jackofclubs 16:09, 27 April 2009 (UTC)
Opposite of a butchie. — [ R I C ] opiaterein — 16:28, 27 April 2009 (UTC)

Is the category for Spanish terms of Spain or of Europe? If the adjective is too awkward, then use the attributive noun: Category:Spain Spanish, but don't say something else altogether. Michael Z. 2009-04-27 18:31 z

Yeah, let's name all of our multi-country language categories like that :) French French/France French, England English/English English, Dutch Dutch. There's a certain level of bias that can be read into that. The most common being "real" French, or "real" English, or whatever. The category is for Spanish terms used in Spain, which is European Spanish. You could call it Castillian Spanish, which not everyone would recognize. That'd be like the Portuguese category being called Lusitanian Portuguese. — [ R I C ] opiaterein — 21:52, 27 April 2009 (UTC)

A plea to our sysops

EVERY time that I log on to Wiktionary, I go to Recent Changes and patrol for vandalism and stupidity back from the time that I last logged out. If I am away for an extended period (two days - busted cable modem) this can take a very long time. It would help me greatly if other sysops did the same thing (rather than just patrolling while you are logged on). Cheers. SemperBlotto 16:00, 27 April 2009 (UTC)

Even when I go through RC, I mark fewer edits patrolled than some others do, as I am wary of marking an edit patrolled when I have absolutely no idea whether it was made in good faith. The prime example, and a very common one, is an edit that adds a translation into a language I do not know at all, or that adds several such translations into various languages for a single word. Another example — less common, but still common enough — is an edit that adds a ==language== section (or a new page) in a language that I do not know. If our policy is that such edits can be patrolled, I will be glad to do so. Is it?—msh210 17:16, 27 April 2009 (UTC)
I've tended to mark as patrolled things that are formatted perfectly (though this is now harder for translations with creation.js) on the assumption that if they have taken the time to learn the format, they must believe what they are adding is right (even if it isn't - but hey, even the autopatrollers will make mistakes, they get corrected eventually). [Assuming of course that there are no highly devious anti-Wiktionary agents who are stealthily creating a huge practical joke]. For a lot of the Romance language translations, I often find myself "guessing" whether they are right or not - I tend to lend higher credence to words that look similar (though I'm often slightly wary that someone might be "guessing" as well). I'd also be more inclined to patrol a group of similar, good looking contributions by the same user (again on the assumption that if they are spending some time on Wiktionary, they must be under the impression they are doing some good). If in strong doubt, I occasionally ask one of our native speakers to verify, or will just not patrol it. 131.111.220.6 23:07, 27 April 2009 (UTC)
In the past, there was an eight-hour window of time I tried to patrol through for every day. If I didn't log in one day, then I'd do that eight-hour period for all previous days the next time I did log on. Unfortunately, my current job takes away more of my Wiktionary time than my previous job did, so I can no longer do this most days. However, the idea might work if a few sysops selected particular time slots to patrol through (not necessarily at the same time you habitually log in), or thoroughly patrolled one hour's worth of edits each day (though not necessarily the same UTC hour each time). --EncycloPetey 03:50, 28 April 2009 (UTC)
To be brutally honest, I don't usually feel I have time for this, especially when there are always so many new words to be added and when so many unpatrolled edits are the non-English ones. (Even with e.g. Spanish, unless I immediately recognise a cognate to French, I can't think about approving it.) Particularly egregious vandalisms are usually picked off at once because it's rare for no admin to be logged on. I do, though, appreciate your efforts, SB, and I will continue to zap as much vandalism as I can while I happen to spot it. Equinox 21:44, 1 May 2009 (UTC)
Patrolling is not approving of. You can mark entries as patrolled even if you aren't sure they're correct, as long as there's a good chance they could be correct. For new translations, having the right script is usually enough for me. DAVilla 20:08, 3 May 2009 (UTC)
Patrolling is about checking that the edit (1) isn't vandlism, (2) isn't spam or propaganda, and (3) is formatted correctly. Patrolling isn't about the veracity of the information, although many patrollers do check that simultaneously when they notice a problem. --EncycloPetey 20:43, 3 May 2009 (UTC)
Ah, but in the cases I describe above (that is, where the edits are in a language I don't know), I have no way of recognizing vandalism. I've brought up this question twice before IIRC, and each time someone says "you can patrol those" and someone else says "I don't patrol those" and someone else says "patrolling means checking that it's not vandalism" (which doesn't help me really).—msh210 19:15, 4 May 2009 (UTC)
Equinox (and others), I was told when I was nominated for sysop that a sysop has no required jobs: the only requirement is that he keep active (at least one edit a year). Obviously, patrolling is necessary (in the sense that someone's got to do it) and therefore a Good Thing To Do™, but.—msh210 19:15, 4 May 2009 (UTC)
Sadly, I can't really use my browser well enough at this point in time to go back and patrol when I'm not on, but I often do leave VandalFighter running through days at a time and I will patrol through that (also, I tend to catch a lot more vandalism than I do to just patrol other edits). I'm also often gone for weeks at a time, which can make the idea of going through RC a bit daunting. But I'll try.  :) --Neskaya kanetsv 22:48, 24 May 2009 (UTC)

Category:Filmology, Template:filmology

filmology is something else. If there's no objection, I'll rename these to Category:Film or Category:Cinema, and Template:filmMichael Z. 2009-04-27 18:00 z

Or Category:Cinematography (used in OED), or Category:FilmmakingMichael Z. 2009-04-27 22:40 z
And Random House would use Category:Movies. But I'll keep it simple, and choose film. Last chance to complain. Michael Z. 2009-04-28 02:51 z

Huh? What happened to all the previous discussion on this issue? --EncycloPetey 03:45, 28 April 2009 (UTC)

I've never seen it. The category has a tag placed August 2008 pointing to RfC, but there's nothing on RfC. Michael Z. 2009-04-28 14:12 z
Found it: Wiktionary:Tea_room/2007/November#Category:Filmology. Inconclusive. Points to add:
  • cinematography is wider in meaning than mentioned there. It is the artistic and technical activity of making and reproducing films, per OED, M–W,[4] AHD, and RH,[5] a close synonym for filmmaking. OED uses the subject label Cinematogr. in entries like (film) trailer. (It appears that our cinematography definition is not quite right, because it is derived verbatim from the related but subtly distinct meaning of cinematographer)
  • filmology means something altogether different.
  • our “Category:Filmology” could include film distribution (arguably within the sphere of filmmaking), criticism (arguably not), etc.
I still think the general field film might be best, cinema, movies, cinematography, or filmmaking would be fine. Michael Z. 2009-04-28 16:39 z
I have placed the RFC tag to the category:Filmology without, in mistake, also putting a RFC entry to Wiktionary:Requests for cleanup, in August 2008. I hope to know better now. --Dan Polansky 07:47, 2 May 2009 (UTC)

Found some more at Wiktionary:Beer parlour/2009/January#Category:Filmology, also inconclusive. Perhaps cinema is preferable. Anyway, if there is any more hiding out there, or anyone has some current input, please speak now.

See also Wiktionary:Requests_for_deletion/Others#Template:Filmology Michael Z. 2009-04-30 17:42 z

  • I definitely prefer ‘film’ to ‘cinema’. I work in television, most of these terms are ones I use every day, and while we often talk about ‘film’ or ‘filmmaking’, ‘cinema’ is obviously inappropriate. Ƿidsiþ 18:30, 30 April 2009 (UTC)
  • Since many of the "filimng" terms apply to videography and television as well as to cinema, should we split out a sub-category for videography/cinematography? --EncycloPetey 20:41, 3 May 2009 (UTC)
    There is Category:Television. I believe film and video production have more in common than ever, technically. Can cinematography or film be used broadly to include video production? Or we could have a combined Category:Film and videoMichael Z. 2009-05-04 03:56 z

Organization of Index:American Sign Language

User:Positivesigner is working on reorganizing Index:American Sign Language. For anyone interested, please give feedback at Index talk:American Sign Language#Organization. —Rod (A. Smith) 21:25, 27 April 2009 (UTC)

Word of the Day - help needed

It looks as though Word of the Day needs volunteer help. Because of personal problems with stress on Commons, and the inaction of that community, I will no longer be uploading audio or other media to Commons. This means that Word of the Day needs someone willing and able to record / upload .ogg file recordings for WOtD. Otherwise, we have two options: (1) no longer include an audio link, (2) allow red links on the Main Page for the audio. --EncycloPetey 03:03, 28 April 2009 (UTC)

If nobody volunteers, then (3) upload locally would also be a (temporary) option. -- Prince Kassad 20:23, 28 April 2009 (UTC)
I've recorded the five missing files for this month but Commons capitalizes the first letter, e.g. File:En-us-resile.ogg, and I'm not sure how to get around that. Someone please revise Help:Audio pronunciations. DAVilla 19:01, 3 May 2009 (UTC)
The capitalization of the first letter of Commons file names has no effect on linking from here. Like Wikipedia, they are case-insensitive for the initial letter. --EncycloPetey 20:38, 3 May 2009 (UTC)

Protected titles

(this page has gotten way to big to load again; I'm adding this section by using the "section=new" URL, so I won't be able to reply here or edit this section again ...)

I re-wrote Wiktionary:Protected titles to reflect the "new" (year old ;-) built in mechanism in the MW s/w. It is underused, and I'd suggest we use it much more.

Please look, and please use the talk page there, rather than here ... Robert Ullmann 15:25, 28 April 2009 (UTC)

Well, I just archived February and March, bringing the Parlor under 200 kb... there's not that much from the current month that's more than 2 weeks old... Hope it helps! (BTW, it'd be nice if some bot herder took it upon themselves to update WT:BP/headings page.) 75.214.50.157 (really, User:JesseW/not logged in) 08:07, 30 April 2009 (UTC)
Connel, who used to do it, has been afk for a while now. Would it not be better to move to using subpages for topics, and transclusion to allow everyone to concentrate on topics they are interested in without concerns about the length of the main page. (Much like WT:ES save that the pages transcluded woult be at Wiktionary:Beer Parlour/) Conrad.Irwin 09:28, 30 April 2009 (UTC)
Many of those titles are not set to expire, and some are set to expire in the distant future, including one ten-letter word in 2018! I'm very much glad to see something said about that. DAVilla 14:47, 30 April 2009 (UTC)

Request for extra input on the Main Page redesign

Discussion for the proposed new design has stalled out. I'm relatively happy with the layout as proposed (though there are tweaks and kinks to work out before it is formally submitted for replacing the current one), so if you want to propose things, come and say so! Circeus 23:50, 28 April 2009 (UTC)

Help:Interacting with humans

Per Visviva's suggestion above, I've had a go at writing a document to discuss how people on Wiktionary are likely to behave. I don't think the current pages at Wiktionary:Assume good faith and Wiktionary:No personal attacks are particularly useful in describing how things actually work. Conrad.Irwin 00:24, 29 April 2009 (UTC)

Looks pretty good. Michael Z. 2009-04-29 00:56 z
Looks good. How would a user get to it? DCDuring TALK 02:01, 29 April 2009 (UTC)
I've gone on a spamming spree, Special:Whatlinkshere/Help:Interacting with humans. I wasn't sure if it should be in {{welcome}} or not - but the last thing that needs is yet another link. Conrad.Irwin 09:45, 29 April 2009 (UTC)
Those all seem good. What screens do people see immediately at the moment they are most likely to be upset (deletion of entry, being blocked, etc)? Perhaps some would read something there. DCDuring TALK 10:36, 29 April 2009 (UTC)
MediaWiki:Blockiptext? —RuakhTALK 20:53, 30 April 2009 (UTC)
I've noticed a numbmer of grammar and punctuation errors. I'm signing off now, but will correct them later if someone else doesn't catch them first. There also ought to be something explaining to newcomers about where most discussion on wiktionary takes place. Wikipedia users often believe that user pages and entry talk pages are the appropriate place for most discussions, as that is where they tend to happen on Wikipedia. On Wiktionary, most discussion happens in one of the five discussion fora. --EncycloPetey 12:10, 30 April 2009 (UTC)
Couldn't this be Help:Interacting with others or Help:Interacting with other/fellow editors? DAVilla 15:03, 30 April 2009 (UTC)
Wherever, this was the title that was suggested - I do quite like the subtle reminder :D. Conrad.Irwin 19:15, 30 April 2009 (UTC)
Maybe you could have some pointed humor in a subtitle. The page title and/or pipe should not drive people away!!! DCDuring TALK 19:48, 30 April 2009 (UTC)

Prepositions and verbs (redux)

What would be a good way to add "to" to the entry for tell, used in Irish English to mean tell? In AmE, it is ungrammatical (in the dialects I'm familiar with.) Wakablogger 00:16, 30 April 2009 (UTC)

If they're using to in place of tell, wouldn't that be under the entry for to then? Or maybe I'm not understanding. DAVilla 14:59, 30 April 2009 (UTC)
Yeah can you explain a bit more please. Mglovesfun 15:09, 30 April 2009 (UTC)
I think Wakablogger's talking about a verbal construction tell to. Circeus 05:17, 7 May 2009 (UTC)
Yes, the song "Seven Drunken Nights," for example, has the repeated phrase "Will you kindly tell to me." [[6]] What is the best way to include this along with all the other prepositions "tell" takes? Wakablogger 20:50, 9 May 2009 (UTC)

Languages without written forms

This messages is inspired by a debate about French sign language on fr.wikt. The sections just linked to an external site with no other formatting, so we deleted them. My question is does it say anywhere in the Wiktionary policies that we only accept languages with written form? If not it either should, or we need to come up with something that allows sign languages to be represented here. Which I'm against just because it's a massive challenge and I can't see how it would work. Mglovesfun 15:09, 30 April 2009 (UTC)

I've only just seen the message above. I should say that I'm not against sign languages, it's just a question of how you 'write down' a sign language. What you've come up with looks interesting. For instance on the French Wikt, they had stuff like fr:bleu as a French sign language word - surely if you write it down, it's written French and therefore not 'sign' language. Anyone care to disagree? Mglovesfun 16:42, 30 April 2009 (UTC)
I believe the system we use to transcribe sign language (i.e. write it down in the manner it is signed, rather than use the English equivalent) is home-grown (as existing systems were not (yet?) usable for entry titles), and that the people who are organising the system on en.wikt are uploading photos to make it easier to learn. I assume that the system they have built could quite easily be modified for other sign-languages, though it does have a possible problem in that it uses fragments of English to describe the signs, that might need to be changed if it were to be adopted by another Wiktionary, I would certainly encourage the inclusion of such entries. Some translations for the word water are in scripts that Unicode does not yet support, the way we solve this problem is to use an image for the translation, and link it to a page with the transliteration as the title. To exclude languages just because no-one writes them seems a bit cruel, and a system of transliteration can presumably always be found/created. Conrad.Irwin 19:12, 30 April 2009 (UTC)
I'm not really against the principle, it just needs to be either done well, or not at all. I'm more in favour of putting external links or "YouTube" style clips for example instead of an audio file Vancouver (Canada) listen to water you could have American sign language - water. Reactions? Mglovesfun 20:45, 30 April 2009 (UTC)
Sure, if we can find volunteers to record them, it would be really neat. Conrad.Irwin 20:53, 30 April 2009 (UTC)

Our current system is the result of extensive consideration of all possibilities that anyone could think of. For some (not all) of the discussion involved, see the archive page Wiktionary talk:About sign languages/Archive 1. For the current system, see WT:ASL. To see it in action, see the entries listed in the subcategories of Category:American Sign Language. And to propose specific emendations to it, you can do so here, but a better place would probably be Wiktionary talk:About sign languages.—msh210 21:37, 30 April 2009 (UTC)