Wiktionary:Beer parlour/2008/June

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

Beer parlour archives edit

2024

2023

Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

December

Auto-categorization based on suffix

I would like to use a template to add entries to categories based on suffix. The suffix is mentioned in the etymology (e.g. in kertész -> kert + -ész). Would this be something other languages would use? For Hungarian words, it would be useful to build categories for words ending in the same suffix. --Panda10 12:25, 1 June 2008 (UTC)[reply]

We've tended to frown on those in the past, for several reasons. Some languages are so highly inflected, that it becomes difficult to know when there is a suffix and when it is just an inflectional ending. It has also been troublesome to settle on appropriate category names, since a "-" must necessarily appear in the middle of the category name. What we've tended to do instead is to create lists as appendices/indices instead. --EncycloPetey 16:17, 1 June 2008 (UTC)[reply]

An appendix would work for this purpose, and the categories I created so far are not that large, I can rework the entries. Thanks. --Panda10 16:51, 1 June 2008 (UTC)[reply]

I don't see we can't have categories- after all, we have Category:English nouns ending in "-ism", Category:German words ending in -nf. Nadando 17:43, 1 June 2008 (UTC)[reply]

Neither of those are suffix-based categories. The first is a list of religious / philosophical nouns that happen to end in "-ism", but this isn't always the result of an added suffix. The name went through considerable discussion, and this was the best compromise we came up with. --EncycloPetey 01:35, 3 June 2008 (UTC)[reply]

I created both an appendix and a template: Appendix:Hungarian words ending in -ász -ész and {{hu-suffix-ász}}. This template puts the word in the right category and allows easy standardization/changes in the future if needed. --Panda10 21:54, 1 June 2008 (UTC)[reply]

Excellent, the template-based approach is by far the best solution, though it still faces some resistance. The category thus constituted basically maintains itself (as long as the entries themselves are maintained). In contrast, walk away from the appendix for a year and see if it is still up to date. -- Visviva 08:56, 4 June 2008 (UTC)[reply]

Actually, I've just discovered that there is already a {{suffix}} template which could be updated to place the entry into a category. An additional (optional) parameter would be required to indicate the categorization. The category name could be standard: Category:<Language> words suffixed with -xxx (no double quotes). This was already brought up in the suffix template discussion page. I created a new {{hu-suffix}} template to do just that, but updating the {{suffix}} template would be a much better solution. --Panda10 10:45, 4 June 2008 (UTC)[reply]

Seems useful to me. DCDuring TALK 11:47, 4 June 2008 (UTC)[reply]

Audio in word lists

I've been thinking about this idea and would like your feedback. Is there a way to add audio to the word lists we have in Wiktionary (indexes, categories)? Visually, this would be a speaker icon in front of the words without any additional text that we normally have in the audio template. Only the icon and the word. If the audio does not exist, the icon would be red, otherwise blue. This would also make missing audio visible - which can be good or bad depending on how you look at it. --Panda10 12:48, 1 June 2008 (UTC)[reply]

I don't think we can do that in categories. The sound (1) may or may not exist, (2) is specific to one language, (3) may have more than one audio file dependent upon poart of speech, stress, etc. This would require a lot of work. Now, adding sound files to an index is potentially easier, but still has the problems of making sure the correct audio file(s) end up in the index, and making sure that each word is properly represented. Consider, for example, that some English words are pronounced differently as a noun or verb. Consider that some English words are pronounced differently in different countries. --EncycloPetey 00:45, 3 June 2008 (UTC)[reply]

The index entry could contain the same number of audio icons as the actual entry. When you hover above the speaker icon with the cursor, an alternate text appears clearly showing the file name, telling which country's pronunciation it is and which part of speech it belongs to. So I don't see this as a problem. If the audio file does not exist in the entry, then there is a choice of either displaying a red speaker icon next to the index entry, or not displaying at all. Anyway, this was just an idea. I think Wiktionary has many interesting functionalities that the other online dictionaries do not have and making those aspects stronger could be one of the ways to stand out. --Panda10 01:44, 3 June 2008 (UTC)[reply]

Citations/Quotations

Urmm, I'm confused! What's the difference (on Wiktionary) between a citation and a quotation, should we be using the {{cite-book}} or {{quote-book}} in the Citations namespace? I'm fairly sure I understand the difference between a reference and a citation/quotation . Am I suffering from (yet another) foolish misunderstanding or is there a wider confusion as to the difference? I am aware that lots of the visiting 'pedians seem to equate citations with references, not quotations. Conrad.Irwin 21:54, 1 June 2008 (UTC)[reply]

Oh, I also forgot to mention {{cite book}} which seems to have been transwiki'd for references. Conrad.Irwin 21:57, 1 June 2008 (UTC)[reply]

"Citation" and "quotation" are synonymous for our purposes; the latter name is perhaps less misleading, but the former name became the namespace name, so *shrug*. {{cite-book}} and {{quote-book}} are both fine in the Citations: namespace; we don't yet have a template that handles this stuff perfectly, so use whichever you prefer or fits a given quote better, or use neither if you'd rather format by hand or if you have a quote that requires handling that neither of them offers. {{cite book}} is for books used as references (i.e. it's used to cite a book in the ordinary, non-Wiktionary sense); you can use it inside <ref> elements, or inside bullet points in ===References=== sections, or you can pretend it doesn't exist and format such references by hand. —Ruakh_TALK 22:10, 1 June 2008 (UTC)[reply]

See also Category:Citation templates for templates for other special cases such as video, Usenet, US patents, journal articles, and newspapers. -- Visviva 11:02, 2 June 2008 (UTC)[reply]

Also ee also Special:PrefixIndex/Template:cite - it looks like they are both redundant to each other. Conrad.Irwin 16:04, 2 June 2008 (UTC)[reply]

AFAICT, {{cite-book}} is equivalent to {{quote-book|i1=*}} for the basic case, but without support for book URLs, editors, original authors, etc.; similarly {{quote-book}} lacks support at this writing for page URLs and genres. I'm not sure if cite-book was created because of problems with {{quote-book}}, or just because that template has not been sufficiently advertised. Either way, hopefully we can get a consensus treatment going forward.

In terms of the name, I opted not to use "cite" in the names of the quote-X templates to avoid confusion with the ex-Wikipedia templates mentioned by Ruakh, which serve a very different function. Perhaps a future consensus series of templates could be at Template:citation-foo? -- Visviva 11:02, 2 June 2008 (UTC)[reply]

As far as I can tell, {{cite-book}} was created for use in the Citations namespace and {{quote-book}} for use on the main entry. The reason there is a difference is that quotations on the main entry must begin with a "#" to put them under the appropriate definition sense. In the Citations namespace, this isn't the case. However, neither template seems to work very well, and the way they package content differs, which makes it frustrating to move an item from the entry page to the citations naemspace or vice versa. --EncycloPetey 14:25, 2 June 2008 (UTC)[reply]

(Yup. That difference is also present between the definition lines and the Quotations section, if the latter is still relevant.) It would clearly be nicer to have only one template for each need, and I'm sure the differences between cite-book and quote-book can be hammered out. For any of these templates, it would be better, IMO, not to include the quotation as a template parameter, which would resolve the problem of indentation. But that could be "solved" as well with even more parameters like {{{indent}}} or other patchwork. 75.54.80.198 06:18, 4 June 2008 (UTC)[reply]

If the quotation is not associated with the template, then there is no prospect of CSS-based autohiding, which is surely something we will want to consider as entries become better-cited. Having the quotation be optional was something I tried with {{quote-book}}, but was borked by the changes to the preprocessor and in any case was probably a dumb idea.

It's fairly easy to match {{cite-book}}'s behavior using {{quote-book}}; the former template could, I think, be replaced with the following code:

{{quote-book
|i1=*
|year={{{year|{{{1|}}}}}}
|author={{{author|{{{2|}}}}}}
|title={{{title|{{{3|}}}}}}
|isbn={{{ISBN|{{{isbn|}}}}}}
|url={{{book-link|{{{url|}}}}}}
|pageurl={{{link|{{{pageurl|}}}}}}
|page={{{page|}}}
|passage={{{text|{{{passage|}}}}}}
}}

... or code to that effect. The only difference would be in the URL display (but we don't really have a standard for that AFAIK). The reverse operation (replacing quote-book with an invocation of cite-book) is not currently possible.

I think it's good to have separate low-hassle templates for the * and #* cases, but they should use the same basic architecture, whatever that may be. -- Visviva 11:35, 4 June 2008 (UTC)[reply]

Flag edits for specific people

This edit got me thinking- I don't have any knowledge of Ancient Greek or how it is pronounced, but what if I could flag the edit and have it show up on some kind of watch list (or just the normal watch list) for people who had knowledge of the language? IE this edit would show up on Atelaes watchlist. It could be tied in with the babel boxes somehow. Maybe there isn't enough need for this but it would be cool. Nadando 06:59, 2 June 2008 (UTC)[reply]

No, I wholeheartedly agree with you. Every time I see an edit to an Italian, Mandardin, Russian, etc. page, I'm never quite sure what to do. It looks reasonable, should I mark it? It would be absolutely fantastic if we could sort RC by language or something. I have no idea if such a thing is possible. And yes, those edits were bunk, as far as I can tell (admittedly, my sources aren't fantastic for esoteric placenames). -Atelaes λάλει ἐμοί 07:20, 2 June 2008 (UTC)[reply]

Well, we've got {{attention}} for stuff that needs a check, or WT:BABEL to find someone to ask to check it, but for small things like that I tend to assume they know what they're on about. Conrad.Irwin 16:02, 2 June 2008 (UTC)[reply]

I noted that we have a lang= parameter for rfc. Would it make sense to have a context= parameter for domain specialties? I would think it would be a good way to get some context-specific entries cleaned up efficiently. DCDuring TALK 16:28, 2 June 2008 (UTC)[reply]

That sort of functionality is not likely to come to MediaWiki anytime soon. The closest thing would be if someone could write a bot that would periodically (on daily/weekly basis) dump the list of unpatrolled edits sorted by language section name in which the edit occur, so that the interested editors could check them, but I don't know if that's feasible at all. --Ivan Štambuk 18:47, 3 June 2008 (UTC)[reply]

Plurals of proper nouns

So what is the policy now? Are plural forms of proper nouns to be included in WT or not? Specifically, is to be specified in the article Alexander that the plural to be used is Alexanders when referring to more than one person bearing that name?
Background 1: RFD discussion on Jesuses, which seemed to converge to the opinion that plurals of proper nouns may be acceptable for Wiktionary purposes in exceptional cases when the use has been verified by a suitable number of citations.
Background 2: Dispute on the inclusion of plural forms in the declension tables for Polish male given names between User:Maro and me (see my discussion page and his edits on several articles I created). The problem is exactly the same: the situation to refer to more than one Marek is in Polish no more and no less common than the situation in English to refer to more than one Alexander; so the fact that this is LOTE does not justify a different policy.
Please discuss. -- Gauss 01:05, 3 June 2008 (UTC)[reply]

For inflected languages, the situation is perhaps more easily resolved. What has been happening in Latin and Ancient Greek is that the declension tables for proper nouns do not include plural forms (except in cases where the name is inherently plural). However, on the inflection line of all Latin proper nouns is a link to the appropriate declension pattern appendix, where the exmples includes plural form patterns. So, someone wanting to explore the plural inflections has that option to examine a full declension table. But attestation is also a bit more cut-and-dry in Classical languages; either the plural is known in the corpus of surviving literature or it isn't. --EncycloPetey 01:30, 3 June 2008 (UTC)[reply]

I disagree with your (Gauss's) assessment of Wiktionary:Requests for deletion#Jesuses; I don't believe it converged to any opinion. It looks to me like a slight majority of editors prefer to keep what you're calling "plural forms of proper nouns", exceptional case or no. I think the topic should be brought to a vote. —Ruakh_TALK 01:34, 3 June 2008 (UTC)[reply]

I've now created Wiktionary:Votes/pl-2008-06/Plurals of proper nouns; I don't like the name, since proper nouns don't have plurals, but couldn't think of a better one. :-P I've decided to go with a simple up-down vote, but if people prefer, I'd be happy to split it into sub-votes (e.g. first voting on whether to include such plurals, then voting on details for how to present them). Please feel free to edit. :-) —Ruakh_TALK 01:54, 3 June 2008 (UTC)[reply]

Well, you could call it "Plurals from proper nouns", so that the nature of the plurals is left nicely ambiguous... --EncycloPetey 02:08, 3 June 2008 (UTC)[reply]

Good idea, thanks, done. :-) —Ruakh_TALK 02:13, 3 June 2008 (UTC)[reply]

~~Comment below probably mooted by~~ EP's deft finesse. DCDuring TALK 02:44, 3 June 2008 (UTC)[reply]

Edit conflict:

Would a vote lead to any conclusion, given the slim majority? Is there any issue on which there might be a resolution? I'm still struck by the fact that there are numerous occurrences of plurals like "Henrys" or "Smiths", but that we wouldn't have a place for them. It wouldn't have to be a prominent place or a separate entry, just something findable from the search box and accessible to someone looking at the singular form. It doesn't have to be called a Proper noun, though it seems as if we could address the issue by linking the PoS header to an explanation of "true" proper nouns. Making all of the proper nouns automagically sprout Noun PoS sections for the normally occurring common noun uses of the words spelled the same as the proper noun seems like overkill, though that would be the only course that addresses the problem posed by the commonly accepted technical definitions of Proper noun.

The MW3 definition is an example of what I see as defective definitions: "a noun that designates a particular being or thing, does not take a limiting modifier, and is usually capitalized in English". By "limiting modifier" they seem to mean "determiner", citing "this" as an example. "This Barack Obama that everyone is talking about" is a phrase that does not include a proper noun. Nor does "the eight King Henrys". "Proper noun" seems to indicate a very particular kind of use that doesn't exactly fit our entries which are for the atoms that might make up true proper names or that might be proper names in particular circumstances.

In other words "proper noun" doesn't seem to be a linguistic category that is stable enough to be useful for a dictionary that specifically excludes proper nouns that actually refer to particular individuals. DCDuring TALK 02:44, 3 June 2008 (UTC)[reply]

MW3 seems to be using a narrower definition akin to that of the CGEL. We use a broader definition of "proper noun" that includes the concept of proper name. The instability is not inherent in the category, but in the scope of its definition as compared between sources that lump together proper nouns and names versus those that carefully distinguish between them. Also, keep in mind that I've started an appendix that should help (I hope) to clarify our stance and definition here. There will still be some borderline cases, but having the appendix should make those cases explicit and clearly explain why they are problematic. I should have time next week to resume work on said Appendix. --EncycloPetey 02:54, 3 June 2008 (UTC)[reply]

MW3 shows "proper noun" as a synonym for "proper name", though not vice versa. I don't know where to look to find our definition of proper noun. Are you referring to the one in the Glossary? DCDuring TALK 05:06, 3 June 2008 (UTC)[reply]

It will probably influence how people vote. We clearly have to have a broader definition, presumably in our Glossary, so that we can have proper noun entries in Wiktionary at all. But, given that our entries are often of proper noun components, not proper nouns themselves, why do we have to follow rules that specifically apply (however imperfectly) to the kind of proper nouns that we mostly exclude? DCDuring TALK 03:04, 3 June 2008 (UTC)[reply]

Which proper nouns and which rules do you mean? your comment is too vague for me to underatnd what you mean. Could you clarify? --EncycloPetey 03:28, 3 June 2008 (UTC)[reply]

The proper nouns I am referring to are the ones that are the subject of the vote. The rules include "Proper nouns do not have plurals". I would argue that "the Smiths", referring to a particular nuclear family living at a particular address, is a proper noun in the most important regard, designating a specific unitary entity. It is as much a proper noun as "Smith". The vote proposes that the common noun singular PoS of the words we have display under the proper noun PoS be officially discouraged and denies proper noun status to the plural proper nouns.

I am really looking forward to the clarification of "proper noun" as used by 1., linguists, 2., our naive users, 3., us as we now use it as a PoS header, and 4., as we propose to use it. I find it hard to understand how we can have a vote on this in the absence of a great deal more precision than currently characterizes our Glossary definition of proper noun or the definition in principal namespace. DCDuring TALK 05:06, 3 June 2008 (UTC)[reply]

Ah, well "the Smiths" belongs to that odd form that linguists seem to waffle over. I would argue that these are collective proper nouns that merely look plural. Consider that United States is normally considered to be a singular proper noun, as is the Virgin Islands. But you are correct that the issue needs clarification. This vote may not clarify that particular issue, however, since we still have very vague policy about the inclusion of personal name elements. --EncycloPetey 13:50, 3 June 2008 (UTC)[reply]

Perhaps the vote is premature until we have a little greater clarity on the issues.

Musing about this: Given that we have a strong, fundamental stricture against entries that are for individual entities, ie, that are the purest case of proper nouns, I wonder whether "proper noun" is the most useful PoS label for words whose principal use is the construction of and shortening of the type of proper nouns that we generally prohibit. Taxonomic names are a great prototype for proper names. There is a clear structure with definite rules and relatively little colloquial usage that we feel compelled to include. In principle, the various taxons each refer to exactly one group of individuals at a given point in time. The species and subspecies modifiers are not in themselves proper nouns, as you pointed out. That is in some ways parallel to "given names".

We could make a case that given names should not be called proper nouns. "Proper name" carries slightly less linguistic baggage and is intelligible to all users. Perhaps the same reasoning could be extended to surnames as well. I certainly haven't thought through what all the implications would be. I airily dismiss the difficulty of conversion by saying a bot could do many of them (assuming that they had the given name and surname templates). Perhaps the existence of the appendices on names would aid the conversion as well. Are there deal-killing drawbacks to such a PoS header? DCDuring TALK 15:52, 3 June 2008 (UTC)[reply]

But saying that given names are not proper nouns runs contrary to all the literature on the subject that I've ever seen. I'd want some solid basis, preferrably in academic publications, before making that kind of colossal change. And "proper name" does have some real problems as well; most especially in that it is a kind of nominal phrase and not strictly the name of a part of speech. In linguistic terms, the category of proper name includes: "the Crimea", "the Earl of Sandwich", "Mother", "my Jennifer", and "the Mary that you met yesterday". If you have access to a copy of the Cambridge Grammar of the English Language, I recommend reading the pages on proper nouns and proper names. --EncycloPetey 23:58, 3 June 2008 (UTC)[reply]

We already have PoS-level headings that are not parts of speech, to wit, the three abbreviation headers, Symbol, Phrase, Proverb, among those not deprecated, as best I recollect. I think that "my Jennifer" and "Jennifer" are very much equivalent. "Jennifer" does not actually uniquely specify a single person except in some kind of context. "My Jennifer" seems to be a case where the "my" provides some of the context, the balance being provided by the context that says who "my" refers to. Fortunately for preventing endless proliferation of entries, "my Jennifer" would seem to be SoP.

I would actually find all of the given name and surname entries to be in good company with any of these that are not SoP. Any term of address would seem appropriate.

It does not seem hard to come up with rules to exclude any types of entries that would be substantially duplicative.

CGEL is not conveniently available, but I will get to the library that has it at some point. Is the CGEL's use of terms consistent with the other modern English grammars? DCDuring TALK 02:33, 4 June 2008 (UTC)[reply]

Sometimes yes, but other times, no. Their terminology is sometimes current, but sometimes idiosyncratic. It is usually possible to tell the idiosyncratic uses from the excessive explanation and over-precise splitting from other terms. I recommend it primarily because the authors raise many issues and cases often glossed over in other general works. It therefore makes a good source for opening the mind to what the language is actually doing.

I disagree with you about "Jennifer" and other such given names. They almost always refer to a specific unique individual. Yes, additional context is often needed to tell which of several possible referents is meant, but the same is true of Congo, Guyana, and Macedonia. The fact that an identical label is applied to many different individuals does not make it a common noun. A proper noun disinguishes an individual, whereas a common noun groups it. An (deprecated template usage) apple is named to place it in a category of items which possess shared qualities common to all items bearing the same label of "apple", and which would be expected from someone hearing the word apple. But (deprecated template usage) Jennifer predicts no properties of the item bearing the label (except gender in this particular case, but not all given names do so), and so there is no description possible of what a "Jennifer" is. Any such description would have to be made of a specific individual bearing the name, rather than a description of shared characteristics. This is why we have no true definition of (deprecated template usage) Jennifer, but instead identify it as a label used as a given name. Thus, Jennifer is a proper noun and not a common noun. --EncycloPetey 03:59, 4 June 2008 (UTC)[reply]

What you are saying seems true of some elements of our current naming system, but wouldn't have applied to occupational surnames like Sawyer; patronymics, Leif Ericson; names reflecting personal characteristics Karl der Grosse, Preacher Roe, Wild Bill Hickok; or place name related names: Vito Corleone, Friedrich August von Hayek. Titles (arguably part of Proper Names and Nouns) also convey informaton. All given names reflect information about the attitudes or aspirations of parents toward their children (patron saints, names like "Theodora", favorite relatives, sports heroes, celebrities, etc.

I take a less Platonic view of these matters. If the concepts don't communicate to users and require a few dozen more doctoral dissertations to get the kinks worked out, then they don't yet belong in a dictionary. It would be much easier for us if we could model what other dictionaries do with regards to proper nouns and modify it to suit using the evolutionary incrementalism that characterizes public wikis. But, for the most part, general dictionaries exclude personal names, leaving the field to more encyclopedic works or to specialized glossaries. DCDuring TALK 06:21, 4 June 2008 (UTC)[reply]

Jennifer can be a name (uncountable, not existing in determined form), but also a specific person who bears the name "Jennifer". In "one Jennifer", Jennifer can be any person who bears the name "Jennifer". In this meaning meaning two persons who bear the name, they would be called Jennifers. "My Jennifer" is also a specific person, but extraxting "Jennifer" from that phrase, "Jennifer" becomes any person who bears that name, thus it would be possible to say "My Jennifers" (like if you have two daughters both named "Jennifer"). In Swedish I think we don't even have terms to distinguish these different meanings, so it's interesting to read this discussion (and the one about Jesuses). However, even if a very small number of persons can differentiate these meanings and the proper PoS-name for it, the vast majority would probably not. Having each proper noun page have different headings would probably lead to alot more confusion than clarity, and there would need lots of work to accomplish this change - the gain would be minimal, if at all. Proper nouns (like Jennifer etc.) are also something that normal dictionaries don't even include so it seems like a bit like they come as a peripheral part of the goal for Wiktionary. Why put so much effort in this instead of making the main pages of normal words better? Plurals (Jennifers) of the "person bearing the name"-meaning could ofcourse be created by a bot, but if these pages directs to the main page (Jennifer) the main page needs to include this meaning too, then all proper noun pages need manual attention, too. Who's supposed to do that, also to maintain this for all pages created in the future, and for all languages? Not because it's not logical to do this, but to me it feels way to peripheral to put alot of effort in. What is reasonable though is to add some kind of inflection table, explaining both the possessiv spelling and also this information about the other "person bearing the name"-meaning, along with the plural form of this meaning, and maybe also a link to a page explaining more about Proper Nouns in the English language that I saw you worked on, EncycloPetey. This could be added with much less effort, but also make the pages complete with information that would satisfy most users the best. Exactly how to formulate I don't know enough of English grammar to, but I am confident you are knowledgable to figure out something good. ~ Dodde 07:53, 7 June 2008 (UTC)[reply]

Thanks for the fresh, thoughtful, reasonable perspective. Treatment of proper nouns is a hard area because of the absence of suitable models in other general dictionaries, but it also is a way to attract users to Wiktionary. I don't believe that proper noun plurals deserve a separate entry (but most common noun plurals don't either, IMHO). I am mostly interested in simply, 1., displaying the plural at the proper noun entry and, 2., allowing search for plurals of "proper" nouns. In the most obvious implementations, this seems to clash with the linguist's definition and criteria as to whether something is a proper noun. I am saying so much the worse for the definition, but others reasonably disagree. Language consists of both words and rules. The definition of a dictionary focuses on words, not rules. We risk wasting user's time, attention, memory, and patience when we have full entries for things that are better accommodated by rules. Almost all of the uses of proper nouns in common noun ways would be fully covered by rules (as would many standard derived meanings of nouns, like "instance or type of" for abstract nouns like -isms; morphological alterations of stems, like those using -ly, -ness; and regular inflections, like plurals and verb forms). Perhaps it could all be accommodated by a clickable black (or, better, an unobtrusive color) links to PoS appendices (which could be for all headers and glossary terms. DCDuring TALK 12:45, 7 June 2008 (UTC)[reply]

We should not forget that proper noun is interpreted slightly differently in different languages (the basic meaning being the same), as this may have some impact. In French, it's usual to consider that proper nouns have no plural, but plurals are used in some cases, nonetheless. And I really think that adding the meaning plural of France to the page Frances could help some readers. Lmaltier 15:42, 7 June 2008 (UTC)[reply]

I must say I am a completely astonished by the first supporting votes on Wiktionary:Votes/pl-2008-06/Plurals of proper nouns. Don't people realize the extensive work supporting this vote mean in reality, if it's meant to be done consistently on all proper noun pages, and how it will completely downgrade the affected pages' readability? I really hope people get aware of this before it's too late. ~ Dodde 14:06, 13 June 2008 (UTC)[reply]

Combining forms

I seem to remember we had a discussion about combining forms at one point. Does the entry include the hyphen or not?

For example, I've just created -eyed and moved derived terms from eye to that page. But should the entry properly be "eyed"? I don't think so, as this is the past tense and past participle of "to eye", and the hyphen is an essential part of the combination (except when the compound is a single word, as is common in US English). So how should the entry be titled? — Paul G 10:11, 3 June 2008 (UTC)[reply]

I think that -entries tend to be labelled as suffixes, and the form without the hyphen as adjectives. See handed and -handed as an example. But I'm not at all sure that this is the best way of doing things. SemperBlotto 10:38, 3 June 2008 (UTC)[reply]

-head is an example of a more restrained differentiation of cases that I think is appropriate for combining forms. One etymology is a pure suffix not connected at all to the ordinary meanings of "head". The other etymology shows a particular combining form application whose meaning, though clearly connected, does not exactly correspond to any one meaning of "head". There is no sense line at -head that corresponds to the use of -head in forehead, ahead, flathead, gearhead, and behead, or even shithead, mophead or pinhead. I think that is as it should be, though it may be that the use of -head in invective and milder forms of name-calling might warrant a sense. DCDuring TALK 16:20, 3 June 2008 (UTC)[reply]

For the sake of consistency, since suffixes are marked with a hyphen and this suffix always occurs with a real hyphen, this should technically be at --eyed. But as this example shows, consistency is no virtue. -- Visviva 08:51, 4 June 2008 (UTC)[reply]

Are (present/past) participles really verbs or adjectives?

While working with Dutch inflections I noticed that all the pages and templates list past participles as verbs. This strikes me as odd, because although they are formed from verbs, they are not verbs themselves but grammatically behave as adjectives, including adjectival inflection and also forming of comparatives and superlatives in some cases. And this applies not just to Dutch but to many other languages including English, German and French, and probably a lot of other Indo-European languages too (including Latin I believe). So why are past participles of these languages listed as being verbs rather than adjectives? Shouldn't this be changed? --CodeCat 22:20, 4 June 2008 (UTC)[reply]

English and French past participles are verb forms (consider "have said", "ont dit"), though it's very common to form adjectives from them, and they're sometimes described (a bit misleadingly) as "verbal adjectives". Certainly they're verbs and have to be listed as such; the question is, should they be listed as adjectives as well? For English and French we've decided that, except in cases that have really produced all-out adjectives (such as "surprised", "relieved", "agacé", "ennuyé", etc.), there's no point in doing so, since it's a regular feature of the language that past participles can be used much like adjectives in some cases, and it's more misleading to list them as adjectives than not to do so. Dutch, German, and Latin may be different, however; I wouldn't know. —Ruakh_TALK 23:01, 4 June 2008 (UTC)[reply]

CodeCat, the answer to this question varies between languages. In Latin and Ancient Greek, participles are typically treated as their own part of speech. Part of the reason for this is that, although they have gender and decline like adjectives, they also have tense like verbs. In English, where adjectives do not have gender, this is no longer an issue. In Spanish, the participle only has gender when it functions in the place of an adjective, but when it combines to form a compound verb, it does not. So, there will not be one answer to this question that will apply equally to all Indo-European languages. What we're doing for Classical languages is recognizing the Participle as its own part of speech, the way Classical grammars do. The etymology shows its origin from a verb, but the inflection table shows how it declines like an adjective. The participle itself gets its own definition, since this will not always match the meaning of the verb and will not always have a simple English equivalent. --EncycloPetey 02:28, 5 June 2008 (UTC)[reply]

No, participles themselves don't inflect for tense, verbal roots they're formed from do ^_^ Once the participle is formed, it represent a particular mood-tense-voice modification of verbal root sense, which can be translated to English using auxiliary/modal verbs, and some helpful semantic modifiers (because lots of those moods/tenses are not directly translatable in English)

AFAICS, what you are doing with entries like e.g. fractus is giving purely adjectival English translations with adjectival Latin declension, but format them as ===Participle===, and moving the "xxx participle of" stub to etymology. I like that approach. I don't like the approach used for e.g. Ancient Greek κατακεχυμένος or Lithuanian bučiuotinas, where the stub-like "xxx participle of" suddenly becomes a "lemma" by itself, another stubs linking to it. Such approach could yield literally hundreds of stubs-linking-to-stubs entries for highly inflected languages like Sanskrit and Ancient Greek, where there are dozen+ participles for every verb each able to decline in 5-8 cases 3 genders and 3 numbers, with several orthography variants, and that would be no good. --Ivan Štambuk 13:50, 6 June 2008 (UTC)[reply]

I dislike EP's approach regarding definitions. First, because the definitions are adjectival. I don't know exactly how Latin treats participles, but that wouldn't work for Ancient Greek, as participles can be adjectives, nouns, or verbs (with verbs probably being the most common). A pasticiple can take on any of the meanings of its parent verb, and so to list them all over again would be silly. I am considering masculine singular participles to be sub-lemma pages. Thus, their forms will link to them, but also to the parent verb (Feminine genitive dual of xxxx, present active participle of xxx), so that the user can go straight to definitions. Rest assured that all entries will link directly to a true lemma page, even if they also link to a few sub-lemma pages as well. -Atelaes λάλει ἐμοί 18:11, 6 June 2008 (UTC)[reply]

Latin has complications that don't exist in Ancient Greek or Sanskrit. Firstly, Latin continued as a language of international communication into the modern era, so it has a longer history of active use beyond the Classical period. With Greek, the modern and medieval languages have a different language code, so historical breaks can be made. A Latin entry, by contrast, must cover all three periods if it is to be thorough. Secondly, Latin has given rise to several major modern languages in which the descendants of those Latin participles are considered strictly adjecitves. So, we have Participle as a verb/adjective hybrid POS in the Classical period, but as Latin gradually became French, Spanish, Italian, etc., the verb aspects of the participle dwindled in favor of the adjective function. We therefore have many, many non-Latin words whose etymologies will point to a Latin participle root. Combine this with its description as a separate part of speech, and an independent, thorough definition seems more helpful than a grammatical description. I'm not entriely happy with this approach, but it seems to me that it is more useful to our users, at least for Latin entries. --EncycloPetey 21:27, 10 June 2008 (UTC)[reply]

Ruakh, can you point to the discussion about that criteria? AFAICS, both (deprecated template usage) surprised and (deprecated template usage) relieved are derivatives of a corresponding verbal sense.

I think that it would be most regretful not to allow adjectival use of English past participles and nominal use of present participles (the act of doing <verbal sense>) to have their own ===Adjective=== and ===Noun=== sections, just because they're derived by regular, predictable morphology. We don't have the same constraints as paper dictionaries, and should not be thinking in terms "what do we gain in savings" but "what do we lose". And in this case, the real non-stub parts of speech, and in lots of cases non-obvious translations to foreign languages.

My mother tongue's grammars never use the terms "participle" but use "verbal adjective" and "verbal adverb" instead, and I do plan to format them as ===Adjective=== and ===Adverb=== one day. This would for it, and for all the other languages that choose to do the same, incur the unnecessary inconsistency of translating one FL POS with another English-one (which wouldn't even be a full-blown entry by itself, but a "xxx participle of" stub). So the "misleadings" saved by omitting the adjectival sense are not eliminated, just projected onto another domain. --Ivan Štambuk 13:34, 6 June 2008 (UTC)[reply]

If regular formation of a participle from its corresponding verb is an argument for listing it as a verb, then why are English adverbs in -ly not considered adjectives, and why are adjectives in -able not considered verbs? It makes sense to me to consider formation of a participle to be simply derivational morphology, since you are deriving one word from another by means through morphological changes. The fact that the original word was a verb doesn't matter, because the end result behaves grammatically as an adjective.

In any case they are not verb forms, because they can never take the place of another verb form except for another participle. However a present participle in English can take the place of any adjective, both predicatively (the man is walking) and attributively (the walking man). English past participles can also be used in this way provided that the verb was transitive (the painted fence and the fence is painted). The only thing that distinguishes an English past participle from an adjective is the use of the auxiliary 'to have' (I have walked).

The situation is the same in Dutch and many other Germanic languages (and French too). In Dutch, present participles can be used predicatively (considered archaic) and attributively. Past participles of transitive verbs can be used like this as well. Furthermore, in both French and Dutch, participles are inflected as adjectives based on gender and number. E.g. Dutch ik heb gekookt (I have cooked), but de gekookte ham (the cooked ham). Or French il a mangé (he has eaten), le pain mangé (the eaten bread), but elle a mangée (she has eaten), la pomme mangée (the eaten apple). --CodeCat 18:02, 6 June 2008 (UTC)[reply]

About French: in elle a mangé, mangé is a verb form, not an adjective at all. In la pomme mangée, mangée is also usually considered as a verb form (it's a short form of la pomme qui a été mangée). Actually, I think that it would be very difficult to find a sentence where mangé is used as an adjective. In other words, 'mangé is not an adjective. But many participles can become adjectives, nonetheless. They are considered adjectives when the meaning of the sentence does not include the meaning of the verb. In Je voudrais un café sucré, sucré is not a verb form, it's an adjective. But, in le café qu'il a sucré, sucré is a verb form, not an adjective. In some sentences, a word may be understood either as a verb form or as an adjective, with slightly different meanings. There has been a long discussion about this point in the Beer parlour, not so long ago. Note that, when you state that they can never take the place of another verb form except for another participle, it's also true of other verb forms (e.g. an imperative can take the place of another imperative, not of a past form). Lmaltier 19:38, 6 June 2008 (UTC)[reply]

Also note that, in French, the noun participe always refers to a verb form (when used as an adjective or as a noun, it is called an adjective or a noun, not a participle). Lmaltier 12:36, 7 June 2008 (UTC)[reply]

I think you need to look at their syntax. Do they, for example licence objects? Do they appear attributively or predicatively? Do they allow modifications by adverbs that typically modify adjectives (e.g., very in English). etc.--Brett 17:05, 7 June 2008 (UTC)[reply]

You are right. In French très (very) works well with adjectives and adverbs, and beaucoup (much) works well with verbs (including participles). I think this trick works as well in English and in many other languages. Lmaltier 20:02, 10 June 2008 (UTC)[reply]

Dutch, from verwennen: Hij is erg verwend. (He is very spoiled.) Hij is veel verwend. (He has been spoiled much.) Hij heeft veel verwend. (He has spoiled much.) --CodeCat 11:37, 11 June 2008 (UTC)[reply]

Full entries or soft redirects for Swiss Standard German spellings

Swiss Standard German has a consistent orthographic difference from Standard German, converting Standard German (deprecated template usage) ß into Swiss Standard German ss. Should we include full entries or soft redirects for Swiss Standard German spellings? For example, should we include a full entry at the Swiss Standard German spelling (deprecated template usage) geniessen (which currently has a hard redirect) or just make it a soft redirect to the Standard German spelling (deprecated template usage) genießen? My instinct is to make it a soft redirect, but I ask here because my affinity for soft redirects is not always in sync with everyone else’s preferences. Rod (A. Smith) 22:42, 4 June 2008 (UTC)[reply]

A soft redirect with a template modelled on {{alternative spelling of}} sounds like a good solution to me, as long as the main entry also gets an Alternative spellings section listing the Swiss form. --EncycloPetey 02:30, 5 June 2008 (UTC)[reply]

I agree with EP, assuming Swiss Standard German is a dialect of German, not a ==Language==.—msh210℠ 16:21, 5 June 2008 (UTC)[reply]

I think we're referring to actual alternative spellings here, sort of like US/UK, whereas Swiss German proper is closer to the Standard French/Quebec Joual part of the Language/Dialect spectrum (abstracting social factors). Circeus 17:03, 5 June 2008 (UTC)[reply]

Note that Swiss German (ISO gsw) is different from Swiss Standard German (ISO de-CH). The former is an almost exclusively spoken (as opposed to written) language, and is nearly impossible for a speaker of only Standard German to understand. The latter is much closer to Standard German, easier for Germans to understand, with entirely predictable changes in pronunciation and spelling, and few differences in vocabulary. Rod (A. Smith) 18:46, 5 June 2008 (UTC)[reply]

I know that ;) Circeus 19:27, 5 June 2008 (UTC)[reply]

If you consult the entry at w:wiktionary:de:geniessen, you will note that the last edit was in January 2008 and that they were already using a soft redirect with a template long before this question was ever asked, as well as a link from the general spelling. If you want to know what to do in another language, you should check how it is done in that language Mike Hayes 06:53, 26 July 2008 (UTC)[reply]

Phrasal verb SoP tests

I would like to take some time to hammer out some guidelines to apply to phrasal verbs with a view to creating a reasonably streamlined decision making process for those recurring borderline cases where it can be difficult to decide if the entry really is a phrasal verb, or is SoP.
My interest stems from the fact that the number of classified phrasal verbs has grown from 22 to 880 in about 1½ years. And there are well over 3,000 in common use, meaning that there are still a large number of possibly long-winded discussions on the horizon which would benefit from a set of basic guidelines, in the same way we often decide set phrases by using one of the standard tests. For those who don't know, the list can be found at Category:English phrasal verbs

To start the ball rolling, I think we need to agree on some basic principles:-

A phrasal verb is composed of a word that is normally used as a verb, plus a particle which in any other construction would either be a preposition or an adverb. There can be one or two particles, but never three.
The plane takes off at 15.00. - Could you look after my baby for half an hour, please.? - John ran off with Jane last week.
It is often, but not always, idiomatic. It can have both an idiomatic and a literal meaning.
Look out for the broken glass. -- Look out the window and tell me what you can see.
It can be either transitive or intransitive, or both.
The plane takes off at 15.00. - Take off your shoes before entering, please.
A phrasal verb often has more than one meaning.
The meaning of the verb in a phrasal verb often, but not always, changes
The particle has a strong meaning content.
In throw away, run away, and most (but not all) other away phrasals, away strongly indicates separation or disappearance.
(Please add more to this list)

Typical particles found in phrasal verbs:-

aback, about, across, against, along, apart, around, aside, at, away, back, by, down, for, forth, forward, in, into, off, on, onto, out, over, past, round, through, to, together, up, upon, with, without, and any others you might think of, please add.

Areas of uncertainty and conflicting opinion:-

Possible phrasal verbs that appear to have only a literal meaning, but occur in grammatical constructions where one would expect to see a phrasal verb.
Common collocations.
(Please add more to this list)

Thanks in advance for your input. -- Algrif 17:51, 5 June 2008 (UTC)[reply]

Is it possible that there is no good intrinsic test of whether a phrasal verb merits an entry in Wiktionary? It seems that phrasal verbs are a rather amorphous category. Is it possible that the merit of having an entry for one depends of the number of possible combinations of definitions of the verb and the particle. I note that both the verbs and the particles often have a very large numbers of definitions. Although it is, in principle, possible for a user to test each possible combination of verb meaning and particle meaning to find the meaning of a phrasal verb, it would seem very difficult to do so in practice. Thus an extrinsic test would be warranted, based on the product of the number of verb definitions and the number of particle definitions. The number would not have to be dramatically large in order to warrant including the phrasal verb as an entry. Perhaps both the verb and the particle should be both be required to have at least two {three?, more?) definitions in general use (excluding special contexts). DCDuring TALK 18:39, 5 June 2008 (UTC)[reply]

It is possible, I suppose. However, preparing a description of the characteristics of phrasal verbs will allow us to (1) write an appendix for users who want to know more, and (2) allow us to weed out obvious cases without repeating the same discussions over and over.

As far as #2 in the list above, I would say that the first take off is a phrasal verb, but the second example (the "literal" one) is not phrasal at all. It is an example of the verb look with a prepositional phrase "out the window" that answers the question "where". So one characteristic of transitive phrasal verbs is that the object is an object of the verbal phrase, and not the object of the preposition. Example: In the sentence "Take off your shoes," what is being taken off? the shoes. Whereas in the sentence "Take the laces off your shoes," it is laces that are being taken, and the prepositional phrase "off your shoes" answers an adverbial question of where they are being taken. So, a phrasal verb parses with a noun or pronoun object answering what in relation to the verb, but if the verb is not phrasal then it parses with the particle as part of a larger prepositional phrase answering where or how. Intransitive situations are hander to judge, because they don't have an object and so do not exhibit this behavior. --EncycloPetey 19:50, 5 June 2008 (UTC)[reply]

Likewise "with Jane".

For intransitive wouldn't it more or less be a question of whether it's idiomatic? DAVilla 06:07, 6 June 2008 (UTC)[reply]

The CGEL argues that there's no such thing as a phrasal verb, though I don't recall the arguments used.--Brett 17:17, 7 June 2008 (UTC)[reply]

Well "phrasal verb" is definable and worth publishing books about. There are dictionaries of phrasal verbs. All grammatical categories are mere creations of students of language anyway. To me the main issue is whether the entries are useful and, secondarily, whether the category is useful. The entries make it much easier to find the meaning of a particular verb-particle construction, especially because the verbs and particles involved are both among the most polysemous of English words. The categories are useful to maintain and review the ~~categories~~entries. I am unclear as to what value the categories have for our users, except to suggest that looking for verb-particle combinations might be fruitful in subsequent searches. DCDuring TALK 18:07, 7 June 2008 (UTC)[reply]

So, are you suggesting that we not include them? Phrasal verbs may not be syntactic units, but we are a dictionary, not a grammar-book. We should strive for accuracy in describing the syntax of our headwords (e.g. labeling determiners as such, attributive nouns as such, etc.), but that shouldn't bleed over into omitting headwords that are syntax-decomposable but not semantics-decomposable. (Or are you simply saying that we can't rely on syntactic arguments in deciding which to include?) —Ruakh_TALK 18:56, 7 June 2008 (UTC)[reply]

By all means, lets add phrases that have idiomatic senses. I'm saying that I suspect that we won't succeed in finding consistent tests for "phrasal verbs". I'll go back and read what Huddleston & Pullum have to say about it when I get a chance though. I think they generally argue that they're simply verbs with prepositional complements.

With the example of run off with, for example, we have go off with, wander off with, stroll off with, etc. You could also front the with if you were so included (i.e., with whom did she run off?)

There are, however, certain oddities, such as pronouns being disallowed in certain constructions where other nouns are fine (e.g., pick up the paper vs. *pick up it.)--Brett 00:37, 8 June 2008 (UTC)[reply]

I for one have even suggested that the heading Phrasal verb should be allowed as a PoS, but I understand from previous discussions, and some comments here, that this would be difficult, or even impossible to pursue. However, I think the category is very useful, particularly for English L2 who struggle to learn the phrasal verbs, and find this list to be a great aid. (I have even had emails thanking me for pushing it!) So I agree with EP that "preparing a description of the characteristics of phrasal verbs will allow us to (1) write an appendix for users who want to know more, and (2) allow us to weed out obvious cases without repeating the same discussions over and over.".
There are many respectable phrasal verb dictionaries, and I believe that Wikt can be better than most simply by attempting to formulate a few "tests" for inclusion (a la "fried egg" and "Egyptian pyramid"). One example that I see in this discussion is run off with. Why is this different from wander off with? Or maybe they are not so different after all? Certainly run off with 1. has more than one meaning, 2. has the idiomatic meaning of steal, (not forgetting the to get married example above of John and Jane), 3. changes the basic meaning of run, and 4. demonstrates what I call inseparability of meaning, that is to say, remove either particle, and the sense changes. Wander off with, if it can be demonstrated to also mean steal, which is a possibility, would also qualify as phrasal verb for the same test.
Grammatical construction tests I think would be good, as per pick it up vs pick up it. EP makes a similar point in his parsing analysis of take off. If examples can be shown to parse as an inseparable unit, eg. Take off your shoes before you enter. then there is a case for there being a phrasal verb entry, even though take + off could appear to be SoP in other examples, eg. Take the laces off the shoes.
I also think that a definitive list of possible particles would help in the weeding out process. -- Algrif 16:19, 8 June 2008 (UTC)[reply]

It appears that I had somewhat misremembered. The CGEL argues against phrasal verbs mainly at a terminological level where, "The view taken here, however, is that the underlined expressions in the [a] example in [3], despite their idiomatic interpretations, do not form syntactic constituents, any more than the underlined word sequences in the [b] examples form constituents." The forms they contrast are:

  Kim referred to your book      vs.    He flew to the capital
  He put in his application      vs.    He carried in the chairs
  I look forward to seeing you   vs.    I ran forward to the desk
  He paid tribute to his parents vs.  He sent money to his parents

The CGEL does, however, recognize "prepositional verbs". These are those verbs that select a prepositional phrase complement containing a specified preposition along with its own complement. Notice that under this definition, referred itself is a prepositional verb, but referred to is not. Those verbs that select a specified PP complement are of two kinds: those with mobile PPs and those with fixed PPs. Verbs taking non-specified PP complements and those with specified but mobile PPs can be distinguished from those with fixed PPs, according to the CGEL, in that the fixed ones do not allow:

fronting of the preposition in relative, open interrogative, and it-cleft constructions.
repetition of the preposition in coordinated complements (e.g., *I came across some pictures and across some letters.)
insertion of an adjunct between the verb and the preposition.

Another test in which the prepositional verbs can generally be distinguished from others is

Passives are usually allowed in the case of prepositional verbs, but not always. In contrast, they are usually unacceptable in non-prepositional verbs.

The CGEL notes 6 possible constructions

  verb [prep + O]
  verb O [prep + O]
  verb [prep + O] [prep + O]
  verb [prep + C]
  verb O [prep + C]
  verb [prep + O] [prep + C]

--Brett 14:41, 10 June 2008 (UTC)[reply]

On top of the category of prepositional verbs, the CGEL also recognizes a verb-particle-object construction (e.g., take down the poster, let go her hand, make clear the intent), verbal idioms containing intransitive prepositions (e.g., He gave in), verbal idioms containing NP + transitive prepositions (e.g., we lost sight of our goal), and other types of verbal idiom (e.g., make sure, given to understand, make do with, have in mind to change his hair).--Brett 15:04, 10 June 2008 (UTC)[reply]

Hidden categories

I don't know if this MediaWiki feature has been enabled here, but if so, polyethylene shows one good example for where it would be useful. __meco 13:13, 6 June 2008 (UTC)[reply]

Why? I'm more likely to spot a porblem because of the category list than by seeing in one of the various little categories. So, for me, the visible categories make it more likely I'll do something to improve the article. I expect this is true for others as well. If the categories were hidden, then that benefit would go away. --EncycloPetey 13:28, 6 June 2008 (UTC)[reply]

I, being a Norwegian native speaker, have a different perspective on this. I check the Norwegian category and then clean it out on a not too regular basis. __meco 13:54, 6 June 2008 (UTC)[reply]

Registered users have the option of having the hidden categories appear. To me the question is the effect of hiding a category on unregistered users and occasional users who are not familiar with the choices afforded them by WT:PREFS and even my preferences.

I can see that the problem with polyethylene. The non-English language-specific maintenance categories would appear to be prime candidates for hiding. DCDuring TALK 14:24, 6 June 2008 (UTC)[reply]

It seems to me that having these categories on display is useful, if people want to remove them from the entries they should fix the cause of the problem instead of just hiding the categories. Conrad.Irwin 14:30, 6 June 2008 (UTC)[reply]

That doesn't quite work. No matter how much I (or Meco) want to get rid of them, we can't; we don't have the information. I certainly don't know any of the genders in question (;-). It isn't like a syntax error or a bad header or some such. Robert Ullmann 14:39, 6 June 2008 (UTC) And as noted, editors can show them, users won't see them (and would probably only be confused by them) Robert Ullmann 14:47, 6 June 2008 (UTC)[reply]

Hiding them (maintenance cats that tend to flock together like these) is a very good idea: we should much more often think about the presentation to users, not (just) editors. That is the idea isn't it? That most people looking at a page are using the project as a reference, not involved in edits? I put the magic word for the ttbc categories in the boiler-plate template {{ttbccatboiler}}, so that we could show them again with one edit if desired, it seems to be satisfactory. In this case, a {{gendercatboiler}} (with a couple of parameters for the genders in the language as well) could do the same, and generate the text in most cases. I do think we should consider carefully for each class of cats whether they ought to be hidden, and not go overboard. This class seems to me to be something to be hidden by default. Robert Ullmann 14:36, 6 June 2008 (UTC)[reply]

It would be almost ideal if registered users could display only hidden categories that were for languages in their Babel listings (above a certain level!!!) or specify which language or other class of category to display. DCDuring TALK 15:15, 6 June 2008 (UTC)[reply]

But how much of our obscure language cleanup/improvement is done by anon editors? How many people here first found out about these cleanup categories because they saw them on a page and decided to do the cleanup? Why is it that we want to show lots of redlinks, but we want to hide all the categories? This eems like we're shooting ourselves in the foot. --EncycloPetey 17:53, 6 June 2008 (UTC)[reply]

Actually, I think you shot yourself in the foot with that initial question EP. In my experience, the answer is basically none. Anons don't do the tedious cleanup like cateogorizing, formatting, etc. They do add translations, but that's about the extent of it. I am also in support of hidden categories, as I think the masses of clumped cats would made our site look ugly, and those who want to see them can. -Atelaes λάλει ἐμοί 18:01, 6 June 2008 (UTC)[reply]

If we hide the missing gender categories, there should be some alternate display to alert users/editors that the translations are missing gender information. Perhaps do this by adjusting the {{g}} template so that it displays an asterisk, with a corresponding note added to the bottom of the table explaining that those translations marked with asterisks are missing gender information. Perhaps add it to {{trans-bottom}}, and have its display (and categorisation in a supercat of the the language specific ones) triggered by the presence of a {{g}} template (I don't know if this can be done, if it can't then editors/AutoFormat could add a {{translations missing gender}} template above {{trans-bottom}} where required).

As a way of cleaning up some of these, would there be a way for a bot to read any gender information from our entry for that word and/or the foreign language Wiktionary? If so, this would seem a natural fit with what tbot does.

Tbot and/or AutoFormat could add {{g}} templates to translations that have no gender template. This would obviously need to work with either a list of languages that have gender or a list of those that don't. Thryduulf 20:59, 6 June 2008 (UTC)[reply]

Is there any need to have Category:Requests for autoformat show? Can we hide it?—msh210℠ 18:48, 11 June 2008 (UTC)[reply]

Proposed change of wording to `{{PL:pedia}}`

Copied from Template_talk:PL:pedia to allow for a wider audience.

As this template is for External links, or See also sections, the wording is out of place. The fact that Wikipedia has an article is not the same as saying that more information can be found by clicking on the link.

I would like to change this template to one of the following:

(The French) Wikipedia's cat article.
Cat on (the Spanish) Wikipedia.

instead of

(The German) Wikipedia has an article on “Cat”.

So that the flow of meaning is more consistent. I have intentionally removed the link from Wikipedia, I don't think that is necessary anymore. I have also intentionally removed the quote marks, as the emboldening suffices to remove the literal meaning of the word. In both of the examples the words in brackets are intended to be removed for links to the English Wikipedia. Conrad.Irwin 14:57, 23 May 2008 (UTC)[reply]

That makes sense to me; and, likewise with all the other {{PL:*}} templates. —Ruakh_TALK 18:35, 23 May 2008 (UTC)[reply]

Oh, but if we're not going to be linkifying the project names, maybe we should include their tag-lines, like:

Cat on (the Spanish) Wikipedia, the free (Spanish) encyclopedia.

? —Ruakh_TALK 18:42, 23 May 2008 (UTC)[reply]

To be honest I'm not a big fan of taglines they just (as I suppose they are intended to) sound like spam. Conrad.Irwin 00:01, 7 June 2008 (UTC)[reply]

I sometimes accidentally click on the blue-linked "Wikipedia" instead of the subject, so I like the recommended change. It would seem to be the kind of mistake many would make. How about a small-target link for those who want information about the Wiki, not the subject? Or perhaps we could keep the full-size blue-link for an "initial" period. Species, for example, might warrant it. DCDuring TALK 16:56, 7 June 2008 (UTC)[reply]

Yes, for Wikispecies some explanation would be useful, though whether it is better to link to the Wikipedia article or to the project's main page I'm not sure. I set up {{PL:pedia2}}, so if people want to experiment then they can do so, as this seems unopposed, I'll make the change to {{PL:pedia}} in the next few days. Conrad.Irwin 10:29, 8 June 2008 (UTC)[reply]

Variant forms of Chinese Characters?

How should we deal with variant forms of Chinese characters? (I’ll archive this discussion at Wiktionary:About Chinese characters afterwards.)

Currently they are listed without much indication of which form they are, and variant forms are only listed under the vague “see also” hatnote.

Distinguish two issues:

character itself (the topic here)
use in particular languages (I’ll address this in a second thread)

For the character itself, there are two issues:

what form is this character?
what are the variants?

Concretely:

I looked at these characters 1:歩 2:步 and became confused.
I suggest adding new fields to {{Han char}}:
- one called f= for which form, and
- separate fields to list all variants.

As far as I can tell, all characters fall into one of the following 9 categories; I list suggested abbreviations below.

In outline, there are:

Traditional forms (the core)
Simplified: Simplified Chinese, shinjitai (Japanese simplifications)
Country-specific: Japanese, Korean, Vietnamese
…and a few other Japanese ones (Japanese-specific traditional forms, idiosyncratic simplifications, and errors)

In detail:

Chinese
- t = traditional
- s = simplified
Japanese (see: Category:Japanese-only CJKV Characters)
- ky = kyūjitai（旧字体）(generally the same as traditional)
- sh = shinjitai（新字体）(often the same as simplified)
- kj = kokuji（国字）: Japanese characters (coined in Japan)
- as = Asahi characters : used only by the newspaper Asahi Shimbun
- gk = ghost kanji (ja)（幽霊字）: mistakes in JIS

(Note that Jōyō kanji （常用漢字）and Jinmeiyō kanji （人名用漢字）are written in shinjitai, while Hyōgaiji （表外字）are written in kyūjitai.)

Korean (see: Category:Korean-only CJKV Characters)
- hj = hanja (ko) Korean-coined; few

Vietnamese (see: Category:Vietnamese-only CJKV Characters)
- cn = chữ Nôm (historical interest)

Characters should be classed as Traditional or Simplified if possible; for instance, only class a character as kyūjitai if that form differs from Traditional (but list both the traditional form and the kyūjitai form as variants on the simplified character page). Non-Traditional/Simplified should be classed into the relevant “(Country)-only CJKV Characters” category.

Returning to my original problem, 1:歩 is the shinjitai form, used in Japanese, and 2:步 is the Traditional form, used in Chinese, and kyūjitai form (this is clear in my browser setup because I have different fonts for Japanese and Chinese coverage): they should be marked as such and refer directly to each other as variant forms.

How does this sound?

Nbarth (email) (talk) 00:52, 8 June 2008 (UTC)[reply]

The template {{zh-forms}} is similar to what I’m talking about, but is specific to Chinese, and deals more with phrases/multiple characters than with individual characters.

Nbarth (email) (talk) 01:27, 8 June 2008 (UTC)[reply]

I also created a {{ja-forms}} template, which might suit your needs. See 図書館 for example. -- A-cai 06:38, 8 June 2008 (UTC)[reply]

Hi A-cai,

Thanks, yes, {{zh-forms}} and {{ja-forms}} address a form of my second question “show the variants”, though I think variant forms should be listed in the {{Han char}} entry itself, for instance, so they can easily be displayed in-line or extracted programmatically; displaying the {{zh-forms}}/{{ja-forms}} box would ideally be selected by user preferences.

Thinking on this more, I think what I’m suggesting is:

Extend {{Han char}} to also reflect the form of the character, as in {{cmn-noun}} and outlined above.
In fact, rather than a f= named parameter, have an (optional for now, mandatory once forms have been fixed for existing entries) positional parameter, as in {{cmn-noun}}, which states what form a character is. This should have t, s, ts, tsh, tssh for various combinations of traditional, simplified, shinjitai.

Nbarth (email) (talk) 15:09, 8 June 2008 (UTC)[reply]

As this seems to not have raised objections, I’ve put a form of the above classification and suggestions (without modifying {{Han char}} at Wiktionary:About Chinese characters#Categories of characters – if it’s not ok, please change.

Nbarth (email) (talk) 23:15, 12 June 2008 (UTC)[reply]

Use of Chinese Character Forms in specific languages

This addresses the second point in the above thread

I suggest:

For the main (Translingual) character section:

if a meaning (but not the form) is country-specific, it should be at the end, and flagged as such

The only example I know are kokkun（国訓）: Japanese meaning, for which we could add a template: {{ja-kokkun}}

For the language-specific sections:

if a character form is the standard form, continue as present
if a character form is not used in a given language, such as a Simplified Chinese character that does not agree with the shinjitai form, like 门, then that language should not be listed on the page
if a character form is non-standard but used, like kyūjitai that are tolerated in names, it should be flagged as such (for this example, {{ja-kyu-name}}, unless someone knows the official name for such characters), and only permitted uses included. Notably, kyūjitai forms of Japanese words are ok to include (for historical reference), but should be clearly flagged as such.

Nbarth (email) (talk) 00:58, 8 June 2008 (UTC)[reply]

On “forms not used in a language”: examining ja:歩 shows that while it does list the other languages, it is only to say “use the other form”. (Or whatever “步参照” means.)

Nbarth (email) (talk) 00:27, 9 June 2008 (UTC)[reply]

Permissions: Add Template:yue-hanzi to correct category

Could someone so-permissioned add:

<noinclude>[[Category:Chinese templates]]</noinclude>

to Template:yue-hanzi, as it is an (important) Chinese template? (Following Template:cmn-hanzi.)

Thanks!

Nbarth (email) (talk) 16:28, 8 June 2008 (UTC)[reply]

Specific Universal Changes in Wikisaurus

You said a broad hand, but let's just say I don't want to overstay my welcome. I see several changes that I think would be beneficial to the wikisaurus project.

The header is kinda clunky and could stand a cleanup. I've created a proposed header here. My reasons for this change are to remove some of the extraneous white-space within the page in order to bring the user's main interest in immediate view, and to give a better aesthetic to the page.
Many of the categories suggested in the template page are actually duplication of effort from the wiktionary pages. I feel we should remove the duplication of effort and therefor simplify the project. See altered page.
Much of the page will be duplicated in the actual wikisaurus process. Couldn't we just let the process be our workhorse? Amina (sack36) 13:24, 10 June 2008 (UTC)[reply]

The header/logo seems to take up too much precious vertical space on the first screen. If it could appear in the upper right, it could be the same size (or even larger). If you feel it needs to be on the left, it should be narrower. (It could be wider.) DCDuring TALK 14:11, 10 June 2008 (UTC)[reply]

The header/logo actually took up a great deal more space on the original pass. This is quite a bit shortened.Amina (sack36) 17:14, 10 June 2008 (UTC)[reply]

Space above the fold is precious. Every extra keystroke it takes for someone to get what they want is a problem, whether it's paging down or clicking on a link. The existing wikisaurus pages seem extravagant in their use of space, both horizontally and vertically. Whitespace certainly has its uses, but not for squeezing important content off the first screen. Some of the efforts to improve Wiktionary have focused on getting the table of contents for an entry onto the otherwise underutilized space on the right side of the screen; hiding long lists of translations and related and derived terms; horizontalizing lists such as of synonyms; and even pushing sister project links toward the bottom of the entry. Long etymology and pronunciation sections also squander space above the fold. In-line citations can also, but usually farther down the page. DCDuring TALK 18:44, 10 June 2008 (UTC)[reply]

One thing, on Wikisaurus:obese, all the links are prefixed with ws: (which doesn't work, but looks like it should go to wikisource), shouldn't they just link straight to the dictionary entries? Conrad.Irwin 18:48, 10 June 2008 (UTC)[reply]

what is the purpose of having all that extra stuff on the wikisaurus pages? Etymology and pronunciation belong in wiktionary as do translations. As for related and derived terms, that's the whole point of a thesaurus. They should act as synonyms with the closest match be listed first and the least likely, the derived and the archaic at the end. The pages should be simple with the logo and the selected word at the top, the synonyms next and the antonyms last each with their own header, of course. That's it. No extra stuff that's already being taken care of in wiktionary. Amina (sack36) 20:58, 11 June 2008 (UTC)[reply]

REDIRECTION IN WIKISAURUS -- I noticed when doing the wikisaurus that the word "fat" had been redirected to "obese". I see two problems with that.

The word "fat" has several meanings. By redirecting to obese, you don't allow the "fat" meaning "lard" or the computer term (although that may be spelled "phat" these days.
Redirection end runs the ease of the simple model. It makes the entire creation of wikisaurus orders of magnitude harder. Now we have to make the determination through programming which is meant. Leaving it open and putting all the different meanings of a word on the same wikisaurus page, we don't have to make that determination, the user can. Amina (sack36) 09:21, 12 June 2008 (UTC)[reply]

The redirection is from a specific sense of fat: "in the sense of obese" I would have thought that is precisely what we want. I would also think that we would want to substitute the wikisaurus link for part of the list of synonyms. It would be particularly nice if the total synonyms line for a sense would only be one line long on a full screen device. Can the wikisaurus link be on the same line as a short list of synonyms? DCDuring TALK 11:22, 12 June 2008 (UTC)[reply]

I'm not sure I understand what you're saying, DC. Are you saying we want the people to be redirected? Wouldn't that defeat the purpose of simplicity? Also, why would we want to limit the synonyms of a given word? Isn't that counter to the establishment of the wiki in the first place? Amina (sack36) 16:05, 12 June 2008 (UTC)[reply]

He's saying that Wikisaurus:fat (obese) is a redirect to Wikisaurus:obese, but that Wikisaurus:fat is no such thing. —Ruakh_TALK 19:21, 12 June 2008 (UTC)[reply]

I must have misunderstood which application of a Wikisaurus link we were talking about. I'm not sure how folks first come to Wikisaurus and how people use Wikisaurus once they are aware of its existence. I focus on the links from Wiktionary entries, which, I assume, are always under the synonyms header. The synonyms headers are likely to be "redirecting" and, in a proper entry for a multi-definition word, will be doing so from a specific sense. In my analysis, the user would already know what for what sense of a word he was seeking synonyms.

I am simply unfamiliar with other means of using wikisaurus, which seem to be what Sack36 was talking about.

I justify my focus on the links from Wiktionary because that seems to be the most likely way Wikisaurus will capture users. Once 'Saurus handles that class of usage well for a large number of high-volume synonym classes, other interfaces might be developed or improved.

Is there any good information about what types of entries get visits and contributions? I assume that sex, invective, oaths, and insults are high on the list. Those classes of entries would at least provide sufficient traffic to generate tests of entry designs. DCDuring TALK 19:47, 12 June 2008 (UTC)[reply]

Wow, DC, you are so ahead of me! By the way, the redirect on fat was not labeled fat(obese). It was just "fat". Now, about the 'saurus. I'm going to summarize where we think we are at this time.

- - The overall look to date is still too complicated
    - Concatenate the heading to allow visibility to the meat of the page
    - Remove all but synonyms and antonyms at this time
      - once a significant body of work is in place, more complexity can be added
  - The portal into wikisaurus will be through the definition of each word in Wiktionary
    - once in wikisaurus, the pattern of finding things will change to a multi-dimensional link system.
    - wiktionary will maintain the bulk of the information, leaving 'saurus to do word linkage.

Does that outline seem right? Any additions/subtractions/alterations?

Well, actually I have one or two already. By using body parts as our introductory foray into the wiktionary we are inviting a great deal more of the perv group to join us in creation of this project. What say we hold off on the body parts until quite a bit later? In wiktionary there is no need for connections between pages. Each definition is autonomous. Wikisaurus is just the opposite. We'll be using links for almost everything. It's like a huge database where wiktionary is the main base while wikisaurus is the key. It uses one-to-many and many-to-many connections. Amina (sack36) 23:52, 13 June 2008 (UTC)[reply]

I'd have to spend more time looking at 'Saurus to be of any help. I've only been here during a period when it's been mostly neglected. Any illustrations of a good entry? Any illustrations of interesting features, bad features, common user problems (pervs, etc)?

With the multi-definition words I assume that the links are from the synonyms section which may have the multiple senses, more than one (but not all) of which will have corresponding 'Suarus entries.

When you say "Remove", I assume you mean "comment out" the content beyond synonyms and antonyms for now.

Can you say more about the "multi-dimensional link system" ? DCDuring TALK 00:12, 14 June 2008 (UTC)[reply]

Finalization of format - really!

Well it was an interesting week trolling the back roads of Wiktionary and Wikisaurus to find the different styles of wikisaurus pages and throwing away the duplicates and insane. I think I may have some bug bites left over from the body part section. If people have done this before me I don't want to know. Humor me. The different types seemed to break down into three noticeably different schemes. I've added a fourth that is of course perfect! I have the salient points of each and a URL so you can actually see them. I have indeed removed parts from a couple of them because they were being addressed in wiktionary where they belonged. Etymology is not a heading for Wikisaurus. If I disagreed with something being part of wikisaurus but I couldn't find it in wiktionary, it stayed in the pages it was found. We all get a blame in this is what I say. So here goes:

Style 1
1. the header (header is the same on all) is inline with the Table of Contents.
2. The differentiation of language is included on the line with part of speech.
3. All the links go to Wiktionary.
4. Pseudo-synonyms are included in a separate part of synonyms. (what's a pseudo-synonym?)
5. each part of speach is represented on a different page
Style 2
1. Table of Contents in line with header and text
2. Each meaning of the headword is labeled in the Table of Contents
3. Pseudo-Synonyms, Idioms, and Slang are included as separate sections
4. the synonyms etc. are presented vertically in a table
5. the table provides a synonym with a link to wiktionary, the same one with a link to it's own page on wikisaurus, and a definition of that word.
6. each group of synonyms includes a definition as well.
Style 3
1. Table of Contents in line with header and text
2. the Headword provides a link to wiktionary
3. each synonym is represented in the same grid as described above
4. Language is the first breakout rather than meaning or part of speach
5. near synonyms, Colloquialisms and Slang are broken out to be on their own
6. near synonyms, Colloquialisms and Slang are presented offset from synonyms
7. wikisaurus links is listed, though I have no idea what function it performs.
8. Roget's Thesaurus classification for this headword is given.
Style 4
1. Table of Contents is along right hand border
2. the Headword provides the only link to wiktionary
3. a synonym is used in lieu of a definition to differentiate separate overall meanings
4. all parts of speach are kept on the same page
5. synonyms and antonyms are presented horizontally
6. only synonyms and antonyms are differentiated since Idioms, Colloquialisms and Slang will be defined as such in wiktionary.
7. all words will be headwords
8. table of contents shows the division of word meanings but synonym and antonym are assumed

Obviously the last one listed is mine. I'm hoping by seeing the page in action y'all will get a better understanding of how we can use the nature of the internet to do our work. The one change I'd like to make across the board is to change the size of the head word to be larger. It would really help people to know which word is the headword. Any comments? Questions? Complaints? Ice Cream?! Amina (sack36) 01:35, 15 June 2008 (UTC)[reply]

`{{wjargon}}` !vote

The outcome of the highly debated VOTE seems to be in favour of not including jargon in the main Wiktionary namespace. This leaves us with the question of what exactly to do with it. I would like to propose the following.

For terms that are only WMF jargon (!vote): Replace the page with {{only in|{{in glossary}}}}

For terms that are also used elsewhere (RFC): Add {{xsee}} or perhaps a more flexibly worded {{also in}} to the top of the page.

Anyone else have any thoughts? Conrad.Irwin 18:32, 10 June 2008 (UTC)[reply]

I'm not sure that I understand the implications of what you suggest. But the question of making Wiktionary jargon usable to new users quickly is vital. Terms used elsewhere in Wikiworld are of secondary importance.

I recollect that six months ago I was constantly coming across terms that I could not make sense of. I naturally hoped that I could find their meaning with one click from the screen on which they arose. Embedded blue-linked terms are great in that regard. I was, of course, unaware of the option of double-clicking on a word to open up the associated entry. I would try Help in the navigation pane. Then I would put the term in the search window, often not finding it, except after fooling with advanced search, making guesses as to where such terms might be. The things I would be looking for started were: 1., pages that people referred to, 2., terms that had the look (to me) of insider Wiktionary jargon; and, 3., linguistic terms, especially those used in a somewhat different way in Wiktionary than in most of the rest of the world. All of these need to be addressed. They do not all fit into the category that was voted on. I would argue that we need a single readily accessible main Glossary that encompasses page shortcuts, Wiktionary jargon, and linguistic jargon that Wiktionary uses in a way different from prevailing meanings.

Each of the three approaches I tried need to be covered as well as any others that a new user might think of. Blue links are likely to work only for fixed text, not for discussions. Double clicking is the same as search with regard to the destination, but requires no typing, a big plus. The navigation screen could have a link to the Glossary. The Help page could have a link to the Glossary. The devices Conrad is suggesting only address the use of the search window. If I understand the proposal and how search works, they seem to be close to the best we can do without significant changes in how search works. Putting the Glossary in the navigation pane and making it more inclusive might help also. DCDuring TALK 19:20, 10 June 2008 (UTC)[reply]

Well we have Wiktionary:Glossary, this was mainly intended as a discussion of how to point people there, but we might need to improve it to incorporate more information. Conrad.Irwin 19:35, 10 June 2008 (UTC)[reply]

We have two glossaries, one for wiktionary jargon, one for terms used in entries. There is some overlap. Moreover, for new users, the distinction between the two is not likely to be obvious. What you propose eliminates the problem by directing the user to the appropriate glossary (presumably the jargony one). I am interested in helping new users find the terms now in both. User expectations formed in the process of finding one set of non-entry terms is likely to govern search for other types of non-entry terms. DCDuring TALK 20:25, 10 June 2008 (UTC)[reply]

New functionality of Template:rfscript

Per the apparent consensus of #Language specific help templates, I have added an additional function to {{rfscript}}. The template now takes a {{{lang}}} parameter, with the input being the ISO code of the language, similar to many existing templates. If a language parameter is entered, the template categorizes the word into [[:Category:{{{lang}}} articles which need {{{1}}} script]] instead of [[:Category:Articles which need {{{1}}} script]]. It seems to me that these categories should be double-categorized under their general script request category and the general language attention category, so I have placed Category:Sanskrit articles which need Devanagari script into both Category:Articles which need Devanagari script and Category:Sanskrit words needing attention. Just wanted to let people know and open up the floor to any conflicting opinions on the matter. -Atelaes λάλει ἐμοί 07:36, 11 June 2008 (UTC)[reply]

PS, I think that in cases where a script is only used by one language (i.e. Avestan), I think it redundant to use the lang parameter. However, I think this the exception, not the rule. -Atelaes λάλει ἐμοί 07:40, 11 June 2008 (UTC)[reply]

I tried it with Punjabi but it did not seem to work: {{rfscript|pa}}. —Stephen 18:23, 11 June 2008 (UTC)[reply]

It still requires a script name, so: {{rfscript|Shahmukhi|lang=pa}} (or would it be {{rfscript|Arabic|lang=pa}}?). Then again, if the former, you shouldn't need to clarify the language, as Punjabi seems to be the only language using Shahmuki. -Atelaes λάλει ἐμοί 18:35, 11 June 2008 (UTC)[reply]

ditransitive

This seems like a wonderfully obscurantist name for a common linguistic phenomenon. See {{ditransitive}}. If we are to keep it, it would seem to need to have a link to the principal namespace entry or to one of the two Glossaries, presumably the one for terms used in entries. Learners' dictionaries like Longman's DCE do not depend on this term, having a particular abbreviated notation for constructions taking "double objects" without prepositions. I suggest that it has no place among the terms that we use in making entries for our users. It of course deserves to be an entry. The phenomenon it describes deserves to be intelligibly noted in all of the relevant verb entries. It might be an appropriate name for the template, but "ditransitive" is yet another term we let discourage new users. DCDuring TALK 18:32, 12 June 2008 (UTC)[reply]

I tend to disagree. Wikipedia is not censored for nudity and Wiktionary is not censored for stupidity. Five minutes ago I had no idea what the word meant, but I found myself on an online dictionary and now do. While we shouldn't use overly technical terms just to look smart, if there is a term which describes a phenomenon, and its the best term for the job, we should use it. I have no problem linking the term, like we do for {{archaic}}, and if there's a simpler and easier to understand word which means the same thing, then let's use it. Otherwise, let's stick with the correct term, and if people have the motivation to find out what it means, perhaps they can find a dictionary as quickly as I did. -Atelaes λάλει ἐμοί 18:50, 12 June 2008 (UTC)[reply]

Wiktionary is not just for us. We have some obligation to attempt to communicate to the larger population. The technical terms serve our needs in doing the work, but they fail in converting most new visitors to repeat users. We have 1.5% the usage of WP, 20% the usage of MW online, and 7% the usage of Answers.com. That doesn't seem like success to me. I would argue that we are aiming at an audience that consists of US. We don't even seem to be doing all that well at attracting more contributors, who would be people most like us. Will such an approach ever get donors interested? Will we ever be able to get technical resources for improvements without donor support?

There is nothing especially "correct" about technical terms. There are more convenient labels for phenomena for those who use them a lot. Web usability research suggests that a clickable link is not something that we can depend on for communication because users simply often don't click through, especially if they don't believe that they will get intelligible, useful information. Take a look at any successful site and you will see that clickable terms are almost always simple ones, especially on pages that a users encounters at the beginning of a visit to a site. Subsequent pages begin to reveal complexities and, yes, technical vocabulary. DCDuring TALK 19:25, 12 June 2008 (UTC)[reply]

We do not make lexical decisions based on economics or uncited research. I also disagree with your comment about the clickable terms on "successful sites". One of the most successful sites on the Internet during its first decade is UCMP, the University of California Museum of Paleontology. The clickable terms on their site during that time were almost entirely all technical terms. Thus, your premise is not a legitimate one, and valid conclusions cannot follow from your argument. --EncycloPetey 06:01, 13 June 2008 (UTC)[reply]

I have long had the feeling that this might be an ivory tower. Decisions appear to be made on unstated preferences, values, and standards as if this was some kind of academic institution. I would welcome a Vote to make explicit all such preferences, values, and standards.

What uncited research are you referring to? If it is the Site Analytics data about our low usage, I had previously offered the link for discussion [1]. If it is web usability research, I would welcome some discussion of the issue by my betters. As it is now we are making decisions based on utter ignorance of user behavior reinforced by apparent indifference to user needs.

I would be interested to determine what number of visits the UCMP site had or what constituted its success.

I am not at all opposed to including technical terms as entries, quite to the contrary. I am simply opposed to adding unnecessary technical terms that constitute a barrier to wider usage. DCDuring TALK 16:33, 13 June 2008 (UTC)[reply]

Like I said earlier, if there is a simpler route which conveys the same information, I am open to using it. Also, I wonder how much the use of ditransitive detracts from the experience of the average user. Most probably see it, perceive technical jargon, and gloss over (in much the same way as I imagine most average users gloss over "transitive"). It doesn't really take up that much space. However, those who are interested in grammatical understanding will either know what it means or take the time to look it up. The understanding of "ditransitive" is not necessary to learn the meaning of the word. When our software becomes a bit more sophisticated, perhaps we could have a setting to allow users to hide grammtical info if they're not interested in it. Until then, I am unwilling to let this information simply be prohibited. If we can have practical, useful information coexisting with esoteric word nerd information, why not do it? -Atelaes λάλει ἐμοί 16:47, 13 June 2008 (UTC)to[reply]

{{ditransitive}} is used in five entries, four of them English. It doesn't seem to have taken our community by storm. There is a vast amount of usage information (especially about verbs) that we don't convey (use with prepositions [consistency and completeness are the issues with phrasal verbs], use with gerunds and infinitives, as well as double-object constructions), but it needs to be conveyed without technical vocabulary.

I favor hiding our use of technical terms by default. Templates referring to grammatical terms could be made behave differently for anons and registered users. Anonymous users should not be confronted with an unfamiliar term like "ditransitive", but with something that might communicate the basics without requiring a click, but offering a click-through for further explanation. If registered users were to have the option of switching from the anon version to one using technical grammatical vocabulary, that would be great.

I am concerned with building up our usage from its amazingly modest levels. I would think this would be an objective worth working toward. If technical terms, obsolete definitions, complicated definition wording, and complex layout scare users away, that hurts us. I certainly don't care nearly as much how we refer to the underlying phenomenon anywhere outside principal namespace. "Transitive" and "intransitive" are terms that some non-negligible percentage of the population have been exposed to, as with the traditional names of parts of speech. DCDuring TALK 21:05, 13 June 2008 (UTC)[reply]

Our model for grammar tags is to use the simplest tags that clearly express the grammar of the term, and to let readers click through to the glossary for the less familiar terms. Just as "transitive", "intransitive", and "reflexive" are useful to many interested readers, some readers may find the now clickable "ditransitive" tags useful. Of course, if you can think of a layman replacement for "ditransitive", please suggest it. Rod (A. Smith) 21:48, 13 June 2008 (UTC)[reply]

By the way, I created {{label}} back in October to allow us to create labels that display differently for linguists and laymen. Robert Ullmann disapproved of it, though, on the grounds that it would ~~confuse editors~~ break external automated consumers of our data. Perhaps somebody can suggest a better approach. Rod (A. Smith) 21:57, 13 June 2008 (UTC)[reply]

Thanks for the clickability. I believe our choice of vocabulary and conventions should be subject to usability evaluation, even testing. I am going to study Longman's DCE and other learner's dictionaries that have more usage information than conventional print dictionaries to find models. I doubt if I could successfully address the problem one term at a time, however. It would probably require some conventional notation or something.

What have the automated consumers of our data done for us lately? A facetious-sounding, but also serious question.

We have already successfully deterred many casual contributors by varied means. The survivors would seem capable of handling complications like differences between what is displayed for different classes of users. If we are going to open up more to new contributors, that would be different. DCDuring TALK 23:28, 13 June 2008 (UTC)[reply]

WT:ELE inconsistency

Just a quick question: I noticed that in Wiktionary: Entry layout explained the list of headers given right after #Order_of_headings has a different order from that of the sections (one put "translations" at the end, the other between "-nyms" and other terms). Which one is supposed to be current (as I was just looking to check that)? Circeus 01:36, 13 June 2008 (UTC)[reply]

I'm not sure what you're asking, but it sounds as though you are concerned that the headers are not discussed in the same sequence that they are to appear in an entry. Is this correct? If so, then this is partly historical, partly because the Translations section is more important than the Related or Derived terms sections, partly because the Related terms and Derived terms do not always appear nested as a L4 header, and possibly as well for other reasons I'm not aware of. In any case, the order given in the Order of heading section is correct as of the vote we took on the matter. In any event, it is not necessary that the sections be discussed in their L4 sequence because, as noted above, some of these headers do not always appear under a POS, and so it is convenient to discuss those sections together following the ones that do always appear under a POS header. --EncycloPetey 05:53, 13 June 2008 (UTC)[reply]

So as far as my actual question is concerned, that appears to boil down to "yes, when they are at the same level, translations are supposed to be last". Thanks. Circeus 18:53, 13 June 2008 (UTC)[reply]

That isn't what you asked. And, no, Translations are not always last; there are a few sections that follow them as noted in WT:ELE. --EncycloPetey 20:12, 17 June 2008 (UTC)[reply]

Non-gloss definitions

Most of the dictionaries I have distinguish between (a) gloss definitions, wherein the definition has the same part of speech and meaning as the defined term; and (b) non-gloss definitions, used for the relatively few terms for which it is difficult or impossible to provide a gloss. Such a distinction seems important to me, so I created {{non-gloss definition}} and applied it to definitions for (deprecated template usage) of and (deprecated template usage) hear, hear.

I don't know what the default style should be, but the Category:Form of templates are used to create non-gloss definitions, so I reused the 'use-with-mention' class from there. Questions, comments, observations, and points of refutation are welcome. Rod (A. Smith) 19:18, 13 June 2008 (UTC)[reply]

I like the idea, and while I'm not too happy with the way of implementing it (in particular, there's no way I can remember that template's name) I can't think of anything better. Conrad.Bot 13:55, 15 June 2008 (UTC)[reply]

I like it! :-) The next step is to decide how to word such definitions; for example, at of you're using a subjectless finite verb phrase of which the headword is the implicit subject, but at hear, hear you're using a determinerless noun phrase indicating what the headword is. Elsewhere, I've sometimes used determiner phrases (such as "The definite article") and sometimes adjective-y/non-finite modifier clauses ("Used to […]"). Each of these approaches makes sense, but it's probably best to aim for consistency. —Ruakh_TALK 15:35, 15 June 2008 (UTC)[reply]

Yeah, consistency would be good. I'm not certain what wording style to use, and as you say, all three styles have have their advantages. Regardless, applying {{non-gloss definition}} (or whatever better-named template anyone might suggest) to such definitions will give us a convenient list of entries to review if we make up or later change our collective mind about the wording style. Rod (A. Smith) 20:57, 16 June 2008 (UTC)[reply]

True. :-) —part of our collective mind_THINK 22:55, 16 June 2008 (UTC)[reply]

I'm not convinced that this is a useful approach. In particular, this applies primarily to parts of speech that cannot be defined in a way that limits the part of speech. For example, no prepositions can be defined with definitions that are themselves prepositions. This applies also to pronouns, interjections, conjunctions, and the like. It seems rather silly to say that these parts of speech will receive definitions formatted in one way, but those parts of speech will use a special template to format all their definitions in a different way. The gloss / non-gloss distinction is relevant in many cases, but I think the way this template has been planned for use is a mistake that will lead to confusion rather than clarification. --EncycloPetey 20:08, 17 June 2008 (UTC)[reply]

I hadn't planned to limit {{non-gloss definition}} to any particular part of speech. Many senses of many prepositions can be defined using words that function in the same grammatical role as preposition ((deprecated template usage) of, (deprecated template usage) over, (deprecated template usage) with, etc.). So, yes, it would be rather silly to say that certain parts of speech will receive definitions formatted in one way, but other parts of speech will use a special template. Fortunately, nobody is saying that. Rather, this follows the sensible convention of nearly all of the respectable dictionaries I've seen. That is, definitions that are worded as glosses get one style. The few definitions that cannot be expressed well with a gloss get a different style. Do you know of a good dictionary that doesn't make that distinction? Rod (A. Smith) 20:34, 17 June 2008 (UTC)[reply]

You mean like MW3 ( of — "used as a function word to..." gets no special formatting); AHD ("of" — "used to indicate an appositive" gets no special formatting), Oxford Advanced Learner's Dictionary ("of" — "used to show the position of something" gets no special formatting), and likewise for the Compact OED. In fact the only dictionary I own that makes a formatting distinction is the Random House Dictionary, 2nd ed. In which dictionaries have you seen this distinction made?

I realize you hadn't planned to limit this template to certain parts of speech. My point is that, intentionally or not, the explanation of how this template is to be used means that it will apply almost universally to certain parts of speech while almost never appearing in others. As you have noted, it will also provide a push for people to define prepositions in certain convoluted or unenlighening ways to avoid using the template, which I feel is another undesirable outcome of the template's use. --EncycloPetey 21:08, 17 June 2008 (UTC)[reply]

But MW3 does use special formatting. It begins each non-gloss definition with an m-dash, whereas gloss definitions get no such treatment. In addition to MW3, Britannica/Funk & Wagnalls Standard Dictionary of the English Language (a 1960 edition, which begins special definitions with a clear mention of the headword within a sentence that describes it) and Webster's Encyclopedic Unabridged (which encloses special definitions in parentheses) all distinguish gloss from non-gloss. If you're mainly concerned with overuse, would it assuage your concerns if we add some warning to the template documentation that we prefer glosses when practical? Even if editors begin to overuse the template, though, the existence of such a template gives us a convenient list of entries from which to cull unnecessarily convoluted definitions. That's a good thing, right? Rod (A. Smith) 22:33, 17 June 2008 (UTC)[reply]

OK, now that you point out the m-dash I notice it, but it isn't especially obvious, and when definitions include a gloss and non-gloss, the intervening m-dash does not register visually at all.

Overuse? It's not overuse that bothers me. Please go back and read what I wrote; I never complained about it being used too much. I'm concerned that we're setting up a double standard where we use one kind of definition format for most parts of speech, but a different format primarily for the other "lesser" parts of speech. --EncycloPetey 22:58, 17 June 2008 (UTC)[reply]

I've read and re-read your original reply above, but if you're not concerned with overuse, I fear I'm no closer to understanding you. Rather than frustrate you with further questions, I'll wait for somebody else to shed some light on your concerns. Rod (A. Smith) 23:17, 17 June 2008 (UTC)[reply]

Longman's DCE puts such definitions in parentheses, FWIW. DCDuring TALK 21:29, 17 June 2008 (UTC)[reply]

That's also what Random House does. --EncycloPetey 22:03, 17 June 2008 (UTC)[reply]

`{{hu-suffix}}`

Could someone please help me out with an addition to this template? Currently it displays par1 and par2 in italics and puts the PAGENAME into a default category. I'd like to add two optional parameters. One would be called "link" and if link=xx is provided, it would put it between parameter 1 and 2 (par1 + link + par2), indicating a linking vowel between the lemma and the suffix. If link=xx is not provided, it would just display par1 + par2 as before. The other would be called "cat" and if cat=yy is provided, it would put the PAGENAME into the category given in the parameter. If cat=yy is not provided, it would use the current default category. Thanks. --Panda10 11:59, 14 June 2008 (UTC)[reply]

That work for you? (In future requests like this should probably be at WT:GP.) Conrad.Bot 13:53, 15 June 2008 (UTC)[reply]

It works great. Thanks for your help. --Panda10 16:09, 15 June 2008 (UTC)[reply]

Protection of supercalafragalisticexpialadocious?

It's been protected since Oct 2007 with the reason: "vandal target - leave for a few days then delete with "misspelling of" comment". Seems like time to reevaluate it's protection... 75.212.217.187 (really, w:en:User:JesseW/not logged in) 75.212.217.187 06:45, 16 June 2008 (UTC)[reply]

We only include common misspellings. Since supercalifragilisticexpialidocious isn't so common itself, it would be a very difficult case to argue. DAVilla 04:19, 17 June 2008 (UTC)[reply]

What makes us sure of the right spelling? Amina (sack36) 05:49, 18 June 2008 (UTC)[reply]

I would imagine most people who wanted to know the spelling would go to the source, so you could ask Buena Vista Pictures that. Of course this isn't to say that there could be a more popular spelling, but you'd have to make that case. DAVilla 20:40, 18 June 2008 (UTC)[reply]

Proto-Indo-European (PIE)

Is there possibly a place on Wiktionary that might be dedicated to Proto-Indo-European roots (for example, *albho-) with accounts of all the words from various languages that arise from those roots? Or is that an inappropriate role for Wiktionary? I know that as a language student, studying everything from Spanish to Sanskrit, I often have difficulty looking for cognates across languages. I feel that there should be a forum for etymologies and cognate words somewhere within the various Wiki programs; however, I don't know whether Wiktionary is the proper place for this, or how such an operation would be handled. Thank you for your consideration (and feedback).

Have you tried Appendix:List of Proto-Indo-European roots? We try to include etymologies on words, but it's fairly specialist knowledge, so we need as much help as we can get. Conrad.Irwin 18:04, 16 June 2008 (UTC)[reply]

Yes that list, and everything inside Category:Proto-Indo-European language where there are more complete cognates list for individual reconstructions, without that much space restrictions. (everything from Hittite and Old Persian to modern languages - all written in the original orthography ^_^). Note however that the list mentioned by Conrad was originally compiled by folks on WP and transwikied here, and was mainly based on Pokorny's dictionary which, by today's standards, contains some "cognates" that are either false or too far-fetched to be considered reasonably tenable. --Ivan Štambuk 02:17, 17 June 2008 (UTC)[reply]

Googleability.

[[xenization]] is one of the first several hits for google:xenization; but it's not a hit at all for google:xenization definition, nor for google:xenization dictionary. And even though we're one of the first several hits for google:xenization, the title is simply “xenization - Wiktionary”, which is only meaningful to those who either know what we are or can guess without any context.

Surely this is something we need to remedy?

—Ruakh_TALK 12:59, 17 June 2008 (UTC)[reply]

Yes, but we're the best; we don't have to be popular. As long as the right people (like the one's looking for "MILF", not necessarily in a dictionary) come here, we're fine. If we make it too easy for us to be found, we'll just have more newbie contributors and we'd have to block them - or train them. DCDuring TALK 13:29, 17 June 2008 (UTC)[reply]

It doesn't matter if we appear low down, the problem is that we don't appear at all if the word "definition" is included in the search string. As people are very likely to use a word like definition to find definitions, we should help them find us by letting the search engine know what our pages contain. Conrad.Irwin 13:44, 17 June 2008 (UTC)[reply]

Obviously you're being either facetious or sarcastic, but I can't tell which. If the latter, then I apologize for whatever I said that upset you. :-/ —Ruakh_TALK 16:39, 17 June 2008 (UTC)[reply]

Sorry, I wasn't intending to be. Looking back I was just not reading what you were saying. Conrad.Irwin 08:57, 18 June 2008 (UTC)[reply]

msh210 had a go at fixing it (after some IRC discussion) by using Mediawiki:Tagline. This does seem to work for "dictionary", google:ablute and google:ablute dictionary, however not for google:ablute definition. As this change was not very long ago, it is possible that google hasn't recached page an xenization to get the tagline in. As the Tagline is hidden to most users we could change it to read "A definition from Wiktionary, a free dictionary" which would get definition (singular). The other solution would be to install User:Conrad.Irwin/MetaKeywords.php which was designed so that we can include these words in our meta description tag so that the search engines know that we have definitions. Conrad.Irwin 13:44, 17 June 2008 (UTC)[reply]

IMO we ought to get more hits than WP (not 1.5% of WP) eventually because our brief help is needed more often than an encyclopedia article. Google and dictionary look-up tools in text-editors seem like the drivers. It would be great if we appeared higher on Google searches so that more folks could see what we are doing. What are the currently relevant barriers to testing these google-placement-improving steps, evaluating results, and implementing what seems to be working ? If we can't evaluate, then why not implement what does no harm and might improve things ? DCDuring TALK 15:51, 17 June 2008 (UTC)[reply]

I'm not sure that we can make that kind of comparison meaningfully. What do people go online to search for more often? The meaning of an obscure word or encyclopedic information about TV, celebrities, science topics, historical events, etc.? I think the reason WP gets far more hits is the kind of content they supply and the greater likelihood that people will go looking for that information. If more people go to the movies each day than to the library, does that mean the library is doing something wrong? No. The cinema and the library are providing different resources. In much the same way, and given the huge difference between the kind of content WP and WT supply, a hit percentage comparison is not statistically meaningful. --EncycloPetey 20:02, 17 June 2008 (UTC)[reply]

We are also getting about 21% of the hits that MW Online gets and 40% of what Dictionary.com gets. I exclude answers.com (6-7%), which has favorable placement at Google and broader coverage. On the bright side we are ahead of Bartleby (130-40%). DCDuring TALK 21:24, 17 June 2008 (UTC)[reply]

OK, those are comparisons we can work with. Do we know how users end up at those sites? For example, is there anything packaged with Windows or set as a default in Explorer that would favor those sites over us? How much of the value is from returning users (versus new ones)? We can't assume that the number of hits all result from random internet searches of from search engines, unless we have data to support that notion. --EncycloPetey 22:02, 17 June 2008 (UTC)[reply]

I do know that our traffic numbers are inflated by the links from one of the most popular sites on the Web. Our numbers seem to include all the Wiktionaries under wiktionary.org. I don't think that the MWOnline, Bartleby and Dictionary.com have non-English dictionaries. I also know that MW Online had a huge increase (300%) in its volume over the last year; whereas Wikt has gotten about a 43% increase in the same period. After these teaser facts, everything else would cost money or require research I haven't done yet and may not be able to get the facts for. Has anyone else been paying attention to what our competitors have been doing? Whatever it is the MW has done it seems to have borne fruit starting in January '08 and has been a big setback to bartleby and dictionary.com. DCDuring TALK 02:40, 18 June 2008 (UTC)[reply]

The site dictionary.com probably is the main reason- it packages many dictionary sites together on one page. Maybe we could get that site to include our definitions too? That is probably where most people get dictionary information, as opposed to individual sites. Nadando 03:31, 18 June 2008 (UTC)[reply]

Dictionary.com has one-third the volume of MWOnline and has lost 15% during the period that MWOnline has gained 300%. That doesn't seem likely to account for MWOnline's sudden surge since January 1, 2008. DCDuring TALK 11:35, 18 June 2008 (UTC)[reply]

The surge in M-WOnline.com is largely due to shifting a great deal of the volume from M-W.com to MWOnline.com. DCDuring TALK 18:51, 20 June 2008 (UTC)[reply]

Oh, cool. :-) I'm not sure that we should accept a suboptimal tagline just because Monobook doesn't display it; if we're going to go that approach, I think it would be better to wrap whatever text we want in an explicit <span style="display:none">[…]</span>. (By the way, does the tagline have access to the current page name, using {{PAGENAME}} and whatnot? If so, then we might want it to mention the entry title, and perhaps to differ slightly between entries/appendices/etc. and templates/project-pages/etc.) —Ruakh_TALK 16:39, 17 June 2008 (UTC)[reply]

On the topic of web visibility of Wiktionary: Isn't the use of keywords in a meta tag in the header of HTML the standard way of informing search engines about the topics covered in a web page? A meta tag, unlike a tagline, does not get printed, so it can contain a list of relevant keywords as one sees fit, without forcing the words to form an artificial sentence or phrase, including such words as "definition", "dictionary", and "define".

Also, did you know of the keyword "define:" by Google, exemplified by define:apple? --Daniel Polansky 08:53, 18 June 2008 (UTC)[reply]

Daniel, yes. See User:Conrad.Irwin/MetaKeywords.php which was written the last time this discussion arose (which would allow us to define relevant meta-tags only in the main namespace). It would be nice if they included Wiktionary in the define: statements, but sadly they don't (I filed a feedback about it a long while back, but it got ignored ;). If we want to do this (which it seems like we do) shall I start a WT:VOTE on it so that we can demonstrate consensus to the developers?

Ruakh, we can get the page name into the tag line, however we can't differ between namespaces with it (though we could with the MetaKeywords extension). The Tagline is already wrapped in a display:none through CSS, and I'm not sure how clever google are, but I'm fairly sure they try and ignore things that are explicitly not shown (as the potential for abuse is large). Conrad.Irwin 09:17, 18 June 2008 (UTC)[reply]

The tagline is display:none in Monobook, but not in, say, Classic. Does Google respect meta-tags? (And did I totally miss the previous discussion, or do I have the memory span of, like, a flea or something?) —Ruakh_TALK 12:39, 18 June 2008 (UTC)[reply]

This is one of my pet-hates with our beloved MediaWiki software, why the hell can't they use the same class names for the same bits of the skin! The previous discussion was on the grease pit (Wiktionary:Grease_pit_archive/2008/January#Aiming_for_Google_keyword_define) Conrad.Irwin 19:01, 18 June 2008 (UTC)[reply]

I had a hunch this must have been discussed before. Thanks. I for one am supportive of the idea of keywords in meta tags, and see no drawbacks, except from the increase of the attention that the project could get from a broad user base. But then the issue at stake would be not what technical solution is preferable but rather whether attention is wanted. --Daniel Polansky 10:44, 18 June 2008 (UTC)[reply]

We have no idea whether returning a result (likely to fall below m-w and other online dictionaries) on google will increase the number of editors, or, if it does, by how much. The only way to find out is to try this out, it can always be reverted if it is found we are unable to cope with the levels of vandalism. Conrad.Irwin 10:58, 18 June 2008 (UTC)[reply]

Google's Listing Tactics

It turns out Google doesn't take it's cue from Meta tags alone. Their approach is more complex than that.

The information included as the header is their first go-to, but not their primary judgement call. They couple this with more info.
Number of times a word is used on the page in questions. In other words, if the word is "tree" and in our definition page we have the word "tree" written five times and dictionary.com only has it listed three times, we win the upper position--all other things being equal. This is a fairly sophisticated calculation that isn't easily fooled by tactics created specifically to fool the spider.
Meta tags are taken into account, just not the only thing taken into account
Number of times a link is made to that page. Here's where they generally place their heaviest consideration. It's just really hard to see why anyone would link back to wiktionary. I see it all the time with Wikipedia when people want to cite a reference. Is there a way we can improve this statistic? Amina (sack36) 12:56, 20 June 2008 (UTC)[reply]

Do you know whether links from WP count? Many is the article there that has terms used where a Wiktionary link would be much more of a help than a WP link. The useful links would be in-line links, not boxes. DCDuring TALK 14:33, 20 June 2008 (UTC)[reply]

Links from WP do count for Wiktionary (as we are a sister project MediaWiki doesn't add the nofollow attribute to links to us).

Re: Sack36, You are right on all counts, but we are not (as far as I know) aiming for a high google position, just any place at all if certain words are searched for (compare google:ablute google:ablute definition site:en.wiktionary.org, it is interesting to note that thanks to the recent tagline change, it does now find "google:ablution defintion" on page 6 of the results for me). Link backs will happen gradually as people find Wiktionary useful, so we can leave it to take care of itself, getting the repetition exactly right in a wiki is (I think) too hard, and as it isn't greatly important, we may as well leave it. Conrad.Irwin 17:46, 20 June 2008 (UTC)[reply]

If we're not aiming for first page, we may as well not "aim" at all. People don't look past the first page. Let's face it, the only reason we're looking to google is to get people to our site. It has to be first page or nothing. Amina (sack36) 07:06, 23 June 2008 (UTC)[reply]

Are you sure that Google looks at meta tags at all? It was my impression that meta (keyword and description) tags are so often abused that most search engines completely disregard them. As far as I know, only directories like Yahoo and directory.google which have human editors confirm their content look at the meta tags. And for this purpose, it would be useful to add something like <meta name="description" content="English-language definition of ‘exposé’, a word in English and French." />

Search engines also disregard text hidden using CSS or other methods. I believe hidden text may also lower the perceived trustworthiness of a page. Why not just show “Definition from Wiktionary, a free dictionary?” The title tag, which shows up in your browser's window title, is also significant. Instead of just “ablute - Wiktionary,” it should be something like “ablute – Wiktionary, an open-content dictionary.”

See Google's Webmaster Help Center. —Michael Z. 2008-06-20 20:47 z

Important: don't try to game the search engines—they are designed to respond badly to that. The easier it is for human readers to read and understand the content of a page, the better it will be understood by Google, etc.

The number one thing we can do to increase Google rank is improve the quality of the dictionary, causing more sites to link to it.

We already have a good head start in links from Wikipedia articles and 404 pages, and it may be helpful to start a systematic campaign to add a Wiktionary link to every single eligible Wikipedia article. It would also be nice to explicitly note the presence of a Wiktionary definition on a 404 page like w:Ablute, rather than just rely on the default sister link. —Michael Z. 2008-06-20 20:56 z

And don't forget Wikiquote! bd2412 T 21:02, 20 June 2008 (UTC)[reply]

What people don't seem to understand is that we are not trying to game the search engines. We are trying to make the site describe itself correctly, which will aid the search engines. Yes, there is no substitute if we want "higher google ranking" to improving our defintions and letting eventuality take hold. However this is not about getting higher google rankings. This is about ensuring that people that specify they are looking for dictionary definitions find dictionary definitions. I don't care how far down the listing Wiktionary is, but it really oughts to be there somewhere. Although most search engines nolonger ascribe high importance to meta-tags they do read them - though they (Google at least) are very harsh on sites they catch spamming irrelevant keywords. Conrad.Irwin 22:12, 20 June 2008 (UTC)[reply]

We're a "dictionary", we're "free", we "define" "word"s, we offer "definition"s, "translation"s, "synonym"s, "etymology", "pronunciation", "usage". It would hardly be gaming the system if we made that much clear. Arguably we also offer "answers" and various other things, but I'd be happy if those terms coupled with a word came up wiktionary some of the time. Could we have an easy-to-use "cite us" link in the toolbox? DCDuring TALK 22:53, 20 June 2008 (UTC)[reply]

But not every entry has all those items. Hmm... Is there a way to set things up so that meta tags appear according to the section headers that exist on the page? That feature could be useful and accomplish some of what's being discussed. That is, if an entry has a synonyms section, then "synonym" will appear in a meta tag for the page. How difficult would that be to do? --EncycloPetey 17:58, 21 June 2008 (UTC)[reply]

There's a question of fact here. Does Google ignore our section headers? DCDuring TALK 18:06, 21 June 2008 (UTC)[reply]

Ignore? probably not. But, even if the section headers aren't ignored, does Google give them any meaningful weight for searches, or would meta inclusion benefit users? --EncycloPetey 18:17, 21 June 2008 (UTC)[reply]

Meta keywords count for less than the page headers. Anyway, there would be no point in duplicating the text of the page in them: the page structure already contains more information than repeating those heading names would add. Needlessly repeating information that's already there is trying to game the search engines, and only waters down the meaningful content. —Michael Z. 2008-06-21 21:03 z

MZ: Do you know how Google treats our section headers? If they are included, then the only issue is whether the words that accurately characterize us and do not appear in an entry naturally are worth including in some way that Google would take seriously. I don't know what to say if Google excludes our section headers, because most of those terms are important descriptors of our content. It could be that they have made a business decision that we aren't indispensable enough to be treated on a par with commercial sites that are in a position to do business with them. WP is still indispensable to users, but Google is trying to create a commercial competitor anyway. It may be that we have to work harder at linking to and from sister projects to enhance our value. But our non-commercial nature limits how we can deal with Google. DCDuring TALK 00:05, 22 June 2008 (UTC)[reply]

I don't have any behind-the-scenes insight, but it is my understanding that Google and other search engines look primarily at the visible content of the page (because text hidden from readers is usually trying to game search engines), and do pay at least some attention to page structure (so the window title, headings, and subheadings are significant, and even parts of the URL). Also very important is having high-quality links to a page and to a site.

I doubt that Google treats us differently from commercial sites, or that they mistreat competitors—stuff like that would ruin their good reputation. I think the simple fact is that everyone in the world knows about Wikipedia and links to it. As we become more useful, we will gain high-quality links and a better reputation, and so show up higher up in the search results. Keep improving the dictionary, and have patience. —Michael Z. 2008-06-22 04:34 z

Everyone on that all important first page is doing everything we'll do with the pages we have. The only thing that will set us off from the others is the amount of back links we can get. If we could get some kind of widget that could be put on other people's sites that would allow them to do lookup of meanings at their site, we'd have the back links we need. Of course that would require that we have the kind of definitions they're looking for. Medical sites require medical terms, scientific sites require scientific terms, etc. Amina (sack36) 07:06, 23 June 2008 (UTC)[reply]

I think search engine users who seek a definition of a word often include "definition" in their queries. Unfortunately for the English Wiktionary, its entry pages don't typically contain the word "definition", so search engines exclude our entries from the search results. We should consider adding the word "definition" to the visible text of our entries. Rod (A. Smith) 19:34, 24 June 2008 (UTC)[reply]

Extracted from Merriam-Webster.com:

"Definition of tipple from the Merriam-Webster Online Dictionary with audio pronunciations, thesaurus, Word of the Day, and word games."

"Keywords" content="tipple, definition, define, meaning, dictionary, glossary, free, online, english, language, word, words, webster, websters, merriam-webster"

tipple - Definition from the Merriam-Webster Online Dictionary

Merriam Webster is the leading English dictionary site without a special deal with Google. They feel compelled to include some of the very meta keywords we are discussing. They seem to get better placement than we do. There are several possible reasons why. The easiest of those to address is the absence of meta keywords. Why wouldn't we add the keywords? We won't stop doing things like improving quality, layout, breadth, depth, special features. DCDuring TALK 01:05, 25 June 2008 (UTC)[reply]

I've initiated a vote at Wiktionary:Votes/2008-06/Install_MetaKeywords_Extension. Conrad.Irwin 12:13, 25 June 2008 (UTC)[reply]

Cool. I've commented at the talk page there. DCDuring TALK 14:12, 25 June 2008 (UTC)[reply]

The Vote, which we need to show the developers before they will install anything, is now live at Wiktionary:Votes/2008-06/Install_MetaKeywords_Extension. Yours Conrad.Irwin 16:20, 1 July 2008 (UTC)[reply]

Template:vi-attention

"This template adds Category:Vietnamese words needing attention to bring the entry to the attention of our Japanese experts. It does not change the appearance of the page."

Why Vietnamese words need attention of Japanese folks? I think there are Vietnamese members here, for example, me. --Cumeo89 14:52, 18 June 2008 (UTC)[reply]

That was probably a cut-and-paste error. Robert has now corrected the template. Thanks for pointing this out. --EncycloPetey 16:39, 18 June 2008 (UTC)[reply]

User behavior

Because we have no direct information on the particular behavior of our own users, we have been relying on ourselves as models of their behavior and do not even have quantitative information about ourselves. I came across some interesting statements in 2006, Prioritizing Web Usability, Nielsen and Loranger. Their results are based on a study of 69 users (no teens or seniors) presumably conducted c. 2004.

As a baseline web users successfully achieve their objectives 66% of the time, compared to 40% in the 1990s.

Users spend less than 2 minutes (1:50) on a site before abandoning it if they are not achieving their objectives.

Users spend only 31 seconds on a home page on their first visit and less and less thereafter. They scroll little on the first visit (23%) and less thereafter. Users read much more on interior pages than on home pages.

c. 45% of links users click on are from the interior content area of a page, 10% from the footer, 15% each from left, right, and top.

For search results, 53% of users only look at what appears above the fold (first screen), a further 40% make it below that but not past the first page, 7% make it to the second page, less than 1% beyond that.

On a search page #1 gets 51% of clicks; #2, 16%; #3, 6%; #4, 6%; #5, 5%; #6, 4%; #7, 2%; #8, 1%, #9, 1%; #10, 2%; #11+ (2nd page+), 5%

Users scrolled below the fold only 42% of the time on content pages that had "below-the-fold" material.

The above are a large percentage of the user behavior facts in the book. The conclusion that the authors reach about site design are based more on their rational economic behavioral model of search behavior and their own clinical experience. I doubt if one could say they were tested, though they are probably more accurate than our anecdotal impressions. DCDuring TALK 18:37, 18 June 2008 (UTC)[reply]

These are great statistics, and I don't mean to argue against them in any way, but I do think they're skewed toward the average site. We are reference sites (Wikipedia, Wiktionary, Wikisaurus) and the data may be different for us. If a person is looking for a given piece of information, I imagine the time spent would be directly proportional to how important the information is to the viewer; how specialized the information is; and how predisposed toward our sites the individual is. Wikipedia may have the most far reaching reputation, but it's not always considered valid info. Wiktionary is more favorable in the eyes of those who know of it, but it's not as well known. Wikisaurus has a bad rep or no rep depending. Amina (sack36) 06:51, 20 June 2008 (UTC)[reply]

Of course, we're unique, just like everybody else. I eagerly await the data that shows that. In the meantime, what would be your guess as to how different we are? Wiktionary is not a monopoly. How important is a dictionary entry on Wiktionary as opposed to:

rereading the troublesome passage, using a different word.
doing without
using Answer.com (via Google)
using MWOnline
asking someone nearby or reachable
using a print dictionary?

I am not at all certain that one piece of reference information is more valuable to a user than, say, finding exactly the right model of water shoe at a good price while avoiding trips to the two nearest shopping malls.

I eagerly await the data on user perception of the relative reliability of reference sites and on the recognition of "Wiktionary". It may be, erm, some time in coming. We don't even seem to know where users are coming from when they come to Wiktionary. We do know that "MILF" has been the most common search term to find us, more common than "Wiktionary".

Also if a user doesn't find what's wanted at a site quickly the first time, does it make the user more or less likely to click on the site the next time information is needed? What matters most to us in terms of attracting new users is ease of getting what they are looking for the first time they hit the site, the first time they come back after an initial disappointment, etc. Also we are competing with other WMF sites for the time of contributors. I would expect that well laid-out pages can't hurt in recruiting them, especially casual contributors and those with special-context knowledge, as opposed to linguists. DCDuring TALK 11:41, 20 June 2008 (UTC)[reply]

If I may, I'd like to ask a question unrelated to the discussion. Several times y'all have used abbreviations that I don't understand. Specifically: MWOnline, MILF, WM. Could you define them really quickly? Amina (sack36) 13:07, 20 June 2008 (UTC)[reply]

Sorry. Of course you may. Some of these may or should be in either Wiktionary:Glossary for "insider" terms used in Discussion rooms like Grease Pit or Beer Parlor, or pages like RfVerification, or RfDeletion, etc. or Appendix:Glossary for terms used in entries. MILF is a notorious entry. MWOnline is Merriam-Webster Online, a mostly free dictionary site. WMF is Wikimedia Foundation, the umbrella for Wikipedia (WP), Wikt, WikiSpecies, WikiCommons, WikiSouurce, Wikiversity and other projects. I have argued that the Glossaries should be links right under Help in the navigation box on the upper left. DCDuring TALK 14:16, 20 June 2008 (UTC)[reply]

Shortcuts are WT:GL for the "insider" jargon; Appendix:Glossary for what every user is supposed to know to understand our entries. DCDuring TALK 14:27, 20 June 2008 (UTC)[reply]

Thank you. I understand now. BTW, why isn't Dictionary.com in that list? It's gotta be the easiest to access. It is severely limited but most people don't want any more than that. Those that do, will head for the OED pages or beat their brains out with the mouse.

I think the thing that would make people access us above the rest is a two tiered approach like (and don't boo and hiss) Apple uses. A person getting a Macintosh can shove it on a desk, plug it in, turn it on and as soon as the welcome routine gets finished can pretty much do all the basics. No fuss, no extra work, no wading through incomprehensible jargon and unwanted explanation. That's tier one. However, if you are uber-geek they also supply a Unix window where you can really screw up the machine; a help function that can boor you to death and several other high end, arcane things. But all that is transparent to the mom and pop terrified-of-computer types!

We need to be that kind of flexible. The stuff at the top should be 6th grade understanding. Nothing about parts of speech or declensions or widgets. Then, as you scroll down the page it should get increasingly more complex until you reach the PHD in (name the language). If you look at Wikipedia, the better pages do just that. Amina (sack36) 00:01, 21 June 2008 (UTC)[reply]

It seems obvious, doesn't it? Most casual users (non-contributing, often anon) probably just want definitions (and spelling). Even if they want more. they need those things to make sure that they rest is relevant. A first screen (above the fold) that doesn't have definitions had better make it clear that the definitions and other wanted content are just one click away. We can use up to an inch for Etymology and two inches for Pronunciation when we only have six inches in total. I'm not sure whether anons still have the left-side ToC to contend with, but that could force all content off the first screen.

I don't think that you can manage presenting words without mentioning the part of speech. Even MWOnline can offer a daunting list of parts of speech )even multiple links for the same pos but different etymology) for a complex word like set before offering any definitions.

Many entries are short enough to fit on the first screen in their entirety. Some entries with multiple etymologies and parts of speech and many definitions for some of the parts of speech would be unavoidably difficult to present. Some entries derive their apparent complexity from the existence of many language headers, which in turn follows from the "all words in all languages" part of the Wiktionary creed. Each of the classes of causes can be dealt with to improve the effectiveness of the first screen for anonymous first users. But the goal would first have to be accepted, which it is not, certainly not with any enthusiasm.

What registered users might want to see on their first screen is somewhat customizable anyway, but some of the customization never worked for me with Internet Explorer, though it is fine with Firefox. DCDuring TALK 00:53, 21 June 2008 (UTC)[reply]

brainstorming

Can we come up with a brainstorming technique suitable for the Wiktionary team to help our discussions? Not just this, but any. Many times I find that good ideas are buried in long conversations and after a few months when the subject comes up again, editors may remember that this was already discussed but newcomers will not know about it and it's hard to search archived talk pages to find the pieces. In classic brainstorming, the ideas are listed and judgment is suspended, then the ideas are analyzed, combined, improved, etc. to come up with the best solution. If you don't see this as feasible, please recommend other methods to keep comments/ideas/issues related to one subject on one summary page, so people can refer to it and return to it.

Perhaps brainstorming needs to be conducted off the main community pages, but with a link from these pages as long as the dicussion is active. Trying the classic brainstorming method first seems like a good idea. We could then consider revising our method based on results. Would the right topic be "first entry screen for anons" or something else? DCDuring TALK 13:47, 21 June 2008 (UTC)[reply]

"First entry screen" is the high-level topic. But we might want to break it down to subtopics, a list of things that could be looked at for improving the first entry screen. --Panda10 14:13, 21 June 2008 (UTC)[reply]

OK, I'm confused. I thought this was off the main community pages. Do you remember how hard this is to locate? Also, I really like the idea of the brainstorming. These pages are bloody hard to keep up with! Amina (sack36) 07:25, 23 June 2008 (UTC)[reply]

TOC issues

Regarding long TOCs that occupy precious space: Is there any way we can implement a horizontal TOC where only the two-character language codes are displayed (en - de - fi - ru) as a link pointing to the FL section of that page? If you say the code is not intuitive for users, an alternate text could display the English and FL name of the language (for de it would be German - Deutsch) when the cursor is above the link. Only those codes would be listed that are on the page. If a new FL section is added, AutoFormat could add the new link when the page is saved. I know we have many more items in the current TOC but maybe we can come up with a new way to incorporate them into a horizontal TOC. --Panda10 12:44, 21 June 2008 (UTC)[reply]

To clarify, are you suggesting this as the default for anons? DCDuring TALK 13:47, 21 June 2008 (UTC)[reply]

Sorry, I don't understand your question. Do anons see a different layout than registered users? --Panda10 14:13, 21 June 2008 (UTC)[reply]

Registered users can set their own preferences at "my preferences". Those really in the know can use WT:PREFS for further customization. One of the preferences allows the table of contents to be on the right hand side with the entry beginning at top left. The top-right placement of ToC already accomplishes some of what we seek for users who know how to set it. The option to do so was considered a test at the time I opted for it, I think. I don't know what its status is now. DCDuring TALK 17:34, 21 June 2008 (UTC)[reply]

I've been wondering about this whole thing with left hand TOC. What reason do we have for wasting that space with TOC? Why isn't it on the right? Amina (sack36) 07:17, 23 June 2008 (UTC)[reply]

(1) It's the default for the MediaWiki projects to have it on the left. (2) There are often images, templates, and other items displayed at the top right of an entry, which would interfere with a TOC there. (3) Some people want it there, and don;t see that as a "waste". --EncycloPetey 02:15, 24 June 2008 (UTC)[reply]

We already depart from some MediaWiki defaults.
There certainly are issues with how the right-side ToC interacts with images and sister-project link boxes. We can push the link boxes down to "See also", but there would be a lot of entry modification required. The images might require more complex work. An entry with multiple images (See screw, which has just 2) can be a mess and may need a gallery (See head). Gallery is not even an approved header, so that the images may not be noticed by users interested in the top 15 senses, unless they use the non-standard intra-entry links that have been developed on an experimental basis or we develop another approach to their display.
It is hard to see the value of the vast amounts of white space above the fold on the right side. I'd love to see Etymology and Pronunciation under the show/hide bars and shorter fonts for the headers. Also ToC length control by suppressing everything below PoS by default.

Registered users could be given configuration choices tailor how they see things. DCDuring TALK 02:54, 24 June 2008 (UTC)[reply]

I've been using the WT:PREF to put the TOC on the right, and I've not found any page on which it looks unacceptably messy. I still strongly support enabling that WT:PREF by default, with the option to turn it off in WT:PREFS. Yes, it may move a few images around on the few pages we have images, but it makes nearly all pages easier to read which is an insurmountable benefit. Conrad.Irwin 10:59, 26 June 2008 (UTC)[reply]

I've started the page Wiktionary:Layout woes where future discussion on entry improval can take place without disturbing the beer parlour too much. Conrad.Irwin 11:24, 26 June 2008 (UTC)[reply]

botflag for CarsracBot

I ask a botflag for my bot, because a small test for the first 60 articles in the main space 7 needs adustment. For more information see my userpage. CarsracBot 19:08, 18 June 2008 (UTC)[reply]

Please read the policies concerning bots. You will not be given permission until you have allowed experienced users to check your bot code and have explained exactly what the bot is suppoed to do. You have not fulfilled either of these requirements. --EncycloPetey 19:24, 18 June 2008 (UTC)[reply]

On the homepage is explained that I would only do interwiki work (add,removing and changing interwiktionary links). And that I use the standard pywikipedia software. That is constantly checked and updated by experienced users. Your homeiwbot only adds interwiki links. Carsrac 12:45, 23 June 2008 (UTC)[reply]

We have a much more efficient interwiki bot (User:Interwicket) which can do all of the interwikis for the whole en.wikt in a couple of days after every XML dump, and the last several requests to run an interwiki.py based bot have been denied (so I doubt you'll be granted permission). There has been an interesting bot request on WT:GP, I don't know whether you're interested enough to try that? Conrad.Irwin 17:38, 19 June 2008 (UTC)[reply]

If it is something that can be done with the standard set of pywikipedia scripts it would not be a problem. Please give a good link to the interessing request.

To comeback on my request. As I indicated in the start your home interwikibot is overworked. It has a the moment a turnaround time of 10 days give and take a hour. So please don't come with a couple of days. I'm not a skilled software engineer. But I work with a public scripts and work together with other users. Questions about the home bot are not answered for a half year by the owner.

BTW I have no question marks on the skills of the programmer and owner of the homeinterwikibot. But I think that a man with his knowledge should be one that improves pywikipedia scripts and upload those improvements. Carsrac 12:45, 23 June 2008 (UTC)[reply]

Vote proposal to modify WT:ELE (help) page

The WT:ELE help page does not include a reference to context labels, yet there are hundreds in use and they are a valuable guide to definitions in paper dictionaries as well as Wiktionary.

I would like to propose that a section be built at the WT:ELE#The_part_of_speech_or_other_descriptor section of the ETE page to introduce people to context labels, found at Category:Context_labels.

This suggestion is first-stage only to see if others agree with the placement and Context_labels page as the appropriate reference for labels. If so, an actual explanation must be drafted. Wakablogger 23:31, 19 June 2008 (UTC)Wakablogger[reply]

There is some ongoing work on context labels, but whatever the outcome, ELE would need to accommodate it. ELE is long. The discussion of context labels is potentially long. Perhaps it would be better to have a heading, two or three lines, and a link in ELE, the link being to the body of the discussion of context labels in a subpage. DCDuring TALK 00:05, 20 June 2008 (UTC)[reply]

See Template talk:context. DCDuring TALK 00:26, 20 June 2008 (UTC)[reply]

It might be best to start this as a whole new page Wiktionary:Context that explains the labels and their use in detail, then create a summary for inclusion on ELE. That way, we don't have to have a vote until we want to add the summary. (New pages don't need votes to start or edit.) --EncycloPetey 03:01, 20 June 2008 (UTC)[reply]

Absolutely, yes.

But these should be addressed from a content point of view rather than a technical one. “Context labels” serve several diverse functions—indicating grammatical aspect, qualifying register, dating, or usage, indicating geographic or topical context, or specifying regional language. That they happen to be indicated using similar “templates” is incidental. Each of these should be discussed in the appropriate place to reduce conflating them.

See Category:Context labels for a breakdown. —Michael Z. 2008-06-20 03:33 z

As long as they're all easily accessible from one location AND easily accessible from the basic help pages, that's fine. On the few items I've done, I spend more time on labels than anything else (more recently, I gave up on labels). Wakablogger 04:58, 20 June 2008 (UTC)Wakablogger[reply]

Portuguese spelling

There are a huge number of alternative spellings between European Portuguese (which is also used in Africa) and Brazilian Portuguese, since the rules for diacritics, digraphs and others are different from scratch. (ato and acto, amamos and amámos, gol and golo, sistema operacional and sistema operativo...) Only a few entries in Wiktionary include this distinction, and their counterparts simply have not yet written. Then I created two templates (Template:pt-Brazilian spelling and Template:pt-European spelling) that would be used in every page explained above. However, I have not found any similar templates for English words (such as center and centre). Instead, we are using Template:qualifier for them, to provide a link to the specified country followed by the alternative spellings, or by a simple description. In this case, the templates Template:pt-Brazilian spelling and Template:pt-European spelling would be a better choice, to provide a complete explanation where it is necessary. Daniel. 18:09, 20 June 2008 (UTC)[reply]

Wouldn't it be more usual, for a word used in Brazil only, to use {{context|Brazilian}} and, sub ===ALternative forms===, * [[foo]] {{qualifier|Portugal}}?—msh210℠ 17:40, 23 June 2008 (UTC)[reply]

Bot flag request for User:EivindBot

I, EivindJ, hereby request a bot flag for my bot, EivindBot. It is a bot run on the python/pywikipedia framework and it'll do interwikis based on no.wikt (a growing wiktionary). Thanks in advance, and please tell me when I can do test edits. --EivindJ 18:13, 20 June 2008 (UTC)[reply]

See the previous bot request on this page. We have a very efficient Interwiki bot (who's code User:Robert Ullmann would (I think) be happy to share, if you want to spread this good news to the other wikts) User:Interwicket, that can update the entire site in a day or two after every xml dump. All recent interwiki.py bot requests have been denied, so it's unlikely you'll get approval to use it at all. Conrad.Irwin 22:16, 20 June 2008 (UTC)[reply]

I have see the code and it is a modified version of pywikipedia. He is more then welcome to share his code with the rest of the pywikipedia project. But at the moment it not a public code. And it can't be reviewed by an experienced editor. Carsrac 11:57, 23 June 2008 (UTC)[reply]

For everyone else's edification: it is not "modified" from interwiki.py; it is purpose written for the wikts, which can and should use an entirely different algorithm from the pedias; the source is public and published at User:Interwicket/code and can be reviewed by anyone. (I do in fact run it on a modified version of the framework, but it will run on the standard one; I run it on a modified version because the standard one is extremely fragile when faced with network problems; it tends to crash if the net has the slightest glitch. On the net here, a glitch can be anything from transients occurring many times an a hour to 24 hour outages, and the process must recover, if it kept restarting it would take forever ...). Robert Ullmann 14:19, 23 June 2008 (UTC)[reply]

I see, that's ok – and I would be more than happy to get hold of that code :) --EivindJ 22:31, 20 June 2008 (UTC)[reply]

See above; but do not that even though it is partly set up to be run on any wikt (e.g. variable "home" is set to "en"; it also is "non-portable" in several ways: it assumes sort order is set up in a putfirst list; apparently the current framework doesn't provide that if the order is just code-alphabetic, etc.) To make it really usable elsewhere, I should do a bit of work and testing. Also note the current published version is not necessarily the version running on en.wikt at any given moment; I may need to be prodded to update it. (which I will do now ;-) Robert Ullmann 14:19, 23 June 2008 (UTC)[reply]

Formatting conjugation, inflection and declension

Is there a particular reason to write some conjugations, inflections and declensions in italics when almost every other doesn't follow that rule? Should the first letter be uppercase for all languages? Do we need emphasis on the most common form (infinitive of verbs, singular of countables, etc.)? And what about ending with a dot? Here are some examples...

ama#Spanish (first letter uppercase, a bold lind and ending with a dot)
botones#Spanish (first letter lowercase, a bold link and no dot)
consigo#Spanish (first character is a numeric abbreviation, there is an italic link and two italic translations)
creamos#Spanish (first letter lowercase, a normal link and no dot)
discere#Latin (first letter lowercase, a bold link and ending with a dot)
éramos#Spanish (first letter uppercase, a bold link and ending with a dot)
ler#Norwegian (first letter lowercase, conjugation in italics followed by a non-italic infinitive)
séria#Portuguese (first letter lowercase, a bold link and no dot)
servos#English (first letter uppercase, a bold link and ending with a dot)
servos#Latin (first letter lowercase, a bold link and no dot)
somos#Portuguese (first letter lowercase, in italics and in parenthesis, followed by a translation)
sou#Portuguese (first letter lowercase, in italics and no dot)

As with other definitions, I think that all those forms should start with uppercase, and ending with a dot. The most common form should appear, emphasized by bold formatting. And examples (or even quotations!) should be done in this way, not this way, that is, not inside the definition. No abbreviations should be done ("first person" would be used instead of "1st person") and no translations are necessary inside the definition. Every definition should be separated, and italics are not necessary. Daniel. 19:40, 20 June 2008 (UTC)[reply]

There is actually a standard formatting for Spanish conjugated forms- éramos#Spanish is the only one in your list that uses it, I believe. Most of the conjugated Spanish forms were added with a bot. It should be possible to do the same thing with Portuguese forms- take a look at the subcategories of Category:Spanish_verb_forms (the verb forms in that category all need formatting, by the way.) User:TheDaveRoss originally ran the bot to add verb forms, and User:DCDuring ran the more recent one. Nadando 19:53, 20 June 2008 (UTC)[reply]

I did not knowingly run any bots. I hope it's hard to run them unknowingly. DCDuring TALK 22:38, 20 June 2008 (UTC)[reply]

My bad, that would be User:Dmcdevit :) Nadando 13:56, 21 June 2008 (UTC)[reply]

Actually, italics are standard with non-gloss definitions here. That is, any time a word is "defined" as "first-person singular past tense" (etc.), we use a particular formatting style that by default renders as italics. We voted on this. Now, if you don;t want to see the text in italics, you may set your personal preferences so that this does not happen. However, the style tags should still be used because there are others who do want such "definitions" displayed this way. The issue of capitalization and periods is not settled for situations like this, and is left to personal choice in most cases. For Latin, I use lower case and no period because it makes coding the templates for conjugated verbs much easier to do. I would agree, though, that any given definition should either (a) be capitalized and end in a period, or (b) not be capitalized and have no period. That is, no definition should start with a capital letter and not have a period, or start with a lower case letter and end with a period. --EncycloPetey 17:54, 21 June 2008 (UTC)[reply]

{{form of}} and variations. Circeus 21:15, 3 August 2008 (UTC)[reply]

Introducing…`{{autological}}`

I’ve just created a category Category:Autological words and written a template {{autological}} to ease categorization.

Presumably this is ok – enjoy!

Nbarth (email) (talk) 17:28, 22 June 2008 (UTC)[reply]

Seems fine, the words that are in Category:Autological words should (I think) be in Category:English autological words. Conrad.Irwin 17:32, 22 June 2008 (UTC)[reply]

Yup, thanks – fixed this (took some purging) – I hadn’t realized it at first.

Nbarth (email) (talk) 18:08, 22 June 2008 (UTC)[reply]

The definition of autological implies that only adjectives can be autological - the category includes nouns such as "letters"? Jonathan Webley 20:29, 22 June 2008 (UTC)[reply]
- For loanword to be autological, would it not have to actually be a loanword? bd2412 T 02:40, 23 June 2008 (UTC)[reply]

For loanword to be autological, it would have to be loanword itself, which it isn't AFAICS. --Ivan Štambuk 02:39, 25 June 2008 (UTC)[reply]

As I share understanding of "loanword" and "autological" with both of you, I have removed the tag "autological" from loanword.--Daniel Polansky 20:11, 25 June 2008 (UTC)[reply]

Agreed – “loanword” is a loan translation of a German word, not a loanword itself (loanwords are borrowings w/o translation). I’ve made a note at Talk:loanword.

Nbarth (email) (talk) 22:28, 27 June 2008 (UTC)[reply]

- Definitely applies to nouns and phrases as well (noun is an autological noun). SemperBlotto 10:39, 25 June 2008 (UTC)[reply]
  See also verb#Verb.—msh210℠ 16:23, 25 June 2008 (UTC)[reply]

What particular sense of the verb to verb is the verb autological to? It's not autologism, unless senses upon which autologisms are defined are allowed to operate on cross-lexeme boundary, on all lexemes that appear to just share the spelling, regardless of PoS, etymology etc. (which strikes me as a very dirty trick ^_^). --Ivan Štambuk 17:55, 26 June 2008 (UTC)[reply]

I think it would be an autologism- as the word "verb" was "verbed". XD Teh Rote 17:59, 26 June 2008 (UTC)[reply]

Certainly one can verb the noun verb, but can one verb the verb verb ? --Ivan Štambuk 06:21, 27 June 2008 (UTC)[reply]

This is subtle: the word “verb” can be used as a verb, hence the verb “verb” is a “verb” (in the noun sense). Thus as a word, it can apply to itself, but only by changing parts of speech. Also, for fun: the verb “verb” is verbed from the noun “verb”.

Nbarth (email) (talk) 22:28, 27 June 2008 (UTC)[reply]

Question: Why use this template? Why not simply add the category in the usual way? The template merely adds complexity for the bots who format pages and sort category links to the bottom of the appropriate language section (where they belong). --EncycloPetey 21:14, 25 June 2008 (UTC)[reply]

Yes. If it is supposed to be a visible context template (probably not), then it should use {context}, but as used it is a formatting oddity, the cat itself would be better. I have a question too: is "autological" an autological word? (Seems to me it is if it is, and isn't if it isn't, and either way is self-consistent ;-) Robert Ullmann 15:32, 26 June 2008 (UTC)[reply]

I agree with EncycloPetey and Robert. When I went to add the template to some autological words, I assumed it was a context tag, but then saw that it has no visible output and that other entries use it as a simple category. WT:RFDO#Template:autological. (By the way, Robert, that very question was posed by Douglas R. Hofstadter. I think it was in w:Gödel, Escher, Bach.) Rod (A. Smith) 16:27, 26 June 2008 (UTC)[reply]

Robert, “autological” can logically be chosen to be autological or heterological.

This is mentioned at Appendix:Autological words, and explained at Grelling–Nelson paradox.

Just as “heterological” cannot logically be chosen to be autological or heterological (it’s a contradiction), “autological” can be chosen to be either (it’s a tautology) – see details as linked.

Nbarth (email) (talk) 22:28, 27 June 2008 (UTC)[reply]

I made a template because I copied it from {{back-form}}, and knew not any other way of dealing with “one category per language” in an elegant way. (Been working with {{term}} and {{t}} too much, not with categories.)

Would the usual/proper way be to literally include the code:

[[Category:English autological words]]

…or perhaps:

[[Category:{{en}} autological words]]

?

Also, is this documented anywhere?

Thanks!

Nbarth (email) (talk) 22:28, 27 June 2008 (UTC)[reply]

(code edited at: 00:14, 28 June 2008 (UTC) b/c I was confused re: ISO 639 expansion)

Renaming glossaries

I am about to rename, technically move, some glossaries from Category:Glossaries so that they start with "Glossary of", and optionally end in "terms", and mandatorilly do not end in "terminology", to create uniform naming. Most of the glossaries already keep that naming scheme. Aligned with this scheme, the name "List of" should be reserved for lists without definitions.

Please, tell me if this action is unwanted. --Daniel Polansky 07:58, 26 June 2008 (UTC)[reply]

I tend to think they should all be "Glossary of", since lists often drift into becoming glossaries, and should have the words linked anyway. --EncycloPetey 17:27, 26 June 2008 (UTC)[reply]

Also, I would like to remove the Category:Appendices from the glossaries, so that they are only listed in Category:Glossaries. --Daniel Polansky 08:05, 26 June 2008 (UTC)[reply]

Since all glossaries are appendices, that would be fine provided that each glossary is categorized in Category:Glossaries, which is in turn listed in Category:Appendices. --EncycloPetey 17:27, 26 June 2008 (UTC)[reply]

And what about me creating Category:Word lists as a subcategory of Category:Appendices and moving the appendices that lists of words without definitions, either plain or hierarchically organized, there? Is "Lists of words" a better name for the category? --Daniel Polansky 08:13, 26 June 2008 (UTC)[reply]

I don;t like that idea unless we can devise a suitable name. "Word lists" could easily be confused with content in the Concordance: namespace. --EncycloPetey 17:27, 26 June 2008 (UTC)[reply]

All sound like good ideas to me. "Lists of words" is fine. —Michael Z. 2008-06-26 20:54 z

Thanks for the feedback. I will rename the glossaries proper, and remove them from the Category:Appendices. I will refrain from doing anything with word lists. --Daniel Polansky 08:25, 27 June 2008 (UTC)[reply]

pronunciation guides?

A while back I was trying to figure out how to pronounce a place name. My first thought was to look in wiktionary, but we don't have place names. I looked in WP and fortunately they included the pronunciation (which they don't always do). Should we add, to our extremely long list of things to do, the creation of pronunciation guides for words we do not include? Could also be used for people's names (such as how to you pronounce Feynman or Dalai). RJFJR 21:03, 26 June 2008 (UTC)[reply]

Do you mean in an Appendix: list?—msh210℠ 21:05, 26 June 2008 (UTC)[reply]

Something like that. Just a long list of words and pronunciations. (Probably broken up by starting letter or something) RJFJR 21:22, 26 June 2008 (UTC)[reply]

Yes, please. This would be very useful when generating rhymes, as very many placenames do not have pronunciations in Wikipedia or in online gazetteers.

However, last time I looked (policy might have changed since then) I thought we were planning on including all place names in Wiktionary anyway, which would therefore cover including their pronunciations too? — Paul G 13:01, 18 July 2008 (UTC)[reply]

If you can find that decision, we probably should include it in CFI. I was left with yet another vague impression that someone though all placenames should be included only in Appendices, but that no one had proposed a method for actually setting this up. --EncycloPetey 19:51, 18 July 2008 (UTC)[reply]

WikiLook, Firefox Wiktionary add-on

Hey guys. I made WikiLook, Firefox browser add-on that looks up any word and show definition in small and sexy frame:) It, for example, let you to check existence of articles, word by word without clicking while you browse web. Or look up translation of foreign words. Look up idiom. Etc. It can be downloaded from direct download link or from Mozilla (registration required). Check Mozilla link or this page regarding how to use it(very simple). I'm looking for any feedback here, on my Wikipedia user talk page, talk page here or by email (you can find it on Mozilla add-on page). And I really need some published reviews on Mozilla page(all you need is to get registered with Mozilla) - it will help to win Mozilla "public nomination" faster. I really hope you guys love it! TestPilot^{talk to me!} 03:37, 27 June 2008 (UTC)[reply]

The only major objection I have is that it gives the definition lines of the first language section only. It would be great if a preferences (or the order of preference) could be set for a specific language (or languages), defaulting to the first one available. --Ivan Štambuk 06:10, 27 June 2008 (UTC)[reply]

That one is on to do list already! Furthermore, the current idea is to extend project, so it will be able to parse other languages editions of Wiktionary. That would bring some nice opportunities. Like to easily compare on the fly articles for same word, and then go and improve one edition. Or look for word definition on your mother tongue first, and, if article not found, define it using your second/third language of choice. As for immediate future, tonight I'll try to implement language name parsing and show it in the frame. That would eliminate unclearness, as to what language was used to define particular word. Next step would be "smart" language search. So it correctly go for definitions of words like Wind, or, that is an awful example - After.

The whole thing take time and lots of efforts, but I'm on it:) TestPilot^{talk to me!} 22:32, 27 June 2008 (UTC)[reply]

Version 1.2.3 is out. Smart language look up - if there is an English definition of the word, it will find it. If you want it, Ivan, it will take me like 5 min to make custom version that would go for another language. Soon I make an user selectable options, just need to read some docs. And this version is a first one that should be able to auto update itself. In theory. Plus, first non English Wiktionary WikiLook. Works like a charm. But not many words defined in that Wiktionary - so not much usable. But sort of proof of the concept. TestPilot^{talk to me!} 09:35, 30 June 2008 (UTC)[reply]

If you can just make an add-on to Lupin's pop-ups that will list a chosen language section (they only list the intro, because they were intended for Wikipedia), I'd be in love with you. Circeus 21:14, 3 August 2008 (UTC)[reply]

Request for Bot approval

I have made a Bot using my notepad++, the bot is named SinBot and shall be used against vandalism. I have used Javascript and Python to create the bot, i havent uploaded the source code yet because i am waiting for approval. The bot has it in its script to notify me when a page has had any content removed, and show me what was removed. It will then ask if i wish to redo/undo the edit made that constituted vandalism. Also, the bots task will be to let me know when a page is blanked, and ask if i would like to undo the vandalism. I have checked and rechecked the scripts, and everything is fine. The Bot was made using javascript only because i have no other source (except Python, which i used a little). It has been run on my computer and works fine. I only ask now that you accept this request on begging knees, and allow me to aid in fighting vandalism with SinBot. Thank you for your time, The7DeadlySins 04:32, 29 June 2008 (UTC)[reply]

Having looked at the soucecode is does not look like javascript, it looks like Bash script with mistakes. The code also strongly resembles ScsBot's, are you User:Scs under a new name, or did you copy it from User:Scsbot/wikised? Additionally, the code for the bot does not seem to do what you claim, though I didn't really look closely. Conrad.Irwin 19:36, 29 June 2008 (UTC)[reply]

Yes. I did copy the code off of the wikised, but i completely revamped it after spending a whole night researching Script, Batch script, and Javascript. I reordered it to do these given tasks:

Notify me when there is vandalism (i.e, page has been deleted or content removed)
Revert the vandalism if i say so.

Also, i ordered it to logon whenever it wants. However, i am having trouble getting the bot to start up, how do i activate it? Furthermore, no. You didnt look closely. I made the bot to do EXACTLY those functions, and there are no loopholes. Trust me, im a computer programmer by heart. I triple checked. Also, i am working on a more advanced version on my Notepad++ if this one fails. Cheers, and please let me know how to activate the bot, The7DeadlySins 19:46, 29 June 2008 (UTC)[reply]

It is not batch script, it is w:Bash script, it does not work under Windows by default, you have to install bash (probably by using Cygwin). I have blocked you for one day so you can research all of this, and also because above you say "It has been run on my computer and works fine" and now you admit " i am having trouble getting the bot to start up". You clearly have no idea about running a bot, and this request is a waste of mine and everyone else's time. Additionally, copying the code from User:Scsbot is illegal, he released the code under the GFDL, which means that you "must" give him attribution in your code (i.e. say where you got the code from) or you are breaking the law. Conrad.Irwin 19:59, 29 June 2008 (UTC)[reply]

We have no use for a bot with this function. It would mean that sysops who patrol recent changes would have to mark two separate edits as patrolled instead of doing a simple rollback. I would never trust this user to run a bot - I can't remember another new user with such a ludicrously low signal-to-noise ratio (lots of "look at me" pages, duplicate emails, and less than a handful of useful edits, most of which would have been done by some of our existing bots). SemperBlotto 07:24, 30 June 2008 (UTC)[reply]

lulz

Would anyone object to moving User:Ptcamn/lulz to lulz (currently protected)? --Ptcamn 04:44, 30 June 2008 (UTC)[reply]

Are quotes from livejournal and other free blog accounts considered valid attestations? I don't believe they fit the description “permanently recorded media”. —Michael Z. 2008-06-30 05:13 z

Done, rfv'd. It's now up to the courts to decide. -Atelaes λάλει ἐμοί 05:23, 30 June 2008 (UTC)[reply]

From what I understand such quotes don't count for RFV, but are fine to illustrate usage. Conrad.Irwin 13:31, 30 June 2008 (UTC)[reply]

Unicode

Mutante and I were playing around with some lists of Unicode characters, we decided that it would be both possible and useful to create entries for those unicode characters about which we have no information. Such entries would be created from a list such as this one and would contain the unicode character name, which is a human readable description of the character (e.g. "Latin Capital Letter Y with macron"), thus making it possible for people to obtain basic information about the symbol. They would (as a bonus) include the unicode character block ("Latin Extended-B"), the unicode code point ("U+0232"), and an external link to the relevant unicode consortium page. An example entry can be seen at Template:new unicode char. Before we embark on creating the more relevant of these entries (probably only the Extended Latin character blocks) I would like to first check whether people have any suggestions for making these entries more useful. This is the kind of thing that can be done by a bot or "enhanced" human editing once we've worked out how to do it. Conrad.Irwin 14:11, 30 June 2008 (UTC)[reply]

Two concerns.

1 I think the focus should be on semantic symbols or letters, rather than Unicode code points. Although Unicode tries to divide up writing into units of meaning, and define a best way to express something, there is often more than one acceptable expression (e.g. the dumb apostrophe “ ' ” can represent an apostrophe, a single quotation mark, a prime, etc.), and there are code points which overlap in meaning.

2 Is this really a dictionary entry? This information is not a definition nor an etymology, and isn't in the scope of WT:ELE. It is an attribute of the Unicode code position rather than of the represented letter. If we provide a reference to a letter's Unicode attributes, why not also its ISO-8859 code point, etc? Are we also going to describe its Unicode spacing, combining, collation, case characteristics, etc? This is threatening to become encyclopedic. Cf. w:Y with stroke). —Michael Z. 2008-06-30 15:40 z

To point 1 I fully agree, to point 2 I'd suggest just descriing the character itself, i. e. "the letter y with a stroke through its upper stems", or ⏢ "a white, usually symmetric trapezium". Of course this can't be done via bot and has to be done manually. -- Prince Kassad 16:08, 30 June 2008 (UTC)[reply]

We could mention that a letter is defined as a y with a stroke, but we don't normally describe the visual appearance of a letter or other symbol, just as we don't describe a, Y, 5, =, ж, or μ. For one thing, the symbol is right there on the page to see. —Michael Z. 2008-06-30 16:52 z

When there are upper-case and lower-case variations, it's good to show them on the "headword/inflection" line, as is done in (deprecated template usage) a and similar entries. For characters that are only used in one or two languages, it would probably be better to show the actual language as the H2 header rather than "Translingual". For readers without the necessary fonts, it would be helpful to include an image of the character. It may also be interesting to include a brief "etymology" so readers can understand why a stroke was added for example. Finally, some example expressions that include the character would be nice. Of course, most of the preceding will have to be entered by a human rather than a bot, but they will make the entries more useful and more similar to entries for actual words. Rod (A. Smith) 17:25, 30 June 2008 (UTC)[reply]

Sounds like a good approach. —Michael Z. 2008-06-30 17:52 z

+1 to this initiative and to all of Rod's suggestions. Also, this is a minor thing, but since LATIN SMALL LETTER Y WITH STROKE is in the category Ll (lowercase letters), shouldn't its POS header be ===Letter=== rather than ===Symbol===? —Ruakh_TALK 01:44, 1 July 2008 (UTC)[reply]

It is only a letter in the Lubuagan Kalinga language of the Philippines. If we can it a "letter", then it is only used in that language and should have the appropriate L2 header. If we are treating the Unicode symbol, then it is Translingual. --EncycloPetey 01:53, 1 July 2008 (UTC)[reply]

Unicode classifies this character as a letter, not as a symbol; AFAIK it's not a symbol for anything; and if it is a symbol for something, our entry doesn't say that it is one, much less indicate what it might be a symbol for. If this letter is indeed only used in a single language's writing system, then you are right that the appropriate L2 header is preferable to ==Translingual== (though I dunno where a bot would get that information); but I don't see how the use of ==Translingual== is an argument for the use of ===Symbol===. —Ruakh_TALK 03:03, 1 July 2008 (UTC)[reply]

Wiktionary:Beer parlour/2008/June

Auto-categorization based on suffix

Audio in word lists

Citations/Quotations

Flag edits for specific people

Plurals of proper nouns

Combining forms

Are (present/past) participles really verbs or adjectives?

Full entries or soft redirects for Swiss Standard German spellings

Phrasal verb SoP tests

Hidden categories

Proposed change of wording to {{PL:pedia}}

Variant forms of Chinese Characters?

Use of Chinese Character Forms in specific languages

Permissions: Add Template:yue-hanzi to correct category

Specific Universal Changes in Wikisaurus

Finalization of format - really!

{{wjargon}} !vote

New functionality of Template:rfscript

ditransitive

WT:ELE inconsistency

Non-gloss definitions

{{hu-suffix}}

Protection of supercalafragalisticexpialadocious?

Proto-Indo-European (PIE)

Googleability.

Google's Listing Tactics

Template:vi-attention

User behavior

brainstorming

TOC issues

botflag for CarsracBot

Vote proposal to modify WT:ELE (help) page

Portuguese spelling

Bot flag request for User:EivindBot

Formatting conjugation, inflection and declension

Introducing…{{autological}}

Renaming glossaries

pronunciation guides?

WikiLook, Firefox Wiktionary add-on

Request for Bot approval

lulz

Unicode

Proposed change of wording to `{{PL:pedia}}`

`{{wjargon}}` !vote

`{{hu-suffix}}`

Introducing…`{{autological}}`