Open main menu

Wiktionary:Beer parlour/2012/December

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit


German adjectival superlative

When used is "häufig", template {{de-adj}} says that "am häufigsten" is the superlative. I seem to think "am häufigsten" is an adverbial phrase rather than adjectival ("most frequently" and "most often" rather than "most frequent"), so "am häufigsten" should be turned to "häufigste" or the like to make it an adjectival superlative. However, de:häufig also shows "am häufigsten" in a table to the right, yet it has such adjectival examples as "Der häufigste Lernfehler".

Even more conspicuous is "am grünsten" in grün; check google:"am grünsten".

What do you think? What references could clarify this? Are there dictionaries that we could use as a model? --Dan Polansky (talk) 11:58, 1 December 2012 (UTC)

(after edit conflict) Full inflection:

There, the thing with "am " is called "Prädikativ". I still think it plays an adverbial role in a sentence. I do not see why this "Prädikativ should be considered an adjectival superlative, or, if so, why it should be considered the most representative form of all adjectival superlatives. --Dan Polansky (talk) 12:05, 1 December 2012 (UTC)

It has nothing to do with adjectival vs. adverbial but with predicative vs. attributive. In the sentence Dieser Baum ist am grünsten ("This tree is the greenest one") the phrase "am grünsten" is clearly not adverbial but adjectival. The reason we (and others) treat the "am"-phrase as the adjectival superlative is simply because in German it's common to compare adjectives like this: "groß -- größer -- am größten", "klein -- kleiner -- am kleinsten", and so on. I agree it's a weird construction and we could perhaps get rid of the "am", but it has nothing to do with adverbial. Longtrend (talk) 14:02, 1 December 2012 (UTC)
Re: 'the phrase "am grünsten" is clearly not adverbial but adjectival': I don't think that is clear at all. "am grünsten" is a prepositional phrase, a shortening of "an dem grünsten", perhaps "an dem grünsten Fall" or the like. "the greenest one" is not an adjectival phrase, either. If "am grünsten" is the only phrase used predicatively to serve as a superlative, it is still unclear why the headword lines show the rarest of the superlative forms: Google Ngram View for German and am grünsten,grünste,grünsten,grünster. --Dan Polansky (talk) 15:10, 1 December 2012 (UTC)
Ad "'am grünsten' is a prepositional phrase, a shortening of 'an dem grünsten', perhaps 'an dem grünsten Fall' or the like": Sorry, but that's quite a weird claim. It might well be the case that the construction "am [superlative]-en" was derived etymologically from "an dem [superlative]-en X" as you suggest (I don't know) but in any case this has nothing to do with how it's used nowadays. You can't "un-shorten" the phrase "am [superlative]-en" anymore; "an dem [superlative]-en" is simply ungrammatical. So I wouldn't be so sure to claim it's a "prepositional phrase". The fact that in my English translation "the greenest one" is not a predicatively used adjective doesn't matter either, since the grammars of English and German differ. I think it's quite unequivocal that the superlative in "Dieser Baum ist am grünsten" is a predicatively used adjective (phrase?), analogous to "Dieser Baum ist grün" and "Dieser Baum ist grüner". I'm not sure what you intend to show by your Ngram link. The forms in headword lines are not based on frequency. They are simply the uninflected lemma forms. The problem with the German superlative is that there is no real uninflected form. The "am [superlative]-en" construction is used instead, see (that link also shows examples for adjectively and adverbially used superlatives all in predicative position). Longtrend (talk) 18:58, 1 December 2012 (UTC)
To make things more complicated, some adjectives form adverbs ending in -st for the superlative (häufigst, freundlichst), while others don't (grünst, as the superlative, probably can't be attested). I would guess that since X-st, while a valid construction, isn't always attested, German dictionary tradition is to use "am X-sten", which is much more common; and if that's true, it would probably be best to use grün, grüner, grünst. 19:37, 1 December 2012 (UTC)
Well, those bare "superlative" forms that you mention are rather "Elative" (they express a very high degree). As you said, they are only used adverbially, so I don't think it makes much sense to use them as "standard" superlative forms in head lines. Longtrend (talk) 20:10, 1 December 2012 (UTC)
I must say I disagree that bare superlatives are always "Elativ"; that's certainly not true when they're used in compounds such as nächstliegend, nearest (not "very near"), schönstblühend, frühestmöglich, Schwerstverbrecher. My questionable sprachgefühl says that superlative adjectives have uninflected forms like all other adjectives, with the one exception of predicative usage which substitutes "am Xsten". Compounds and adverbs use the bare form, and their meaning is often, but not always, elative—but the same is true of all superlatives.
That said, superlatives are regular enough that any one choice will let you derive all inflected forms, so it doesn't matter too much. 21:42, 1 December 2012 (UTC)
But you can't conclude from the fact they appear in compounds that they have the status of words, otherwise we should have entries for "Him" and "Brom" :) Of course all superlatives can principally appear without the definite article (mit größtem Vergnügen, auf grünsten Wiesen, that might be what your sprachgefühl says?) but not all can appear uninflected, and all that can are used as adverbs. Longtrend (talk) 21:55, 1 December 2012 (UTC)
All grades of German adjectives—positive, comparative and superlative—are regularly homographic with adverbs and/or able to function as adverbs; Igor Trost's Das deutsche Adjektiv gives the examples
  • Eva schreibt schön.
  • Eva schreibt schöner als Erika.
  • Eva schreibt am schönsten von allen.
This may be one reason "am häufigsten" seems adverbial—it can be adverb-like. But it is also (indeed, principally) adjectival. "Am _sten" is routinely considered the lemma form of the superlative... I cannot immediately find a reference work which treats another form as the lemma of the superlative. The German Wiktionary, too, uses "am _sten" as the lemma. - -sche (discuss) 21:40, 1 December 2012 (UTC)
@-sche and @Longtrend several paragraphs above: Thank you for your clarification. My mistake in seeing "am grünsten" exclusively as an adverbial phrase. After having thought it over, I find it hard to see "am grünsten" in "Dieser Baum ist am grünsten" as an adverbial phrase, while I am able to see it as a phrase that serves as a predicative adjectival superlative. I would not choose this phrase as a lemma for an adjectival superlative, but if this is the tradition of German dictionaries, let it be. --Dan Polansky (talk) 18:50, 2 December 2012 (UTC)

Category:en:Ordinal numbers

Good evening,

I have just seen this request. I don't really understand : since "The members of the category act as adjectives. What makes them ordinal numbers is their semantics", why don't we move Category:English numerals to Category:en:Numerals ? Indeed, in English, I don't see the difference between a "numeral" and an adjective or a noun, but I may be mistaken.

Anyway, in French, trois is considered a "numeral adjective" and a noun. --Fsojic (talk) 18:46, 1 December 2012 (UTC)

Your proposal begins with a comment about ordinals, and then suddenly shifts to cover all numerical words. In fact, the cardinal numerals do have a different grammar in English (and in many languages). It's only the other forms of numerical words (ordinals, distributives, adverbials, etc.) that do not always behave as a separate part of speech from their usual counterparts. --EncycloPetey (talk) 05:28, 7 December 2012 (UTC)

Skills listings

Sometimes I come across a mystery chemistry word and take it to SemperBlotto, and once in a blue moon he has taken a recent comp sci word to me. I'm also aware of people on Wiktionary who specialise in biology, France, etc. I wonder whether it would be a good idea for us to set up some sort of simple skills listing indicating who knows about what. This might help to reduce some of the Requested Entries backlog (e.g. I research "polycritical" and think "that's obviously physics, but beyond me"). Dunno whether this would best be done by categories on user pages, lists of users on category talk pages, or whatever else. Wikipedia has some such scheme, right? Equinox 02:40, 3 December 2012 (UTC)

Lists of users on category-talk pages would make it hard to find a user when you want one (which category do you look at?). Perhaps a sortable table with one column for user and another for subject matter (à la WT:TA), which people add themselves to (à la WT:DW)?​—msh210 (talk) 03:05, 3 December 2012 (UTC)
I like the idea of a list where we can sort wanted entries, but I think it's hard to group specialties. I know a little about a lot, but I can't guarantee that a geologic or Egyptological entry, for example, would be within my limits of knowledge, even though I have studied those fields (amateurly). Mostly, it's abstract math and physics that are causing backlog and wholesale copying of Wikipedia's definitions. —Μετάknowledgediscuss/deeds 04:35, 3 December 2012 (UTC)
If you see any abstract-math terms that need attention, you can try my talkpage. I'm a mathematician. (Naturally that doesn't mean I'll be able to help. But I may be.)​—msh210 (talk) 07:16, 3 December 2012 (UTC)
Yes, this is something that we need. Maybe some sort of extension to the babel system would be the way to go. Probably best to keep the subjects fairly broad - e.g. maths, physics, philosophy, music etc rather than statistics, astrophysics, epistemiology, rock'n'roll etc. SemperBlotto (talk) 08:22, 3 December 2012 (UTC)
Would epistemiology be the study of how knowledge is transmitted through a population? :-)   —RuakhTALK 18:20, 3 December 2012 (UTC)
An extension to the babel system... sounds like user boxes to me (gasp). DTLHS (talk) 18:29, 3 December 2012 (UTC)
They would still be restricted to broad areas of interest, i.e. not Pokémon, Enya, or "this user likes toffee apples". Equinox 12:20, 4 December 2012 (UTC)

Han Tu as L3 header

An IP (2003:51:4F07:D100:21F:3CFF:FE52:86EE (talk)) has been changing ===Han character=== to ===Han Tu===. This has to be wrong just from the capitalization, but do have anything to point to showing the correct header, so we can give them an alternative? We don't seem to have an About Vietnamese yet. Chuck Entz (talk) 06:20, 3 December 2012 (UTC)

Rolled back all his/her edits. We don't have an About Vietnamese page but we don't have Han Tu heading either. --Anatoli (обсудить/вклад) 06:34, 3 December 2012 (UTC)

NY Times op-ed article spills the beans about lexicography

FLASH! See this article from the 3 December 2012 NY Times for shocking revelations on press coverage of dictionary goings-on. DCDuring TALK 14:10, 3 December 2012 (UTC)

I looked at out entry for marriage after reading this. It’s pretty good and non-POV eh? — Ungoliant (Falai) 20:12, 3 December 2012 (UTC)
In definition 2, the separation of marriage into the two definitions of "one man and one woman" and "any two people" is written on the basis of how laws work in local jurisdictions. But while pretending to follow those laws, the definitions ignore the facts that such laws sometimes prohibit the marriage of first cousins (sometimes not), and usually (or perhaps always) have age prohibitions as well. Also, the main meaning of definition 2 allows more than two people, but the sub-definitions do not, which seems to be a contradiction. --BB12 (talk) 20:47, 3 December 2012 (UTC)
@BB: the sub-senses are (and are labelled as) more specific/narrow uses of the general sense. How is that contradictory? - -sche (discuss) 21:42, 3 December 2012 (UTC)
@-sche: Because the reader is left wondering what happened to three or more people in the sub-senses. If the meanings of the sub-senses depend on the jurisdictions, the reader can only scratch their head, wondering why three or more people is also not dependent on the jurisdiction. --BB12 (talk) 23:31, 3 December 2012 (UTC)
It seems we just have different "senses of logic" (for lack of a better way of putting it), then, since I think this is how subsenses normally work: they narrow the general sense. Compare "war", where the general sense "a conflict, or anything resembling a conflict" narrows to "a campaign", or "god", where "a deity" narrows to "a male deity". - -sche (discuss) 23:50, 3 December 2012 (UTC)
The war and god sub-senses seem fine to me in that they illustrate the main sense. But with marriage, I'm left wondering what happened to multiple people and the "usually to the exception of all others" parts. --BB12 (talk) 00:01, 4 December 2012 (UTC)
And you aren't left wondering what happened to female deities when you read the "male deity" subsense of "god"? - -sche (discuss) 01:03, 4 December 2012 (UTC)
No. Female deities are in sub-senses 1 and 3. Sub-sense 2 simply points out that there is a super-specific meaning of male deity. In the case of marriage, though, the reader is left clueless on marriages with multiple people and the "usually to the exception of all others" part. --BB12 (talk) 06:47, 4 December 2012 (UTC)
Left clueless? No, citations which use "marriage" to refer to the union of multiple people or to the union between people who agree to have an open relationship, etc, are covered by the broad sense. Many people also use the term only with one of two "super-specific meaning"s, to use your phrase, so those are also spelt out the entry. All citations of the narrow senses are by definition also covered by the broad sense, though. Or do you disagree that "unions of two people" are a subset of "unions of two or more people"? (As I commented below, the "one man, one woman" sense should be a sub-sub-sense of the "any two people" sense if we want to be really logical, but the current state of things is OK, like the current state of "god".) Or is it that you disagree with the idea that if one sense refers to a subset of what another sense refers to, the one should be formatted as a subsense of the other?
Compare "gook" and "tupelo" (which I also formatted): "Koreans" are a subset of "Asians", "trees of the species Nyssa multiflora" are a subset of "trees of the genus Nyssa". - -sche (discuss) 07:38, 4 December 2012 (UTC)
In my opinion, the union of man and woman should be the primary sense with the union of any two people being a subsense with the {{by extension}}. Subsenses don't necessarily have to have an "is a" relationship with the parent sense. Also, what do you mean by more than two people? If you are talking about polygamy, polygamy is covered by the "union of two people" sense because it is just a set of unions-of-two-people where one person in each union-of-two-people happens to be the same person, as evidenced by the fact that there is usually a separate wedding for each union-of-two-people. Also, I think the simplest and best solution is to just define it as "union of two people, usually a man and a woman". --WikiTiki89 09:15, 4 December 2012 (UTC)
Re "by extension": That would be inaccurate, because the broad sense is not an extension of a narrow sense. Re "set of unions-of-two-people": some polygamous relationships—especially those involving deception, with one person not telling their second partner about their ongoing first marriage, often because polygamy is not allowed in the society they are in—are sets of two-person unions. But history has plenty of examples of "marriage" being, and the word "marriage" beig used for, the direct union of (3|4|etc) people. - -sche (discuss) 16:57, 4 December 2012 (UTC)
Re "by extension": How is a "union of any two people" not an extension of a "union of a man and a woman"? Re polygamy: Take a classic example of polygamy: A man has four wives. In this case, each wife has a union with the man, however the wives have no union with each other, they just happen to each have a union with the same man. If you find an example of marriage being used in the sense that the man and his wives are all in one union, then I would consider that a separate sense, or possibly a subsense (which would be an extension of the primary sense of "union of a man and a woman"). --WikiTiki89 20:01, 4 December 2012 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── The Germanic languages, including the predecessors of modern English, used "marriage" to refer to the union of multiple people, including the union of one man and one woman. From its inception (traditionally the year 1500), modern English continued to use the word this way; examples can be found in texts describing polygamous Biblical figures. The narrowing of the sense to refer only to the union of one man and one woman is precisely that: a narrowing. Calling the broad sense an "extension" is historically and linguistically counterfactual, directly the opposite of the actual development of the senses. The fact that, after Christianization, marriages in Germanic and English society were typically only concluded between one man and one woman never stopped the term "marriage" from being used of other marriages, such as polygamous marriages. - -sche (discuss) 20:38, 4 December 2012 (UTC)
1. The Germanic languages didn't have the word "marriage" as "marriage" is derived from Latin through French. 2. Can you give some examples of this usage, regardless whether the word is "marriage" or a Germanic equivalent? --WikiTiki89 21:06, 4 December 2012 (UTC)
Perhaps if I could understand the broader definition, it would not be a problem. The two most frustrating parts of the broad definition to me are: what does "usually to the exclusion of all others" mean when the "usually" does not apply, and what happens when there are more than two and the usually of "usually to the exclusion of all others" does not apply? I simply don't understand these implications and expect the sub-senses to lay the issues out. --BB12 (talk) 09:18, 4 December 2012 (UTC)
Re "when the "usually" does not apply": The omission of "usually" from the narrow senses means that an open marriage of one man and one woman, or of two men or of two women, is only covered by the broad sense that also covers three-person marriage, etc. The entry is laid out that way because when I overhauled the senses, formatting them into a broad sense and a narrow sense, I retained the "to the exclusion of all others" aspect of the narrow senses, and an IP then expanded only the broad sense. That didn't seem like a problem to me at the time, but you've persuaded me that it should be changed: if the current layout of entries is kept (which it may not be, given this discussion), would you find it clearer to add "usually" to the narrow senses, or to drop "to the exclusion of all others" from the narrow senses?
Re "when there are more than two": Consider a marriage of one man to two women (multi-person marriage), in which the spouses agree that they can take other partners (open marriage). The spouses do not induct their other partners into the marriage, yet they have other partners. Thus, there is a marriage of more than two people, which is also not exclusive. - -sche (discuss) 16:57, 4 December 2012 (UTC)
I think you're now talking about sexual relations, something that is not at all clear from the definition. I honestly thought "to the exclusion" referred to the marriage, not their sexual relations. If that is what you are talking about, there is yet another can of worms. Infidelity is often legal grounds for a divorce, though I don't know if it always is. --BB12 (talk) 18:56, 4 December 2012 (UTC)
Our entry actually seems to me to take a pretty liberal POV, and implies (by invoking "jurisdictions") that the specifically-one-man-one-woman use is a legal definition. Which I don't think is a huge deal; I can't imagine anyone looking up this term and being led astray by a POV definition, as long as it's not over-the-top. But I wouldn't describe our entry as "non-POV". A more fully NPOV definition might be something like "Any of various similar social practices found in many cultures, whereby two or more people enter into a personal union", followed by a listing of typical defining features (usually it's just two people; usually one is a man and one is a woman; usually both are considered adults, either as a prerequisite or as a consequence of the marriage; usually it's a long-term union, intended or even required to continue until death of one or both members; usually there is a legal and/or religious component; usually there is an economic component; frequently there is an expectation of having children; frequently the social position of one or the other member, and/or of children, is defined by the marriage (cf. "widow", "bastard", etc.); etc., etc., etc.). —RuakhTALK 20:52, 3 December 2012 (UTC)
If you want to strip out the reference to "jurisdictions", be my guest; I incorporated it not only because it is accurate, but also so that "the union of one man and one woman" and "the union of two people" would be on the same level, which seemed to me less likely to provoke objections than listing the senses by the increasing specificity of their {{context|specifically}}:
  1. The union of two (or sometimes more) people, usually to the exclusion of all others. [from 14th c.]
    1. The union of any two people, to the exclusion of all others.
      1. The union of one man and one woman, to the exclusion of all others.
- -sche (discuss) 21:39, 3 December 2012 (UTC)
I think that removing "jurisdictions", without making other major changes, would be a non-improvement. More generally, I think the entry is more or less O.K. the way it is, even though it's less-than-ideal from an NPOV perspective. This is a very difficult entry to write in an NPOV fashion, for a lot of reasons. For example, interracial marriage was illegal in many U.S. states until 1967; does that mean that in some jurisdictions the term "marriage" implied "same-race", or does it mean that in some jurisdictions there were some marriages that were legal and some that (while still "marriage") were illegal, or what? Is there an NPOV way to tackle that question? I think that most people in the U.S. (and probably elsewhere in the Anglosphere) actually would use the word "marriage" in almost exactly the same way, despite great differences in political views on the subject, if not that the political debate itself distorts usage. That same debate makes it hard to describe the usage in a way that everyone would accept. —RuakhTALK 22:23, 3 December 2012 (UTC)
I still think it is non-POV, but your suggestion is better anyway, Ruakh. But I’d add that it is an officially recognised personal union (otherwise dating would fit the description). I’d also only keep the following “usuallies”:
  • usually it's a long-term union, intended or even required to continue until death of one or both members
  • usually there is a legal and/or religious component
And add something along the lines of “in some jurisdictions, only a union between one man and one woman is considered a marriage”. — Ungoliant (Falai) 22:13, 3 December 2012 (UTC)
I think "officially recognized" is getting much closer. How about "officially or socially recognized"? I think that's what marriage is really about. That, by the way, takes care of the strangeness of including which sexes qualify for marriage but not the number or age of the people married. --BB12 (talk) 23:31, 3 December 2012 (UTC)
Dating is a "socially recognised" union. I still think "officially or socially recognised union" is an improvement over bare "union", and I think including "or socially" is better than leaving it out, I'm just pointing that out. Perhaps Ruakh's suggestion of "usually [having] a legal and/or religious component" is better for that reason. I don't have a preference. But we should pick one or the other; I think "an officially or socially recognised union, usually having a legal and/or religious component" would be a lot of semi-redundant verbiage. I also support adding "usually intended to be long term". - -sche (discuss) 17:27, 4 December 2012 (UTC)
How about this: "A union of two or more people that creates a family tie and carries legal and/or social rights and responsibilities."
If the sub-senses have same-sex marriage, opposite-sex marriage, polygamous marriage, polyandrous marriage and common law marriage, so much the better. --BB12 (talk) 20:46, 4 December 2012 (UTC)
If by "the sub-senses have same-sex marriage" you mean "one of the subsenses includes the union of any two people, including those of the same sex": yes, that should continue to be the case (as it is the case now). But if you mean one of the definitions of "marriage" should be "same-sex marriage, the union of two people of the same sex", I oppose that for reasons I gave on Talk:marriage some time ago, namely that there is no jurisdiction or body of literature I am aware of in which unions of two people of the same sex are the only unions which are considered to be appropriately described as "marriages".
I'm not sure about "creates a family tie", I'll wait for others to comment on it. I like the rest as a broad sense. Actually, I would really like Ruakh's wording ("Any of various similar social practices found in many cultures, whereby two or more people enter into a personal union...") as an intro except that it doesn't match up grammatically: I feel the definition needs to start "a union...", not "a practice whereby people enter into a union...", because if "Edna's marriage is disintegrating", it isn't her practice of entering into a union that is falling apart, it's the union she actually entered into and is now in. - -sche (discuss) 21:29, 4 December 2012 (UTC)
I agree that the sub-senses should not have one meaning exclusively same-sex marriage. AFAIK, all marriages in all cultures create a family tie, and I suspect that is the true essence of marriage. --BB12 (talk) 22:27, 4 December 2012 (UTC)
Speaking of things we should check the N/POV-ness of: [[traditional marriage]] (should perhaps be re-RFDed), [[companionate marriage]] (also needs formatting), [[mixed marriage]] and [[intermarriage]] (also need to be edited to reflect that a "mixed marriage" is a union, but "intermarriage" refers to the act/state of being married across group lines), [[gay marriage]] and [[same-sex marriage]], [[marriage lite]] and [[remarriage]] (for which it's not a question of POV so much as a question of whether the definition is accurate or could be improved). Discussion should probably take place on the talk pages of the entries, to avoid having a massive multi-facet discussion here. - -sche (discuss) 23:50, 3 December 2012 (UTC)
Mostly we need to keep a focus on having citations and be accurate about the context/usage of each definition, preferably based on citations and corpus evidence. DCDuring TALK 00:55, 5 December 2012 (UTC)

Is there a "cabal" of "semi-divine" lexicographers? The line "It’s exciting to think of dictionaries in more dramatic terms: as battlegrounds where the fate of the language is decided, or as shadowy enterprises with secret, back-room meetings over what does and does not count as a word." made me think of WT:RFD and WT:RFDO, which seem to fit those descriptions perfectly :) —Μετάknowledgediscuss/deeds 05:50, 4 December 2012 (UTC)

MK shhhhh! We can’t let other people know there is a cabal! That’s the point of that article. Now everyone, repeat after me: there is no cabal, there is no cabal, there is no cabal... ♕ The Wiktionary Cabal ♕ 14:53, 4 December 2012 (UTC)
Cabal, what cabal? The article clearly shows there is no cabal. Now move along to the article talk pages. Nothing to see here. Certainly no cabal. Move along. DCDuring TALK 00:55, 5 December 2012 (UTC)
At least we're more honest about it than Wikipedia. Compare the following: w:WP:TINC and WT:TINC. —Μετάknowledgediscuss/deeds 04:14, 5 December 2012 (UTC)
I have two objections:
  1. There IS no cabal.
  2. Did I miss the vote appointing him? DCDuring TALK 12:39, 5 December 2012 (UTC)
Of course you did. That's the point. MWAHAHAHA!!!! Chuck Entz (talk) 14:36, 5 December 2012 (UTC)

Moving more discussion pages to monthly subpages

What are the general thoughts after several months of subpages at the grease pit? If it's considered a success can we apply this to more pages, particularly this one? DTLHS (talk) 18:41, 3 December 2012 (UTC)

Support ASAP. --WikiTiki89 18:43, 3 December 2012 (UTC)
Looks good.​—msh210 (talk) 19:31, 3 December 2012 (UTC)
I was just about to suggest this myself; the GP experiment seems to be going smoothly. I think this should only be applied to discussion rooms (where the discussions are supposed to live on in centralised archives), though, not to requests pages (from where requests, after they've been resolved, are moved to individual talk pages). - -sche (discuss) 19:40, 3 December 2012 (UTC)
Support too. Though it would be nice if we didn't have to keep moving the page back and forth every month, just so that it shows up on everyone's watch pages. It seems like an easy thing to forget, but until an admin does it (only they can), the discussion pages are effectively disabled. That's less than optimal obviously. —CodeCat 19:48, 3 December 2012 (UTC)
Does that really work, moving the page back and forth? --WikiTiki89 20:07, 3 December 2012 (UTC)
So far, it has. Do you recall ever watching one of the GP subpages? :) —CodeCat 20:25, 3 December 2012 (UTC)
Oh so that’s how GP pages have been magically getting into my watchlist lol. Support. — Ungoliant (Falai) 21:56, 3 December 2012 (UTC)
Support for BP and TR. The ES is still weird in its own ineffective way, and I think we should keep the ID simple for noobs from other wikis to feel comfortable. Request pages are another story altogether. —Μετάknowledgediscuss/deeds 05:39, 4 December 2012 (UTC)
Support per Metaknowledge.​—msh210 (talk) 06:19, 4 December 2012 (UTC)
The ID shouldn't actually need to work any differently from how it does now. If users click on the + button, it will appear to them as if they are just adding a new section as usual. Except that the section gets added to the subpage instead. —CodeCat 14:14, 4 December 2012 (UTC)
Re: "users click on the + button": [citation needed] ;-)   —RuakhTALK 14:25, 4 December 2012 (UTC)
Well, there will be no more edit button, so I suppose they'll have to? —CodeCat 14:44, 4 December 2012 (UTC)
The BP is getting rather long right now, and edits are taking longer and longer, causing more edit conflicts. Would anyone object if I made the split right now, at least for this page? —CodeCat 00:15, 5 December 2012 (UTC)
Re ID: I think you may have missed my point, in that it doesn't matter to me how it works so much as how it feels to noobs. I think the solution there is just to archive more frequently, but I'm not going to tell anyone else how often to archive things.
Re BP: With this clear consensus, why not? Thank you! —Μετάknowledgediscuss/deeds 02:45, 5 December 2012 (UTC)
Done!CodeCat 03:50, 5 December 2012 (UTC)
Thank you so much. Can TR be next? --WikiTiki89 07:57, 5 December 2012 (UTC)


Discussion moved to WT:RFC#walirang.

Removing phrasebook

For those who dislike phrasebook, I have create a vote to start in 14 days, on 18 December 2012:

Feel free to discuss the proposal. Feel free to postpone the vote should the discussion last until the beginning of the vote. --Dan Polansky (talk) 18:38, 4 December 2012 (UTC)

I have changed my mind and let the vote start in 7 days. Again, if the discussion lasts until then, the vote can be postponed. --Dan Polansky (talk) 18:39, 4 December 2012 (UTC)

A related vote from 2010:

--Dan Polansky (talk) 18:46, 4 December 2012 (UTC)

The vote is something of a straw man. After bd2412 prompted everyone to clarify their votes, several users including me clarified that the phrasebook should be moved to the Appendix namespace, not completely deleted. - -sche (discuss) 20:43, 4 December 2012 (UTC)
People will be able to clarify their stance as part of their votes. In any case, removing the phrasebook includion option from CFI does not preclude having the phrasebook in the appendix.
What is the benefit of having a phrasebook entry such as I love you with the translation table in the appendix, as opposed to mainspace? --Dan Polansky (talk) 20:50, 4 December 2012 (UTC)
I think instead of voting straightaway to remove it, we should first see if we can change the way it works, such as putting it in the Appendix. --WikiTiki89 21:01, 4 December 2012 (UTC)
Again, removing the phrasebook provision from CFI does not preclude moving it to appendix.
What is the benefit of appendix rather than mainspace for the user of the phrasebook? --Dan Polansky (talk) 21:07, 4 December 2012 (UTC)
I have added a question to the vote asking the voters to indicate whether they want to move the phrasebook to an appendix. --Dan Polansky (talk) 21:11, 4 December 2012 (UTC)
Thanks. To answer your previous question: It would be all in one place, rather than spread out into separate articles. And it would allow phrasebookness and SOPness to be judged independently of each other so that non-SOP words can still be in the phrasebook. It would also get rid of the need to define SOP phrasebook entries, which usually have ridiculous definitions that just restate the obvious, since all we will need is translations. We recently had a discussion about how an Appendix phrasebook could be implemented, but I cannot remember what page that was on. --WikiTiki89 21:17, 4 December 2012 (UTC)
Can you create an example page at User:Wikitiki89/Phrasebook sample or the like that shows the format of the phrasebook appendix that you envision, based on mainspace phrasebook entries, including the translation tables? --Dan Polansky (talk) 21:23, 4 December 2012 (UTC)
Sure, I'll work on that in the next couple of days. Also, I can't remember whose idea this was, but I like the idea of possibly separating phrasebook pages for each foreign language, with all of them containing the English. --WikiTiki89 21:30, 4 December 2012 (UTC)
Re: "It would be all in one place": Is Category:English_phrasebook/Communication and its likes not one-place-ish enough? --Dan Polansky (talk) 21:45, 4 December 2012 (UTC)
Well, that's just a list of phrases. You have to actually click on each one in order to get translations. When I'm done with the example page, we can discuss it further and in more detail. --WikiTiki89 21:55, 4 December 2012 (UTC)
Thanks Polansky. This vote will be helpful.
Personally I think the phrasebook’s layout should be like that of our Swadesh lists. Instead of having an entry for every phrase, we could have a table consisting of {phrase, translation (not necessarily 1 ↔ 1), pronunciation, audio file, notes}, and entries would be topical, like Appendix:Phrasebook/Medical emergencies/lij. — Ungoliant (Falai) 21:21, 4 December 2012 (UTC)
Pages could get very big with that approach. Translation tables are already unwieldy without having grammar, pronunciation etc in them. And you want to put possibly dozens of such translations, into dozens of languages each, onto one page? I'd prefer water to that, thank you! :) —CodeCat 21:54, 4 December 2012 (UTC)
One page per language. — Ungoliant (Falai) 21:59, 4 December 2012 (UTC)
It is unlikely that a user will want to see the translations of a particular phrase into every language at once. Generally phrasebooks are useful for helping the user translate multiple phrases into one particular language that they are either learning or found themselves suddenly surrounded by. --WikiTiki89 22:07, 4 December 2012 (UTC)
That (your second phrase) is what I’m suggesting. — Ungoliant (Falai) 22:18, 4 December 2012 (UTC)
There are two options that I favour. The first is moving all phrasebook entries to a "Phrasebook" namespace. The second, which is not incompatible with such a move, is to drastically prune it, keeping all the phrases you would expect but chucking away all the dross. I would also like to get rid of the so-called "definitions" in these entries. Then it could be a useful part of the project. SemperBlotto (talk) 22:30, 4 December 2012 (UTC)
The definitions should really use Template:translation only, in my opinion. --Yair rand (talk) 22:50, 4 December 2012 (UTC)
FYI, I have recreated the simple support-oppose-abstain structure of the vote. The vote makes a simple and clear proposal that some people seem to support ("just nuke the entire phrasebook"; "If I'm asked, delete the whole g-mn phrasebook"); let us see how many they are. --Dan Polansky (talk) 19:06, 5 December 2012 (UTC)

Why English has to be special with regards to SoP-ness

The point made by Thrissel at WT:RFD#I don't need a fag is a very good one. If a term is SoP in English, that doesn't mean it's SoP in every other language, or that the translation is necessarily straightforward. I showed in WT:BP#WT:COALMINE that other, paper translation dictionaries may (often) include a wealth of SoP terms for the sake of translation. If we categorily exclude such entries from Wiktionary, we can never hope to achieve the same usefulness that those dictionaries achieve. We can't just look at English alone if we want to be a multilingual dictionary. Our English entries, by virtue of having translations, cannot ever be assumed to cater only to monolingual English users. So the fact that we offer translations from English into other languages necessarily means that, either certain SoP entries will have to exist to fill up idiomatic gaps in the translation, or we will have to settle on being and remaining incomplete as far as idiomatic English-to-foreign translations are concerned. My point, therefore, is this: when considering idiomaticity, a special exception must be made for English, so that it accounts for both idiomaticity of the English term itself, and idiomaticity of the translation of that term into another language. —CodeCat 22:08, 4 December 2012 (UTC)

Or we could improve the search facility so that searching for some English term X would find any foreign terms that translate to X, without requiring X to have a useless entry with no possible definition other than reiterating the "headword". Equinox 22:19, 4 December 2012 (UTC)
I don't think even Google is that good. And do we really have the capability to improve the search facility? —CodeCat 23:00, 4 December 2012 (UTC)
Obviously the target translation phrases would have to be marked up in some way (enclosed in <transtarget> or something?); whatever kind of search spider we use would then recognise items thus tagged. Just an idea. Equinox 23:08, 4 December 2012 (UTC)
Oppose. We should not have entries for if only you had not been able to make him take it all out from under me again for them#English just because Ubykh has a single word for that. (What's more, no-one will think to look for if only you had not been able to make him take it all out from under me again for them#English.) As an alternative, I would suggest allowing foreign-language entries to host translations tables, like the German Wiktionary does. The English translations/glosses on the sense-lines of those entries would let users find the entries via the search function. That way, if more than one language has a term for something English lacks a term for, the foreign-language terms can be linked without creating unidiomatic entries in any language. - -sche (discuss) 22:20, 4 December 2012 (UTC)
That's a straw man argument (since you are opposing something that isn't even a proposal, nor does it capture the spirit of what I said) and a slippery slope fallacy. I am only arguing that we should be at least as useful as existing dictionaries with regard to the translations we offer. Do you disagree with that? My point necessarily means having nonidiomatic English entries for the sake of their translations, but the translations themselves must be idiomatic and CFI-compliant. Or as an alternative, it would involve putting translations in an entry when they don't actually translate the entry, such as putting the translation table for coal mine on coal and on mine. But that would bring problems like those on color/colour. Putting translations on non-English entries would obviously have the same effect, except even worse! —CodeCat 22:29, 4 December 2012 (UTC)
The Ubykh word which means "if only you had not been able to make him take it all out from under me again for them" meets CFI, because it is a single word. Would you exclude if only you had not been able to make him take it all out from under me again for them#English in spite of the fact that it could host CFI-compliant translations? Some languages have specific, basic, uninflected lexemes for X-year-old animals of particular species, would you allow two-year old deer#English? Do you think anyone would look it up? Would you draw an arbitrary line somewhere and decide which SOP phrases with idiomatic translations to include and which to bar? As Equinox says, anyone looking for translations of "brick office wall" or whatever phrase you wish to use as an example should be able to type that phrase into the "search" bar and find the entries; thus, we are already as useful as other dictionaries in that regard, or we should improve our searchability. - -sche (discuss) 23:11, 4 December 2012 (UTC)
I agree with you, though, that allowing non-English languages to host translations would open quite a can of worms. Still, we already do something similar/equivalent, in that some entries host "see also" sections with the same effect. - -sche (discuss) 23:15, 4 December 2012 (UTC)
I don't think the question of whether someone will look it up is relevant. I'm sure we have plenty of obscure entries that nobody has ever looked at since they were created years ago. That said, I would include two-year old deer but not that big Ubykh word. I can't really say why or where to draw the line. Just that it makes more intuitive sense to find the former in a translation dictionary. On the other hand I doubt that we'd find the Ubykh word in even a monolingual Ubykh dictionary. I am willing to bet that despite being a word, it's not idiomatic, but that Ubykh has a different standard for idiomaticity (which is common for agglutinative/polysynthetic languages). So I would argue for deleting it, and that automatically discounts any translation entry for it. Can you tell me what the word is? —CodeCat 23:25, 4 December 2012 (UTC)
Would you expect to find "two-year-old deer" as an entry, under the letter T, in any professional published translation dictionary? I'd be very surprised, personally. I also wouldn't want it here, because (in English) it's so SoPpy; this just shows how "drawing the line" is going to be very variable. Of course I have no objection to having "two-year-old deer" as a searchable translation target that doesn't appear as a headword in the English index of entries. Equinox 01:02, 5 December 2012 (UTC)
I would be surprised if a language had a word meaning just that, and a dictionary from English into that language did not contain a translation into that term somewhere. Just like I'd expect an English-to-Dutch dictionary to contain, say, end-user licence agreement > gebruikersovereenkomst. —CodeCat 01:36, 5 December 2012 (UTC)
So in an English-Finnish dictionary, you would expect to see English entries for my pet, your pet, and their pet, so as to translate lemmikkini, lemmikkisi, and lemmikkinsä? Equinox 01:40, 5 December 2012 (UTC)
No, because they are not idiomatic, nor are they lemmas. On the other hand, Zulu has separate words for "my mother", "your mother" and "his/her mother" that are separate lemmas, etymologically unrelated to each other. But in that case I'd expect them to be listed under mother with a qualifier. However, there is no clear place for a translation into gebruikersovereenkomst to be; I'd expect to find it at end user, licence and agreement as sub-headings. I suppose we could argue about this all day but the problem is that the definition of idiomaticity we currently maintain is English-centric, and doesn't work for many other languages, causing many false negatives (for languages like Finnish) or false positives (for Chinese languages). I agree that we have to draw the line somewhere, but I think that the line is currently drawn in the wrong place, and we need to work out something that is more useful. —CodeCat 01:56, 5 December 2012 (UTC)
These are inflected forms or could be considered as such, same can be said about Arabic, Persian languages which have enclitic pronouns. They could be included still (since users are likely to look up the derived forms), if not for the amount of entries that would require. Wiktionary contains inflected forms and other non-lemma entries, so, theoretically, even these forms could be included but I don't think we should do it, even inflected forms. --Anatoli (обсудить/вклад) 02:01, 5 December 2012 (UTC)
Maybe I misunderstood your post, but don’t we do this already in Category:English non-idiomatic translation targets? — Ungoliant (Falai) 23:10, 4 December 2012 (UTC)
Yes, we do. But many times, when I argued for keeping an entry for that purpose, my argument was shot down as irrelevant or invalid because the deletion discussion concerned an English entry. I made the statement above to demonstrate why I believe it is a valid argument. The consequence of accepting that argument, of course, would be to enact a policy to give the entries in that category some form of formal allowance. —CodeCat 23:25, 4 December 2012 (UTC)
I completely agree with Codecat. Matthias Buchmeier (talk) 10:48, 6 December 2012 (UTC)
Oppose, not needed. Would satisfy certain of our editors, but be of little or no use to our readers. Mglovesfun (talk) 11:24, 6 December 2012 (UTC)
Isn't gebruikersovereenkomst SOP other than it being protected by our single word policy? Anyway, I oppose per -sche. Also, don't forget that this is the English Wiktionary so we are allowed to be English-centric. If someone wants to see translations for an idiomatic Dutch word, why would they look it up in English? If you really want to add these translations, add them to nl:gebruikersovereenkomst. The Dutch Wiktionary could sure use some help. --WikiTiki89 11:27, 6 December 2012 (UTC)
I was about to post on the subject of gebruikersovereenkomst. I wouldn't look up user agreement thinking "I wonder if any languages have a single word term for this". This is what I mean when I say it would satisfy a couple of our editors (Matthias Buchmeier, CodeCat) but would be of approximately zero use to our audience as a whole. Mglovesfun (talk) 11:29, 6 December 2012 (UTC)
I would rather say that a couple of our contributors is promoting SOP deletion, while the rest don't feel like interfering. Matthias Buchmeier (talk) 16:26, 6 December 2012 (UTC)
I agree with this statement of Matthias Buchmeier. -- Gauss (talk) 16:46, 6 December 2012 (UTC)
To Mglovesfun: citation needed. If you really think the readers wouldn't appreciate such entries, ask them. Until then I'm not convinced. I believe that it is useful for a dictionary to offer ways to translate idiomatically, such as user agreement. After all, how else is someone supposed to figure out how to say that in another language? A knowledge of Dutch grammar is only partially useful here, since the literal translation would be *gebruikerovereenkomst which is close but not quite right, *gebruikerafspraak which would sound very strange to a Dutch speaker, or even *gebruikerovereenstemming which would be incomprehensible altogether. Hence, as long as this translation is not included, Wiktionary is incomplete as it does not provide users with a means to figure out how to say it in Dutch. —CodeCat 16:36, 6 December 2012 (UTC)
I propose a challenge. Try to figure out how to say it in Finnish. If you use anything except Wiktionary, you lose. —CodeCat 16:40, 6 December 2012 (UTC)
Such a challenge could be proposed for full sentences, too. That doesn't make them suitable dictionary entries. Equinox 16:41, 6 December 2012 (UTC)
Did I say it did? I'm just talking about user agreement, nothing more. Personally I see no way to figure it out using Wiktionary, I'd go to Wikipedia instead and use the interwiki links there to find out. Kind of sad really when Wikipedia is better at being a dictionary than Wiktionary is. —CodeCat 16:44, 6 December 2012 (UTC)
You introduced this nearly metaphysical topic. You can hardly be surprised that folks revert to the subject of the heading rather than follow you the the new grounds of your choosing.
With all the pathetic inadequacies of coverage of our translation tables, I am simply astounded that anyone can seriously claim we need to find more English terms not to have adequate translation tables for. DCDuring TALK 19:29, 6 December 2012 (UTC)
It's a wiki. People will work on the things they see fit to. Telling people they can't work on something because you think they should be doing something else goes totally against the spirit of voluntary collaboration. I mean, I don't tell you to improve coverage of Mongolian first before doing any work on English. —CodeCat 21:29, 6 December 2012 (UTC)
While I oppose outright allowing an entry for any term which has a doubtlessly-a-real-word translation, I totally support the idea of translation targets as long as there is a significant amount of translations (personally I’d want at least some 15 non-SOP translations for a translation target to be necessary). Maybe we should have a vote to formalise it in the CFI. — Ungoliant (Falai) 20:53, 6 December 2012 (UTC)
That principle is hard to integrate with the piece-by-piece incremental nature of wiki editing. We can't expect every translation-only entry to have at least 15 translations right from the start, that would just lead people to argue for deletion rather than improve it. Furthermore, an entry with only 14 translations isn't any less useful to anyone who uses Wiktionary for one of those 14, than an entry with 15 translations would be. —CodeCat 21:29, 6 December 2012 (UTC)
I think the best solution here would be to improve our searching capabilities rather than adding SOP entries just for translation tables. --WikiTiki89 21:36, 6 December 2012 (UTC)
If our search were good enough to find non-English entries by their English definitions, then under your view we'd not need translation tables at all anymore. —CodeCat 21:41, 6 December 2012 (UTC)
You're absolutely right. Let's scrap translation tables completely. --WikiTiki89 21:48, 6 December 2012 (UTC)
It's the nature of languages that words are expressed in different number of ways and words, same, shorter or longer. The main idea, I think from CodeCat's and Ungoliant's post, as I understand it, combinations we consider words, rather than loose combinations should be included. The reason why we have both "coalmine" and "coal mine", "furcoat", "fur coat" is that both synonyms are considered just words, not loose combinations, even if the latter has a space in between. Translation target entries is just another method to include words otherwise threatened to be excluded by deletionists, like "last night", "day before yesterday", etc. Many technical, computer, medical terms (diseases, organs), etc. are considered terms and words by many, included in specialised dictionaries (paper and online), not just concepts (I have no problem in having concepts included as well, as long as they are in their shortest form). end-user licence agreement is an important technical term, which is formed different in various languages, e.g. in Russian it's expressed shorter, only "пользовательское соглашение" (pólʹzovatelʹskoje soglašénije) (usually it's the other way around) - made of two words, literally: "user's agreement". Knowing end, user, licence and agreement doesn't help to arrive at "пользовательское соглашение". Besides, which agreement? Согласие or соглашение? Maybe it should be контракт. We have disagreements about definitions of words, some prefer the lowest level granular words but users don't care, they want to find terms and they will use other dictionaries (economical or technical dictionaries to find end-user licence agreement) if we don't give them this information. As for the number of translations for "translation target" entries, one or two will be enough, IMHO. --Anatoli (обсудить/вклад) 00:44, 7 December 2012 (UTC)
Who's the "we" who considers these to be words? Also, how is "пользовательское соглашение" shorter than "end-user license agreement", other than having fewer spaces? --WikiTiki89 00:56, 7 December 2012 (UTC)
That's excluding "we", advocates of keeping entries like end-user licence agreement. You're right about spaces, the length of component words doesn't matter. --Anatoli (обсудить/вклад) 01:25, 7 December 2012 (UTC)
Fifteen might be too many, but one or two will cause the problem -sche is worried about (entries like if only you had not been able to make him take it all out from under me again for them). — Ungoliant (Falai) 01:09, 7 December 2012 (UTC)
well, I said, IMHO. It depends on how well-known that term is and the target language used. Let me clarify. No one tries to delete "gas station", even if one can argue that you can theoretically figure out what it means by knowing "gas" and "station". More complex terms get into trouble by being less frequent and less known or may be confused with "kitchen window" type of loose combinations. European languages are better understood by English wiktionary users and editors. Dutch, German, Finnish terms can be more disputable because of how words are joined together, even if they are not fixed terms or expressions, like German "Stadtbibliothek" - "city library", not a term we need to keep. A fine line is hard to define, no doubt, that's why entries should continue to be discussed case by case. No easy solution. --Anatoli (обсудить/вклад) 01:25, 7 December 2012 (UTC)
The tricky part with Dutch compounds is that while they can be understood by their parts, they cannot be predictably formed from their parts because Dutch has several slightly different but productive ways of forming compounds. Compounds can be formed either by just attaching the two nouns together, by inserting -e- between them, by inserting -en-, by inserting -s- and in a rare case even by inserting -er-. Native speakers usually have some kind of intuitive sense for which kind to use, but being a native speaker myself it's very difficult to explain how this works or why certain words or types of word prefer one kind of compounding. Historically these all derive from the various noun declension classes of Germanic, as well as genitive singular and plural formations. But this system has completely collapsed in Dutch and there is no longer any transparency, nor is the modern form always the historically expected form (since the Dutch spelling reform of 1995 which changed many -e- compounds to -en- compounds, counter to popular usage). And it doesn't explain why zonnescherm (sun screen) is an -e-infix compound while zonkracht (solar strength) is a zero-infix compound. So from this point of view, all Dutch compounds are somewhat idiomatic as there is more than one way to form them, but only one (occasionally two) way each one actually is formed. —CodeCat 03:06, 7 December 2012 (UTC)
It's similar in German, the linking elements are very predictable, as far as I remember - -ø- (zero-infix), -s-, -(e)n- work as linking elements. Stadtzentrum, Staatssicherheit, Straßenbahn, etc. --Anatoli (обсудить/вклад) 11:19, 7 December 2012 (UTC)
It is curious though that the Dutch cognate of Stadtzentrum is stadscentrum. There is no cognate of Straßenbahn, but if I were to coin it, I'd go for either *stratenbaan or just *straatbaan; *straatsbaan doesn't sound right somehow. Curiously, I find hits for both straatbaan and stratenbaan on Google, but judging by usage neither of them refers to a Straßenbahn. That kind of demonstrates how unpredictable and arbitrary such compounds can be sometimes. —CodeCat 14:30, 7 December 2012 (UTC)
And that's why we have gebruikersovereenkomst as an entry. So people will know it is not gebruikerovereenkomst. It's not like it's impossible to find just because there is no translation table on some English entry pointing to it. --WikiTiki89 09:08, 7 December 2012 (UTC)
That's why I proposed the challenge. To see how hard it would be to find a Finnish translation. That's the only way to know for sure if it's easy enough for users to find - try yourself. —CodeCat 14:30, 7 December 2012 (UTC)
  • Strong support: SOP is out of control, and is arbitrarily limiting this project. For crying out loud, television show was deleted! Anybody who responds to my comment with a "what-about-this-oblique-redlink I will not be pleased with Purplebackpack89 (Notes Taken) (Locker) 01:52, 7 December 2012 (UTC)
CodeCat, I think it would be silly to try and cite something to show that nobody thinks it. Do I have to come up with zero citations of someone looking up user agreement to see if there are any single word translations of it? In which case, since I have zero citations, I've already succeeded. Mglovesfun (talk) 14:57, 7 December 2012 (UTC)
  • If you look at bilingual dictionaries, they don't have translation target entries. Instead they have lists of phrases that use a word under the entry for the word, with translations there. Monolingual dictionaries have the same thing, but without the translations. What we need to have is a Phrases tab (to keep the size of the entry page manageable), with separate entry lines for phrases using the word, and translation tables for those phrases. Phrases that are entries could consist of a "See ..." cross reference. Having such a tab would help prevent SOP entries, since it would be a way for the common phrases that people look for to show up in in searches. In fact, I would go so far as to say that most of our problem with SOP stems from this gap in our coverage compared with that of other dictionaries. Chuck Entz (talk) 18:09, 7 December 2012 (UTC)
    Support Chuck Entz's suggestion. --WikiTiki89 18:20, 7 December 2012 (UTC)
    I like the idea of a separate namespace for collocations of a term. I don't like the idea of all the translations for all the collocations being on that page (though experience says that there won't be much effort put into the translations for quite some time). Would it work to have set the table appear on both Collocations_of:set and Collocations_of:table and translations just at Phrases:set the table? DCDuring TALK 19:55, 7 December 2012 (UTC)

Top-level domain entries

We have traditionally had entries for all Internet top-level domains: this includes the better-known generic TLDs, such as .com and .net; country-specific TLDs, such as .uk and .mv; and the other oddities like .int, .coop, and .museum. (I remember suggesting once before that the dot is misleading, as it only serves to separate the components of an address: the real TLDs are com, net, uk, etc.) In any case: after the stupid decision to let anyone create a TLD if they are willing to pay for it, we are now looking at any number of corporate-sponsored "TLDs" becoming available — "On 13 June 2012 ICANN has revealed nearly 2,000 applications for new top-level domains, which are expected to go live throughout 2013"[1] — so the current policy of including all TLDs (dotted or not) is going to become unsustainable. What are people's thoughts about this, and about what to do with the existing entries? Equinox 00:11, 5 December 2012 (UTC)

We have always given national, ethnic, governmental, and even partisan political entities exemption from rules that we apply strictly to private entities, especially for-profit ones. We can continue that unquestioned tradition into this realm as well. We could arguably apply WT:BRAND to the private TLDs. DCDuring TALK 01:12, 5 December 2012 (UTC)
So what do you really think? Equinox 01:14, 5 December 2012 (UTC)
Apply BRAND to them. Or let's VOTE to get rid of BRAND and allow brand entries that are attestable outside of advertising. Ie, apply BRAND to them. DCDuring TALK 19:58, 7 December 2012 (UTC)

Christmas Competition 2012 - Countdown

Yes, Countdown has finally arrived on Wiktionary!

This year's Christmas competition is based on the ever-popular British TV show Countdown. In the main part of this gameshow, contestants have to find the longest word from nine given letters - and that's what you have to do as well. Every day I shall choose nine letters (a mixture of consonants and up to four vowels) - I shall use a secret process based on Python code. You have to find the longest word you can using any of those letters once only. Your score is simple the number of letters in your word - but you get eighteen points for a nine-letter word. Now comes the tricky bit: multiple people are allowed to choose the same word in this competition. In the TV show the words are only declared after the time is up. Here, with contributors spread around the world, in different timezones, we need a method that will not disadvantage anyone, and lets no other contestant see your word until the time is up. I have chosen to use email for this purpose. I have created User:Countdown2012, and each day that the competition is run, you can send a Wiktionary email to this user. At the close of each day (some time after 8 a.m. (UK time)) I shall read the emails, allocate points, and display the points here. The overall winner will be the person who has accumulated the most points at the end of the competition (when we all get bored with it).

Anyone (who has email enabled) can take part.

The prize is the same as in previous years.

The first nine letters will be posted here tomorrow - enjoy. SemperBlotto (talk) 22:53, 5 December 2012 (UTC)

Sounds fun :-) Can it be any language? Can I create the word’s entry? — Ungoliant (Falai) 23:00, 5 December 2012 (UTC)
Is there any restriction on, say, the use of Perl scripts and database dumps? (Such a restriction would rely on the honor system, of course.) —RuakhTALK 02:44, 6 December 2012 (UTC)
Thank you so much! You have saved us from the risk of enduring something cobbled together by yours truly. I really appreciate it.
A suggestion: maybe it would be better to put points and posted words on a subpage or project page somewhere, instead of clogging up the BP? Oh, and a question: does standard transliteration of languages in foreign scripts count? —Μετάknowledgediscuss/deeds 05:23, 6 December 2012 (UTC)

As soon as I have had my breakfast I shall create Wiktionary:Christmas Competition 2012 in which all your questions (and others that you haven't thought of yet) will be answered. SemperBlotto (talk) 08:46, 6 December 2012 (UTC)

I was a semi-finalist on the show in 2003. It's ultimately how I ended up contributing to Wiktionary. Mglovesfun (talk) 11:31, 6 December 2012 (UTC)
Oh no! We're all dead, subject to the ferocity of your mad skills! ;) —Μετάknowledgediscuss/deeds 20:56, 6 December 2012 (UTC)

veni, vidi, vici

I don't see why this article is here, what is the lexicological value of it ? --Fsojic (talk) 14:54, 6 December 2012 (UTC)

Because some quotes and expression acquire a certain level of proverbiality. See also, amongst many examples, something is rotten in the state of Denmark. Circeus (talk) 16:50, 6 December 2012 (UTC)
Yes, but what would be the "proverbial" signification here? --Fsojic (talk) 17:04, 6 December 2012 (UTC)
fr:veni, vidi, vici provides quotes, for French and for Latin. Lmaltier (talk) 20:44, 6 December 2012 (UTC)
My view: It is SOP in Latin and so it should be converted to an English entry. --WikiTiki89 20:50, 6 December 2012 (UTC)
Might well be. Did you check out fr.wikt's quotes? Looks like they both refer to Caesar anyway. The first one even mentions (briefly) the background of the quote. —Μετάknowledgediscuss/deeds 20:55, 6 December 2012 (UTC)
Is there any use of this quote in Latin as a kind of cultural reference to Caesar, in the same way we might reference, say, Winston Churchill? —CodeCat 03:13, 7 December 2012 (UTC)
None that I know of in Latin. However, in English contexts this is often quoted, as has the form of a set phrase, such that I've seen T-shirts that say "veni, vidi, visa (I came, I saw, I shopped)", as is the name of at least two restaurants, several companies, songs, and groups in the US, as well as in the Philips Morris logo. This may still qualify only as a quote, but the fact that the Latin is quoted so often in English contexts leads me to believe it may have some value as an entry here. --EncycloPetey (talk) 04:28, 7 December 2012 (UTC)
I would say that if English speakers are generally expected to know this phrase (and therefore, cultural references to it like the one you showed), then it has entered the language as an idiomatic phrase. —CodeCat 14:35, 7 December 2012 (UTC)
Yes good spot, it seems to be literal only. Not really any different to Churchill's never in the field of human conflict was so much owed by so many to so few, is it? Mglovesfun (talk) 15:00, 7 December 2012 (UTC)
If we are going to have allusive phrases, then we should really get rid of WT:BRAND because so much of culture is commercial these days. I find it hard to understand why we discriminate lexically against commercial culture. Once something is attestable outside of paid advertising and use that could have been planted by commercial interests, it has entered culture at least as much as political slogans, fairy tales, ethnic and national myths, religion, "well-known works", etc.
Or is this just another manifestation of some kind of "merit"-ocratic crypto-prescriptivism rather than our oft-stated descriptive approach to language? DCDuring TALK 20:11, 7 December 2012 (UTC)

WOTDs for Christmas and New Year's Day

Any suggestions for the Words of the Day for Christmas and New Year's Day? Astral (talk) 04:20, 8 December 2012 (UTC)

How about wassail? It's true that it's more strictly associated with Christmas Eve, but one might be able to extend it to the rest of the season. The etymology (which will need to be added) is interesting because it's actually a fossilized Old Norse sentence meaning "be well!" It started out as the kind of thing that one would say as a toast, then became the term for the toast itself, then the term for what you drank when you made the toast, and finally also became a verb for going from house to house singing carols (presumably in expectation of being served wassail). A very cheery holiday term. Chuck Entz (talk) 06:12, 8 December 2012 (UTC)
I would like de novo for New Years', a phrase similrly fossilized from Latin... but I'm rather biased that way, so maybe a more Englishy word is in order. —Μετάknowledgediscuss/deeds 06:46, 8 December 2012 (UTC)
By the way, do we have anything special planned for the end of B'ak'tun 13 on December 21? Given all the 2012 hype, I think it would be appropriate to run something boring like eschatology. Chuck Entz (talk) 09:04, 8 December 2012 (UTC)
I set "b'ak'tun" for the 20, and "solstitial" for the 21, and "doomsayer" and some other things in the days leading up to it. :b - -sche (discuss) 19:14, 8 December 2012 (UTC)

Oppose wassail, lacks etymology and too British. Solendma (talk) 09:14, 8 December 2012 (UTC)

I like wassail and de novo. Both seem "exotically useful" — that is, words with which the average reader is likely to be unfamiliar, but not so obscure or specialist as to prevent them from making useful additions to an average reader's vocabulary. It looks like the carolling sense still needs to be added to wassail though. Astral (talk) 00:29, 10 December 2012 (UTC)
Never mind, de novo was enWOTD back in Feb 2011. So much for that. —Μετάknowledgediscuss/deeds 02:37, 10 December 2012 (UTC)
I like wassail as well. There's a carol that begins "Here we come a-wassailing..." and I always wondered as a child what that was supposed to mean, but I didn't know the spelling or have the internet, and no one I asked knew either. --EncycloPetey (talk) 05:55, 10 December 2012 (UTC)
Set wassail for December 25 and inception for January 1. Astral (talk) 12:30, 22 December 2012 (UTC)
And maybe we could have for Dec 26th - I know most Americans are unfamiliar with the term. SemperBlotto (talk) 12:33, 22 December 2012 (UTC)

A question about category naming

Several languages have some words that are both masculine and feminine, but the categories that list them are differently named, e.g. Category:Russian nouns with multiple genders and Category:Greek nouns of common gender. I'll need a similar category for Latvian, but I'm not sure what to name it. Shouldn't there be a consistent name? My personal opinion: I think "multiple" sounds like overkill, and "common" is more appropriate for cases like Dutch, in which you have a "common" vs. a "neuter" gender; my preference would be to use "ambigenous" instead. Unless there already is some policy on this issue. Is there? --Pereru (talk) 10:25, 8 December 2012 (UTC)

A gender called "common" and multiple genders, it's not the same thing. In French, too, many nouns have multiple genders, e.g. Russe may be masculine or feminine. But the only possible genders are masculine and feminine, there is no "common" gender. Lmaltier (talk) 11:03, 8 December 2012 (UTC)
Indeed. (And I see there's no category for such words in French, right?) I'm thinking about proposing "ambigenous" as the general word and then a general poscatboiler category for ambigenous words (say, Category:LANGUAGE_NAME ambigenous words). Wouldn't that be a good idea? --Pereru (talk) 16:16, 8 December 2012 (UTC)
A category exists: fr:Catégorie:Noms multigenres en français. But it seems that it does not include obvious cases such as Russe. Lmaltier (talk) 19:09, 8 December 2012 (UTC)
But this is a different category -- it lists words that have different senses when masculine than when feminine (e.g., physique, both physics and physique). As the category description says, these are mostly cases of homography/homophony. The Latvian (and Russian, etc.) cases are something else: words that actually can be used (with the same sense) as masculine or as feminine, depending on whether they refer to males or females (e.g., pļāpa (chatty person, chatterbox) is masculine and takes masculine adjective agreement when it refers to males, and feminine, taking feminine adjective agreement, when it reers to females. --Pereru (talk) 00:15, 9 December 2012 (UTC)
According to the category-description that Lmaltier links to, those are called epicène in French (English epicene). —RuakhTALK 00:46, 9 December 2012 (UTC)
Actually, it's épicène in French. Lmaltier (talk) 18:06, 10 December 2012 (UTC)
I went ahead and created Category:Latvian ambigenous nouns. Should a basic Category:Ambigenous nouns (or maybe Category:Epicene nouns if you prefer) be created? --Pereru (talk) 00:55, 15 December 2012 (UTC)

Category:Latin inflection-table templates

Good afternoon,

I am wondering why Latin is treated differently from, say, Russian or German, where there are three categories : one for verbs, of course, but one for adjectives and one for nouns, not a single one for both. I just created Category:Latin noun inflection-table templates, but will it be useful ? --Fsojic (talk) 16:13, 8 December 2012 (UTC)

Because the distinction between noun and adjective in Latin is not as clear-cut. Many nouns are used as adjectives, and many adjectives are used as nouns, so the inflection tables overlap. It isn't possible to separate them without much unnecessary duplication. There are also other parts of speech in Latin that use some of these same tables, including participles and certain verb forms. --EncycloPetey (talk) 16:16, 8 December 2012 (UTC)
Finnish and Estonian are languages where it doesn't make sense to split them, because the same tables are used for both. However, I don't think this really applies to Latin. I actually did split them at one point (showing that it could work) but then someone decided to undo all of it. Even so, every template is used for either nouns or adjectives, rarely for both. —CodeCat 16:37, 8 December 2012 (UTC)
Is it ever used for both, actually? --Fsojic (talk) 16:42, 8 December 2012 (UTC)
Yes. --EncycloPetey (talk) 17:27, 8 December 2012 (UTC)
Where? —CodeCat 19:24, 8 December 2012 (UTC)
Examples of Latin words that violate the claim that "every template is used for either nouns or adjectives": noster (our, ours) - which we list as an adjective because of declension, but many texts list as a pronoun because of its meaning - ego - a strict pronoun - humans - a present participle (one of thousands) - lectus - a perfect participle (one of thousands) - amandum - a gerund - complures - a word that is primarily an adjective, but also has a noun sense. These were the quickest and easiest to come up with. Listing more would take research I don't currently have time for, but this should serve to show why the category can't be simply divided into "nouns" and "adjectives"; there are additional parts of speech here, some of which overlap on the templates with nouns or adjectives, and some adjectives that have noun (substantive) senses. --EncycloPetey (talk) 00:59, 9 December 2012 (UTC)
Pronouns are neither nouns nor adjectives. Participles are not nouns because they have several gendered forms, so they are more like adjectives. You use adjectives as nouns but that doesn't change that they are still adjectives. Amandum, as a gerund, is a noun/substantive. Complures is an adjective too. So really, I am not convinced at all by those examples. Each of them is clearly either noun-like (one gender) or adjective-like (multiple genders). —CodeCat 01:21, 9 December 2012 (UTC)
EncycloPetey is right. Latin just doesn't work like that. Learn even basic Latin grammar and you'll see that to be true. Our templates even acknowledge this. Take a look at for example - the abl. sing. changes based on how it's used grammatically - vs . And if you're hung up on genders, take a look at . When used nominally, I translate it as "friend"; when adjectivally, as "friendly". But what if the friend is female, or the noun that is friendly is feminine? Then both noun and adjective switch to . I honestly believe that native Latin speakers considered the different parts of the speech under the same lemma in a situation like that to be the same word. —Μετάknowledgediscuss/deeds 05:29, 9 December 2012 (UTC)
But on Wiktionary we treat them separately, like amicus shows. I don't understand why we can't make a clear separation between noun-like templates (with one gender) and adjective-like templates (with multiple genders). From what I can see, this split is already very clear in the templates we use, and the only objections are grammatical technicalities that don't matter for the broad picture. —CodeCat 13:40, 9 December 2012 (UTC)
That's not true. We only treat them separately when the substantive form of the adjective has a fixed gender, like senex (old" / "old man). In situations where the substantive form of the adjective has variable gender, we list a (substantive) definition line rather than creating multiple separate noun sections, simply for the convenience and for the fact that (as pointed out) native Latin speakers did not distinguish those parts of speech. There are situations where this has not been correctly handled, as has been pointed out, but that doesn't retroactively change the Latin language. A division between single-gender templates and multiple-gender templates is only one way to divide these templates up, and it is not a very satisfactory way of doing it, because "noun-like" and "adjective-like" don't adequately describe what you're saying. Why not by number, as there are some templates with just singular, just plural, or both? Why not by declension (1st, 2nd, etc.)? There are multiple variables in the way all of these templates are set up, and not just gender. And what would be the advantage of saying: here is a category for tables of words with only a single gender, versus: here is a category for tables useful for words with multiple genders? How confused will a person be who is looking for the template for a noun that can be either of two genders? Hence, the templates are divided up as they currently are into "conjugation-type" templates and "declension-type" templates. --EncycloPetey (talk) 17:54, 9 December 2012 (UTC)

Taana (ތާނަ) Transliteration

Hello, everyone. I am considering making a policy for transliterating Taana, the script in which the Dhivehi language is written in. My reason for using the spelling Taana is because the spelling Thaana can imply that its initial consonant, th, is aspirated. Therefore, can we make a policy that uses the letter h for the letter ހ (ހާ, hā) only. It could also use a dot for letters ޅ (ḷ), ޓ (ṭ), ޑ (ḍ), and ޱ (ṇ). For ޗ, we could use just the letter c. How is it? --Lo Ximiendo (talk) 14:33, 10 December 2012 (UTC)

Here's a table, which anyone can modify:

Characters of the Thaana script
(vowels are displayed with an alifu carrier)
Grapheme Name Nasiri Romanization IPA value Proposed Transliteration
haa h [h] h
shaviyani sh retroflex [ʃ] sh note
noonu n [n̪] n
raa r [ɾ] r
baa b [b] b
lha viyani lh [ɭ] note
kaafu k [k] k
alifu varies '/vowel carrier note
vaavu v [ʋ] v
meemu m [m] m
faafu f [f] f
dhaalu dh [d̪] d
thaa th [t̪] t
laamu l [l] l
gaafu g [ɡ] g
gnaviyani gn [ɲ] ñ
seenu s [s̺] s
daviyani d [ɖ]
zaviyani z [z̺] z
taviyani t [ʈ]
yaa y [j] y
paviyani p [p] p
javiyani j [dʒ] j
chaviyani ch [tʃ] c
ttaa Arabic-to-Maldivian
hhaa note
khaa x
zaa English-to-Maldivian
transliteration [ʒ]
ž note
sheenu Arabic-to-Maldivian
š note
saadhu note
daadhu note
to note
zo note
aïnu ʿ note
ghaïnu ġ note
qaafu q note
waavu w
aba fili a [ə] a
aabaa fili aa [əː] ā
ibi fili i [i] i
eebee fili ee [iː] ī
ubu fili u [u] u
ooboo fili oo [uː] ū
ebe fili e [e] e
eybey fili ey [eː] ē
obo fili o [ɔ] o
oaboa fili oa [ɔː] ō
sukun varies '
Naviyani [ɳ] note

Any questions? --Lo Ximiendo (talk) 15:13, 10 December 2012 (UTC)

it looks fine for now -- Luceatlux (talk) 15:33, 10 December 2012 (UTC)

Using an Arabic character in a Latin transliteration kind of defeats the whole point. If you need to represent that sound, I'd recommend the half-ring character ʿ. -- Liliana 16:13, 10 December 2012 (UTC)
Good point. Liliana, could we school Luceatlux (talkcontribs) on how we make headword templates? --Lo Ximiendo (talk) 16:15, 10 December 2012 (UTC)
Do you mean an example entry? I've picked a random entry - އިރު(iru) and picked the most essential part. The transliteration is optional, only if known, so is the plural form.

{{head|dv|noun|sc=Thaa|tr=iru}}, {{Thaa|[[އިރުތައް]]}} (iruthak) {{p}}

# [[sun]]

The word "noun" can be replaced with other parts of speech. --Anatoli (обсудить/вклад) 02:53, 11 December 2012 (UTC)
I think we may want to create Wiktionary:Dhivehi transliteration, {{dv-noun}}, {{dv-verb}}, {{dv-adj}}, etc., transliteration needed. (The plural can be optional in case we come across uncountable nouns. Dhivehi nouns can also have indefinite article forms. Ask Luceatlux (talkcontribs) if you want to know more.) --Lo Ximiendo (talk) 03:07, 11 December 2012 (UTC)
re ށ, I think this sentence from w:Maldivian phonology is referring to it: "/r/, a voiceless alveolar flap or trill, is peculiar to Maldivian among the Indo-Aryan languages. But some people pronounce it as [ʂ] a retroflex grooved fricative." The first sound could be represented by either an hr or an rh digraph. The second is basically Sanskrit or Mandarin sh.
ށ with a sukun is pronounced as a glottal stop. But with all other vowels it is just "sh".--Luceatlux (talk) 19:10, 12 December 2012 (UTC)
re ޏ, this sound is also gn in French which is where Dhivehi got it. In Portugese it's nh, and in Catalan ny. I like ny for this, because that's how it's written in English loanwords from Spanish such as canyon Chuck Entz (talk) 06:27, 11 December 2012 (UTC)
It's also spelled ñ and n, as in jalapeño and jalapeno. I'm happy with ñ, as that's least ambiguous.--Prosfilaes (talk) 05:13, 12 December 2012 (UTC)
I'm ok with this. Using ñ for ޏ--Luceatlux (talk) 19:16, 12 December 2012 (UTC)
I noted the ones I am unsure about. I'll show them to some other people.--Luceatlux (talk) 21:59, 12 December 2012 (UTC)
Luceatlux (talkcontribs) can create Wiktionary:Dhivehi transliteration if he wants; so if he starts, we finish, savvy? --Lo Ximiendo (talk) 23:23, 13 December 2012 (UTC)
I created the page. I left the things I was unsure of. Feel free to edit and add to it.--Luceatlux (talk) 01:19, 14 December 2012 (UTC)

Є/є in Old Church Slavonic

In modern Cyrillic this is called w:Ukrainian Ye, and has a distinct usage and meaning. However, in Old Church Slavonic, it was just a graphical variant of Е/е and was completely interchangeable with it at first instance. We have some entries that use Е but most seem to use Є. Presumably this should be harmonized if there really is no difference between them. But should all uses be changed to Е or to Є? —CodeCat 18:38, 11 December 2012 (UTC)

Move to Е. --WikiTiki89 18:46, 11 December 2012 (UTC)
I would agree too, but... OCS has pairs of regular and iotated vowels, with the latter written as a ligature of I and that vowel. The iotated version of Е and Є is Ѥ, based on the letter form of the Є. So for the same of visual consistency, Є might be preferable. —CodeCat 18:52, 11 December 2012 (UTC)
And of course since most entries already use Є, it would be less work to keep it. —CodeCat 18:54, 11 December 2012 (UTC)
Yet another reason I hate Unicode. Theoretically, an OCS font would solve that problem (and many others relating to graphical variants). But even if we can't get an OCS-specific font, I still think E is better, although I can't think of any solid arguments other than that Є was created for Ukrainian specifically as an iotated version of E and therefore it makes less sense to use it as a non-iotated letter. --WikiTiki89 19:01, 11 December 2012 (UTC)
It's similar in a lot of ways to the distinction between i and j and between u and v in Latin. They weren't distinguished in Latin itself, but modern usage has assigned different phonetic values to them. Just as Ukrainian has done with these letter forms. —CodeCat 19:07, 11 December 2012 (UTC)
I guess so. But also, etymologically the sound represented as E/Є in OCS is represented almost exclusively by E in modern Slavic languages (unless the sound changed to /o/ or /i/, in which case it is represented by another letter entirely). Ukrainian, which is the only language that distinguishes (or ever distinguished) E from Є, uses E for OCS E/Є and Є for OCS Ѥ. So, it would make more sense to use E. --WikiTiki89 19:16, 11 December 2012 (UTC)
I'm happy to move all CU entries with є to е, leaving a redirect behind. There were rules about when it is possible to use є (beginning of a word and in some endings, e.g. "къ отцємъ" and to distinguish some homophones), as a number "5"(!) but all words (except for the numerical usage) can alternatively be written with е. --Anatoli (обсудить/вклад) 00:14, 12 December 2012 (UTC)

Sorry to show up late.

I believe Unicode intends е to represent the normal/narrow variant of yest, and є to represent the broad/anchor form. This usage is compatible with the Unicode fonts that I have, and with modern Church Slavonic font usage. I can’t find proof of this in Unicode’s documentation, but this assumption is clear in some important documents, like some of Michael Everson’s proposals[2][3] which were incorporated into the Unicode standard. See the notes on “combining Cyrillic letter ie” and “combining Cyrillic letter Ukrainian ie,” and the associated figures.

So yes, let’s change most uses of є to е. As much as possible, let’s retain the є in any terms where the broad yest is the attested usage, provide alternate-form entries where it is an alternate usage, and, of course, in citations that follow the source. I will start Wiktionary:About Old Church SlavonicMichael Z. 2013-02-11 21:06 z

What exactly are the broad and narrow variants? —CodeCat 21:30, 11 February 2013 (UTC)
Different stylistic forms. They are equivalent, but writing conventions determine where they are used, especially in modern Church Slavonic, and both can appear in the same text. This is why they are separate code points. There is also a rarely-seen reversed iest. Michael Z. 2013-02-11 21:58 z
But what are they? —CodeCat 22:00, 11 February 2013 (UTC)
If I understand Michael Z. correctly, the "broad" variant is the variant found as the first letter of the second and fourth words in this image, and the "narrow" variant is the variant found everywhere else. Thus we should transcribe the second word of the image as [[єже]]. Is that what you mean, Michael? —Angr 22:20, 11 February 2013 (UTC)
I think that is being overly precise and pedantic. The two varieties didn't actually have any difference in meaning in OCS, and we regularly normalise such varieties for other languages too. —CodeCat 22:25, 11 February 2013 (UTC)
I agree (with CodeCat); consider that we don't allow entries with long s. (But are we correctly understanding what you're proposing, Michael?) - -sche (discuss) 00:13, 12 February 2013 (UTC)
Yes, those are the narrow and wide variants of the letter iest/jest/yest in Angr’s image, sorry. If you have a good Cyrs font installed: е and є.
I don’t know enough about old Cyrillic writing to judge whether the broad є is just used in certain positions in a sentence, or properly to particular words. Old Cyrillic writing has a lot of orthographic variation, which probably belongs under an “Alternative forms” heading. Michael Z. 2013-02-12 00:29 z

Entries for reflexive Spanish verbs

I am a beginning Spanish speaker and use Wiktionary to lookup unknown words. Today I had quite a bit of trouble defining "me visto". I first found no entry for it though eventually found an entry for "vestirse" but only once I realized that my "visto" was derived from "vestir" via Google--is there a policy against separate entries for cojugations of reflexive verbs? If not, I will volunteer to add at least this one. Second, I found that "visto" derived from "vestir" was not listed under the entry for "visto" derived from "ver"--I'm really new to this but can I create an 'Etymology 2' entry defining "visto" per "vestir"? —This unsigned comment was added by (talk).

I added the form of vestir at visto. — Ungoliant (Falai) 23:40, 11 December 2012 (UTC)
Thanks!! —This unsigned comment was added by (talk).

"✒ There are new messages for you" (MediaWiki:Lqt-new-messages).

Previous discussion: Thread:User talk:George Animal/Eine Bitte.

[[MediaWiki:Lqt-new-messages]], which is the message that LiquidThreads puts at the top of your watchlist whenever anyone posts a message in a thread you've contributed to, or in a thread (new or pre-existing) on a page you're watching, or in a thread (new or pre-existing) on your talk-page, defaults to this:

✒ There are new messages for you.

which can be pretty confusing if you're watching someone's talk-page and someone leaves them a message. Does anyone object to my changing it to:

✒ There are new messages in a thread you are watching.


RuakhTALK 15:14, 12 December 2012 (UTC)

But that's not quite true either (as you noted). True would be "There are new messages on a page that also has a thread you are watching".​—msh210 (talk) 16:17, 12 December 2012 (UTC)
Are you sure? My understanding was that each thread is watched separately; so you should only get notifications if you're actually watching that specific thread. (Granted, one way to be watching a specific thread is to watch the page that contains it; but watching a thread doesn't cause you to watch its page, so doesn't cause you to watch all other threads on the same page.) —RuakhTALK 16:30, 12 December 2012 (UTC)
Yes. Even if you watch only the parent page, all messages posted there trigger this message. I have a lot of old talk pages on my watchlist from pre-LQT times, and all of them give me this message. -- Liliana 18:48, 12 December 2012 (UTC)
. . . as I said. —RuakhTALK 23:12, 12 December 2012 (UTC)
You may be right: I don't doubt it. I only ever tried watching a standard user-talk page, and was notified of any LQT changes thereto. I never tried watching a specific thread, never having (AFAIR) posted in one. (See e.g. diff.)​—msh210 (talk) 20:10, 13 December 2012 (UTC)
Threads go in their own namespace, so you can filter your contributions-page to show only that namespace, and thereby see your posts in threads: Special:Contributions/Msh210?namespace=90. As it happens, it's been more than a year since you posted in one, but you're presumably still watching those, if that helps you figure anything out. —RuakhTALK 02:29, 14 December 2012 (UTC)
No wonder LQT hasn't won everyone over. Can it be fixed to make more sense? Is there a bug about the less-than-fully-helpful nature of the notifications? DCDuring TALK 16:26, 12 December 2012 (UTC)

Nobody has raised a valid complaint, AFAICT, and I think we all seem to agree that Ruakh's emendation is an improvement. So, can we change it now? —Μετάknowledgediscuss/deeds 23:57, 18 December 2012 (UTC)

Okay with me.​—msh210 (talk) 17:49, 19 December 2012 (UTC)

Users Fête (talkcontribs), Phung Wilson (talkcontribs) and his IPs


the French Wiktionary would like to warn the English Wiktionnary that users Fête (talkcontribs), Phung Wilson (talkcontribs) are, first, the same person, and also are about to be blocked on the French Wiktionnary indefinitely for their contributions. Please, watch them, or, if it's possible, block them.

Fête's (Phung Wilson) contributions are about pronunciations of french words in province of Quebec. He harasses people, like me, on French Wiktionnary asking for the pronunciations (that I meant in this message). This user also made one user to leave the Wiktionnaire. Fête doesn't master Quebec French pronunciations, and continues always to do that after warnings.

He even had been blocked for one month, and also used other accounts to continue his contributions when he was blocked. Please, again, watch these accounts, or block these one, but you have the last word because we don't have any authority here. Ĉiuĵaŭde (talk), Wiktionnaire Francophone (French Wiktionnary); 23:02, 14 December 2012 (UTC)

Dankon. He basically gave away his sockpuppetry himself. I'm still unclear on exactly what his offense was at Wiktionnaire, however. How is asking for pronunciations harrassment? —Μετάknowledgediscuss/deeds 00:50, 15 December 2012 (UTC)
How is asking for pronunciations harrassment?--Luceatlux (talk) 03:42, 15 December 2012 (UTC)
"Answer me please!" — Actarus (Prince d'Euphor) 06:27, 15 December 2012 (UTC)
By the way, Fête was also blocked indefinitely on the French Wikipedia a month ago. — Actarus (Prince d'Euphor) 06:51, 15 December 2012 (UTC)
I know. Unfortunately, the Frenchies did not deign to leave a pointer to an edit that displays his "harcèlement de contributeurs". I guess I'll look about the place. —Μετάknowledgediscuss/deeds 06:57, 15 December 2012 (UTC)
That's odd. Suddenly his talkpage became viewable to me again. I found it a bit annoying to navigate fr.wikt talkpages (too much like a Wikipedia!), but I understood most of it. I'll look through his edits and see if he's doing the same stuff here. —Μετάknowledgediscuss/deeds 07:05, 15 December 2012 (UTC)
I wouldn't exactly call it harassment. It's just a bit annoying that he keeps asking the same sorts of questions and can't seem to figure out the patterns. What he does wrongly is to change English pronunciations even when I tell him they are wrong, to add pronunciation to entries not in his native language, and to "update" all the audio samples for all of the IPA characters on Wikipedia, when he is clearly not an expert on the IPA characters. --WikiTiki89 07:07, 15 December 2012 (UTC)
Right. Blockworthy, though? I'm on the fence, because I trust the French, but OTOH he does seem to be causing rather minor damage which you are kindly cleaning up after. —Μετάknowledgediscuss/deeds 07:10, 15 December 2012 (UTC)
Well I'm not going through every edit he does, only if it happens to be on my watchlist. I think he needs a formal warning. It's too early to block him. --WikiTiki89 07:17, 15 December 2012 (UTC)
I hate to say this... but could you please give the warning? —Μετάknowledgediscuss/deeds 07:18, 15 December 2012 (UTC)
But I'm not even an admin... --WikiTiki89 07:41, 15 December 2012 (UTC)
So? You can make threats you can't carry out, because if they're reasonable and based on community discussion, then a notification at WT:VIP should be sufficient to get an admin to come over and make a block. I just don't think that I'm knowledgeable enough about his actions and good enough with handling people to do the job. —Μετάknowledgediscuss/deeds 16:02, 15 December 2012 (UTC)
If you find him doing something which you think he should not be doing, let him know on his user talk page. Otherwise, he will not know there is an issue. You do not need to give any warning of the form "stop, or you will be blocked". What you need to do is to demonstrate on his talk page that he has been notified of problems with his editing. If you do not want to receive questions from him, let him know on his user talk page. --Dan Polansky (talk) 17:07, 15 December 2012 (UTC)
Dan, we warned him enough on French Wiktionnary and he continues like if nothing happened. Ĉiuĵaŭde (talk), Wiktionnaire Francophone (French Wiktionnary); 17:17, 15 December 2012 (UTC)
Maybe that'll give him impetus to listen here. But I don't see this as a blockable offence, especially if the editors he's talked to here don't see it as one. - -sche (discuss) 17:20, 15 December 2012 (UTC)
I'll give a chance for his behavior, because he's not on our territories, but you can talk to me or to admins of French Wiktionnary if there's a problem, even if it didn't happened before. Ĉiuĵaŭde (talk), Wiktionnaire Francophone (French Wiktionnary); 17:25, 15 December 2012 (UTC)
He's still at it with pronunciations in languages he doesn't know well [4]. What I find most of concern is that the audio file he added is labeled in Commons as contributed by a native speaker, but his Babel box says he's level 1. Given that we don't have many who can verify the pronunciations in some of the languages he does, we need to be able to trust him- and that doesn't help. Chuck Entz (talk) 18:20, 21 December 2012 (UTC)

Good discussion. The problem with him, it's that he asks to the same people one million of times the same sort of questions (about pronunciations). When nobody answers, he asks (or orders) to answer. He succeded to make one editor leave on Fr Wikt, so I'd take his case seriously, because his contributions here are similar to these one on Fr Wikt. Ĉiuĵaŭde (talk), Wiktionnaire Francophone (French Wiktionnary); 15:24, 15 December 2012 (UTC)

He has been asking me stuff over at en: and it _is_ getting irritating to have these kind of questions repeatedly asked by a clearly nonnative speaker. Circeus (talk) 21:31, 23 December 2012 (UTC)

Block him. Ĉiuĵaŭde (talk) 17:50, 25 December 2012 (UTC)

{{obsolete name of}}

I created this template some time ago, and have used it with Latvian terms that had been replaced by more recent ones (like laika vārds (verb), replaced by darbības vārds). Now, CodeCat pointed out at the time that this led to non-standard formatting, and others also criticized the resulting format (e.g., -sche in the template talk page). It was suggested that I should simply use {{obsolete}}, and mention the replacing word elsewhere (e.g., under ===Synonyms===). That is in principle OK with me, but I had one problem: {{obsolete}} categorizes words into Category:English terms with obsolete senses rather than into Category:English obsolete terms (note that, e.g., {{archaic}} doesn't do that: it categorizes into Category:English archaic terms, not into the (non-existing) Category:English terms with archaic senses).

I personally think that fully obsolete or archaic words should be directly categorized into a "... obsolete terms" or "... archaic terms" category, not a "... terms with obsolete/archaic senses" category. Since {{archaic}} does that, I had no problems formatting archaic entries (see, e.g., tallerķis). But since {{obsolete}} does not behave like {{archaic}}, I thought I should ask people here whether these two templates could be made to do the same before changing the Latvian obsolete term entries. (Could we, for instance, have two templates for each label, one for "words with archaic/obsolete senses" which also have current, non-obsolete senses, and another one for fully obsolete/archaic, no longer used words? Say, {{archaic}} alongside {{archaic sense}}, and likewise for {{obsolete}} and {{obsolete sense}}?) --Pereru (talk) 01:11, 15 December 2012 (UTC)

I support making separate templates for specific senses of a word like that. I also don't think they should categorise. Category:English archaic terms is ok, but I'm not sure what the use of Category:English terms with archaic senses would be. —CodeCat 01:14, 15 December 2012 (UTC)
Category:English terms with archaic senses is more accurate than "Category:English archaic terms", because words like "absorb" are not archaic(!)... hence RFMs are open to move such categories (WT:RFM#Category:English_dated_terms, WT:RFM#Category:English_archaic_terms). I could support having two templates (one for archaic senses nd one for entire words), and even categorising archaic/obsolete/etc forms of terms into a third class of category... though I worry such a three-way division of categories will be too complex for anyone to follow for long, and will cease to be followed. - -sche (discuss) 02:09, 15 December 2012 (UTC)
That's why I suggested getting rid of the "obsolete senses" category and keeping only "obsolete terms". We can use that for both obsolete lemmas and obsolete forms of lemmas. —CodeCat 02:56, 15 December 2012 (UTC)
A large share of English headwords have multiple etymologies, PoSes, and senses. The senses can be dated, obsolete, archaic, etc. To place English headwords in categories with silly names that wrongly imply that the headword is "obsolete" etc when that is not often the case is simply unacceptable. If those of a uniformitarian impulse would confine their standardizing designs on languages other than English I would be much happier.
I trust the judgment of those familiar with each of the languages we actively support to come up with naming schemes for categories for each language. But the uniformitarians do not. Perhaps the uniformitarian name scheme should best be considered a default one for a given language to be replaced as soon as its strictures chafed. DCDuring TALK 03:39, 15 December 2012 (UTC)
I totally don't understand what you're saying. —CodeCat 03:50, 15 December 2012 (UTC)
@CodeCat: if we're to have only one category, it should be Category:English terms with archaic senses, which is always accurate: even if a word has no non-archaic senses, if it has archaic senses, it has archaic senses (tautologically). In contrast, Category:English archaic terms is often wrong: "absorb" is not archaic. Those who've voted at RFM so far have agreed with that rationale. - -sche (discuss) 03:56, 15 December 2012 (UTC)
Then you misunderstood what I am suggesting. I am suggesting to have no category at all for words that only have some senses that are obsolete. In effect, we would categorise only entire words that are obsolete, not words that are still used in at least some senses. —CodeCat 04:12, 15 December 2012 (UTC)
Aha, I did misunderstand, I'm sorry. Hm.. yeah, even though I think having 2-3 categories and templates would be ideal, I think either your suggestion of not categorising partislly-archaic terms or my suggestion of putting everying with any (including 100%) archaic senses in the "archai senses" cat, is more likely to actually work. - -sche (discuss) 04:45, 15 December 2012 (UTC)
Nor, obviously did I. Conceptually, it is appealing. How would you suggest your scheme be implemented and, most importantly, maintained? DCDuring TALK 14:18, 15 December 2012 (UTC)
How about having {{archaic}} categorize into Category:English archaic terms, and then having another template {{archaic sense}} that doesn't categorize into any category (or into Category:English terms with archaic senses if we want to keep two categories)? (Personally, I like CodeCat's suggestion. Sounds very commonsensical to me.) --Pereru (talk) 16:27, 15 December 2012 (UTC)
A further note. Even though a category like Category:English terms with archaic senses containing fully, not only partially, archaic words is OK in purely logical terms, the casual user, not versed in the intricacies of formal logics, would find this misleading. S/He would probably either assume that ALL words in Category:English terms with archaic senses are fully archaic, or then that NONE of them is fully archaic. Either way, s/he wouldn't find what s/he was looking for. --Pereru (talk) 16:31, 15 December 2012 (UTC)
That would seem to mean that we need to complexify {{archaic}}, {{obsolete}}, etc so that there is a switch which selects the category: "X terms" or "X terms with obsolete senses". The default would have to be the "senses" category or no category. Wouldn't we need a bot to check to make sure that the categorization corresponded to the facts of the entry, ie, that there were no non-X senses? Do we have the capability and willingness to do that? Or would we rely on what was detected by the observant contributor? DCDuring TALK 16:48, 15 December 2012 (UTC)
How would the switch work with {{context}}? DCDuring TALK 16:49, 15 December 2012 (UTC)

Here is one possibility: make {{obsolete}}, {{archaic}} (et alii) all categorize into Category:English terms with obsolete senses, etc. (which means {{context}} also will, if these "obsolete", "archaic" etc. are given as labels). Then create a new {{obsolete term}} or {{obsolete word}} or whatever, and have a bot go through all the words in Category:English terms with obsolete senses to detect all entries that have exclusively obsolete senses; the bot will delete {{obsolete}} from all senses and add {{obsolete term}} to the inflection line instead. Do the same for the other languages. Wouldn't that work? --Pereru (talk) 17:31, 15 December 2012 (UTC)

That sounds more workable than the earlier/opposite suggestion of having {{archaic}} assume a whole word was archaic. A bot could also convert instances of "{{archaic}} {{alternative form of|foo}}" (some of which are my fault) to {{archaic form of|foo}}. - -sche (discuss) 17:37, 15 December 2012 (UTC)
I am a little bothered by putting obsolete at the sense/definition level because it introduces a kind of hidden polysemy into the term "obsolete" as we use it. In some ways it would be better at the inflection line.
Unfortunately, it is quite possible for one homonym (under either an Etymology or PoS header) to be an obsolete form while others with identical spelling are not.
Do we mean the labels "obsolete" and "archaic" to apply to headword (within a language, of course), etymology, or PoS in the cases where they are not limited to application at the sense level? DCDuring TALK 17:47, 15 December 2012 (UTC)
It seems to me all (or almost all) words have "obsolete senses" in that, if you look at their reconstructible history, they used to mean something else at some point. (Lots of common words have "obsolete senses" in the OED.) Which is why I prefer the labels (at least the categorizing ones) to apply to the whole word. I suppose we could in general simply use {{qualifier|obsolete sense}} or something like that to mark obsolete senses of a word, and keep {{obsolete}} only for the case in which the whole word and all its senses are obsolete (or archaic, dated, etc.).
Going on a related question: some words in Latvian became "obsolete" quite recently, in the sense that a new word was proposed and became standard, say, in the early 20th century -- e.g., zinātne (science). Is "obsolete" the right label for them, if it's so recent? Also, the "obsolete" variants -- zinība, zinātnība, etc. -- are not exactly "alternative forms of" zinātne, but rather similar words with slightly different structures/suffixes... should I label them differently? --Pereru (talk) 20:53, 15 December 2012 (UTC)
Re "early 20th century": I usually label anything that was used in living memory as "dated" (or at worst "archaic"), since even if no or only very few Lithuanians from the 1910s are still alive, people are still alive who heard their grandparents use the term. But I can see how scientific jargon (e.g. unnilpentium) could go obsolete even within living memory. - -sche (discuss) 22:00, 15 December 2012 (UTC)

What is happening with dialectal word categories?

First there was Category:lv:Dialectal, then Category:Latvian dialect terms, then Category:Latvian dialectal terms. And when I tried to create the latter with {{lexiconcatboiler}}, I only got the error box. Is there something going on? --Pereru (talk) 16:34, 15 December 2012 (UTC)

I was the one who change dialect terms to dialectal terms, per dialectal. Mglovesfun (talk) 18:08, 15 December 2012 (UTC)
Fixed. Should {{lexiconcatboiler/dialect terms}} be deleted? — Ungoliant (Falai) 18:15, 15 December 2012 (UTC)
Everything has now been moved from "dialect" to "dialectal". Ultimateria (talk) 19:43, 15 December 2012 (UTC)
Great! No problem then, I just happened to be looking as Mglovesfun started doing the changes. Everything's fine now. --Pereru (talk) 20:43, 15 December 2012 (UTC)

CFI for pronunciations

I think we should come up with some sort of CFI for pronunciations (at the very least for English). This may be difficult to formulate, but we can neither allow obscure idiolectal pronunciations nor deleting pronunciations just because no one here happens to have heard of it. There has to be some way of verifying them other than referencing other dictionaries. --WikiTiki89 12:07, 17 December 2012 (UTC)

Idiolectal, no, but obscure dialects (properly tagged), yes, right?​—msh210 (talk) 15:26, 17 December 2012 (UTC)
Yeah, but the trickier part of the CFI would be really, how do we verify the pronunciations? --WikiTiki89 15:33, 17 December 2012 (UTC)
Presumably we'd want the pronunciation to be durably archived in the form of sound. However, we can't really verify anything relating to the phoneme/allophone distinction; all that a sound can archive is the phonetics, which usually don't reflect the underlying structure of the word in that language (only a study of the language as a whole can do that). It would also be rather hard to find pronunciations for many languages, even if the orthography allows us to derive its phonemes accurately (like in the case of Finnish). Not all attested pronunciations should be considered equal either, as everyone speaks somewhat differently. Consider what would happen if we used Jonathan Woss for English! So I believe that we should set up a system of references so that we can add them where possible, but that they should not be part of CFI because it would lead to removing most of the (valid and useful) pronunciations we already have. —CodeCat 15:39, 17 December 2012 (UTC)
Or we can design the CFI so that we wouldn't have to remove most of the valid and useful ones. It should be done language by language so don't worry about the small languages (like I said, we should start with English). I obviously don't suggest that we require audio citations for each pronunciation. Essentially, what I think we need is some sort of RFV process for someone to dispute a given pronunciation and then have some set of guidelines on how to verify it. --WikiTiki89 17:34, 17 December 2012 (UTC)
I don't know how we can not require audio citations for each pronunciation for CFI. That's the only way to prove that it exists.--Prosfilaes (talk) 08:54, 18 December 2012 (UTC)
We can't because audio citations are not so easy to come by. Anyway, it's not the only way to prove existence: transcription (e.g. in linguistics articles) are also evidence, as is analysis (e.g. comparing to similar words).​—msh210 (talk) 21:23, 18 December 2012 (UTC)
I've never had much of a problem finding audio citations for English works like wyrd or fajita on Youtube, albeit ones that weren't durably archived. Analysis is not proof, and I don't see really how asserting that "this pronunciation looks right" would be meaningfully part of CFI.--Prosfilaes (talk) 10:48, 20 December 2012 (UTC)
Not only are audio citations difficult to come by, they don't actually prove anything by themselves, because without a phonological analysis to interpret them by, they're not very useful. Record a Kiwi saying Auckland and play it for a linguistically naive American. He'll probably tell you it sounds like "Oakland". —Angr 20:40, 21 December 2012 (UTC)
That beats the heck out of what a linguistically naive American would do to English blackletter. I would argue this says more that it takes some degree of linguistic awareness to work on a dictionary then anything else. To a certain extent, I think it's overstating what's being desired by most of Wiktionary's users; I suspect most of our users are looking for words in their dialect or a dialect they're familiar with, to use a pronunciation their audience will find acceptable. Cross-dialect uses will be in the minority.--Prosfilaes (talk) 00:03, 23 December 2012 (UTC)
We should start a discussion room (Pronunciation Auditorium?) for discussing pronunciation, like the Etymology Scriptorium, but unlike the Request for Foo rooms in that pronunciations wouldn’t fail or pass. For the record, there has been some recent edit warring in the entry -ing. — Ungoliant (Falai) 20:35, 21 December 2012 (UTC)

Wiktionary thing should have its own favicon

Wiktionary should have a unique favicon (favorites icon--the little icon in browser tabs). Currently Wiktionary uses Wikipedia's 'W' favicon. This causes confusion and the often needed dictionary gets lost among the encyclopedia pages. Wiktionary is not Wikipedia. Our favicon should be distinct from Wikipedia's favicon so that users (like me) can easily see which browser tab contains a Wiktionary page and not a Wikipedia or other page. We must stand out and be identifiable as a dictionary--that all important book everyone needs and wants to lay a finger on easily without combing through seven Wikipedia browser tabs to find it. I was about to start a vote on this but perhaps discussion over beer would be a easier start? Cheers! Proposals for what a uniquely identifiable Wiktionary favicon could look like are welcome. Yea or ney regarding your support for this wacky notion that a dictionary is not an encyclopedia and that anyone would ever want to find the Wiktionary tab in their browser are also welcome. Me, I say Wiktionary should stand out with its own bold favicon. But I'm just a humble reader who doesn't know what some words mean and comes to Wiktionary to find out, and then looses the dang browser tab amid all those Wikipedia tabs I always have open and gets frustrated and thinks gee, maybe that
Wiktionary thing should have its own favicon. Please bang       beers together over this with me. Cheers! --Rogerhc (talk) 19:25, 17 December 2012 (UTC)

I strongly agree. Why don't we use the favicon that basically every other Wiktionary uses, with the tiles that represent their superior logo? For example, see the favicon that comes up when you go to Swahili Wiktionary. —Μετάknowledgediscuss/deeds 21:47, 17 December 2012 (UTC)
These types of changes are usually contentious. Maybe this is less controversial. DTLHS (talk) 21:59, 17 December 2012 (UTC)
We might be more likely to reach a consensus one way or another now that we allow only editors to vote. - -sche (discuss) 22:11, 17 December 2012 (UTC)
I don't think it is something we (sysops for instance) can change ourselves. I believe that it is built into the definition of the wiki (and uses the old /flavicon.ico method). Perhaps a developer can change it. SemperBlotto (talk) 22:00, 17 December 2012 (UTC)
To be specific, it's in the site's PHP, as the Boolean value of $wgFavicon in LocalSettings.php (or something like that, it's been a while since I made a wiki). If we get clear consensus (i.e. a vote that passes or a unanimous discussion), we can open a bug report of Bugzilla asking the devs to change it. —Μετάknowledgediscuss/deeds 22:09, 17 December 2012 (UTC)
<facetious>Or we could ask Wikipedia to change its favicon.</facetious> - -sche (discuss) 22:12, 17 December 2012 (UTC)
Ugh, but then we'd get Wikipedians and WMF employees coming by and asking us why we're still using the old deprecated favicon instead of migrating to the shiny new one. —RuakhTALK 18:45, 25 December 2012 (UTC)
I would support changing it. Equinox 01:21, 18 December 2012 (UTC)
I also support changing it, but not to one based on the tile logo. I like -sche’s suggestion even better, just to piss them off lol. — Ungoliant (Falai) 01:31, 18 December 2012 (UTC)
What would it look like? The closest thing we have to an accepted symbol is the pile of text on the top-left, and the only thing there that's faviconizable is our own "W", which doesn't look all that different from Wikipedia's. --Yair rand (talk) 18:18, 25 December 2012 (UTC)
Instead of bickering more, I have created a vote :) Take a look at Wiktionary:Votes/2012-12/New favicon and see what you think. —Μετάknowledgediscuss/deeds 06:04, 29 December 2012 (UTC)
So an angled "W" is enough to differentiate use from Wikipedia's and MediaWiki's straight "W". (Who else uses it?) Is there any need to differentiate us from other Wiktionaries? DCDuring TALK 18:05, 1 January 2013 (UTC)
Well, I often have two Wiktionary tabs open. At the moment they have different favicons, but I always have English on the left so it's no big deal. SemperBlotto (talk) 18:13, 1 January 2013 (UTC)


There's a new wikiproject in town, Wikivoyage. It can be linked-to using the code "voy:" in the same way Wikipedia can be linked-to using "w:". We should probably make a {{wikipedia}}- and/or {{slim-wikipedia}}/{{pedialite}}-style template to link to it, for use on articles like [[London]], [[Berlin]], etc. (And we should hope the ISO never assigns "voy" as a language code...) - -sche (discuss) 08:46, 18 December 2012 (UTC)

Sorry to be a pedant but "Wikivoyage" is the correct capitalisation. This, that and the other (talk) 10:35, 18 December 2012 (UTC)
Fixed, thanks. - -sche (discuss) 17:31, 18 December 2012 (UTC)
It seems a bit like a rip-off of Wikitravel... —CodeCat 14:02, 18 December 2012 (UTC)
Or the other way round: I think a lot of the Wikitravel contributors got tired of being exploited for ads etc. "The project was started as a fork of the German version of Wikitravel in September 2006 with the specific intent of avoiding commercial interests after Wikitravel was aquired by the for profit company Internet Brands." Equinox 14:22, 18 December 2012 (UTC)
Oh, so it actually is Wikitravel. That's good! —CodeCat 14:25, 18 December 2012 (UTC)
(And we should hope the ISO never assigns "voy" as a language code...) Someone tried proposing a Wikipedia in Wawa language. It was rejected because Wawa's code is {{www}}, and - well, I don't think I need to tell you the rest. -- Liliana 16:08, 18 December 2012 (UTC)
LOL! I hope they assign an exceptional code at some point (maybe "wawa:")... it'd be unfair to deny a language wikis just because of that! - -sche (discuss) 17:31, 18 December 2012 (UTC)
  • This could solve our phrasebook dilemma - we could ship it off to Wikivoyage. However, it seems that there's already some phrasebooks. --Wikt Twitterer (talk) 14:37, 21 December 2012 (UTC)

Positions of obsolete senses

I would like to see obsolete senses below non-obsolete senses. Thus, I do not want to see sad#English start with three obsolete senses; these should better be moved below non-obsolete ones, IMHO. What do you think? --Dan Polansky (talk) 19:45, 19 December 2012 (UTC)

I very much prefer the senses to be sorted in chronological order, even if it means having three obsolete senses at the beginning. -- Liliana 20:01, 19 December 2012 (UTC)
I also prefer chronological order of definitions. — Ungoliant (Falai) 21:16, 19 December 2012 (UTC)
I prefer chronological order personally. Chronological order, with subsenses, can make Etymology sections readable.
What about some gadgets for allowing users to suppress obsolete, archaic, date, obscene, and rare senses if they wish? We could have some of them suppressed by default to make it easier for users to find common current senses. 21:43, 19 December 2012 (UTC)
I prefer chronological ordering but I also prefer to put obsolete senses at the bottom. So the first sense should be the oldest still current sense. —CodeCat 22:13, 19 December 2012 (UTC)
Not this argument again... I think senses should be arranged in (chrono)logical order. If that means that some obsolete senses should be put on top, then so be it. Maybe obsolete senses should be grayed out a little or something? --WikiTiki89 22:19, 19 December 2012 (UTC)
But what advantage is there to the average dictionary user to find obsolete senses first? We can be fairly sure that those senses will be the ones users will look for the least likely. —CodeCat 22:22, 19 December 2012 (UTC)
It doesn't slow the user down very much, they are fairly easy to skip over if need be. But on the other hand, I think that older senses (whether they are obsolete or still in use) help the user to understand the meaning of the later senses, even if they aren't what the user is looking for. --WikiTiki89 22:27, 19 December 2012 (UTC)
Re: "... older senses (...) help the user to understand the meaning of the later senses, ...". Nonsense. --Dan Polansky (talk) 22:31, 19 December 2012 (UTC)
I agree with Wikitiki’s comment on older senses helping understand other senses. — Ungoliant (Falai) 22:48, 19 December 2012 (UTC)
What argument again? I am trying to figure out whether people would support moving obsolete senses below. Your "(chrono)logical" phrase is ridiculous; chronology has nothing to do with logic AKA study of correct inference; chronological order is definitely not logical order. IMHO, putting the least sought and used senses first is silly and user-unfriendly, but people obviously differ. --Dan Polansky (talk) 22:30, 19 December 2012 (UTC)
The argument of what order in which to put definitions. Actually I was thinking of just saying "logical" but decided to add "(chrono)" before it after. The reason it is a logical order is because it the more original senses before the senses that are derived from them. I think this is important because (as I just said above) they "help the user to understand the meaning of the later senses". And that is not "nonsense". It makes no difference for denotations, but for connotations every piece of information you get adds to the meaning. --WikiTiki89 22:39, 19 December 2012 (UTC)
I rest my case; it is nonsense. Furthermore, the other definitions would not be hidden, merely relegated to less prominent place. I am confident that, as regards the sad entry, the obsolete defintions "Sated, having had one's fill; satisfied, weary", "Steadfast, valiant" and "Dignified, serious, grave" do not add anything that helps understand "Feeling sorrow; sorrowful, mournful", nor do they present a "connotation" or any other continental crap. --Dan Polansky (talk) 22:47, 19 December 2012 (UTC)
Maybe in the sad case they don't but in many other cases they do. --WikiTiki89 22:49, 19 December 2012 (UTC)
I claim that they don't. You have not stated a single example of where they do. The single example of "sad" does not prove my case, but illustrates the point. --Dan Polansky (talk) 22:55, 19 December 2012 (UTC)
It would be helpful in the entry wold, if the obsolete sense came first. — Ungoliant (Falai) 23:04, 19 December 2012 (UTC)
Are you saying that the reader cannot understand "An unforested or deforested plain, a grassland, a moor." without first reading "(obsolete) A wood or forest, especially a wooded upland"? --Dan Polansky (talk) 23:05, 19 December 2012 (UTC)
No. — Ungoliant (Falai) 23:07, 19 December 2012 (UTC)
Then you are not responding to what I said to Wikitiki. --Dan Polansky (talk) 23:09, 19 December 2012 (UTC)
I am. What you said to Wikitiki was that it wouldn’t help understand the meaning (and I think it would, if the obsolete definition of wold came first); what you asked me was whether the reader would not understand the meaning, which is, of course, not the case. I my opinion if a user reads that the word wold used to mean “A wood or forest”, but now means “unforested or deforested plain”, it will help him remember the modern meaning. — Ungoliant (Falai) 23:43, 19 December 2012 (UTC)
@Wikitiki: You do not really believe that the readers of Merriam-Webster Online are at loss about what their definitions say due to the lack of obsolete definitions, do you? --Dan Polansky (talk) 23:09, 19 December 2012 (UTC)
It doesn't help you understand the definition, but it helps you understand the word beyond the definition. Really, it's about as useful as the Etymology section. Some people just don't care about these kinds of things, and for those people it is very easy to just skip over this sort of information. For the people that would care about it, they don't always know it's there. Putting the older senses first gives those people a chance to see it before they find the information they were looking for and close the page. If it turns out they don't care it's not a big a deal, but if they do care, it will give them some significant insight into the word. --WikiTiki89 23:28, 19 December 2012 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── I thought we were discussing this: "... older senses (...) help the user to understand the meaning of the later senses, ...". I said it was nonsense. You have now changed the subject; your last paragraph does not in any way support the discussed statement. Have I heard "I stand corrected" from you? No. Have you provided a single example of an entry in which the obsolete senses (which neither Merriam-Webster nor many other dictionaries have) critically enhance the reader's understanding of the moderns senses? Neither. --Dan Polansky (talk) 23:41, 19 December 2012 (UTC)
I did not change the subject, that is precisely what I was talking about. Also, I never used the word "critically". What I'm saying is they are not necessary, but they can help. And if they're put at the end, they won't help anyone because no one would care to look. Also, Ungoliant gave an example already and if you want another one: rook#Etymology 2. --WikiTiki89 23:45, 19 December 2012 (UTC)
As regards Ungoliant's wold, are you saying that the reader can understand "An unforested or deforested plain, a grassland, a moor" better after reading "(obsolete) A wood or forest, especially a wooded upland"?
As regards rook, I find no obsolete senses in rook#Etymology 2. --Dan Polansky (talk) 23:53, 19 December 2012 (UTC)
Re: "And if they're put at the end, they won't help anyone because no one would care to look." Upon strict reading of "no one", that is very unlikely. But what you are really saying is that you want to push obsolete senses to the attention of people who do not care to look at them such as me. --Dan Polansky (talk) 23:55, 19 December 2012 (UTC)
  1. No. People can understand "An unforested or deforested plain, a grassland, a moor" just fine without the obsolete sense. It is "wold" that the obsolete sense adds meaning to and not its definition. As I said before, this "meaning" that it adds is beyond the definition. It is not necessary, but it can be useful.
  2. Rare or obsolete, it makes no difference.
  3. Basically yes, but if you read what I said, I maintain that this is helpful to some and no more than a minor nuisance to others.
--WikiTiki89 00:19, 20 December 2012 (UTC)
By the way, check sad at OneLook Dictionary Search dictionaries and compare them to our pitiable sad entry. --Dan Polansky (talk) 22:47, 19 December 2012 (UTC)
That the entry is inadequate has little to do with the ideal entry we might want as a goal.
To try to shortcut the question of ideal entry by a "bold" stroke of this kind seems like giving up on more constructive, if repetitive, discussion of goals. Two suggestions have been made about how to make it easier for users to find the senses that you claim (without support) that they want. What do you think about them?
It is really indefensible for us to simply assume that we represent the typical worthy user or that we understand actual usage of our entries. We do not have any facts that would enable us to characterize our typical user and we don't have any current facts about our current usage, even page hits, let alone what users really want. What facts we have about usage suggest that we should always place definitions about sex and drinking, those constituting insults, unusual definitions used in popular culture or in political rhetoric before other senses.
I could even imagine that a case could be made for inverse chronological order, which might put the most recently developed senses at the top, which might be supportive of our competitive advantage in accumulating such topical senses.
Until we do have better knowledge of usage, we are probably better of to stay with what can, in principle, be documented with objective evidence: dates of first usage, Google N-grams etc. DCDuring TALK 23:32, 19 December 2012 (UTC)
Re: "... that you claim (without support) that they want." (a) Many modern dictionaries online including Merriam-Webster do not have obsolete senses at all, so I find it very unlikely that their readers are hopelessly craving for obsolete senses. (b) I have found not a single dictionary that has obsolete senses and puts them before modern senses. (c) A modern user of language does not need to know the obsolete senses; they are a historical curiosity. (d) Yes, I do not have strict empirical evidence about what Wiktionary readers want, but (a), (b), and (c) are fairly indicative of the wants and needs of dictionary users. --Dan Polansky (talk) 23:48, 19 December 2012 (UTC)
(b) As I recall the OED follows chronological order.
(a) At least RHU and Macmillan have an indication of obsolescence/archaicism for words they include.
(a) MWOnline has senses that I consider obsolete, but I haven't found any that they mark them as such.
(c) Some modern users of dictionaries may be attempting to understand literature from any of the several centuries of modern English.
(d) The questionable status of the premises makes the conclusion even more questionable. DCDuring TALK 01:26, 20 December 2012 (UTC)

My earlier highlighting suggestion was rather quickly shot down, but perhaps it will have a more receptive audience here. I bring your attention to WT:BP#Highlighting current senses for highly polysemic terms with many rare/obsolete senses, above; q.v. for all relevant information. I'm so meta even this acronym (talk) 11:53, 21 December 2012 (UTC)

Some kind of further visual differentiation seems appropriate. I doubt that color is enough. I like the idea of by default suppressing the display of obsolete, archaic, rare, dated, etc senses, using CSS, I suppose, and allowing users to display them using controls in the "toolbox". But the value of historical ordering with subsenses is substantial.
I strongly oppose "logical" ordering, because that seems to me to be essentially a disguise for personal whim. DCDuring TALK 13:30, 21 December 2012 (UTC)
I haven't read this thread, but I prefer having the most common senses first. Mglovesfun (talk) 13:31, 21 December 2012 (UTC)
Do we use any particular facts for the determination of what is common or is it a matter of whim idiolect? Idiolectic vote? DCDuring TALK 14:38, 21 December 2012 (UTC)
I advocate some kind of marking to affect the relative prominence of senses, but not so much as to render rare or obsolete senses invisible. Perhaps enlarging current senses is an unpopular proposal, but what about shrinking rare and obsolete senses? We could use code like this: <font size=1>{{{1}}}</font> for the shrinking. I'm so meta even this acronym (talk) 15:51, 21 December 2012 (UTC)
  • Entries that are not in chronological order make no sense. Why would mean a small ball of water and also a prayer? Historical dictionaries only work if you show the order these sense emerged in (prayer – object on rosary representing a prayer – any small round object). Anyway, how on earth will you determine the ‘most frequent’ use of , or ? Many dictionaries order chronologically but I don't know of any that claim to order by frequency of usage. Ƿidsiþ 13:21, 22 December 2012 (UTC)

Straw poll

Support ordering senses oldest first, even if obsolete

The first sense will be the oldest attestable sense.

  1.   Support DCDuring TALK 15:32, 21 December 2012 (UTC)
  2.   I support this option, though I also advocate some kind of marking to affect the relative prominence of senses, either diminishing rare/obsolete ones' or enhancing common/current ones'. I'm so meta even this acronym (talk) 15:51, 21 December 2012 (UTC)
  3.   Support. — Ungoliant (Falai) 18:15, 21 December 2012 (UTC)
  4.   Support, as in, I think this is one fine way to do it — but not the only fine way to do it. Even if we do it this way in general, we'll probably want to allow for exceptions in cases where a bit of re-ordering groups the senses more logically (though sometimes that can be handled by using subsenses). And, of course, sometimes "oldest" is completely meaningless. If two senses are attested since Middle English, then does the first post–Modern-English–cutoff attestation decide which sense is "older"? —RuakhTALK 00:22, 22 December 2012 (UTC)
  5.   Support - -sche (discuss) 04:59, 22 December 2012 (UTC)
      Oppose --Dan Polansky (talk) 08:37, 22 December 2012 (UTC)
  6.   Support Ƿidsiþ 13:13, 22 December 2012 (UTC)
  7.   SupportPingkudimmi 13:29, 22 December 2012 (UTC)
  8.   Support --WikiTiki89 17:51, 23 December 2012 (UTC)
      Oppose EncycloPetey (talk) 16:23, 25 December 2012 (UTC) Ordering this way prevents us from grouping related senses, and will often put the least useful information up front, which does not serve our readership and will confuse them. --EncycloPetey (talk) 16:23, 25 December 2012 (UTC)
    Only if implemented in a particularly mindless way. All of the strict ordering methods should be considered in their interaction with grouping of related senses. In fact, it would be a very good exercise for all the talent here to consider how related senses are grouped: grammar first (eg complementation, uncountability)? Degree of abstraction from the concrete first? I would argue that this is more pressing than the treatment of obsolete senses, not that importance much influences our agenda. DCDuring TALK

Support ordering senses oldest first, but obsolete senses at the bottom

The first sense will be the oldest still-current sense.

  1.   SupportCodeCat 15:03, 21 December 2012 (UTC)
  2.   Support This order makes perfect sense when seeing or adding translations. I'd say, not just the oldest but the most common sense should come first. This will reduce the number of wrong translations, as some contributors don't seem to read the sense definitions properly all the time and less confusing to people who just want to look up a common word. --Anatoli (обсудить/вклад) 00:44, 22 December 2012 (UTC)
  3.   Support I support moving obsolete senses to the bottom, regardless of what order is chosen for modern senses. --Dan Polansky (talk) 08:37, 22 December 2012 (UTC)
      Oppose — This, I think, is the worst option; an arrangement that fails to impart information about either historical development or usage frequency. I'm so meta even this acronym (talk) 21:30, 22 December 2012 (UTC)
    Why are we trying to impart information with the ordering? It would be better to add date information to the sense lines. We should use order to make the whole more useable. Chuck Entz (talk) 23:12, 25 December 2012 (UTC)
      Oppose DCDuring TALK 02:38, 23 December 2012 (UTC) Makes historical reading of an entry disjointed.
    Is that bad? —CodeCat 20:59, 25 December 2012 (UTC)

Support ordering senses newest first

The first sense will be the newest attestable sense.

Support ordering senses by usage frequency, most-used first

The first sense will be the most commonly used sense (however that is determined).

  1.   Support. Not a rigorous, highly-specified ordering, but one that allows a few exceptionally common senses to come before the others so people can find what they want without too much trouble.
    Other than that, each term has its own universe of senses and subsenses that are related in idiosyncratic ways generally incompatible with one-size-fits-all prescriptive rules. The reason this is such a perennial topic of debate is that any given scheme fits some entries very well and is really bad for others. We don't have the tools to determine what's best for the body of entries as a whole, so we never get beyond the "dueling examples" stage.
    I'm more for grouping semantically similar senses together most of the time, though there are some cases where chronological order tells an interesting and useful story. Chuck Entz (talk) 17:02, 21 December 2012 (UTC)
    But what makes you think that most users usually want the most common sense for polysemous words? Isn't that the one that users are most likely to know?
    I wish we had more subsense grouping of definitions, though the principles for such grouping are not obvious, especially since we don't have a canonical structure for most definitions, taxonomic names being a not-very-glorious exception. DCDuring TALK 17:23, 21 December 2012 (UTC)
    Because that's what they're most likely to run into. Here again, it varies. There are some basic terms where most English speakers might be expected to already know the most common definition or two. In those cases, most-common-first would be a mistake. That's why I said "allows" rather than "requires". There are plenty of other terms where an English speaker may have seen the term before, but isn't sure of the definition. In those cases, listing obscure senses first can be a little misleading, since they don't match what one is most likely to find in use. And then there are the things that can be found in well-known-but-archaic texts such as Shakespeare and the KJV of the Bible, where the unfamiliarity of the old senses might be assumed to prompt a much higher percentage of queries, and thus skew the numbers. Chuck Entz (talk) 17:55, 21 December 2012 (UTC)
  2.   Support.Matthias Buchmeier (talk) 22:02, 21 December 2012 (UTC)
  3.   SupportΜετάknowledgediscuss/deeds 22:36, 21 December 2012 (UTC) —Μετάknowledgediscuss/deeds 22:36, 21 December 2012 (UTC)
    I'm looking forward to hearing where we are going to get the information to support this bold move. I have the feeling that this means ordering senses by vote. DCDuring TALK 00:05, 22 December 2012 (UTC)
  4.   Support It's similar to the above vote, hopefully they don't clash much but I support this one more than the above vote. --Anatoli (обсудить/вклад) 00:48, 22 December 2012 (UTC)
  5.   Support I support moving obsolete senses to the bottom, regardless of what order is chosen for modern senses. I admit that determining frequency can be hard. What is not hard is to spot obsolete senses and move them down. --Dan Polansky (talk) 08:37, 22 December 2012 (UTC)
  6.   Support. Mglovesfun (talk) 11:34, 23 December 2012 (UTC)
      Oppose - -sche (discuss) 20:03, 22 December 2012 (UTC)
      Oppose Impossible to properly execute. Prove me wrong by producing information sources that make it feasible to actually execute this program. DCDuring TALK 02:39, 23 December 2012 (UTC)
    I don't really agree with that line of reasoning. We can have the goal of listing senses in approximate order of frequency — most common senses first, least common senses last — without having any means of proving objectively that one sense really is more common than another. I mean, we always have to make these sorts of judgments. Not all of our goals can be perfectly achieved; for example, one of our goals is for our definitions to be as clear as possible, but we don't have any way of quantifying that. We just have to rely on our own judgment and common sense. (Now, that's not to say that we should have the goal of listing senses in approximate order of frequency. But if we think that's a worthy goal, then I don't think we should be deterred by the fact that perfection is unattainable.) —RuakhTALK 03:09, 23 December 2012 (UTC)
    "Perfection" is a straw man.
    We don't have a factual basis to make even a reasonable approximation. This amounts to the whims of whoever decides to spend their time on this, perhaps occasionally going through the sausage factory of a vote (at the Tea Room?). I've never been impressed with the the average quality of the outcome of such process, compared to RfV.
    I would like to see some instances of exemplary attempts to apply this to an entry before it is seriously considered as a policy. DCDuring TALK 10:17, 23 December 2012 (UTC)
    Hmm. I guess this is the same disagreement we've always had: I'd prefer to do what's best, and if it can be operationalized, so much the better; you prefer to do what can be operationalized, and if it can be made good, so much the better. :-P   (Hence our longstanding disagreements on what discussions belong at WT:RFV.) So we're probably not going to end up agreeing in this case, either. —RuakhTALK 18:08, 24 December 2012 (UTC)
    This matter seems to be discussed solely on principle. It's like a revolution in which royalists, anarchists, social democrats, republicans, and communists are all fighting about ideals. In the meantime, the trains don't run, the power gets shut off, no exports or imports, houses get destroyed and not rebuilt, and - in the end - none of the ideal systems get implemented. At least here we can't - or at least don't - have much of a dictatorship. And we also have projects that start with flourish and fanfare and get abandoned after some expenditure of effort, followed by eventual removal (Shorthand?). I would really like to see someone try their hand at working a few entries, especially ones with subsenses, along these lines. DCDuring TALK 18:46, 24 December 2012 (UTC)
    You realize that this is only a straw poll, right? We're just expressing our preferences, not enacting actual proposals. In practice, no matter how this comes out, things will be pretty much the same. If a proposal comes out of this it will have to be presented and its pros and cons discussed. Right now, complaining about how a non-existent change is not going to be implemented properly is kind of a waste of space. Chuck Entz (talk) 23:24, 25 December 2012 (UTC)
    If it really is impossible to properly execute, then why has nearly every print dictionary attempted this approach? --EncycloPetey (talk) 16:26, 25 December 2012 (UTC)
  7.   Support. IMO, usefulness is too important a criterion to hobble it with consistently useless obsolete senses at the top. This information is why we have an etymology section anyway. Circeus (talk) 21:39, 23 December 2012 (UTC)
  8.   Support EncycloPetey (talk) 18:36, 24 December 2012 (UTC) This is what most users expect from a dictionary, and not having this has led to much confusion among our readers. Translations between languages typically pull from our lead definition, because that's where other dictionaries would put the primary sense. It therefore make sense for us to put the most useful information up front. --EncycloPetey (talk) 18:36, 24 December 2012 (UTC)

Something else/abstain

JS and/or CSS defaulting to not display definitions with obsolete, rare, archaic and perhaps other tags (eg, dated, obscene) with toolbox option to display and gadget to alter default for registered users.

  1.   Support DCDuring TALK 15:35, 21 December 2012 (UTC)
    None of these options is feasible because we don't have enough information on the ages of meanings or the frequency of use of meanings to be able to calculate any of the proposed orderings.
  2.   SupportAngr 23:40, 21 December 2012 (UTC)
    We have a chance with dates for English by checking using the OED and Google Books. There is often a confluence of degree of physicality/literalness and age. Etymologies also provide evidence. As to the frequency of meaning, it would require rather intense effort using a corpus to do it objectively. My experience says the required level of effort to do a creditable job will not be forthcoming from the advocates. I doubt that anyone would even mine other dictionaries systematically for the purpose. I would like to see a couple of polysemous English words with, say, five senses in a single Etymology/PoW actually done this way, with some supporting evidence to justify the ordering. DCDuring TALK 00:14, 22 December 2012 (UTC)
    I seriously doubt Google Books will be of any use to us, as its results are predominantly from books written in the past dozen years or so. And anyway, we're not just a dictionary of English. Where will we get the age or frequency data for Lower Sorbian or Sesotho or Tongan? —Angr 00:20, 22 December 2012 (UTC)
    Are you also saying that it is not feasible to move obsolete senses down regardless of the order of modern senses? --Dan Polansky (talk) 09:01, 22 December 2012 (UTC)
    No, but I'm undecided whether I want us to do that. —Angr 09:39, 22 December 2012 (UTC)
  3.   Support - I don't think it's such an important question; I suspect we should attempt to order senses roughly from "most basic/central" to "most peripheral/metaphorical", but with a wide latitude for problems in determining which senses are more or less basic. I don't think that One Best Ordering is feasible in all cases. I would rather place obsolete senses at the end; to begin with obsolete senses suggests to me more of a specialized dictionary (or an all-encompassing one like the OED), not one friendly to casual users. --Pereru (talk) 01:38, 22 December 2012 (UTC)
    If we were really trying to reach casual users in English we would be doing many things differently: hiding etymology and pronunciation sections; eliminating most quotations from before 1900 or later; simplifying our context tags to eliminate terms unlikely to be recognized by casual users; using a restricted vocabulary for all non-technical terms; eliminating semantic relations headers other than synonyms and antonyms; renaming, combining, or eliminating Derived terms and Related terms etc. DCDuring TALK 15:24, 22 December 2012 (UTC)
  4. I don't think definitions should be in age order. (The Etymology section, or perhaps those defdate remarks, can indicate that.) I think current common senses should be most prominent, with others (dated, obsolete, used by chemists, regional, used only by teens) thereafter, but not necessarily any strict order beyond that (and even the order I indicated can have exceptions if there's some reason).​—msh210 (talk) 05:59, 23 December 2012 (UTC)

Script names as section headings for characters

Our current practice is to list characters as ==Translingual==. But many characters aren't really translingual, they're more specific than that. We never categorise such characters in the Translingual categories; compare which is categorised as a "Han character", with , which is categorised as "Translingual punctuation mark". Presumably it is desirable to have our categories and section headings reflect similar treatment, so I think it may be better if we treat scripts as "languages" and allow their names as section headers. So the translingual entry at would become ==Han character== or ==Han script== or something similar. What do you think? —CodeCat 01:40, 20 December 2012 (UTC)

Not sure, but a technical question first: what, then, about things found in multiple scripts, like ,?​—msh210 (talk) 06:32, 20 December 2012 (UTC)
They should probably stay as they are. This proposal only affects characters that belong to one script and are rarely used outside it. I wouldn't count things like citations of foreign terms, or scientific usage of the Latin and Greek alphabets, as real usage (treating A as Chinese or Δ as Latin seems a bit... off). I think that for now, we can look at the clear-cut cases first and look at more ambiguous ones later. —CodeCat 14:29, 20 December 2012 (UTC)
Yeah, same with Canadian Aboriginal Syllabics and Devanagari. I like this idea, better to split up Translingual along scriptal divisions. —Μετάknowledgediscuss/deeds 19:58, 20 December 2012 (UTC)
Weak oppose. This will create a mess of level 2 headings. — Ungoliant (Falai) 20:18, 20 December 2012 (UTC)
But aren't there more distinct languages than distinct scripts? Like, by a factor of at least twenty? —RuakhTALK 20:42, 20 December 2012 (UTC)
Yes, but I think it’s better if we keep the standard that every level 2 heading is a language (or multiple languages, in the case of {{mul}}) instead of mixing it up with scripts. — Ungoliant (Falai) 21:02, 20 December 2012 (UTC)
It is a good standard for sure, but we shouldn't be too dogmatic about it. If it makes more sense to change it, we should. —CodeCat 21:26, 20 December 2012 (UTC)
This wouldn't work with our entry structure though, as many things expect there to be a valid language code for every entry, but there is no language code for "Han character" and the like... -- Liliana 20:40, 20 December 2012 (UTC)
What kinds of things expect that, and what is preventing them from being changed? —CodeCat 20:41, 20 December 2012 (UTC)
It starts with {{infl}} and ends with the AutoFormat bot. -- Liliana 20:50, 20 December 2012 (UTC)
{{head}}/infl is easy to change, if it even needs changing at all. Don't we already have templates for most of the scripts? Autoformat... I don't know. —CodeCat 21:24, 20 December 2012 (UTC)
That's not a bad idea, but I wonder if it would really apply to every case. Would you want abbreviations of worldwide organizations, spelling alphabets, taxonomic names, chemical elements, math functions, currency and other ISO standards to all fall under a ==Latin script== header? Numbers like 100 to all fall under an ==Arabic script== header? I'm not saying it would be undesirable, I just want to make sure we know what we'd be getting into. DAVilla 01:37, 21 December 2012 (UTC)
No this is only about the... um... I suppose you could say this is only for terms that are related to the script itself, not terms that just use the script to make something else. I don't know how numbers would be treated... maybe they can stay translingual because they are not tied to any particular script. —CodeCat 01:52, 21 December 2012 (UTC)
But some numerical systems are tied to just one script, while others are associated with mutliple scripts. And although you can find "Arabic numbers" in use with everything from English to Japanese, you find a completely different set of "Arabic numbers" in some languages like Urdu that use an Arabic script. --EncycloPetey (talk) 02:12, 21 December 2012 (UTC)
Yes, and Chinese has its own set too I believe. I suppose we can consider them as part of their own scripts. The tricky part with numbers I think is that they are often treated as part of the script (translating into Arabic would include writing the numbers in the way appropriate for Arabic script) but the mapping isn't one-to-one. The "Latinised" Hindu-Arabic numerals are used in many scripts, but often scripts have their own individual numeral system as well. The Latin script has Roman numerals as its "own" numeral system, and Greek, Cyrillic, Gothic and Hebrew also have their own (similar) numeral systems, although the languages that use those scripts have mostly moved onto the Hindu-Arabic system. Chinese has characters for numbers but also uses "international" numerals. And I believe Arabic also uses "Latinised" numbers, I believe, although in its case it's just a typographical difference. —CodeCat 02:24, 21 December 2012 (UTC)
While I do like our current system of "L2s are always language names", I can see the utility of treating Arabic letters as ==Arabic script==, etc. I don't have strong feelings about the current system or the proposed change. Since scripts always have codes (and script codes are distinct from languages' codes by being four letters rather than two, three, or six), it should be possible to adapt templates, bots, etc to handle the presences of scripts as L2s. - -sche (discuss) 03:54, 21 December 2012 (UTC)

I think there is enough support for this, although many people don't really have strong opinions either way. So I propose to make this change by allowing the name of a script as a language heading. The name in the header would be followed by the word "script" in most instances when it needs to be made clear that it's not a language, and also to make sure it doesn't conflict with a language name. Exceptions that would not require "script" would be:

  • Canadian syllabics ("syllabics" implies a kind of script)
  • Egyptian hieroglyphics ("hieroglyphics" are also known to be a form of script)

This proposal specifically does not intend to specify which characters belong under which headings, only that the headings themselves be allowed. If there is any ambiguity, they can be discussed on an individual basis, similar to how we often discuss changes in the way we define languages and which languages words belong to (Serbo-Croatian, Low German etc). It also does not specify any technical modifications that would be required to support it. I believe a vote would be needed to codify this change, but I am not sure what such a vote should actually modify in terms of policy as I don't know where the policies on language headers are actually written down currently. —CodeCat 22:31, 27 December 2012 (UTC)

{{obsolete term}}

Following up on the above discussion on Category:English obsolete terms vs. Category:English terms with obsolete senses and other similar categories for archaic, dated etc. words, I went ahead and created the above template, categorizing into Category:LANGUAGE obsolete terms (while {{obsolete}} continues to categorize into Category:LANGUAGE terms with obsolete senses) for use with words that are fully obsolete. I intend to use it with Latvian obsolete terms. If you all agree, I will create similar templates for archaic, dated, etc. What do y'all think? --Pereru (talk) 02:26, 22 December 2012 (UTC)

I missed that above discussion (and don't see it now) but we don't have separate categories for US-only terms and for those with US-only senses, nor separate categories for math-specific terms and for those with math-specific senses. Why should obsolete, archaic, and dated be different?​—msh210 (talk) 06:10, 23 December 2012 (UTC)

Wiktionary talk:About Portuguese

Hi. I want to turn WT:APT into a useful page, so, as much as I would love to be the dictator of Wiktionary and just change it to my liking, I started a ton of discussions in its talk page, regarding the formalisation and change of some practices. I invite all those who are interested in Romance languages, especially those who know Portuguese, to read and comment. Because some suggestion ask for change in widely used templates, I also invite those who are interested in template writing to take a look. — Ungoliant (Falai) 03:42, 22 December 2012 (UTC)

WT:APT is a draft proposal, not established policy, so you can be bold and just change it to your liking, as long as you're willing to talk about it if other people later object to something. —Angr 09:42, 22 December 2012 (UTC)
Some are major changes though. Especially the banning of first-person imperative, which will require the deletion of thousands of definitions. And some are requests for ideas (regarding entries for enclitics and mesoclitics). — Ungoliant (Falai) 15:49, 22 December 2012 (UTC)
Try separate all the changes into separate edits so a change can be done without undoing all the others. Use clear edit summaries, and then leave it to us to check it. Mglovesfun (talk) 18:09, 22 December 2012 (UTC)

Jersey Dutch

Jersey Dutch was (is?) a creolised variety of Dutch, with English and Lenape influences, spoken in New Jersey for several centuries. There were 2-3 subvarieties; the lect(s) as spoken by Europeans and Native Americans were called "Lag Duits" or "Leeg Duits" (literally Low Dutch), the lect as spoken by Africans was called "Negerhollands" or "Negerduits" (not to be confused with {{dcr}}, "Negerhollands", which was spoken in the Caribbean). Jersey Dutch does not, AFAICT, have a language code. So... should we give it one? If so, what should it be called: {{gmw-ldt}}, {{gmw-nll}}, {{gmw-jnl}}, something else? - -sche (discuss) 19:36, 23 December 2012 (UTC)

I would suggest {{gmw-jdt}} as our made-up codes should be based on the English name. I don't think we should use anything derived from "nl". --WikiTiki89 19:41, 23 December 2012 (UTC)
{{gmw-jdt}} it is. - -sche (discuss) 05:45, 31 December 2012 (UTC)

Wiktionary:Votes/2012-12/Unified Malay

I hereby announce the latest vote in the saga of language treatment, to unify Malay, Indonesian, and a couple other questionable lects under a single header. Comments welcome. —Μετάknowledgediscuss/deeds 20:05, 23 December 2012 (UTC)

I know purely based on Wiktionary that they are astonishingly similar. Mglovesfun (talk) 00:31, 30 December 2012 (UTC)

Using ĭ and ŭ instead of ь and ъ in Proto-Slavic

Previous discussion: Wiktionary:Beer parlour archive/2012/April#Proto-Slavic:_Why_are_.D1.8C_and_.D1.8A_used_for_.C4.AD_and_.C5.AD.3F

I'm well aware that most traditional scholarship has used the Cyrillic letters ь and ъ to denote the two short high vowels in Proto-Slavic and also occasionally in Old Church Slavonic. However, ĭ and ŭ are also occasionally encountered, and we systematically transliterate those two Cyrillic letters that way in OCS words. The history of these letters is a bit of a kludge, really. When linguists first encountered them in OCS texts they didn't really know what they were or how to interpret them. They didn't know how to assign a sensible phonetic symbol to them, so they kept them as ь and ъ to stand for some kind of mysterious "reduced vowels" that disappeared sometimes but changed to other vowels in other occasions. Modern Slavic research now knows that these were just regular short vowels, but since the symbols <i> and <u> traditionally stand for what are now considered long vowels in Proto-Slavic, the letters ĭ and ŭ have come into use to denote these sounds. I would like to propose that we switch to using ĭ and ŭ to represent these two vowels instead of ь and ъ. It's clearer to non-experts, since even those who can read Cyrillic will think of ь and ъ as a soft sign and hard sign rather than as vowels. And of course ь and ъ look even stranger to someone who doesn't know Cyrillic and sees these two b-like letters mixed in with Latin text, and it doesn't help that they look so similar. —CodeCat 14:10, 26 December 2012 (UTC)

I completely agree, with only one selfish objection: I personally find Proto-Slavic words easier to read with ь and ъ than with ĭ and ŭ, mostly because the latter two are more difficult for me to instinctively connect to their Russian equivalents. --WikiTiki89 14:22, 26 December 2012 (UTC)
I suppose that is true but it has a downside as well. I have learned to mentally "skip" these two letters as (more or less) irrelevant, because that is how they are treated in Russian. So when I read a Proto-Slavic word with ь and ъ in it like *četvьrtъkъ my mind skips the yers and "sees" something that resembles četv*rt*k*. I suppose that is what makes it easier to connect to modern Slavic words. But of course for Proto-Slavic these letters were anything but irrelevant, and I realised I found OCS transliterations much easier to grasp intuitively than Proto-Slavic because of this difference. Somehow, četvĭrtŭkŭ gets the structure of a word across to me much clearer. —CodeCat 14:34, 26 December 2012 (UTC)
Yeah, for me it makes reading OCS transliterations a little confusing (personally I think that transliterations should always be secondary to the original script and I despise linguistic literature that uses only transliterations). But regardless, I think switching to ĭ and ŭ is ultimately the right thing to do. --WikiTiki89 14:41, 26 December 2012 (UTC)
Sounds reasonable. — Ungoliant (Falai) 16:22, 26 December 2012 (UTC)
Sounds like a worthwhile change, because I've been treating them like Russian all these years too! I don't think I ever knew how to pronounce them correctly before. —Μετάknowledgediscuss/deeds 17:12, 26 December 2012 (UTC)
To summarise last April's discussion: Angr said "ь and ъ are more common, especially in more modern sources [] we provide both Cyrillic and Latin for OCS, [so] it would be redundant to use ь/ъ in both, but for Proto-Slavic we only provide Latin, so it isn't redundant." Stephen said "the precise phonetic values of ь and ъ are not known for certain, and representing them as ĭ and ŭ might be incorrect". Ivan Štambuk said: "Jers in Proto-Slavic reconstructions are usually not transliterated into Latin. Cyrillic characters are used simply because it's the most common practice in the books/papers. It's quite common for OCS too, but as Angr explained it doesn't really make sense for us to use it." - -sche (discuss) 17:17, 26 December 2012 (UTC)
My views haven't changed since that discussion. I do disagree with Stephen's statement, though, because ĭ and ŭ are no less abstract than ь and ъ. Whichever pair of symbols we choose, we're using them like mathematical variables to stand for two phonemes of Proto-Slavic and OCS. The symbols make no claims about the precise phonetic values of these phonemes. (Which is just as well, since ŭ is almost certainly an ultrashort y (i.e. ꙑ), not an ultrashort u.) —Angr 17:34, 26 December 2012 (UTC)
Well actually, the phonetics are known to some degree. It's known that ĭ was front and high (probably [ɪ]) and contrasted with both *i and *e and ŭ was back and high (probably [ɪ̈] ~ [ɯ] ~ [ʊ] depending on dialect) and contrasted with *u, *y and *o. ŭ was non-distinctively rounded as there was no rounded equivalent to contrast with, like there was with *y. So the symbol ĭ is certainly "correct" (whatever that means for an abstract phonemic symbol), and what we call ŭ could be equally called y̆. Also the name "ultrashort" is not really correct; it was only the "weak" yers that were actually ultrashort. The "strong" yers never shortened but instead underwent lowering: depending on dialect [ɪ] > [e] or [ɪ] > [ɘ] and [ɪ̈]/[ɯ] > [ɵ]/[ɤ] and [ʊ] > [o]. Horace Lunt describes the yers as "lax" vowels rather than ultrashort. The length distinction was probably fairly minimal by the end of the Proto-Slavic period, somewhat similar to how vowels developed in Vulgar Latin. —CodeCat 18:04, 26 December 2012 (UTC)
Since they are basically just variables, as Angr said, I think we are better off being consistent in using the Latin alphabet. --WikiTiki89 17:56, 26 December 2012 (UTC)
Strong oppose. ь and ъ are more common, I've never seen ĭ and ŭ in my sources. And I agree with Ivan and Stephen. Maro 22:04, 26 December 2012 (UTC)
Being common alone isn't a good reason. How many of those sources are easy to understand for a general audience as opposed to specialists? Also, I think that it makes no sense to transliterate these characters in OCS but still use them in Proto-Slavic entries. We might as well not transliterate them in OCS either then. Which I strongly oppose. —CodeCat 23:39, 26 December 2012 (UTC)
We should use ь and ъ, even they are used very differently in modern Cyrillic-based Slavic and other languages languages. In Bulgarian ъ is still a vowel, even if it's a special "hard sign" symbol in Russian. Symbols ĭ and ŭ (Roman, not Cyrillic symbols) should only be reserved for transliteration or redirects if there are occurrences of use OCS. --Anatoli (обсудить/вклад) 05:04, 29 December 2012 (UTC)

I find it perplexing that editors of a descriptive dictionary would go against “the most common practice,” or even consider that writing differently from everyone else could possibly be “clearer,” or “consistent,” and not look “stranger.” I can’t find any recent sources that use the Latin letters, although I see evidence that they were widely used in the early and mid 20th century. If I’m wrong, then would someone please cite a few of the sources that use the Latin letters?

(If consistency in romanizing different languages is so important, then we have some much bigger problems with our transliteration practices than this.)

Oppose. Michael Z. 2013-02-22 01:43 z

redirects from macronic Latin forms

Already posted initial discussions here and here but if this is the place where the admins and most common editors work through these things, I can summarize my main points:

A reverted edit has brought it to my attention that Wiktionary policy current holds that

  1. Latin in running text should (in principle) always include macrons over its long vowels
  2. Latin entry names should never include macrons over their long vowels
  3. The current redirect policy (in fact does not* but) is considered to preclude redirecting from the macronic forms to the macronless ones

In practice, what this means is that (e.g.) every etymology including -anus should be written -ānus but (since the entry itself is at -anus) the etymology will redlink unless the entry is written [[-anus|-ānus]]. This duplicate linking is to be employed in every instance of every Latin word across the entire dictionary.

This is nonsensical and breaks any template where the displayed Latin text (requires macrons) automatically links (where macronic titles are forbidden).

The obvious solution is to redirect, but that requires some editors to speak up and change the (currently understood) existing consensus. LlywelynII (talk) 23:14, 27 December 2012 (UTC)

* The actual policy forbids creating redirects from accentless names like etre to the accented namespaces (e.g., être). The policy is sensible since the computer automatically redirects on its own when it sees issues like that. The current Latin situation doesn't apply here at all: the computer does not automatically fix the links and the running text (per policy) should not be edited to make the initial links work (or requires writing every single entry twice). LlywelynII (talk) 23:14, 27 December 2012 (UTC)

To avoid splitting up the discussion now that it has already started, please comment at Wiktionary talk:About Latin#macron redirects. —Μετάknowledgediscuss/deeds 23:18, 27 December 2012 (UTC)
(ec) There's no automatic redirection from accentless forms to accented forms. nuachtan is a red link and clicking it takes you to the Search page; it doesn't automatically take you to nuachtán. —Angr 23:18, 27 December 2012 (UTC)

Proposal: redirects from Unicode dashes to ASCII dash forms

I'd like to suggest that Wiktionary allow the creation of "–" and "—" typographic forms as redirects to the format used on Wiktionary, the "-" dash form. I don't think this alternative form requires the creation of a separate entry, but as these forms exist as the recommended forms at Wikipedia (w:en:WP:DASH; though not everyone agrees there to use those forms as the article names (w:en:WP:Hyphen luddites) ) as being the more encyclopedic form, it would be useful to have these as redirects here on Wiktionary to Wiktionary's preferred form, the ASCII dash form. (and also from "−" to "-")

This would mean that an entry which contains an ASCII dash on Wiktionary, but for which some style guides recommend using a more differentiated dash form, would have a redirect to that entry at the version using the Unicode dash equivalent title. (This does not mean that the Unicode encoded form appears in the Alternative forms section)

-- 05:15, 28 December 2012 (UTC)

User:Yair rand/uncategorized language sections

Could we regenerate this? Also perhaps make it a subpage of WT:TODO to make it easier to find (WT:TODO uses {{subpages}}), though the regeneration is the important bit, not what it's called. Mglovesfun (talk) 11:40, 28 December 2012 (UTC)

I've repopulated it, except that I omitted the 23,190 Translingual entries on the grounds that it would have been counterproductive to include them. Also, I'm not sure how some of them are supposed to be fixed; for example, what's our current policy on categorization of numerals? —RuakhTALK 16:46, 28 December 2012 (UTC)
Also, feel free to move them wherever you want, and I'll follow along. I'm not a bot, I won't get confused by pages having been moved. —RuakhTALK 16:48, 28 December 2012 (UTC)
Are any of the uncategorized Translingual sections possibly taxonomic names? I assume they are mostly CJKV characters. I'd be willing to work on any that were in Latin characters. DCDuring TALK 17:37, 28 December 2012 (UTC)
I proposed to allow categories such as Category:Translingual Han characters and Category:Mandarin Han characters ad so on. An IP said this was 'obviously inappropriate' as it would lead to excluding Middle Chinese. And if you understand what that means, good for you. I don't have a clue Mglovesfun (talk) 17:50, 28 December 2012 (UTC)
@DCDuring: See User:Yair rand/uncategorized language sections/Translingual/ASCII. A few of them are taxonomic names, yes, but not very many. —RuakhTALK 18:10, 28 December 2012 (UTC)
Thanks. It's a handier list than others I've had to work with and most are almost ready for prime time. DCDuring TALK 18:16, 28 December 2012 (UTC)
I expect 200-400 taxa. There are many, many more (>1,000) abbreviations of botanists' (and zoologoists' ?) names, which might merit a standardized templating. There are also astronomical name components, and some chemical formulas. DCDuring TALK 19:42, 28 December 2012 (UTC)
I've set {{ordinal}} and {{cardinal}} to categorize in [[Category:<langname> numerals]]. That will solve a heck of a lot of the cases on these lists, maybe as much as half of them. Mglovesfun (talk) 19:37, 29 December 2012 (UTC)
I've undone the change because it will break several cases as well. Ordinal number words are grammatically adjectives and not numerals in most European languages. Cardinal numbers often are numerals but not always: the Slavic languages in particular treat the numbers 5-10 as feminine nouns. The Germanic and Indo-European word for 100 is a neuter noun, and remains so in many of their descendants. —CodeCat 19:44, 29 December 2012 (UTC)
Numeral isn't a grammatical category anyway, is it? We just use it to avoid making difficult decisions, a bit like we do with abbreviation and initialism. Mglovesfun (talk) 19:48, 29 December 2012 (UTC)
It is treated as a grammatical category on Wiktionary, though. As a subset of determiners to be exact. Wikipedia also treats them that way: w:English determiners. —CodeCat 19:53, 29 December 2012 (UTC)

Linkified L2 headers.

I seem to recall a decision, at some point in the past, that L2 headers should never be linkified; so, for example, where [[me'ẽ]] currently has


, it should instead have


. Do I recall correctly? If so, does anyone object to my going through the 757 or so entries that have linkified L2 headers, and de-linkifying them?
RuakhTALK 16:08, 28 December 2012 (UTC)

I don't think there is a 'policy' on this, so much as it's common practice. User:KassadBot unlinks them, though its predecessor AutoFormat did not. I don't object, no. Mglovesfun (talk) 16:50, 28 December 2012 (UTC)
Whatever the policy, this looks like a good idea, both for consistency (I'm sure there are languages that are both linkified and unlinkified, depending on the entry) and just good practice- the less clutter we have to complicate bot and template coding, the better. Chuck Entz (talk) 18:25, 28 December 2012 (UTC)
I can't find the discussion, but I reckon it exists somewhere. I support this as well. —Μετάknowledgediscuss/deeds 18:48, 28 December 2012 (UTC)
No objections. — Ungoliant (Falai) 20:17, 28 December 2012 (UTC)

Category:French idioms and Category:French expressions

Good evening,

what is the difference between Category:French expressions and Category:French idioms? And why the latter does request cleanup? --Fsojic (talk) 01:32, 29 December 2012 (UTC)

Re: why the latter requests cleanup: When Mglovesfun (talkcontribs) tagged it, he added a section to WT:RFC, with this text:
This needs splitting into its subcategories [[Category:French expressions]] and [[Category:French similes]]. But I'm not entirely sure how. Mglovesfun (talk) 10:52, 24 October 2009 (UTC)
That section seems to have been removed, I don't know why. (And I don't know whether anyone else contributed to it before then.)
RuakhTALK 02:29, 29 December 2012 (UTC)
I moved that section and everything else from 2009 to the "ancient unresolved requests" page because it had been ignored for three years. I've revived it for you now: WT:RFC#Category:French_idioms. - -sche (discuss) 02:50, 29 December 2012 (UTC)
Oh, I see now: Wiktionary:Requests for cleanup#Unresolved requests from before November 2011. Thanks. —RuakhTALK 04:14, 29 December 2012 (UTC)

CFI for Navajo and other languages

The RFV discussion of jádí dághaaʼígíí and the extraordinary difficulty of citing terms in languages which are documented in works which are only fleetingly accessible (and which use ununified orthographies) prompts me to ask: What change to our CFI would make it possible for us to have entries in such languages? I hope we all agree that some criterion is necessary and we shouldn't accept just anything any visitor to our site creates a ==Navajo== entry for, even if it's *dájíí jágaʼídí or *someotherplausiblecombinationofletters whichisneverthelessnotaword. Is there a coherent criterion by which we can judge Navajo (and other languages), to have confidence that proposed terms are real, without setting too insurmountable a hurdle before them? Or does our appropriate insistence on verifying terms and being reliable necessarily mean that we cannot cover most languages in any depth?

We currently hold that mention in a durably-archived dictionary verifies a term in any LDLanguage. Wikimedia Commons copies appropriately-licensed pictures from Flickr, and is able to keep them even if a Flickr user subsequently changes a picture's license, because Commons has a record which notes that the picture had a valid license at the time it was copied. Could we allow selected (language-community-approved) non-durable dictionaries like [5] to be used as references if two admins signed on an entry's talk page that they had found the entry in the dictionary? If the dictionary later went offline or removed the entry, we'd have the record that it had nonetheless had the entry at the time it had been used as a reference. Would that help, or would the drawbacks of that (straying even further from actual attestation) be too great? - -sche (discuss) 03:23, 29 December 2012 (UTC)

Support. This would be very helpful for attesting Hunsrik terms. Some questions and suggestions:
  • Does it apply to content other than dictionaries?
  • Does the entry creator count as one of the two who can verify an entry?
  • I suggest that if the dictionary removes an entry (as opposed to it going offline), the verification becomes invalid, because it might have been incorrect.
  • I suggest that a forum be created and all entries that need signing be listed there. Otherwise people will just ask their friends to verify every entry they create.
Ungoliant (Falai) 04:50, 29 December 2012 (UTC)
I actually don't think this addresses entries like jádí dághaaʼígíí at all. You won't find it in any dictionary, because it is essentially a neologism whose parts describe the animal or thing in question. If we want to keep entries like this we'd have to add criteria like "added by a user experienced with the language", "would be understood by native speakers", "entries that are composed of parts in a predictable way but which are not SOP in that language". DTLHS (talk) 05:23, 29 December 2012 (UTC)
We might need this. But it's straying toward a really tiny amount of attestation. Already, we accept a self-published wordlist or a single usenet post for a language like Navajo. At this point, we'd be basically allowing Wiktionary derivatives to be cites for Wiktionary entries. I think it could work, but it would need a lot more parameters to be thought out first, to prevent infinite loops, LDL protologisms, and other unfortunate excesses. —Μετάknowledgediscuss/deeds 05:25, 29 December 2012 (UTC)
Some responses and additional thoughts:
  • Should only dictionaries be covered by this? Or should any work which would be acceptable under current CFI if it were durable be covered?
  • I'd be happy to let entry creators double as verifiers, or not, whatever others think is best. Barring creators from serving as verifiers is equivalent to just raising the number of verifiers required, anyway.
  • I neglected to say: I also think that if another admin comments that they don’t see a term in a work, either before the requisite number of admins say they do see it or within 2 months of the last admin saying they do see it, the verification is invalidated, or needs more admins to sign off on it. (failsafe)
  • I suggest 2 months as the period because I get the impression from tending RFV that most RFVs are either resolved in &;lt1 month or sit for 4 months, but clicking a link should be easier than citing the nuances of the legal use of the term "abduction" from three books.
    A site might go offline in 2 months, but I envision this as a hedge against dictionaries going offline years from now and not as a way of catching things that flicker on and then flicker off again in space of a few months, anyway.
  • I do think only admins should be able to serve as verifiers, to prevent the scenario of non-admins listing and verifying too large a number of terms for anyone else to want to wade through and catch errors in before the time limit for disputation has passed. Admins can be trusted to have actually seen the entries they say they saw, and can go through any queue of terms to be verified at about the same leisurely pace at which we resolve RFVs (because, as I say, I envision this as a hedge against dictionaries going offline years from now, not next month).
I still hold out hope that the gnu-killing book Angr and Chuck mention will verify it, but DTLHS may be right that jádí dághaaʼígíí is just a neologism that no measure can help. I don't mind. Some users have been unhappy that their 'correctly formed' protologisms fail CFI, but I don't mind if we continue to fail to include protologisms. A lot of basic Navajo vocabulary could be helped by this, though, and a lot of Hunsrik vocab (as Ungoliant says), and a lot of vocab from other languages for which tribes have put up websites but not printed books. This Abenaki dictionary, for example, contains the contents of several books; so far, I've been careful to enter only words which are in the books I have, but if that site could be used as a reference, I could enter a lot more.
Wiki-mirrors and Wiki-derivatives have always been excluded, even when they appear in durably-archived print, because they're not independent of us. If we task communities of language editors with determining which dictionaries are reliable and which are not, those editors are unlikely to approve Wiki-mirrors, anyway.
It is ironic to be discussing ways of recording sites, isn't it, given that the WebCite proposal to allow an actual archive didn't pass?
- -sche (discuss) 06:38, 29 December 2012 (UTC)
Do you think the WebCite vote would pass if it was LDL-only? —Μετάknowledgediscuss/deeds 06:48, 29 December 2012 (UTC)
Possibly. Some people, including me, were wary of the previous vote because its effect on major languages like English, which are already very well covered, would have been to allow any un-copyedited website full of uncommon, nonstandard (mis)spellings. But websites are among the only accessible sources for many LDLs—and LDL websites (dictionaries, .edu sites, Bible translations) are ironically more likely to use a standard orthography than random English websites are... even if each website uses a different standard orthography, lol. - -sche (discuss) 08:20, 29 December 2012 (UTC)
Works other than dictionaries should be allowed. The vast majority of written content in Hunsrik are unpublished Bible portions, which have been floating around the Internet for a couple of years. I suspect the translation project has been abandoned. — Ungoliant (Falai) 07:43, 29 December 2012 (UTC)


Good afternoon,

I am wondering if a category could be created, that would contain adjectives ending in -ed pronounced /ɪd/, such as crooked, naked, wretched, ragged, etc.?

By the way, could someone read the little text about the queen on the same page, and explain to me where is the ambiguity? --Fsojic (talk) 16:45, 30 December 2012 (UTC)

I don't know about anyone else, but Google doesn't give me a preview of this page. As for the category, they seem to be an odd assortment of words from Old English and Old Norse with different morphological origins. Another example might be blessed, which is pronounced as two syllables only in poetic and religious (i.e. archaic) usage (in poetry it's also spelled blessèd). That one is from Old French. Chuck Entz (talk) 03:27, 31 December 2012 (UTC)
After skimming through the adjective section, it looks a bit more complex: terms with t or d before the -ed have -ed regularly pronounced as /ɪd/, so you probably want to eliminate those to concentrate just on the ones that are different from normal patterns. Also, the language originally pronounced all the -ed endings or their predecessors with vowels, but most of the vowels have since been lost. There are several terms that tend to retain the vowel in archaic registers such as poetry, religion and (sometimes) legal terminology, or in specific senses: aged/agèd, alleged, beloved/belovèd, hallowed, learned, and supposed are the ones I've noticed so far. Then there are a few terms ending with -gged that seem to be collectively resisting the trend: dogged, jagged, rugged. Some terms ending in -legged seem to be included, though it might also be the influence of -footed. One could also add wicked and sacred to the main list. In summary: It looks like this would be a valuable category for learners of English, but naming and delimiting it may pose problems. Chuck Entz (talk) 09:51, 31 December 2012 (UTC)

Please, block Fête (talkcontribs)

I can't let it continue. He continues to harass people on this wiki and he has been blocked and banned on 3 wikis (en wp, fr wikt fr wp) and he even continued on the French Wikiversity and on the Esperanto Wikipedia. We can't let him a chance, he will continue. He will say ok, I stop, but will continue. We'll warn him, but he will ignore warnings. Block him, or he'll maybe attack other wikis. Ĉiuĵaŭde (talk) 03:51, 31 December 2012 (UTC)

Re: "He continues to harass people on this wiki": Has anyone on this wiki indicated that they feel harassed? Harassment is very much a culture-dependent thing; behavior that people from one culture find acceptable, people from another culture might find intolerable (and vice versa).   Re: "Block him, or he'll maybe attack other wikis": I rather suspect the reverse: time that he spends here is perforce time he is not spending on other wikis.   —RuakhTALK 04:02, 31 December 2012 (UTC)
And these 3 wikis where he's blocked? Ĉiuĵaŭde (talk) 04:03, 31 December 2012 (UTC)
. . . were presumably wikis where people did feel harassed. What has that got to do with us? —RuakhTALK 04:06, 31 December 2012 (UTC)
The en.WP block discussion itself sounds like "I think this person is harassing other people" rather than "this person is harassing me".
It has proven necessary to check all of Fête's edits to entries, and to correct or tweak many of them; in that way they seem comparable to Wonderfool, and one can ask the same question one could ask when considering whether or not to block the day's WF sock: which weighs more, the amount of good information that is added or the burden of checking all edits and correcting the errors that will probably always be present in the contributions of a user who edits pronunciations in a language they speak only middlingly? - -sche (discuss) 08:02, 31 December 2012 (UTC)
Gtroy/LW or Drago seem closer, due to inability to really learn from correction by others. WF has stretches of decent edits, then either gets bored and starts to add garbage or gets lazy and falls back into old, bad habits (it varies). Fête's edits are consistently colored by questionable judgment. Very few of his edits are in languages he's qualified to work with, and pronunciation is an area that's much harder to piece together from reference material if you're not fluent. I don't trust myself to do much of it- and I've taken classes in the subject at the university level. Chuck Entz (talk) 09:23, 31 December 2012 (UTC)

You can let a chance, but the best is to block our dear cantonese friend. Quentinv57 (talkcontribs) is (will be) maybe discussing with other stewards for a probable global block, following his talk page.

Now, there's nothing wrong here for you (so you don't care), but we'll see that soon.

Here are the block reasons of user Fête:

  • EN WP : 72 hours : Cross-wiki disruption from French Wikipedia and Commons
  • FR WP : Forever : récidive dans le harcèlement de contributeurs (recidivism of user harassment)
  • FR WIKT : Forever : Voir WT:BA
  • ZH-YUE WP : Until January 2nd, 2013, at 1:39pm : Frequently inserting non-Cantonese words when editing (e.g. 唔噏"bye" in <a href="/w/index.php?title=%E8%AC%9B%E6%8B%9C%E6%8B%9C&action=edit&redlink=1" class="new" title="講拜拜 (page inexistante)">講拜拜</a>) (vandalism)

If it's not enough, here are two links for you : FR Wikiversity and FR WIKT (with steward warning).

We don't need something that proves that they feel harassed (because they can stay quiet), we need just something that proves thay they were harassed. I'm doing a job that a simple user would not do. Ĉiuĵaŭde (talk) 14:40, 31 December 2012 (UTC)

You've completely misunderstood. It's not "not enough" it's completely irrelevant. I find your input into this debate irritating, but I'm not going to block you for it. Mglovesfun (talk) 15:00, 31 December 2012 (UTC)
Harassment also isn't something that exists outside of a person's opinion. You've got this totally wrong. If you have some relevant information, go ahead, but please don't talk just for talking's sake, or else I might feel 'harassed'. Mglovesfun (talk) 15:02, 31 December 2012 (UTC)
Ok, no problem. I'll let you leisurely. The problem is not there, right? So you don't care and I understand you totally. Happy New Year. Ĉiuĵaŭde (talk) 15:06, 31 December 2012 (UTC)