Wiktionary:Beer parlour/2012/April

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


April 2012

I have a question, I have seen entries like this one before for templates for phrases as entries so I gave it a try with one I saw was missing, have I done it right?Lucifer (talk) 22:43, 13 April 2012 (UTC)[reply]

No, these have failed RFV/RFD in the past, since they are not used in language this way (with the letter X). Examples were X one's Y off and I'll see you X and raise you Y. Equinox 17:25, 14 April 2012 (UTC)[reply]

w:Singlish has an entry saying it's a creole language with no ISO 639 code. Is it really a language, and if it is, what code do we use for it? If it's English-based, it might be {{gmw-sin}} ({{etyl:gmw}} is for West Germanic, and English is West Germanic). Mglovesfun (talk) 15:48, 3 April 2012 (UTC)[reply]

Maybe we could enter it as English with a Singapore gloss. It seems to read more like English than anything else. Equinox 15:51, 3 April 2012 (UTC)[reply]
It's not a creole. WP editors (and linguistically uninformed people generally) have an unfortunate tendency to call any language that shows any sort of influence from an unrelated language a "creole" or a "mixed language" without understanding what those terms actually mean. It's not a separate language either. If we have any "Singlish" forms here, just list them under English and tag them {{Singapore}}. —Angr 22:30, 3 April 2012 (UTC)[reply]
That's right, Singlish is an example of code-switching and mixing languages, uneducated English in general. It's similar to Chinglish, Engrish/Japlish, Runglish, etc. The spread and importance of Singlish is also exaggerated. Yes, Chinese people say "okey lah" (see ) and make grammatical errors but it's not exclusive to Singapore, it's just the influence of people's mother tongue on whatever language they speak. I also don't think it's something that stays unchanged for a significant period. --Anatoli (обсудить) 23:20, 3 April 2012 (UTC)[reply]

out of context in an attributive sense

Wiktionary:Criteria_for_inclusion/Fictional_universes:

With respect to names of persons or places from fictional universes, they shall not be included unless they are used out of context in an attributive sense, for example:

  • 2004, Robert Whiting, The Meaning of Ichiro: The New Wave from Japan and the Transformation of Our National Pastime, p. 130:
    Irabu had hired Nomura, a man with whom he obviously had a great deal in common, and, who, as we have seen, was rapidly becoming the Darth Vader of Japanese baseball.
  • 1998, Harriet Goldhor Lerner, The Mother Dance: How Children Change Your Life, p. 159:
    Steve and I explained the new program to our children, who looked at us as if we had just announced that we were from the planet Vulcan.

Wiktionary:Criteria_for_inclusion#Fictional_universes and Wiktionary:Criteria_for_inclusion/Fictional_universes require fictional names to be cited “out of context in an attributive sense.” But the terms are not used attributively in the examples. How are we to interpret this guideline? Are the examples incorrect? Michael Z. 2012-04-01 17:23 z

I have myself been under the impression that others had superior knowledge of the meaning of (deprecated template usage) attributive so that the uses above would be included. But I can find no evidence that there is such a generally accepted meaning in any applicable linguistic sense. Quite to the contrary, the contrasting term (deprecated template usage) predicative seems directly applicable to the 2004 example. Accordingly, one or contributors must have been mistaken. DCDuring TALK 18:42, 1 April 2012 (UTC)[reply]
Right, the linguistic interpretation of attributive does not apply. So it must be the other meaning, "pertaining to or having the character of attribution or an attribute". DAVilla 04:19, 3 April 2012 (UTC)[reply]
I'm skeptical. In the Vulcan quote, there is no “attributive sense” to the term. The children's disbelief is conveyed by a metaphor stemming from the sentence's syntax. And such a generalized sense of “attributive” isn't even given in most dictionaries, so this should be rewritten unambiguously, no matter which way interpret it. Finally, Vulcan is a not a supporting example, because the “Star Trek planet” sense is absent from Wiktionary. Michael Z. 2012-04-03 16:10 z
If CFI is somewhat ambiguous, maybe we do need to have it clarified. Isn't it somewhat disturbing your statement that the example in CFI does not meet CFI? I maintain that the common meaning of attributive is applicable, and I challenge you to find any dictionary that does not have this meaning. Plus, you should have your eyes checked. We've had the Star Trek planet sense since the very day the page was created. DAVilla 18:11, 4 April 2012 (UTC)[reply]
Related: Wiktionary:Votes/pl-2010-05/Names of specific entities. —RuakhTALK 17:52, 2 April 2012 (UTC)[reply]
I think with this policy, something like "Joker laugh", you probably won't understand what this means unless you know who Joker from the Batman franchise is. I'm ok with it so far. But if I hear a song on the radio and someone says "this has a 90s sound to it", we can't really add to 90s a definition to cover what music was like in the 90s. At some point, popular culture information has to go on Wikipedia. Mglovesfun (talk) 15:53, 3 April 2012 (UTC)[reply]
Yes. The policy should help unwind the difficult exercise of excluding specific references, but including terms with inherent meaning (which may, confusingly, originate in specific references). The current unclear writing just muddies the water. As a result, we don't even know what the policy is, and interpret it in contrary ways. I hate that. Michael Z. 2012-04-03 16:21 z

...is ongoing.​—msh210 (talk) 21:02, 1 April 2012 (UTC)[reply]

Yeah, it ends in approximately 24 hours. --Daniel 00:14, 2 April 2012 (UTC)[reply]

Other ongoing votes:

And these are going to start:

--Daniel 18:15, 2 April 2012 (UTC)[reply]

We have two votes about changing the CFI language on brand names? Two votes which are simply different proposals, without respect to each other? Well, it's a good thing we're solving these issues with votes, which are conducive to compromise and change, as opposed to something silly like discussions. Also, if anyone would like to change the status quo on the necessity to have a vote over changing a comma into a semi-colon on ELE, there's a vote for that too. -Atelaes λάλει ἐμοί 22:43, 2 April 2012 (UTC)[reply]
The two votes on brand names were quickly started, and have created not much of overhead and endless discussion. Those who wanted to discuss could do so on the talk pages of the votes, and they did, indeed. I think I and Liliana should be applauded for creating these two votes. We are getting things done. I now even think that Liliana was right to let the other vote start mere week after the start of the first vote. It's a pity that there are so many vote-haters in Wiktionary. Furthermore, the extension of the regulation of brand names to non-physical products has been discussed countless times in WT:RFV, so a lot of discussion actually already took place long before the votes were created. --Dan Polansky (talk) 15:58, 3 April 2012 (UTC)[reply]
To be clear, I don't hate votes, but I guess my philosophy on their purpose differs from others. I've always thought that votes were meant to be a documentation of previously established consensus. Essentially, that votes shouldn't be started until it's patently obvious which way they'll go. They shouldn't require further discussion because the discussion has been more or less fully resolved. I'll take your word for it that there was a fair amount of discussion on the topics beforehand. However, there were clearly subtopics which weren't discussed, or weren't discussed properly beforehand, as evidenced by the talk pages of those votes. In any case, I'm not disparaging the vote creators, whom I will concede are getting things done. Rather, I'm disparaging the current methodology of getting things done, which I view as extremely inefficient. -Atelaes λάλει ἐμοί 23:11, 3 April 2012 (UTC)[reply]

First-person Singular Imperative of Portuguese Verbs

An anon complained at Talk:cantar that there’s no 1st person singular imperative in Portuguese, and as far as I’m aware he is correct. If you look at fazer, our conjugation includes the 1st person singular imperative (affirmative and negative). Note the following links:

  • (pt:wp) “No imperativo, não existe a primeira pessoa do singular (eu)” (For the imperative, there is no first person singular (I)).
  • (pt:wt) No 1st person singular imperative (open the dropdown under the header Conjugação. 1st person singular is the first column, imperatives are the 10th and 11th rows).
  • (Online dictionary 1) ditto (under the header Imperativo, notice how it has 5 items, while the other moods have 6)
  • (Online dictionary 2) ditto (1st person singular imperatives marked with a dash).

It is my opinion that the 1st person singular imperative should be removed from the Portuguese conjugation table ({{pt-conj/theTable}}). Is anyone opposed to this? Ungoliant MMDCCLXIV 12:12, 3 April 2012 (UTC)[reply]

The 1st person singular imperatives are traditionally absent from conjugation tables, yet they are largely attestable. Don't you agree?
"Bendito seja eu por tudo o que não sei" (Fernando Pessoa)
See also Prof. Evanildo Bechara's grammar book Moderna Gramática Portuguesa, where they do appear in conjugation tables. --Daniel 12:36, 3 April 2012 (UTC)[reply]
Isn’t that just referring to oneself in the third-person? Ungoliant MMDCCLXIV 12:42, 3 April 2012 (UTC)[reply]
I don't think so. But, how do you think that would happen? If you have an explanation, please let me know. I didn't check the sources you listed.
The thesis that it is just referring to oneself in the third-person does not seem to explain how the verb "saber" agrees with the personal pronoun without further adaptations. (if the sentence was, hypothetically, "Bendito seja eu por tudo o que não sabe.", the thesis would be easily understandable; but, the truth is...)
  • Bendito seja eu por tudo o que não sei.
  • Bendito sejas tu por tudo o que não sabes.
  • Bendito seja ele por tudo o que não sabe.
--Daniel 13:01, 3 April 2012 (UTC)[reply]
In these examples, both persons are the same, but that's not necessarily the case.
  • Bendito seja eu por tudo o que (eu) não sei.
  • Bendito seja eu por tudo o que (tu) não sabes.
  • Bendito seja eu por tudo o que (vocês) não sabem.
Ungoliant MMDCCLXIV 13:07, 3 April 2012 (UTC)[reply]
In "Bendito seja eu por tudo o que não sei.", there is no second written personal pronoun. Assuming that the parentheses represent unwritten words, then your examples would be written like this:
  • Bendito seja eu por tudo o que não sei.
  • Bendito seja eu por tudo o que não sabes.
  • Bendito seja eu por tudo o que não sabem.
Is that correct? --Daniel 13:12, 3 April 2012 (UTC)[reply]
Correct. By the way, I just discovered that my university’s library has prof. Bechara’s “Moderna Gramática Portuguesa”. However, I will only be able to go there later at night. Ungoliant MMDCCLXIV 14:52, 3 April 2012 (UTC)[reply]
Pardon my cluelessness — I don't speak Portuguese — but in the Pessoa quotation, isn't seja the subjunctive, rather than the imperative? The speaker is not commanding himself to be blessed — it's not as though he could comply with the commandment by being blessed — but rather, he's calling down a blessing on himself. Note that in the second person it's generally "bendito sejas", not "bendito sê". —RuakhTALK 13:18, 3 April 2012 (UTC)[reply]
If it was subjunctive it would take a que somewhere, I think. As in “que eu seja bendito por tudo que não sei”. Ungoliant MMDCCLXIV 14:52, 3 April 2012 (UTC)[reply]
Well, again, I don't speak Portuguese; but in both French and Spanish, this is exactly the sort of situation where the subjunctive can be used without "que". For example, in French you can say either « béni sois-je » or « que je sois béni », and in Spanish either "bendito sea yo" or "que yo sea bendito". (Note that in both languages' "que"-less version, the subject follows the verb.) —RuakhTALK 15:27, 3 April 2012 (UTC)[reply]

Back from the library. I found the book mentioned by Daniel, and it confirms my theory (at least in the edition I had access to).

    • 1977, Evanildo Bechara, Moderna Gramática Portuguesa, 22nd edition, Companhia Editora Nacional, page 116:
      O imperativo em português só tem formas apenas para as segundas pessoas; as pessoas que faltam são supridas pelos correspondentes do presente subjuntivo. Não se usa o imperativo de 1.ª pessoa do singular. As terceiras pessoas do imperativo se referem a você, vocês e não a eles. Também não se usa o imperativo nas orações negativas; neste caso empregam-se as formas correspondentes do presente do subjuntivo.
      The imperative in Portuguese only has forms for the second-persons; the missing persons are supplied by the correspondents of the present subjunctive. The 1st person singular of the imperative is not used. The third persons of the imperative refer to você, vocês and not to eles. The imperative is also not used in negative clauses; in this case the correspondent forms of the present subjunctive are employed.

Following this paragraph, there is a table with the conjugation of the affirmative and negative imperatives of the verb cantar, and in both of them the first person singular is marked with a dash. I also skimmed quickly through some pages to find conjugation tables; in every one I found (pages 129, 133, 136, 144), 1st person singular imperative was missing.

Maybe this changed in subsequent editions, but if it didn’t I’d say this is evidence enough for the removal of the 1st person singular imperative from the Portuguese conjugation table.

Also, hats off to Ruakh, who was right about it being subjunctive. Ungoliant MMDCCLXIV 23:41, 3 April 2012 (UTC)[reply]

If you do want to remove it from conjugation tables (which sounds reasonable to me), then I think you should do that by modifying {{pt-conj/doWork}} or {{pt-conj/doWork/add}} rather than {{pt-conj/theTable}}: the latter happily lets the first-person-singular-imperatives have some sort of indication of non-existence (such as an em dash or a blank space), as long as parameters #69 and #75 are set properly. —RuakhTALK 00:29, 4 April 2012 (UTC)[reply]
I’ve only recently started attempting to write templates (and it’s not working very well yet :-( ). Whichever way it’s removed, it should be done so that in case it ever turns out we were wrong, it should be easy to fix. But it’s better to wait and see what Daniel and other contributors have to say, before changing such a widely used template. Ungoliant MMDCCLXIV 00:50, 4 April 2012 (UTC)[reply]

{{look}}

Requesting input for extinct and other sparsely documented languages

Thanks to the great feedback provided, the proposal for endangered languages CFI needs to be reworded and is essentially stopped. (Do I need to somehow terminate it?)

1. Although I have asked twice for clarification of the CFI for extinct languages, however, I haven't gotten any feedback on that. It currently reads: "For terms in extinct languages: usage in at least one contemporaneous source."

One of the criticisms of my proposed vote was the word "usage" instead of "mention." But the CFI for extinct languages also uses the word "usage." Also, the word "contemporaneous" is used. I'm not sure why it's worded like that.

My current thought is to propose that all sparsely documented languages (including extinct and most endangered languages) have a criterion such as this: "Contextually appropriate usage or mention in at least one source."

That will allow for quoting scholars who are not contemporaneous with the language when it was alive.

2. I think that the wording in 1 will provide room for abuse, however. With only one usage or mention, somebody can just upload to Usenet a hastily typed document with lots of errors and proclaim it to be a valid source, forcing inappropriate words onto Wiktionary. I therefore want to provide a mechanism to provide balance and curb abuse of that criterion. My line of thinking is something like this: "Each sparsely documented language will maintain a page that provides space for discussing whether sources are appropriate."

That page can be the About page or a completely separate page; additionally, people can bring it to the Wiktionary community at large and call a vote, though hopefully that will not be necessary.

Any comments and feedback are greatly appreciated. BenjaminBarrett12 (talk) 23:38, 3 April 2012 (UTC)[reply]

Reading the comments, I suspect that the final product should probably treat the various groups (extinct, endangered, poorly documented) separately, as a number of the oppose/abstain votes took issue with lumping them together. The notion of allowing each language to essentially define its own criteria on its 'About' page is a very robust approach, which, in an ideal world would probably be best. However, I suspect that many editors will be uneasy with the inherent ambiguity of such an approach. Personally, I'm not entirely certain how I feel about it. Regarding use/mention, it should probably be noted that Ancient Greek is already allowing mentions. θεπτάνων (theptánōn) is a mention only term. It is never used in an ancient context, but is merely mentioned and defined by the ancient dictionary written by w:Hesychius. If we restricted ourselves to use, there may well be justification for that, but we should realize the consequences of doing so, namely that we will never have important, and almost certainly real, words in our dictionary that others will have. Of course, a compromise approach would be 'use only' with notable exceptions when appropriate, which would solve the problem for θεπτάνων (theptánōn) at least. -Atelaes λάλει ἐμοί 00:39, 4 April 2012 (UTC)[reply]
Thank you for the feedback. The issue of doing extinct languages separately is one of the reasons I'm asking about the current policy: I just don't see why extinct languages should be restricted to contemporaneous usage only when the other groups would not be, or what makes extinct languages so unique. (Also, there is a lot of fuzzy overlap between endangered and extinct languages that makes it difficult to separate them.) BenjaminBarrett12 (talk) 02:50, 4 April 2012 (UTC)[reply]
Can we also consider languages, which are NOT endangered but English language resources are limited, multiple transliteration methods exist and our contents in these languages are very scarce? For example, Sinhalese resources in English are almost non-existent, si wiki is nearly dead and very few here would be able to say about what transliteration methods would be right? Yes, having "just one source" for endangered languages sounds right and the source may not be on the web but from a book. I occasionally find the same situation for Lao or Burmese, where a word exists in a dictionary but there's hardly any occurrence on the web. --Anatoli (обсудить) 03:10, 4 April 2012 (UTC)[reply]
Thank you for the input. Yes, those are part of the proposal. Although I named the first one "endangered languages," it actually covers "sparsely documented languages." For the next proposal, I will use the term "sparsely documented."
BTW, transliteration methods is a major issue. I see Wiktionary has a guide at Wiktionary:Sinhalese_transliteration and a list of less than 100 words at Category:Sinhalese_language. Do you know if those Sinhalese terms have three attestations from books or Usenet? Also, Wikipedia has a portal at w:si:මුල්_පිටුව. BenjaminBarrett12 (talk) 03:31, 4 April 2012 (UTC)[reply]
I started Wiktionary:Sinhalese_transliteration but it's incomplete. Those Sinhalese words can probably be attested by simple Google search but most likely not from archived resources or Google books. Note also that entries in rare language are often created by enthusiasts, not native speakers. If there are not enough enthusiasts to maintain si wiki, then it's even harder to find those willing to contribute here. --Anatoli (обсудить) 04:35, 4 April 2012 (UTC)[reply]

Welcome message

Yes, ours (Template:Welcome) is fine, but the template that Wikinews uses is considerably better (see it here). It is more attractive, has tabs (oooh, fancy!), and is honestly more welcoming. Is anyone interested in using this? If so, I'll move the content of our template into theirs and see how it looks. --Μετάknowledgediscuss/deeds 01:26, 4 April 2012 (UTC)[reply]

It does seem nicer... you can always try. —CodeCat 12:42, 4 April 2012 (UTC)[reply]
I doubt that just copying the template will suffice; I'm almost certain that the tabs require support from custom JavaScript that we'd have to copy as well. —RuakhTALK 17:03, 4 April 2012 (UTC)[reply]
It gives the impression of an automated welcome. Ungoliant MMDCCLXIV 17:21, 4 April 2012 (UTC)[reply]
The welcome already is an automated welcome, just added by hand. It's not as if anyone would have a chat with you about does and don'ts and what your motivations are. But I would definitely prefer a more graphical approach to a text block.ᚲᛟᚱᚾ (talk) 18:14, 4 April 2012 (UTC)[reply]
How about this: I'm willing to 'Wiktionarize' it (including making sure that it relies only on other templates that we have) if someone else checks the JS (if you treat it as a language, my proficiency level is js-1...). I wouldn't even know where to look. --Μετάknowledgediscuss/deeds 03:59, 5 April 2012 (UTC)[reply]

"Idiom" header

Where it has been decided that the "Idiom" header is deprecated?

I believe that decision happened somewhere, but I couldn't find it. --Daniel 00:09, 5 April 2012 (UTC)[reply]

WT:POS says that it's not deprecated. (Of course, that could just mean that WT:POS wasn't updated after the decision was made.) —RuakhTALK 00:19, 5 April 2012 (UTC)[reply]
I think it should be deprecated if it's not already. —CodeCat 00:22, 5 April 2012 (UTC)[reply]
I think we just attempted to convey grammatical information in the L3 PoS header, leaving the sense-level {{idiom}} or category membership to carry the water for the idiom concept. There was nothing that prevented that and no one seemed to object. One could call that consensus or common law. I am not aware of the status of implementation outside of the English language. The idiom header was still used in other languages when last I monitored it, so perhaps it is just a question that has been resolved by English-language contributors for English-language entries, without prejudice for other language-contributor communities. DCDuring TALK 00:33, 5 April 2012 (UTC)[reply]
Inasmuch as I'm aware of policy and practice, the Idiom header is still in use at L4 in Japanese entries, but that's for listing idioms that use the headword. At L3 for Japanese, I've seen Phrase, Idiom, and Proverb. From what I've seen of English entries that have idioms as the head, they're being categorized as idioms and placed under an L3 Phrase header. FWIW. -- Eiríkr ÚtlendiTala við mig 01:12, 5 April 2012 (UTC)[reply]

watchlist all language templates

Given the discussion of Krio, I've reduced the cascading protection I applied (after this discussion) to all language templates and their script and family subpages using a, b and c, so now everyone except new users can edit those pages. If you'd like to add all seven-thousand-odd language-templates and their family pages and script subpages to your watchlists so that you can spot vandalism of them, you can click here, click 'edit', copy the contents of the page, and paste them into your watchlist; then do the same with this and this. (Warning: the pages are massive.) - -sche (discuss) 04:47, 5 April 2012 (UTC)[reply]

PS, if you think the cascading protection of those pages should not have been lowered, or should not exist at all, consider this also a general thread for discussing that. - -sche (discuss) 04:49, 5 April 2012 (UTC)[reply]
Oops, I should have tried to lower it before posting... turns out, protection can only cascade at the admin-only level(???). Well, admin-only protection it is. - -sche (discuss) 04:53, 5 April 2012 (UTC)[reply]
Yes, this is by design, because otherwise any user could protect a page without admin rights. -- Liliana 15:33, 5 April 2012 (UTC)[reply]
Ah, sorry, I was unclear; I mean: when protecting a single page (using my admin rights), I can leave the protection level at "allow all users", I can set it to "block new and unregistered users" or I can set it so "administrators only" can edit the page. If I set it so "administrators only" can edit the page, I can make that protection cascade and affect every page the original page transcludes... but I can't make the "block new and unregistered users" cascade(?). - -sche (discuss) 17:11, 5 April 2012 (UTC)[reply]

The discussion Category talk:Wine focused the attention on the possible confusions between Category:Wine & Category:Wines for the people who look after.

Consequently I think that it would be better to adopt the same term as fr:Catégorie:Œnologie & nl:Categorie:Oenologie. JackPotte (talk) 07:41, 6 April 2012 (UTC)[reply]

That would be more differentiated, but still difficult for us on this side of the pond (I almost solely see (deprecated template usage) enology). Is there alternate word we could use? --Μετάknowledgediscuss/deeds 16:50, 6 April 2012 (UTC)[reply]
Category:Winemaking? In a general sense, this could be considered to include growing vines, winetasting, etc. Conversely, why not Category:Wine varietiesMichael Z. 2012-04-17 20:32 z

Egyptian

AFAICT, there is no policy on Egyptian. As a consequence, there are entries with transliterated titles as well as with hieroglyphic titles (which look like boxes to me, since I don't have that font). Should we make all Egyptian entries have titles with standard transliteration (which would be a lot more helpful) or do a mixed-script thing, like Serbo-Croatian (Cyrillic-Latin) or Japanese (Katakana-Hiragana-Romaji)? --Μετάknowledgediscuss/deeds 18:04, 6 April 2012 (UTC)[reply]

This is a consequence of Egyptian hieroglyphs being unavailable for thread titles until recently. I should've moved all of them to hieroglyphic titles long ago - but my motivation is a bit lacking. -- Liliana 18:08, 6 April 2012 (UTC)[reply]
Japanese is a bit too complex (I didn't even mention kanji above, which needs more considerations than the others). However, would you be open to giving the hieroglyphic titles equal status with the translit like sh? --Μετάknowledgediscuss/deeds 17:28, 7 April 2012 (UTC)[reply]
sh? You mean maybe got? Mglovesfun (talk) 22:48, 7 April 2012 (UTC)[reply]
Japanese isn't allowed transliteration entries because it's complicated. It's allowed transliteration entries because they're actually used by Japanese speakers. Egyptian should all be in hieroglyphic titles (with transliterations in the entry, of course). -Atelaes λάλει ἐμοί 22:56, 7 April 2012 (UTC)[reply]
Of course, you can take the way Gothic took and start a vote if you're interested in having Egyptian transliterations. -- Liliana 22:57, 7 April 2012 (UTC)[reply]
I couldn't think of any equivalent situations - thank you for reminding me of Gothic. That would be perfect. Due to widespread use among Egyptologists, I think translit titles are great. Does anybody have major objections to raise before I go ahead and try writing something up? --Μετάknowledgediscuss/deeds 17:08, 8 April 2012 (UTC)[reply]
Transliteration is actually used by virtually all Egyptian speakers. I don't see why the fact that they're non-native speakers should mean we shouldn't serve their needs.--Prosfilaes (talk) 23:50, 8 April 2012 (UTC)[reply]
I fully agree, but which transliteration system should we use? Currently we are using a hopeless mixture of the Traditional system and the Computer system. Personally, I find the European system (similar to Traditional) to be the best, but the Computer system doesn't use diacritics, so it would be much easier for me to input transliterations. Liliana, others, do you have an opinion? --Μετάknowledgediscuss/deeds 01:47, 9 April 2012 (UTC)[reply]

Numbered translation glosses

Do we want to have numbered translation glosses, such as those introduced in this edit of "break" from 23:35, 23 January 2012, curiously summaried as "checked fi"? I think I prefer the current practice of having no such numbers in translation glosses. Furthermore, I think "transitive" should not be removed from the glosses. Thoughts? --Dan Polansky (talk) 19:03, 6 April 2012 (UTC)[reply]

Current practice seems fine, both because many would have no clue to which sense a given translation belonged if the order or number of senses should change and because a nearby gloss should facilitate translation. DCDuring TALK 19:36, 6 April 2012 (UTC)[reply]
No, they should not be there. If someone adds a sense somewhere, it will turn everything into a total chaos. -- Liliana 22:59, 6 April 2012 (UTC)[reply]
I agree! Mglovesfun (talk) 23:02, 6 April 2012 (UTC)[reply]
I think that the current system is a chaos as well. People editing the English definitions seem to pay no attention whatsoever to the translations. I have many times encountered translation tables for senses that have been deleted or merged a long time ago or others in which the translation gloss just vaguely resembles the corresponding definition. However, my original plan was not to start a new practice. My intention was to use the numbers during the editing phase to keep track on which translation refers to which definition, but obviously I forgot to remove the numbers. Sorry for that. --Hekaheka (talk) 20:08, 7 April 2012 (UTC)[reply]
Are there cleanup approaches that would get at the problem? The problem is, after all, limited to polysemic PoSes in English entries. One thing might be to match the count of trans-tables to the count of senses (excluding &lit senses) for polysemic English PoSes. Another might be to list each trans-gloss that does not consist of (meaningful) words that are in a sense including any context-type labels. Polysemic PoSes can also have problems with synonyms. This is in principle doable with dump processing. Once we have cleaned up the backlog, it could be left to a bot to track changes. Perhaps the closers of RfDs and RfVs and other contributors could leave a special template if the sense content changes and they do not make corresponding changes in the glosses. DCDuring TALK 21:50, 7 April 2012 (UTC)[reply]
I'm also annoyed with people ignoring translations when changing definitions at times. Please understand, it takes time and effort to add translations. Don't just change definitions and trans glosses lightly. People who made valid translations may not come back. Also, frustrated when translations are converted into "to be checked" sections. I know it's hard but perhaps this could be avoided if definitions and translations are in synch and when the most common or intuitive definition of a term comes first (before any additional senses), the most common translation will also come into the first gloss. Endless splitting may also be counterproductive, like with pass#Verb. There are so many definitions but I struggle to find the sense, which applies to "pass the time" (or maybe it's missing?). --Anatoli (обсудить) 02:27, 11 April 2012 (UTC)[reply]
Having numbers in translation tables might not be so bad after all, especially for entries which have a large number of definitions. Take break as an example. We have currently 27 definitions and 31 translation tables. Why? Because, for definition #1 there are four translation tables and for definition 23 there are two. There is potential for another two tables as definition # 23 has three subdefinitions and somebody might want to create a table for the sense that is common for all subsenses. This is really time-consuming for omeone who would want to either find or edit translations. Also, when doing the actual editing, it is not easy to locate the correct one among 31 options in a set of 27. I admit that the numbers might sometimes refer to a wrong definition, but if there would be the number and the gloss, the numbering would at least make it easier to detect an eventual mess and fix it. Either way, the current model of separating definitions and translations in separate lists is apt to cause confusion. What if we had a slightly different format for translation tables and they would immediately follow a definition in the same way the quotations do? --Hekaheka (talk) 07:11, 19 April 2012 (UTC)[reply]
As I was reading the first part of your comment, I was thinking the same thing you suggest in the end: what if we put translations next to senses? The obvious downside is that the edit window becomes less navigable, for someone trying to change e.g. the sixth of twelve senses. (Imagine [[water]] or [[iron]] sorted that way). But there are always trade-offs, and this would have the benefit of reducing the tendency of entries to accrue translations sections that are out of sync. - -sche (discuss) 07:21, 19 April 2012 (UTC)[reply]
Would it be complicated to make each sense individually editable? --Hekaheka (talk) 08:45, 19 April 2012 (UTC)[reply]
Another reason to go for numbering or to put the translations next to senses is some editors' habit to use subsenses. See for example gay and touch. --Hekaheka (talk) 07:48, 20 April 2012 (UTC)[reply]
It has been tried, and in my opinion was a great success. Ruakh came up with a system for it. I forget why it wasn't implemented; I think someone (Connel?) opposed it rather strenuously. We should definitely bring this back, it would solve a multitude of problems; right now there's an irritating need to duplicate definitions in the translation tables. Ƿidsiþ 08:05, 20 April 2012 (UTC)[reply]

Normalised spellings of ancient languages

There are some ancient languages, that are commonly written in normalised spelling. This means that the spelling is brought into a common form, which may not be the form that is actually attested in writing. One such language is Old Norse. A word such as kvelja might have actually been spelled <qvelia> in the original document, and ek is more usually written <ec>. Similar normalisations are also commonly applied to other Germanic languages. It seems to me that such normalised forms are definitely useful (even moreso than the spellings of the original document), but they technically don't meet CFI because they are not actually attested. I am wondering what kind of consensus or policy exists on this practice so far. Personally I think they should be allowed for any language with no consistent spelling system, provided the normalisation scheme is explained somewhere. And of course I'm not suggesting that the spellings of the manuscripts themselves can not be added, too (maybe as alternative forms pointing to the normalised spellings). —CodeCat 19:32, 6 April 2012 (UTC)[reply]

I would consider any publication of a work valid for attestation; I see no reason that you can't cite Oxford's Anthology of Old Norse, e.g., for a spelling. I would consider them far more useful then the manuscript spellings, as there's probably hundreds of copies of such anthologies for each copy in the original spelling.--Prosfilaes (talk) 22:32, 6 April 2012 (UTC)[reply]
See also Wiktionary talk:About Old French#Wiktionary:Tea room. Mglovesfun (talk) 22:34, 6 April 2012 (UTC)[reply]
If it's relevant, miniscules (lowercase letters) and accents were all invented well after nearly all the important Ancient Greek works, yet all of our Ancient Greek words make use of them. It's a scholarly standard. The point is, Ancient Greek on Wiktionary has fairly specific orthography standards (with a few grey areas), and as it happens, every other Wiktionary seems to be following identicalish standards, as evidenced by the existence of interwikis. -Atelaes λάλει ἐμοί 23:00, 6 April 2012 (UTC)[reply]
That does clear up some things but I do still have questions. When it comes to normalisation of ancient Germanic languages, there are different standards (if you can call them that). Those standards don't usually conflict, it's more a matter of how much normalisation they apply. For example, one source might normalise i to j where appropriate, another might normalise uu to w and u to v where appropriate, another might normalise c to k, yet another might also normalise qu to kw, some might also apply morphological standardisation, and different sources might apply different combinations of these normalisations. It would be hard for us to show all combinations, and it still leaves open the question of which scheme Wiktionary itself standardises on (for consistency and clarity if nothing else). —CodeCat 00:00, 7 April 2012 (UTC)[reply]
rambling thoughts on the matter: When a portion of a work (not just one word) is printed in two (or one, or three, etc) editions using different normalised spellings, I consider both printings to contain CFI-satisfying uses of whatever spellings they contain, in the same way one printing of the Bible using [[vnto]] can be used to cite [[vnto]], and another using [[unto]] can be used to cite [[unto]] (even though they're not independent, etc etc), so I'd allow all attested normalisations. I'd also consider original manuscripts and facsimiles thereof to contain CFI-satisfying uses of words, so I believe we should always allow manuscript spellings to have entries (which can soft-redirect to the normalised spellings). I would still normalise upper- vs lower-case, i.e. have an entry at [[en]] even if manuscripts have [[En]] or [[eN]], because our search function and see-{{also}}s handle case differences but not spelling differences, and because it is our longstanding, fundamental policy to have entries for different spellings ([[colour]], [[color]]) but not different capitalisations ([[COLOUR]], [[Color]]). As for which normalisation to standardise on: I suppose the editors of each language can decide that amongst themselves, just like the editors of various languages can decide on systems of romanisation. - -sche (discuss) 03:45, 7 April 2012 (UTC)[reply]
I've elaborated on normalisation some on WT:AOSX, WT:AODT, WT:AGOH and WT:ANON. Is this ok? —CodeCat 16:57, 7 April 2012 (UTC)[reply]
See also my note at Wiktionary talk:About Middle French. Mglovesfun (talk) 17:39, 7 April 2012 (UTC)[reply]
I like what you've written at WT:AOSX and WT:AGOH, though I have two questions about the specific normalisation schemes: I presume the note that 'u, uu'='w' applies only when 'u' is consonantal, so 'ubar' is not 'wber' ;) — and why normalise 'u, uu' to 'w', but 'kw' to 'qu'? Shouldn't it be either 'u + qu' or 'w + kw'? I'm looking into WT:ANON now.
PS: I wonder if we should have a dedicated template for manuscript spellings, like {{manuscript spelling of}}, displaying "Manuscript spelling of _" or "Alternative spelling of_, used in [some manuscripts]" (the last part could even allow specific manuscripts(s) to be named as parameter(s)). - -sche (discuss) 18:03, 7 April 2012 (UTC)[reply]
I'm not sure about normalising qu to kw. My reasoning is mostly that while modern German has normalised c and uu, it still retains qu in its modern spelling (although Dutch does not). Furthermore, Middle Dutch is commonly cited with qu intact as well. So it seemed more consistent to leave it like that in the Old languages as well. And yes I do think a special template would be nice. But I would suggest {{unnormalized spelling of}}, so that it's immediately clear what the relationship is (as well as the fact that normalisation has been applied in the first place). However, this may be confusing if both the unnormalised and normalised spellings occur in the actual documents, such as Old Saxon terms spelled with either v or ƀ. —CodeCat 18:09, 7 April 2012 (UTC)[reply]
I'm in strong favour of a template which denominates unnormalised spelling. I also think that normalisation should only include native phonemes. Thus, I would eradicate qu from it because, in Germanic languages, neither do kw and qu contrast, nor is there a /q/ phoneme, nor is /u/ part of <qu>. Also, for Old Saxon in specific, I'd like to rediscuss <v>, which indicates /v/, where [β] was used. Which makes me ask: Are you intending this to be guidelines which are followed due to politeness or would it lead to votes turning them policies?Korn (talk) 18:28, 7 April 2012 (UTC)[reply]
It seems to me that that goes further than the intended purpose of normalisation. Normalisation is not the same as completely respelling the words to be phonemic. It's meant only to standardise on spelling variations for ease of use, and to introduce modern distinctions between letters that were not known to ancient writers (mostly concerning I and U). A 'common denominator' spelling if you will. I'm not sure about the exact pronunciation of /v/ in Old Saxon. In Germanic it was indeed [β], and presumably it remained bilabial until after Old Saxon and Old High German split, because OHG has [b]. But in Old Saxon texts, in words with Germanic *b, some writers use ƀ while others use v. Old Dutch and Old Frisian texts use v exclusively, while OHG texts write mostly b, and Old Norse and Old English write only f. On the other hand, in later Old Dutch and also High and Low German, the letter v starts to be used to represent *f as well, showing initial and medial voicing of voiceless fricatives.
And no I'm not intending this to be a formal policy, I'm only hoping to establish some kind of common practice, and to have it in writing that normalisations are... well, the norm on Wiktionary. —CodeCat 18:44, 7 April 2012 (UTC)[reply]
How about {{historical/attested spelling of}} and {{normalised spelling}} alongside each other?Korn (talk) 20:32, 7 April 2012 (UTC)[reply]
Do you intend {{normalised spelling}} as a context template? (Main entries need no form-of templates.) If so, what do we do when the normalised spelling is also attested, include {{attested spelling of}} on one line and {{normalised spelling}} on the next, or use {{context|attested and normalised spelling}}? I would prefer we leave normalised entries unmarked: we needn't mark them as attested when they are also attested, just as we needn't mark "bird" as referring to a feathered-wing-having, egg-laying animal in the New York dialect, because it refers to the same thing in most other dialects of English... and we needn't mark them as normalised, lest we wrongly imply they aren't attested when they are. (But I understand the value of noting which spellings are attested and which are normalised, so I might could be persuaded to support such templates and context tags as an obvious way of presenting such information, though usage notes might be better.)
Another idea for the manuscript spellings: (e.g. for [[ec]]:) # {{context|in the Codex Regius}} {{alternative spelling of|ek}}. Any dedicated template would also work in place of {{alternative spelling of}} in that example... now I'm just not sure if including the manuscript's name as a context or after the "alt spelling of _" bit looks better. - -sche (discuss) 21:43, 7 April 2012 (UTC)[reply]
I mean {{normalised}} as in {{slang}} or {{archaic}}, what are those called? I'm thinking about languages where some words' normalised spellings were never used by native speakers or during that period while others were. So that any entry could be either having-been-used-historically or normalised or both and hence would need separate markers to be informative. So I would propose two tags {{normalised (spelling)}} and {{historic (spelling)}} (which would not collide with archaic/obsolite since those are reserved for living languages). Historic would, when not accompanied by normalised then be followed by sth. like see [normalised form]]. I'm not fond of alternative spelling because historical spellings are no alternative to normalised spellings an modern-academic context and normalised spellings were not applied by native writers. Furthermore: Couldn't we extend the normalisation to modern languages lacking codification/official rules and thus end a good deal of the problems within Low German?Korn (talk) 22:39, 7 April 2012 (UTC)[reply]
We should be recording the forms that are used. Unlike historical languages, the fundamental issue with languages like Low German is that there is no agreement on what spelling to use, and I don't think we should be sticking ourselves into the issue by choosing one.--Prosfilaes (talk) 09:11, 8 April 2012 (UTC)[reply]
Sometimes what the original spelling is can be debatable. Manuscripts, as the name suggests are hand-written so handwriting can be an issue, also sometimes what one reader interprets as a diacritic, another reader interprets as a smudge or accidental pen stroke. Not sure if this adds much value to this thread or not. Mglovesfun (talk) 23:30, 7 April 2012 (UTC)[reply]
Well, it brings up the question about handwriting again. Or rather what to do with it. (See the Old French ō=on etc.)Korn (talk) 23:47, 7 April 2012 (UTC)[reply]

Why do section links not work?

When I link to Wiktionary:Beer_parlour#Requesting_input_for_extinct_and_other_sparsely_documented_languages and click the link or even paste http://en.wiktionary.org/wiki/Wiktionary:Beer_parlour#Requesting_input_for_extinct_and_other_sparsely_documented_languages into my browser window, my browser first goes to that section and then goes to the bottom of the page. I've noticed that again and again. Am I doing something wrong? Is it my browser--I'm using Chrome on a Mac.BenjaminBarrett12 (talk) 23:16, 7 April 2012 (UTC)[reply]

I'm not entirely certain about this, but I have a suspicion that it's because of #Counting number of articles in a given language in any given Wiktionary. There's a very large collapsing list in that thread, and I wonder if the browser manages to scroll down to the appropriate thread before it manages to collapse the list, it might cause the sort of problems you're experiencing. If the scroll-down happens before collapse, the page becomes significantly shorter while you're already down a ways, and you end up at the bottom. I've not experienced it on my desktop, but I have experienced it on my Android phone. It's probably prudent to wait until some of the more technically savvy folks chime in before coming to a firm conclusion. -Atelaes λάλει ἐμοί 23:25, 7 April 2012 (UTC)[reply]
I've noticed similar behavior on pages with collapsing lists, even with short collapsing lists. It does seem to depend on how busy the servers are or how slow my connection is. But I'm not tech savvy. DCDuring TALK 23:45, 7 April 2012 (UTC)[reply]
I suspect you're both right (Atelaes, DCDuring); I notice it on pages with lists as well. I don't have Chrome, but in Firefox I can wait until a page has loaded completely (and finished jumping around), click in the URL bar, and press 'enter', at which point it brings up the specified section/anchor. (If I scroll up or down, and then click in the URL bar and press 'enter' again, it brings it back to the specified section then, too. But clicking 'refresh' reloads the page.) This makes me almost certain that the collapsing lists are the cause of the jumping-around. - -sche (discuss) 07:08, 8 April 2012 (UTC)[reply]
Thank you all for the feedback. I notice now that they are working correctly, so perhaps it has to do with the servers or with the length of the list. BenjaminBarrett12 (talk) 09:31, 14 April 2012 (UTC)[reply]

A separate category tree for forms

A few weeks ago, someone suggested at WT:ID#Category of all forms to create a separate category tree for non-lemmas. I do see some merit in the suggestion, as it would make it a bit easier for editors and users alike to keep lemmas and non-lemmas apart. On the other hand, it isn't always clear what is a lemma and what isn't. Participles are a notorious example. —CodeCat 17:17, 8 April 2012 (UTC)[reply]

But "non-lemma" is not a coherent concept, is it? I mean, what does it say about a word that it's not a lemma? —RuakhTALK 20:34, 8 April 2012 (UTC)[reply]
Supposedly, lemmas may be created through derivation, whereas non-lemmas are created through inflection. If something is not a lemma on Wiktionary it means we don't have a self-sufficient definition for the term, but rather link to another term which has the proper definition. Conjugated verb forms and declined forms of nominals would be examples. Participles, verbal nouns and degrees of comparison are a bit of a grey area, as they may have definitions in some languages (such as in Latin) but not in others (English), and often have secondary senses not readily derivable from their status as inflected forms, so that they can be considered both lemmas and non-lemmas at the same time. —CodeCat 21:16, 8 April 2012 (UTC)[reply]
That distinction doesn't hold up. The Romance language verb forms (non-lemmata) derive from Latin verb forms. The choice of the lemma for a word is actually arbitrary, as the plurals and gendered forms of words usually derive from a corresponding form in the parent language. Wiktionary, like other dictionaries, simply chooses one of the various forms to be place-holder for the word in all forms, and that choice, although conventionally consistent, is nonetheless arbitrary. --EncycloPetey (talk) 22:01, 8 April 2012 (UTC)[reply]
Of course it is arbitrary, but since categories on Wiktionary are also arbitrary, there is no harm in allowing our categories to reflect our own internal treatment of terms. —CodeCat 22:11, 8 April 2012 (UTC)[reply]
To what benefit? Non-lemmata may or may not have definitions, and may or may not have translations, and should have quotations and pronunciations. So what difference warrants a separate category structure beyond the purely arbitrary designation of some entries as non-lemmata? --EncycloPetey (talk) 02:57, 9 April 2012 (UTC)[reply]
I've been putting all the non-lemmata in 'non-lemmata' categories for Ancient Greek. My reasoning is that, once we actually get a substantial number of inflected form entries, keeping noun forms in Category:Ancient Greek noun forms helps keep Category:Ancient Greek nouns useable. However, as I look at it, I see that Category:Ancient Greek noun forms is in Category:Noun forms by language, so I guess I'm not entirely certain what's being proposed here. There is some grey area as to what is and is not a lemma entry. We have some "lemma" entries for the comparative forms of adjectives (e.g. πρεσβύτερος), and I've seen them for plurals too, but I don't know if that derails the whole notion of making a distinction between the two. -Atelaes λάλει ἐμοί 11:35, 9 April 2012 (UTC)[reply]
This proposal doesn't intend to solve or change the definition of what we consider lemmas or not. All it is, is taking the categories like Category:Dutch verb forms and moving them from having Category:Dutch verbs as their parent category, to a new to be created category tree. The ambiguity is only in deciding what to do with categories like Category:Latin participles and Category:Latin adjective comparative forms, since it's not clear in which of the two category trees they belong. —CodeCat 12:28, 9 April 2012 (UTC)[reply]

Scottish slang and jargon

See Appendix talk:Glossary of Scottish slang and jargon. I think we're in that annoying middle ground where some Scots terms as used in English are slang or dialect, but in Scots they are just normal words. What can/should we do with this? Equinox 22:16, 8 April 2012 (UTC)[reply]

I think the page name should be changed to afford Scots speech more 'dignity'. The debate as to whether it is a collection of dialects or a language can be contentious, but to relegate it to the level of slang or jargon would seem to debase its status. 'Scottish words and phrases' might be neutral enough... — This unsigned comment was added by 94.193.240.11 (talk).
I think that's not the point Equinox is trying to make. He says that such words are slang or jargon when used in English, but when used in Scots they are just everyday words. It's similar to how words of Spanish origin might become slang in US English - that doesn't make Spanish itself slang! —CodeCat 00:09, 9 April 2012 (UTC)[reply]
Fine, but the Wiktionary article purports to be about Scots terms per se, i.e. as used in Scotland, not about Scots words (of which there are very few) that are used in English outwith Scotland. As you say, in those terms, the words and phrases listed are not slang or jargon in Scottish terms. Their status outwith Scotland is irrelevant. An article about Spanish words and expressions would not describe them as slang simply because some of them are used as such in US English. — This unsigned comment was added by 94.193.240.11 (talk).
I think you have Scots and Scottish English confused. Terms may be acceptable in Scots, but be considered slang or jargon in Scottish English. An article about Spanish words that are used as American slang would describe them as slang, but in the English section only.--Μετάknowledgediscuss/deeds 04:58, 9 April 2012 (UTC)[reply]
I find the whole thing very confusing. When someone says Scottish when relating to languages, I'd usually assume Gaelic. But this is not, so, as I cannot differ Scots and Scottish English (if there is a difference), I do not know which this list gives me examples of.Korn (talk) 12:37, 9 April 2012 (UTC)[reply]
I think this has always been the case here. Merging Scots into English entirely has to be a possibility. In the same way we have Category:Serbo-Croatian language and Category:Croatian Serbo-Croatian. How do we decide if these two are separate languages or not? PS recently Filipino got merged into Tagalog, and Moldavian got merged into Romanian, so there are three recent precedents for it. Mglovesfun (talk) 12:42, 9 April 2012 (UTC)[reply]
Moldavian and Romanian never had many differences (the biggest one was that they used different scripts for roughly a century, before they went back to using the same script), and the same is true of Serbian and Croatian. In contrast, Scots has been distinct from English since at least the 16th century. Scots certainly looks similar enough to English that we could probably shoehorn it into English, but it would be as linguistically incorrect as shoehorning Low German into High German. - -sche (discuss) 18:50, 9 April 2012 (UTC)[reply]
I have never heard a proper explanation, as for WHAT makes Scots different from English. Which would be easier for Low German: Other grammar, other syntax. So...maybe if there was a Scotsman here, he could shed some light on this.Korn (talk) 23:46, 9 April 2012 (UTC)[reply]
I'm no Scotchman, but I know the vocabulary and pronunciation varies more than most dialects of English. However, the clincher is different grammar: w: Scots language#Grammar. It's enough to make it count as a language to me, and I think unification with English on Wiktionary would be a major mistake. --Μετάknowledgediscuss/deeds 00:25, 10 April 2012 (UTC)[reply]
Scots is different grammatically, syntactically and in terms of vocab. An exchange like ‘Gonnae no dae that!’ — ‘How no?’ — ‘Jist gonnae no!’ doesn't make sense in English but it's very common in Scotland. The problem is that several generations of Scots have been brought up to believe that the language is ‘slang’ (if my wife said (deprecated template usage) ay or (deprecated template usage) heid at school she was always ‘corrected’ to (deprecated template usage) yes or (deprecated template usage) head) and therefore it has a weird status in its home country where educated people are often embarrassed by it. When the Scottish Parliament released a version of their website in Scots a few years ago I remember it being passed round in forwarded emails among Scottish friends of mine as though it was the funniest thing ever – like a UK Parliament site in Cockney. This is changing now though, as Scottish schools have to include Scots classes and many publishers are bringing out more serious books in the language. For Wiktionary to treat it like the language it is seems no less than we should expect from the site. Ƿidsiþ 08:01, 20 April 2012 (UTC)[reply]

LT Straw Poll

A quick straw poll about using LiquidThreads on a forum like the BP. As far as I can tell, there has never been a definitive community decision on actually using it. Thanks --Μετάknowledgediscuss/deeds 04:54, 9 April 2012 (UTC)[reply]

LT Straw Poll — Support

  1.   Support but I do think it will need some fine-tuning before it can be used. I haven't had any problems with using it on my own talk page but I don't know how it would work on on our main discussion pages. Maybe it could be added to a less-frequently used page like WT:ES as a trial? That way we can assess more easily what the biggest problems are and we could ask its creators if they can address them. —CodeCat 12:17, 9 April 2012 (UTC)[reply]

LT Straw Poll — Oppose

  1.   Oppose -Atelaes λάλει ἐμοί 11:27, 9 April 2012 (UTC) Inasmuch as I think the Beer Parlour desperately needs a new format, and inasmuch as I was one of those who initially championed liquid threads.....I have to admit it's gotten on my nerves. Some of its problems include: Having a separate watchlist, in addition to my watchlist. Not having the ability to see one line descriptions of the change(s) at a glance (like my watchlist). Having to check things off or else that little number keeps increasing and berating me for not keeping up with things. -Atelaes λάλει ἐμοί 11:27, 9 April 2012 (UTC)[reply]
  2.   Oppose SemperBlotto (talk) 11:31, 9 April 2012 (UTC) nasty, nasty, nasty![reply]
  3.   Oppose Mglovesfun (talk) 12:43, 9 April 2012 (UTC)[reply]
  4.   OpposeRuakhTALK 13:07, 9 April 2012 (UTC)[reply]
  5.   Oppose --Daniel 13:30, 9 April 2012 (UTC)[reply]
  6.   Oppose EncycloPetey (talk) 13:55, 9 April 2012 (UTC)[reply]
  7.   Oppose DCDuring TALK 13:57, 9 April 2012 (UTC)[reply]
  8.   Oppose. Not being a Luddite but it's horrible. Equinox 14:08, 9 April 2012 (UTC)[reply]
  9.   Oppose. -- Eiríkr ÚtlendiTala við mig 15:25, 9 April 2012 (UTC)[reply]
  10.   Oppose. See my comments at the WT:Information Desk. - -sche (discuss) 18:58, 9 April 2012 (UTC)[reply]
  11.   Oppose Dan Polansky (talk) 19:49, 9 April 2012 (UTC)[reply]
  12.   Oppose JamesjiaoTC 23:04, 14 May 2012 (UTC) I would really like to see a better and more polished forum system for WT. Neither LT nor the current system is good enough in my opinion.[reply]

LT Straw Poll — Comment

How would the use of liquid threads affect javascriptless browsers, such as Lynx? Ungoliant MMDCCLXIV 13:05, 9 April 2012 (UTC)[reply]
In whatever way it would affect those people, it would presumably also affect people who use e.g. NoScript. - -sche (discuss) 18:58, 9 April 2012 (UTC)[reply]

As Atalaes said: BP and other high-volume pages do need a new format. But LT doesn't seem to have many friends here.

Therefore, I suggest sub-pages like the deletion log of the Italian WP. Example: w:it:Wikipedia:Pagine da cancellare/Log/2008 giugno 13. One could create a sub-page for each new discussion, or a sub-page for each new month. These sub-pages could be added to your normal watchlist, and without flooding your watchlist with the high-volume Beer Parlour (or similar pages). After a certain time, one could unlink the sub-pages and link them to some archive index page. The editing and the discussions wouldn't find place in the BP or Tea Room any more, but in their sub-pages; BP and TR would be just a list of included sub-pages.

This would also solve the problem of retrieving old discussions. Currentliy, moving discussions to an archive page kills all links and you'll have to search the archives.

What do you think? --MaEr (talk) 15:17, 9 April 2012 (UTC)[reply]

I still think we should have some sort of "Wiktionary forum". -- Liliana 16:44, 9 April 2012 (UTC)[reply]
I would absolutely support subpages. All we'd need is something that made it really, really easy to make a new topic, and really easy to convert a regular thread that was incorrectly made into a topic. -Atelaes λάλει ἐμοί 22:25, 9 April 2012 (UTC)[reply]
Wow, I didn't realize how much enmity there is towards LT. I'm really unclear on how the subpage idea would work, but it sounds better than the current mess. --Μετάknowledgediscuss/deeds 00:19, 10 April 2012 (UTC)[reply]
As I imagine it, it would work similar to WT:VOTE, except, if at all possible, easier, and slightly more automated. That way, discussions can survive until resolution, whereas right now, they tend to survive until there's a few engaging discussions beneath them. -Atelaes λάλει ἐμοί 01:19, 10 April 2012 (UTC)[reply]
I'm not sure that's good for the discussion rooms, but a modified version would work well for RFV et al.--Μετάknowledgediscuss/deeds 23:39, 10 April 2012 (UTC)[reply]

Indeed, what I'm thinking about is similar to WT:VOTE (which I never have tried). In this solution, the user should be offered a link to click on, which starts some kind of Javascript, maybe similar to the New Entry Creator by Yair rand. It should prompt the user for a discussion title and the discussion text, like when you add a new section to a discussion page. The discussion title should be the base of the sub-page name. The Javascript adds the name of the parent page to the title and could prepend the current date, so sub-pages don't conflict with each other. For example: "Beer Parlour/2012-04-11 Newbie question", "Tea Room/2012-04-12 Another queston" etc. The sub-pages should contain their own edit-links (like in WT:VOTE). Users shouldn't be able to edit the parent page (Beer Parlour, Tee Room, Ety Scriptorium etc) in a direct way; it would be the Javascript that links or inserts the newly created sub-page into the parent page.
Comments? Criticism? --MaEr (talk) 18:32, 11 April 2012 (UTC)[reply]

Sounds intriguing. As somebody has probably mentioned before, why don't we try it out on WT:ES, which gets very little traffic as it is? We can get a feel for how it works on a real discussion page. --Μετάknowledgediscuss/deeds 00:10, 13 April 2012 (UTC)[reply]
Note that WT:ES has an odd (and thus frequently-circumvented/ignored) format already. Also, note that a set-up like WT:VOTE's would require subpages to have unique names, which need to be more than just the word in question: the same string of letters might be sent to WT:RFV or WT:RFD twice. A solution, for those pages, is to include dates in the subpage URL, preferably the way the archives already do (so, WT:RFV/2012-04/word or WT:RFV/2012/04/word).
Indeed, the Etymology Scriptorium has an odd and complicated format. We should try new things there.
How do we proceed?
We need some help from someone with technical knowledge, probably Javascript. Where do we ask? A broadcast in the Grease Pit?
When and where and whom do we ask, before we try this new concept in the Etymology Scriptorium?
--MaEr (talk) 17:10, 13 April 2012 (UTC)[reply]
Yeah, asking for technical help in the Grease Pit is the best next step; then a heads-up about the impending change to WT:ES can be posted in the BP. - -sche (discuss) 19:30, 13 April 2012 (UTC)[reply]
OK, I've created a new discussion: Wiktionary:Grease pit#Sub-pages for high volume discussion pages. --MaEr (talk) 12:13, 15 April 2012 (UTC)[reply]

Moving 'pronunciation' down in ELE

I think it would be nice to make pronunciation one of the last sections of an entry, thereby promoting definitions. An example of an entry with long pronunciation section (much longer than its etymology section) is contract. Demoting (moving down) pronunciation should be much easier than demoting etymologies, as there are fairly many entries with multiple etymologies, and Wiktionary's entry structure makes part-of-speech headings depend on etymology headings. I know that some entries have several pronunciations, which might cause a problem, yet contract entry shows how several pronunciations, differentiated by part of speech, can be entered into one pronunciation section. Whatever the case, I think this proposal should be given some serious thought. If you know of a past discussion on the subject, please post a link to it. --Dan Polansky (talk) 10:40, 10 April 2012 (UTC)[reply]

No. Every printed dictionary I have has the pronunciation given as the first thing, before the definition, and it is arguably what most people will be looking for. -- Liliana 11:20, 10 April 2012 (UTC)[reply]
I agree with Liliana. People expect pronunciation info at the top of an entry. Also, homographs are very often also homophones (especially in languages with better regulated spelling systems than English), so in some cases a single pronunciation can be given for a single spelling, even when multiple etymologies are listed. (See for example fundo#Portuguese.) —Angr 11:45, 10 April 2012 (UTC)[reply]
I don’t see the necessity of promoting definitions. I was a Wiktionary user for a few years before I became an editor, and I searched words for their pronunciation and/or etymology as often as (if not more) I searched them for their definitions. Ungoliant MMDCCLXIV 14:33, 10 April 2012 (UTC)[reply]
In general, I think we should not have entries with pronunciation sections laid out like contract (pronunciations split by POS but all in one section); I think in such cases we should have entries split by ===Pronunciation 1===, or in this case, by ===Etymology 1===. PS, why is the {{etyl}} template freaking out about {{fro}}'s script? {{fro}}'s script is set, as Latn. - -sche (discuss) 18:15, 10 April 2012 (UTC)[reply]
The code was used wrong; someone had used ofr when it should be fro. —CodeCat 21:17, 10 April 2012 (UTC)[reply]
For the record, I have renamed the section heading of this thread to make it clearer.
I've always thought definitions are the most important content in a dictionary. While dictionaries often do list pronunciation first, their pronunciation information takes one line rather than several; see how compact the pronunciation information is at http://education.yahoo.com/reference/dictionary/entry/cat or at http://www.macmillandictionary.com/dictionary/american/cat. Furthermore, many dictionaries do not show pronunciation at all (including Merriam-Webster online), which suggests pronunciation is less important than definitions, especially to native speakers. Some online dictionaries have etymology below definitions, including Merriam-Webster, Collins and AHD. Thus, I am surprised by the responses above. I admit that I have no data on what content users of Wiktionary search most often, whether definitions, pronunciation or etymologies. What I do know is that Wiktionary has 31,445 entries with "etymology" section (found using AWB, searching for "==\s*Etymology"), and that it has 30,875 entries with "pronunciation" section (found using AWB, searching for "==\s*Pronunciation"). Thus, whatever people are searching for on Wiktionary, what they actually find are above all definitions rather then pronunciation and etymologies. On the sample of one consisting of me, definitions are content number one. The interface to Wiktionary ninjawords only shows definitions, suggesting some other people deem definitions the most important thing. I recall DCDuring wanting to promote definitions. --Dan Polansky (talk) 06:50, 13 April 2012 (UTC)[reply]
I've tried (what I think is) the suggested format on a number of entries: kitten, unionized, iron. It is very counter-intuitive to me; it's hard to say if that's just because I've had years to get used to Wiktionary's pron-at-the-top format. I also tried moving etymologies down; that turned out badly (as expected, because of the way Wiktionary uses etymologies to sort homographs): kitten, unionized, iron. (I am aware that the subject of this discussion is pronunciations.) - -sche (discuss) 07:19, 13 April 2012 (UTC)[reply]
I've long thought that there is too much screen space on the initial landing page for longer entries devoted to everything above the inflection line: L2 and L3 headings, lhs ToC, alt forms lists, etymology, and pronunciation. A program for resolving this that is not likely to have too many unforeseen bad consequences would be:
  1. Smaller typefaces for headers
  2. Rhs ToC
  3. Horizontal lists of alternative forms
  4. Fewer cognates or putting cognates and long etymologies under {{rel-top}}
  5. Putting all or long pronunciation sections under {{rel-top}}
This does not involve changing header order or making the relationship between the user-visible format and the edit window context obscure (in the way that the category-page templates do). DCDuring TALK 14:29, 13 April 2012 (UTC)[reply]
I don't think we really need the pronunciation section to take up nearly as much room as it does. The word "Audio" doesn't add much; the "Play" button could really be by itself. Likewise, the "Pronunciation" header itself isn't really useful. Are there situations where someone would think that the section is about something else? The label "IPA:" might be useful, but usually it probably isn't, and chances are it doesn't need to be that large. We don't need to break everything up into many lines. The "Rhymes: -xxx" could be replaced by a small (Rhymes) link or something. At its current size, moving the pronunciation section lower might be an overall improvement, but it doesn't need to be that large in the first place, so... --Yair rand (talk) 19:57, 15 April 2012 (UTC)[reply]
If pronunciation is going to be in a section by itself, it needs a header. What else would we call it but "Pronunciation". The IPA link is useful as it links either to our page on how to interpret the IPA symbols for the specific language or to Wikipedia's article on that language's phonology. —Angr 20:30, 15 April 2012 (UTC)[reply]
Structurally, the data section needs a header; removing the section header is a Bad IdeaTM. Moving the Pronunciation section down creates logical problems in the data structure as well. And no, we need the (Audio) text for people browsing without images on. There are also instances where "Audio" is replaced by something more specific, such as when there are audio files for different regional dialects or from different historical periods. No one has yet proposed a workable solution that would permit placing Pronunciation in any location other than the current one. --EncycloPetey (talk) 20:34, 15 April 2012 (UTC)[reply]
Labels indicating which dialect a pronunciation is from is necessary, though we often duplicate the same label multiple times, which I don't think is necessary. Why would users browsing without images need the "Audio" text? --Yair rand (talk) 21:01, 17 April 2012 (UTC)[reply]
Also, we don't really need the "(file)" link. The "About this file" link in the More menu works well enough for attribution purposes. The different forms of displaying pronunciation (IPA, SAMPA, enPR) don't really all need to be displayed at once, there could just be an option to change which is displayed by default. (Atelaes made a script to do something like that a while ago (with automatic conversion between forms), if I recall correctly.) --Yair rand (talk) 21:33, 17 April 2012 (UTC)[reply]
The pronunciation section can be made smaller, but I don't know how it can be made much smaller without a loss of information (some dialects, for example), except by adopting DCDuring's collapsible-table idea. - -sche (discuss) 21:23, 17 April 2012 (UTC)[reply]
A new contributor of pronunciations has been experimenting with condensed pron sections on paratransit. Take a look. :) - -sche (discuss) 00:29, 25 April 2012 (UTC)[reply]

Moving 'alternative forms' down in ELE

Wiktionary:Votes/pl-2012-03/Moving "Coordinate terms" up in ELE proposes this ordering of terms:

  • Alternative forms

...

  • Synonyms
  • Antonyms
  • Other allowable -nyms
  • Coordinate terms
  • Derived terms
  • Related terms
  • Translations
  • Descendants

But we still list 'alternative forms' at the top of entries. To me it makes more sense to list it along with other terms that are related in some way to the current term. So I would like to propose modifying it to this:

  • Alternative forms
  • Synonyms
  • Antonyms
  • Other allowable -nyms
  • Coordinate terms
  • Derived terms
  • Related terms
  • Translations
  • Descendants

Seen as the terms are listed in a general order from 'most closely related semantically' to 'least closely related semantically', the alternative forms section seems 'closer' even than synonyms, so I've placed it above it. —CodeCat 12:22, 10 April 2012 (UTC)[reply]

  1.   Support --Daniel 23:28, 10 April 2012 (UTC)[reply]
    • Generally, yes, I support.
    • Especially when it's only one POS section. But, I'm not sure if I want multiple Alternative forms sections with repeated information, otherwise.
    • Case in point: this revision of present, with multiple etymologies and POS sections. --Daniel 23:28, 10 April 2012 (UTC)[reply]
  2.   Support strongly. The alternative forms are a much less relevant part of the entry than the definition and linguistic information. Mostly the alternative forms either differ by a space/hyphen or are (very) archaic; the AE/BE dualism is the only significant exception from this general rule. (Use {{also}} there?) Obviously, the most important parts of the entry should always be near its beginning, not halfway down the page (as in the case of small screens and multiple alternative forms). -- Gauss (talk) 22:55, 13 April 2012 (UTC)[reply]
    Actually that gave me an idea. Why dedicate a whole section to alternative forms when {{also}} would do, much more compactly? Even if the consensus is opposed, this is worth considering. —CodeCat 23:14, 13 April 2012 (UTC)[reply]
  3.   Support. The definition is the most important thing and it should be as close as possible to the top. There is a somewhat good reason for the etymology to precede the definition, but alternative terms are of low importance and usually not of much interest to the user. —Stephen (Talk) 15:55, 16 April 2012 (UTC)[reply]
I quite like it where it is, at the top. These are different forms of the current word, i.e. basically alternative headwords, and some dictionaries would even place them together before the main content: "color, colour: a thing in a rainbow... blabla..." Equinox 23:32, 10 April 2012 (UTC)[reply]
  Oppose. Because it’s important to know right away whether there is an alternative pondian form. If we had a different header for obsolete and informal forms, I’d support moving it down. Ungoliant MMDCCLXIV 17:03, 11 April 2012 (UTC)[reply]
Yeah, for the record, I   Oppose moving the alt forms. - -sche (discuss) 00:34, 13 April 2012 (UTC)[reply]
*
I like the idea of moving the section down, but for the US-UK variation. But US-UK (and Canadian etc., yes) variations could be listed on the headword line, I think, like this in color:
color (plural colors) (American)
colour (plural colours) (British, Canadian)
Notice the hyperlink in "colour", which leads the reader to the alternative form entry.
The obsolete forms such as those at knowledge (first 5 of the 30 ones listed: cnaulage, cnoulech, knauleche, knaulege, knaulach) are IMHO not worthy the prominent place at the top of the language section.
Alternatively, a section "Obsolete forms" could be created, placed somewhere at the bottom of the entry; "alternative forms" would be kept and restricted to current forms. --Dan Polansky (talk) 07:05, 13 April 2012 (UTC)[reply]
I like your suggestion of splitting them between current alternative forms and obsolete alternative forms. And the double-headword arrangement looks quite nice too. But what about extinct languages, whose terms are all obsolete by definition? —CodeCat 12:13, 13 April 2012 (UTC)[reply]

Uh... no vote is needed. We already had a vote that said that when Alternative forms is used as an L4 header, it appears in that location. We generally list the alternative forms first, but sometimes they apply only under one etymology or to one part of speech, in which case they become an L4 header. --EncycloPetey (talk) 20:28, 15 April 2012 (UTC)[reply]

Where is that vote that we already had? --Daniel 13:37, 16 April 2012 (UTC)[reply]

Renaming “context labels”

I'd like to reform the vocabulary surrounding our “context labels.”

{{context}} is used for two different things: grammatical labels and usage labels (also called restricted-usage labels).

The use of the term context is incorrect and misleading. Newb editors often think the label is supposed to represent the context of the referent (labelling cow with “animals”), or even that it's just something vaguely placed in the context of the entry's definition line. But the term context, in corpus-based lexicography, properly refers to something different: a term's context of usage, shown in a quotation. The generic sense of context can be applied to usage labels, but in several different ways, mostly incorrect. And context has nothing to do with our so-called “grammatical context labels.”

I'd like to rename the the mechanics of these templates from context to label, and sort them out functionally into grammatical labels and usage labels. This relates directly to the terminology of professional lexicography (for example, s.v. “label” in Hartmann, Dictionary of Lexicography).

Things that would be updated:

Any suggestions, objections, etc.?  Michael Z. 2012-04-11 00:51 z

I don't object, but 'label' is a bit too vague. And also... are those the only two kinds of context labels we have? —CodeCat 01:02, 11 April 2012 (UTC)[reply]
Those are the two categories of labels in dictionaries. (Restricted) usage includes usage restricted to a region, medium, technical subject, period, frequency, social group, formal, slang, etc. See Category:Context labels for the subcategorized list, and HartmannMichael Z. 2012-04-11 01:13 z
Which of those would include sense-labels like {{figuratively}}? Figurative usages are not restricted to "figurative contexts" IMHO, but their figurativity is obviously not grammatical information, either. Likewise sense-labels like {{of a|person}}. (But I'm on board with the vocabulary change from "context" to "label".) —RuakhTALK 02:29, 11 April 2012 (UTC)[reply]
Well, figurative and literal are kinds of usage, which can be labelled as such. These have some relation to formal and colloquial or perhaps even folksy speech. I am happy to drop the word context here altogether. Of a person is a restricted usage – the sense only makes sense when used in this restricted way.
Hartmann classifies various axes of usage labels, although he warns that there are no clear boundary lines: period (e.g., archaic/in vogue), attitude (appreciative/derogatory), frequency (basic/rare), contact (borrowing/vernacular), channel (written/spoken), standard (correct/incorrect), register (elevated/intimate), social status (high/demotic), subject (Botany), genre (poetic/conversational), dialect (American). I have seen other classifications, and no one can really define slang.
Incidentally, Hartmann on figurative meaning: “such meanings are sometimes marked with special usage labels ironically termed ‘fig. leaves’ by critics, because they can be said to hide the underlying basic sense. Michael Z. 2012-04-11 06:04 z
I see. That makes sense. I find "restricted-usage labels" (which you mentioned in a parenthetical note above) to be misleading, but "usage labels" sounds sufficiently broad to me. —RuakhTALK 15:45, 11 April 2012 (UTC)[reply]
I see no advantage, so I'd rather things stayed as they are, but only to avoid the hassle associated with change as opposed to no change. Mglovesfun (talk) 22:10, 16 April 2012 (UTC)[reply]
I would try to take the hassle upon myself, while checking for consensus on substantial changes, esp. for guideline changes. Michael Z. 2012-04-17 20:27 z
I don't object to the basic idea, except that it seems like a lot of upheaval over a relatively minor semantic quibble. Yet more editing conventions we'll all have to relearn... Ƿidsiþ 10:31, 22 April 2012 (UTC)[reply]
If Wiktionary lasts 100 years, I'd rather correct blatantly wrong terminology now than in 90 years. This “upheaval” will be nothing compared to the misunderstandings and wasted effort this situation would continue to cause. Michael Z. 2012-04-22 20:15 z
And how do you feel about doing it twice, once now and again in ninety years? There's no shortage of "relatively minor semantic quibble[s]" to justify future upheavals. :-P   —RuakhTALK 20:44, 22 April 2012 (UTC)[reply]
If the results of the proposed changes are deemed unsatisfactory by the community, I hereby commit to changing everything back, 90 years from today. Michael Z. 2012-04-24 19:08 z

Sarang (talkcontribs) has created {{Commonsrad}}, and would like me to run a bot that will add it to all entries and indices for Chinese radicals (e.g. and Index:Chinese radical/一). If you have an opinion on the subject, or would like to read others' opinions on the subject, please go to Wiktionary talk:About Sinitic languages#{{Commonsrad}}. —RuakhTALK 15:41, 11 April 2012 (UTC)[reply]

(Note: please discuss there, not here! —RuakhTALK 16:59, 11 April 2012 (UTC))[reply]

A new vote for languages with limited documentation

Building on the recent failed vote for endangered languages, I have opened a new vote on languages with limited documentation. The talk page has quite background information and summaries of some of the issues involved.

The proposal also expands the criteria for inclusion for extinct languages to include usage.

To counteract potential abuse of the single attestation proposal, a provision for each language to maintain a list of excluded sources is included.

I hope this proposal is acceptable as a way to welcome endangered languages and other languages without a strong written tradition. BenjaminBarrett12 (talk) 21:35, 11 April 2012 (UTC)[reply]

Discussion is underway on the talk page about whether this vote should propose a list (which might be long) of specific languages with sparse documentation that would be allowed with fewer citations, or whether this vote should merely allow "those languages listed at [[WT:CFI/some-subpage]]", with separate votes populating that subpage. Your input is solicited here. - -sche (discuss) 03:57, 13 April 2012 (UTC)[reply]

Belatedly pursuant to Wiktionary:Beer parlour archive/2011/October#Trademarks, I've set up the concise page WT:TM, mostly using the language bd suggested at the end of that old BP discussion. Comments, critiques? Should we ultimately vote to make it a policy? Also, should we have a new vote to codify our actual practice, which appears to have come to be different from from what Wiktionary:Votes/pl-2007-02/Trademark designations suggested? - -sche (discuss) 04:56, 13 April 2012 (UTC)[reply]

Is a bot auto-creating users?

I have noticed that someone is creating a lot of users with a first name, a surname, and three random letters, e.g. User:WesleyBridgesejm, User:JamesNortonsfo, User:MauriceMclaughlinymo. Equinox 14:25, 13 April 2012 (UTC)[reply]

Still going non-stop. I just blocked one of them (User:ThomasBrucekee) by IP address, which I suppose should put a halt to this (assuming all created from same IP, which I don't have permissions to see). Still has ability to edit talk page. Equinox 18:31, 14 April 2012 (UTC)[reply]
And if that IP-possessor has a virus? we are potentially going to LOSE the user!!!!!! just kidding :D, but we need a captcha security though--Dixtosa 18:41, 14 April 2012 (UTC)
Hasn't worked. Still seeing new ones created. Boo. Equinox 19:11, 14 April 2012 (UTC)[reply]
It's not just here. User accounts following this pattern are being made at Multilingual Wikisource and English Wikisource as well. I didn't see any at English Wikipedia, and I didn't check any other projects. —Angr 19:22, 14 April 2012 (UTC)[reply]

I nominated this template for deletion WT:RFDO#Template:languagex. But the reason why it exists is because some of our language templates have prefixes and this template is designed to handle those. It has always been a bit of a strange system and slightly misguided in my opinion, so I'd like to discuss removing those prefixes. —CodeCat 12:22, 15 April 2012 (UTC)[reply]

Entries that are translation targets, but violate SoP restriction

It seems you don't have a rule regarding it. So, do we keep them?--Dixtosa 18:20, 15 April 2012 (UTC)

Some we keep, like day after tomorrow, others we don't. There is no rule behind it, and it is decided on a case-by-case basis. -- Liliana 18:22, 15 April 2012 (UTC)[reply]
No, no. I meant foreign words such as წიგნის მაღაზია which violates SoP, but used in bookshop.--Dixtosa 18:26, 15 April 2012 (UTC)
Oh that. No, we never keep these. Just use {{t|ka|წიგნის}} {{t|ka|მაღაზია}} in translation tables. -- Liliana 18:29, 15 April 2012 (UTC)[reply]
OK. just saw a discrepancy between the admins. --Dixtosa 19:08, 15 April 2012 (UTC)

Regarding Georgian verbs

Well, because I received so many replies saying "Oh yeah, of course", "OMG why didnt we discuss that before?", etc. I have decided to move discussion into here to talk in depth :D. OK, if, here too, I got no replies, I would:

  1. establish, I understand how strange it may sound, 8 forms of third-person present (or sometimes future) and verbal nouns as lemmas, this means we will have to define at most 9 forms.
GED has the same rule (in fact I'm talking about GED's rules :D)
one of the main reasons why we should not use only a verbal noun as a lemma is that not all verbs have a verbal noun. another is that verbal nouns don't express any features of Georgian verbs except aspect and the meaning :D
similarly, I use future tense because not all verbs have a present form.
Choosing a verb form as a lemma eases by many aspects, for example sometimes it is too artificial to transform a common Georgian proverb (verbal part of which is verb form) into a proverb with a verbal noun in it.
  1. create ka-verbal _noun
  2. create ka-verb_link and link all possible lemmas from a verbal noun article.
  3. make use of ka-verb for lemmas, but as opposed to la-verb, ka-verb will also add //category:verb form//

In a nutshell, I'll copy the contents from GED first :D (preserving even the way of defining). While copying (:D) I/we may encounter a problem (less probable though) or a better way may strike on our heads, but that wouldn't be a problem (using our lovely bots :D).

legend: GED -> Explanatory dictionary of the Georgian language -< The most comprehensive dictioanry ever made--Dixtosa 19:08, 15 April 2012 (UTC)

Buryat

Per this back-and-forth, should we consider all Buryats to be {{bua}}, or distinguish them? Current policy, as noted at WT:RFM#Template:bxr, is to consider all Buryats to be {{bua}}, but I don't know what sort of discussion preceded that policy. - -sche (discuss) 19:46, 15 April 2012 (UTC)[reply]

Note that Ethnologue splits up Buryat into {{bxr}} Template:bxr, {{bxm}} Template:bxm and {{bxu}} Template:bxu. This is no different than British English vs. American English (even less actually, as there is no difference at all apart from loanwords), and I see no need to differentiate. -- Liliana 04:37, 16 April 2012 (UTC)[reply]
They supposedly have different written standards and are sometimes written in different scripts, though we could certainly combine them and still distinguish standards (as we do UK vs US in English) and scripts (as we do Syriac and Hebrew in Aramaic, etc). - -sche (discuss) 18:52, 16 April 2012 (UTC)[reply]
I have yet to see Buryat written in anything other than Cyrillic. -- Liliana 20:32, 17 April 2012 (UTC)[reply]
It appears that this page has a photo proving that there is at least some usage in other scripts: [1]. --Μετάknowledgediscuss/deeds 04:56, 21 April 2012 (UTC)[reply]

emo ety

Our ety says that emo was a contraction of emocore. Can anyone confirm or deny this? I am aware of all Internet traditions the term "emotional hardcore", which "emocore" abbreviates, but I'm not sure about the order in which they arose. However, I am used to any given genre X getting an Xcore later in its life, e.g. funkcore, rapcore, skacore, so it makes me suspicious. Equinox 23:50, 15 April 2012 (UTC)[reply]

I always thought it was from emotional, where like you say emocore is a later coinage. I can't back that up with evidence, though. Mglovesfun (talk) 11:51, 19 April 2012 (UTC)[reply]

Deleting "his" in CFI

Along the lines of Wiktionary:CFI#Pronouns, it seems that the sentence:

A person defending a disputed spelling should be prepared to support his view with references

in the WT:CFI should be reworded something like:

A person defending a disputed spelling should be prepared to provide references for support

It looks like a vote would be required. Is there anything controversial to this change? --BenjaminBarrett12 (talk) 18:04, 16 April 2012 (UTC)[reply]

Gender-neutrality for the win! I don't think this is controversial enough to merit a vote, but our vote on not voting hasn't closed yet... I suppose that means we do have to vote on this, or wait until that vote closes. - -sche (discuss) 18:11, 16 April 2012 (UTC)[reply]
Support, obviously. —CodeCat 18:32, 16 April 2012 (UTC)[reply]
That looks better to me too. Equinox 18:56, 16 April 2012 (UTC)[reply]
It would be a strange world if I didn't give my utmost support on this. -- Liliana 19:37, 16 April 2012 (UTC)[reply]
I didn't understand the exact intention of that vote before. But if it passes, and it looks like it will, this can perhaps be the first application :) --BenjaminBarrett12 (talk) 21:29, 16 April 2012 (UTC)[reply]
Considering I've never noticed that before, I guess I didn't read CFI as carefully as I thought I did! Anyway, it'll be a surprise if any of the closet sexists speak up.--Μετάknowledgediscuss/deeds 14:22, 17 April 2012 (UTC)[reply]
If people are willing to describe themselves as "grammar Nazis" — and they are — then why not grammar sexists? —RuakhTALK 20:39, 17 April 2012 (UTC)[reply]
...because sexists aren't known for wearing cool leather boots and crisp uniforms, or for talking in silly accents. --EncycloPetey (talk) 06:23, 18 April 2012 (UTC)[reply]
"Nazi" is just a metaphor for "extreme pedant". What would "sexist" be a metaphor for? Equinox 21:17, 18 April 2012 (UTC)[reply]
Extreme traditionalist? (Though I'm not sure I agree that "Nazi" means "extreme pedant", and if it does, I sure don't see why. I mean, were the Nazis noted for their pedantry?) —RuakhTALK 21:25, 18 April 2012 (UTC)[reply]
Well, they both aggressively stifle dissent. A guy at work who constantly picks on the smallest misplaced comma etc. in documents and CVs came and talked to me about "a criteria" and I wanted to slap his hypocritical face. But he's leaving in two weeks so allowances must be made. Equinox 21:31, 18 April 2012 (UTC)[reply]

Done. (but, to be exact, that sentence was found at WT:CFI#Spellings, not WT:CFI#Pronouns) --Daniel 12:07, 28 April 2012 (UTC)[reply]

Great, thank you! The reference to the pronouns section was that the CFI itself says that "his" should not be used. That's what struck me as particularly odd about the CFI using "his." --BenjaminBarrett12 (talk) 15:44, 28 April 2012 (UTC)[reply]

Featured entries

As I wrote in Talk:háček#Featured entry?:
With ninety-five supporting quotations for its various forms, háček is very probably this project's best-attested lexeme. I've never seen an entry with pronunciatory transcriptions for as many accents and/or speech standards, as many attested synonyms, or as many supporting references. It has a full etymology, going back to Proto forms, and includes parallel formations and cognates; moreover, it has fourteen attested variant spellings, an illustrative image, three derived terms, two lists of coördinate terms, translations into twenty-one languages, two external links, and a fair few exegetic notes. Unlike Wikipedia, we don't have "featured articles" (though our equivalents here would be "featured entries"), but I think we should, because this would allow us to draw attention to our lexicographically best entries, which would in turn function as beaux idéals to inspire more entries of such calibre.
So, what do y'all say? Should we have featured entries? and does (deprecated template usage) háček make the grade? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:33, 17 April 2012 (UTC)[reply]

I've been thinking about it before, but I am not sure how it would differ from the Word of the day, which has comparable requirements. -- Liliana 22:43, 17 April 2012 (UTC)[reply]
The main criterion for a given word's selection as a word of the day is "exotic usefulness". Very often, a word will be selected to be a word of the day whose entry is pretty barebones — featuring only the minimum of proper formatting — it matters only that the definition(s) be interesting. By contrast, the criterion I have in mind for an entry's selection as a featured entry is "completeness". Besides more translations, I can't really think of anything left that could be added to our English entry for (deprecated template usage) háček; that's why I think it should be a featured entry. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:45, 23 April 2012 (UTC)[reply]
I've been detailing the many pronunciations of [[pecan]], Liliana has filled out the translations of [[water]] and Beobach filled up the semantic relations of [[iron]] (the entry also has full trans tables) and worked on [[line]] (sorry if I've missed anyone else's greatly-improved entries)... and [[mole]] has the potential to be turned into a model multiple-etymology entry... the problem in having a Wikipedia-like featuring of entries is that we would run out of such massively-detailed entries, because we have only a few. We could and should, of course, highlight them as models — perhaps in the welcome message? on the pages that deal with translations, pronunciations, etc? - -sche (discuss) 23:13, 17 April 2012 (UTC)[reply]
Yes, you get the idea. (It's great to see such detailed entries — very enheartening!) I don't think we need Wikipedia's bumph — just a category would be fine, linked to by a little bronze star just under the language name. Wikipedia gives featured status to ~1‰ of its articles; I think we should be more exacting, and aim for something closer to 1‱. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:45, 23 April 2012 (UTC)[reply]
The French Wiktionary tried tagging "entries of quality", but the project ground rapidly to a halt. We don't have the kind of group efforts going into entries that Wikipedia has, in part because of our extremely small regularly contributing staff, each member of which has a particular sphere of knowledge that often does not overlap with others. We also run into the issue of whether non-English entries could qualify, and then it becomes a swamping effort in a few languages, with no qualified editors to review the nominations. What I did try to start was a "model entries" effort, but that was designed to include a very small number of words to act as models, which meant that they needed to meet certain a priori criteria, such as few PoS sections, breadth of translatability, simplicity and universality of the senses defined, etc. Many excellent entries on Wiktionary would not be useful as models because of the entry's overwhelming complexity. We;ve discussed the idea of "featured" entries before, and I still haven't heard a viable proposal that makes sense to bother trying. Unlike Wikipedia, we can't feature an "intertesting" toppic; we'd be featuring thorough dictionary entries, which do not lend themselves to casual reading. --EncycloPetey (talk) 06:21, 18 April 2012 (UTC)[reply]
How about the criterion of "completeness" that I mention above (contemporaneous with this post, in response to Liliana)? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:45, 23 April 2012 (UTC)[reply]
I'm not sure encouraging massively detailed entries is a plus for Wiktionary. It would be much more valuable to have the systematic coverage of, say, Webster's Unabridged 3ed, then to have a small number of massively detailed entries.--Prosfilaes (talk) 23:39, 18 April 2012 (UTC)[reply]
I agree that having comprehensive lexica is a greater priority than accomplishing the exhaustive analysis of a few words, but there's no reason that exhibiting such detailed entries should lead to a diminution of efforts to include stubby entries. Some people prefer adding the kind of "barebones" entry I describe above (comprising a language header, a POS header, a headword line, and a definition or small group of definitions), whilst others prefer to flesh out entries until they're really meaty (carnal metaphor aside, you get my meaning). Let a thousand flowers bloom and all that. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:45, 23 April 2012 (UTC)[reply]

Discoverability for cites/quotes

As noted at Wiktionary:Feedback#verisimilitude, it's not always obvious to WT users where to find sample uses. When citations exist for an entry, should we have some way of transcluding them right into the entry itself, probably under an L3/L4 header? Or if that's too technically messy, perhaps we could add an L3/L4 header for quotations, link to the Citations page there, and explain to users that they should click through to see quotations? -- Eiríkr ÚtlendiTala við mig 06:43, 18 April 2012 (UTC)[reply]

Many articles do this with a L4 Quotations header followed by {{seecites}} or {{seemorecites}}. ~ Robin (talk) 08:11, 18 April 2012 (UTC)[reply]
Aha! Thank you, Robin. Aaand, clearly I can be an idiot, as the verisimilitude entry already uses that. <facepalm/> Thank you nonetheless, I now know what to use in any entries I myself edit, and this will stick in my mind.  :) -- Cheers, Eiríkr ÚtlendiTala við mig 14:31, 18 April 2012 (UTC)[reply]

This part of this page should be very much expanded. Meanwhile, this part can be entirely (and at the moment is already partially) handled by WT:Languages#Exceptional_codes_for_varieties.2C_for_etymologies. - -sche (discuss) 03:22, 19 April 2012 (UTC)[reply]

There was some discussion to split WT:LANGCODE and WT:LANGNAME into WT:Languages, WT:Families and WT:Dialects. (However, I don't know whether "Dialects" is a 100% accurate name; would "Language varieties" be better?)
I think the status quo is better. WT:Languages already has too much to see. It is taller than CFI or ELE. I'd like to do the exact opposite of what you said: I'd like to move WT:Languages#Exceptional_codes_for_varieties.2C_for_etymologies to Wiktionary:Dialects#Dialects_in_etymologies. --Daniel 11:45, 19 April 2012 (UTC)[reply]
Alright, I've merged the WT:Languages bit into WT:Dialects. (Btw, when I typed WT:LANG, I noticed that we have an out-of-date List of languages there... which lacks many languages which are listed in [[water]].) As for what to call WT:Dialects: it depends; do we want to handle this sort of thing on that page, or on a separate page? - -sche (discuss) 20:04, 19 April 2012 (UTC)[reply]

Deletion review

Had someone come in and wipe an extant (albeit infrequent) entry. There's nothing simple here on the left or in a WP:# namespace for me to check, so what's the procedure for deletion review over here? or the location of the appropriate admin-issues FAQ? Kindly reply to my talk page. LlywelynII (talk) 15:29, 19 April 2012 (UTC)[reply]

Someone responded there; for the benefit of future readers, I'll note here also that deletion review is at [[WT:RFD]] (main-namespace entries) and [[WT:RFDO]] (other things).​—msh210 (talk) 20:29, 19 April 2012 (UTC)[reply]

Requiring entries to use e.g. {{context|medicine}} instead of e.g. {{medicine}}

Discussions at Wiktionary:Grease pit have led some editors to suggest that we require entries to use e.g. {{context|medicine}} rather than e.g. {{medicine}}.

Some reasons for this suggestion:

  • Some editors feel that part of the complexity of {{context}}'s implementation is due to support for entries' using {{medicine}} directly. (I don't really agree with this view, personally, but it seems to have been the original motivation for the suggestion, hence my mentioning it.)
  • This would allow sense labels to be searchable. Right now a Wiktionary search for "slang" + "forgetful" won't find [[spacer]], because that entry uses {{slang}} directly; but it would find it if the entry used {{context|slang}} instead.
  • Currently, we have some inconsistency in that {{context|medicine}} and {{medicine}} are equivalent but {{context|law}} and {{law}} are not (because the latter is a language template rather than a context template), and {{context|obstetrics}} and {{obstetrics}} are not (because the latter doesn't exist).
  • There are some downsides to the fact that {{context|medicine}} invokes {{medicine}} rather than (say) {{context/medicine}}; for example, the existence of {{law}} as a language template means that {{context|law}} can't be used for legal entries (we've had to require that {{context|legal}} be used instead; {{law}} can't even be a redirect).
    • Technically this is a somewhat separate issue, in that forbidding {{medicine}} from being called directly is not equivalent to having {{context}} call {{context/medicine}}; and indeed, the downsides that I mentioned could be addressed by having {{context|medicine}} invoke {{context/medicine}} while keeping {{medicine}} as syntactic sugar for {{context|medicine}}; but some editors seem to be opposed to the idea of having two templates for every sense label.
  • This would make it easier for consumers of our content (mirrors, applications, and whatnot) to recognize sense labels.
  • This would make it easier for bots to recognize sense labels. (For example, a bot that recognizes {{context|...}} in non-English sections and adds lang=xx would no longer have to search for and recognize every single context template.)
  • This would arguably be more sensible when multiple unrelated sense labels appear together; for example, if you think about it, it's rather bizarre to combine {{rare}} and {{medicine}} as {{rare|medicine}} (where {{rare}} is the template and medicine is an argument to it), whereas {{context|rare|medicine}} (where both are arguments to {{context}}) makes more sense.
    • This could, of course, be treated more narrowly: we could allow {{context|rare}} to be written as {{rare}} when that's the only label, while forbidding something like {{rare|medicine}}.
  • Currently, every context template more or less needs to support every parameter to {{context}}; for example, the fact that {{medicine}} doesn't support script= means that {{medicine|literary}} doesn't, either, even though {{context|literary}} does.
    • Granted, merely requiring {{context|medicine}} wouldn't solve this problem, because currently, {{context|medicine|literary}} also depends on {{medicine}} to pass parameters to {{literary}}; that, however, can be fixed, whereas this cannot.
    • And of course, this too would be addressed if we forbade {{medicine|literary}} but allowed bare {{medicine}}.

If we do make this change, we could do it pretty gradually; it wouldn't have to happen overnight. For example, we could use these steps:

  1. Create a one-time bot to convert existing instances of {{medicine}} to {{context|medicine}}.
  2. Create a long-running bot, à la AutoFormat, that would convert new instances, until people get used to the idea.
  3. Eventually modify {{medicine}} to call attention to itself on preview, and start sending polite notices to people who keep using it.
  4. At some point during this process, modify {{context}} to use e.g. {{context/medicine}} rather than {{medicine}}.
  5. Eventually modify {{medicine}} to call attention to itself even not on preview, e.g. adding a cleanup category rather than a sense label.
  6. Eventually delete {{medicine}}.

RuakhTALK 20:45, 20 April 2012 (UTC)[reply]

I for one absolutely support putting all context labels in {{context}}, for all the reasons stated above. I also support having a bot convert all existing context labels and maintaining the new format. -Atelaes λάλει ἐμοί 21:34, 20 April 2012 (UTC)[reply]
  Support. However, I am profoundly confused about the historical background for why {{context|law}} doesn't work. The first arg to {{context}} is apparently checked to see if there's a template by that name. This is based on the frighteningly spaghettified cross-calling that this discussion ostensibly seeks to simplify. If we scrap this difficult-to-understand and difficult-to-maintain cross-calling and only look at {{context}} alone (ideally, from my perspective, even going so far as to expressly disallow and remove specific labeling templates such as {{medicine}}), there would be no need for this template check, and {{context|law}} (and any other context labels that also happen to be language labels or the names of any other templates) would then work as expected. (As an aside: I suspect we'll go through this again once we have Lua functionality -- much of this will be infinitely simpler once we have a proper programming language to play with.) -- Eiríkr ÚtlendiTala við mig 22:16, 20 April 2012 (UTC)[reply]
No, sorry, you're being a bit too optimistic. {{context}} will still have to check to see if its arguments have corresponding templates, because those templates are the only way, short of some sort of terrible #switch:, to treat each argument in its own way (adding entry to the right grammatical category or the right topical category or no category, linking to the right entry or the right glossary entry or nothing, etc.). In this respect, the problem with the current approach is that not that it requires a template for each more-than-just-text label, and not that it checks for that template, but rather, that it uses a conflict-prone naming convention (namely: none), and that it checks for conflicts by doing horrible things. —RuakhTALK 22:41, 20 April 2012 (UTC)[reply]
Aha, thank you Ruakh.
(Extraneous content deleted, will repost later to the relevant WT:GP thread, sorry for the confusion. -- Eiríkr ÚtlendiTala við mig 00:14, 21 April 2012 (UTC))[reply]
Support.​—msh210 (talk) 06:50, 22 April 2012 (UTC)[reply]
Generally it seems a quite sensible thing for one dictionary to avoid {{medicine}} in favor of {{context/medicine}}) - I for myself was confused when first encountered one of those as to what its purpose was. I don't remember which one it was. But now that I am aware of them, I am not that sure that it's the best idea to completely disband all of these special templates for sense labels like {{legal}} is. For example, I think that {{legal|medicine}} or {{medicine|law}} could convey its purpose through its name and usage rather easily as a label for medical jurisprudence sense. And then there's this laziness factor when typing the templates manually. That said, I don't object the proposal apart from thinking that the last 2 steps are maybe unnecessary. --BiblbroX дискашн 13:39, 22 April 2012 (UTC)[reply]
Wow, thanks, Ruakh. A great list of reasons to simplify this. I would add to them the simple improvement in understandability by editors of a template that would work like every other template. Removing a layer of unnecessary magic (i.e. opaqueness) is good in itself. Michael Z. 2012-04-22 20:46 z
  • I've already voiced my support, but I do want to point out that Michael's point here is very salient -- as contributors to a dictionary site, we strive to write clear and meaningful entries. Our templates should likewise be clear and meaningful, ideally even from a coding perspective.  :) -- Eiríkr ÚtlendiTala við mig 05:38, 23 April 2012 (UTC)[reply]

Collocations section in entries?

Because of our idiomaticity requirement, we do not have entries for common collocations that have meanings that can be derived from their parts. However, it would be incredibly useful to language learners to be able to find some common collocations and phrases, even if they are not idiomatic. A common question for learners is 'how do I say (phrase) in (language)?' So I would like to propose adding such a section to WT:ELE. I'm not sure whether it would be useful to English entries, but for people who are learning English or have a non-native command, it would still be nice to list them even if we don't give definitions. —CodeCat 23:54, 20 April 2012 (UTC)[reply]

That makes sense to me. I sometimes give such collocations in example sentences (or as example "sentences", if I'm feeling lazy), but giving them their own section would have a number of benefits over that: (1) it would make it less problematic to linkify them if they're reasonably idiomatic, and perhaps to linkify their other component words if not; (2) it would make it less problematic to tag them with qualifiers such as (UK); (3) it would make it less problematic to list multiple variants. I also sometimes give them in usage notes, but that's unwieldy in a different way. —RuakhTALK 00:22, 21 April 2012 (UTC)[reply]
Should it be a section, a use of citation space, another namespace? It seems to me that it could become quite voluminous. DCDuring TALK 01:58, 21 April 2012 (UTC)[reply]
Whichever is chosen, can we use a name most have heard before, like Combinations, to lower the eye-glazing quotient? Chuck Entz (talk) 02:33, 21 April 2012 (UTC)[reply]
It could always be made collapsible, like we do with 'Derived terms' sometimes already. —CodeCat 11:50, 21 April 2012 (UTC)[reply]
My point is that fewer people will take the trouble to figure out what it is (and thus be able to use it) if they're put off by an incomprehensible name. Chuck Entz (talk) 15:38, 21 April 2012 (UTC)[reply]
Sorry, my message was in response to DCDuring. Another name is ok with me but to me 'collocations' is the clearest. —CodeCat 16:14, 21 April 2012 (UTC)[reply]
Personally, I'd be more baffled by "Combinations" (which could mean anything or nothing in this context) than by "Collocations" (which has a clear, specific, familiar meaning). —Angr 19:06, 21 April 2012 (UTC)[reply]
None of the suggested names seem to me to convey to a normal person what we intend. I think we would have to consider using one of "Derived terms", "Related terms", "Idioms", or "Phrases" (possibly modified by a well-known adjective) to avoid the heading being incomprehensible to normal users. That means we would have to redefine one or more of those terms in our own habitual thinking. Which would seem the most natural? I fear that "Phrases" and "Idioms" are too easily confused with the L3 PoS headers(though "Idiom" is not used in English any more). Why not just include the material under "Derived terms", possibly a subheading. DCDuring TALK 21:54, 21 April 2012 (UTC)[reply]
Aren't these often sense specific? It makes more sense for me to add them under the sense lines as if they were example sentences. — This unsigned comment was added by Nadando (talkcontribs) at 18:34, 22 April 2012 (UTC).[reply]
They're generally sense-specific, yes, but so are synonyms, antonyms, translations, and so on. I'm on board with listing all of those under sense lines (except, perhaps, for translations), but "as if they were example sentences" is not ideal. —RuakhTALK 18:50, 22 April 2012 (UTC)[reply]
How about "common expressions"? --BenjaminBarrett12 (talk) 04:05, 24 April 2012 (UTC)[reply]
Maybe, or "common phrases"? —CodeCat 20:10, 24 April 2012 (UTC)[reply]

Engineering terms

Would these be suitable to add?

Or at the very least used to support claims about engineering use? Sfan00 IMG (talk) 18:14, 22 April 2012 (UTC)[reply]

Under the CFI, only words in permanently archived media can be included in Wiktionary, and at least three such citations are required. "Permanently archived media" is interpreted as basically meaning printed materials, Google Books and Usenet. I think those lists would be great as a resource to check for the words in permanently archived media and then added to Wiktionary. --BenjaminBarrett12 (talk) 03:55, 25 April 2012 (UTC)[reply]
Google Books is really a way of accessing printed materials, it's not a separate medium per se. Mglovesfun (talk) 09:59, 25 April 2012 (UTC)[reply]
Terms that are colloquial/slang, included in such glossaries merit inclusion in an appendix of such terms, possibly together with similar terms that are included, IMO. But we don't want to simply copy copyrighted material, of course, though. Determining that a term was included in more than one glossary would be a way of adding value to such a copyrighted list. DCDuring TALK 11:42, 25 April 2012 (UTC)[reply]

Middle English cutoff

Currently, WT:AEN (and the ISO) makes 1500 the cutoff before which texts are Middle English and after which they're modern English. This is also the date I've always used, and the one Prosfilaes favored at RFV#tyme; on the other hand, Raifʻhār suggested 1470 in the same RFV and Leasnam suggested 1470 on AnWulf's talk page. Νικα suggested 1475 in the only old discussion I can find. Those thirty years make a difference, as some terms fail RFV because their latest quotations are pre-1500 (and perhaps others pass with quotations from 1475-1499). Has there been more discussion of this cutoff that I'm not finding? Are we content with 1500 (in which case, let this be an announcement that that is our current policy), or would someone like to propose pushing the cutoff back (in which case, do so)? - -sche (discuss) 23:36, 23 April 2012 (UTC)[reply]

Presumably it wouldn't be very smart to treat this too rigidly, whatever year we decide. If a term is in common usage throughout the 1300's and 1400's, and then has a single quote in 1490, it's clearly Middle English. In order for it to be treated as English, it'd have to have some significant usage past the era border, going into at least 1550 or so. — This unsigned comment was added by Atelaes (talkcontribs) at 23:54, 23 April 2012 (UTC).[reply]
I'm a big fan of bright lines; if there's three quotes ten years past the line, then it may be Middle English, but it's clearly also (Early) Modern English. (One quote doesn't meet CFI for Modern English.) I don't see substantial gains from a fuzzy border that offset the loss in clarity and consistency.
The precise year is arbitrary. 1500 has the advantage of making that fact more or less obvious.--Prosfilaes (talk) 06:24, 24 April 2012 (UTC)[reply]
Cutoffs always have negative side-effects, but 1500 is a date devoid of intrinsic meaning. An oft-cited date is 1476, when the first book was printed in Britain (by w:William Caxton). The diffusion of printing presses quickly set a standard, namely the London dialect, which overtook most local dialects and led to spelling standardization. I doubt that anyone can find an entry that exists only on the merit of those 24 years. --Μετάknowledgediscuss/deeds 23:59, 24 April 2012 (UTC)[reply]
On the other hand, ISO 639-2 enm ends in 1500. That should be our default, and changing from that an overt act. 1476 looks like a major line; 1500 looks like just what it is, an arbitrary line putting the 15th century Middle English and 16th century Modern English. (And why shouldn't it be 1473, when Caxton published the first book in English?)--Prosfilaes (talk) 01:56, 25 April 2012 (UTC)[reply]
I am not going to argue this. Until it makes a tangible difference on this site, the ISO standard (which I was unaware of) is good enough for me. --Μετάknowledgediscuss/deeds 02:07, 25 April 2012 (UTC)[reply]

quotations of Middle English in English entries

Closely related to the question of a cutoff date is the question: should we quote Middle English texts, especially in Middle English form, in ==English== sections? This has come up at Talk:warish.
 Raifʻhār and I opined in the RFV of undeadliness (Talk:undeadliness) that books like those quoted in support of [[undeadliness]] constitute ‘translations’ of Middle English texts into English, and can be cited as English uses of terms: but the pre-1500 editions are clearly Middle English, and it seems to me no less inappropriate to quote them in ==English== sections than to quote late Latin texts in Italian entries. I would favor creating ==Middle English== sections to house the Middle English quotations. but we should come to a decision as a community about what to do. - -sche (discuss) 06:22, 24 April 2012 (UTC)[reply]

It's all very well to say that 1500 is a "cutoff", but what's it based on? It's pure convention. Look at the texts from around then, you will see there is a perfect continuity of language across this so-called division. In some cases you will have the ridiculous situation of having volumes I and II of a work under Middle English and volumes III and IV under modern English. Look at any citation-based dictionary and you will see citations stretching back at least to the Middle English period: it's a crucial way of showing, under a given headword, how the language has evolved over time. I have no objections to Middle English entries (though I'm not sure who is working on them), but I do object strongly to excluding these citations from modern English entries as well. Ƿidsiþ 06:29, 24 April 2012 (UTC)[reply]
It's pure convention, but it is what it is. If you have a Middle English cite, put it in an Middle English entry. Since it's an extinct language, you have the one cite needed to support the entry. We don't need copies of a citation under multiple languages. If you can name another multilingual citation-based dictionary, I'd be interested in seeing it.--Prosfilaes (talk) 06:38, 24 April 2012 (UTC)[reply]
The OED? Websters? They all include Middle English. It's normal. Ƿidsiþ 06:58, 24 April 2012 (UTC)[reply]
(1) No, those dictionaries list pre-1500 quotations under English headwords precisely because they don't include Middle English, or Spanish etc... they're monolingual dictionaries of English. Wiktionary, in contrast, includes Spanish and Middle English words, in their own sections.
(2) We don't quote Old English works in ==English== sections (nor in ==Middle English== sections), even though Old English works could illustrate that period in the history of words: we keep that information in ==Old English== sections. We thus already don't show the full history of words in whatever most recent section they're attested in. - -sche (discuss) 07:13, 24 April 2012 (UTC)[reply]
The Merriam-Webster's Unabridged, 3rd was abridged from the second edition by removing all words obsolete in English by 1700, with exceptions for Shakespeare and a few other authors. It's explicitly Modern English only. The OED includes Middle English and Scots, but all under the one banner of English. For these purposes, they're still monolingual.--Prosfilaes (talk) 08:58, 24 April 2012 (UTC)[reply]
Again, it's not because they don't use Middle English headers that they see the need to include Middle English citations under English headwords. It's because this is a crucial part of illustrating a word's history. The OED has links with the Middle English Dictionary and all their entries are linked to that where appropriate. It would be easy for them to defer such citations to that site, but they still keep Middle English under English headwords. Why? Because otherwise words and senses would appear to pop into existence from nowhere. We need to show the history of a word's use. It's the basic requirement of citation-based lexicography. I don't know what the point of a Middle English section is, maybe someone interested in just that period wants to work on it. I am not interested in that, I am interested in the history of English words, and like any good dictionary I want Wiktionary to illustrate that. The situation is nothing like Old English, which had grammatical gender and a case system and was a vastly different language from what came after. The change from Old to Middle English is also marked by a gap in the records, so there is a clean break. None of this is true for Middle English. Perhaps you only work with modern sources, or perhaps you aren't interested in citations at all, I don't know. But I work a lot with texts from the 15th, 16th and 17th centuries and I'm telling you the distinction makes no sense when it comes to citations. English words did not pop into existence in 1500; by 1500, they had already been evolving in certain ways which we need to be able to demonstrate. I am not looking to include all Middle English words routinely undfer an English header; all I'm saying is that where a modern English word goes back to Middle English, that should be illustrated in the citation evidence. Ƿidsiþ 08:16, 24 April 2012 (UTC)[reply]
Where does this rule apply? Should we cite Livius Andronicus in Italian, French and Romanian entries? When a Modern English word goes back to Middle English, we note that in the etymology, and if you want citations for the ancestor of the Modern English word, you go to the Middle English entry. Yes, it's artificial, but in a dictionary that cites Old English, Middle English, Modern English, and Scots all separately, along with many other languages, that's the consistent way to do it.--Prosfilaes (talk) 08:58, 24 April 2012 (UTC)[reply]
You "note it in the etymology"? I'm sorry, but you come across as someone who has never tried to actually do what you're advocating. What about a word with 50 senses? Do you add 50 separate notes in the etymology to explain which of them were present in Middle English or not? As for which languages this rule applies to, you are exaggerating the difference between Middle and Modern English. Middle English is better thought of as a period of English rather than a separate language, there is a smooth continuum between them. By contrast the transformation from Latin to modern Romance languages is very poorly attested in documents, that is why we can clearly say that there are two spearate languages. Ƿidsiþ 09:08, 24 April 2012 (UTC)[reply]
Get a vote to treat Middle English and Modern English as one language. Otherwise, for Wikitionary's purposes, they're two separate languages.--Prosfilaes (talk) 09:52, 24 April 2012 (UTC)[reply]
This has to be arbitrary or else we base what's English or Middle English on personal preference. Anything with Middle English citations only should be Middle English. But I don't per se object to Middle English citations in English entries as long as the term has citations in English too, or it's clearly citable. The thing with shend as a specific example, it contains one Middle English example which contains the head word shende not shend. So it's the wrong language, and not the same spelling. We do have shende which is probably valid in English too, but I don't know that as a fact. Mglovesfun (talk) 10:34, 24 April 2012 (UTC)[reply]
I actually agree with that, ME should only be acceptable when there is also modE citation evidence. I think the problem with many examples is that there aren't enough modern citations, which makes older ones look out of place rather than on a developmental curve. Ƿidsiþ 10:49, 24 April 2012 (UTC)[reply]
I agree with Mglovesfun and Ƿidsiþ: a cutoff is a decent way to decide whether a word counts as "English" as well as "Middle English", but once a word is accepted as English, it makes the most sense for citations to go as far back as the word is attested. (By comparison: entries frequently include citations that are mentions, or that are not durably archived, even though such citations do not justify the existence of an entry. The RFV process is based on citations, but that's not the only thing citations are good for.) —RuakhTALK 13:18, 24 April 2012 (UTC)[reply]
Is this true for all languages? Do we include Latin citations in modern Romance languages?--Prosfilaes (talk) 13:50, 24 April 2012 (UTC)[reply]
I think this is a good use for the citations namespace, no? I see what Prosfilaes is saying, but Latin and Modern Romance languages are clearly distinct, so you've picked a bad example. A better example might be the w:Oaths of Strasbourg, where it's not universally agreed what language this is; Old French, Old Provençal or "Gallo-Romance". It doesn't mean such citations can't be useful anywhere on this wiki. Mglovesfun (talk) 13:54, 24 April 2012 (UTC)[reply]
Are you actually aware of any words that are continuously attested from ancient origins into (say) Modern French, or is this just hypothetical? —RuakhTALK 14:13, 24 April 2012 (UTC)[reply]
To take a different approach... what about modern Danish as compared to the Proto-Norse of the w:Golden Horns of Gallehus? In this case there is an unbroken writing tradition... first in runes, then in Latin writing. —CodeCat 14:23, 24 April 2012 (UTC)[reply]
If someone took the time to track down fifteen centuries of citations for a Modern Danish word, wouldn't that be wonderful? I'm really not seeing the problem here. —RuakhTALK 14:40, 24 April 2012 (UTC)[reply]
Icelandic vs Old Norse is an even better example. The differences between those two languages/eras are not even nearly as marked as the difference between Middle and modern English. - -sche (discuss) 19:40, 24 April 2012 (UTC)[reply]
Inspired by these comments, I've set up User:-sche/ek. Having been accused of POINTing before, I note — one could say, I POINT out — that I created this not in the main namespace, and moreover following an evolution of thought as detailed below in my comment of 19:40, 24 April 2012. - -sche (discuss) 08:57, 25 April 2012 (UTC)[reply]
Wow! :) Pretty awesome. I think you have obviously gone out of your way to make a point here, but in all seriousness, if Icelandic editors find it useful, why not. There is obviously a continuity of usage with this word, and you've shown that rather well, I'd say. Ƿidsiþ 09:44, 25 April 2012 (UTC)[reply]
Awesome indeed! :-)   —RuakhTALK 17:04, 25 April 2012 (UTC)[reply]

We have entries with Middle English headers?

Seems clear that quotations in the entry are to demonstrate usage in the language of the header, while the full list of quotations on [[Citations:]] pages shows the word's history.

We should clarify our inclusion of quotations. I think they should all be listed on the Citations pages, and select ones could also appear in entries. (According to the w:DRY principal of information systems, each should be a page/template of its own, to be transcluded into appropriate places, since many are suitable as examples for more than one entry.) Michael Z. 2012-04-24 15:34 z

I thought of something similar to what Ruakh suggests in his comment of 13:18 24 April 2012 after I signed off yesterday. So, what if we voted/agreed to do that — to say that if a word is attested in modern English, its Middle English history is included in the English section? How much Middle English do we want to include as English, though? I mean: do we want to do away with Middle English sections entirely in those cases, and/or include in the English section any grammatical information (especially when it comes to pronouns), any pronunciation info, etc, or do we want to include only the quotations? Do we want to duplicate the quotations (have them also in the ==Middle English== sections, even when they are in the ==English== sections)? Do we want to modernise the spelling of them in the ==English== sections? And do we want to apply this to other similarly-similar languages (Middle High German vs German, Old Norse vs Icelandic), or only to English? - -sche (discuss) 19:40, 24 April 2012 (UTC)[reply]

Q about redirects

There is a certain class of Japanese nouns that can also be used as verbs. In specific contexts, these can be verbs as-is, whereas in other contexts, they require the auxiliary verb する to impart various kinds of conjugational information. This する is a full verb in its own right, equating more or less to the English do, and thus [noun] + する is essentially an SOP entry.

To avoid SOP-ness, and since these [noun] terms can also act as a verb on its own, Haplology and I have been adding the verb senses to the bare term entries themselves. This raises the question of how to ensure that a Wiktionary user who might not be fluent in Japanese would find these entries if they enter [noun]する into the URL or search bar.

In a thread in the Grease pit (Wiktionary:GREASE#How_the_search_feature_works_.28and_doesn.27t_work.29), Ruakh clued me in to the possibility of using a do-nothing template param that would hold a string intended to generate a search hit. This does work to some extent, but only when a user uses the search feature. Typing this string into the URL fails out, whereas a redirect would work.

Since the [noun]する combination is perforce specific to Japanese, and since a redirect would thus not affect any other language, would other editors be opposed to using redirects from the SOP (but common in English-language teaching materials) [noun]する forms to the [noun] entries? -- Eiríkr ÚtlendiTala við mig 06:25, 24 April 2012 (UTC)[reply]

I support using redirects. We usually don't use redirects, because they're usually inappropriate, but this is a perfect example of when to use redirects. To a limited extent, we already use similar redirects from phrases that are SOP in English to the idiomatic parts thereof (win-win situationwin-win). - -sche (discuss) 06:33, 24 April 2012 (UTC)[reply]
Sounds good to me. —RuakhTALK 13:20, 24 April 2012 (UTC)[reply]

Current votes

--Daniel 18:23, 24 April 2012 (UTC)[reply]

Modern Latin

Due to Latin being an extinct language, one citation is sufficient for each term. However, the posts above about Middle English led me to this question: are modern usages of Latin acceptable? What if a term is only used in medieval texts? Based on how often I see "New Latin", I'd guess there are a fair few of them. I also think we ought to accept them, when tagged as not being Classical.

The issue is really about words that are only found in 20th and 21st century works. I am currently reading Winnie ille Pu and I plan sometime soon to read Harrius Potter et Philosophi Lapis. There are some words in these that are clearly legitimate (like hamaxostichus), but describe things that did not exist before the modern age (in this case, trains). Am I justified in adding them with a citation from such a work? --Μετάknowledgediscuss/deeds 00:15, 25 April 2012 (UTC)[reply]

Speaking of modern Latin... Dux Oppositionis (talkcontribs) has been adding a lot of it, like Lesothum and Cuvaitum, which may need to be RFVed. - -sche (discuss) 00:17, 25 April 2012 (UTC)[reply]
Modern citations for Latin are valid, just like modern citations for (say) English, but only ancient cites (cites from before Latin became extinct) would qualify for the one-cite rule. —RuakhTALK 00:40, 25 April 2012 (UTC)[reply]
That sounds reasonable, but it unfortunately drives us right into the arbitrary date problem that we have with Middle English above. When did Latin truly become extinct? --Μετάknowledgediscuss/deeds 00:45, 25 April 2012 (UTC)[reply]
636, anno Domini. -- Liliana 07:09, 25 April 2012 (UTC)[reply]
I'm almost afraid to ask, but why? The death of Isidore of Seville? —Angr 07:54, 25 April 2012 (UTC)[reply]
Actually, it's based on the Islamic Expansion, which many scholars consider the end of the Ancient Era. More important, however, is that the last contemporary Latin authors lived around the 6th and the beginning of the 7th century, with the language evolving into the precursors of the modern Romance languages after that. -- Liliana 08:02, 25 April 2012 (UTC)[reply]
Sure, but 636 is such a precise date. Why 636 as opposed to "ca. 650" or "ca. 700"? —Angr 08:20, 25 April 2012 (UTC)[reply]
Quoting Ruakh "but only ancient cites [] would qualify for the one-cite rule." Um, I don't think that's the case. It would be an interesting revision though. See Wiktionary talk:Votes/pl-2011-05/Attestation of extinct languages 2 where this issue was raised with no solution. Also 636 seems surprisingly early, were there no native Latin speakers after that date? Mglovesfun (talk) 09:57, 25 April 2012 (UTC)[reply]
Almost as many as there are native Esperanto speakers now. SemperBlotto (talk) 10:01, 25 April 2012 (UTC)[reply]
Latin turned dead soon after the Roman empire was exterminated in the 5th century. Insofar the date of 636 is almost generous. -- Liliana 11:01, 25 April 2012 (UTC)[reply]
I think intentionally or unintentionally, you side stepped my question. Mglovesfun (talk) 11:44, 25 April 2012 (UTC)[reply]
@Mglovesfun: Re: "I don't think that's the case": The criterion is "For terms in extinct languages: usage in at least one contemporaneous source." I take "contemporaneous" to mean "from when the language was not extinct". Do you interpret it differently? —RuakhTALK 12:43, 25 April 2012 (UTC)[reply]
I think we have a problem with definitions here: post-extinction usage in an extinct language is a contradiction similar to "a bachelor's wife". One might think of it as a conlang based on Latin rather than Latin itself, or one might think of it as a resurrected language, like Modern Hebrew, but either way, simple logic says it's not usage in an extinct language for our purposes. Chuck Entz (talk) 13:15, 25 April 2012 (UTC)[reply]
An extinct language is commonly held to be one with no native speakers, not one that's unused. Michael Z. 2012-04-25 13:52 z
Then post-extinction usage of an extinct language isn't a contradiction at all, as that's exactly the situation Latin was in from the 7th to the 18th century or so; Hebrew and Sanskrit have also been in that situation for many centuries of their histories. All three languages (and probably several others as well, but these are the three I can think of off the top of my head) went through long periods where they had a large body of highly skilled users who were actively creating new literature in them, but they served as no one's "please pass the salt" language. In other words, they had no native speakers, but they did have many highly fluent (maybe even "near-native") users. —Angr 14:15, 25 April 2012 (UTC)[reply]
Would Montaigne (1533–1592) count as a native speaker, owing to the fact that Latin was his first language? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:01, 25 April 2012 (UTC)[reply]
Maybe, but it wouldn't change anything in the current discussion since he was very much an isolated case. —Angr 21:54, 25 April 2012 (UTC)[reply]
Would it maybe be useful to treat native and non-native Latin as separate languages? —CodeCat 14:30, 25 April 2012 (UTC)[reply]
By no means. It might be useful to treat Classical-ish Latin (up to ca. 650/700) and Medieval/Modern Latin as separate languages, but even that's pushing it, as the differences between the two are so slight. Certainly as long as ISO doesn't provide separate codes for them we shouldn't try to separate them ourselves. (ISO does provide separate codes for Biblical Hebrew and Modern Hebrew, but for some reason we ignore that. I wouldn't know where to put the centuries of Hebrew literature in between Biblical and Modern anyway.) —Angr 21:54, 25 April 2012 (UTC)[reply]
@Ruakh yes I do interpret it differently, as extinct languages can still be used, even by fluent speakers, so long as they are not native speakers. Hebrew is itself an interesting case as a language that was dead, but according to Wikipedia, is no longer considered dead. Would citations post Middle Ages but pre-revival not count as 'contemporaneous' then? Mglovesfun (talk) 15:36, 25 April 2012 (UTC)[reply]
Re: Your first sentence: Sorry, I don't see what you're getting at. Obviously extinct languages can still be used — this whole discussion is about how we treat modern uses of Latin — and if they couldn't, then there would be no need to specify "contemporaneous". No? You say that you interpret "contemporaneous" differently, but you don't say how you interpret it. I don't think this discussion can proceed any further until you explain that.
Re: Hebrew: we've decided to treat all forms of Hebrew as a single language, so it's not extinct. If we decided instead to treat Ancient Hebrew and Modern Hebrew as separate languages (as Ethnologue does), then we'd have to figure out how we distinguish them, and terms like "extinct" and "contemporaneous" would presumably fall out naturally from that.
RuakhTALK 15:51, 25 April 2012 (UTC)[reply]
Essentially my argument is that your interpretation of contemporaneous is just that - your personal interpretation without the vote giving any indication that that's what it means. Quite simply I'm not interpreting it in the same way. There really is nothing to explain. Contemporaneous says "Existing or created in the same period of time" which you interpret in this context as referring only to living language. Imagine it not only referring to living languages, and you're there. Mglovesfun (talk) 20:42, 25 April 2012 (UTC)[reply]
Ah, so you interpret "usage in at least one contemporaneous source" as meaning the same as "usage in at least one source". In that case, I think you are wrong to say that the vote gives no indication: on the contrary, I think the fact that the vote includes the word "contemporaneous" is, in itself, an indication that "contemporaneous" is to be read as having some relevant meaning. —RuakhTALK 21:16, 25 April 2012 (UTC)[reply]
(More generally — whenever any halfway-intelligent person espouses any interpretation about anything, it's because (s)he thinks that that interpretation is indicated by that thing. That's the whole point of interpretation. It's not "make up something related", it's "figure out what something means in its context". In general, to rebut an interpretation, you need to provide either a reason to think that it's wrong, or else a reason to think that some other interpretation is as valid or more so. Describing an interpretation as a "personal interpretation", without offering any alternative, is pretty useless, except that sometimes it can be a decent way to infuriate someone. And hopefully infuriation is not your goal, because this is not one of those times. I'm about to go on vacation for a week and a half, so am nigh uninfuriable at the moment. :-)   —RuakhTALK 21:28, 25 April 2012 (UTC))[reply]
No, you're trying to complicate something simple. The way your interpreting contemporaneous in this context with any support from the vote or our definition of contemporaneous, simple don't do it, and you're there. All the debating in the world won't change that. Mglovesfun (talk) 21:31, 25 April 2012 (UTC)[reply]
You're up to your old trick of interpreting votes in the way that most suits you instead of the intention of the original vote. In some cases, like this one, going beyond what's even possible from the wording. What's happened is you've mentally added something to the vote, and clearly, the rest of us can't see it because it's in your mind. If you remove that, then you end up with the same version of the vote as everyone else. That's why I don't need to read your arguments, all the talking in the world won't change the text of the vote which is the only way your arguments can have basis. So can you just act in good faith and interpret the vote as it was meant, please? Play nice, please? Mglovesfun (talk) 21:43, 25 April 2012 (UTC)[reply]
I'm sorry, I don't see how Ruakh's interpretation of "contemporaneous" differs from the dictionary's definition or the usual common interpretation of the word. —Angr 21:54, 25 April 2012 (UTC)[reply]
I'm with Ruakh and Angr on this on this one. Contemporaneous has to mean something, otherwise it would not have been placed so prominently in the vote's wording. I.e. it has to be interpreted, somehow. Ruakh's intrepretation matches up with mine. If you have an alternative interpretation, then by all means present it. It's possible that people were supporting different things, and didn't realize it. However, simply not interpreting the word is not possible. -Atelaes λάλει ἐμοί 22:21, 25 April 2012 (UTC)[reply]
We have to interpret votes according to current consensus. It's not really fair to claim that something particular was intended, because it may be that each writer's and each voter's intent was somewhat different. I would also claim with caution that each word carries intent, because I've seen many poorly-worded votes and guidelines that say something contrary to the intent of particular authors who wrote them. In the end we apply votes and guidelines as the community which is working on the dictionary today, and hopefully we clarify or improve our guidelines as we go. Michael Z. 2012-04-25 23:01 z

Having waxed philosophical, I have done some actual reading, and would like to point out that the “contemporaneous” wording was inherited from another proposal for inclusion of dead languages. If you haven't already, please have a look at the proposal and talk page for WT:Votes/pl-2011-05/Attestation of extinct languagesMichael Z. 2012-04-26 01:24 z

Indeed, and that proposal made this explicit: “The restriction to contemporaneous sources is meant to exclude reconstructed terms listed in modern dictionaries. It also serves to exclude modern printed texts written in ancient languages.” —RuakhTALK 18:06, 3 June 2012 (UTC)[reply]

Well, here is (probably) the first application of this rule: hamaxostichus, fully cited with 20th-century cites. As for the leader of the opposition (User:Dux Oppositionis), most of what they've been adding is impossible to cite, but it's all real material and in good faith, so I don't feel like pursuing it. --Μετάknowledgediscuss/deeds 05:29, 28 April 2012 (UTC)[reply]

I've done a bit of what I consider to be tidying to the entry. --EncycloPetey (talk) 19:02, 6 May 2012 (UTC)[reply]

AWB Access for Pronunciations

I would like to use AWB for two main purposes:

  1. To generate lists of words that do not have pronunciations
  2. To convert pages that use both Template:audio and Template:IPA to use only Template:audio-IPA

I generally won't be making many edits using AWB, but I can make a bot account if desired. Could an admin add me to the check page, please?

--Gabriel Sjöberg (talk) 15:29, 25 April 2012 (UTC)[reply]

Well the first one wouldn't require any edits, would you need approval for that? Can't you log in but not edit? Second one, not so sure, I'm not a big fan of {{audio-IPA}}, not for any reason just because we don't use it much. I'd like some sort of input from other Wiktionary editors. Mglovesfun (talk) 15:52, 25 April 2012 (UTC)[reply]
What's the benefit of using it? Can it handle multiple recordings matching a single transcription?​—msh210 (talk) 16:16, 25 April 2012 (UTC)[reply]
AWB doesn't let me fetch a list without logging in, so I'll need it even for item 1 (though DPL can get me most of the way there). As for the templates, I like that Template:audio-IPA attaches the audio directly to the transcription, which makes attaching audio to words with multiple pronunciations clearer to the reader. Cf. paratransit. I'm actively working to make the presentation even cleaner and add a few features, but I think Template:audio-IPA is already a big improvement over Template:audio in many circumstances. --Gabriel Sjöberg (talk) 18:33, 25 April 2012 (UTC)[reply]
I think msh210 is asking you to say why it's "big improvement". I'm not against to per se, I just have no reason to support it. Mglovesfun (talk) 21:45, 25 April 2012 (UTC)[reply]
Here are the advantages I see right now:
  1. The biggest advantage is that Template:audio-IPA connects the IPA transcription to the actual audio file. This can be really handy for people who don't know how to read IPA and can't correctly associate a list of IPA transcriptions to the audio files.
  2. The new template also adds hidden categories based on optional, named parameters. This metadata can indicate the language, dialect, and sex of the speaker. This isn't information that anyone is using now, but it could come in handy at some point in the future. Additionally, the hidden categories can be used to determine which terms do not have audio with certain characteristics (e.g., you could make a DPL that gives all pages that do not have a British audio pronunciation).
In the future, I'd like to come up with a way to connect more than one recording to a transcription, but I just don't have a really clear way of presenting that on the page right now.
--Gabriel Sjöberg (talk) 00:09, 26 April 2012 (UTC)[reply]
Connecting the audio file to the IPA is not always desirable. The IPA is sometimes for UK, sometimes for US, sometimes for both, and sometimes for another region altogether. The audio files are almost always for US English. Additionally, there may be multiple IPA representations given, and there can be multiple audio files. Linking these correctly would be a very complicated job requiring a good ear for English phonemes and regional variation in English. --EncycloPetey (talk) 18:56, 6 May 2012 (UTC)[reply]

Implied nouns

I recently expanded τέταρτος (tétartos, fourth), and I came upon some difficulty in conveying some of the information in proper Wiktionary fashion. In its primary sense, the definition is fairly straightforward, it's the ordinal version of τέσσαρες (téssares, four). However, some of its other senses, while fairly intuitive to understand, are somewhat difficult to rigorously explain. For example, definition 3.2 means quart, as in a liquid measure. That definition is essentially when τέταρτος (tétartos) is attached to μοῖρα (moîra, part, portion). The word μοῖρα (moîra, part, portion) doesn't actually have to be in the clause, or the paragraph, or even the work for that matter. It can be implied by the context, as it is in the Herodotus work cited for it (follow the link, if you don't believe me). I'm almost positive English can do similar things, but I'm at a loss as to think of any examples. My reference (the LSJ 8th edition) uses the syntax "(sub. μοῖρα)" to explain the grammar, with sub. being listed in the list of abbreviations as "subaudi", which is absolute jibberish to me, but I was already aware of the phenomenon, and so understood anyway. My solution is {{grc-sub.}}, which simply makes a {{context}} like parenthetical note "with x". Obviously, this should eventually get an appendix, but I don't have the gumption to write one up on the spot. In any case, does anyone have any thoughts on how to more clearly explicate this in a definition list? -Atelaes λάλει ἐμοί 02:38, 26 April 2012 (UTC)[reply]

Perhaps definition 4 of the noun fifth is of use. Can τέταρτος be defined as "quart" and then τέταρτος μοῖρα also defined as "quart"?--BenjaminBarrett12 (talk) 02:57, 26 April 2012 (UTC)[reply]
I don't think that's a terribly elegant solution. τέταρτος μοῖρα (tétartos moîra) is really just sum of parts, and should not be given an entry. τέταρτος (tétartos) could reasonably just have a definition "quart" (though I suspect it's not actually equal to a quart), but there's an implied word in there which really should be explicated somehow. -Atelaes λάλει ἐμοί 11:30, 26 April 2012 (UTC)[reply]
It sounds like it means "fourth (part); quart, quarter". My question is whether it functions as a noun in such cases? Or is it a "substantive adjective"? Either way it looks like you are explaining it just fine to be honest. Good citations (some with and some without "part") will make the phenomenon fairly clear in any case. Ƿidsiþ 05:40, 26 April 2012 (UTC)[reply]
My experience, in Ancient Greek at least, is that nouns and substantive adjectives function identically: they both function as grammatical nouns, take the definite article, can be modified by adjectives, etc. Some adjectives strongly prefer a certain gender, and usually function as substantives, and it can be only determined on scant evidence that they even are adjectives (θεός (theós) is an example that comes to mind). -Atelaes λάλει ἐμοί 11:30, 26 April 2012 (UTC)[reply]
I must admit "subaudi" was incomprehensible to me, too, but after looking through a few dictionaries (both ones that used the word, and ones that defined it), I've tried to put together an entry. I'm not sure how to format the note that it, like "read" in "it was an interesting [read: disastrous] affair", doesn't inflect. - -sche (discuss) 06:50, 26 April 2012 (UTC)[reply]
Brilliant -sche! Thank you. If you think it's solid enough to survive an rfv, I'll just put that in (linked, of course) instead. The example sentence in particular is well-crafted. I think that will make the situation much clearer to our readers, while retaining the snooty Latin term which we need to keep our self-respect as a dictionary. :-) -Atelaes λάλει ἐμοί 11:30, 26 April 2012 (UTC)[reply]
I'd never heard it either. If I saw "sub." used as described above, I'd think it meant "substantivized adjective" or something. I've only ever seen "sc." used to mean "to be supplied mentally". —Angr 14:03, 26 April 2012 (UTC)[reply]
Wouldn't (elliptically for τέταρτος μοῖρα (moîra)) cover it? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:33, 26 April 2012 (UTC)[reply]
Yes, I think it would, but (subaudi μοῖρα (moîra)) is a bit more precise and concise. -Atelaes λάλει ἐμοί 00:21, 27 April 2012 (UTC)[reply]
I dunno, I would strongly prefer an intelligible clarification (like "elliptically for...") to the one so obscure that we dictionary-editors didn't even know what it meant! - -sche (discuss) 00:29, 27 April 2012 (UTC)[reply]
*sigh* Yeah, you're probably right. I guess I got so excited about the possibility of using the original wording that I sort of forgot how remarkably esoteric it is. I've implemented your suggestion, Doremítzwr, at τέταρτος (tétartos). It's a bit longer than I'd prefer, but it really does explain what's happening better than anything else. -Atelaes λάλει ἐμοί 00:45, 27 April 2012 (UTC)[reply]
I see it's being used now at both τέταρτος and τοξικός, but it raises a more general question about how to treat substantivized adjectives: shouldn't these forms be listed under the genders where they actually occur, and then under a ==Noun== header? After all, what we have here synchronically is a feminine noun τετάρτη that means both "quart" and "the fourth day"; the fact that it's elliptical for τετάρτη μοῖρα is really just part of its etymology. Likewise there's a neuter noun τέταρτον that means "a fourth, a quarter", a feminine noun τοξική that means both "archery" and "a shothole", a masculine noun τοξικός that means "a bowman" (attested only in the plural), and a neuter noun τοξικόν that means "poison for smearing arrows with". I don't think all these noun meanings should be grouped together under the adjective just because that's how Liddell and Scott do it. They're paper and need to save space; we aren't and don't. —Angr 08:22, 27 April 2012 (UTC)[reply]
I apologize in advance for what I'm sure is going to be a thoroughly unsatisfying response. I really can't support this suspicion, but I don't think that the LSJ put them all together to save space. I think they put them together because they're still, in some meaningful way, still all part of the same word. I feel like separating them to their own entries would be, if not inaccurate per se, an organizational error. I'll try and do some further research and mental stewing and see if I can't give you something beyond idle speculation. -Atelaes λάλει ἐμοί 13:15, 27 April 2012 (UTC)[reply]
Even if there are semantic rather than spatial reasons to keep them together, that's not the way Wiktionary works. Unlike any other dictionary, we have separate entries for dog and dogs; for rojo, roja, rojos, and rojas; and for τέταρτος, τετάρτη, and τέταρτον as adjectives quite apart from their substantivized meanings. We are already organized differently from LSJ, so we should make full use of the way we're organized rather than trying to follow the way they're organized. —Angr 13:47, 27 April 2012 (UTC)[reply]
We do have separate entries for dogs and dog, and messages and message, but I was quite unhappy that we had some senses at messages that we didn't even point to from message. I'm happy with entries like bang and scissor, and the current revision of message, which I edited so that it does point out the additional definitions at the plural form. (Line similarly has a sense which should technically be the line.) If we move these elliptical senses to specifically gendered inflected forms, we'll need to point them out in usage notes in the lemma entries (like messages), or how will anyone ever find them? Contributors who know none of the language will copy-and-paste the term and get the right page, but contributors who know enough of the language to search for the lemma, aware that our inflected forms almost never contain information beyond "Foo form of bar", won't find the senses. - -sche (discuss) 19:20, 27 April 2012 (UTC)[reply]
I disagree with the current state of message and messages, since now the meaning "groceries, shopping" is not listed in a definition line anywhere. It's in a usage note at message and hidden on the Citations page at messages. As for the Greek forms, I would actually list the noun meanings under Derived forms of the adjective's lemma. Readers who know enough Ancient Greek to be looking for nouns like τετάρτη will know enough to look for it under its nominative singular τετάρτη (as opposed to one of the other cases, or the plural), but if they only encounter it as a noun they will probably not look for it under the masculine adjective form τέταρτος--you can't tell from looking at τετάρτη that it's "basically" an adjective rather than a noun. In a paper dictionary that's not such a big deal, because if you look up τετάρτη you will immediately see the adjective τέταρτος and will look there instead. But here, if I encounter a noun τετάρτη in my Greek reading and come to Wiktionary and look for it there, and all that page tells me it that it's an inflected form of the adjective τέταρτος, I'll be confused because what I have on my page is a noun ending in -η, not an adjective ending in -ος. —Angr 19:57, 27 April 2012 (UTC)[reply]
FWIW, when this sort of this has cropped up for Latin adjectives, I've used (by extension). --EncycloPetey (talk) 18:52, 6 May 2012 (UTC)[reply]

This category failed RFD a while ago, and it is now almost empty, with most of its contents having been moved to Dutch. What should we do with its code, {{vls}}? Wikipedia uses that code specifically for West Flemish (West Flemish Wikipedia), and I think that would make sense because there is a stronger consensus within the linguistic community that it is a language, at least compared to Flemish/Belgian Dutch in general. It's not recognised politically as a language, but still. —CodeCat 20:37, 26 April 2012 (UTC)[reply]

$wgPFEnableStringFunctions

The ParserFunctions extension now includes string functions, but these need to be enabled separately. As of right now this is disabled, so the string functions don't work. I'd like to vote to enable these. Of all the Wikipedia wikis I can think of, Wiktionary would probably be able to make the most use of string functions, because we work with words so much. These functions would allow us to write templates that automatically adjust endings added to words based on the page name. So for example, {{en-noun}} would add -s for the plural most of the time, but it could automatically add -es or replace -y with -ies when appropriate, and without needing an additional parameter. Similarly, it would allow a template like {{fr-conj-er}} and many other inflection templates to drop the 'stem' parameter, because for a word like chanter the string functions could automatically extract the 'chant-' from the page name. This makes templates much less error-prone because of forgotten or mistaken parameters and such. There are of course many other possibilities. In any case I think these would be incredibly useful and I don't really see any problems at all. —CodeCat 14:15, 27 April 2012 (UTC)[reply]

The developers already said they won't do this. And there's no alternative to it in sight anywhere, because the planned Lua extension won't support Unicode. -- Liliana 14:16, 27 April 2012 (UTC)[reply]
Have they explained why they won't? And if Lua doesn't support unicode, why are they even adding it? It's not 1990 anymore... —CodeCat 14:18, 27 April 2012 (UTC)[reply]
Whoa whoa whoa, what's this about Lua not supporting Unicode? That's completely unacceptable. It doesn't even make sense -- MW projects are used globally in way too many different scripts for that...
  • A quick search of the Lua website does find mention of a UTF-8 <-> Unicode (presumably UTF-16) converter written in Lua on this page, suggesting that the language itself can handle Unicode.
  • This page also explains that Lua is not inherently incapable of using Unicode or UTF-8 strings.
Liliana, have you run across some MediaWiki dev list post stating that the Lua functionality added to the backend will not be Unicode- and or UTF-8-compatible? -- Eiríkr ÚtlendiTala við mig 15:44, 27 April 2012 (UTC)[reply]
Old discussion is at [[Wiktionary:Grease pit archive/2011/January#Idea: constants in templates]] and [[Wiktionary:Grease pit archive/2011/July#String functions (again :)]] (among other places).​—msh210 (talk) 19:54, 2 May 2012 (UTC)[reply]

Proto-Slavic: Why are ь and ъ used for ĭ and ŭ?

I've been wondering this for a while now. All other sources I came across about Slavic so far use the Latin letters ĭ and ŭ and not the Cyrillic equivalents. Even our own transliteration of Old Church Slavonic uses those letters. So why not in Proto-Slavic? —CodeCat 22:47, 27 April 2012 (UTC)[reply]

I've seen both used, but I think ь and ъ are more common, especially in more modern sources. The same is true of transliterated OCS, but only when the Cyrillic isn't also provided. Since we provide both Cyrillic and Latin for OCS, it would be redundant to use ь/ъ in both, but for Proto-Slavic we only provide Latin, so it isn't redundant. —Angr 00:11, 28 April 2012 (UTC)[reply]
I believe the reason for this is that the precise phonetic values of ь and ъ are not known for certain, and representing them as ĭ and ŭ might be incorrect. In the case of OCS, that language is still spoken (for liturgical purposes) and we know its phonology. —Stephen (Talk) 05:03, 28 April 2012 (UTC)[reply]
Jers in Proto-Slavic reconstructions are usually not transliterated into Latin. Cyrillic characters are used simply because it's the most common practice in the books/papers. It's quite common for OCS too, but as Angr explained it doesn't really make sense for us to use it. --Ivan Štambuk (talk) 20:13, 6 May 2012 (UTC)[reply]

To my surprise, we have a Category:Cajun French language and a {{frc}}. It is useful to distinguish Cajun French from general French in etymologies and when the usage of a word is {{restricted}}; we do this for Category:Quebec French. I think we should do the same for Cajun French: but I think we should rename Category:Cajun French language to Category:Cajun French, and not use {{frc}} (perhaps rename it {{etyl:frc}}) except in etymologies. Cajun French and Quebec French have very limited syntactic, pronunciatory and lexical differences from the French spoken in Europe and around the world, just as Southern US English and Canadian English have limited syntactic, pronunciatory and lexical differences from the English spoken in Europe and around the world; I do not think it is appropriate to treat either as a separate language, and as Quebec French is already not treated as separate, I think it is especially odd to treat Cajun French as separate. (Cajun French, {{frc}}, is not to be confused with Louisiana French Creole, {{lou}}.) - -sche (discuss) 18:35, 28 April 2012 (UTC)[reply]

Obviously, we shouldn't treat Cajun French as different from French to the extent that we, say, give them different language sections, but what's the point of banning {{frc}} in favour of {{etyl:frc}}? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:08, 28 April 2012 (UTC)[reply]
In our current naming system, as I understand it, only languages which have L2 sections have unprefixed codes; regional varieties like {{etyl:Viennese German}}, temporal varieties like {{etyl:LL.}} and families like {{etyl:smi}}, which are for use in etymologies, have codes prefixed with etyl:. (And appendix-only languages have codes prefixed with conl:, etc.) This makes it unmistakable which things are L2 languages and which aren't. - -sche (discuss) 20:00, 28 April 2012 (UTC)[reply]
My understanding was that the etyl:-prefixed language-code templates are the ones that aren't ISO codes. Consider that we have {{gkm}} for Byzantine Greek, which we treat as part of Ancient Greek, {{grc}}. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:44, 28 April 2012 (UTC)[reply]
Fun fact: some guys proposed a Wikipedia in Cajun French, and after a few pages, it really became apparent that it is no different from Standard French. So yeah. -- Liliana 20:18, 28 April 2012 (UTC)[reply]
If anyone's interested in seeing it, it's at incubator:Wp/frc/Page Principale, and a sample article is at incubator:Wp/frc/Louisiane. Apart from a few idiosyncrasies like using monde to mean "people", it's just French. —Angr 20:27, 28 April 2012 (UTC)[reply]
Not necessarily ones without ISO codes. {{mo}} and {{fil}} have been deleted, but it was discussed whether to move them to {{etyl:mo}} and {{etyl:fil}}. The difference is, {{frc}} is used in to etymologies; bayou and lagniappe. mo and fil were used in none. Mglovesfun (talk) 11:03, 29 April 2012 (UTC)[reply]
Support merge, per Angr. Mglovesfun (talk) 11:09, 29 April 2012 (UTC)[reply]
Self-correction; {{etyl:mo}} does exist, but isn't used. Mglovesfun (talk) 16:49, 29 April 2012 (UTC)[reply]
As a result of the fact that "Cajun French" is slightly narrower than "Louisiana French", I've created Category:Louisiana French (rather than Cajun French, as I initially proposed). {{Cajun}}, {{Cajun French}} and {{Louisiana French}} all put entries in the category. - -sche (discuss) 02:40, 30 April 2012 (UTC)[reply]

Bot request, Simplified Chinese and Traditional Chinese

Request to rename by bot all the categories start Simplified Chinese and Traditional Chinese to Mandarin ... in simplified script and in traditional script. Concrete example Category:Simplified Chinese terms derived from English to Category:Mandarin terms derived from English in simplified script. Reason: Simplified Chinese and Traditional Chinese aren't languages, but linguistic norms. Mglovesfun (talk) 15:25, 29 April 2012 (UTC)[reply]

Nomenclature like Category:Mandarin terms in simplified script derived from English would, IMO, be better; nomenclature like Category:Mandarin terms derived from English in simplified script suggests that the terms derive from etyma that are themselves written in some unspecified simplified script. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:25, 29 April 2012 (UTC)[reply]
I agree with Raif'har. - -sche (discuss) 01:55, 30 April 2012 (UTC)[reply]

cascade protection over all etyl templates

I have created Wiktionary:Index to templates/languages/protection/etyl. Cascading protection could be applied to it, to protect all etyl: templates against vandalism. This would have the same drawback as the protection of Wiktionary:Index to templates/languages/protection had: helpful new editors will be unable to change the content of the templates without admin assistance. As an alternative or supplement to cascading protection, admins can watchlist all of the etyl: templates at once by following the directions here. Do we want to cascading protection over Wiktionary:Index to templates/languages/protection/etyl; do we think the benefit of stopping vandalism to our "backend" is worth the drawback of non-admins being able to change the templates on their own? - -sche (discuss) 02:02, 30 April 2012 (UTC)[reply]

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.