Wiktionary:Beer parlour/2012/August

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives +/-

Wiktionary:Votes/2012-08/Foreign Word of the Day

Please note that Wiktionary:Votes/2012-08/Foreign Word of the Day is going to run soon, and your feedback is welcome. Thanks! --Μετάknowledgediscuss/deeds 21:24, 2 August 2012 (UTC)

If someone could check the code we want to add to the main page to see if it’s bugged, that would be nice. — Ungoliant (Falai) 05:22, 4 August 2012 (UTC)


Across the Unicode list just came a message that the Faroese Parsed Historical Corpus. It's pretty minimal, but it's version 0.1.--Prosfilaes (talk) 22:06, 3 August 2012 (UTC)

What is it exactly? The link you gave doesn't really explain a lot... —CodeCat 22:08, 3 August 2012 (UTC)
It's a parsed corpus; it's got a selection of texts, and annotations breaking them down into parts of speech. It's a corpus, meaning it should probably be our first look for older Faroese words, and the fact it's parsed means we can find a word only when it's used as a noun or verb, at least in theory.--Prosfilaes (talk) 22:57, 3 August 2012 (UTC)

Friulian wiktionary

How do I go about requesting that a Friulian wiktionary be made? Dude2288 (talk) 22:18, 3 August 2012 (UTC)

See here. Good luck! — Ungoliant (Falai) 22:23, 3 August 2012 (UTC)

CFI for extinct languages

As promised in the Wiktionary:Votes/2012-06/Well Documented Languages proposal, I would like to proceed with a proposal to modify the way extinct languages are handled in the WT:CFI.

The Well Documented Languages proposal, which recently passed, expanded the scope of languages for which one use or mention is adequate for inclusion on Wiktionary, along the lines of its predecessor, Wiktionary:Votes/2012-04/Languages_with_limited_documentation, which also passed.

Currently, words in extinct languages may be allowed with one use, but not one mention. This next proposal is to permit one mention as well, subject to the same requirements as other one-mention inclusions:

  • the community of editors for that language should maintain a list of materials deemed appropriate as the sole source for entries based on a single mention,
  • each entry should have its source(s) listed on the entry or citation page, and
  • a box explaining that a low number of citations were used should be included on the entry page (such as by using the {{LDL}} template).

The actual proposal has been drafted by User:Metaknowledge and edited by me, but this is the essence of the new proposal.

Any thoughts, criticisms, etc., are welcomed before it is officially introduced as a proposal. --BB12 (talk) 01:02, 4 August 2012 (UTC)

Amending Wiktionary:Votes/2012-06/Well Documented Languages

on WT:RFV#Haifa, there's been a question of whether one mention is enough for any Tatar term, and furthermore does the mention have to be in a Tatar text or not? I'd like to amend Wiktionary:Votes/2012-06/Well Documented Languages to exclude mentions for this reason. For example an English Usenet citations of "The Tatar word for Haifa is Haifa" would be enough to pass Haifa as a Tatar word. I oppose this as it doesn't convey meaning (place names admittedly aren't a good example of words where context is require to convey meaning; I'm choosing it as a current RFV candidate rather than an ideal example). The only thing we'd need to change is

For all other spoken languages, only one use or mention is adequate, subject to the following requirements:


For all other spoken languages, only one use is adequate, subject to the following requirements:

Removing only the words or mention, with no other changes of any kind. Mglovesfun (talk) 09:51, 4 August 2012 (UTC)

Have you read any further? Probably not. It says:
  • subject to the following requirements: the community of editors for that language should maintain a list of materials deemed appropriate as the sole source for entries based on a single mention.
So, unless the community approves, Usenet won't be acceptable as a source of mentions. -- Liliana 10:19, 4 August 2012 (UTC)
I actually had read that, it's good get-out clause I suppose. Essentially any citation we don't like, we can get rid of it by considering it a material not deemed appropriate. That's kinda reassuring, but I still think removing the mention bit solves the problem all together, and very quickly too. Mglovesfun (talk) 10:34, 4 August 2012 (UTC)
There are some languages that are so badly attested that the only decent source is in the form of mentions. Vandalic is an example, where the only 'sentence' in the language that I am aware of is mentioned within a Latin text. I wouldn't want them to be deleted because CFI no longer allows them... —CodeCat 11:35, 4 August 2012 (UTC)
Yes, and Dacian is also known only through mentions. I suppose Usenet could be ruled out as a source for Tatar, but there are going to be cases where it's not so simple. I believe the solution lies at Wiktionary:Beer_parlour#Durability_and_online_archives. --BB12 (talk) 18:30, 4 August 2012 (UTC)
That's a difference then, as I'd be ok with languages that are only used in mentions being de facto excluded to avoid this problem all together. Mglovesfun (talk) 10:13, 6 August 2012 (UTC)
I wouldn't want to say only in mentions. Say a language has 50 mentions in various sources. Would we want to discount all of those once there is a single use? This kind of thing could happen for real if, for example, a single 5-word inscription written in Vandalic or Dacian were discovered in the future. —CodeCat 14:49, 6 August 2012 (UTC)
For one thing, I dispute that the Vandalic citation is a mention. All the terms are used in a context so they're all uses, and all valid. Mglovesfun (talk) 08:01, 11 August 2012 (UTC)
Right; the point is that because there would suddenly be those five uses, all the more numerous mentions would be invalidated (if we said that only mention-only languages were allowed mentions). - -sche (discuss) 20:10, 28 August 2012 (UTC)

Entry structure

Hello, everyone! It seems that this is my first edit as a registered user in the English-language Wiktionary. I'm surprised, given that I'm very active on several Wikipedias and other projects.

I'm writing here because I'm extremely confused by the entry structure used in this and other Wikipedias. This is the structure of the entry sea:

  • English
    • Etymology
    • Pronunciation
    • Noun
      • Synonyms
      • Derived terms
      • Translations
    • See also
    • References
    • Statistics
    • Anagrams
  • Irish
    • Etymology
    • Contraction
      • Usage notes
      • Antonyms
  • Old Irish
    • Determiner
  • Spanish
    • Verb
    • See also

Mixing definitions in different languages in a single entry is really messy. I think that we should have a totally different entry structure. Each entry should cover only one language, and the language should be denoted by a prefix.

So EN:sea, ES:sea and other entries would have structures like this:

  • Etymology
  • Pronunciation
  • Noun
    • Synonyms
    • Derived terms
    • Translations
  • See also
  • References
  • Statistics
  • Anagrams

This of course would a major change in all Wiktionaries. What do you think about it? --NaBUru38 (talk) 17:24, 6 August 2012 (UTC)

I completely agree, but unfortunately not enough people here want to do it, because it's too much work, too difficult, too different from what we're accustomed to etc. etc. —CodeCat 18:14, 6 August 2012 (UTC)
We had this discussion recently at WT:ID. Basically, it would create massively more work for me, reduce my future efficiency, and garner no benefit. I could be contributing at Latin Wiktionary (and they sorely need somebody), but they use that kind of structure and it completely turns me off. If you want neatness, I recommend that you enable Tabbed Languages at WT:PREFS. --Μετάknowledgediscuss/deeds 18:27, 6 August 2012 (UTC)
We tried them as a default for admins. The trial ended. Now what?​—msh210 (talk) 18:43, 6 August 2012 (UTC)
They can still be selected via your Preferences. As for NaBUru38's suggestion, are you saying that English Wiktionary should define only English words, Spanish Wiktionary should define only Spanish words and so on? If so, that won't work, not only because there are languages too small to support an editing community, not to mention extinct languages that don't qualify for a Wiktionary of their own. Or, are you saying that instead of having a single entry sea that has different headers for different languages, we should have en/sea for the English word, es/sea for the Spanish word, ga/sea for the Irish word, sga/sea for the Old Irish word, and so on? I suppose that would be doable, but it would be an enormous amount of work getting there, and I don't see much benefit over how things currently are. —Angr 21:01, 6 August 2012 (UTC)
The default could be the English word, so if you look up "sea," you'll see the English entry. That page could have links to the pages for "sea" in all the other languages. The advantage would be a much cleaner look. Of course, if someone wants to look up "sapere," there should be some sort of redirect system in place to decide whether to send the user to the Italian page or the Latin page. --BB12 (talk) 22:55, 6 August 2012 (UTC)
Even better, the English "sea" page could have a link to a disambiguation page, and that could be the default if there is no English entry for a word. --BB12 (talk) 22:56, 6 August 2012 (UTC)
"Mixing definitions in different languages in a single entry is really messy". Well they're not mixed, their in succession in alphabetical order by language name. I like the current format, with the tables of contents (TOCs) you can easily go directly to any section you like. Mglovesfun (talk) 08:35, 7 August 2012 (UTC)
@Msh210: I think TabbedLangs should become default for anons.
@Angr: I fully agree.
@BB: That would be awful. I often work on multiple closely related languages at the same time, and that makes me more efficient. I can add etymologies from Proto-Polynesian to every Polynesian reflex by making just a few edits - for example, see all the Polynesian languages at vaka ("canoe"). I oppose any sort of disambiguation page, which is what Latin Victionarium uses, and again that makes me less likely to contribute. --Μετάknowledgediscuss/deeds 14:06, 7 August 2012 (UTC)
Don't worry. The current layout isn't going to change (well, only over several people's dead bodies). SemperBlotto (talk) 14:10, 7 August 2012 (UTC)
I think putting ease of editing over ease of viewing is a bad idea, but unfortunately it's a common problem with wikis. The main thing we should focus on is how easy it is for users to find the information they want. And for that, we have to know what they want. Probably most users by far are looking for English entries. Those that look up non-English terms are usually interested in what that term means, and are not interested in any of the other languages. Only a very small group, probably the more linguistically enclined (which predictably are more numerous among editors than among users!), is interested in comparing terms across languages. So to cater to a very small group (linguists, editors) as opposed to a much larger group that doesn't want or need such features and is only confused by them, is IMO completely backwards. —CodeCat 14:30, 7 August 2012 (UTC)
The readers are important - but they're not generating content! Last I checked, anons were producing less than 5% of edits, and around 20% of those edits needed to be reverted or rewritten. The whole point of content is to be read by anons. Therefore, I believe that it's more important to expand content than to emphasize existing content, assuming that both old and new are of equal quality. --Μετάknowledgediscuss/deeds 15:01, 7 August 2012 (UTC)
I think the notion that any format like sea/en would be easier for readers is simply wrong. Given that on WT:FEED it's clear that many people can't tell uppercase from lowercase, I think it's unreasonable to assume readers would figure out to type sea/en or sea/sga to look up an English vs an Old Irish word. And (again from WT:FEED) it's clear that many people don't click links from bearing to bear, etc — I don't think a disambig page would be any less confusing. Also, readers might not know what language a word is in; in that case, our current format is helpful; alternatively, readers might think they know what language a word is in, but be mistaken. - -sche (discuss) 22:36, 7 August 2012 (UTC)
What if we had subpages for each language (e.g. sea/en, sea/es, etc.) and then put a template on the page itself (sea) that automatically includes all the subpages. This would leave the layout exactly the same for readers but would create separate pages for editors.
We wouldn't have to change every page right away. Like any other change like this it will occur gradually and new pages will always use the new system. This also seems like a change a bot could easily make.
Of course this would mean updating all templates that rely on page names (such as {{en-noun}}).
This is just a thought I had. I don't know if it's worth actually doing this.
--WikiTiki89 (talk) 17:43, 7 August 2012 (UTC)
That's exactly what some other user said at WT:ID. That proposal doesn't pacify the OP's concern (emphasize existing content) or my concern (create more content). So it looks to me like a lose-lose. --Μετάknowledgediscuss/deeds 18:01, 7 August 2012 (UTC)
If we split the pages, then we could allow a setting in the preferences to select whether (and which!) languages are shown on the main 'languageless' page. So, users would be able to control to which degree they want the old system to appear to them. Preferably, this setting would be implemented on the server side, so that it's independent of JavaScript. Reliance on JavaScript (which can be disabled or just not functional) is the main shortcoming of tabbed languages as it is now, even though it is very good and ought to be made the default I think. —CodeCat 18:54, 7 August 2012 (UTC)
@MK: FWIW, I'm not advocating doing this. I was simply addressing how it could be done. --BB12 (talk) 19:20, 7 August 2012 (UTC)

I completely agree that mixing languages on the same page is very confusing. It has mislead me more than once when looking up a word. A setting in preferences is no good at all, the majority of users of Wikitionary do not have an account and in any case would not realise that the default could be changed. English meanings should be the default display. Btw, "Tabbed browsing of language sections" does not seem to work (Monobook, Firefox 13.0), although "Tabbed browsing of language sections with tabs on the side" does. SpinningSpark 21:23, 7 August 2012 (UTC)


I am trying to resolve an editorial disagreement at the entry for niggerling. My attempts to dialogue with the other editor are going ignored and the latest response included a suggestion that I suck some cock and then he disabled his talk page so how can I best proceed. The definition is simply inaccurate without mentioning the term nigger as well.Lucifer (talk) 00:58, 8 August 2012 (UTC)

I suggest you just give up and suck the cock, since most of your entries relate to doing so. Hope this helps! Equinox 00:59, 8 August 2012 (UTC)
The facts: Lucifer was involved in edit warring at niggerling, even after reverts by three admins (Ruakh, me, and Equinox). He was trying to include the word nigger in a definition, which we all found to be inappropriate. Equinox responded by giving him a two-week block without specific warning (although he has been warned before), and has since deleted and protected his own talkpage to be admin-only edited, given Lucifer verbal abuse, and deleted the main page (restored by Jamesjiao). --Μετάknowledgediscuss/deeds 01:12, 8 August 2012 (UTC)
NB: Neither of the involved parties is currently blocked. --Μετάknowledgediscuss/deeds 01:14, 8 August 2012 (UTC)
If he deleted the main page, I don't think he should be an admin, just sayin'. I mean, it's not like Wonderfool was left off scot free when he did it. User: PalkiaX50 talk to meh 01:19, 8 August 2012 (UTC)
Ruakh has de-sysopped him with the summary "deleted the main page -- I assume that means he's WF?" FWIW, I don't think he's WF, unless WF has been playing a very deep game. (I'm also curious what the space where the main page is looks like when the main page isn't there.) - -sche (discuss) 02:13, 8 August 2012 (UTC)
Yeah, I wasn't confident enough in the claim to go ahead and permablock him (though I wouldn't look askance at another admin doing so). Even if he's not WF, he deleted the main page! —RuakhTALK 02:54, 8 August 2012 (UTC)
He's definitely not WF: WF has always treated LW as an ally or even a protegé. Besides, Equinox is a better editor at his worst than WF is at his best. I agree that de-sysoping was necessary- I just hope he realizes that he's hurting himself more than LW, and takes a step back to think things through. Chuck Entz (talk) 04:42, 8 August 2012 (UTC)
{e/c} @Palkia: That's not what Wonderfool was permablocked for, ever. It was always for "abusing multiple accounts".
@Sche: I assume it just gives the deleted-page-recreation text and a record of all the WF deletions to date.
@Ruakh: I would definitely look askance. If any admin permablocks Equinox without due process of some sort (i.e., some semblance of consensus), I will unblock him. I do think that your actions have been defensible for the time being. (You've made a great 'crat!)--Μετάknowledgediscuss/deeds 04:46, 8 August 2012 (UTC)
You haven't read back far enough. WF was an admin, and went on a vandalism spree that included deleting the main page and blocking all the other admins. If you look at the block logs, you'll see that before then he blocked himself for a short time to test whether he could unblock himself- so it was obviously premeditated. He was permablocked, and the "abusing multiple accounts" is from his repeated efforts to evade that original block ever since. He also did the same at Wikipedia after one of his sock-puppets got admin rights there. He can indeed be charming and helpful and friendly, but I suspect that the main reason he hasn't been up to such tricks lately is because people are keeping an eye on him and stopping him before he gets the chance to be voted in as an admin. Chuck Entz (talk) 06:31, 8 August 2012 (UTC)
Re: "He was permablocked, and the 'abusing multiple accounts' is from his repeated efforts to evade that original block ever since": That may once have been true, but it's not the case at present. The current de facto policy has been to let him use one account at a time, blocking them only when he's got multiple socks going at once. (SemperBlotto handles this.) —RuakhTALK 06:37, 8 August 2012 (UTC)
I wasn't addressing current practice, per se, but responding to the statement "That's not what Wonderfool was permablocked for, ever." I'm aware of current practice, and make a point of checking with SemperBlotto any time I recognize a WF sock, unless it looks like SemperBlotto is aware of him and choosing not to act. Actually, what usually happens is that I look at the recent changes or the new editor contributions, say to myself "that's got to be WF", check the contribs, and find that SemperBlotto has already permablocked him. Chuck Entz (talk) 07:14, 8 August 2012 (UTC)
@Chuck: If, in the beginning, WF had one account at a time, then I'm wrong. I just wasn't aware that such a situation ever existed. --Μετάknowledgediscuss/deeds 17:35, 8 August 2012 (UTC)
Reasonable people could disagree about whether "young black person" adequately encompasses the meaning of niggerling. After the first or second revert, Lucifer probably should have made his case for “young person described as a nigger” over “young black person” on niggerling’s Talk page, rather than Equinox’s. In any case, removing “black” from the definition in a fit of pique, even assuming he anticipated its prompt reversal, seems like vandalism to me. I don't see a deletion of the main page in the logs, however. ~ Röbin Liönheart (talk) 08:10, 8 August 2012 (UTC)
Equinox not Wonderfool. Mglovesfun (talk) 08:18, 8 August 2012 (UTC)
You just haven't looked far enough back. I did, and I can assure you- it's all there. Chuck Entz (talk) 09:28, 8 August 2012 (UTC)
@Metaknowledge re: The facts. LW has since been blocked, but not because of this. He's been playing games with his cites. My favorite was a term he labeled as African-American Vernacular English, for which his cite (from a novel) quoted dialogue spoken by a clansman in a southern lynch mob railing against the protagonist's associating with "niggahs". Chuck Entz (talk) 10:00, 8 August 2012 (UTC)
  • Equinox is not Wonderfool, I am. LW is not WF either. I am. Now Equinox is out, I retire too. For ever. Enjoy! --Tacones (talk) 13:27, 8 August 2012 (UTC)
One question here from me, who or what in the flying fuck is "LW"...? User: PalkiaX50 talk to meh 14:43, 8 August 2012 (UTC)
User:Luciferwildcat SemperBlotto (talk) 14:46, 8 August 2012 (UTC)
I could not exclude the possibility that Equinox is Wonderfool, mostly because of the edit summary on his main-page deletion. (But perhaps it is like Saddam Hussein pretending to have WMDs.) But, even if that were the case, the enormous number of new entries added should make him welcome here. Perhaps we should review them (a sample anyway) to confirm the quality and consider whether to risk restoration of powers beyond those of ordinary registered user.
I have certainly thought that other admins, not excluding myself, have occasionally had lapses of judgment, brought on by substance abuse, fatigue, mood, pain, malfunctioning neurotransmitters, personal problems etc. Such lapses need not be treated as indications that the admin could never again be trusted with the tools. DCDuring TALK 16:39, 8 August 2012 (UTC)
Very true. —RuakhTALK 16:59, 8 August 2012 (UTC)

@Chuck: I know. My facts came out before Ruakh blocked LW. I also know about the cites (which is not something he has ever done before, to the best of my knowledge). --Μετάknowledgediscuss/deeds 17:35, 8 August 2012 (UTC)

The thing with Wonderfool, is he is capable of good edits, but it characterized by a lot of edits in a short space of time, often with lots of careless errors, like spelling mistakes and missing genders in {{fr-noun}}. Then for some reason he insists on getting blocked, and will become more and more disruptive until someone blocks him. I really don't understand why that is. Mglovesfun (talk) 17:43, 9 August 2012 (UTC)

Announcing vote proposal: Extinct Languages - Criteria for Inclusion

I have just put up a new proposal to provide CFI rules for extinct languages just like other limited-documented languages: one mention permitted as a basis for inclusion provided the source is made explicit and a disclaimer is included. You can see the vote at Extinct Languages - Criteria for Inclusion. --BB12 (talk) 01:20, 11 August 2012 (UTC)

A question on redirects

If, as User:Metaknowledge has recently told me (and can be read on WT:REDIRECT, the policy here is to avoid redirects in the main namespace at all costs, why is it that there are many language name redirects to the corresponding single adjective pages? Shouldn't they be deleted? Or should this be extended to other language names as well? (The ones I've found are: Spanish language, Portuguese language, French language, Italian language, Hebrew language, Arabic language, German language, Dutch language, Danish language, Russian language, Polish language, Greek language, Urdu language, Turkish language, Chinese language and Japanese language.) --Pereru (talk) 09:28, 12 August 2012 (UTC)

Many of these passed WT:RFD, but probably shouldn't have. Mglovesfun (talk) 09:31, 12 August 2012 (UTC)
  • Re: "the policy here is to avoid redirects in the main namespace at all costs": That is not the policy. We certainly do not avoid redirects at all costs; people sometimes vote "keep as redirect" in WT:RFD. Wiktionary:Redirections is a draft proposal rather than a voted policy. --Dan Polansky (talk) 10:58, 12 August 2012 (UTC)
    • There are certainly times when redirects are sensible here, but they should be more sparingly used here than at Wikipedia. Good uses of redirects here include spellings using typographic ligatures like fi and fl in English, IJ and ij in Dutch, and װ and ױ in Yiddish, and spelling using curly apostrophes, e.g. [[don’t]] which is (and should remain) a redirect to [[don't]]. —Angr 14:22, 12 August 2012 (UTC)
      • The basic principle I follow in this is to avoid a redirect unless one can be reasonably certain that the redirected spelling isn't used for any term not covered by the destination entry in any language in the world. Things like ligatures are very specific to very few languages, so it's safe. "Don't" is a spelling that probably wouldn't be found in other languages with either apostrophe (let's hope so!), though no doubt the confusion it avoids in English is worth any small risk of a conflict with other languages. As for the original question: the fact that these are English phrases makes conflict with any other language highly unlikely (they would have to have terms with the exact same spelling as each of the individual words, used in a phrase in the same order). Chuck Entz (talk) 15:09, 12 August 2012 (UTC)
        • What you all guys are saying suggests to me that those language name redirects should be deleted. They are synonyms of the simple adjective taken as name of the language ('Chinese language' = 'Chinese'), but that is, from what you say, not enough to warrant them redirect status. They are not in the same class with [[don’t]] or ligatures. How come some of them passed RFDs? --Pereru (talk) 16:17, 12 August 2012 (UTC) Or, in other words: should I add the RFD template to those redirects? --Pereru (talk) 16:17, 12 August 2012 (UTC) (I further add that, to me, the main point is treating all cases the same way. I tried to create [[Latvian language]] as a redirect to [[Latvian]], but this was immediately deleted, for reasons similar to the ones you all mention. So either we should delete these redirects -- or then create lots of new [[XXX language]] redirects to [[XXX]]. After all, why should some [[XXX language]] redirects be deleted, while others are kept? Don't you think? --Pereru (talk) 16:21, 12 August 2012 (UTC)
          • Talk:Spanish language established a precedent, even if a weak one. Until that is changed, it should be possible to create Latvian language without having it deleted, IMHO. Some people are okay with some sum-of-parts phrases being kept as redirect to ease searching, but oppose their having a fuller entry. --Dan Polansky (talk) 17:02, 12 August 2012 (UTC)
            • Well, User:Metaknowledge deleted the Latvian language redirect within ten minutes of my having created it, with an edit summary that made clear he thought such redirects in the main namespace were pointless. Apparently not all admins are in agreement about that. What should one do in this case? (Besides, should one create all the other obvious [[XXX language]] redirects? Logic and uniformity would dictate that we do so, if the Spanish language precedent is taken seriously. The same argument mentioned there several times -- "people search for these terms" -- can be used verbatim for all possible [[XXX language]] redirects. --Pereru (talk) 22:21, 12 August 2012 (UTC)
              • The reason they ever existed is more historical than logical. Initially, many templates and pages imported from Wikipedia at the dawn of this project had links to "XXX language", and redirects were created to bridge the links. It was only later that the problems of redirecting links on a dictionary became apparent and "policy" was instituted. No one has yet gone through the trouble of a systematic cleanup. --EncycloPetey (talk) 23:12, 12 August 2012 (UTC)
                  • In this case, I'll go add rfd's to those terms, and hopefully they will be cleaned up now. --Pereru (talk) 00:03, 14 August 2012 (UTC)
              • I have restored it so that your example works. Additionally, this topic was already under discussion in RFD so a deletion is premature. Just don't go about creating others until the issue is settled. DAVilla 02:34, 22 August 2012 (UTC)
              • For the record, all the fill-in-the-blank Language pages above have been requested for deletion. Just as we don't put RFD of templates on the template page because it breaks the template, there isn't much point in putting an RFD on a redirect page since it breaks the redirect. Really, that looks far worse than either it redirecting or simply not being there, plus none of the bots realize they're supposed to avoid it. There should probably be another solution, but for now just pretend the RFD is there because for all intents and purposes it's in force. DAVilla 02:47, 22 August 2012 (UTC)

I can't edit my own user pages

D'oh. Could someone unprotect my user page and all subpages? Thanks. Equinox 22:57, 12 August 2012 (UTC)

Yes check.svg Done I think. Test them. — Ungoliant (Falai) 23:05, 12 August 2012 (UTC)

Problem: Norwegian dialectal entries

The previous thorough debate on the setup of Norwegian entries ended weakly in favour of the two-header solution: the two written standards of Norwegian are to be treated as if they were separate languages (which they clearly are in some definitions of 'lanugage', anyway); with the headers ==Norwegian Bokmål== and ==Norwegian Nynorsk==. (as a sidenote, my mind has since then changed somehwat, and I now completely support this solution)

The problem is now how to treat dialectal terms with the two-header solution. I see two viable options:

  • a) Treat dialectal Norwegian terms under the header ==Norwegian== (thus effectively settling for a three-header solution).
    • Drawbacks: Three L2 headers instead of just two. It might also seem odd that the header ==Norwegian== is exlusively dedicated to the dialects (and potentially other non-standard terms).
  • b) Create two entries (typically close to duplicates of one another) under the headers ==Norwegian Bokmål== and ==Norwegian Nynorsk==; one entry for each header.
    • Drawbacks: In many cases, this will lead to two L2 headers where only one is really needed. It is also somewhat ironic to treat dialects under the headers of strictly standardised written languages.

So, what is the best solution? Njardarlogar (talk) 14:07, 14 August 2012 (UTC)

Are we to assume that the dialects cannot be assigned to either of the standard languages, i.e. "Bokmål dialects" and "Nynorsk dialects"? —Angr 20:52, 14 August 2012 (UTC)
If there is no better solution to this, I would argue that Bokmål and Nynorsk are themselves two dialects of Norwegian. It is not that unusual for one language to have several distinct standards. Catalan and Valencian are a notable example as are (I think) European and Brazilian Portuguese, Eastern and Western Armenian, Traditional and Simplified Chinese, Hindi and Urdu, Serbo-Croatian, and even English. In most of these cases we treat them as one single language. So personally I think treating Norwegian as a single language with two standard written dialects makes more sense. —CodeCat 22:07, 14 August 2012 (UTC)
Is a large proportion of the lexicon spelled the same in both standards? That's certainly the case with US/GB English, PT/BR Portuguese, and Serbian/Croatian/Bosnian (within each alphabet, of course). If so, that's also a good argument in just having one Norwegian, as it will cut down on redundant entries. —Angr 22:33, 14 August 2012 (UTC)
I'm not quite sure how much of it is the same. If this page is an indication, the differences are only small. If we choose to have separate languages, then all the words that are not bold in that text would be duplicated, which is almost all of them. —CodeCat 23:13, 14 August 2012 (UTC)
Thanks for finding that. (I especially enjoy the quote "you don't really have to concern yourself with the feminine gender as a student".) If that's representative, I'm really in favor of having just a single Norwegian heading and marking the words that are different with a context template {{Nynorsk}} or {{Bokmål}}, or whatever dialect. —Angr 07:30, 15 August 2012 (UTC)

I am not familiar with the standardisation processes in the languages that CodeCat mentioned, so I will have to ask: to what extent are the different standards recognised as separate entities? Are books/publication in the language often marked with "language (form): <name of variant>"?.

The thing is that Bokmål are written standards only; they do not reflect one individual dialect or one set of dialects only. Mixing Nynorsk with Bokmål is as "bad" as mixing e.g. Swedish with Bokmål or Nynorsk; even if such mixing typically is less obvious (Swedish has ä and ö for Norwegian æ and ø). To be absolutely sure to avoid mixing, we would have to mark any sense, any related word (synonym etc.) as either Nynorsk or Bokmål only, or both. Leaving senses and words that are found in both written standards without marking is asking for mistakes to be made.

The problem with trying to gauge the differences between the two written standards, is that each of them has a myriad of alternative forms for a huge amount of words. Thus we got two different ways of measuring the differences: how close can we theoretically get the two written standards (allowing ourselves to use word forms that are hardly ever used by anyone where beneficial to similarity (and the other way around)), and how close are they using the most frequently used forms only?

In the 5th sample text I would myself have written Det var ei sterk nasjonal stemning i den unge norske staten på 1800-talet og mange byrja [infintive marker å removed] arbeida for eit eige norsk skriftmål, bygttalemålet. Two of the changes that I made (-språk (language) > -mål) are also perfectly fine for Bokmål, however it is more in the spirit of Nynorsk, as Nynorsk tends to avoid loan words, in this case språk from Low German sprake. Likewise, myself and many other writers of Nynorsk would subsitute forskjellig (different) with ulik, although technically, the word is equally accepted in both written standards. Most of the Nynorsk sample texts would have been more different from the Bokmål texts if they were written by me (the way I always write). Often, sentence structure is also differrent. The Bokmål sentence landets sterkeste mann ('the country's strongest man') would have to be written with a different structure in Nynorsk, something like den sterkaste mannen i landet, as "genitive" forms like landets of landet are verboten in Nynorsk. Such "genitive" constructs are very common in Bokmål.

Also, some words that have exactly the same spelling can have pretty different pronunciation, such as meg (me) (nn /meːɡ/ , nb /mɛi/) and (to a somewhat lesser extent) norsk (Norwegian) (nn /nɔrsk/, nb /nɔʂk/). Njardarlogar (talk) 10:41, 15 August 2012 (UTC)

One problem with seeing the two standards as absolute and unable to be mixed is that it also implies that any dialectal forms that are not part of the standard are not officially Bokmål or Nynorsk. So if someone uses a dialect word in their Bokmål writing, then it would no longer be Bokmål. Going by the two-header approach, we would then have to include a third header "Norwegian" just for words that occur in neither standard, which I think is a bit silly. I very much doubt anyone follows the standards so rigidly... language is living and flexible after all, and I wouldn't be surprised if some people mix Bokmål and Nynorsk whenever they see fit, too. From what I know about the two, Nynorsk is used more in southwestern Norway, so is it possible that what you consider typical 'Nynorskisms' are really just features of those dialects that occur in speech as well as in writing? —CodeCat 10:55, 15 August 2012 (UTC)
It's all about the mindset. The vast majority of Norwegian texts that are not purely dialectal are written either with a Nynorsk or a Bokmål philosophy. There is also a third mentality, and that is writing in one's own dialect using a mostly phonemic ortography. However, how distinct the three mentalities are depends on how close the writer's dialect is to the standard language he normally writes; so the dialectal ortography tends to be the most phonemic outside of southwestern Norway (Nynorsk) and central eastern Norway (Bokmål).
Either way, you could of course substitute some Bokmål words for Nynorsk words in a Bokmål text, or vice versa, and say that the result is still Bokmål or Nynorsk. However, if done without great consideration, it would be either weird Bokmål or weird Nynorsk, weird in similar way that using e.g. using English, Spanish or Albanian words instead would be; although you could still call it Nynorsk or Bokmål in these cases, too. Both Nynorsk and Bokmål have strong seperate traditions, and therefore mixing will look weird. No newspapers do it, no commercials do it, no websites do it etc. Njardarlogar (talk) 15:06, 15 August 2012 (UTC)
Isn't that true in English, too? I mean, I'd find it weird if a text mixed clear Britishisms with clear Americanisms. —RuakhTALK 15:29, 15 August 2012 (UTC)
That's exactly what I am arguing for: that Nynorsk and Bokmål are not only different on paper, but that they are also generally thought of as two entirely different beasts. The reason why I brought that up, is because the focus on how many words are shared between the two feels odd when the two written standards are separate entities with separate histories. For instance, the word ut (out) is found in Nynorsk (with this exact form) because it is the most frequent form of the direct descendants of the Old Norwegian word út, whereas it is found is in Bokmål (with this exact form) because Danish also has the word, and it was found in the vocabulary of the Norwegian urban upper class, and there were no widespread lower class forms that could rival it in later reforms of Bokmål when the weight was shifted further away from the urban upper class to the urban lower class and less urban dialects. In order words: identical words, but very they have different histories, and different reasons for being present in the relevant standard language. Njardarlogar (talk) 16:32, 15 August 2012 (UTC)

Njardarlogar, you say: Also, some words that have exactly the same spelling can have pretty different pronunciation, such as meg (me) (nn /meːɡ/ , nb /mɛi/) and (to a somewhat lesser extent) norsk (Norwegian) (nn /nɔrsk/, nb /nɔʂk/). — but pronunciation is not part of the language standards. People in Bergen will say /nɔʁsk/, in dialect as well as in Bokmål. And if people in the Oslo area speak Nynorsk, they will do it with the pronunciation details of Eastern Norway. I had a university teacher who talked Nynorsk the Eastern way. Pronunciation depends on the dialect background of a speaker. Most Nynorsk users live in Western Norway, so average Nynorsk will have a Western accent when used orally. In the same way, most Bokmål users live in Eastern Norway, so average Bokmål will have an Eastern accent. --MaEr (talk) 17:44, 15 August 2012 (UTC)

We agreed in February-March 2011 to have two headings, one for Norwegian Bokmål and one for Norwegian Nynorsk. Those are the two standard languages we are documenting. In any book or newspaper in either of these langauges, a word that sounds like belonging to slang or some dialect rather than the standard language, should be documented as such. If the same word appears as an expression of dialect in Nynorsk, Bokmål and Swedish, that word would be listed under each of the three language headings, just like a word that appears as an expression of slang in all three standard languages. This is to say, that the b) alternative in the original post is the correct one. The objection that "In many cases, this will lead to two L2 headers where only one is really needed" is invalid. Trying to use a common heading would be like using ==Scandinavian== for words like en, fem and bil that are (roughly) the same in all Scandinavian languages. We don't do that. We treat Danish, Norwegian Bokmål, Norwegian Nynorsk, and Swedish as four independent languages. --LA2 (talk) 18:14, 15 August 2012 (UTC)
I agree with LA2 in my answer to the original question. We start with the refutable presumption that the words in a Nynorsk/English/Dutch text are Nynorsk/English/Dutch words. Sometimes, the same series of letters is found in multiple texts with the same meaning and even origin, e.g. guerrilla is used in both English and Dutch: then we have "duplicate" entries, one ==English== and one ==Dutch==, for the "same word", because the pronunciation and declension may be (and is) different. As long as we treat Nynorsk and Bokmål as separate languages (per the earlier consensus), it is appropriate to have separate, citation-supported entries for "dialectal" terms in each language: we may find that some dialectal terms are attested in both lects, but dialectal terms may be attested in only one of the two lects. But that's my answer to the original question; it's not an argument against merging the two and distinguishing them by {{context}}, {{a}}, etc, as we do for US and UK English. I lean in favour of merging them... we don't have separate headers for Schweizer Hochdeutsch (Swiss Standard German, not to be confused with Swiss German) vs Bundesdeutsches Hochdeutsch (FRG Standard German), although they differ in vocabulary, spelling, syntax (if not grammar) and pronunciation about as often as Nynorsk vs Bokmål seem to. - -sche (discuss) 22:26, 15 August 2012 (UTC)
@MaEr: norsk seems to be a poor example. To my knowledge though, most people that read Bokmål will pronounce the -eg personal pronouns as /-ɛi/ regardless of how their dialect does it - I don't think anyone pronounces jeg as /jeːɡ/, and thus it comes naturally to pronounce the other personal pronouns in a similar fashion.
@LA2: The problem is that you can find a huge amount texts written entirely in dialect, there is no way to tell which written standard they would "belong" to - the only link may be that the dialectal texts use the same alphabet as the written standards, and typically uses the pronunciation that is represented by the name of each letter.
@-sche: What about the fact that Nynorsk and Bokmål have their own ISO 639-1 codes - wouldn't that set them apart from most other examples? Njardarlogar (talk) 10:40, 16 August 2012 (UTC)
Per ISO codes: Bosnian, Croatian and Serbian all have them: sr, hr, bs. And that is all three: 639-1, -2 and -3. Perhaps Montenegrin will have one soon. But we anyway treat them (B,C,S,M) all as one. Here. Some consensus exists to do so. And I believe that was not one small thing to achieve that - may I call it consensus. Not that I want to reunite all similar languages/dialects under one umbrella, especially as I am not that knowledgeable of the Norwegian language/dialect distinctions, but I don't put that much faith in ISO codes as they are a product of a process made by SIL International. And this organization AFAICT catalogizes languages/dialects based predominantly on the fact that someone managed to produce a translation of Bible in the required "language". --BiblbroX дискашн 13:49, 16 August 2012 (UTC)
ISO also denied Elfdalian/Dalecarlian a code, deciding it was a Swedish dialect, even though I can't understand it with my knowledge of Swedish. Meanwhile, I have no problem reading the Occitan Wikipedia with my knowledge of Catalan, and Occitan does have an ISO code. So really, having an ISO code isn't really a reliable indication of how much of a language something is. —CodeCat 14:08, 16 August 2012 (UTC)
I did not mention ISO 639-1 as a way of arguing that it was more appropriate to label Nynorsk and Bokmål as separate languages - by most defintions of the word language, they certainly are not. However, it is not hard to argue that Norwegian and Swedish also are two subdivisions of the same language, which could be called e.g. "North Nordic", and you could add Danish to the mix and call the language "Mainland Nordic". However, they are typically treated as separate languages, at least in non-linguistic contexts.
Nynorsk and Bokmål have been separate written standards for 140-160 years. In fact, when the earliest stage of Nynorsk was created (Landsmål), Nynorsk was as different from Bokmål as it was from Danish, because Bokmål did not exist until decades later and was at this time "99%" equal to Danish. Even after the creation of Bokmål, Nynorsk and Bokmål were considerably more different than they are to today. E.g. traditional Nynorsk contained a huge amount of unique (not at all present in Bokmål) and fundamental words like, as examples, um (about), upp (up), burt (away), yver (over), andsvar (responsibility), soli (the sun), kvinnor (women) (the last two represent entire grammatical categories and endings that both Bokmål and present day Nynorsk lack) which were later changed to om, opp, bort, over, ansvar, sola, kvinner; terms that all coincide with Bokmål.
Those word comparisons were provided as historical context for those who are keen to go by how many duplicates we would end up with. Njardarlogar (talk) 17:31, 16 August 2012 (UTC)

Wiktionary is built around standard languages, seen as separate entities. The dialect continuum can't be dealt with in any other way than as exceptions from one of the standard languages. This is one of many imperfections of Wiktionary. Perhaps in the future, somebody will invent a smarter system, but right now we are building Wiktionary on these premises. -- The "huge amount texts written entirely in dialect" are written, I suppose, in different dialects. It would be hard to find a large corpus by many authors written in a single dialect in a consistent way. If that were to happen, for example if a newspaper were to appear in Ranværing (Mo i Rana) dialect, it might be considered as a third Norwegian standard language, and we could consider to add a new heading for ==Norwegian Ranværing==. For the time being, however, we could list Ranværing dialect expressions under either or both of the two existing headings. --LA2 (talk) 13:15, 18 August 2012 (UTC)

But why would we make such an exception for Norwegian? We don't make it for Serbo-Croatian either, which also consists of three (or four) separate standards. And in the same way that someone would write only in their particular kind of Norwegian (rather than mixed), people probably don't write a Serbian text full of Croatisms. —CodeCat 13:27, 18 August 2012 (UTC)
I don't suggest that we will create a 3rd heading for Norwegian Ranværing. I'm convinced that this will never happen. I'm just outlining what would be necessary for it to happen, i.e. that this dialect would have to grow into a new written standard language (which seems most unlikely). -- As an example that points in the opposite direction of Serbo-Croatian, there are already 12 subcategories to Category:Sami languages with the extinct Category:Akkala Sami nouns having one entry. --LA2 (talk) 14:20, 18 August 2012 (UTC)
CodeCat: the question for you then, becomes: why do we treat Danish, Swedish and Norwegian as separate languages? Njardarlogar (talk) 15:18, 20 August 2012 (UTC)
I would say that Swedish and Danish differ quite a bit, and Swedes have a lot of trouble understanding Danish (see w:North Germanic languages). I don't think it's that extreme between Serbian, Croatian and Bosnian. In the end whether something is a language or not is relatively arbitrary (Afrikaans differs less from Dutch than some Dutch dialects do). But I don't think Bokmål or Nynorsk are languages. —CodeCat 15:35, 20 August 2012 (UTC)
I've done a comparison between Swedish, Nynorsk and Bokmål through a short text here. Nynorsk and Bokmål have more in common with each other than with Swedish, but the difference is not large. Most of the differences between Swedish and the two Norwegian standards are small: like ö instead of o and -a instead of -e, -ade and -at instead of -a, and so on. In other words, what separates Swedish from either Norwegian standard is what separates the two Norwegian standards from each other, only to a slightly larger extent. Evidently, it is from a linguistical point of view almost arbitrary to treat Swedish and Norwegian as two separate languages while treating Nynorsk and Bokmål as one language. Njardarlogar (talk) 19:20, 20 August 2012 (UTC)
There's also Norwegian ei and au/øy against Swedish e/ä and ö, which is a difference that can be traced back to the time of Runic writing. The same for the difference between Norwegian dere/dykk and Swedish er, which Norwegian shares with (Old and Modern) Icelandic. —CodeCat 19:31, 20 August 2012 (UTC)
A language is a dialect with an army and navy. --WikiTiki89 (talk) 19:33, 20 August 2012 (UTC)
Dear Njardarlogar, why are you trapped in this linguistic riddle? Of course it's arbitrary from a linguistic point of view whether Danish, Swedish, Bokmål and Nynorsk are considered to be languages or dialects, different or the same. But Wiktionary is not a philosophical exercise. It's a dictionary. And from the dictionary writing (lexicographic) point of view, it matters a whole lot that Danish, Swedish, Bokmål and Nynorsk are four recognized standard languages with published dictionaries and newspapers and literature that consider the spelling correct when it agrees with a published dictionary, and incorrect (or slang or dialect) when it doesn't. We agreed before to have separate entries for Bokmål and Nynorsk, rather than a common entry for Norwegian. That should be the end of this discussion. Now, get on with writing this dictionary. There are still far fewer entries here for Bokmål and Nynorsk than for Danish and Swedish. And there are a bunch of old entries for Norwegian that need to be split into Bokmål and Nynorsk. --LA2 (talk) 20:18, 22 August 2012 (UTC)
That can't be the end of this discussion, because this discussion starts with the question about where to place words that are considered Norwegian, but aren't part of the standard languages Bokmål or Nynorsk.--Prosfilaes (talk) 22:32, 23 August 2012 (UTC)
Yep. I also happen to argue against CodeCat's view (which is to use one header for Norwegian) by bringing up Swedish and Danish. Njardarlogar (talk) 10:12, 24 August 2012 (UTC)
The answer is to put these dialect words in one or both of the available boxes: Norwegian Bokmål or Norwegian Nynorsk. We have no other boxes to put things in. --LA2 (talk) 10:58, 26 August 2012 (UTC)
 ? That's absurd. We have many other boxes to put things in, including Norwegian (ISO 639-1 no). Arbitrarily splitting them into boxes that don't represent anything real about them isn't helpful.--Prosfilaes (talk) 14:04, 26 August 2012 (UTC)
This discussion thread starts with a link to: "The previous thorough debate", from February-March 2011 where we decided to use exactly these two headings (Norwegian Bokmål, Norwegian Nynorsk) and not Norwegian. That's the prevailing consensus and the current question of dialect words is one example of how such a consensus is used in practical everyday life. -- Prosfilaes, what is your role here? Do you contribute to the Norwegian or Scandinavian entries? Did you read the discussion thread from 2011? Or do you just want to troll the beer parlour discussion? --LA2 (talk) 19:50, 26 August 2012 (UTC)
The question of this thread is do we want to stand by that consensus when we have words that don't cleanly fit into the two language model. Is your position so weak that you have to resort to ad hominem accusations of trolling instead of discussing the issue at hand?--Prosfilaes (talk) 04:03, 27 August 2012 (UTC)

Adding Norwegian dialectal terms by linking them to either/both written standard through attestation does not work. It is artificial. Dialectal words both can and are used entirely independent of written standards, and deciding whether a mix of dialect and standard language is Nynorsk or Bokmål is often impossible/meaningless. It would slow down the addition of these terms unnecessarily. If there are going to be separate headers for Nynorsk and Bokmål, the only reasonable options are the two that I listed at the top. Of these, placing the dialectal entries under ==Norwegian== is the most elegant and natural solution. Also given names would be put under this header, as they are invariant of written standard. Njardarlogar (talk) 08:57, 27 August 2012 (UTC)

So Norwegian has three forms, Bokmål, Nynorsk and Neither? That's a bit silly... I just noticed that the Bokmål Wiktionary treats Norwegian as one language, while the Nynorsk Wiktionary treats it as two. Compare no:hauk and nn:hauk. Looking at the Bokmål version I notice they also have Riksmål. What do we do if a word is found in Riksmål but not in Bokmål or Nynorsk? Do we make yet another language header, or do we consider that 'Neither'? —CodeCat 09:58, 27 August 2012 (UTC)
The fact that no.wiktionary treats both Bokmål and Nynorsk under a common ==Norsk== (i.e. Norwegian) heading was the starting point for the discussion in 2011. Norwegian doesn't have three forms, it has plenty of forms, plenty of dialects. Very few of these are ever written, and major newspapers only exist in Bokmål, Riksmål (an older form of Bokmål, used in Aftenposten), and Nynorsk. Riksmål and Bokmål are very close and are covered by the ==Norwegian Bokmål== heading. It does make sense to have one common heading for Norwegian (as no.wiktionary does) and it also makes sense to have two headings, as was the consensus for en.wiktionary in 2011. But introducing a third heading only to be used for those dialect words that are common to both Bokmål and Nynorsk doesn't make sense. Most readers of Wiktionary would, upon seeing the ==Norwegian== heading, conclude that these are standard Norwegian words, rather than some local dialect.
I have tried to assist and guide the reaching of a consensus for Norwegian entries in en.wiktionary in the hope that it would put discussion aside and allow for productive work. This assumption of mine was apparently wrong, so I will give up on the Norwegian language(s) here. It seems to be a lost cause for the next several years. I'm glad that we have a reasonable number of entries in Swedish and Danish. --LA2 (talk) 11:49, 27 August 2012 (UTC)
Any dialectal term must of course be marked with {{dialectal}} (or any similar, relevant template), regardless of which header it falls under.
CodeCat: Riksmål and Bokmål share the same origin, so it would be natural to treat them the way the differences between British and US English are. Nynorsk and Bokmål do not share the same origin, hence it makes sense to treat them separately.
Treating Riksmål in the manner the no.wikt does without treating Høgnorsk in the same way is an obvious bias as both forms are used by small minorities within their respective related official standard (most of the terms that are unique to Riksmål, such as sprog and efter, you will hardly ever encounter in contemporary texts). Njardarlogar (talk) 15:28, 27 August 2012 (UTC)
Ok well let's look at this again. We can't go just by what is 'modern' because we have to account for attestations in older texts. So how many Norwegian standards are there? Bokmål, Nynorsk, Riksmål, Høgnorsk... and I think also Landsmål? And isn't there also a standard called Samnorsk that never caught on? I realise that Riksmål is the precursor to Bokmål, but does that mean we include Riksmål words under the Bokmål header? And presumably then we'd have to include Høgnorsk and Landsmål under Nynorsk as well. But while they do fall under the same general grouping, it's obvious that Riksmål did not just 'become' Bokmål if both are still in separate use (one didn't actually supercede the other), so it somehow seems wrong to do it that way. It'd be a bit like treating Esperanto and Ido as forms of each other, since Bokmål was intended to improve upon Riksmål in much the same way that Ido was meant to 'fix' Esperanto. —CodeCat 22:07, 27 August 2012 (UTC)
How many standards?
  • I personally often use the word Landsmål to refer specifically Nynorsk as it were before it received an official standard; though technically, it is simply the name Nynorsk had prior to 1929.
  • Høgnorsk is a de facto standard; it follows older reforms of Nynorsk, though there is no body regulating what is official Høgnorsk and what isn't.
  • Riksmål is a conservative (read: less removed from Danish) version of Bokmål, but unlike Høgnorsk, it is regulated by a state-independent body.
  • Nynorsk and Bokmål are the only written standards of Norwegian that are recognised by Norwegian law, and they are also both regulated by the same state-controlled body.
So, you seem to suggest that putting Riksmål under the ==Norwegian Bokmål== header would be unfair. However, given that Riksmål is essentially a subset of Bokmål (old and present), it seems entirely fair to me. The only parts of Riksmål that should not be a subset of Bokmål, would be cases where an old Bokmål spelling is used to create a new word. E.g. the word <prefix>sprog comes into use in 2005, while sprog has not been a part of Bokmål for decades, thus the combination of <prefix> and sprog exists only in Riksmål and not older Bokmål.
I know nothing about the constructed languages that you mention, however, it seems that Ido is a descendant of Esperanto; while, on the other hand, neither Bokmål nor Riksmål is a descendant of the other. Riksmål contains de facto (minus any example of the form above) two types of words: a) words that are form a subset of Bokmål, b) spellings that are no longer official in Bokmål. The vast majortity of the words fall in category a). Njardarlogar (talk) 14:43, 28 August 2012 (UTC)


I have always disliked phrasebook because of the inherent stupidity of the explanations. E.g. I need a toothbrush is explained as "Indicates that the speaker needs a toothbrush". This is useless, because if a user does not understand the sentence "I need a toothbrush" he is not going to understand the explanation either. It occurred to me recently when I was fulfilling the translation request for I lost my handbag (which is cleverly enough explained to indicate that the speaker has lost his/her handbag!) that the phrasebook entries should have no explanation at all. That way they would be much less annoying. --Hekaheka (talk) 18:34, 15 August 2012 (UTC)

  1. Symbol support vote.svg Support --Μετάknowledgediscuss/deeds 18:36, 15 August 2012 (UTC)
  2. Symbol support vote.svg Support -- this is the second-best solution (the best being to delete the stupid things) SemperBlotto (talk) 18:46, 15 August 2012 (UTC)
  3. Symbol support vote.svg Support. But I do like having a sense-line; it's how we're used to organizing things. How about
    # {{&lit|need|toothbrush}}
    1. Used other than as an idiom: see need,‎ toothbrush.
     ? (Confession: I recently did that to [[scuba diver]].)
    RuakhTALK 19:00, 15 August 2012 (UTC)
Having no sense line would accentuate the separateness of Phrasebook, i.e. not being a part of the Wiktionary proper. Besides, "I need a toothbrush" is hardly an idiom (expression peculiar to or characteristic of a particular language, especially when the meaning is illogical or separate from the meanings of its component words). --Hekaheka (talk) 19:26, 15 August 2012 (UTC)
  • Good idea. —RuakhTALK 19:38, 15 August 2012 (UTC)
  1. Symbol support vote.svg Support {{translation only}} (or {{&lit}}, but not nothing).​—msh210 (talk) 20:24, 15 August 2012 (UTC)
  2. Symbol support vote.svg Support using {{translation only}}. (If there are some phrasebook entries with non-obvious meanings, I'd make exceptions.) - -sche (discuss) 04:35, 16 August 2012 (UTC)
  3. Symbol support vote.svg Support {{translation only}} BTW, is this a discussion for bashing phrasebook entries and trying to get rid of it altogether or for making it better? --~~~~ —This unsigned comment was added by Atitarev (talkcontribs) at 05:04, 16 August 2012 (UTC).
  4. Symbol support vote.svg Support. Matthias Buchmeier (talk) 09:26, 16 August 2012 (UTC)
  5. Symbol support vote.svg Support with -sche's exceptions. —Angr 10:19, 16 August 2012 (UTC)
  • I don't agree with using {{translation only}} because it says "translations into other languages are [idiomatic]" which is not necessarily true. {{&lit}} also implies some kind of idiomaticity which is not necessarily there. Would it be too much trouble to make a new template? Siuenti (talk) 10:25, 16 August 2012 (UTC)
    • That's just the documentation, though. The template itself doesn't say that. —CodeCat 10:37, 16 August 2012 (UTC)
      • Basically Category:English phrasebook needs trimming to make sure it does what it says it does, rather than egotistically creating entries for our own pleasure. WT:CFI#Phrasebook does give some guidelines on what to include and what not to, would it really be so bad if we actually stuck to them? They're hardly super-rigid. Mglovesfun (talk) 10:56, 16 August 2012 (UTC)
        • Your link doesn't seem to go anywhere... —CodeCat 11:39, 16 August 2012 (UTC)
          • I imagine that Mglovesfun meant to link to § "Idiomaticity", whose last regular paragraph reads: "Phrasebook entries are very common expressions that are considered useful to non-native speakers. Although these are included as entries in the dictionary (in the main namespace), they are not usually considered in these terms. For instance, What's your name? is clearly a summation of its parts." The guidelines in question are therefore very common and considered useful to non-native speakers. It goes without saying that phrasebook-entry-creators have been ignoring the former and playing games with the latter. —RuakhTALK 12:14, 16 August 2012 (UTC)
            • It's hard to judge usefulness based on commonness though. Earlier today I wanted to know how to say "to rob a bank" in Swedish. There are several ways to say "rob", two of which are råna and beröva. In Dutch we say "een bank beroven" so my initial guess was for beröva. But after I looked on Google I found out that råna is the normal word to use in this phrase as it's far more common. That's the kind of thing a user of Wiktionary would like to know! —CodeCat 12:51, 16 August 2012 (UTC)
              • I don't think that "to rob a bank" belongs in a phrasebook. If the question is "which verb for 'to rob' is most appropriate for a given circumstance?", then the entries for råna and beröva should have usage notes that answer that. Otherwise we'll have to create phrasebook entries for "to rob a convenience store", "to rob a coffee shop", "to rob a lemonade stand", and so on. —RuakhTALK 13:12, 16 August 2012 (UTC)

OK. I will modify all English phrasebook entries exactly per Sche's statement above unless somebody tells me not to. (Warning: you have to tell me very soon, and you have to have a good reason for it.) --Μετάknowledgediscuss/deeds 18:02, 17 August 2012 (UTC)

Just let the proposal sit in Beer parlour for a week. A practice of implementing proposals two days after they have been made is unadvisable. --Dan Polansky (talk) 20:45, 17 August 2012 (UTC)
I agree. —RuakhTALK 20:48, 17 August 2012 (UTC)
thirded. Mglovesfun (talk) 21:12, 17 August 2012 (UTC)

Template listing sequence elements eg "quaternarily"

Regarding sequences like "primary", "secondary", "tertiary", etcetera, or like "semicentennial", "centennial", "sesquicentennial", etcetera: We all know there are additional elements in such sequences (and there are dozens or hundreds of sequences), but it may be very difficult for a casual user to figure out how to start from the sequence-element she knows (eg "secondarily" or "bicentennial") and learn the related element she actually wants (eg "quaternarily" or "tercentennial").
It seems like it would be really handy to enable a Wiktionary entry page to readily display other elements in that term's sequence (I'm thinking of something like Wikipedia's templates such as W:Template:US Presidents which can be put at the side or bottom of an article on a particular term (which term is ordered there in the template among its related terms).
Is there anything like that already at Wiktionary? Any ideas about how to promote the idea? --→gab 24dot grab← 16:10, 17 August 2012 (UTC)

We do already have boxes for cardinal and ordinal numbers, like those at two and second. Those boxes could probably be expanded to cover the words you mentioned. —CodeCat 16:19, 17 August 2012 (UTC)
I see "quaternary#See also" simply lists terms such as "duodenary" (12); I believe we can do better/slicker than that. I understand the mention of Template:cardinalbox, but it isn't really what I envision and it is very limited (eg direct links only to the elements immediately before and after). I guess for now... Regarding the expansion of the existing Template:cardinalbox to allow simple terms like "secondary" and "secondarily", what name is there for those two parameters? --→gab 24dot grab← 17:11, 17 August 2012 (UTC)
See Wikisaurus:number. Each member on that page needs to be referenced to the Saurus page. --BB12 (talk) 21:06, 17 August 2012 (UTC)

eat one's Wheaties

Have things changed around here? It used to be that to cite a term, we had to cite the exact spelling including the precise words of a phrase, e.g. eat his Wheaties vs. ate my Wheaties etc., and that a generalized phrase as this title suggests had an even higher bar. DAVilla 02:20, 22 August 2012 (UTC)

I don't see why "eat his Wheaties", "eat your Wheaties", and "ate my Wheaties" shouldn't be counted as cites for this. Surely inflected forms count as cites for lemma forms. Why wouldn't they? —Angr 09:17, 22 August 2012 (UTC)
I'm not saying they don't count, I'm saying the burden of proof is greater. I could probably find three citations that use tractor and bowling ball in some metaphoric way, claiming their base clause is idiomatic, but that doesn't make bowling ball from one's tractor an expression worth commenting. DAVilla 02:17, 24 August 2012 (UTC)
Huh? I think we're talking about different things. All I'm saying is, "ate my Wheaties" is just an inflected form of eat one's Wheaties and can be used in a quotation for it, just as "tractors" is an inflected form of tractor and can be used in a quotation for it. I don't really understand what "higher bar" or "burden of proof" you're talking about. —Angr 10:30, 24 August 2012 (UTC)

Declined forms of idiomatic phrases

Usually we list and create separate entries for forms of declined forms of idiomatic phrases. For example, in addition to dependent variable, we also have a separate entry for its plural form dependent variables, even though it's just the regular plural of the second part (variable) while nothing changes about the first part dependent. However, other languages have richer inflection. In such languages, the adjective may inflect, too. For example, the German geschweifte Klammer ("curly bracket") has as many inflectional forms as there are forms of the adjective geschweift. Should we really create separate entries for all these forms, even though they are exactly like non-idiomatic combinations of an adjective and a noun, such as gelbes Auto ("yellow car")? I don't think this makes sense. But which entries should we actually create then? Longtrend (talk) 15:40, 22 August 2012 (UTC)

In English, with its limited number of inflections, the practice is to show the plurals of noun phrases, but practice is divided on verb phrases. I think there is a tendency to not show conjugation of longer verb phrases and of verb phrases with weak verbs. There seems to be some consideration of the possible value and cost of the extra entry. DCDuring TALK 16:00, 22 August 2012 (UTC)
I think for most languages, a notice like "see (term 1) and (term 2)" in the inflection section is enough. —CodeCat 16:13, 22 August 2012 (UTC)
In Russian, we have compound words with the full inflection, like именительный падеж and without it, like мобильный телефон. Providing inflection doesn't cause new entries and inflected forms are usually not wikilinks (verb inflections may have explicit links, e.g. выпускать, created manually) but there are inflected form entries as well. Compound verbs don't have entries for inflected forms. It's too much effort and if we need them, perhaps it could be done by a bot. You may want to check with User:Hekaheka, what she does for Finnish compound words and their inflected forms. --Anatoli (обсудить/вклад) 23:59, 22 August 2012 (UTC)

wiktionary - phonetic error?

That little graphic in the upper left hand corner seems to my eye to be a syllable short for pronouncing Wiktionary; any explanations? Wik - 'wik toin - (silly S shape)plus upside down e? a - missing? ry - i

??? —This unsigned comment was added by (talk) 16:09, 23 August 2012‎.

See Wiktionary for the pronunciation variations. --BB12 (talk) 16:15, 23 August 2012 (UTC)
Can we get the logo to link to the FAQ page instead of to the homepage? (I kid, but.)​—msh210 (talk) 17:59, 23 August 2012 (UTC)
That was my thought, too, but without the kidding. The main page can be accessed from the left column. --BB12 (talk) 19:35, 23 August 2012 (UTC)
Where is the logo? I can't find one. Mglovesfun (talk) 21:05, 23 August 2012 (UTC)
It's at the very top left corner of every page. It looks like a dictionary entry for the term Wiktionary. —RuakhTALK 21:19, 23 August 2012 (UTC)
I guess my .css suppresses it then, which isn't a bad thing. Mglovesfun (talk) 09:57, 24 August 2012 (UTC)
Found it, the final /ɪ/ is typical where I live (see User:Mglovesfun/Leeds IPA) but not throughout the UK. Apart from /r/ rather than /ɹ/, it is correct in the sense it is used by native English speakers. Mglovesfun (talk) 10:03, 24 August 2012 (UTC)
I think the OP's objection was not so much to the final /ɪ/ as to the lack of a vowel between /n/ and /r/. North Americans pronounce dictionary (and therefore also Wiktionary and Pictionary) with a secondarily stressed /ɛ/ there; for us it's a four-syllable word, as our actual entry for Wiktionary (as opposed to logo) shows. —Angr 22:46, 24 August 2012 (UTC)
For any non-regulars reading this discussion, here's a link to our previously-hinted-at FAQ page on the subject. - -sche (discuss) 21:37, 28 August 2012 (UTC)

Navigation "breadcrumbs" have been added to category boilerplates

I've added small links at the top of category boilerplates, to help users find their way within our category structure. For example see Category:English uncountable nouns. It should work ok, but if there are any problems with it please let me know. I hope it's useful. —CodeCat 14:36, 24 August 2012 (UTC)

I like it!   Re: if there are any problems with it: they seem to be broken at pages for "by language" categories, e.g. Category:Uncountable nouns by language. —RuakhTALK 15:13, 24 August 2012 (UTC)
Ok, I removed them from such categories, because they don't have a category 'tree' in the same way, they all have the same parent category. —CodeCat 15:17, 24 August 2012 (UTC)
I noticed there are a few cases where the first parent category is missing, and is replaced with the main language category instead. This is a problem with the way the subtemplates of {{poscatboiler}} and such define those categories. To fix them, parentbyname1= should become parentbyname2= and parent2= should become parent1=. The navigation always looks at the first parent, so if there is none (or just a 'by name' which is normally reserved for root categories) then it doesn't work. —CodeCat 15:26, 24 August 2012 (UTC)
I like it, too. :) - -sche (discuss) 21:19, 24 August 2012 (UTC)
I like it as well, although it does look a little bit ridiculous on a heavily nested page like Category:Luganda terms derived from Proto-Bantu. --Μετάknowledgediscuss/deeds 03:39, 25 August 2012 (UTC)
Pretty. DCDuring TALK 04:25, 25 August 2012 (UTC)
I looked at the derivation categories and I thought of getting rid of the 'terms derived from' part of the names, but there is no easy way to do that. —CodeCat 10:24, 25 August 2012 (UTC)

Template:arc/script (edit request)

Please add Armi. --Z 17:56, 24 August 2012 (UTC)

As a separate option? So that either script can be specified? --Μετάknowledgediscuss/deeds 23:48, 24 August 2012 (UTC)
How do you mean? You can already specify a secondary script with sc=. —CodeCat 01:23, 25 August 2012 (UTC)
Actually, I want it for here: "It is a member of the Aramaic family and written in Hebrew script." --Z 19:13, 26 August 2012 (UTC)
That's what it already says. Mglovesfun (talk) 19:16, 26 August 2012 (UTC)
I know, we want to add Aramaic script. --Z 19:24, 26 August 2012 (UTC)

About: Egyptian

I have begun an about page for Ancient Egyptian, which can be found here: Wiktionary:About Egyptian. I invite/encourage/need comments and criticism.

In addition, does anyone know exactly how much of the Manuel de Codage is implemented on wiktionary? Rotation of characters doesn't seem to work for me. Furius (talk) 02:35, 25 August 2012 (UTC)

Distinguish languages with alternate names or parenthetical disambiguators

It's long been en.Wiktionary's practice to use alternate names to distinguish languages, resorting to parenthetical disambiguations like "Kara (Tanzania)" only when languages have no alternate names. I've advanced this practice in the past... but I've started to question the wisdom of it, and wonder if we should reverse our approach and prefer parentheses.

The danger of alt names is that new contributors may overlook language codes in places they are present (e.g. if about to change a translation in a trans table), and may not even know to check what name we use for a particular language (code) the rest of the time. Thus, they may create entries like :

 # thing

which one of us will put into Category:Aja nouns and/or 'format' as

 # thing

unaware that the contributor was thinking of {{ajg}}, which we and the UN call 'Adja' but which WP and some references also call 'Aja'.

This is all the more insidious because the languages it affects are often obscure: who among us would even notice if a well-meaning scholar saw that we gave akpa as the 'Aja' ({{aja}}) translation of article and 'corrected' it to foobar (or whatever the word for article is in {{ajg}}-Aja), let alone perceive the reason for the mistake?

On the other hand, how much would we benefit from parentheses? Would contributors notice that we distinguished "Aja (Benin)" and "Aja (Sudan)", or would they blithely add ==Aja== words and "Aja:" translations, leaving us hardly better off, forced to {{rfc}} and perhaps ultimately delete the former and {{ttbc}} (but wait: with what language code?) the latter? (Just a day or two ago I saw a new entry created with the header ==Norwegian==, though I can't find it anymore.)

The examples of confusing languages I can most easily call to mind are ones I renamed myself: Template talk:ajg, Wiktionary:RFM#the_Bina_languages, #Bena_languages, #Kara_languages. But there are sure to be more I'm unaware of, given discussions like this one.

Yes, languages should be examined on a case-by-case basis: there's a strong argument for using alt names rather than parentheses to distinguish WT:RFM#Template:bgl-Bo from {{bpw}}-Bo (and from {{akm}}-Aka-Bo), for example. And it goes without saying that our header-, template- and category-structure depend on languages being distinguished from one another, somehow. But we should discuss whether our general tendency should change from alt names to parentheses. - -sche (discuss) 00:00, 26 August 2012 (UTC)

TL;DR? Short version: currently, we prefer to distinguish languages by using alt names, so e.g. {{aja}} is 'Aja' and {{ajg}} is 'Adja', though both names are used for both languages in reference works. Wouldn't it be better if we preferred parentheses, not just for 'Aja (Sudan)' vs 'Aja (Benin)', but in general? - -sche (discuss) 00:00, 26 August 2012 (UTC)
I think avoiding parentheses is good, but I think this problem is just another facet of our struggle to incorporate minority languages that none of us know much about. I don't think that there's any real solution except using {{also}} on pages like {{ajg}} and {{aja}} and realizing that there is no way out until we get (a) contributor(s), which could very well never happen :( --Μετάknowledgediscuss/deeds 02:53, 26 August 2012 (UTC)
This is a good point. The problem is going to as time goes on and more languages are added. One way around this is to require the ISO code be used in the wikicode instead of the language name. --BB12 (talk) 03:29, 26 August 2012 (UTC)
Unfortunately, I don't think there's overlap between the people who look up [[Template:aja]] and the one-time contributors who'd add a word in ==Aja== and maybe copy {{head|aja|noun}} from another entry (or, problematically, copy {{head|ajg|noun}} and ==Adja== assuming we just called 'Aja' by its alt name). :/ {{also}} would help us keep them straight :) but please add any {{also}}s after the language name: doing otherwise seems to break the code's entry in WT:LANGLIST, only a minor issue, but an avoidable one. (@Anyone looking to troubleshoot: see the current version of LANGLIST, where this version of {{ldl}} corrupted Kaan's entry.)
Similarly, I think requiring the code be used in the wikitext (like fr.Wikt?) isn't worth the hassle, as most languages have unambiguous names. We shouldn't let the tail of a few languages wag the dog of all the major languages with unique names that make up most of our pagecount. And I figure one-off contributors would either (a) enter ==Aja== rather than =={{aja}}== anyway, or (b) copy the code from an existing Aja entry (getting the right code half the time and the wrong code half the time), or (c) not contribute (and speakers in other, uniquely-named languages might also not contribute). The more technical we make things, the harder it is for people to contribute. :/
Whereas, I do think a person from Sudan would think twice before copying the header ==Aja (Benin)== to add a word from their Sudanese Aja. And likewise for other same-name languages. - -sche (discuss) 04:59, 26 August 2012 (UTC)
(FWIW, AFAIK the only langtemps with {{also}}s are {{ldl}} and {{law}}, both added by me.) I just think that parenthetical countries are a problem. For example, countries change, and then it can get really messy. In this case, Aja isn't spoken in Sudan any more. It's spoken in South Sudan (independent since 2011). And what about the the thousands of Aja speakers across the border in the Central African Republic? So you can see that Aja (Sudan) is just asking for trouble, and that's the problem with parentheticals in general. --Μετάknowledgediscuss/deeds 05:15, 26 August 2012 (UTC)

Bot to handle {{t}} and its ilk.

I'd like to set Rukhabot (talkcontribs) on the task of converting between {{t}}, {{t+}}, {{t-}}, and {{}}. As currently coded, it follows these rules:

  • It goes based on database-dumps, so will typically have somewhat out-of-date information. (For the first pass, I think this is fine: currently these templates are, for the most part, two years out of date. For later passes, I'll see about improving this somewhat.)
  • It will only convert between those four templates. If a translation does not use any of those templates, it will be not be touched. (Later on, I may work on {{t}}-ifying simple cases.)
  • It will not change any formatting, alter any language-codes, or anything like that. It will only change the name of the template being called. (Exception: if there's spurious whitespace inside the template-call, it can be removed. For example, {{ t+ | fr | le }} will become {{t+|fr|le}}.)
  • It does not try very hard to understand the subtle complexities of MediaWiki template syntax. It simply looks for (approximately) {{t[-+ø]?[|][a-z-]+[|][^|}=]+ followed by | or }}. So, for example, it will be fooled by {{t+|fr|asfasefasefase|2=le}}, which looks like it links to fr:asfasefasefase, but which actually links to fr:le. However, even in such pathological cases, it won't cause any serious harm — it just might select the wrong template.
  • It doesn't examine context at all; it's just as happy to update a {{t}} in a ====Synonyms==== section, or inside a comment, as a properly-used {{t}} in a ====Translations==== section. (See Wiktionary:Todo/Translations templates outside translations sections.)
  • It chooses between {{t+}}, {{t-}}, and {{}} using the rules you'd expect, with two special cases:
    • The language-codes nan, cmn, nb, rup, and kmr are hardcodedly mapped to zh-min-nan.wikt, zh.wikt, no.wikt, roa-rup.wikt, and ku.wikt, so they will result in {{t+}} or {{t-}}, not in {{}}. For example, no:yes exists, so the bot will convert {{t|nb|yes}} to {{t+|nb|yes}} and {{t|no|yes}} to {{t+|no|yes}}.
    • Due to the weird script-conversion stuff, I haven't figured out how to reliably tell if zh.wikt, sr.wikt, kk.wikt, ku.wikt, or iu.wikt has an entry, so for those, it will change {{}} to {{t}}, but will otherwise leave those Wiktionaries alone. (See Wiktionary:Grease pit/2012/August#ku:فەرهەنگ.)
  • It has no special behavior for B/C/S/M; for example, it will convert {{t|hr|Leiter}} to {{t+|hr|Leiter}} and {{t|sh|Leiter}} to {{t-|sh|Leiter}}.

For ten examples of the edits it makes, see Special:Contributions/Rukhabot?offset=20120826030100&limit=10.

Does anyone have any objections to my doing this?

RuakhTALK 17:09, 26 August 2012 (UTC)

I have no real objections. In the category of minor things, though, I think that B/C/S/M translations ought to be merged and nested by the bot. I also think that {{t}}ification would be very helpful. However, if those features are not feasible currently, that's OK. --Μετάknowledgediscuss/deeds 17:37, 26 August 2012 (UTC)
I am not opposed to merging BCSM translations as such cause I treat them all as one language, but simple linking to sh.wiktionary of all the translations might be detrimental: some of them point to the real entries in the corresponding wiktionaries and by removing those links we are left without some information from those wikts. I am more for adjusting the t templates in the case of sh so it can optionally point to bs, hr, sr and/or sh. --BiblbroX дискашн 19:02, 26 August 2012 (UTC)
I don't think we should modify {{t}} to have special behavior for B/C/S/M, but your suggestion actually doesn't need any modification to {{t}}: if you want a given Serbo-Croatian translation to link to bs.wikt, then you can just use {{t|bs|...}}.
However, I was thinking we might want to create a separate template {{t-sh}}, or perhaps two templates {{t-sh-Cyrl}} and {{t-sh-Latn}}, that would link to all four B/C/S/M Wiktionaries with appropriate coloration. (Naturally, these templates would need four separate parameters, but if we use named parameters, e.g. {{t-sh|foo|m|bs=+|hr=+|sr=-|sh=-}}, it shouldn't be too bad.)
RuakhTALK 19:45, 26 August 2012 (UTC)
{{t|bs|...}} is not really a good idea because it will cause the translation itself to link to the Bosnian section on our own Wiktionary, which probably does not exist. —CodeCat 20:22, 26 August 2012 (UTC)
Oh, whoops, good point. —RuakhTALK 21:07, 26 August 2012 (UTC)
I don't mind creating a separate template, or two of them for BCSM. Also it might be useful amending the script for adding translations to sh whenever user tries to put bs, hr, or sr. If noone disagrees. --BiblbroX дискашн 17:17, 29 August 2012 (UTC)
It might be slightly unrelated, but your bot should also strip the xs= parameter from translation templates when it's going through them. That is a leftover from 2007 and doesn't actually do anything anymore. -- Liliana 20:02, 26 August 2012 (UTC)
O.K.; I can do that, at least in most cases. (I assume no one objects?) —RuakhTALK 01:45, 27 August 2012 (UTC)
I'd prefer all instances of {{tø|cmn| to be changed to {{t|cmn| (remove "ø"). Perhaps it's best to keep zh as the standard language code for Mandarin translations. Mandarin wiki uses "zh", not "cmn". If contributors use cmn, it always adds {{}} |cmn, which prevents it from linking to zh:wiki.
Can we stop the bots from converting "zh" to "cmn" to avoid confusions (to future contributors)? If the Chinese (Mandarin) wiki uses "zh" code, we should too, not "cmn". It's not in contradiction to our practice either.
Despite the bots knowing that "cmn" = "zh" for interwikis, in my observation, the bots incorrectly convert {{t}} to {{t-}}, even if a zh:wiki entry exists. I have been manually changing "-" to "+" in many instances. --Anatoli (обсудить/вклад) 01:06, 27 August 2012 (UTC)
Sorry, I think either you're misunderstanding me, or I'm misunderstanding you, or something. According to the rules above, this bot will change {{tø|cmn| to {{t|cmn|, because it recognizes that cmn means zh.wikt (so, not {{}}), but doesn't know how to tell whether a given zh.wikt entry exists (so, not {{t+}} or {{t-}}). I don't know what you're referring to when you write that "If contributors use cmn, it always adds {{}}". (What is "it"?) I also don't know what bots you're referring to when you write of their "converting 'zh' to 'cmn'", and I really don't know what you're referring to when you say that "in [your] observation, the bots incorrectly convert {{t}} to {{t-}}", since so far as I'm aware, there is currently no bot that ever converts {{t}} to any other template. —RuakhTALK 01:45, 27 August 2012 (UTC)
Sorry, if I sound confusing. I know little about what bot does what, so maybe I lumped things here that I shouldn't.
1) Thanks for clarifying the first item ({{tø|cmn| to {{t|cmn|). Happy if this is the case.
2) When editors use cmn code to add translations via the JavaScript, "it" adds {{}}. Not sure if Conrad Irwin's script has to do with it or that cmn is not considered a language but a dialect code. The same thing happens when a translation is added for a language/dialect, which doesn't have or is not supposed to have its own Wiktionary. Please take a look at this revision of gross profit. {{}} is always added automatically when "cmn" code used instead of "zh". That's why "zh" is preferable when adding Mandarin translations.
* Chinese:
*: Mandarin: {{tø|cmn|毛利|tr=máolì|sc=Hans}}
3) Tbot changes 'zh' to 'cmn' in translations and changes {{t}} to {{t-}}. I'd like this to stop but I had trouble finding, which bot is doing it. Sorry if it's unrelated. Obviously if the original translation was using "cmn" the bots won't change {{t}} to {{t-}} but leave it as {{}}. E.g. the Mandarin translation of life (first sense), which I changed to 'zh' on 13 Aug 2010 and it was changed to 'cmn' and {{t}} to {{t-}} (although the matching Chinese entry for 生命 already existed!) by this edit (Tbot). I'd like this to stop if possible. --Anatoli (обсудить/вклад) 02:11, 27 August 2012 (UTC)
* Chinese:
*: Mandarin: {{t|zh|生命|tr=shēngmìng|sc=Hani}}, {{t|zh|生活|tr=shēnghuó|sc=Hani}}

Has become (note zh->cmn change and a "-" symbol added)

* Chinese:
*: Mandarin: {{t-|cmn|生命|tr=shēngmìng|sc=Hani}}, {{t-|cmn|生活|tr=shēnghuó|sc=Hani}}

--Anatoli (обсудить/вклад) 02:23, 27 August 2012 (UTC)

Re: #2: Ah, O.K.; yes, that was in Conrad's script. I've fixed it now — see User:Conrad.Irwin/editor.js?diff=17694052 — but note that client-side caching means that not everyone will immediately get the updated JavaScript. (To get the updated JavaScript for yourself, visit http://en.wiktionary.org/w/index.php?title=User:Conrad.Irwin/editor.js&action=raw&ctype=text/javascript and perform a hard-refresh, and make sure you see cmn:{g:"",hw:1,p:0,sc:"Hans"} in the page; but obviously that will only fix your own future edits.)
Re: #3: Tbot hasn't run in almost two years, and will never run again. (Its owner, Robert Ullmann, passed away about a year and a half ago.) This bot completely supersedes it, and this bot does not make either of those changes: it doesn't change zh to cmn (though it may start doing that, in the future, if you and others are O.K. with it), and it doesn't change {{t|cmn|...}} to {{t-|cmn|...}} (at all, for now; in the future, once I figure out how to tell whether an entry exists on zh.wikt, it will convert {{t|cmn|...}} to either {{t+|cmn|...}} or {{t-|cmn|...}} as appropriate, but cmn will still be no better nor worse than zh).
RuakhTALK 02:33, 27 August 2012 (UTC)
Thanks very much, Ruakh. If all people use the new version of User:Conrad.Irwin/editor.js, then it's OK for translations to have {{t+|cmn|...}} or {{t-|cmn|...}}, in my opinion. --Anatoli (обсудить/вклад) 04:39, 27 August 2012 (UTC)

Bot to handle {{t}} and its ilk — Update

General updates, notes, and planned-changes-if-no-one-objects:

  • I started the bot running last night. Please let me know if you see any problems.
  • As some of you know. http://sr.wiktionary.org/wiki/fobur results in an HTTP 301 redirect to http://sr.wiktionary.org/wiki/фобур. I think this means that, for purposes of {{t}}/{{t-}}/{{t+}}, sr:fobur should be considered to exist. (Does anyone disagree?) I now think I understand how to determine when such HTTP 301 redirects will occur; I want to do some testing, but I expect that tonight or tomorrow night I'll modify the bot to implement this rule. The other affected Wiktionaries, ku.wikt and so on, will come later, without further notice here.
  • I haven't implemented Liliana's requested removal of xs= from translation-templates, but I'll probably do that tonight or tomorrow night.
  • At some point I plan, without further notice here, to detect the use of no or zh on translations-lines labeled "Bokmål" or "Mandarin", and change these to nb or cmn. I won't touch lines that are simply labeled "Norwegian" or "Chinese".

RuakhTALK 15:54, 28 August 2012 (UTC)

About Tahitian (again)

Hi, I warn you I had freeze new entries or modfication in Tahitian at the end of this discussion (Okina or straight apostrophe for tahitian ?) cause of Tahitian's transcriptions problems occurred (French discussion about that). Actually there are at least 2 officially recognized writing system in Tahitian :

  1. The "Tahitian Academy" transcription.
  2. The "Maohi Protestant Church" transcription.

Both are recognized by the French Polynesia and its Education Ministry. Here the proofs from 2006 educational directives :

The thrice refers to this document which show the difference between the 2 graphic sytstems : Fiche outil N°2 : Les systèmes graphiques.

So we decided on fr.wikt in order to respect the neutrality to treat the both systems equally. For example we will have ʻā (Tahitian Academy norms), ’ā (treat as a variant of the Tahitian Academy norms), and â (Maohi Protestant Church norms). I'm the only one to take care of this language over there, so for now I create the three articles and the 2 last one are treat as variant of the first one to make my job easy and avoid the dispersion of informations (sorry for the neutrality). I had create a template to notice and classify the 2 norms. Because of course it is wrong to mixte the both systems when you write in Tahitian. Are you ok with this, or do you want experiment an other way ? V!v£ l@ Rosière /Whisper…/ 21:40, 27 August 2012 (UTC)

This could be done like we do for British and US English. When the spelling is different, we make either the US or British the main form and then refer from the other form. You can see this at anaemia.
Allright, so I'll follow this way thanks. V!v£ l@ Rosière /Whisper…/ 01:14, 29 August 2012 (UTC)

Template:in the plural

I just noticed this template. It speaks to my personal preference to include plural-only senses like "messages"=groceries or "pontificals"=vestments in the singular lemma entries. However, it had been my impression that prevailing practice was to do the opposite — to hide the plural-only senses in the plural entries. If practice is now to use this template and include the plural-only defs in the singular lemma, I am thrilled — this will make it much simpler for users (who are generally intelligent enough to figure out that messages is an inflected form of message) to find all the senses of a term. But if practice continues to be to confine the plural-only senses to plural entries, this template shouldn't exist. So, which is it? Can I move # Groceries back out of [[messages]] and into [[message]], or should this template be systematically eliminated and the senses it tags systematically moved? - -sche (discuss) 20:00, 28 August 2012 (UTC)

I dislike adding plural only meanings to singular forms, as it's basically saying "this word doesn't mean this". I take your point though. I don't actual remove them, I just tend not to add them. Mglovesfun (talk) 19:06, 4 September 2012 (UTC)
Question book magnify2.svg
Input needed: This discussion needs further input in order to be successfully closed. Please take a look!
Oh, I just thought of something obvious: what if we used an {{&lit}}-like template, like: In the plural: see foobars. That would keep the info on the technically-most-correct page, [[foobars]], but make it obvious to anyone looking at [[foobar]] that there were more senses of the word than were listed in [[foobar]]. - -sche (discuss) 23:57, 4 September 2012 (UTC)
People probably don't look at plural entries much. Most of our plural entries don't contain anything interesting beside a standard message, so we should take care to either include additional information about the plural on the singular page, or provide an obvious link. —CodeCat 00:01, 5 September 2012 (UTC)
I like this second suggestion, because it gets people to the meaning regardless of whether they approach the word from the plural or singular end. The entry for the singular should inform the reader about special meanings for inflected forms - of course. But if I key in the plural word, then there should be information at that entry - the user won't always know that the unfamiliar word they're looking at is an inflected form of something else, particularly if it is an unfamiliar language. Furius (talk) 10:16, 13 September 2012 (UTC)
CodeCat: "People probably don't look at plural entries much." I'd agree that's true in general, but for the class of nouns that have a plural-only sense there is an excellent chance that they would go to the plural. For example, the terms boys and girls are somewhat likely to get users, both native speakers and others, who are looking for confirmation of the legitimacy and register of the senses in which they are used. I would expect a larger portion of FL users to lemmatize the regular plural and look at boy and girl.
The net benefit of having the material at both locations varies by the complexity of the L2 section for the singular. Not having a plural-only sense at the plural where the entry for the singular is complex (long, multiple PoSes, multiple etymologies} is simply bad. We might well lose some users. Adding multiple plural-only senses to such complex entries only serves to make them more complex. For such entries, I like the idea of simply referring the user to the plural entry. OTOH, simple singular entries can stand a little extra complexity and little harm, except the maintenance problem, comes from the duplication.
If we need to have a single consistent model in English for the benefit of both users and contributors, the universal application of the template suggested by -sche and the placement of senses at the plural appeals to me. DCDuring TALK 12:43, 14 September 2012 (UTC)

Proposed new guideline at the Commons

There's a proposal for a new Commons policy/guideline about replacing existing files. See the 'request for comments' and the proposed new policy/guideline.​—msh210 (talk) 19:46, 29 August 2012 (UTC)

Revising derived/related terms etc.

I noticed that there is a lot of confusion among editors, especially new ones, about what these things are supposed to mean. They are also not always as useful as they should be. There are often new editors that treat 'related terms' as 'semantically related' rather than 'morphologically related' which is how we use it. I think that does kind of prove a point: the term 'related terms' is too vague. I also often see it being used as an etymology, to link to the parts of a compound word. And today I came across doodshoofd, itself a compound, where the related terms linked (correctly) to words that were morphologically related, but they were all words that were themselves compounds derived from either dood or hoofd. Those terms had nothing at all to do with the term doodshoofd semantically and would probably not be very useful to someone looking at that entry. Someone wanting to find terms derived from dood or hoofd would (hopefully) look at those entries instead.

Regarding 'derived terms', there is also ambiguity about what 'derived' means. I remember from past discussions that we only consider synchronically derived terms, but that can often make such a list far more limited and far less useful. To name an example, see and zien are related to sight and zicht, but the historical relationship goes back to before Proto-Germanic times. Yet, there would be no one who doubts that the latter is not related to the former. To me as a native Dutch speaker, zicht still kind of 'feels' derived from zien, because Dutch still has many other examples of the same derivational process, even though it is not productive anymore except for a few rare cases. So a Dutch speaker would still consider this derivation because the derivational relationship is still clear to them, even if the derivation itself happened millennia ago.

So I am thinking that some changes in the way we treat derived and related terms would be a good thing. I would like to propose replacing them with two new kinds of header (the exact name of which would need discussion). One for semantic relationships and the other for morphological ones (which may be synchronically derived or not). I'm proposing the first to remove the vagueness of what kind of 'related' we mean, but also because we currently have no header for this already (unless 'coordinate terms' is it). The second proposal is to make it easier for editors to add information without having to get bogged down with questions of etymology and dating, which they probably don't know much about (if such information is even available for that language!). It would hopefully allow us to focus more on adding content and not nitpicking about fine details that most users and editors alike don't (need to) know about. —CodeCat 22:36, 29 August 2012 (UTC)

I agree we need to do something about this, because the mistake is too easy to make. Whether we need to change the headings I don't know. In my happy dream-world, we have all the links in the etymology section, such that an automatic process can reject an edit and say "this isn't related — look at its ety", but clearly that won't happen until the year 2999. Equinox 22:42, 29 August 2012 (UTC)
Re putting links to etymologically related words in the etymology section: See user:msh210/ELE.​—msh210 (talk) 15:39, 30 August 2012 (UTC)

"Ooh, I could crush a grape"

Stu Francis once began a song: "When life is good, like you know it should". [1] This is immediately wrong (sorry, Stu) because "it should be". But why? "Be good like you should" is probably acceptable, and that uses the same verb. Equinox 23:03, 29 August 2012 (UTC)

I thought it might be because the implicit verb form missing after should has to match the earlier-stated one exactly, but "He is doing as he should" works. So that's not it....​—msh210 (talk) 05:37, 30 August 2012 (UTC)
I think the "as" may nullify the requirement so that it is indeed a case of the verb form. Ex: "I went as I should (go.)" --BB12 (talk) 05:52, 30 August 2012 (UTC)
I think it has something to do with some sort of distinction between a sort of passive and active meanings of to be. The active meaning can distinguished by its present tense of is being rather than just is. to be in the passive sense cannot be described as doing something, while the active sense can be. should can only be used with verbs that can be replaced by do (pretty much all verbs except for the passive to be). Because of this, the is in the first half of the sentence has a different meaning from the be in the second half, making it necessary to use be explicitly. For example, the imperative sentence Be good, like you know you should. is perfectly correct, while You are good, like you know you should. is wrong and needs the be. But that's just my theory. --WikiTiki89 (talk) 07:10, 30 August 2012 (UTC)
"Active" and "passive" are already grammatical terms, and already have a relationship to be, so let's not overload them. To use the terminology of the Cambridge Grammar of the English Language, what you're calling the "active" meaning is called the lexical use, whereas what you're calling the "passive" meaning is called the copula use. (There's also a progressive use, as in "I'm doing it", a passive use, as in "it's been done", a quasi-modal use, as in "I'm to do what now?", and a use-that-I-don't-remember-the-name-of, as in "I've been to France.") —RuakhTALK 13:54, 30 August 2012 (UTC)
I wasn't talking about mood or aspect or anything. For most verbs, the present tense distinction between, for example, eats and is eating has to do with the progressive aspect. But for the verb to be, is and is being can have a completely different distinction, depending on context. is being is still the progressive aspect, but is is no longer the whatever-the-opposite-of-progressive-is. But the distinction I am talking about has nothing to do with the progressive aspect because it can also apply in the whatever-the-opposite-of-progressive-is aspect. The reason I called this active is because there is an action, as opposed to the copulative sense which just describes things as they are. Take for example the following: (a) He is good. That is just the way he is. and (b) He is good all the time now that I've had that talk with him. In example a, is is just there for syntax. In example b, is is an action: What is he doing? He is being good. And I almost want to use the nonexistent form bes instead of is: What does he do? He bes good. (This might be related to powers that be: What do the powers do? They be.) --WikiTiki89 (talk) 14:20, 30 August 2012 (UTC)
Yes, I understood you just fine. I was just objecting to your choice of terms (because "active" and "passive" already have meanings that conflict with what you were trying to use them for), and offering you better ones (those used by CGEL). —RuakhTALK 18:23, 30 August 2012 (UTC)
But I disagree that the ones you offered me are what I was looking for. --WikiTiki89 (talk) 18:32, 30 August 2012 (UTC)
I don't know how long it's worth continuing this conversation here, but . . . you seem to have accepted the term "copula", since you yourself started using a variant of it: "as opposed to the copulative sense which just describes things as they are" (emphasis mine). So the only term that you can be rejecting is "lexical". But your reasons for disliking "lexical" are that you "[weren't] talking about mood or aspect or anything", and that "the distinction [you are] talking about has nothing to do with the progressive aspect". These are bad reasons, because they have absolutely nothing to do with what they purport to be replying to. They certainly don't justify the choice of "active" over "lexical". (I'm left wondering whether or not you actually read my comment before rejecting it.) —RuakhTALK 18:53, 30 August 2012 (UTC)
The word lexical doesn't really fit what I'm referring to. What I am talking about is not grammatical but semantic (although the distinction I guess is fairly gray). active seems to be the best word for it even though it does conflict with active voice. Either way this is a pointless argument. I was just sharing my own speculations that are not based on anything I have read anywhere (because I have never read anything anywhere about this). Anyway, regardless of what term you use, do you think this explains why be is necessary after should? --WikiTiki89 (talk) 19:07, 30 August 2012 (UTC)
Re: "What I am talking about is not grammatical but semantic": Right, and that's exactly what lexical means. (A "lexical" word is the opposite of a "grammatical" or "function" word.) Re: your speculation: the lexical-vs.-copula-be distinction was also my first thought when I saw Equinox's question, but I'm not sure whether it fully explains the difference in the verb-phrase ellipsis. —RuakhTALK 19:33, 30 August 2012 (UTC)

Minor changes to Wiktionary:Main Page

Discussion on changes to the Main Page --> Wiktionary talk:Main Page.
Maro 00:23, 30 August 2012 (UTC)

Referring to an entry.

We have a lot of different ways of linking to an entry, but it's not obvious to me how to link to an entry if I'm actually referring to the entry itself (rather than to the term defined there). One such case is at [[team#Etymology]], which ends with the sentence "More at tie, tow", meaning roughly "More at the entries for tie and tow."

A few ideas, just to get the discussion going:

  • We could just use regular links, as [[team]] does:
    More at tie, tow.
  • We could just use {{term}}:
    More at tie, tow.
  • We could use our mad English skillz:
    More at the entries for tie and tow.
  • In discussions, I often use an extra set of double-square brackets (in mimicry of how {{temp}} displays curly braces):
    More at [[tie]], [[tow]].
  • A number of dictionaries use small-caps:
    More at tie, tow.
  • We could use some sort of symbol:
    More at → tie, → tow.
    More at • tie •, • tow •.
    More s.v. tie, s.v. tow.
  • We could force underlining:
    More at tie, tow.
  • We could sidestep the whole issue:
    More elsewhere.
  • We could use a different color:
    More at tie, tow.

(Naturally, the above are not all mutually exclusive.)

I also wonder about linking to individual parts of entries:

See usage notes at when.
See usage notes at when.
See § "Usage notes" s.v. when.

though we already have problems with links other than to language-sections, so probably that's not worth thinking about yet.

Any thoughts?

RuakhTALK 18:11, 30 August 2012 (UTC)

I'm not really sure. It seems like kind of a minor issue honestly. I never quite understood the difference between {{term}} and {{l}} either, they both do pretty much the same thing except for the italic. I know one is meant to be used in running text and the other in lists, but is it really that obvious to all the editors here? The last thing we should do is add to the confusion by introducing a third type... —CodeCat 22:15, 30 August 2012 (UTC)
Actually, it occurred to me that {{l}} and {{term}} could be merged. The only really key difference is the Italicization of the Latin script. That would require an entirely new parameter, all the other parameters could happily be merged into a single template. Mglovesfun (talk) 22:20, 30 August 2012 (UTC)
If not merged outright, then at least merged in the way they are used and by the functionality they provide. {{term}} uses a lang= parameter while {{l}} uses the first unnamed parameter for the same thing, which is kind of annoying. —CodeCat 22:24, 30 August 2012 (UTC)
@CodeCat: I actually think that having more meaningful templates reduces confusion rather than adding to it. {{l}} ("link") was apparently intended to have little in the way of semantics, which makes its usage haphazard. (By contrast, {{onym}}'s name makes fairly clear what it's about.) Also, this template would probably be used relatively rarely. It doesn't happen all that often that one entry refers to another entry by name. —RuakhTALK 22:50, 30 August 2012 (UTC)
Personally, I use the {{term|...}} template. The italics seem a good way to distinguish mention from usage. (Whether it's guaranteed to render in italics in all situations, I dunno.) Equinox 22:41, 30 August 2012 (UTC)
{{term}} certainly distinguishes mention of a term from usage of a term, as you say, but it doesn't distinguish mention of an entry from mention of a term. —RuakhTALK 22:46, 30 August 2012 (UTC)
Personally, IIRC, I've used the method you call "our mad English skillz", precisely in order to avoid the question of whether to italicize, which I had no answer for. I see no problem continuing to do so; otoh, your "some sort of symbol" method sounds good, if it's to be a standard (and if the symbol chosen is not ugly). (By the way, IMO the question of whether to italicize is begged by use of s.v.: I'm not sure why you consider use of s.v. a solution.) Your "sidestep the whole issue" method has a certain appeal, too. (I kid.) As for your "extra set of double-square brackets" method, which I've stolen from you, I think strongly that we should not adopt that for use in entries: it only makes sense to those who edit MW wikis. (Not that I think you were suggesting it, but I worry others may run with it.)​—msh210 (talk) 04:08, 31 August 2012 (UTC)
I sometimes use italics and bold; see cikán. I welcome a better method. I like the idea of small caps. I dislike the idea of a symbol, "s.v." or a different colour of link (that last only because I don't think it stands out). "Using our words" is also an acceptable idea. - -sche (discuss) 05:38, 31 August 2012 (UTC)
So I'm going to poke my nose into this entry and note we should NOT (and I repeat NOT) use colour as a differentiating factor for something like this. It's incredibly inaccessible, and we can't be certain that the colour we choose for something like this, unless it's within a standard set of colours (that yellow is… difficult to read) is going to be legible for people on all screens. I could barely read it. Personally I favour something like the → symbol, as that's clearest, and clearest for those who are accessing the wiktionary with a screenreader. Anyway, I'll come back to this later this afternoon and see what else I've thought of on the subject. --Neskayagawonisgv? 18:51, 2 September 2012 (UTC)
Hm, your comment makes me wonder if, by using a template to display whatever format we choose, we could wrap the text in a span/class such that a screenreader would read "more at tie" aloud as "more at the entry tie" (as if we'd used our "mad English skillz" even if we don't choose that option). - -sche (discuss) 19:04, 2 September 2012 (UTC)
I like all three of the most recent ideas for user space:
  1. A fairly self-explanatory symbol like .
  2. A simple template for entry references to be rolled out to a variety of entries to test user response to various version thereof, perhaps by redirecting it to different templates until we come to consensus. OTOH, if it is not simple (few calls, preferably none, to other templates), I'd rather we didn't bother at all. I dread any template that gets doesn't bypass our script and language system where possible.
  3. Embedding some accessability/usability aids or links thereto in various of our templates.
OTOH, I intensely dislike the idea of merging more and more functionality into a small number of templates, as be merging {{term}} and {{l}}. These seems like a strategy for, 1., reducing experimentation by making template change a more costly proposition, 2., reducing contribution from casual users by complicating template use.
Embedding a11y/accessibility/usability aids in coding (possibly implementing things from WAI-ARIA for the screenreaders to have?) would be great. Ideally this would be done in conjunction with whatever other solution we choose, period. --Neskayagawonisgv? 21:48, 2 September 2012 (UTC)
@DCDuring: We should never link to (say) a Hebrew entry without using {{Hebr}} and #Hebrew. Do you disagree? If so — why? —RuakhTALK 23:38, 3 September 2012 (UTC)
I have no objection to anything that allows us to go forward on the all-languages objective, but I hope that we don't saddle English with overhead like calls for templates, which could instead be handled by making "en" and "Latn" defaults where it could be made possible. DCDuring TALK 01:14, 4 September 2012 (UTC)
Sorry, but I really have no idea what you're saying. I feel like you might be mixing up two very different issues. —RuakhTALK 11:57, 4 September 2012 (UTC)
  • I would like small caps if we were just an English dictionary or even just a Latin-alphabet dictionary, but we're not. Alphabets with no concept of capital letters can't use small caps, and they look really dumb in non-Latin alphabets that do have capital letters, like Cyrillic, Greek, and Armenian. —Angr 22:23, 2 September 2012 (UTC)
    • We do have the option of making certain styling elements appear differently depending on script, though. We already do this with {{term}}, it displays some scripts in italic but not others. —CodeCat 22:30, 2 September 2012 (UTC)
I've edited my earlier comment. I'm willing to go along with a symbol (→? ⇒?), as that seems to be the most favoured option. - -sche (discuss) 20:46, 5 September 2012 (UTC)

New part-of-speech headers?

I've been adding more entries in Zulu again but got a bit stuck. Adjectives do exist in Zulu, but they are a closed class of about a handful of words. What we call adjectives usually translate into a part-of-speech called a 'relative' in Zulu. Relatives differ from adjectives because they take a different type of agreement and generally don't behave as adjectives do, and they (along with a few other uncommon parts-of-speech) are part of standard Zulu grammar. That means of course that we would want a separate header for them next to Adjective, Determiner and so on. Is there a special procedure or approval for this (so that it doesn't get marked as an error by bots)? —CodeCat 22:21, 30 August 2012 (UTC)

Can you give an example? I've never seen this in a Bantu language. Do you mean the words that act rather like present active participles of various verbs? --Μετάknowledgediscuss/deeds 00:03, 31 August 2012 (UTC)
I'm not sure if they exist in all Bantu languages. I found at least one paper that said they occurred only in the Southern Bantu languages. This page shows the differences. Adjectives follow the 'adjective concord' while relatives follow the 'relative concord'. That paper covers Tswana rather than Zulu, and from what I can see, in Tswana the relative concord is expressed by means of some kind of separate particle, but it's the same idea. —CodeCat 00:10, 31 August 2012 (UTC)
Clearly a grammarian needs to look at this. All I see are adjectives acting strangely (but acting like adjectives all the same). The relatives seem to carry adjectival meaning and agree with the nouns they modify. That said, I'm in no place to pass judgment on it. --Μετάknowledgediscuss/deeds 00:25, 31 August 2012 (UTC)
Clement Doke's 'Textbook of Zulu Grammar' also covers it, and mentions a third type called 'enumeratives' too, and that is quite clearly a grammatical treatise. In fact most other works on Zulu seem to reference Doke's book one way or another, so it is considered a standard work within the field. According to this page, which is part of a language course that uses Doke's book as part of its teaching material, the adjective concord is historically formed from a- + noun prefix, while the relative concord is formed from a- + verb subject marker. So they are definitely distinct parts of speech in that they take a different form of inflection related to verbs rather than nouns (yet, they are not verbs at all). —CodeCat 00:31, 31 August 2012 (UTC)
I've never studied a Southern Bantu language, so bear with me. Here's my comparison to Latin (and I apologise in advance for the Europeanisation): Zulu adjectives agree with Zulu nouns based on prefixes that show class, similarly to how Latin nouns and adjectives agree based on suffixes that show gender. Some Latin nouns end in -us (masc. 2nd decl. nom. sing.), but the adjective that agrees with them can either belong to the same system and end in -us (like servus industrius) or belong to a different system and end, for example, in -ens and yet still agree (as in servus intelligens). Just because they don't agree by means of the same system doesn't mean that they suddenly constitute a new POS — just a separate division of adjectives. I will only believe that it is truly a new POS if it really does something other than be a single word that directly modifies a noun like traditional Zulu adjectives do. --Μετάknowledgediscuss/deeds 05:25, 31 August 2012 (UTC)
I did make that comparison myself. But why should we reinvent Zulu grammar when everyone else already uses different terms? It would be like suddenly renumbering the Latin declensions just because we felt like it, or calling the perfect the 'completed past' instead or something like that. It would only serve to confuse our users because we're not using established terminology. —CodeCat 10:49, 31 August 2012 (UTC)
I think each language should have its own rules for headings, conforming as much as possible but differing where need be. --WikiTiki89 (talk) 11:00, 31 August 2012 (UTC)
I agree. —RuakhTALK 12:16, 31 August 2012 (UTC)
It seems to me that our decisions about grammatical categories for a language should be limited to those used by three communities: native speakers, those learning it as a second language (ie, the texts they use), and academic students of the language. I'm not sure whether native speakers of Zulu have ever felt the need for such categories outside of school but some may have learned some categories in school. The opinions of a student of languages using English-, Latin-, or universal-grammar-based categories shouldn't count for much until it is shown that no such category system is accepted in any of the three communities. If there are conflicting systems, I think we would be constrained to select one of them, probably a recent one showing signs of widening acceptance. DCDuring TALK 12:57, 31 August 2012 (UTC)
I have no idea how Zulu is taught in schools to native speakers. An adjective is isiphawulo, and according to the dictionary I am using a relative is isimelana (see [2] [3]). The Zulu Wiktionary calls everything isiphawulo, but it also lacks definitions and other grammatical information (including whether the word uses adjective or relative concords) so it's best to say it's incomplete (compare zu:-bomvu, a relative, and zu:-dala, an adjective). The Afrikaans Wictionary is much more complete, and distinguishes the two (af:-bomvu, af:-dala). It calls relatives betreklike naamwoord (relative nominal) while adjectives are byvoeglike naamwoord (adjective nominal) and nouns are selfstandige naamwoord (independent nominal) (compare Dutch betrekkelijk, bijvoeglijk naamwoord, zelfstandig naamwoord). —CodeCat 13:50, 31 August 2012 (UTC)
I thought about what you found, but it still leaves us in a state of ignorance about the expectations of users other than ourselves and folks very much like us. DCDuring TALK 01:52, 4 September 2012 (UTC)
IsiXhosa (another Nguni language) has the same distinction. There is a handful of real adjectives that have a more complicated concord system resembling that of the nouns, but most are relatives, that are treated similarly to verb forms in terms of their concords. Jcwf (talk) 01:23, 4 September 2012 (UTC)
The concords for xh at af.wikti for af:-dala are actually wrong: it is omdala, not omudala (that's Zulu) Jcwf (talk) 01:27, 4 September 2012 (UTC)
The difference between the adj and rel concords is that in the so-called weak classes (1,3,4,6,9 and 10) that have a nasal part in the noun concords (um-,um-,imi-,ama-, in-, i(z)in-) this element is retained in the true adjectives. In the relatives this nasal element is missing just as it is missing in the verbal concords for the relative tenses.
This also means that the relatives are simpler to turn into a predicate:
umfazi olusizi - a sad woman
umfazi ulusizi - the woman is sad
An adjective would require the predicative concord -ng- like a noun would:
umfazi omdala - an old woman
umfazi ungomdala - the woman is old
In other words the two groups follow a different grammar; see
Jcwf (talk) 02:48, 4 September 2012 (UTC)

New idea for censored words

There's renewed discussion of c**ksucker on WT:RFD. As a replacement, or an alternative, I created Appendix:English censored words, a list of censored words by first letter and last letter or length, perfect for searching on any of this type of word, be it attestable or no. It's just a quick start right now, but I can see us redirecting c**ksucker and friends to it. Opinions?--Prosfilaes (talk) 00:47, 31 August 2012 (UTC)

The main problem I see is that currently Appendix:English censored words seems really messy and hard to maintain. --WikiTiki89 (talk) 11:02, 31 August 2012 (UTC)
What's the reason to use the system of last letter and also number of letters, so that every word appears twice in quick succession? Mglovesfun (talk) 15:41, 31 August 2012 (UTC)
So if you had f—t or f-----, you could easily figure out what the word was. I suppose if people are concerned about maintainability, it could be dropped to a simple alphabetical list of words and be nearly as useful.--Prosfilaes (talk) 22:34, 31 August 2012 (UTC)
Maybe it should be made into one of those tables that's sortable by every column? (Do those use stable sorting, i.e., the currently displayed ordering as a fallback ordering? If so, then someone could sort by word-length and then by first letter, or vice versa, in order to quickly find all words of the desired first letter and length.) —RuakhTALK 00:56, 1 September 2012 (UTC)
Last modified on 13 April 2014, at 14:50