Open main menu

Wiktionary:Beer parlour/2012/July

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit

Link explanation

I find the pages describing links to be inadequate. The Tutorial has three pages and Wiktionary:Links has some information, but neither tells you how to link to Spanish Wiktionary or French Wikipedia, and there doesn't seem to be an easy reference page. I have created a Wiktionary:LDL/Link_Guide. Although it still lacks some information ("Category" sometimes requires a colon before it), it is much more thorough. I don't want to disturb the flow of the Tutorial, but is there an appropriate place to put this where people can easily find it? --BB12 (talk) 06:20, 1 July 2012 (UTC)

I like it. You could add a section for linking to other namespaces. — Ungoliant (Falai) 16:21, 1 July 2012 (UTC)
Merge it into [[Help:Wikitext quick reference#Links]]? Or create a new subpage [[Help:Wikitext quick reference/Links]] (and make [[Help:Wikitext quick reference#Links]] a link to the subpage)?​—msh210 (talk) 17:14, 2 July 2012 (UTC)
The page doesn't make any mention of {{l}}, {{term}} and such, which are widely used on Wiktionary and are preferred in many cases to bare links. —CodeCat 17:17, 2 July 2012 (UTC)
Nice additions, thank you! I've seen those on occasion but have used them only once or twice. I'll add those. --BB12 (talk) 07:31, 3 July 2012 (UTC)

Unblocking oneself

Is it legal? Special:Block/Ivan Štambuk. Mglovesfun (talk) 12:53, 1 July 2012 (UTC)

  • Blocking a sysop stops him from editing. Nothing else. You would need to get a -crat to desysop him for an equivalent time in order to prevent this behaviour. SemperBlotto (talk) 13:26, 1 July 2012 (UTC)
If he continues with this unacceptable behavior, please whine at Meta. -- Liliana 13:40, 1 July 2012 (UTC)
  • Well the justification for the block was IMHO inane (I was doing no such thing) so I lifted it. You should try to discuss the matter first, and not simply block someone you disagree with, or who is involved in a discussion you're not familiar with. --Ivan Štambuk (talk) 14:52, 1 July 2012 (UTC)
Or he was familiar with the discussion, but saw that you repeatedly dismissed everything anyone said without really addressing it. If you don't like the way the rules are interpreted, bring it up here- don't just start edit warring against something that was the product of discussion and consensus using the appropriate forum for such things.
What's the point of having RFD, RFV, etc, if one person is going to unilaterally reverse them? Should we amend CFI to add "unless Ivan Štambuk thinks it's a bad decision" to all the parts about removing content? Chuck Entz (talk) 16:56, 1 July 2012 (UTC)
Perhaps you too should have investigated the matter more thoroughly - the entry was restored only after providing sufficent (3) citations on the respective Citations page. Then one of the citations was removed because of the 404 and the entry got re-deleted due to again lacking citations. Then the deleted citation was resurrected but this time pointing to the Internet Archive mirror, and I restored the entry. It should be noted that all of the actions heretofore (prior to Gloves chiming in) were per policy, and don't constitute edit warring of any kind. And then Gloves deleted it twice inexplicably, blocking me in the process for apparently "inerting gibberish".
Discussions at RFD, RFV etc. are not binding laws, and their conclusions are not permanent and immutable. Entry that got deleted due to failed attestation can be redisputed any time in the future by anyone. --Ivan Štambuk (talk) 17:25, 1 July 2012 (UTC)
  • It was a bad block and if Ivan hadn't unblocked himself, I would have. Since the final result is the same (Ivan being unblocked) I say no harm, no foul in this case. —Angr 18:33, 1 July 2012 (UTC)
I agree, it was a bad block. MG should know that, having been on the receiving end of such a block not that long ago. That said, I can see why MG felt compelled to do it. This was a case of overreaction on both sides.
What was the compelling urgency that required getting into an edit war with other admins? Was the lack of that entry so serious a problem that there wasn't time to have a discussion here? Or was it just the sneaking suspicion that the consensus wouldn't go his way?
I did look through the logs and the discussions: the issue is that Ivan disagrees as to what "durably cited" means. The citations he's talking about aren't durably archived according to the standards that have been in use in rfv for quite some time. There are plenty of terms that have failed rfv and been deleted because they're cited using websites, in spite of those websites being archived at Internet archive. That's because the Internet Archives will remove material for any reason, as long as it's requested by the current website owner. This may be an invalid reading of the CFI rules, but it's the accepted practice and consensus of those who participate in rfv.
The next time he reverts someone who changes the language headers from Serbo-Croatian to Croatian or Serbian, he could rightly be questioned as to why it's ok for him to go against consensus, but not them. It's hard enough enforcing rules without the very admins who are charged with enforcing them getting into ugly little power struggles like this. Chuck Entz (talk) 19:49, 1 July 2012 (UTC)
Sorry, but that's not right: the issue isn't that Ivan disagrees about what "durably archived" means, it's that he disagrees with having any "durably archived" requirement, and he refuses to abide by such a requirement on the grounds that it's "meaningless". (Note that even if we accepted as durably archived, the term would still have only two durably archived cites.) I think the block was quite appropriate. —RuakhTALK 20:10, 1 July 2012 (UTC)
Don't really see what else I can do, just say "go ahead, flout the rules". It's one of those occasions when blocking is the least bad of all the options (undesirable, but not as undesirable as not blocking). Mglovesfun (talk) 20:21, 1 July 2012 (UTC)
If there's no discussion forum for administrators, then a good alternative is to take a deep breath (and maybe wait 24 hours), post a note on the administrator's user page, and then (maybe after 24 more hours) post here. Administrators blocking administrators does not seem like an appropriate step to take. Administrators have their status because they are trusted to do certain things with a higher level of responsibility and accountability. Bringing the issue to a forum rather than blocking recognizes that responsibility and accountability. Wiktionary won't come crashing down because a word entry isn't rectified for a couple of days. --BB12 (talk) 22:03, 1 July 2012 (UTC)
I agree. Ivan should not have taken the actions he did, and he should not have been blocked as he was. There was no urgent need that demanded immediate action either way: the status of a single term is not something to drop all pretense at process and take extreme measures over. Such issues routinely sit for weeks and months in rfv or BP without the world falling apart.
If Ivan were to launch on a crusade to resurrect every entry that ever failed rfv- Right Now- and there was the prospect of major disruption if he wasn't stopped, blocking might be justifiable to get his attention, but probably not a good idea. Blocking someone who can just unblock himself in a matter of seconds is not only pointless, it also escalates tensions and makes it harder for cooler heads to prevail. I wonder what other options could be developed to deal with this sort of dispute. Chuck Entz (talk) 23:22, 1 July 2012 (UTC)
You're probably right on the details, but that just strengthens my point: Ivan shouldn't have taken the actions he did. One can't make one's own set of rules because one feels one is right and everyone else is wrong. That way leads to chaos and conflict, and undermines the community that's needed for a wiki to exist. Still, blocking is worse than pointless when used against an admin- it just raises the emotional level when reason is most needed, and takes attention away from how to deal with the underlying conflict. Chuck Entz (talk) 01:54, 2 July 2012 (UTC)

Unblocking oneself in frowned upon in the community. In my mind, the best option for Ivan was to let another administrator unblock him instead of unblocking himself. That is always the right course of action if another administrator has blocked you, regardless of what the block reason was. Razorflame 01:28, 2 July 2012 (UTC)

I would maaaybe agree with Razorflame, if only for the fact that I also think it seems like a bad block. On a tangentially related note, I think our CFI regarding cites has been shown to be a bit bullshitty in this case IMO...Sure Equinox is right that we shouldn't become Urban Dictionary, but I dunno...the policy just seems like a bit of a bitch sometimes to be honest... 50 Xylophone Players talk 01:37, 2 July 2012 (UTC)
If this many admins think it was a bad block, then certainly someone would have unblocked him very soon. I don't know what is "typical" in these atypical scenarios, but certainly this would be a time if any when boldness should not be esteemed. DAVilla 04:43, 22 August 2012 (UTC)
I’ve seen close to a hundred cases (or more) where one admin has blocked another admin, but in every case the one blocked immediately unblocked himself (as soon as he discovered that he’d been blocked). However, I think I remember a time or two where an admin has blocked himself, not as a test, but as castigation for some error. I thought it was weird. But unblocking oneself is the norm in the community. —Stephen (Talk) 02:05, 2 July 2012 (UTC)
It shouldn't be. In almost every case I've seen, both on this project and other projects, this is heavily frowned upon, even if the block reasoning was ludicrous. Razorflame 02:11, 2 July 2012 (UTC)
That's the sort of substantive issues that could have been discussed. Instead we got what amounts to "I don't like the rules of this game. I'm not going to follow them and you can't maake mee." "Oh yeah? I'm going to shoot you with my toy gun. Bang! Bang! I shot you. Hey! You're supposed to play dead! That's no fair!", etc., etc., ad nauseam Chuck Entz (talk) 02:13, 2 July 2012 (UTC)
I have not looked into the circumstances of this block at all, but I approximately agree with Razorflame: one should pretty much never be unblocking himself, unless he was the one who blocked himself, or unless he's unblocking himself and not using the unblock (i.e. not editing) except to discuss the block, or similar. (There are other exceptions, too, such as when WF goes on an admin-blocking spree.)​—msh210 (talk) 17:00, 2 July 2012 (UTC)

CFI for fictional universes

In an RFV, there is disagreement about how to handle Jabba the Hutt. Regardless of what the CFI says, there is disagreement about it should say, so perhaps a review of the policy is in order. A discussion was held above, but there was no conclusion. As Ruakh points out, there is Wiktionary:Votes/pl-2010-05/Names_of_specific_entities, but regardless, what is really needed to include all words in all languages?

Wiktionary:Criteria_for_inclusion#Fictional_universes and Wiktionary:Criteria_for_inclusion/Fictional_universes provide the guidelines currently. They allow the following:

  • [After finding a glowing blade,] Brian being Brian, his first thought was of a lightsaber.
  • Irabu had hired Nomura, a man with whom he obviously had a great deal in common, and, who, as we have seen, was rapidly becoming the Darth Vader of Japanese baseball.
  • Steve and I explained the new program to our children, who looked at us as if we had just announced that we were from the planet Vulcan.

I think what is needed is to come up with a bunch of examples and then figure out which sorts we want to include and which we want to exclude. Here are some:

  • "There is no reason that the people who ask that sort of question have a more elevated moral standing than us," he [Kissinger] intones in his guttural, Jabba the Hutt voice. (From the Jabba the Hutt entry)
  • My sister is very scientific, a real Spock.
  • My sister is Spock.
  • My sister is like Spock.
  • My father is/thinks he's James T. Kirk.
  • My father always struts around the dinner table like James T. Kirk.
  • My father always struts around the dinner table like a James T. Kirk.
  • He's no taller than a hobbit.
  • He's no taller than a Tolkien Gimli.
  • He's so short, he's like a hobbit.
  • I feel like I'm driving my own KITT. (w:Knight_Rider_(1982_TV_series))
  • This roller coaster is like cruising at warp speed 9.
  • This roller coaster is like cruising at warp speed 9 on the Enterprise.
  • This roller coaster is a warp speed 9 machine.

Disclaimers: I'm a fairly short person and have been compared to hobbits. Also, I'm not suggesting that "warp speed 9" be added, only that it be taken into consideration for what sort of criteria are wanted :) --BB12 (talk) 05:50, 2 July 2012 (UTC)

The examples given all seem like clear metaphors or similes, which I'd be wary of including. "The moon was a ghostly galleon tossed upon cloudy seas" doesn't really cite a definition of galleon meaning "moon", but the hypothetical "The crescent galleon was waxing in the sky" would. Similarly, I'd consider something like "My short friend came to visit recently. The hobbit stayed for a week." a better citation of hobbit than any of those given above. Smurrayinchester (talk) 17:17, 4 July 2012 (UTC)
I think they should all be valid. They are used outside of the universes themselves, and people are likely to encounter them and want to know what they mean. What is the reasoning for not including them? --BB12 (talk) 18:55, 4 July 2012 (UTC)

Firstly, it's not even clear that the CFI do allow the three examples mentioned at the top, because that depends on the interpretation of “in an attributive sense,” and acceptance of examples which contradict the common-sense interpretation. This should be clarified.

It's also pretty clear to me that the CFI specifically disallow James T. Kirk, James Tiberius Kirk, James Kirk, and probably Jabba the Hutt, because they are entries for individual persons “whose page title includes both a given name or diminutive and a family name or patronymic,” essentially sum-of-parts names of specific people rather than terms the way we consider them. And if we allow these do we also accept Lucille Ball as an English word because we can cite the “attributive” use “funny as Lucille Ball”?

And warp speed 10 or what have you is a sum-of-parts construction employing warp factor (which has a bad definition). Michael Z. 2012-07-04 23:42 z

Right, so the question is? what do we want to be disallowed? I think all of these are useful as words and should be listed. If "X as Lucille Ball" comes up three times in durably archived sources, I think she should be added to. What are reasons for not including any of them? --BB12 (talk) 05:35, 5 July 2012 (UTC)
I think we want all words in all languages. However, we want to guard a little bit against terms arising in a fictional Universe that aren't recognized as words outside of it. The reason we have Captain Kirk is the same reason we have the name of non-fictional person Benedict Arnold, because the name is used (or at some point has been used) in everyday speech as a shorthand for a set of characteristics. On the other hand, Triskelion, a planet that is visited in one Star Trek episode and referenced in a handful of the novels, lacks such currency. bd2412 T 18:42, 30 July 2012 (UTC)


Seems we have a case of a good faith editor making almost nothing but bad entries. This goes way beyond Luciferwildcat and Wonderfool. Indef block seems inevitable, but I know nobody likes blocking editors who act in good faith. But... isn't it better to do it and to hate it than to not do it at all? Mglovesfun (talk) 11:39, 3 July 2012 (UTC)

Maybe we could try talking to him before blocking him for a change. I don't see anything on his talk page about his tendency to add unidiomatic SOP entries. Looking through his contribs, all of his single-word creations seem to be OK, it's just the multi-word entries that are bad. —Angr 11:51, 3 July 2012 (UTC)
I agree, we should talk to him about this and tell him what (exactly!) he's doing wrong, and tell him that we're considering blocking him because he keeps making the same mistakes. —CodeCat 11:53, 3 July 2012 (UTC)
Well that has been done, see User talk:Sae1962. I wouldn't be posting here if it hadn't. Mglovesfun (talk) 14:00, 3 July 2012 (UTC)
Sorry, I'm not seeing anything specific enough there. I see very vague and unhelpful comments of yours (like at User talk:Sae1962#Redirections and User talk:Sae1962#Breaking things) there, and I see in his responses a willingness to get better, but what I don't see is any constructive comments explaining, for example, that SOP applies also to entries in other languages than English and that we don't create entries for non-idiomatic expressions in other languages just because they're translations of English idioms. —Angr 14:21, 3 July 2012 (UTC)
I think all the welcome messages should include language that instruct new editors to post to the Tea Room each time someone tries something new to get feedback. Also, Sae1962 could have been asked to check in at the Tea Room on a regular basis multiple times along the way.
At the point of "breaking things" on Sae1962's user page, they seem to have been doing things in good faith and in a reasonable manner. Sae1962 probably saw the initialism template used somewhere and copied it. They weren't able to figure out how to get SNAP as a separate entry. In technical fields, underlining and capitalizing the first letter of an initialism in the full form is common. Not sure about Arabic and Bosnian, but in the Bosnian case, at least, Sae1962 was using a dictionary that is not durably archived--again, something that was in a reasonable manner.
If Sae1962 had checked in at the Tea Room on a regular basis, multiple cases of these errors could have been avoided. --BB12 (talk) 17:24, 3 July 2012 (UTC)
Just to clarify, I'm a fluent but not a native speaker of German. There's nothing grammatically wrong with ungeeignete Maßnahme and ungeeignete Person; there's no reason to think his command of German is not at a native level. The problem is just that they aren't idioms. They're pure SOP. He created them because they're translations of "square peg in a round hole" and he apparently thought it would be OK to have an entry for the non-idiom German translation of an English idiom. Going through his contribs, I'm pretty sure these aren't the only cases. —Angr 21:10, 3 July 2012 (UTC)
Quoting above "If Sae1962 had checked in at the Tea Room on a regular basis, multiple cases of these errors could have been avoided." I don't agree, he/she hasn't demonstrated ability to understand others or to put their advice into practice, the opposite if anything. Talking to him/her hasn't worked. Mglovesfun (talk) 20:59, 3 July 2012 (UTC)
Specifically which advice did they not put into practice and how have they acted in the opposite manner of what was requested? (Regardless in this case, I still think that the welcome messages should be changed and newbies should be told on a regular basis to report to the Tea Room.) --BB12 (talk) 22:52, 3 July 2012 (UTC)
I've left a comment at User talk:Sae1962#ungeeignete_Person. I don't think there's a need to raise the specter of a block unless Sae keeps creating such entries. - -sche (discuss) 21:21, 3 July 2012 (UTC)
The handling of this proves Ruakh's comment (in an old vote) that en.Wikt is a cantankerous wiki, lol. - -sche (discuss) 04:04, 4 July 2012 (UTC)
@-sche it's been a while already! All I want to say is this is a dictionary project, not a social networking website. We have to act in the best interests of the dictionary, even when doing so is unpleasant. Mglovesfun (talk) 11:20, 5 July 2012 (UTC)
Finished cleaning up after Sae's latest mess; fortunately we caught it before Sae generated hundreds of "possesive" forms to mop up. ~ Röbin Liönheart (talk) 13:36, 5 July 2012 (UTC)

Vote on unblocking or continuing to block Luciferwildcat

As a result of the discussion above (WT:BP#Indefinite_or_long-term_blocking_of_registered_well-meaning_editors), I have created this vote: Wiktionary:Votes/2012-07/Blocking of Luciferwildcat. - -sche (discuss) 23:41, 3 July 2012 (UTC)

Seems like a good idea to me. Mglovesfun (talk) 00:03, 4 July 2012 (UTC)

Monthly subpages for the current (BP, GP...) discussion page

I think this has been discussed before but I don't remember when. I would like to propose that we use a dedicated subpage for each month on discussion pages, and use it also for the current month instead of just for archives. This has several important advantages:

  • There is no longer a need to archive discussions, as they archive themselves, keeping the discussion pages short without much additional work.
  • Without a need to archive, discussions can continue on the pages of previous months.
  • It makes watchlists more fine-grained as you can watch a particular month and see when an old discussion has been restarted.
  • As archiving is no longer necessary, links to specific sections of the page don't get broken upon being archived. Under the current scheme, the link to this discussion, WT:BP#Monthly subpages for the current (BP, GP...) discussion page, will no longer work if the discussion is archived.

Under this new scheme Wiktionary:Beer parlour could become a redirect to the current month. At the start of a new month, the redirect would be changed and a new subpage created. Anyone watching Wiktionary:Beer parlour would be notified of this change so they can watch the new month's page. Of course there are other solutions (such as a soft redirect) but this is one possibility. —CodeCat 14:37, 4 July 2012 (UTC)

The problem with this solution is that generally a previous month has many discussions of which one or two are still live and we don't (I think) want those disappearing off of the main BP page. I don't think people want to keep watching all the old pages when they can focus on just one BP page. While there is a workaround (reverse-archiving, i.e. moving still-current discussions to the current page), I think a better solution would be monthly subpages that are transcluded here, like we have at [[WT:Votes]]. That'd keep everything the way it is now except (a) archiving will be much easier, (c) #links work 9as described above), and (c) editing a section will be more confusing (until people get used to it) (since you click "edit" on the BP page and, on saving, wind up on the subpage).​—msh210 (talk) 16:22, 4 July 2012 (UTC)
I think that would work too, yes, as it would be the best of both worlds to a degree. —CodeCat 16:26, 4 July 2012 (UTC)
Both ideas seem fine to me. And any of them is better than the current solution. We should try one of them. --MaEr (talk) 16:48, 4 July 2012 (UTC)
Ok, if nobody has any serious objections I think we could try this out for one discussion room, starting August 2012. Preferably not the BP, so maybe GP? —CodeCat 16:55, 19 July 2012 (UTC)

Mentoring program for newbies

Above, Kaldari says, 'Has anyone here ever wondered why there is only a "small number of people available"? Perhaps if biting the newbie wasn't a celebrated pastime here, Wiktionary would actually have more editors willing to help maintain the project.'

Administrators complain about people making numerous mistakes and newbies complain about the nasty, unyielding attitude of established editors (and quit). How about a mentoring program where newbies have someone helping them and looking over their shoulder to offer constructive advice? --BB12 (talk) 00:22, 6 July 2012 (UTC)

I would definitely support that and I would like to offer my help too. —CodeCat 00:33, 6 July 2012 (UTC)
As part of this effort, how might we make {{pediawelcome}} nicer? I find this fact (the fact itself, and its mention in the template) particularly bitey: "Be aware that well-meaning Wikipedians have unfortunately found themselves blocked in the past for perceived disruption due to misunderstandings." - -sche (discuss) 00:55, 6 July 2012 (UTC)
If you think that's bad, then have a look at this old version, which includes such nuggets as " [] you very likely will be blocked, like many other Wikipedia admins". (I mean, O.K., yes, the current version is bad, too. But it could be worse!) —RuakhTALK 01:42, 6 July 2012 (UTC)
You people essentially claim that you want more contributors assisting, enriching and enhancing the project…but then when they do something that you dislike, you lash out and act disrespectfully and rudely towards them.
It gets tiresome. --Æ&Œ (talk) 02:11, 6 July 2012 (UTC)

Help:Reverting#When to and MediaWiki:Revertpage

At #Courtesy when reverting people's edits, an anonymous editor reiterated his/her previously-expressed outrage over having edits get rolled back. That particular editor . . . well, whatever . . . but for every one person who complains vocally about something, there are probably many others who share the general sentiment but are silent about it.

Wikipedia's w:Help:Reverting#Rollback advises that:

Rolling back a good-faith edit, without explanation, may be misinterpreted as "I think your edit was no better than vandalism and reverting it doesn't need an explanation". Some editors are sensitive to such perceived slights; if you use the rollback feature other than for vandalism (for example, because undo is impractical due to the large page size), it is courteous to leave an explanation on the article's talk page or on the talk page of the user, whose edit(s) you have reverted.

and our own Help:Reverting#When to is much more forceful, contending that:

It is only acceptable to revert another users edit if it is clearly and irredeemably nonconstructive. Reverting vandalism is obviously acceptable, as is reverting copyright violation and edits that do not conform to our Criteria for inclusion. In almost any other case it is better to constructively edit the page, and clean up the problem with the edit. This is because it moves Wiktionary forwards as opposed to stagnating on the old version. It is, for example, inappropriate to revert simply because "the old version was better", as a new version you could write would be better still.

In the same vein, I've observed that editors sometimes respond to a rollback by re-reverting, and that this quickly leads to blocks. I think this results from a sort of misperception or miscommunication: to admins, a re-revert can feel like willful bad editing on the part of the original editor, but I don't think that's really how the original editor sees it. Specifically — a newbie whose edits are rolled back is unlikely to register the implicit message being conveyed, which is (hopefully) something like, "trust me, I have a lot of experience here, I know what I'm talking about, your edit was wrong."

So I think we need to change something. Maybe multiple things. Some possibilities:

  • Changing Help:Reverting#When to to reflect our actual practice.
  • Making less use of the rollback feature.
  • Creating a number of "rollback edit-summaries" (one summary could be something like "rolling back vandalism", one could be something like "the entry already has that information", and so on), and using JavaScript to replace "rollback" links with some sort of simple interface (a pseudodropdown?) for performing a rollback using one of those summaries.
  • Modifying the default rollback summary, namely MediaWiki:Revertpage, to include something like "feel free to leave me a talk-page message if you disagree".

Any thoughts?

RuakhTALK 03:21, 6 July 2012 (UTC)

Updating the help page to reflect actual practice is definitely desirable. And it's perceptive that new editors may think an admin rolled back an edit as a result of misunderstanding it as vandalism. I think it's a very good idea to modify the auto-summary. A dropdown menu could be a bit much (at that point, it's not much different from "undo"), but perhaps the auto-summary could cover all the bases: "Your edit added information already present in the entry, or which does not belong in the entry. If you disagree, leave me a message." - -sche (discuss) 04:31, 6 July 2012 (UTC)
Although I agree that policy and practice should match, I think modifying the practice, not the policy, is the better answer. I've seen some comments along the lines "If you think my reversion was in error, please..." Making it clear that the reversion was a routine edit, not a judgment, is important. --BB12 (talk) 05:29, 6 July 2012 (UTC)
Help:Reverting#When to is a bit naive and/or misses out on some practical points. It's very undesirable to undo multiple bad edits individually rather than reverting, it takes ages. A lot of our admins rollbackers would rather be making their own edits rather than undoing other people's. Also, after three years editing here, I've never heard of this page, which suggests it isn't in the least influential, so updating it does seem reasonable. Again, leaving a message on the entry's talk page is often undesirable as a waste of time both for the person making leaving the message and for the people reading it. @BenjaminBarrett12 I've seen a couple of messages like that too. But how many, a couple a month? I don't think we should be pandering to the tiny minority head of the massive majority that are aware of Wiktionary practices and respect them. Mglovesfun (talk) 13:17, 6 July 2012 (UTC)
Reading it again, it's even more fanciful than I realized. While it's often 'better just to write a new version', reverting doesn't preclude that, also it assumes that people have unlimited time and effort. I think whoever wrote this passage should clean up every Wiktionary entry ever. That'd solve the problem. Mglovesfun (talk) 13:41, 6 July 2012 (UTC)
If writing "If you think my reversion was in error, please write on my page" takes too much time, can boilerplate be created for it? --BB12 (talk) 08:32, 7 July 2012 (UTC)
Ruakh's last point is that the default (auto-supplied) message can be modified, yes. - -sche (discuss) 10:34, 7 July 2012 (UTC)
I think each of those ideas has merit and may well help, but the last, modifying mediawiki:Revertpage, is additionally very easy to implement and pretty much cost-free. Perhaps instead of (or, better yet, in addition to) "feel free to leave me a talk-page message if you disagree", it should link to a brief help: page for those whose edits were reverted.​—msh210 (talk) 23:44, 8 July 2012 (UTC)
...and should not look the same as the WP reversion edit summary, as then no WPan will follow the link.​—msh210 (talk) 23:44, 8 July 2012 (UTC)
Done; of course, this is undoable at any time.​—msh210 (talk) 16:12, 11 July 2012 (UTC)
What the fuck? I don’t want to encourage assholes to pollute my already abysmal talk‐page. Could you please destroy that modification to the rollback messages? --Æ&Œ (talk) 00:13, 12 July 2012 (UTC)
If you don't want to discuss your edits (and reverts are edits too) then I'm not sure if Wiktionary is right for you. I agree with the change. —CodeCat 00:17, 12 July 2012 (UTC)
CodeCat, I do not desire to discuss my labours with bastards, which significantly differs from your claim that ‘I may not want to discuss my edits’. Are you going to insist that assholes are fine? --Æ&Œ (talk) 00:24, 12 July 2012 (UTC)
I suppose it should be 'if you disagree for legitimate reasons' but I can see why you wouldn't want to actually put that. Mglovesfun (talk) 13:20, 12 July 2012 (UTC)
How's this?​—msh210 (talk) 17:42, 12 July 2012 (UTC)

I for one totally object to this...due in part to some reasons Æ&Œ listed...Is there some way to make it so you only have to use this if you want to? Like something through the PREFS page perhaps? User: PalkiaX50 talk to meh 17:43, 12 July 2012 (UTC)

Fine. I've rolled mediawiki:Revertpage back for now pending further discussion.​—msh210 (talk) 17:47, 12 July 2012 (UTC)
  • It's certainly possible for individual administrators and rollbackers to customize their rollback-summaries using JavaScript such as the following:
    ( function ()
      { $(' a').each
        ( function ()
          { if(this.href.indexOf('&action=rollback&') == -1)
            if(this.href.indexOf('summary=') > -1)
            this.href +=
              '&summary=' +
              ( 'Reverted edits by [[Special:Contributions/$2|$2]] ' +
                '([[User talk:$2|talk]]), restoring last version by ' +
                '[[User:$1|$1]]. If you think this rollback is in error, ' +
                'please leave a message on my talk-page.'
    , but I think the default summary should include this sort of message, and those who want to remove it should have to customize their JavaScript accordingly (and thereby take on a higher standard of using the "rollback" feature only for clear vandalism).
    RuakhTALK 19:34, 12 July 2012 (UTC)
I agree. Is that somewhere for general inclusion, though?​—msh210 (talk) 22:41, 12 July 2012 (UTC)

Template:only in for SoP terms?

Looking at WT:RFD#local variable, I wondered if it would be nice to include a special notice on SoP terms, to make explain why there is no entry and why it shouldn't be created. If we just delete it or if it's never created, editors who are not familiar with our SoP policy might think it's just a missing entry and create it. With a notice, we could make it clear that it's a SoP entry that shouldn't exist, and we could also include links to the individual parts. —CodeCat 13:11, 6 July 2012 (UTC)

Well, the deleted entry will show the deletion log. Personally I tend more to worry about the opposite: cases where some entry should exist, but the existing entry is garbage and gets deleted: many potential editors will see that an entry has previously been deleted, so will refrain from recreating it. —RuakhTALK 14:08, 6 July 2012 (UTC)
I agree completely with Ruakh's first sentence. There's no need for such a template.​—msh210 (talk) 23:46, 8 July 2012 (UTC)
Where would we place those terms? What would they be "in"?
We could simply be more explicit about being or becoming a phrasebook. There's nothing wrong with being a phrasebook. We are so far from being competitive with MWOnline and OED on their terms, that perhaps we should be more frank about including terms that are merely common collocations with particular meanings that predominate. IMO, is an example. As it is, we have many terms that are borderline includable/excludable on a strictly lexical basis. The great mass of potential contributors is not going to fill in the large gap between the number of truly lexical entries we have and the number in MWOnline (mostly technical, I think) and OED (mostly older Englishes?). But we would still need some CFI to avoid the nonsense entries that sank our last effort in the phrasebook direction. DCDuring TALK 15:18, 6 July 2012 (UTC)

Collection of words for possible addition to Wiktionary

For over the past two years I've been collecting words while proofreading Wikisource:Wikisource:WikiProject Popular Science Monthly. The timeline of the source is from 1872 to 1912, consist of words from almost all academic, technical and scientific fields.

I've placed a list of ~1,300 words, between A and C inclusive HERE. Some exist, many don't but I know that they qualify can because of cursory checks of web dictionaries, and some don't qualify.

The complete list is around 13,000+ words so far, and a few words are added each day. I invite anyone interested to "dig" in use the list to add the qualifying words to the Wikitionary. My eventual goal is to transfer all collected words, broken down to about 1,000 per page.

If you need to contact me, please leave a message here, and I also have a link to my talk page on Wikisource. Ineuw (talk) 03:39, 7 July 2012 (UTC)

Thanks for thinking of us. I took a quick look and found that we have lower case versions of many of the terms. But we are always looking for wordlists. And scientific and technical lists are good. But especially prized would be relatively current technical and scientific wordlists.
If you can do so easily, could you give future listings wikilinked with first letter both upper- and lower-case. The upper case is useful for proper names and for taxonomic names, but other our entries for other words are lower-case. If it is not convenient, of course, we can do it ourselves. DCDuring TALK 04:28, 7 July 2012 (UTC)
Thanks the new page is exactly right for checking what we have and don't have. We'll have to get a feel for the kind of terms on the lists that are likely to lead to new entries. And I should have asked for an * or something for the capitalization that it had in the source, if you capture that. DCDuring TALK 02:03, 13 July 2012 (UTC)

Okina or straight apostrophe for tahitian ?

Hello ! I didn't contributes to en.wikt until now so I'm kind of random but I'm an active user from fr.wikt. I'm here to request some indications about the tahitian (<== oh-oh red link ?) language conventions. I checked your category and I found no words nor instructions to guide me (just maybe this : Index:Tahitian). Currently I'm add some tahitian words on fr project so I'm ready to kill two birds with one stone by add some words here too. But before add anything here I need to know what did you decide about the okina (==> ʻ ). Cause I saw you redirected the french words which use this kind of apostrophe (sorry don't know its english name) so which kind of character should I use for the tahitian entry the okina ʻ or the straight apostrophe : ' ? On fr we use okina like it is suppose to be encourage by the Tahitian Academy for the moment but in practice it's a real chaos : almost all kind of apostrophe are used on the net and on paper. The okina is really important in this language and represent the letter for this sound /ʔ/ but maybe you'll prefer use the straight apostrophe like the french entry. I let you decide and I'll begin to act when you'll got a consensus. Thank you for your attention. V!v£ l@ Rosière /Murmur…/ 06:53, 7 July 2012 (UTC)

I've been trying to get some kind of consensus to deal with this chaos, but so far, nothing much. The two of us who've been most active recently in the Polynesian languages decided to use the real okina for Hawaiian since that seems to be important to native speakers, and have sort of winged it on the rest. The main thing was to be consistent within a language, so Tongan are all the okina, while we've used the straight apostrophe for the rest just by default.
We don't have a huge number of Tahitian entries (about 200), so changing how we do things wouldn't be that hard. If you're going to be adding a good number of Tahitian entries, I, for one, would be happy to accommodate whatever you think is best. The straight apostrophe is certainly more convenient, but convenience isn't everything.
By the way, the best way to find resources for most languages on Wiktionary is through categories such as Category:Tahitian language, Unless there's an automated way set up for an index to be updated (as there is for some of the main languages here), the index tends to have just a fraction of what's really out there in the dictionary. Chuck Entz (talk) 07:34, 7 July 2012 (UTC)
Don't worry about tahitian being a red link; language names are always capitalized in English, and Tahitian is a blue link! As I've stated elsewhere, I think it's important that we use the okina correctly and not substitute the straight apostrophe or even the curly left-hand apostrophe for it, although redirects from pages using those characters to the pages using the okina are OK. Since we currently don't have many Tahitian entries, this is a good time to establish the standard. —Angr 07:48, 7 July 2012 (UTC)
Oh yes sorry, 2 big mistakes. First I totally forgot you capitalize language name (my bad), second I just saw you have some words already sub-categorized (I had only looked the main one like on fr). Personnally I prefer also respect the rules express by the Tahitian Academy, use the okina and the macron and respect the native institution but I have no problem to create the redirect or the variant page if it's need. If after all the english community still prefer the apostrophe I'm done with that too. However the macron, which indicates long vowel, must appears in my opinion. V!v£ l@ Rosière /Murmur…/ 10:58, 7 July 2012 (UTC)
When it comes to the "English community", there really isn't an opinion on this. As far as Tahitian entries are concerned, there isn't much of a community at all. Since there's a standard that says ʻokina is preferred, I think we should go with that. We have about 50 entries that will be affected, so I'm going to get started moving them now. I'll try and keep an eye out for entries that should have a macron, too. We should avoid hard redirects, because other languages may have words spelled the same, but with an apostrophe. Chuck Entz (talk) 21:55, 7 July 2012 (UTC)
I think the chances of that are in most cases slim enough that we can use hard redirects unless and until we encounter a word in some language where the apostrophe is actually correct. —Angr 22:36, 7 July 2012 (UTC)
Yes but careful, on Polynesian languages, okina is consider like a letter not just a typographical alternative. If we decide to change that it is like writing french or another languages whitout its proper diacritic. It's like if you use here prēsent instead of présent for the french language, it's seems a minor change for us but that can be a real misrepresent for them, we should keep this thought in consideration. V!v£ l@ Rosière /Murmur…/ 23:22, 7 July 2012 (UTC) Sorry, total misunderstanding, it's bedtime signal hahaha. V!v£ l@ Rosière /Murmur…/ 23:25, 7 July 2012 (UTC)
<edition conflict> Thanks, I'll check first your current words in order to correct the macron if it need (when I'm able to verify I mean), but here also it seems it isn't strictly respected in courant use by the Tahitians. Yes, the redirect problem was mention too on our side and we didn't found the perfect solution on fr. But, from what I saw, Polynesian language are really particular and lot of their words are very long. So I think we may permit hard redirect for the big one (above 5-6 letters) because it's really improbable that another kind of language, out-polynesian family I mean, use the exactly same graphy that's why I think the long world shouldn't cause any troubles (it's just a supposition maybe few exceptions exists). The more probable case of homograph will be inside the Polynesian language family itself (where sometime 70% of lexical can be common) and I guess almost all Polynesian languages chose to use the okina but I'm really not sure of this assertion it's need to be confirm. So here too, that shouldn't be a big deal.
However reals redirects problems occurs with the short words. Like ʻa in Tahitian ( U+02BB/okina + letter) and ’a ( U+2019 + letter) in Napolitan so we tried that solution (fr.wiktionary link) on our side.
It's mention : Variante par contrainte typographique english translate : "Typographic constraint variant". That show the differents languages affected and let the page open for the new entry that can be add. This solution isn't perfect on fr cause it's categorize this unofficial "variant". But here there will be no problem cause your language title don't categorize entry, so we can add this mention without include the word to the Tahitian category. What do you think of this ? Then we can also extend it for longer word but it's quite useless in my opinion. V!v£ l@ Rosière /Murmur…/ 23:09, 7 July 2012 (UTC)
Are ā, ē etc different letters from a, e in Tahitian, or does the macron only indicate vowel length? If the latter, the macrons should be used in headwords but not pagetitles (like Latin). - -sche (discuss) 23:50, 8 July 2012 (UTC)
They're not different letters, and they do only indicate vowel length, but it's not like in Latin: the macrons aren't just a dictionary convention, but are actually used in running text. (Well, to some extent. The Academy promotes their use, but they're usually just omitted.) (Obviously I don't speak Tahitian, this is just going by what Wikipedia and other such sources say.) —RuakhTALK 01:35, 9 July 2012 (UTC)
Yes same letters. Usually use on recent printed work often omitted on internet (site, forum, etc.) from what I saw. V!v£ l@ Rosière /Murmur…/ 07:02, 9 July 2012 (UTC)
  • @-sche, this is probably the same as for Hawaiian or Māori, where use of the macrons online is inconsistent (due to the technical problems of ASCII and limited input methods), but where the respective language authorities advocate their use, and where vowel length is contrastive (and thus should be part of the lemma). For instance, Hawaiian na (by, for, belonging to) is contrastive with (plural definite article; to moan, to wail; to be calmed, quieted, settled), and na keiki ("of or for the child") contrasts with nā keiki ("the children"). -- Eiríkr Útlendi │ Tala við mig 22:24, 9 July 2012 (UTC)
I'm also okay with redirects and copious uses of the template {{also}}. Mglovesfun (talk) 23:12, 7 July 2012 (UTC)
Ok thanks for the indication, I begin the work and apply it. If there some oppositions or other thing just contact me here. I don't know whole english rules, conventions, etc. so don't hesitate to rectify, correct, categorize or fortmat my contributions during the patrol. I'll follow all of them so I'll be able to improve myself for the next one. PS : I'm not a tahitian speaker and nor a native english's one so of course I'll add only the words which I'm 100% sure about the tahitian meaning and about the french and english translation (like dog, rainbow, etc.) to avoid mistake. V!v£ l@ Rosière /Murmur…/ 07:20, 9 July 2012 (UTC)
I've pretty much finished moving the entries with straight apostrophe to the equivalent with ʻokina and macrons. 'e is confusing because the remaining sense doesn't seem to match what I could find in sources that show macrons, so I wasn't sure where to move it. I had similar problems with 'o. I'm also going to go through Appendix:Swadesh lists for Austronesian languages and the Proto-Polynesian, Proto-Malayo-Polynesian and Proto-Austronesian appendicex to see if I missed anything else. Just from my experience with the entries so far, I suspect that there are lots of entries that will need to be moved to spellings with macrons. For the most part, I didn't leave redirects- there are several Polynesia languages where all the other entries use straight apostrophe, and that could have similar words (the entries where the glottal stop corresponds to a velar nasal in related languages being the main difference). We can always create redirect pages later as needed. I also don't have much experience with this kind of mass-move, so my use of {{also}}, etc., may need some tweaking. Chuck Entz (talk) 13:59, 9 July 2012 (UTC)

Thanks so much, especially VLR and Chuck, for bringing this up and dealing with it. The little I've seen of the mass move looks good to me, including all the also's. Anyone interested in seeing if we can solve the Wallisian problem while we're at it? I seem to remember they are quite erratic about 'okina usage and pedia says something about a strange symbol I've never seen that Unicode doesn't support. Any thoughts? --Μετάknowledgediscuss/deeds 15:14, 10 July 2012 (UTC)

After read this morning the answer from Metaknowledge, I begin to take a look on web to find something about Wallisian and by chance I find out a note write in order to the tahitian teachers in 2008 (Yes that's stop me from my first goal). It's from Polynesian Education Ministry's department and that explicit normalisation's rules... and it's really different from what it was said on Wikipedia about okina or what i got on my lexicon (from 1995). I started a discussion on the fr.project I'm tired so I'll bring the translation and facts here tomorrow. Sorry, but it's seems we will maybe have to revert the moves. If you're able to read french and don't want waiting tomorrow, I leave the link here. V!v£ l@ Rosière /Murmur…/ 21:08, 11 July 2012 (UTC)


Nailed down.

The translation section is headed "euphemistic: dead" - would this entry better having 2 translation sections:

  • dead (euphemistically)
  • dead (non-euphemistically) see dead

This question was prompted by there being 2 Greek translations (removed) which were not euphemistic. — Saltmarshαπάντηση 10:22, 8 July 2012 (UTC)

I suspect the problem may be scarce enough that no separate translation box is necessary. Is perhaps "euphemism: dead", "dead (euphemism)", or "dead: euphemism" clearer in its intent than "dead (euphemistically)"?​—msh210 (talk) 07:46, 10 July 2012 (UTC)
Little harm would come from using {{trans-see}}. I think this problem of register for translations is not so rare. Is there consensus that the translations should only be in the same register? DCDuring TALK 14:50, 10 July 2012 (UTC)
I agree little harm would come from using {{trans-see}}, and didn't mean to imply otherwise. I don't think it's necessary, but have no objection.​—msh210 (talk) 19:12, 10 July 2012 (UTC)

Doomsday virus

Hopefully none of us have been infected with this. If you have, you could lose your internet connection tonight at midnight. See here for a description and some easy tools to check and fix. —Stephen (Talk) 00:04, 9 July 2012 (UTC)

Appendix:English terms where 'ch' sounds as 'sh'

(Jamesjiao suggested that I enquire about this here.)
Does anybody here believe that this appendix is a good idea? --Æ&Œ (talk) 05:12, 10 July 2012 (UTC)

I do. — Ungoliant (Falai) 06:00, 10 July 2012 (UTC)
Sounds like what categories were made for.​—msh210 (talk) 07:48, 10 July 2012 (UTC)
Hell no. We already have too many categories. An appendix is the perfect place for this. -- Liliana 13:34, 10 July 2012 (UTC)
We already have too many appendices. A category is the perfect place for this. —RuakhTALK 13:59, 10 July 2012 (UTC)
We have neither too many categories nor too many appendices, but this strikes me as being more typical appendix material than category material. —Angr 14:02, 10 July 2012 (UTC)
We already have too many pages at Wiktionary. Wikipedia is the perfect place for this. -- Liliana 14:04, 10 July 2012 (UTC)
By this point, you guys had me ROFL.--Jacecar (talk) 07:56, 4 October 2012 (UTC)
If our goal is to have entries for all words in all languages, we can hardly worry about having "too many pages". If you try putting this at Wikipedia, they'll nominate it for deletion and/or transwikiing back here on the grounds that "Wikipedia is not a dictionary". —Angr 14:27, 10 July 2012 (UTC)
T'was a joke. -- Liliana 14:44, 10 July 2012 (UTC)
An appendix seems like a better place to start, especially since the category header structure is not conducive to innovation be a contributor. DCDuring TALK 15:43, 10 July 2012 (UTC)
I think a category is better suited for maintaining lists of words without further information per word. Which is the case here... unless we want to add annotations to some of the words. —CodeCat 16:33, 10 July 2012 (UTC)
Agreed. I'm not familiar with /sʰ/ as a phoneme, however. DAVilla 04:14, 22 August 2012 (UTC)
What do you mean by "category header structure"? —RuakhTALK 18:20, 10 July 2012 (UTC)

Pseudolanguage and Macaronic Terms

We keep having terms like Gott in Himmel, noli illegitimi carborundum, and ‎разблюто come up for discussion, and it seems to come down to: "these are made up by speakers of other languages, but they don't belong under those languages- what do we do?". We need to figure out a consistent and logical way to deal with these. Do we make entries under the coiners' languages? Do we make entries under the languages they're alleged to belong to? Do we make them translingual? Or do we create (a) separate "language(s)" along the same lines as translingual or the conlangs? Or do we relegate them to an appendix?

The first option would probably best handled with some kind of context labels and/or categories to set them apart from regular terms of the language, and likewise for the second. There would also be problems from the tendency to use letter-combinations and even scripts that are otherwise impossible for a given language. The translingual option runs into the problem that they're usually very language-specific, even if they don't actually belong to either language. The separate-language option would force creation of a new "language" whenever a given language fakes a term in a language we haven't covered. The appendix option leaves a hole in mainspace which people will keep trying to fill by creating entries.

Thoughts? Suggestions? Chuck Entz (talk) 18:37, 13 July 2012 (UTC)

I'm not entirely certain about the technical aspects, but my vote would be for putting these in an appendix, and if needed, creating redirects (hard or soft, as appropriate) from the expected lemma location(s) to the appendix entry. -- Eiríkr Útlendi │ Tala við mig 20:56, 13 July 2012 (UTC)
They seem a bit like a more extreme form of 'eye dialect'... terms that are used to create a certain impression of foreignness. And as such they would be clearly English, despite appearances, because they are used by English speakers to an English audience. —CodeCat 21:00, 13 July 2012 (UTC)
The first two look like they're handled correctly. The Russian is an interesting compromise but should probably be appendicized assuming verification in Russian comes up empty. DAVilla 04:08, 22 August 2012 (UTC)

A closer look at the Appendix: namespace

Presumably this was once inspired by the appendixes of paper dictionaries, but as it stands now, the appendix namespace is used for all kinds of things without a clear guideline. The current practice seems to be 'put anything that doesn't belong in the main namespace there'. In particular, we have entry-type pages (reconstructed and constructed terms), linguistic article-type pages (pages about spelling, grammar, pronunciation etc.), lists (slang, abbreviations, words with special properties, even some topical lists) and other kinds. Category:English appendices gives a good idea of what's out there. I do think that a lot of those list pages are superfluous and should probably be converted into categories instead, but that's another story. It seem obvious to me that the 'appendix' namespace is too vague and too disorganised; we might as well call it the 'Stuff' namespace. So I would like to propose:

  • A separate namespace for reconstructed terms. Presumably a separate namespace for constructed terms should also be created, or it could be combined into the one for reconstructed terms. Having a separate namespace also has advantages for templates and bots, because there would be a clear way to tell which pages are entries (and therefore follow WT:ELE) and which are not, which can aid bots and abuse filters with tagging errors in those pages as well. (An alternative proposal that was made a while ago was to merge (at least) reconstructed terms into the main namespace, but have their names prefixed by *. This approach has some technical flaws, as templates would no longer be able to sort pages without a sort key.)
  • Perhaps also a separate namespace for pages concerning spelling, grammar, pronunciation and other general information about languages as a whole.
  • Converting lists into categories where possible. The lists were probably originally created so that we could list terms even if their entries did not exist yet, as a category would have been incomplete in that case. But this approach is flawed for that same reason: a list can also be incomplete. In addition, a list is more difficult to maintain because it is less obvious to editors that it should be updated (whereas categories just need entries placed in them). And let's not even get started on having both a category and a list... there could be terms in both, terms in one but not the other, or terms in neither. Not good.
  • If there are enough pages remaining in the Appendix namespace after we're done cleaning it up, we can consider giving it a more descriptive name.

CodeCat 20:39, 13 July 2012 (UTC)

Have you seen Wiktionary:Index to appendices? That was an attempt by me to index the whole mess. I agree that we don't need all of the pages, as some are clearly obsolete by now. -- Liliana 20:42, 13 July 2012 (UTC)
Yes, and it's very helpful! It's not even complete, though, as it doesn't mention reconstructed terms in attested languages, like Appendix:Vulgar Latin/montanea. They would presumably be treated the same as reconstructed terms in entirely unattested languages, though. —CodeCat 20:51, 13 July 2012 (UTC)
I don't think we should just clear things out of the appendix namespace. Getting some rules and deleting unwanted stuff sounds great, but let's not get desperate to move seemingly valid things out of the appendix namespace into another namespace in order to have less pages using the namespace. I don't see what in real terms this would achieve. Mglovesfun (talk) 15:30, 14 July 2012 (UTC)
At the very least I think there should be a namespace for non-mainspace entries. The others are not as important. —CodeCat 16:16, 14 July 2012 (UTC)
I don't see why we couldn't just have unattested terms in the main namespace, without being prefixed or anything. The {{reconstructed}} template makes things clear enough. --Yair rand (talk) 21:30, 16 July 2012 (UTC)

Q about header levels in exceptional cases

(Mostly copied from User_talk:Liliana-60#Q_about_Kassadbot_.26_level_headers_for_JA_entries.)

I recently ran across a change made by Kassadbot here on the JA entry 今日, moving a Coordinate terms header from L3 to L5.

JA entries pose a bit of an organizational problem, as one lemma may have multiple etymologies and multiple readings and POSes. Various items that are usually L4 or lower according to WT:ELE, and would thus hierarchically only apply to the parent header, may actually apply to the whole entry. 今日, for instance, has three distinct etymologies, each with specific readings. However, all three etyms have the same meaning, and the items listed under Coordinate terms apply equally to all three etyms.

The simplest way to indicate this would be to just have Coordinate terms at L3, on par with all of the etyms. However, this breaches WT:ELE and thus Kassadbot (and possibly other bots too) will demote the header (producing an entry that gives incorrect information) or mark the entry as needing attention.

Does anyone here know of an elegant way to make this work?

-- TIA, Eiríkr Útlendi │ Tala við mig 20:46, 13 July 2012 (UTC)

If all three readings have the same meanings, then maybe it doesn't make sense to treat them as three separate terms? —RuakhTALK 23:17, 13 July 2012 (UTC)
Hmm, perhaps, but numerous JA terms have single spellings (i.e. lemma forms) that have divergent pronunciations, each with independent etymologies and shades of meaning. It's a bit like if English undoubtable and indubitable were spelled the same but still had separate pronunciations and etymologies. Or as a more radical example, imagine if English doubt and disbelief somehow had the same spelling, but still had separate pronunciations and etymologies, and which one you used depended on the context.
For 今日, for instance, the kyō reading is from Old Japanese and is a compound of OJP /ke/ "this" and /hu/ "day", with /hu/ an ablaut version of /hi/ (). Meanwhile, konnichi is from Chinese, and konjitsu is from Chinese but with the reading tweaked in Japan at a later date. All three can mean today; the Chinese-based readings can mean nowadays; and the two different Chinese-based readings carry different connotations and are used differently. The 今日 entry could certainly use expansion to lay this all out clearly, and I'm adding the term to my to-do list. However, despite the identical kanji spelling of 今日, I feel very strongly that treating kyō, konnichi, and konjitsu as one and the same term would be a grave mistake. See, for example, the current state of the 明日 ("tomorrow") entry, which I consider to be severely defective for all the information that is left out or glossed over. (Also adding to my to-do list.).
As far as I currently understand things, language names should be L2 headers, and lemmata with multiple etymologies should list those at L3 with information pertaining to specific etyms at L4 or below under the relevant etym header. What I'm hoping to find is a format for indicating information that applies to multiple etymologies, and that does not fall afoul of bot autocorrection rules. I suppose we could just duplicate the coord terms section at 今日 under each etym, but that is clumsy at best, and raises the risk of inconsistencies arising in the duplicates over time. -- Eiríkr Útlendi │ Tala við mig 21:15, 16 July 2012 (UTC)
Just out of curiosity, how does someone reading a Japanese text know whether kyo, konnichi, or konjitsu is meant? (Since, as you say, the connotations are different, this can be important; on the other hand, since all mean "today" I imagine it would often be hard to tell.)​—msh210 (talk) 22:38, 17 July 2012 (UTC)
That's the fun of gaining Japanese literacy. :-/
Japanese is an extremely contextual language, much more so than English, and this extends to the written form as well. 今日 is usually read as kyō, and that's the most common reading. If the context calls for a meaning more like nowadays, then the reading shifts. If the word is used as nowadays and is followed by topic particle (wa), then the speaker will likely use the reading konjitsu, as konnichi followed by wa seems to be generally reserved for 今日は (konnichi wa), the Japanese version of the greeting good day. If the writer intends a specific reading, perhaps in poetry or in other cases where the sound of the word is important, the writer might use furigana or other w:Ruby text to specify the reading. Writers sometimes apply readings based on foreign words or even nonce words, relying on the kanji for the meaning; this is especially common in manga.
In terms of connotational differences, words based on native Japanese roots tend to be more basic while words based on Chinese roots tend to carry a higher register and have fancier connotations, a bit like how words with Anglo-Saxon vs. Latinate roots are treated in English. As such, OJP-derived kyō is the most common reading and is the most prosaic, while Chinese-derived konnichi and konjitsu are somewhat more formal and entail a higher social register (with the exception of konnichi being used in the everyday greeting konnichi wa).
As best I can tell, the distinction between konnichi and konjitsu is subtler, and it might boil down to different sound textures and attendant alliterative allusions. Again, konnichi is used in the everyday greeting, and thus might be regarded as a bit less formal; konjitsu sounds almost the same as honjitsu, formal for “this day, today”, and thus an author or speaker might choose to use or avoid konjitsu if also using honjitsu in close proximity. There might also be a regional dialect dimension to this -- I spent a few years in the w:Tōhoku and northern w:Kantō regions of Japan and recall hearing this particular konjitsu reading, but I note that some dictionaries don't include this reading, leading me to suspect that it might not be so common in other parts of Japan. -- Cheers, Eiríkr Útlendi │ Tala við mig 19:38, 19 July 2012 (UTC)
Thanks.​—msh210 (talk) 23:09, 19 July 2012 (UTC)
(after e/c) Presumably without greater ease than someone reading an English text knows which of the many similar senses of "set" is meant — or "epizootic", an English word with sense-specific pronunciations. - -sche (discuss) 19:46, 19 July 2012 (UTC)
Having a Coordinate terms at L3 doesn't breach WT:ELE as clearly shown in Wiktionary:Semantic relations#Coordinate_term. Unless that example is a mistake of course...Fedso (talk) 19:14, 21 September 2012 (UTC)

Vulgar Latin appendices

Should the descendants section of Vulgar Latin appendix entries (such as Appendix:Vulgar Latin/montanea) be split into branches, like we do with PIE? If so, is Ethnologue’s classification of Romance languages acceptable? — Ungoliant (Falai) 22:14, 16 July 2012 (UTC)

I dunno. Mglovesfun (talk) 19:43, 19 July 2012 (UTC)

Definitely, the more clear 'Descendants' sections are, the better. --Μετάknowledgediscuss/deeds 16:55, 20 July 2012 (UTC)

Category:Mandarin han characters

I proposed this a few months ago to deal with uncategorized Mandarin entries (would work for other Chinese languages too) but it got sidetracked as an IP suggested that what I was suggesting was actually to exclude Middle Chinese entries. I don't see the link but maybe someone can explain it to me.

Anyway, my proposal is in the same way we have Category:English letters is to categorize individual CJKV characters used in Mandarin in Category:Mandarin han characters. I do not propose to exclude Middle Chinese (again, I still don't get it). Mglovesfun (talk) 14:12, 17 July 2012 (UTC)

  • If your proposal concerns the characters themselves, those are translingual, so the "Mandarin" label might be what's throwing off that IP user. "Mandarin", as I understand it, is mostly used to indicate modern standard Chinese based on the Beijing dialect, and thus specifically excludes Middle Chinese, Cantonese, Taiwanese, etc. What about an all-encompassing title like Category:Chinese characters? This excludes Japanese kana and Korean hangul, but would include Japanese kanji and Korean hanja, as these are essentially Chinese characters (and the respective words literally mean "Han / Chinese characters"). This would also avoid any unintended exclusion of other varieties of Chinese.
If your proposal is purely limited to that subset of Chinese characters that are used in Mandarin, then what about a title such as Category:Chinese characters used in Mandarin? -- Eiríkr Útlendi │ Tala við mig 20:14, 17 July 2012 (UTC)
We do have Category:Han characters, I think Mglovesfun wants to split that by language. I don't think that's a good idea personally. -- Liliana 20:29, 17 July 2012 (UTC)
But what about Category:Spanish letters and Category:Hungarian letters, surely they are just the same letters but used in different languages? But we don't have Category:Latin script characters used in Spanish. Mglovesfun (talk) 22:24, 17 July 2012 (UTC)
  • I note that Category:Hungarian letters appears to list only those letters unique to Hungarian. The cat title in this case seems appropriate -- these letters are used in Hungarian and (AFAIK) nothing else, ergo they can reasonably be called "Hungarian letters".
Meanwhile, Category:Spanish letters lists the whole of the Latin alphabet as used in Spanish. This doesn't seem right, as most of these letters are commonly used across the whole range of written languages that use the Latin alphabet, so in this case, I would be happier if the title were indeed Category:Latin script characters used in Spanish or something of that ilk. -- Eiríkr Útlendi │ Tala við mig 23:03, 17 July 2012 (UTC)
Huh? Category:Hungarian letters includes e, h, é, and ö, all of which are used in languages other than Hungarian. —Angr 23:07, 17 July 2012 (UTC)
That's me moving too quickly. I clicked through, saw very little, and took what first met my eye -- letters like Ű and Ő. Doh.
Taking a second look, that's a very odd category. Delete? Rework, with considerable expansion? -- Eiríkr Útlendi │ Tala við mig 23:24, 17 July 2012 (UTC)

Peace Corps manuals available

The Peace Corps of America has kindly offered Wiktionary the use of their language manuals, provided proper attribution is provided. I have added Wikipedia links and ISO codes, though several are unclear. The list is at: Available References. You can contact me or leave a note on that page if interested. I hope this all works out! --BB12 (talk) 07:03, 19 July 2012 (UTC)

What are the terms here? Do we have OTRS? And why Wiktionary; if we have a language manual, it should go to Wikisource, and be referenced from there.--Prosfilaes (talk) 10:12, 19 July 2012 (UTC)
I contacted the Peace Corps, and someone responded with their list of manuals, saying we can cite them provided attribution is made. According to Wiktionary, OTRS means: "The problem center responsible for all of the Wikimedia Foundation activities, complaints, investigations and related legal matters." I don't know anything about that.
I have no problem with Wikisource but no nothing about it. I have trouble imagining trying to go to Wikisource for every single word in a manual, but maybe I don't understand the process correctly. Please advise.
FWIW, I don't think it would show good manners to request all of their manuals in one bunch; rather, asking for a couple of manuals that there is interest in would be better. I am working with the Zambia manual and Ungoliant has requested the Portuguese manuals. --BB12 (talk) 18:46, 19 July 2012 (UTC)
I think that Prosfilaes is just saying that Wikimedia can't just trust that we won't be infringing copyright, but instead need some sort of proof, registered at OTRS. That said, I really don't know anything about it, although I think Angr and probably many other users are quite familiar with it for other Wikimedia projects. Wikisource hosts texts; in any case I think that Wikisource would be incompatible copyright-wise with the manuals. I don't know about what manners may be involved, but in the meantime, I would like to get my hands on the Bislama manual (figuratively speaking) if you can get that one. By the way, I don't necessarily know what's going on any better than anyone else, so Prosfilaes or anybody else, please feel free to correct me. --Μετάknowledgediscuss/deeds 19:02, 19 July 2012 (UTC)
Permission for a Wikimedia project to use a copyrighted work is never sufficient. It has to be free to use for everyone, not just us. And it's generally required to have them send an email to OTRS if it's a copyrighted work, so if someone complains in the future, the foundation will have a permanently stored record that it does have permission to use it.--Prosfilaes (talk) 20:45, 19 July 2012 (UTC)
My contact is on vacation right now, but I will contact her when she comes back and asked about the Bislama manual. BTW, the manual I have is images in a PDF. If that can be OCDed accurately, that would be great. --BB12 (talk) 20:26, 19 July 2012 (UTC)
If we have a PDF that's going to stored on WMF servers, it should be uploaded to Commons. It would probably be better to have it transcribed on Wikisource and referenced here, but once it's on Commons that's not a big deal.--Prosfilaes (talk) 20:45, 19 July 2012 (UTC)
I have no intention of storing PDFs on WMF servers. Should I do so? --BB12 (talk) 21:21, 19 July 2012 (UTC)
Why not? If they release the material under a free license, it can be stored at Commons. Why wouldn't you want to store it there? Also, if they release the material under a free license, we can both have it digitized at Wikisource and cannibalize it as reference material here. They can't simply give permission for Wiktionary alone to use it; they have to license it freely so that anyone can. —Angr 11:25, 20 July 2012 (UTC)
Storing on the Commons sounds fine, then. There is a small amount of information on Wikisource about OCRing, but not much. I don't know that I want to get involved with Google Tesseract myself. Are there people on Wikisource who love doing that sort of thing? Also, how accurate is OCRing with English/non-English text? --BB12 (talk) 17:48, 20 July 2012 (UTC)
@Prosfilaes: Wiktionary, not Wikisource, seems appropriate if we'll be using the manuals as references and citations, rather than uploading them in their entirety. (We can cite copyrighted works are references and citations without the copyright-holder's consent, but I think consent was obtained in this case because there's an expectation that terms may be imported en masse after some checking to ensure they're lemmata and not inflected forms.) - -sche (discuss) 19:24, 19 July 2012 (UTC)

Help decide about more than $10 million of Wikimedia donations in the coming year

(Apologies if this message isn't in your language. Please consider translating it)


As many of you are aware, the Wikimedia Board of Trustees recently initiated important changes in the way that money is being distributed within the Wikimedia movement. As part of this, a new community-led "Funds Dissemination Committee" (FDC) is currently being set up. Already in 2012-13, its recommendations will guide the decisions about the distribution of over 10 million US dollars among the Foundation, chapters and other eligible entities.

Now, seven capable, knowledgeable and trustworthy community members are sought to volunteer on the initial Funds Dissemination Committee. It is expected to take up its work in September. In addition, a community member is sought to be the Ombudsperson for the FDC process. If you are interested in joining the committee, read the call for volunteers. Nominations are planned to close on August 15.

--Anasuya Sengupta, Director of Global Learning and Grantmaking, Wikimedia Foundation 20:00, 19 July 2012 (UTC)

Distributed via Global message delivery. (Wrong page? Fix here.)

Definition of durable

PARADISEC has a collection of materials and Metaknowledge suggested I ask here to find out whether the community considers the collection to be durably archived or not.

According to their about page, they have "established a framework for accessioning, cataloguing and digitising audio, text and visual material, and preserving digital copies. The primary focus of this initial stage is safe preservation of material that would otherwise be lost, especially field tapes from the 1950s and 1960s." Among their records available online and provided with the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License is a glossary of words in Sesake (North Efate (llp)).

This looks like durable archiving to me. What do others think? --BB12 (talk) 05:54, 20 July 2012 (UTC)

It looks like they aspire to be a durable archive. How are they funded, staffed, and hosted? DCDuring TALK 10:22, 20 July 2012 (UTC)
They are funded by the Universities of Sydney, Melbourne, and New England, Australian National University and the Australian Research Council. They are hosted at the Universities of Sydney and Melbourne and ANU. PARADISEC is headed by w:Nicholas Thieberger and Linda Barwick. --BB12 (talk) 16:27, 20 July 2012 (UTC)
"Creative Commons Attribution-Noncommercial-Share Alike 3.0 License" is not compatible with CC-BY-SA, because of the "Noncommercial" part. Wiktionary cannot use that material. --Dan Polansky (talk) 17:52, 20 July 2012 (UTC)
Most books are not compatible with CC-BY-SA. That doesn't prevent us from using them as sources of quotations. —RuakhTALK 20:26, 20 July 2012 (UTC)
The material in question includes a glossary apparently published under an unsuitable license. Consider this page containing what looks like a triplet of (English word, non-English word, non-English example sentence); if translation pairs should not be systematically copied from a copyrighted source into Wiktionary, the glossary should not be systematically copied into Wiktionary. Copying example sentences from that glossary into Wiktionary would be a bit like copying example sentences from a commercial dictionary into Wiktionary. --Dan Polansky (talk) 20:40, 20 July 2012 (UTC)
Yes, sorry; I see now that you were replying to BB12's last sentence, which veered off on a tangent. I agree with you that we can't steal their glossary. —RuakhTALK 21:36, 20 July 2012 (UTC)

Thank you for the discussion of the copyright issue. How about durability? Do people consider these materials to be durably archived? --BB12 (talk) 22:05, 20 July 2012 (UTC)

I would venture to say yes since archiving is one of their stated purposes, plus the whole means, motive, and opportunity bit. DAVilla 03:20, 22 August 2012 (UTC)

Reward or bounty board

What do people think of having a page for specific rewards for completion of Wiktionary tasks? Wikipedia has the WP:Reward board that include rewards of money, goods, tit-for-tat editing, and barnstars (which we probably utilize here). The talk-page archive also lists the various pros and cons of such a page. Our problems with violating NPOV are probably a bit different than enwiki's. There've been several times when I've thought it would be nice to pay someone to do something, especially those repetitive tasks that can't be fully automated. As long as the "completer" knows about the potential de-motivational aspect of switching to monetary reward I would encourage people to make appropriate transactions here (because they can always be done somewhere else). We would probably want to have versions of WP:Wikipedia:Paid editing, WP:Conflict of interest and related pages. Relatedly, would there be interest in having a local version of WP:Bounty board where the payment is to Wikimedia in the name of the bounty hunter? --Bequw τ 17:59, 20 July 2012 (UTC)

No. --Μετάknowledgediscuss/deeds 18:14, 20 July 2012 (UTC)
  -- Liliana 18:35, 20 July 2012 (UTC)
I am all for any initiatives that reward editors, but it has to be a reward for goodwill, not one that makes the reward the end towards which editing is only the means. I am in favour of barnstars, but not of any of the other things you mentioned. —CodeCat 18:58, 20 July 2012 (UTC)
Thanks, CodeCat, for being so polite in comparison to us :). Barnstars might not be a bad idea; besides making them bureaucrats, there's not a whole lot the community does to overtly thank admin editors. On the other hand, we're so small that we don't really need it, and it's probably not worth the trouble. --Μετάknowledgediscuss/deeds 19:02, 20 July 2012 (UTC)
(See, that's when you offer barnstars! :D ) I don't think admins are the ones that need the most thanking, it's mostly the smaller editors that may make a few edits, then get reverted once or twice with no explanation, and decide to give up even though they were willing to help improve Wiktionary. Offering barnstars would also be a good way to foster good 'vibes' among the editors that are already here. In fact, not so long ago, I read an essay about Wikipedia's use of barnstars that concluded that they actually do increase productivity: users that received one made measurably more contributions afterwards. w:Wikipedia:Wikipedia Signpost/2012-04-30/Recent research The key phrase here is receiving recognition for one's work in an informal peer-based environment such as Wikipedia has a positive effect on productivityCodeCat 20:09, 20 July 2012 (UTC)
Wow, that's really fascinating. I'm all for it now. I also notice Astral got one, so clearly it's not nonexistent around here. Is there, say, a template system for barnstars on enwiki that we could move over here? --Μετάknowledgediscuss/deeds 20:13, 20 July 2012 (UTC)
Sounds good. Arguing with large font and images is inferior or empty; there is no argument. I see no reason why a person doing boring cleanup work should not be paid with money if someone offers that money. --Dan Polansky (talk) 19:53, 20 July 2012 (UTC)
Just curious, which of us is inferior and which of us is empty? Liliana, if I'm inferior, will you be insulted because that means that you have to be empty?
Seriously, though, I could argue based on wiki principles and systematic bias, but you'd just argue back, and it would be a waste of my time. So I'll make it easy and say that I think it's morally corrupting. That's my opinion, and you can disagree with it, but you can't argue its basis. --Μετάknowledgediscuss/deeds 20:01, 20 July 2012 (UTC)
I did not say any people were inferior; I said arguing in a certain manner was inferior. The hypothesis that being paid for doing real work is morally corrupting sounds like pure nonsense to me, amplified by using boldface, as if it helped the nonexistent argument. --Dan Polansky (talk) 20:13, 20 July 2012 (UTC)
I think Wiktionary is in far less danger of becoming biased than Wikipedia is, just because of its nature. It's hard to define words to mean something they simply don't; the only place where bias could creep in is missing definitions or in wording. That said, we can't stop people from being paid to edit Wiktionary, and in all likelyhood it already happens, so the matter is just whether we actively promote it or just ignore it. —CodeCat 20:18, 20 July 2012 (UTC)
I was just planning to ask why wouldn't it be worth the trouble to introduce barnstars for example, when I noticed that the discussion already inclined towards accepting the proposal. Anyway, I wanted to say that the atmosphere here sometimes appears very unwelcoming if not even intimidating to the newbies - at least judging by some of their comments I've encountered. It certainly looked so when I was one newbie. Some form of more formal acknowledgments of other editors' work (as for example barnstars are) would surely help alleviate the overall spartanness of this place. I thought this idea was doomed from the start, because if the majority of editors are spartan themselves, they probably wouldn't give much toss to giving or receiving appreciations for their work either. But, luckily I was wrong - I guess. In the end, I surely would promote awarding people for their contributions. --BiblbroX дискашн 20:46, 20 July 2012 (UTC)
@Meta, the reasoning should focus on the project. What do I care if someone wants to compromise themselves as long as the project doesn't suffer? Do you think paying someone to, say appropriately clean up Category:Translation table header lacks gloss would be detrimental to the project? If so, how?
There's of course some uncertainty in predicting what the longer-run implications would be here, but given Wikipedia's several years of experience I think we should assume there'll be a small number of boring requests that don't cause that much problem. And in the end, though, it's an empirical question. I think we should try it out and periodically re-evaluate and change if necessary. --Bequw τ 03:44, 21 July 2012 (UTC)
My response was designed to stop people from arguing with me and instead argue with other people, but clearly that didn't work. Mostly, it bothers me (obviously) and I feel like it's asking for trouble. I think that paid editing is more likely to introduce the kind of problems that Wikipedia is equipped to handle (ArbCom, etc.) and that we're not (e.g., voting on whether to block Luciferwildcat). User:Scienceexplorer did fairly shoddy editing, and that's my only experience of paid editing on Wiktionary. --Μετάknowledgediscuss/deeds 04:46, 21 July 2012 (UTC)
For those asking about barnstars, we've had {{User Barnstar}} for over 2 years that's barely been used so I'm uncertain have more would help. To see why many here don't like them, look here (search the page) and here. --Bequw τ 02:51, 21 July 2012 (UTC)
Barnstars are a kind of motivator that is surprisingly effective. It is reminiscent of the points and badges automatically dispensed by Foursquare for "checking in" to locations or the 'civic points' dispensed by SeeClickFix for identifying local government/community tasks.
I have some difficulty understanding who would pay for what kind of activity and why. I could understand why the American Chemical Society might pay for the creation of scores of thousands of entries of chemical terms. But who would pay for maintenance tasks? And why?
The motivational effects seem less than straightforward. If some people are getting paid for maintenance tasks, what would be the effect on others who weren't being paid? What happens to the tasks that are for a time uncompensated? Do folks go on strike in order to cause them to become paid tasks? What about the tasks that are done by bots? Would those who develop bots to do these tasks get paid? Would people be paid on a piecework basis? How would quality be controlled and compensated for? This needs some fleshing out before I could feel comfortable with it - and I'm a microeconomist by education and inclination. DCDuring TALK 05:15, 21 July 2012 (UTC)
I've thought of paying for some clean-up tasks. Usually these are for things that matter a lot to me and probably won't get done (I have little time and it's not a priority for others). For example, I championed making the Chinese category structure more similar to that of other languages, but by the time it all got planned out I didn't have enough time to do the grunt work.
I see little demotivational consequences for this kind of request. From the wikipedia experience I think the answers to your questions would be: no one will go on strike, the reward hunter gets paid regardless of whether he completes it by bot or by hand, rewards could be paid for piecemeal work, details of the job (e.g. quality control, checking, payment method) are left up to the rewarder and the reward hunter. --Bequw τ 06:15, 22 July 2012 (UTC)
An arrangement between two people would not really require anyone to know. Are you saying that you would solicit volunteers? DCDuring TALK 06:34, 22 July 2012 (UTC)
Basically yes. The board would be used to define the job and solicit workers (and to post the possible completion). Per the enwiki page "The execution and details of the transaction are the responsibility of the participating parties", which happen elsewhere. --Bequw τ 19:58, 24 July 2012 (UTC)
I don't see anything objectionable in principle. I suppose folks could disagree with the desirability of a specific task. Would a consensus of indifference about a specific task be sufficient to let a specific task proceed? DCDuring TALK 20:12, 24 July 2012 (UTC)
I've set up WT:Barnstars as a draft, please feel free to improve it as necessary. In particular, we probably need to discuss which kinds of barnstar there are, and whether we want to allow people to be creative with them like we don't with userboxes. See Wiktionary talk:Barnstars for that. —CodeCat 12:23, 21 July 2012 (UTC)

Dealing with newbies

There's an interesting and I think very relevant new blog post out there on the Web somewhere.​—msh210 (talk) 19:14, 20 July 2012 (UTC)

There is one difference, in that our highest goal is improving the dictionary, not building community. That said, community is a good thing, and the more newbie-friendly we can be, the better. --Μετάknowledgediscuss/deeds 20:09, 20 July 2012 (UTC)
But being a wiki, we can't build a dictionary without building a community. That's not something we can ignore, Wiktionary will just bleed to death if we do. —CodeCat 20:14, 20 July 2012 (UTC)
After e/c: We can't improve the dictionary without building community. We are often at our limits in dealing with technical terms outside of linguistics, computing, law, business and economics - in English. And I think we can always use more technical adepts, at least provided they play well with others. Mostly we need to take a chance with new contributors. I suppose we could be more explicitly inviting to people in specialized fields or contexts, which might give us license to be rude more demanding of others.... DCDuring TALK 20:19, 20 July 2012 (UTC)
Recruitment from 'pedia, maybe? That could be our best bet, because 'pedians already are familiar with MW, wiki markup, wikiquette, etc, and we already have WT:Wiktionary for Wikipedians. --Μετάknowledgediscuss/deeds 20:32, 20 July 2012 (UTC)
Are you considering recruiting newbies from Wikipedia, or recruiting help with newbies? I don't think the latter would be a bad idea... an experienced Wikipedian could act as a consultant here and point out things that could do with improvement. —CodeCat 22:13, 20 July 2012 (UTC)
I'm still in favor of a mentoring program. --BB12 (talk) 22:13, 20 July 2012 (UTC)
@CodeCat: I wasn't thinking of that, but it's actually a much better idea.
@BB: I don't think I'll do, but if you can figure something out and get people to do it, more power to you.--Μετάknowledgediscuss/deeds 22:17, 20 July 2012 (UTC)
I also support a mentoring program, and I am willing to participate. — Ungoliant (Falai) 22:33, 20 July 2012 (UTC)
I think the mentoring program would work like this: A project page is set up and people who want to be mentors sign their names to it. When a new person comes on board, the next person in the list is assigned as mentor. The mentor provides a welcome message that includes instructions on how to opt out of the mentor program and how to request a different mentor. --BB12 (talk) 23:31, 20 July 2012 (UTC)

A suggestion

I've heard from a couple of users here that it is better if the inflection lines of entries aren't simply formated with apostrophes, i.e., '''entry''', but actually with a template, even if only {{head|en}}. Yet I've seen a number of such entries (the last one being velis, which has '''velis''' right under ===Verb===, under ==Esperanto==). Wouldn't it be a good idea to get a bot to look for all those apostrophe-formatted inflection lines and replace them automatically with {{head| + LANGUAGE CODE + }}? It seems to me that it should be very easy to write the code and actually do it. Or is there a reason not to do it that I don't know? --Pereru (talk) 14:52, 21 July 2012 (UTC)

I have wondered this myself. It would be very easy for a bot to replace such a 'raw' headword-line with {{head|xx}}. And, if there is also Category:Xx POSes on the page, it could remove that and use {{head|xx|POS}} instead. —CodeCat 14:56, 21 July 2012 (UTC)
I'm all for it, but one programming the bot should be careful with the second proposal of inserting POS parametres - an entry may have multiple meanings and with that belonging to multiple POS. --BiblbroX дискашн 15:04, 21 July 2012 (UTC)
An extra safeguard would be to check whether the headword line being replaced begins with the POS that appears in the category. So, it would only replace Category:English nouns in case the headword line appears as ===Noun=== word in the English section of the page. —CodeCat 15:24, 21 July 2012 (UTC)
Hopefully, the bot would also remove Category:Langx nouns when it changes '''whatever''' to {{head|xx|noun}}. --Μετάknowledgediscuss/deeds 16:08, 21 July 2012 (UTC)
I don't think bot-adding the POS is wise unless carefully tailored. For example, ==Language==/===POS=== is used for [[Category: Language POS forms]] entries, which (at least in many languages) should not be categorized as [[Category: Language POSes]]. The safeguard CodeCat mentions above (15:24, 21 July 2012) may suffice for this. (Changing '''headword''' to {{head|langcode}} seems innocuous OTOH; but see my simultaneously-posted reply to Pereru, below.)​—msh210 (talk) 06:53, 22 July 2012 (UTC)
Re "it is better if the inflection lines of entries aren't simply formated with apostrophes, i.e., '''entry''', but actually with a template, even if only {{head|en}}": That is true for entries in non-Latin alphabets, as the template will style them right; for Latin-alphabet entries AFAICT '''entry''' is preferred, as it doesn't require a template transclusion.​—msh210 (talk) 06:53, 22 July 2012 (UTC)
Why is that preferred? I think using the template is preferred in all cases, as it also sets the HTML lang= property. It is also more consistent and is easier for bots to parse. I have been using headword-line templates for a long time now, and I've replaced raw bolded text with them whenever I came across it. —CodeCat 12:54, 23 July 2012 (UTC)
It seems to me likely to be language-specific. In English we already have {{plural}}, {{past of}}, etc which do all the categorizing we want and no more. We do NOT want plurals categorized as nouns, etc. I see only cost in having meaningless transclusions of already widely transcluded templates like {{head}} as a matter of policy. If a given language doesn't have a good discipline for handling categorization of words and word forms, that would be that language's problem and a top-down interim "solution" might be the use of a template like {{head}}. I view each use of {{head}} and {{infl}} applied to English inflected forms as an item needing inspection at least to remove erroneous categorization. While there I may remove any instance of just for the resource-saving effect. DCDuring TALK 13:43, 23 July 2012 (UTC)
A small item on the plus side of templates is not having to type the headword. Maybe that's not a big deal for a single-syllable, obvious spelling, but longer words open the way for typos, and it saves the extra work for characters not on the standard keyboard. Also, when an entry is moved, the headwords correct themselves, rather than requiring editing to make the new spelling match the entry spelling. It may not seem like much, but I can tell you from experience that it's a convenience when you have to make a mass change of something like straight apostrophe to ʻokina in dozens of entries. Chuck Entz (talk) 14:13, 23 July 2012 (UTC)
I'm with CodeCat; I always use e.g. {{head|en}} at English inflected forms. —RuakhTALK 14:27, 23 July 2012 (UTC)
I'm all for saving time, just not miscategorization. Save all the keystrokes by only using {{head|en}} when PoS categorization is inappropriate. I use cut an paste from the page title, which works particularly well for multi-word entries, speeding the wikilinking of individual component terms. DCDuring TALK 14:53, 23 July 2012 (UTC)
I too think that all headlines should use the {{head}}-template, but without a POS-parameter the template is quite useless. Matthias Buchmeier (talk) 15:00, 23 July 2012 (UTC)
But the POS parameter results in either wrong or redundant categorization (e.g., at an English plural form using {{plural of}}, {{head|en|noun}} is wrong and {{head|en|plural}} is redundant). —RuakhTALK 15:38, 23 July 2012 (UTC)
{{head|en|plural}} is redundant, but it helps bots to know that the following section is a plural noun form. If one has to first scan the whole page to see if there is a {{plural of}} or a [[Category:en:plural]] then the parsing code is much more complex. Matthias Buchmeier (talk) 17:02, 23 July 2012 (UTC)
I don't think that's true. And {{head|en}} would serve the same purpose, in that ===Noun=== {{head|en}} would also indicate a plural. —RuakhTALK 17:25, 23 July 2012 (UTC)

Why not create {{en-plural}} as an inflection template? -- Liliana 15:29, 23 July 2012 (UTC)

Why yes create it? —RuakhTALK 15:38, 23 July 2012 (UTC)
  • I must be misunderstanding something, but since when are English plural nouns no longer nouns? -- Eiríkr Útlendi │ Tala við mig 16:17, 23 July 2012 (UTC)
For categorization purposes, we keep them separate. Does this sufficiently answer your question? Mglovesfun (talk) 16:58, 23 July 2012 (UTC)
  • After e/c a longer answer than MG's: Plural forms of nouns are nouns and, as such, appear under Noun L2 headers, but as non-lemmas they should only appear in Category:English plurals, which result is accomplished by {{plural of}}. Plural-only nouns are also lemmas and should be categorized as nouns, which result is accomplished by {{en-plural noun}}. English entries are fairly highly compliant with this approach. Some entries are in the proper category as a result of ex-template categorization. —This unsigned comment was added by DCDuring (talkcontribs) at 17:01, 23 July 2012 (UTC).
    To add to what Mglovesfun and DCDuring have said . . . the idea is that a singular form and a plural form are two forms of a single noun, which should only appear once at Category:English nouns. (There are some kinks in the system — for example, we do use e.g. {{en-noun}} at alternative-form entries — but I think the overall approach is correct.) —RuakhTALK 17:25, 23 July 2012 (UTC)
    • Thank you all for the technical explanations, I very much appreciate it. The explanation of categorization for lemmata listing purposes makes things clear. -- Many thanks, Eiríkr Útlendi │ Tala við mig 17:34, 23 July 2012 (UTC)
  • It think this conversation demonstrates why English as much as any other language needs to have control of the practices that apply to its own L2 sections. DCDuring TALK 17:01, 23 July 2012 (UTC)

From all the above discussion, it seems to me the most important factors are resource economy vs. uniform treatment. If changing '''entry''' to {{head|en(|POS}}) adds an extra transclusion, and this costs resources, that's a minus; but by using templates we make entries more uniform, and modifiable as a group, which makes the dictionary business easier. Which of these two weighs more then? Is it a question of opinion, or can they be assessed empirically or quantitaviely? Also, is this sufficiently important to become a matter of policy? It would seem to me to be so, or else there might be people with different opinions (say, CodeCat and Msh210 above) undoing each other's work (by changing '''entry''' to {{{head|en}} or vice-versa). --Pereru (talk) 23:09, 23 July 2012 (UTC)

Applying a single template to many languages makes it harder to have any kind of experimentation as each edit takes quite a long time to work through the system. Several languages have materially different inflection lines. For example {{en-noun}}, itself very widely transcluded, facilitates the handling of questions of number in English. {{head}} does not and there are limits to how much complexity can be incorporated into a single template before it becomes unmanageable.
Of all the languages which do not need the kind of uniformity that seems to be under discussion, English is foremost here. It is already reasonable uniform with a simple logic that reflects the inflectional simplicity of English. We do not have such a great amount of technical resources available to squander on projects in the wrong areas or before the time is right for them. We still need to do what we can to facilitate the entry and correction of new manually added headwords in every language, even English, which is far from incorporating all the headwords of, say, MW3, let alone all the proper nouns that some advocate including. DCDuring TALK 23:54, 23 July 2012 (UTC)
Of course the headline templates can be language and POS specific if that makes sense, and certainly the uniformity of formating is very important. Headline formating with '''entry''' would be acceptable for form-of entries if that's more or less uniformly applied, however {{head|en|POS}} is IMHO better, if that's not too much work. Matthias Buchmeier (talk) 13:09, 24 July 2012 (UTC)
Could {{head}} be modified so that it transcludes the language's own template if one exists, and creates its own headword line otherwise? For example, {{head|en|noun}} would simply use {{en-noun}}. Or would that needlessly complicate it? —CodeCat 13:15, 24 July 2012 (UTC)
There are too many languages with more than one template, or with templates that require parameters that can't be predicted. In Ancient Greek, for instance, there are three declensions, and adjectives have masculine, feminine and neuter forms that are sometimes in different declensions- and not always the same combination of different declensions.
I do think it would be good to come up with a better way of documenting what templates are available for a specific language, and what they do. I find myself doing things like typing template:xxx-noun into the search box to see what the auto-fill-in suggests Chuck Entz (talk) 14:24, 24 July 2012 (UTC)
Glad you asked. On the one hand we have Category:Headword-line templates by language and on the other hand we have Special:Uncategorizedtemplates, which has, for example, many Latin conjugation templates. It is hard to say how many uncategorized templates there are in total: certainly more than 5,000, probably fewer than 10,000. It is hard to say because we have so many language and script template subpages that are uncategorized and clog up the works that actual templates that follow the Latin conjugation templates alphabetically are not visible. DCDuring TALK 18:23, 24 July 2012 (UTC)

Babel-style template for one's native regional lect

A large number of entries in the newbie contribution section the past few days are from someone from Tennessee and someone from Scotland. This got me to thinking: there are several very distinctive regional speech varieties that may be comprehensible to other speakers of the same language, but which have differences in usage that those from elsewhere aren't familiar with.

What if we had some kind of template for one's user page to indicate one's regional lect, with perhaps allowance for a local area like a state or a county or equivalent, and maybe a city or town, if relevant. Mine would say I'm from the US, and southern Californian. Someone else might have US, Southern, Georgia or UK, northern England, Leeds. There would have to be provision for named varieties like Geordie or Cockney. It might be nice to have categories to go with it, but that could get messy- risking the choice of either a bunch of one-member categories or fights to determine if a particular regional variety deserves its own category.

Thoughts? Chuck Entz (talk) 16:11, 22 July 2012 (UTC)

Sure, why not. Some people (like me) just use a customized BabelBox anyway, so we might as well customize the real deal (although I haven't actually added my accent yet). --Μετάknowledgediscuss/deeds 19:51, 22 July 2012 (UTC)
I would like to expand the Babel system so that it could show how well a person reads / writes / speaks a particular language. I would say that I was it-3, it-2.5 and it-1 for reading, writing, speaking Italian (as an example). SemperBlotto (talk) 19:56, 22 July 2012 (UTC)
That’s a great idea! — Ungoliant (Falai) 20:00, 22 July 2012 (UTC)
I think a general template people can add the names of their accents to — {{[template]|Geordie|3}}, {{[template]|Southern California|N}} — makes much more sense than a gzillion individual templates, one for each accent. The latter (a) will lead to a large number of templates, some of which will be hard to find (and sometimes therefore re-created under a different name) and (b) may lead to arguments over which templates should exist. The whole point of templates with parameters is that we can have boilerplates (templates) that can be used for different things. That said, I do support having such a template (one, or even one each for the various levels, since that's how the language templates are set up).​—msh210 (talk) 20:19, 22 July 2012 (UTC)
The way {{Babel}} currently works, a user can't actually specify template parameters. I'm not sure how to fix that. Maybe a new body= parameter, with the property that {{Babel|en|fr-3}} (for example) can be written as {{Babel|body={{User en}}{{User fr-3}}}}? —RuakhTALK 12:29, 23 July 2012 (UTC)
Good point (re current situation). Yeah, the suggestion sounds good to me. it'd just require the addition of {{{body|}}} to {{Babel}}} before {{#if:{{{1|}}}..., no?​—msh210 (talk) 18:33, 23 July 2012 (UTC)
I had imagined {{{body| before it and }}} after it; but yeah, either way. —RuakhTALK 18:57, 23 July 2012 (UTC)
I've created template:User lect, template:User lect-4, template:User lect-3, template:User lect-2, template:User lect-1, and template:User lect-0, qq.v., per this discussion. I haven't yet edited template:Babel.​—msh210 (talk) 19:00, 24 July 2012 (UTC)
There is an alternative way to do this so that you can override each box individually rather than overriding all or none. The babel template can check each parameter and somehow see if it 'looks like' a language code with a number (presumably by the same means that other templates judge links), and transclude it if it does, otherwise use it verbatim. This would allow you to use {{Babel|{{User lect|en|British}}|fr-4|{{User lect-3|sv|Finland}}|de-2}}. —CodeCat 19:20, 24 July 2012 (UTC)
That could work. We can say that any argument under 15 characters (e.g. fr-4) must be the name of a userbox, and any argument over 15 characters (e.g., <div style="float:left;border:solid #6ef7a7 1px;margin:1px;"> [] , which is the start of what {{User lect|en|British}} expands to) must be wikitext that generates the userbox. —RuakhTALK 18:03, 26 July 2012 (UTC)
I actually meant using {{isValidPageName}}, which would be less error-prone and wouldn't depend on length. —CodeCat 23:59, 27 July 2012 (UTC)
I actually knew what you meant. I thought that length would be more straightforward. *shrug* —RuakhTALK 01:18, 28 July 2012 (UTC)
It might have been, if we had string functions. Of course if we had those, we could just look at the string itself... —CodeCat 01:26, 28 July 2012 (UTC)
See {{str ≥ len}}. I'm not a fan of the {{str ...}} family of templates in general, because they're hackish and incredibly expensive, but this one actually isn't expensive. (It's still hackish — it exploits a weird MediaWiki edge-case — but then, the same is true of {{isValidPageName}}.) —RuakhTALK 01:33, 28 July 2012 (UTC)
I've added support for this to {{Babel}} now, but it seems the {{User lect}} template isn't quite ready yet. —CodeCat 11:06, 29 July 2012 (UTC)
  • Would we want such location/dialect information to be locatable by others, as for questions about the dialect? Would we rely on regular wikisearches, which could be restricted to user space? Could we use the template to generate a page that had username and dialect listing for each language? DCDuring TALK 11:07, 23 July 2012 (UTC)
    • To the best of my knowledge: maybe (we probably wouldn't care enough), most likely, and certainly. --Μετάknowledgediscuss/deeds 16:12, 23 July 2012 (UTC)

Searching Usenet

Apparently Google has decided to break Google Groups. Can anyone successfully search Usenet there, and if so, can you show me how?--Prosfilaes (talk) 22:53, 22 July 2012 (UTC)

Works fine for me [1], AFAICT. Are you sure it's not a local problem? --Μετάknowledgediscuss/deeds 04:47, 23 July 2012 (UTC)
Works for me too. You do have to select Google Groups rather than All groups, but that isn't new. Equinox 08:31, 23 July 2012 (UTC)
Since the redesign, I sometimes get an odd glitch where pressing "enter" to search Google Groups comes up with no hits, but clicking "search" works fine. Could that be the problem? Smurrayinchester (talk) 09:59, 23 July 2012 (UTC)
I so far haven't been able to find an advanced search feature in the new Google Groups, so I've been using the old version. Astral (talk) 19:36, 26 July 2012 (UTC)

Mentoring program

Me, BenjaminBarrett12 and Metaknowledge created a draft for a mentoring program for beginners. If the feedback is positive we could a link to that page in {{welcome}}. Comments? Ideas? — Ungoliant (Falai) 05:27, 24 July 2012 (UTC)

Looks good. I like the minimal structure. I would recommend that folks be particularly encouraging to beginners in areas where we really need help, for example, in English, technical subject matter. DCDuring TALK 11:36, 24 July 2012 (UTC)
I have taken your advice. I also recommend that those of us who speak or study minority languages (you know who you are!) to sign up, just in case. Even if you sign up only with a larger language like Russian or Arabic, it's still very likely that you wouldn't need to mentor for long periods of time, but you'd come in handy if the time arose. Thanks all --Μετάknowledgediscuss/deeds 17:04, 24 July 2012 (UTC)
Wait, where's the like button? I must be new at this! DAVilla 03:05, 22 August 2012 (UTC)
I'll mentor if the mentee cooperates. How do we mark their learning? :) I have asked two Russian speaking editors to add transliteration to their translations but they refused! Not because they don't know how but because they think it's a waste of time. Does anyone else think that Russian doesn't need transliteration? For me it's frustrating to see translation without transliteration in scripts I'm not familiar with, so I fix when I can. Anyway, I have signed up for Russian, Chinese Mandarin and Japanese, even if I don't know all words in languages I learn I can help get started with editing in these and some other languages. --Anatoli (обсудить/вклад) 03:27, 22 August 2012 (UTC)

Everyone happy with Wiktionary:Votes/pl-2009-03/Context labels in ELE v2?

A context label identifies a definition which only applies in a restricted context. Such labels indicate, for example, that the following definition occurs in a limited geographic region or temporal period, or is used only by specialists in a particular field and not by the general population. Many context label templates also place an entry into a relevant category, but they must not be used merely for categorization (see category links, below).

I think the bit people dispute is "or is used only by specialists in a particular field and not by the general population". You do get a lot of entries using # {{anatomy|lang=foo}} [[heart]] which looks good to me but is actually invalid! Perhaps # {{anatomy|lang=foo}} [[hyoid]] is ok because it's more obscure and so 'only used by specialists'.

Thoughts? Mglovesfun (talk) 09:30, 24 July 2012 (UTC)

I think that it might be helpful to have the categorization on the corresponding definition line. This way one could know to which of the various definitions the category referres. I've sometimes seen [[Category:en:Something]] at the end of definition lines. Matthias Buchmeier (talk) 11:20, 24 July 2012 (UTC)
If we could distinguish between topical labels and usage contexts that might be helpful. But it should be possible to provide multiple definitions, some suggesting general use of term and others appropriate for specialists. I think [[iron]] provides an example. Scientists, metalworkers and ordinary humans each have different meanings even when referring to the same material. DCDuring TALK 13:24, 24 July 2012 (UTC)
I recently commented (at Wiktionary:Information desk#{{grammar}}) that Oxford Dictionaries Online, le Trésor de la langue française informatisé, el Diccionario de la lengua española, and הַמִּלּוֹן הֶחָדָשׁ tag the grammar sense of verb, verbe, verbo, and פֹּֽעַל, פועל(pó'al) as Grammar,[2] GRAMM.,[3] Gram.,[4] and [בדקדוק] (respectively), despite DCDuring's point (in the same discussion) that it is "not at all restricted to use by linguists or grammarians". The restricted-context theory of sense-labels makes superficial sense -slash- it makes perfect sense in the abstract, but other dictionaries don't seem to follow it, and when I look at specific examples (such as words for “verb” and “heart”), my non-anatomical gut tells me that the sense-label belongs there, regardless of what the theory says. —RuakhTALK 15:10, 24 July 2012 (UTC)
I seem to be in disagreement with others here. The section quoted starts with, "A context label identifies a definition which only applies in a restricted context," and then says that usage by specialists and not the general population is an example. It does not say that the general population doesn't use a labeled word. I think the wording could be made clearer, but don't think there is any problem with using the label “anatomy” for “heart.” --BB12 (talk) 16:55, 24 July 2012 (UTC)
Right, use by specialists and not the general population is just one example. But the point is that a word like heart does not "only appl[y] in a restricted context". It's not restricted by region or dialect or time-period or register or anything like that. In particular, it's not restricted to any sort of "anatomy" context. (The reason that Mglovesfun is focusing on the specialists-vs.-general-population example is that it's the only dimension of restricted usage that (anatomy) could logically represent. Even if heart merited a sense-label because it was only used in New England, or something, that wouldn't give it a free pass to have a sense-label like (anatomy).) —RuakhTALK 18:06, 24 July 2012 (UTC)
Since the heart is also used as food, I see your point. --BB12 (talk) 18:43, 24 July 2012 (UTC)
We are talking about context labels as they apply to specific senses. The way people speak about it, what they mean, differs by context.
  1. Two doctors talking to each other about coronary artery disease or a surgical procedure mean something technical probably
  2. What they mean and what their patient understands when they say he has a heart problem is probably different, especially at a diagnostic consultation.
  3. The doctor and patient share the same meanings when they talk about the organ meat or the core of something or a greeting-card illustration.
That first sense is the one that is restricted. It is not restricted to professional anatomists, but to folks who have the kind of understanding that comes from systematic study of anatomy, including ad hoc self-education. Senses 1 and 2 have the very same referent: that patient's heart, but the definitions that reflect the meanings in the conversations are different.
I hardly think that we would find the need to put topical labels like food or love or iconic image on senses of . But I don't see the logic by which such labels would not be almost required by a system of topical categories such as we pretend to have. DCDuring TALK 19:26, 24 July 2012 (UTC)
I think we can all agree that a specialist in anatomy understands broadly the same thing from the word 'heart' as any other English speaker would. Mglovesfun (talk) 15:44, 25 July 2012 (UTC)
  • Working primarily on non-English entries, mainly Japanese, I wind up using context labels in a clarifying role -- JA 心臓 (shinzō) is specifically the anatomical heart, while 心臓肉 (shinzō niku) is the food heart, (kokoro) is more the psychological heart, 中心 (chūshin) is more like the center sense, ハート (hāto) is the playing card sense, 核心 (kakushin) is the abstract core sense, and (shin) is more the concrete core sense. (Ain't translation fun!)
In this case, would it be acceptable to use the anatomy context label for the first, the food label for the second, and so on? -- Eiríkr Útlendi │ Tala við mig 16:28, 25 July 2012 (UTC)
Very good example! I think the answer is not unless they are only used by specialists, so if 心臓 is the common population's word then it shouldn't be tagged with anatomy, but can be categorized as Category:ja:Anatomy. That's sort of the beef I have with this vote. Mglovesfun (talk) 17:51, 25 July 2012 (UTC)
Then I say the usage guidelines for context labels should be changed to allow for broader use. To say that 心臓 (=shinzō) is "anatomy" should not mean that only doctors use this word, and I really don't think that an "(anatomy)" tag appearing before a word on an entry page would be understood by most users as indicating any such restricted specialist-only meaning.
(FWIW, our [[anatomy]] page is missing the sense of "the bodily structure of a plant or an animal or of any of its parts", as listed in the 1992 3rd edition of The American Heritage Dictionary of the English Language.) -- Eiríkr Útlendi │ Tala við mig 17:59, 25 July 2012 (UTC)
No, you'd use "# [[heart]] {{qualifier|organ of the body}}" or "# [[heart]] {{qualifier|the cut of meat}}" or "# [[heart]]: [[center]], [[core]]".​—msh210 (talk) 18:52, 25 July 2012 (UTC)
I'd be for the change too, but I don't fancy drafting a vote, if someone else wants to do it I'll re-read it. Mglovesfun (talk) 10:25, 26 July 2012 (UTC)
I think you should use the context labels because this is restricted use and anatomy would be great for 心臓. (The specialist use issue is given only as an example in the guidelines.) --BB12 (talk) 22:13, 26 July 2012 (UTC)
As a newbie, the idea of categories makes perfect sense. It allows the user/reader to match a definition to a category to make a more accurate choice for word usage outside of Wictionary, which is the whole point (well, one of) of Dictionaries in general.--Jacecar (talk) 08:36, 4 October 2012 (UTC)

Slippery slope/Other stuff doesn't exist/laundry list arguments

In many RfDs (particularly ones where SOP is the main reason for deletion), one or more editors will "vote" delete, and say "Well, if we keep this, we'll have to create..." and then a list of other entries. I request that we have some guideline that discourages the use of this line of argument. If SOP is kept, it alone should be a sufficient argument without the laundry list. The laundry list of entries also flies in the face of the fact that RfDs are generally supposed to be decided on individual merits; the entries of the laundry list may not be at the same level of notability as the entry at RfD, or even at the same level of notability as each other. Purplebackpack89 (Notes Taken) (Locker) 15:00, 27 July 2012 (UTC)

"Notability" is a Wikipedia concept, with no place here. The closest corresponding Wiktionary concept is "attestation", though I guess you really mean "idiomaticity". —RuakhTALK 15:41, 27 July 2012 (UTC)
I think the closest we have is WT:CFI#Attestation, if it's attested it's 'notable' enough to be included. Mglovesfun (talk) 15:47, 27 July 2012 (UTC)
All words in all languages - however slippery the slope. SemperBlotto (talk) 15:50, 27 July 2012 (UTC)
We had a vote to get rid of the slippery slope section of CFI, but the result was no consensus. By the way, you can’t just request that people don’t use arguments you don’t like; I might as well request that we have some guideline that discourages the use of arguments such as: “we should waste as much database space as possible with clearly non-idiomatic sums of parts because of NOTPAPER”. Instead, try convincing them it’s not a good argument. — Ungoliant (Falai) 15:51, 27 July 2012 (UTC)
That's a good point, having an opinion is ok, obviously, but if you can't back that opinion up with good reasoning, Wiktionary not Wikipedia reasoning may I add, don't expect people to agree with you. Basically, you may as well make it possible for people to agree with you. Mglovesfun (talk) 16:06, 27 July 2012 (UTC)
There is value in comparing a specific collocation in question with other collocations. We make comparisons with coverage of a term in other dictionaries. The point of comparisons with other collocations is to determine what is and what is not a free collocation, not worth including and possibly misleading to include. DCDuring TALK 16:44, 27 July 2012 (UTC)
Ungoliant, if there was no consensus awhile back, then do it again to see if there's consensus this time. (Consensus can change). During, how do "collocations" add anything that's not already in SOP? Purplebackpack89 (Notes Taken) (Locker) 16:57, 27 July 2012 (UTC)
@Purple: I don't understand your question. DCDuring TALK 18:44, 27 July 2012 (UTC)
How does comparing "specific collocation in question with other collocations" add anything that isn't already covered in SOP? Why not just say "This falls under SOP" and have done, rather than list other entries that aren't necessarily guaranteed to be of the same attestability? Purplebackpack89 (Notes Taken) (Locker) 20:20, 27 July 2012 (UTC)
Because SoP is not a hard-and-fast determination, just as "meaning" or "sense" is not. We try to come up with simplfications that let our determination as to inclusion be made more expeditiously, but there are relatively few simplifications that are entirely satisfactory. I usually extract any long list of collocations from COCA and/or BNC, which virtually guarantees attestability. DCDuring TALK 20:31, 27 July 2012 (UTC)
The trouble, I think, is that WT:CFI is currently somewhat vague in its definition of SOP ("unidiomatic") terms, leaving a fair amount of room for personal interpretation. There's nothing in it which, in my reading, clearly rules out the inclusion of a collocution such as television show. I interpret the "fried egg test" portion of CFI as allowing for its inclusion, as, like fried egg, television show could be used to indicate multiple things but is generally only used to indicate something specific.
It seems that others are interpreting television show as equivalent to bank parking lot. I understand that reasoning, but I think it fails to take into consideration a distinction that some, like myself, are making. Bank parking lot is simply attributive noun + noun; many adjectives and attributive nouns could be attached to parking lot, but few, if any, would represent distinct concepts worthy of inclusion. Fried egg and television show, on the other hand, both represent something specific. The specificity of these terms' meanings cannot be easily derived from their components alone without prior knowledge of the terms' restricted usage.
I definitely think CFI should be amended to more explicitly spell out what constitutes an unidiomatic/SOP term. Whatever policy the community decides to adopt, it seems preferable to me that it should be fully set down, rather than partially exist in the unwritten, de facto manner which it seems to presently. Astral (talk) 22:15, 27 July 2012 (UTC)
As I understand the "fried egg test", television show doesn't pass it, because there is nothing else that is a show on television but isn't a television show. A scrambled egg, on the other hand, is an egg that is fried, but it isn't a fried egg. —Angr 23:02, 27 July 2012 (UTC)
Among the definitions for show is "an exhibition of items" such as an "art show" or a "dog show." But an art show on television is not a television show per se; that is, there is a television show that has the art show, but the art show itself is not the television show. In this exhibition sense, a television show would be an exhibition of television sets. --BB12 (talk) 23:22, 27 July 2012 (UTC)
@Angr — Show can also mean "movie," but television show is generally only used to indicate a TV programme/series, rather than a made-for-TV movie or a theatrical-release movie shown on TV. Hence the term's usage, in my view, is limited to a single specific meaning out of multiple potential meanings, similar to fried egg. Astral (talk) 23:25, 27 July 2012 (UTC)
@BB: A car radio is a radio in a car or radio programming listened to or intended to be listened to in a car. An AM radio is a radio using amplitude modulation or radio listened to in the morning. The meaning of free noun-noun compounds is assembled in various ways. DCDuring TALK 23:51, 27 July 2012 (UTC)
@DCDuring: Would you say, then, that the "bank parking lot" example needs to be stricken? --BB12 (talk) 02:39, 28 July 2012 (UTC)
Quite to the contrary. My experience tells me that we underestimate the power of the human mind to exploit context to resolve these ambiguities. It is only when we think inhumanly, mechanistically, that this seems so hard. (I'm not participating at Wiktionary to help computers.) I came across a reference to quote: "CS Lewis refers to this as "the insulating power of context"; that is, "the sense of a word is governed by the context and this sense normally excludes all others from the mind." Some theorists refer to "frames", but whatever model one uses, it seems apparent to me that humans don't really have 25 separate senses of in their mental lexicon. They have a smaller number and the ability to think metaphorically. Context might include universal components, like naive physics and biology; culture-specific items; much narrower subcultural knowledge, as well as personal and situational knowledge. The "bank parking lot" example just scratches the surface. DCDuring TALK 04:32, 28 July 2012 (UTC)
Perhaps I'm the only one, but I just don't get it. To me, "bank parking lot" cannot mean "(sperm) bank parking lot" and therefore is, contrary to the assertion in the CFI, idiomatic. --BB12 (talk) 04:58, 28 July 2012 (UTC)
Of course it could mean that - in context. That would require more situation specific context. The less context we have the more interpretation options we would have to keep open, but it would be a maladapted brain that computed all the possibilities rather than use default meanings. But the defaults are themselves context-conditioned. The more widely shared the contexts among speakers, whether because of the nature of physical reality or cultural reality, the more frequent a given meaning of a word - or collocation. DCDuring TALK 12:41, 28 July 2012 (UTC)
It does not necessarily seem so to me, but in any case, an example that needs explanation is not a good example. --BB12 (talk) 20:25, 28 July 2012 (UTC)
But most humans aren't troubled by the example. That you are means you are thinking about it carefully. Perhaps you could come up with some alternative examples. I find the bank parking lot example amusing because both main senses of bank are from words meaning "bench", by a series of metaphorical extensions from different aspects of a physical bench. DCDuring TALK 21:02, 28 July 2012 (UTC)
To me "bank parking lot" can not mean "sperm bank parking lot", but only because "bank" does not mean "sperm bank" to me. If I was with someone who used "bank" for "sperm bank", I'd have no problem interpreting the statement "Go park in the bank parking lot."--Prosfilaes (talk) 22:13, 28 July 2012 (UTC)
I think that's where my issue comes from. It's in combination only. I would be fairly surprised to find that some people say, "I went to the bank" to mean "I went to the sperm bank" unless they're trying to be funny. Even if they work there and are talking to a fellow worker, that seems a stretch (but who knows). In any case, DCDuring's suggestion is really where I want to go with this. I'll put this on my list of things to do. --BB12 (talk) 22:36, 28 July 2012 (UTC)
What you're describing isn't necessarily a slippery-slope argument, but a reductio ad absurdum argument: take the same logic, apply it to other cases, and show how the flawed logic leads to nonsense. Most of the time the person using the argument isn't seriously suggesting that keeping the one term will force us to have all the others, they're just performing a thought experiment, and showing the results. The way to refute such an argument is to show how the application to those other cases is different in such a way as to invalidate the experiment.
If you don't think a particular type of argument is valid, you point out its flaws so others will stop letting it sway their opinions. You don't go elsewhere and try to have it banned. Chuck Entz (talk) 22:26, 27 July 2012 (UTC)

German nominalized infinitives

I wonder if we should have entries for German nominalized infinitives. In this language (and I believe in many others), verbal infinitives (e.g. spielen (play)) can be used as nouns: Das Spielen (des Spiels) macht Spaß. This is 100% productive (i.e. works with any verb) and is very similar to the English gerund: Playing is fun. The nominalized infinitives have all the characteristics that nouns have, in particular they can appear in all cases (das Spielen, des Spielens, dem Spielen, das Spielen). All of them have neuter gender (das Spielen). Usually there are no plural forms (only if the nouns are lexicalized to a certain extent). AFAICS, English gerunds don't seem to get their own entries (playing appears to be an exception), and in fact this might be pointless since any verb can be nominalized in this way. But on the other hand, there are two arguments in favor of separate entries in German: first, the nominal inflection in cases, and second, nominalized infinitives are capitalized (as all other nouns in German are). What do you think? Longtrend (talk) 12:02, 28 July 2012 (UTC)

The same happens in Dutch as well. But I've always removed such entries when I came across them. I consider a nominalised infinitive similar to a nominalised adjective. —CodeCat 16:00, 28 July 2012 (UTC)
Nominalized adjectives are an interesting case as well (as is the question whether we should include them, but this should be discussed separately IMO), but there are clear differences to nominalized infinitives (in German at least). Nominalized adjectives are inflected just like adjectives, there is nothing noun-typical about them. Nominalized infinitives, however, are inflected like nouns (see above): they lose their verbal inflection and can appear in all cases (in the genitive, even the morphological form changes, as in des Spielens. Excluding nominalized infinitives would mean to exclude such special forms and thus to not be able to look them up). Longtrend (talk) 16:45, 28 July 2012 (UTC)
The case forms could be included in verb inflection tables. Historically, the infinitive has always been a verbal noun in Germanic languages, and the neuter gender associated with it is inherited from that noun all the way to Indo-European times. So really I suppose the infinitive isn't nominalised as such, it already is a nominal, but one that happens to reflect a verb as well. —CodeCat 17:10, 28 July 2012 (UTC)
Another difference between German and Dutch is that in German, the nominal forms are capitalized. In Dutch, there's no orthographic difference between schrijven the infinitive and schrijven the gerund, but in German there is a difference between schreiben the infinitive and Schreiben the gerund. (We already have Schreiben because it's also a regular noun meaning "letter", but of course that doesn't apply to all gerunds.) German Wiktionary does list the capitalized gerunds separately from the lower-case verbs. —Angr 19:17, 28 July 2012 (UTC)
Absolutely. I also don't like the idea of cluttering up verb inflection tables with forms that aren't even verbs. It was also once suggested to include inflected participle forms (such as spielende) in verb inflection tables despite their adjective-like behavior. Still this might be worth considering because those forms clearly have something verby about them. But nominalized infinitives and their inflections are as nouny as it can get. They are clearly not verbs, so I don't think verb inflection tables are the right place for them. Longtrend (talk) 08:12, 29 July 2012 (UTC)
Incidentally, I've just been looking through a lot of Irish entries, and noticing that many have "verbal noun" and "noun" sections that are arguably either as redundant or as distinct as Spielen and spielen. My inclination is to include substantivised infinitives for the reasons Longtrend outlines: they differ in capitalisation and they inflect, even having distinctly non-verbal genitive forms. - -sche (discuss) 04:56, 30 July 2012 (UTC)
Embryomistic and I have generally been trying to include Irish verbal nouns under ===Noun=== headings rather than separate headings, but there's not a lot of consistency yet. The Celtic languages in general don't have infinitives, they only have verbal nouns. —Angr 22:16, 30 July 2012 (UTC)

All right, at the moment it looks like Angr, -sche and I have an inclination to include such nominalized infinitives, while only CodeCat argues against them. Any more opinions? Longtrend (talk) 12:11, 1 August 2012 (UTC)

They should certainly be included. However I'm not sure if all merit a lemma entry or if rare ones should be included as form-of entries. Matthias Buchmeier (talk) 12:43, 1 August 2012 (UTC)
The reason I argue against them is that they don't have any idiomatic meaning. Their meanings are completely predictable. So it is kind of like having SoP entries, except that here even the meaning of just one 'part' is not idiomatic. Having noun entries for all verbs would involve a huge duplication of information without any great benefit, unless we have 'form of' entries that point to the same entry (which would be the first time we did something like that). This is not even specific to Dutch or German, in Latin and the Romance languages the same thing happens: Errare humanum est. Irren ist menschlich. And something similar also happens for adjectives in Dutch and German, which also function as adverbs. —CodeCat 13:48, 1 August 2012 (UTC)
Interesting choice of word(s): the equivalent in English is with gerunds, and meaning is an example of (I think) a gerund that has a long history as a noun. Chuck Entz (talk) 14:05, 1 August 2012 (UTC)
@CodeCat (and also Matthias Buchmeier): I definitely see your point. I don't like having vast amounts of predictable entries either. But English plural forms, for example, are not that different. Of course there are few nouns with plurals other than -s, but there are almost as many -s plural forms as there are nouns, yet we attempt to include them all. Provided, of course, they are attestable. This is also a reply to Matthias: Your concern would mean to stipulate a more or less arbitrary line between common and rare nominalized infinitives. This seems to be difficult to do. But in fact, all such entries have to be attestable so we can exclude the weird ones. And remember, even if we allow nominalized forms for all infinitives, that doesn't mean that all will be actually created. Longtrend (talk) 14:43, 1 August 2012 (UTC)
  • My take: Let us have entries for "Spielen" and the like when they are attested. They inflect like nouns, and they take noun positions in sentences. However, we could avoid repeating the definitions from verb entries by using something like "Noun form of [[spielen]]" on the definition line; we do something like that when we define "-ness" entries like "The quality or state of being <adjective>" regardless of the number of definitions present in <adjective> entry. The potential stance that we should exclude all terms that are derived from other terms in a fully regular manner is not appealing to me. In Wiktionary:Requests_for_verification#vuvuzela, CodeCat (talkcontribs) seems to argue in the opposite direction, wanting to include all forms--attested or not--that are derived from other forms in a fully regular manner: she wants to include unattested inflected forms that are formed in a fully regular manner. --Dan Polansky (talk) 08:49, 5 August 2012 (UTC)
Agreed. Maybe we could think of even better definition lines than "Noun form of verb" -- this is just a grammatical definition without any semantic information, unlike "The quality or state of being adjective" which provides a semantic definition. Any ideas? Longtrend (talk) 09:56, 5 August 2012 (UTC)
Just to clear up my point here... I do argue for creating entries for all inflected forms no matter if attested. But on the other hand, I don't support creating such entries when the result is having two different entries on the same page, if one is entirely predictable from the other. Rather, we should find a way to merge them into one entry and include information of both. So I do believe that if the adjective snel is attested, that we also include the inflected entry snelle. But I think we should not have a separate entry for the adverb snel, because all adjectives are always and automatically adverbs in Dutch (as in German), just like all infinitives are always and automatically neuter nouns (as in German and Latin). —CodeCat 10:42, 5 August 2012 (UTC)
Understood. You haven't said anything about the fact that German nouns are capitalized, though. This means that infinitives and their nominalized "forms" will not be on the same page in German. Whether or not this is a sufficient reason to treat German different than Dutch in this respect, I don't know. Longtrend (talk) 10:49, 5 August 2012 (UTC)

There hasn't been further input. I think we all agreed that such entries should be made in some way or another, so I'll just create them when it's "necessary" (e.g. if they are requested). Their content can always be changed afterwards if necessary. Longtrend (talk) 19:52, 7 August 2012 (UTC)

All? —CodeCat 20:06, 7 August 2012 (UTC)
Well, you suggested to include these forms in verb inflection tables which means by your own logic that they should get entries, and in fact get them even if not attested (they don't appear on the same page as the verbal infinitive forms either, as I mentioned several times already). Longtrend (talk) 20:29, 7 August 2012 (UTC)
Ok, I thought you implied I agreed to adding them to the same page. If they are added to another page with a capital letter that is ok with me. Do you know if there is a form-of template for this situation? —CodeCat 20:33, 7 August 2012 (UTC)
I don't think there is. Note Dan Polansky's and my comments above about the possible content of such a template. I still think some semantic content rather than just "Noun form of X" would be nice. Longtrend (talk) 20:44, 7 August 2012 (UTC)

Foreign Word of the Day - revival

Ungoliant MMDCCLXIV and I want to initiate the 'Foreign Word of the Day' box on the main page. It has been discussed many times before, as 'Word du jour' in 2007 and (recently) above at #Foreign word of the day. It will be a daily feature, carefully chosen by Ungoliant and me at WT:Foreign Word of the Day/Nominations. We have built up all the necessary infrastructure, including the now fully-equipped {{FWOTD}} and an archiving system ready to be used. If anyone wants to see how it would fit in on the main page, I have set up a mock main page that includes the FWOTD at User:Metaknowledge/Main page.
All we need is approval; there is a long list of nominees, so we can literally get started within a day of being allowed to do so. I'm hoping we can get community consensus for this feature, but if necessary, we will create a vote for it. Thank you, Ungoliant, and thank you all in advance! --Μετάknowledgediscuss/deeds 03:23, 30 July 2012 (UTC)

It's interesting but I have some doubts. It may sound interesting or fresh to the nominator but not to native speakers and to other contributors who know a language better or don't know it. Language skills differ greatly, so I don't understand who and how will assess these nominations. From the list I see some really basic and unexciting words mixed with some unusual and interesting words. I appreciate the Russian slang word юзверь (júzverʹ), which sounds like a mix of English "user" and зверь (zverʹ) "beast" but what about those basic French and German words in the top list. Why were they nominated? --Anatoli (обсудить) 03:44, 30 July 2012 (UTC)
The old list (pre-2012) was made under different specifications. We want terms that are interesting for some reason, like юзверь (that's a great one!). We may mix in 'utilitarian' foreign terms now and then, but I hope that we can focus on words unique to their language or somehow notable. --Μετάknowledgediscuss/deeds 04:06, 30 July 2012 (UTC)
I haven't been keeping up with the technical side of this: is there a framework in place, as there is for WOTD, that automatically changes from one template (or subpage or other receptacle) to another when the date changes, so that if you set FWOTDs for the next n days, they will automatically be displayed in sequence? Or will the FWOTD need to be updated manually each day? I presume the former, but just want to be sure. - -sche (discuss) 04:48, 30 July 2012 (UTC)
No, it's all automatic, just like en-WOTD. It's at {{Foreign word of the day}} (although all you'll see now is the error message, because we haven't set one for today (obviously). --Μετάknowledgediscuss/deeds 04:57, 30 July 2012 (UTC)
Great, then I fully support this. :) - -sche (discuss) 05:00, 30 July 2012 (UTC)

Phonetic Russian pronunciations should be marked with the correct brackets -- they are not phonemic transcriptions

User:Wanjuscha is actively adding Russian pronunciations (transcriptions), which is nice, but I want to note that it would be correct to use square brackets for those, because they are phonetic rather than phonemic (the slashes in IPA are for phonemic transcriptions).

I've explained this at his talk page: that theoretically there are not so many phonemic distinctions in en:Russian phonology as details about the sounds in his transcriptions. But he hasn't noticed this and is continuing to mark the transcriptions in the wrong way.

There are already a lot of such transcriptions added. Perhaps, someone could set up a bot to correct the brackets.--Imz (talk) 06:07, 30 July 2012 (UTC)

We use slashes for IPA transcriptions, not just Russian, don't know where it started and why exactly slashes, not square brackets but please don't change it. You should check first! English and other language entries use the same convention. --Anatoli (обсудить) 06:21, 30 July 2012 (UTC)
Well, ok, I didn't know that this was a convention here. Why not discuss this issue now?
Also, I've had a look at the page bat, for example. The entries for some languages use square brackets for IPA transcriptions there.--Imz (talk) 08:53, 30 July 2012 (UTC)
Also, I feel that the English transcriptions might actually be quite close to a phonemic transcription (with each symbol representing a phoneme of a chosen variety of English).--Imz (talk) 09:01, 30 July 2012 (UTC)
We use slashes when the transcription is phonemic, and brackets when the transcription is phonetic (broad or narrow). This is the established practice in linguistics. Atitarev seems mistaken about our current practices, we already do it this way and there are many entries that have both kinds of transcription, for example Danish ikke. —CodeCat 10:26, 30 July 2012 (UTC)
My impression is that we use phonemic transcription in slashes for English (which we assume our readers know, if only subconsciously, how to pronounce) but phonetic transcription in square brackets for other languages. Any language for which there is an "Appendix:Foo pronunciation" page should follow the guidelines given at that page. —Angr 22:08, 30 July 2012 (UTC)
There's actually no reason why we can't include both pronunciations for all languages. Naturally, phonetic transcription is important for languages other than English, because of language learners. But I argued sometime before that someone who already knows how to pronounce a language will not find a phonetic transcription helpful; all they will care about is the phonemes. —CodeCat 22:38, 30 July 2012 (UTC)

Summary and request to a moderator

So, the conclusion of the comments made here is that we do not have a special convention, but follow the standard IPA convention of using slashes for phonemic transcriptions, and square brackets for phonetic transcriptions. (English transcriptions tend to be phonemic here, because someone who knows English doesn't need a more detailed pronunciation instruction.) So, the Russian transcriptions have been marked incorrectly.

And User:Wanjuscha doesn't seem to notice this discussion and correction: he is continuing to do the valuable work of adding pronunciations (his ones are phonetic transcriptions), but still incorrectly using slashes for the phonetic transcriptions instead of square brackets.

Could someone please stop him for a while so that he notices the discussion?

Then we should also correct the brackets for Russian pronunciations with a bot.--Imz (talk) 06:44, 4 August 2012 (UTC)

You have done what you can. In my opinion, it's better to just let it go. A minor error is much better than nitpicking and potentially running off a valuable contributor. Maybe some day he will read his page. If not, the bot task is a pretty trivial one, as far as that goes. DAVilla 02:59, 22 August 2012 (UTC)

Aramaic policy

I have some questions about the Aramaic policy (or lack thereof).

Why is the lemma form for Aramaic definite rather than indefinite as in Hebrew and Arabic? I think indefinite would make more sense if there is not already a reason to use the definite.

Why do the IPA transcriptions seem more like graphemic transcriptions rather than phonemic or phonetic? For example ספרא‎ uses the transcription /sɛprɑʔ/ rather than /sɛfrɑ/. If there is no good reason not to, I think it would be a good idea to fix these pronunciations.

Maybe we should create a Wiktionary:About Aramaic page?

--WikiTiki89 (talk) 10:15, 30 July 2012 (UTC)

Yes we should create that page; we'd need someone with enough knowledge to do so. Ideally more than one person so they can check each other's work. Mglovesfun (talk) 10:19, 30 July 2012 (UTC)
I wrote most of those entries back when we lumped everything under the "Aramaic" header and didn't distinguish between different dialects. The "Aramaic" entries written in the Hebrew script should actually be split up into Jewish Babylonian Aramaic, Jewish Palestinian Aramaic, etc. in the same way that "Chinese" is split up into Mandarin, Cantonese, Wu, etc. Since I'm most familiar with the Syriac dialect of Aramaic, I wrote those entries with a heavy bias towards that dialect. :)
Anyway, some points:
  • In Syriac, the lemma is the definite form (the "emphatic state", as it's called). It should definitely be changed for Jewish dialects.
  • On the pronunciation of סִפְרָא‎, I would argue that both "p" and "f" can be used in the phonemic form; it all depends on how you look at it. If you're using a phonetic form though, "f" should definitely be used.
  • The "ɛ" used in סִפְרָא‎ should probably be changed to something like "i" or "ɪ", as I think "ɛ" is only found in the Syriac pronunciation of that word.
  • Geminated consonants might want to be added to many pronunciations.
  • Niqqud markings might want to be added to the headwords.
  • The plural forms of most masculine words should be changed (e.g., ספרא to ספריא or, using the indefinite form, ספר to ספרין). Again, ספרא is a Syriac plural.
  • Meanings of words in general should be looked at since some may only occur in Syriac. --334a (talk) 17:20, 30 July 2012 (UTC)

A proposal to solve the number-numeral question

I've been reading a bit more about what a numeral is and what a number is, and how they differ. I came across the Wikipedia article w:Numeral (linguistics) and the discussion at w:Wikipedia talk:WikiProject Linguistics/Archive 6#what is a 'numeral'?. This is what I've gathered:

  • 'Numeral' is a part-of-speech class that includes words like one and forty-two, but not first or once. It is a subset of the quantifier class, itself a subset of the determiner class.
  • Since numerals are a part of speech, there is no such thing as a 'cardinal numeral' or 'ordinal numeral'. first is an adjective, once is an adverb.
  • 'Cardinal number' is a term that can refer to one, but it is not a part of speech. Likewise, 'ordinal number' or any other kind of 'number' is not a part of speech either.
  • Crucially, not all cardinal numbers belong to the 'numeral' part of speech; hundred and thousand are nouns (historically and currently).
  • Therefore, the category that contains both one and hundred can not be a part-of-speech category. Rather, the relationship between them is semantic.
  • The conclusion then is that the category Category:English cardinal numbers, containing words like one and hundred, is a semantic category, grouping terms on semantic rather than grammatical grounds. Likewise for Category:English ordinal numbers.
  • Consequently, these two categories, with their parent Category:English numbers, do not belong in Category:English parts of speech.

So I propose:

And of course likewise for all other languages. Does everyone agree with this proposal? —CodeCat 15:48, 30 July 2012 (UTC)

The category changes sound reasonable. What about the headings? Would the level-3 heading for forty-two be Numeral? — Ungoliant (Falai) 22:35, 30 July 2012 (UTC)
That sounds English-specific to me. Other languages are most likely another story.Matthias Buchmeier (talk) 14:47, 31 July 2012 (UTC)
The consideration that 'numeral' is a part of speech while '(cardinal/ordinal) number' is not, is independent of language. But which words are numerals differs per language, and not all languages even have words that are grammatically numerals. one or its translations being a numeral is English-specific, so one would belong in Category:English numerals, but its Old Norse cognate einn probably does not. Nevertheless, the categorical division can apply to all languages even if the words themselves are categorised differently in each. —CodeCat 14:59, 31 July 2012 (UTC)
It could make it very difficult for most users to find the numbers they want, which normally include zero, one through ten, hundred, thousand, and so on. Put Category:English cardinal numbers and Category:English ordinal numbers in Category:English numerals, but do not hide Category:English numerals within Category:English determiners. Keep Category:English numerals with the other POS such as nouns, verbs, etc. People shouldn’t have to have a Master’s degree in linguistics to find the category that includes all numbers from zero up. —Stephen (Talk) 15:16, 31 July 2012 (UTC)
I don't think putting cardinal numbers as a subcategory of numerals is a good idea since they are not a subset. On the other hand, a link in the description at the top of the page seems like a good idea. But in any case, someone who visits one will notice that it appears in both Category:English numerals and Category:English cardinal numbers so it shouldn't be too hard to find, I hope. I'm not terribly happy with keeping 'numerals' outside 'determiners', as it is this kind of categorisation that helps convey to users what we mean by those terms. Subcategorisation tells the users 'numerals are a kind of determiner' which is very useful indeed. If anything, the past has shown that it's important that we have a good understanding of terminology if we want to categorise words! —CodeCat 15:22, 31 July 2012 (UTC)
  • To respond to the introductory question: I do not agree with the proposal. It ignores all the difficulties discovered in previous discussions. It presents an analysis of number words in English, and then goes on to propose a change that is going to impact other languages, in "And of course likewise for all other languages". One key question is to what part of speech various number words belong; this may be highly language-dependent. Another key open question is whether to classify various number words by their grammar or by their semantics; a number of traditional grammars seem to present a grammatical classification that turns out to be a semantic one upon a closer inspection. See also Wiktionary:Number words, a page that shows specific examples of what some grammarians of some languages consider to be number words. --Dan Polansky (talk) 09:08, 5 August 2012 (UTC)
    What I would do, I think, is allow for parallel grammatical and topical classification. Thus, there would be "Category:English numbers" as a topical category, with any topical subcategories as we see fit, without consideration of grammatical properties. Thus, there could be "English cardinal numbers" and "English ordinal numbers" as subcategories, and "twofold" and "threefold" could end up in yet another topical subcategory. Creating a topical classification would create a browsing structure that is convenient for the users of the dictionary, and at the same time accurate. Part of speech would be decided independently of the topical classification, based on grammatical properties of the words. Thus, there could be "Category:English numerals" for those cardinal number words that have grammatical properties distinct from nouns, which "hundred" is not. Alternatively, the part of speech of such words as "two", "three", and "five" could be "Determiner", in English anyway. Thus, in English, the user would find all cardinal number words in the topical category "Category:English cardinal numbers" that would include both "three" (determiner) and "hundred" (noun). --Dan Polansky (talk) 09:28, 5 August 2012 (UTC)
    A correction: "Category:English ordinal numbers" would actually be Category:en:Ordinal numbers, like Category:en:Mammals, until the time the naming convention for topical categories changes. Category:en:Ordinal numbers would be a subcategory of Category:en:Numbers, a subcategory of Category:en:All topics or, as it is now, of Category:en:Mathematics. Category:en:Numbers would have Category:en:Fractional numbers and Category:en:Historical numbers as a subcategories, as it has now. --Dan Polansky (talk) 09:37, 5 August 2012 (UTC)
Dan, you actually summed up pretty much what my proposal is (or at least what I intended) so I'm not sure what you actually don't agree with. At least, I do agree with you... I'm sorry if I didn't explain it right. I'm not quite sure if they can really be topical categories though. It seems to me that once and twice are alike in a very different way than dog and cat are. The things referred to by the words dog and cat are both mammals, but what do once and twice actually refer to? Do they even refer to any kind of concept at all? What about first and second? To me, one, first and once all refer to the same concept, the cardinal number 1, so they would belong in the same 'topic'. What distinguishes them isn't topical, but is a bit more abstract; they have different kinds of reference to the cardinal number 1. My personal intuition is that that is not a distinction topical categories (should) make. Then again, we do include dog, doglike and canine in the same topical category, so I'm not quite sure. —CodeCat 11:26, 5 August 2012 (UTC)
Hm, after rereading your proposal, I admit it seems to have a lot in common with what I am proposing. To repeat, the core of my proposal is to recreate the topical structure of Category:en:Numbers, Category:en:Cardinal numbers and Category:en:Ordinal numbers for ease of browsing, leaving the part-of-speech classification open and language-specific, thus still possibly having Category:English numerals if deemed appropriate. Topical classification of number words is what we had before the vote Wiktionary:Votes/pl-2010-01/Number categories, which I unfortunately supported. --Dan Polansky (talk) 18:37, 6 August 2012 (UTC)
Yes, that was my proposal too, except that I didn't explicitly say they should be topical categories; just not part-of-speech categories. Maybe we should reconsider our naming scheme for topical categories again. I would prefer it if the distinction between different types of category weren't part of the name. It would allow us to move categories to different parts of the category tree much more easily, and would make questions like 'is this a topical category' less of an issue. But that would belong in a separate discussion. —CodeCat 18:42, 6 August 2012 (UTC)
Whatever the naming, the point is that Category:en:Numbers, Category:en:Cardinal numbers and Category:en:Ordinal numbers should not be subcategories of Category:English parts of speech, unlike--if it is to exists--Category:English numerals. --Dan Polansky (talk) 18:46, 6 August 2012 (UTC)
(after edit conflict) Re: "To me, one, first and once all refer to the same concept, the cardinal number 1, ...": They don't. "first" refers to an ordinal number. You can define a finite ordinal number in terms of a cardinal number (like by saying that n-th item is such that the number of items before the item is n-1), but they are still distinct. --Dan Polansky (talk) 18:43, 6 August 2012 (UTC)

Turkish ş

Sae1962 (talkcontribs) just removed the Turkish section from şikayet and added an also link to şikâyet. I consulted WT:About Turkish... but it doesn't exist. We had the same issue over Romanian, and it was settled quite straightforwardly. Let's hope we can do the same again. Mglovesfun (talk) 13:48, 31 July 2012 (UTC)

This doesn't have anything to do with <ş>, which in Turkish is unambiguous. (In Romanian it's tricky, because Turkish <ş> had computer support long before Romanian <ș> did, so of course the former was frequently used for the latter. But that issue never affected Turkish.) Both of the page-titles that you link to use <ş>; the only difference between them is that one uses <a> and the other uses <â>. —RuakhTALK 14:07, 31 July 2012 (UTC)
BTW, tr.wikt confirms Sae1962's edit: it gives the Turkish form with <â>, and reserves <a> for Ottoman Turkish (ota). (Note: Sae1962 has edited both of those entries on tr.wikt, but his/her edits were minor ones not closely related to this.) —RuakhTALK 14:12, 31 July 2012 (UTC)
After EDIT conflicts. şikâyet must be the right spelling according to online dictionaries but the verb form here is given as şikayet etmek, even if the noun is written as şikâyet. It's a matter of moving the Turkish section of şikayet into şikâyet, the templates should cater for the derived forms. Sae1962 (talkcontribs)'s Turkish is rated as Babel-4, so why not trust his/her judgement? --Anatoli (обсудить) 14:13, 31 July 2012 (UTC)
Another link şikâyet in Turkish wiki --Anatoli (обсудить) 14:15, 31 July 2012 (UTC)
I misread it then, sorry. Mglovesfun (talk) 14:17, 31 July 2012 (UTC)
I don't think that's an accurate description; ş has always been an acceptable glyph for the Romanian character. The unification with the Turkish character was well within standard practices both for Unicode and pre-Unicode.--Prosfilaes (talk) 04:44, 1 August 2012 (UTC)
Eh, I think it's accurate enough. The old characters are named as having "cedillas", and the reason the new characters with "commas below" were added to Unicode is that the Romanian standards body asked for them. If the point of this discussion were the Romanian situation, I'd have gone into much more detail (and given links to Wikipedia articles, as well as to past discussions and votes here), but since this discussion is about Turkish, I sought only to give a cursory explanation of the Romanian situation — enough to make clear that the same issue does not affect Turkish. —RuakhTALK 14:23, 1 August 2012 (UTC)