Open main menu

Wiktionary β

Wiktionary:Beer parlour

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017


September 2017

September LexiSession: peaceEdit

 
An origami for peace!

The monthly suggested collective task is to make peace. September 21 is the International Day of Peace and October 2 the International Day of Non-Violence so it may be good to reinforce our content related to this topic.

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession, because we started a year ago. I hope there will be some people interested in making some contributions! My plan is to try to draft a thesaurus on this topic, but to pick good illustrations can be a nice challenge too   Noé 10:37, 1 September 2017 (UTC)

Hey, the International Day of Peace is today. I'm quite happy to make some publicity about the new thesaurus about peace in French!   Noé 13:41, 21 September 2017 (UTC)

Addition to WT:Wikidata policyEdit

I propose to add a 5th case in which pre-approval isn't necessary: in existing and in-use templates/modules, where the use of Wikidata does not have any effect on the output. This would allow us to do things like tracking and stuff, for testing and to explore potential new uses and the effects. —Rua (mew) 14:41, 1 September 2017 (UTC)

To repeat what I said in Wiktionary talk:Wikidata#A first experiment, I support adding this 5th exception. I think it's consistent with the spirit of the other exceptions, to allow the possibility to access Wikidata without affecting the actual content presented to readers. --Daniel Carrero (talk) 21:49, 9 September 2017 (UTC)
I've been bold and added it to the Wikidata policy page, since nobody else has shown an interest in this discussion. —Rua (mew) 19:03, 15 September 2017 (UTC)

Old Kurdish?Edit

Please forgive my ignorance, but what is the generally accepted name for the parent form of all the Kurdish sublects? Old Kurdish? Proto-Kurdish? Is "Old Kurdish" attested and/or reconstructable?

Also, I noticed we have quite a few entries with the language code ku. Do these need to be sorted out and moved to their respect lect codes or are these entries with identical orthography across all three lects? --Victar (talk) 20:55, 1 September 2017 (UTC)

We have discussed this before, although I'm not sure where. @-sche might have a link handy. Our entries in ku are nearly all Kurmanji in Latin script, although we also have a specific code for Kurmanji. I think we would be best off committing to a single approach, and use a modified version of {{fa-regional}} to link between dialects. —Μετάknowledgediscuss/deeds 21:15, 1 September 2017 (UTC)
Pinging @Calak, who seems to be knowledgeable in all of the Kurdish dialects. —Aryaman (मुझसे बात करो) 02:42, 2 September 2017 (UTC)
Proto-Kurdish is attested.
I always use ku code when a word is common in all of the Kurdish dialects. For example the common word for "goat" in Kurdish is "bizin"; why should we separate dialects and write "bizin" four times?! ku code means Kurdish language with all its dialects.--Calak (talk) 09:03, 2 September 2017 (UTC)
Because it can get confusing where there's Southern Kurdish one place and Kurdish another, and it's doubly confusing once someone has set computers at the job and there's just raw lists of which entries have Souther Kurdish translations and which don't, without any note of Kurdish in the vicinity.--Prosfilaes (talk) 22:52, 4 September 2017 (UTC)

Proposal: install mw:Extension:PageNoticeEdit

This extension makes it possible to add headers to pages. It would mean we no longer need to add {{reconstructed}} to every reconstruction page. —Rua (mew) 12:19, 2 September 2017 (UTC)

French Wiktionary August newsEdit

Hello!

Hey! August issue of Wiktionary Actualités just came out in English!

What's up in French Wiktionary? And in the other Wiktionaries? What is a Magic Link? Is there statistics somewhere? German words that may exists? Videos? Details on tantum categories? Nice paintings from French artists? Clowns? Yes, all of this can be find in August Actualités!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We are very happy to celebrate a year of English translations! Twelves issues! That's not bad considering that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community, and it was eleven participants for this issue! We all stay eager to receive your opinion on our publication!   Noé 21:24, 2 September 2017 (UTC)

Egyptian hieroglyphicsEdit

Why are you using the html tag for Egyptian hieroglyphics instead of the unicode characters? While looking at Help:WikiHiero syntax I read that the unicode characters only are partially supported so I guess that's why (found the page while writing). What are the things missing for it to be fully supported and when might those "missings" be added or fixed? This turned out to be a more "I don't know anything" question than meant... sorry 'bout that.Jonteemil (talk) 23:34, 2 September 2017 (UTC)

You answered your own question: Unicode can't support all of what we want to show. WikiHiero not only displays the characters correctly, it also does so regardless of whether the reader's computer can support special fonts (hint: most can't) and allows for flexibility in stacking, which lets us show how the language's native speakers chose to organise hieroglyphs spatially. Moreover, Egyptian dictionaries are conventionally organised by romanisation. There is no expectation that Unicode will ever fix this, which is unsurprising given that it is a long-extinct language with no use community, so our current solution is the best way to handle Egyptian going forward. —Μετάknowledgediscuss/deeds 23:45, 2 September 2017 (UTC)
Actually, positioning hieroglyphs properly might be in the works. —suzukaze (tc) 07:16, 3 September 2017 (UTC)
Fingers crossed! — Ungoliant (falai) 17:54, 4 September 2017 (UTC)
Oh, I hadn't seen this version. That's interesting, although I'm not sure it weighs out the other concerns. (E.g. that Egyptian spelling is so erratic that if people search by hieroglyph, we'd have to create entries for all the plentiful alternative spellings (and arrangements that aren't even truly alternative spellings, but have different Unicode control characters) because they'd never be able to guess what spellings we'd lemmatised. —Μετάknowledgediscuss/deeds 18:02, 4 September 2017 (UTC)

User:TNMPChannel‎Edit

Discussion moved from Wiktionary:Tea room/2017/September#User:TNMPChannel‎.

I just blocked them for three days for creating an entry in Vietnamese- a language they don't claim to know- by plagiarizing the definitions (without attribution) from a Chinese entry that shares the same character. This is not just dishonest, it's a copyright violation and a violation of our Creative Commons license.

It's also part of a pattern of poor judgement that I've been concerned about for a while: indiscriminate mass creation of articles from a single source without checking for attestability. Creating entries, then immediately rfding them (within minutes). Submitting one of their new entries to rfc because no one else had intervened to fix it yet. Moving a category without understanding enough about our categories to have a clue whether it was a good idea (it definitely wasn't). In general, doing stuff without knowing what they were doing, then expecting others to fix it.

I may be wrong, but to me this all looks like a child who's too young to understand the implications of what they're doing, and is used to grownups stepping in and fixing things. If not, then something is really, really wrong.

At any rate, we need to decide what to do about this- I've only blocked them for three days, and they will have read this by then. Wikis aren't all that good at dealing with contributors who sincerely believe they're helping, but don't know what they're doing. What do you think we should do? Chuck Entz (talk) 05:55, 3 September 2017 (UTC)

I think he/she should be unblocked for now. The first reminder or warning to the user about making errors in unfamiliar languages was by Justin on their talk page at 03:54, 3 September 2017 (UTC), and they haven't made similar edits after the message. From I observed in their Chinese edits: he/she seems to be quite unfamiliar with our formatting system, though I can see they are trying to improve, and I have also received 'thanks' for my subsequent edits to their created entries. The entries have been quite useful too. It's not very often that we get new users who are native in E/SE Asian languages, so I'm more inclined to fix their new edits than discourage them. Of course, if it persists despite explanation and warning, then blocking would be indicated. Wyang (talk) 06:01, 3 September 2017 (UTC)
The plagiarism part merits at least a day. Chuck Entz (talk) 07:16, 3 September 2017 (UTC)
Sure. However, I suspect (or hope) that it may just be part of their cluelessness, rather than malicious disruption or infringement. I do hope that they could return, to help with Chinese idioms and Malay entries. Wyang (talk) 08:12, 3 September 2017 (UTC)
I find their constant page moves very worrying. They seem to misunderstand how everything works here. Maybe time should be taken to explain what's wrong on their talk page. —suzukaze (tc) 06:02, 3 September 2017 (UTC)
About 2.5 hours ago this user placed {{unblock|I promise I won't do it again.}} at User_talk:TNMPChannel#Unblock. I think we need some specific, extensive acknowledgement of what won't be done again. (Also something in the documentation that requests a complete confession or allocution before the user is entitled to the request being considered.) DCDuring (talk) 12:25, 3 September 2017 (UTC)
Also, please check if it's not an Awesomemeeos sock. --Anatoli T. (обсудить/вклад) 12:53, 3 September 2017 (UTC)
Completely different. Awesomemeeos gets all the technicalities perfect but has trouble with basic common sense. This person can't get either right. Awesomemeeos simply doesn't have the self-awareness and self-restraint to pull off an impersonation like this- for one thing, they're compulsive about upgrading templates. Chuck Entz (talk) 14:32, 3 September 2017 (UTC)

Translingual terms listed under descendantsEdit

Prompted by this seemingly innocent kerfuffle, it makes me wonder if it is I who am in the wrong, or if I'm onto something. Should translingual terms be listed as descendants? I would say no, because they're mainly taxonomic terms made up of Latin terms, not natural descendants. --Robbie SWE (talk) 17:49, 4 September 2017 (UTC)

Would you support the taxonomic terms being listed as ==Latin== instead of ==Translingual==? DTLHS (talk) 17:59, 4 September 2017 (UTC)
Hmm, I'm kind of slow today. Mind giving me an example? --Robbie SWE (talk) 18:02, 4 September 2017 (UTC)
Since you say that they are "taxonomic terms made up of Latin terms" and favor derived terms instead of descendants, it would be consistent to just call them Latin instead of "Translingual". DTLHS (talk) 18:04, 4 September 2017 (UTC)
But we already have translingual taxonomic terms. The examples I was given were bombyx#Descendants, accipiter#Descendants, aequoreus#Descendants and alauda#Descendants. I don't believe they should be listed as descendants. --Robbie SWE (talk) 18:19, 4 September 2017 (UTC)
You do not understand. If you want them to be derived terms why do you still want to call them "Translingual"? DTLHS (talk) 18:23, 4 September 2017 (UTC)

I don't want them there at all - for instance, Bombyx shouldn't be under descendants nor should it be under derived terms at bombyx. --Robbie SWE (talk) 18:47, 4 September 2017 (UTC)

Why shouldn't there be some link in the Latin section to the taxonomic term that is derived or descended from it? Why would we omit the connection? DCDuring (talk) 22:17, 4 September 2017 (UTC)
@DCDuring, I understand what you mean. The reason why I would opt for not listing them under descendants at all is because tanslingual isn't a language per se. According to our guidelines – [l]ist terms in other languages that have borrowed or inherited the word. The etymology of these terms should then link back to the page. – I don't think that translingual terms fall under this category. I looked through this category and a vast majority of them are not listed under descendants in their original Latin entries. --Robbie SWE (talk) 08:30, 5 September 2017 (UTC)
@Robbie SWE: In the case of CJKV characters clearly we are dealing with a script, not a language. In the case of other Translingual items we are dealing with items that are used in multiple languages. If we don't have Translingual descent shown for taxonomic names, then in principle we should show the taxonomic name as a descendant in each language in which the taxonomic name is used. This seems silly at best. DCDuring (talk) 14:34, 5 September 2017 (UTC)
"Should translingual terms be listed as descendants?"
By common practice it's done that way and so you were "in the wrong". This know is with a "should" and another question and topic. So what possibilities are there?
  • Don't list translingual descendants at all.
    --- This probably is not a good choice.
  • List them as derived terms.
    --- By WT:ELE#Derived terms that would only be possible if translingual terms would be mislabelled Latin or if Latin would be mislabeled translingual or if both would be merged into a single pseudo-language 'Translingualolatin' (or whatever the name would be). This probably is not a good choice either.
  • List translingual terms as descendants.
    --- Why not? I can't think of any contra reasons. Pro reasons: In a translingual entry it would also be for example "From Latin TERM", and descendant is descriptive.
  • List translingual terms at see also.
    --- This does also depend on the question of what can be listed at see also, and apparently there are different views about it. If different-language terms can be listed at see also: Why not? The only reasons I could think of would be that descendants (maybe cp. descendant#Noun) sounds fitting and is more descriptive and informative. On the other hand, as some translingual terms come with {{taxlink}} and link to the English wikispecies project and not to wiktionary, this would also be an acceptable choice.
-84.161.41.93 02:52, 5 September 2017 (UTC)
{{taxlink}} "temporarily" links to Wikispecies (in all uses), with the hope that there will be a Translingual Wiktionary entry (unless it is decided that taxonomic names are Latin). The "See also" heading is for items that have no more specific heading, but in the past has been used for (true) external links and WM project links as well as for alternative forms, "coordinate terms", hyponyms, hypernyms, meronyms, etc. Placing the items now under "See also" under proper headings would be an excellent cleanup project (Augean stables?), but most of us are in pursuit of bright shiny objects. DCDuring (talk) 03:59, 5 September 2017 (UTC)

(literary or dialectal)Edit

故此 is tagged as (literary or dialectal). I'd like to know whether, as the order of it seems to mean, 'literary' refers to Mandarin only, or also in the unspecified dialects implied by the tag. Is this way of inferring to be systematically unsderstood for any such tags appearing in other entries for any other language? --Backinstadiums (talk) 21:06, 4 September 2017 (UTC)

As it is the entry is not comprehensible. @Wyang, what dialects are included in "dialectal"? DTLHS (talk) 05:17, 5 September 2017 (UTC)
Wiktionary:About Chinese#Key points: "Terms are defined in relation to Modern Standard Written Chinese. ... Senses limited to the literary language, certain dialects or regions should be marked accordingly." I changed the tag to "formal or Min Dong". I think the headers in our entries should link to the "About" pages somewhere, so that people are directed to a page which explains how a language is documented in entries on Wiktionary, also a page where people can leave their questions or feedback, or voice their interest in joining the editing team. Wyang (talk) 12:13, 5 September 2017 (UTC)
@Wyang: Isn't it uncommon for term to be use in either formal registers or dialectal ones? --Backinstadiums (talk) 06:41, 7 September 2017 (UTC)
Not necessarily, especially if the reason for the disuse in the modern standard register is an innovation, which happens often in Chinese. Wyang (talk) 06:44, 7 September 2017 (UTC)
@Wyang: Could you please add some examples of such innovations for words used frequently in the language? thanks in advance. --Backinstadiums (talk) 14:58, 8 September 2017 (UTC)
Much of the variation in basic vocabulary (Appendix:Sino-Tibetan Swadesh lists) is due to innovations in the northern varieties of Chinese. Some examples include: (“mouth”) (later displaced by ), (“to eat”) (by ), (“to drink”) (by ), (“dog”) (by ), (“to stand”) (by ), (“he/she”) (by ). Apart from this kind of simple monosyllabic supplantation, another reason for the innovations is the process of polysyllabification, which occurred especially in the northern varieties out of the need for disambiguation, as a compensatory mechanism after the loss of many phonetic contrasts (e.g. tone) through sound changes. Examples include 石頭, (“seed”) → 種子. Wyang (talk) 04:30, 9 September 2017 (UTC)
@Wyang: Hi again, thank you for your examples but none is tagged as either "literary", "dialectal" or "literary or dialectal". Could we isolate the entries which have the tag "literary or dialectal", and then check which ones developed as innovation? --Backinstadiums (talk) 08:30, 9 September 2017 (UTC)
They don't have to be tagged. It's implied in the {{zh-dial}} boxes on those entries. Wyang (talk) 10:42, 9 September 2017 (UTC)
@Wyang: Do you mean having {{lb|zh|Min Dong}} link to the about page? — Eru·tuon 07:52, 7 September 2017 (UTC)
Not really, having <h2>Chinese</h2> link to the about page rather, in the style of fr:chinoise or something similar. Wyang (talk) 08:02, 7 September 2017 (UTC)
I suppose that could be done with JavaScript. Or with templates, if we decided to allow templates in headers like the French Wiktionary (less likely). — Eru·tuon 08:49, 7 September 2017 (UTC)

203.220.198.195 (talk)Edit

Please someone block this. Wyang (talk) 04:40, 8 September 2017 (UTC)

@Wyang: Done, simply because I trust you. I want a reason, though — I looked over a few edits and they seemed fine, though I don't know any Uyghur. (Also, for future reference, this sort of thing can go at WT:VIP.) —Μετάknowledgediscuss/deeds 05:39, 8 September 2017 (UTC)
Hmm, is it just because it's an Australian who already knows how all our templates work? —Μετάknowledgediscuss/deeds 05:46, 8 September 2017 (UTC)
Works for me. It's not necessarily the quality of the initial batch of edits, but the camel's nose that will lead to high volumes of hard-to-check edits later on. Notice, for instance, their edits on {{Template:kk-decl-noun}} which "coincidentally" continues edits by another IP, 61.69.238.79 (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks) back in July.
(Before E/C) @Wyang Hi. What edits are wrong? I have only checked some, haven't seen anything bad.

--Anatoli T. (обсудить/вклад) 05:40, 8 September 2017 (UTC)

@Atitarev you don't find it odd that an IP pops up out of nowhere and starts out by rewriting all the inflection templates- in both Kazakh and Uyghur? Chuck Entz (talk) 07:28, 8 September 2017 (UTC)
@Chuck Entz: Yes, it's suspicious, it could be a formally blocked editor but I didn't know that this is a reason for blocking, though and none was given. Who was it, anyway? --Anatoli T. (обсудить/вклад) 07:37, 8 September 2017 (UTC)
AwesomeMeeos, of course. I'm starting to go through Category:Noun inflection-table templates by language. In the Adyghe subcategory, for instance you'll find edits by 203.63.135.236 (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks). Chuck Entz (talk) 07:57, 8 September 2017 (UTC)
He's quite inventive, LOL. --Anatoli T. (обсудить/вклад) 08:11, 8 September 2017 (UTC)

Adding accents to Italian headwordsEdit

I'm currently learning Italian and started to work on our Italian entries. I noticed that we don't display accents for irregularly stressed words. cavolo and diavolo for example are listed in other Italian dictionaries as càvolo and diàvolo (e.g. Treccani), because they don't follow the common stres pattern (next to last syllable). My suggestion is to add a headword parameter to those entries, {{it-noun|m|head=diàvolo}}. Or would that be confusing? Explicit parameter? Better alternatives? – Jberkel (talk) 09:04, 10 September 2017 (UTC)

The problem is that sometimes the accent is actually written, and there's no way for someone to tell the difference. —Rua (mew) 11:28, 10 September 2017 (UTC)
I have included the accent in the hyphenation as a possible solution: diavolo. --Vriullop (talk) 12:12, 10 September 2017 (UTC)
Hm, maybe that's a solution then, I think the information should go somewhere, and the pronunciation section is a good place. It's interesting that they don't bother using accents, even in ambiguous cases (e.g. pesca). – Jberkel (talk) 15:12, 10 September 2017 (UTC)
Theoretically, the "correct" solution would be to include an IPA pronunciation with a stress mark. But I like the accent in the hyphenation idea too. --WikiTiki89 18:06, 11 September 2017 (UTC)

angstromEdit

This person keeps reverting my OED-sourced proper pronunciation of angstrom. Apparently he or she thinks that proper pronunciations must meet his personal litmus test of notability or whatever. Wiktionary exists for many reasons, one being a place where readers from Wikipedia can come to get the specifics of a word, an important part of that being proper pronunciations. I added this information for the specific reason of avoiding the continuation of ever-recurring arguments about the pronunciation of this word over at Wikipedia. The proper pronunciations of words and the common do not always overlap in many cases. Their argument seems to be that [œ] does not exist in English, but a quick look over at Open-mid front rounded vowel can tell you that's not the case. In any case it is a Swedish loanword, and you can observe this vowel in the pronunciation of the the Swedish version. I made sure to display the pronunciation I added [phonetically] rather than /phonemically/, and I did not mess with or remove the existing common phonemic English pronunciations, so I really don't see what the big deal is. I don't just pull this stuff out of my ass; I am well-versed in the relevant term and how IPA works. This information, while a bit obscure, could still potentially help someone. Don't get me wrong; any other day I'm all for excising superfluous crap from reference sources, but this is not one of those cases. It looks to me to be yet another age-old case of what we called “barracks lawyers” in the Army. Pariah24 (talk) 12:48, 10 September 2017 (UTC)

I would expect that pronunciation by English speakers would rarely be the same as "correct" or common pronunciation by Swedish speakers, especially for a word fully absorbed into English. At [[ångström]] we have the Swedish pronunciation. DCDuring (talk) 14:51, 10 September 2017 (UTC)
@Pariah24 The problem I have is that you don't state which accent says IPA(key): [ˈɔːŋstɹœm]. The LPD and CEPD, the most respected pronunciation dictionaries of English list only the pronunciations with IPA(key): /ə/ and IPA(key): /ʌ/. We're both aware that there's no *IPA(key): /œ/ phoneme in English, at least in most accents. Can you prove that accents that use IPA(key): [œː] for the NURSE vowel (or GOAT vowel, in the case of South African English) would use that vowel in 'angstrom'? I find that highly unlikely and so that argument just doesn't hold up without additional sources that would prove that. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
@DCDuring We do, I added it there a few days ago per the LPD, which provides the Swedish IPA alongside RP and GA transcriptions. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
The established practice on Wiktionary is to only put English pronunciations in an English entry. If no English speaker actually says the word angstrom with the pronunciation [ˈɔːŋstrœm], then that pronunciation should not be listed in the English entry. So the way to resolve this dispute is to find evidence that an English speaker says [ˈɔːŋstrœm], and to put that pronunciation in the proper context (is it rare, is it used by English speakers who also speak Swedish?). — Eru·tuon 17:48, 10 September 2017 (UTC)
I didn't notice that the pronunciation came from the OED. It is given as phonemic there: /ˈɔːŋstrœm/. On the one hand, I respect the OED; on the other, Wiktionary encourages verification of information taken from other sources, so it would be good to find out what they based this transcription on and whether it would meet our standards even if the OED didn't say it, and put it in proper perspective (that is, as above, who actually uses this pronunciation?). — Eru·tuon 18:02, 10 September 2017 (UTC)
screenshot for the plebs :) Pariah24 (talk) 03:29, 11 September 2017 (UTC)
Given its relative obscurity I think most would agree that finding a sample in the wild of a pronunciation of this term is pretty unfeasible. OED has never steered me wrong. If this were Wikipedia I wouldn't have even bothered with this nitpicky silliness, but it's a dictionary for pete's sake, and I always lean on the side of too much information is better than not enough, provided the source isn't questionable. I'm not over here deleting anything, I'm just adding. Pariah24 (talk)
This doesn't look like a query for an obscure word to me. —suzukaze (tc) 03:48, 11 September 2017 (UTC)
 
After all, there are many sources that say "X is a word that means Y" out there in the world, in particular other dictionaries. But they don't actually prove that people use a word, they just say they do, which really isn't sufficient. It wouldn't be the first time that dictionaries make up words that nobody has ever used!
 
suzukaze (tc) 03:40, 11 September 2017 (UTC)
Blockquoting, really? Pariah24 (talk) 03:43, 11 September 2017 (UTC)
If you want to play this game, here's a link for you. Pariah24 (talk) 03:47, 11 September 2017 (UTC)
Alright, you can have another too. —suzukaze (tc) 03:50, 11 September 2017 (UTC)
  • I, too, suspect this is a dictionary invention and I think it should not be in the entry without proof of actual use. Incidentally, I have a background in science (in the US), where normal use was /ˈæŋstɹəm/ or /ˈæŋstɹɑm/ (the latter of which I note is not in the entry). —Μετάknowledgediscuss/deeds 03:52, 11 September 2017 (UTC)
Not trying to offend but your anecdotal experience with a physics term can not possibly be comprehensive enough to be used as a valid argument for inclusion into a worldwide dictionary project. The only plausible way to attack my view that I see is to question the validity of OED and claim they would just put made-up bullshit into their dictionary, which is what you appear to be doing. Everything I know about OED from years of using it goes against this view. This is all getting really pedantic if you ask me. Sometimes it seems like all the worst parts of Wikipedia are magnified here. Pariah24 (talk) 04:24, 11 September 2017 (UTC)
If I understand the screenshot above correctly, the OED's entry for angstrom hasn't been updated since 1933. Nobody is saying that they insert made up bullshit into their entries. But their information may be outdated. DTLHS (talk) 04:28, 11 September 2017 (UTC)
Sadly, it may be bullshit. The OED makes things up a lot more than any of us would like, I'm afraid; you'll see that they're a frequent offender over at Appendix:English dictionary-only terms (a list of terms found in dictionaries that were never actually used enough to enter Wiktionary). —Μετάknowledgediscuss/deeds 04:31, 11 September 2017 (UTC)
If it needs to be removed based on this rationale I don't have a problem with it. I only have a problem with "hey guy I'm here to police you because I don't think you know what you're talking about." I'm rarely on the other side of these situations, because I almost never remove the content of others unless it obviously needs to go. Pariah24 (talk) 04:35, 11 September 2017 (UTC)
@Pariah24 Here we go again with misrepresenting what I say. I'm not interested in policing you (though you seem to strongly believe that, which isn't correct) but in the quality of Wiktionary. I removed that information and added sourced pronunciations (or sourced already existing ones, whatever it was) because I could see that the IPA was incomplete, it didn't say which accent says IPA(key): [ɔːŋstrœm] and I knew for a fact that neither RP nor GA speakers use IPA(key): [œ] in loanwords, not to mention native words. Do I really have to repeat myself over and over again? It seems like I do. I'm tired of you twisting my words and lying about my actions. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
It's especially aggravating when you're someone who has already gone through pains to use good sources. Pariah24 (talk) 04:39, 11 September 2017 (UTC)
@DTLHS: The last quotation in the entry is from 1957, and the small text under the definition mentions a redefinition of the meter in 1960, so they must have updated at least those parts of the entry since 1933. — Eru·tuon 05:03, 11 September 2017 (UTC)
Nobody likes to say this, but we rely heavily on original research (for English at least) and generally don't give a shit what secondary sources say. Which is why you will encounter so much hostility if you try to support something based on what a dictionary says. DTLHS (talk) 04:40, 11 September 2017 (UTC)
I honestly never noticed the little update warnings off to the right, so thanks for that. Clearly my attention to detail needs some work. Pariah24 (talk) 04:47, 11 September 2017 (UTC)
However on a second look it does say Previous version OED2 (1989). I think it just means it was first published in 1933. Pariah24 (talk) 04:53, 11 September 2017 (UTC)
@DTLHS That's clearly not the case here. I sourced the IPA on angstrom, it's not OR. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
FWIW, there have been other cases where dictionaries prescribe a pronunciation that we can't find in use; bon appétit is one. If more dictionaries than just the OED prescribed the pronunciation mentioned here (and if they were consistent, e.g. in saying it was an RP pronunciation), then it might be appropriate to add a note like the one in bon appétit. - -sche (discuss) 04:20, 19 September 2017 (UTC)
Isn't it ALWAYS true that a term originating in a FL will sometimes be pronounced by a limited number of English speakers as it is in the FL. Those who pronounce it as in the FL would be limited to those who knew the FL or were repeatedly "corrected". If the pronunciation is not fairly common, why should it be included? DCDuring (talk) 05:03, 19 September 2017 (UTC)

WM language guidesEdit

Following is text extracted from a message posted on many WM mailing lists:

Many wikis in the Wikimedia world give editors suggestions about the correct usage of each respective language: orthography, register, punctuation, and so on.

I started a page to list of such language guides: https://meta.wikimedia.org/wiki/Language_guides

I added a bunch of links to Hebrew there because that's my home wiki. I also added a few pages that I could find for Catalan, Indonesian, Russian, and Bosnian.

Please add your languages there! Surely there are dozens and dozens of missing links there.

Before you ask: The linked page explains why Wikidata is not very convenient for maintaining such a list, but if you think that you can put this nicely in Wikidata, be bold.

Thank you!

-- Amir Elisha Aharoni

Note that this is not the same as article style guides. I would think folks here would be interested. Potentially English usage guides would be a valuable resource and link target for us. Perhaps some of our usage notes and other material would be useful material for such usage guides in many languages. DCDuring (talk) 14:43, 10 September 2017 (UTC)

Erm, it appears that these are, in fact, just style guides... —Μετάknowledgediscuss/deeds 18:32, 10 September 2017 (UTC)
I think the glass is about 1/4 full. At least, Croatian, Hebrew, Indonesian, and Polish include grammar and/or common spelling errors. Afrikaans has something on translation errors. DCDuring (talk) 20:03, 10 September 2017 (UTC)

Draft strategy direction. Version #2Edit

In 2017, we initiated a broad discussion to form a strategic direction that will unite and inspire Wikimedians. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups discussed and gave feedback[strategy 1][strategy 2][strategy 3]. We researched readers and consulted more than 150 experts[strategy 4]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

A group of community volunteers and representatives from the strategy team synthesized this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The second version of the direction is ready. Again, please read, share, and discuss on the talk page on Meta. Based on your feedback, the drafting group will refine and finalize the direction.

SGrabarczuk (WMF) (talk) 10:12, 11 September 2017 (UTC)

Merge Proto-Nuclear Polynesian and Proto-Eastern Polynesian into Proto-PolynesianEdit

The Austronesian languages suffer from what you might call matryoshka grouping: each group has a branch which then branches further, which then branches further, and so on. You end up with a lot of branches which don't have very significant differences, and a lot of different proto-language entries with very similar content. It's made more complicated by the fact that some of the branches are less well established than others. To reduce this somewhat, I propose merging Proto-Nuclear Polynesian poz-pnp-pro and Proto-Eastern Polynesian poz-pep-pro into Proto-Polynesian poz-pol-pro. The differences between them are very small; each group is separated from its parent by only one or two sound changes, the sound changes of individual languages are often more substantial than those separating the proto-languages. See *matuqa for example. Rapa Nui and Hawaiian are both in the Eastern Polynesian group, yet the former preserves the Proto-Polynesian form unchanged while the latter significantly changes it. Having separate entries for Proto-Polynesian *aka, Proto-Nuclear Polynesian *aka and Proto-Eastern Polynesian *aka is quite pointless. —Rua (mew) 22:47, 11 September 2017 (UTC)

So for some more background, what are the sound changes that supposedly differentiate PNP from PP, and PEP from PNP? --WikiTiki89 22:54, 11 September 2017 (UTC)
See w:Proto-Polynesian language. Nuclear has loss of *h and merging of *l and *r. Eastern also has *s > *h and partial loss of *q. —Rua (mew) 23:31, 11 September 2017 (UTC)
  • Oppose. We have terms that can be reconstructed to PNP but not to PPn. Why on earth would you make it impossible to enter reconstructible terms? —Μετάknowledgediscuss/deeds 23:53, 11 September 2017 (UTC)
    • Same reason we did it for Proto-Germanic or Proto-Uralic. —Rua (mew) 00:00, 12 September 2017 (UTC)
      So what do you do with words reconstructible to Proto-West-Germanic but not to Proto-Germanic? —Μετάknowledgediscuss/deeds 04:29, 12 September 2017 (UTC)
      • We reconstruct them for Proto-Germanic. Many linguists don't even believe in Proto-West-Germanic. —Aɴɢʀ (talk) 09:47, 12 September 2017 (UTC)
        • To clarify, we put the reconstruction in a Proto-Germanic entry and give it a context label of "West Germanic". --WikiTiki89 16:47, 12 September 2017 (UTC)
  • Oppose. There are two factors that differentiate Polynesian languages from Indo-European and other Eurasian language families.
    1. Polynesian phonotactics are extremely adverse to consonant change: in any Polynesian language I'm familiar with, there's no such thing as a consonant cluster- every syllable begins with either a vowel or a single consonant followed by a vowel, and there are simply no final consonants. That means that any consonant change that does happen is really significant.
    2. Millions of square miles of open ocean. It is physically possible to walk from the Scandinavian Peninsula all the way to India, but not from Samoa to Hawaii. Proto-Germanic spread over a wide area, but contact between dialect areas prevented it from splitting up into separate branches, for the most part. There are Polynesian island groups such as the Hawaiian Islands and Rapa Nui where there has been a colonization event or two, but no other contact with the outside world- ever. There are parts of Polynesia where island groups are close enough to allow periodic contact, but there are also plenty of island groups where other peoples were a subject of oral history, but never actually encountered until the Europeans showed up. There again, patterns of sound changes are probably reflective of actual population movements, not of areal influence or borrowing- you can't have areal influences if people within an area have never had any contact with each other. Chuck Entz (talk) 01:36, 13 September 2017 (UTC)
      I don't get what your point is. You seem to be saying that we should group languages based on how different they theoretically could be rather than how different they actually are. To me it seems that what we need is to determine whether there actually are fundamental enough differences between PP, PNP, and PEP that it would be infeasible to treat them under a single langauge. I don't personally have an answer to this, nor do I have any evidence to share, but let's not base this on theory. --WikiTiki89 02:40, 13 September 2017 (UTC)
  • Support at minimum for inherited terms. Multiplication of entries like Proto-Polynesian *wai, Proto-Nuclear Polynesian *wai and Proto-Eastern Polynesian *wai is useless. These are nothing more than a waste of effort that makes things harder to maintain. Trying to document every possible reconstructed proto-language as if they were attested natural languages is not a part of Wiktionary's mission; it is w:scope creep and should be avoided.
    — I could agree either way on items that actually are reconstructible only for a smaller group of languages, though, such as Proto-Nuclear Polynesian *nui. Having these under their proper proto-language categories etc. is more exact; but on the other hand, keeping around two "tiers" of proto-languages seems more complicated than is really necessary, since the context label (dialect label would be more exact) approach works for almost all needs. --Tropylium (talk) 20:48, 18 September 2017 (UTC)

Category:Quotation templates to be cleanedEdit

There are over eight thousand entries in this category. Does anyone know what we are supposed to do with them? SemperBlotto (talk) 19:27, 14 September 2017 (UTC)

It is a project of TheDaveRoss. Maybe he can tell you. DTLHS (talk) 19:28, 14 September 2017 (UTC)
The cleaning means changing {{quote-text}} to one of the specific quote templates, it is not at all urgent. - TheDaveRoss 19:57, 14 September 2017 (UTC)

"Morphologically from the root" IP editorEdit

There seem to be a variety of IP users editing Arabic entries and, among other things, adding the text "morphologically from the root x". See for instance this and this and this and this. As you can see, there are several different IP addresses, but to me their editing style looks the same, particularly the tagline above in the etymology sections, so they might be a single very high-tech person who knows how to mess with IP addresses, or a conspiracy of people. Sometimes the edits are okay, aside from formatting (no newlines between sections, and second-level reference sections, for instance). Often they're replacing specific definitions or etymologies with generic morphological ones (with the above tagline), or replacing Arabic templates with generic ones. With the last example, أَعْلَى (ʾaʿlā), they've radically reformatted the entry in an unconventional fashion, with pronunciation sections above etymology sections that share that pronunciation. It makes sense, but it's something that needs discussion (and there's the deletion of valuable etymological material). On the whole, their edits are full of various things that need to be corrected.

Anyway, I don't know what to do about this. At least the tagline gives some way to find their edits. I'll leave it at that. — Eru·tuon 10:06, 15 September 2017 (UTC)

Proposed first use of Wikidata: categorising planetsEdit

A while ago, {{senseid}} was added to entries with Wikidata ids, but with no actual Wikidata access, it was mostly a formality. Now that Wikidata access has been enabled, I've done some experimenting in Module:senseid, detailed at Wiktionary talk:Wikidata#A first experiment. The experiment was meant to find out how to use Wikidata information to categorise entries. Categorising entries this way offers the advantage that we don't have to think about which categories something belongs in. As long as the data is present on Wikidata, and the appropriate IDs are added to entries on Wiktionary, the categories can be added automatically. Think of it as an {{auto cat}} for individual senses: just plop in the Wikidata ID and the template will figure out what needs to be done. This method can only work for "set" categories, which contain things belonging to a particular set, usually indicated in Wikidata with the "instance of" property. It doesn't work for categories that contain terms related to a particular thing/topic. Semantic relatedness is lexical data, which is not currently present in Wikidata. Thus, we'll still have to manually add "topic" categories like Category:Astronomy.

Because of the rule that uses of Wikidata should be approved first, I did this experiment by using tracking categories to stand in for actual topical categories. My finding was that in general it works quite well, but Wikidata handles certain things idiosyncratically which our modules need to take into account when "translating" the information. For example, many things that are combined into one category on Wiktionary have several different Wikidata entities, such as our Category:Planets of the Solar System, which has three corresponding Wikidata entities, outer planet of the Solar system (Q30014), inner planet of the Solar System (Q3504248) and planet of the Solar System (Q17362350). Wikidata makes frequent use of subclassing; writing code on our end to resolve sub/parent classes may help in these cases. Taxonomic data is also handled differently, with special taxonomic properties rather than the generic "subclass of" and "instance of" properties.

I would like to take a first step towards making it actually do something with the data. This needs to be approved in a discussion, so I hereby propose modifying Module:senseid/{{senseid}} to automatically place an entry into a language-specific Category:Planets of the Solar System, if it is given a Wikidata ID (Q followed by numbers) and if Wikidata indicates that the entity for this ID is a planet of the solar system. I'm choosing planets specifically because it's a very small set with exactly 8 known members, and the Wikidata data is known to be complete. This makes it easy to spot any problems if they arise. Extending the system for more categories is very easy; if you approve of doing it for more categories than just planets right from the start, please state so. —Rua (mew) 18:54, 15 September 2017 (UTC)

I entirely approve and I especially like this example because it is something small and restrained (unlike e.g. adding it to every entry on a species) and it does include some possibly contentious data--i.e. the status of Pluto (which is not contentious to astronomers but is the sort of thing that could have actual individuals editing back-and-forth about it). —Justin (koavf)TCM 00:47, 16 September 2017 (UTC)
What benefit does this add? The members of the category are unlikely to change much over time, and if we did this it would require that every page which is not a planet checks to see if it is a planet, which seems like a lot of overhead. - TheDaveRoss 12:54, 18 September 2017 (UTC)
@TheDaveRoss: Your question answers itself: since this is a very small and stable use case, it will allow us to see in the wild how Wikidata integration would work. That is the value. —Justin (koavf)TCM 15:53, 18 September 2017 (UTC)
@Koavf: I don't disagree that this would be a good test, if it were the type of thing that I thought Wikidata was well suited for. I do not think that this is an example of a good use of Wikidata, however. If you have a small, class of objects you label the objects rather than querying all objects to see if they belong to that class. If you have a very large class, or one which changes often, then you query. - TheDaveRoss 11:39, 19 September 2017 (UTC)
It's not expensive at all to check this. All you need to do is retrieve the "instance of" property of the entity, and then check the IDs that you get for matches. IDs are just strings, so it's basic string matching, which is very fast. —Rua (mew) 15:48, 20 September 2017 (UTC)
It is expensive due to the volume, not the task. - TheDaveRoss 17:37, 20 September 2017 (UTC)
Yes, but structured data will vastly decrease overhead in the long term. —Justin (koavf)TCM 17:41, 20 September 2017 (UTC)
Only if we were currently using any overhead at all for this sort of thing, which we are not. I agree that judicious use of Wikidata will decrease overhead. - TheDaveRoss 17:46, 20 September 2017 (UTC)
Categorizing things is overhead. Someone has to actually do it. —Justin (koavf)TCM 18:23, 20 September 2017 (UTC)
  • I too question the utility of this. The inline markup is unaesthetic, cryptic and really seems out of place, and it offers practically no additional benefit to the current categorisation system. Furthermore, it rests on the assumption that words in various languages are direct equivalents of one another; they are not, for most words in a language. Sure, Venus may be translated as 太白星 and listing it in the translations table at Venus is acceptable, but the words are far from being the same, really. There are several names for Venus in Chinese, each with a nuance in meaning, and the same situation exists for nearly every other planet and star. Wyang (talk) 13:38, 18 September 2017 (UTC)
If 太白星 doesn't mean Venus, then why does the definition say Venus? Subtle nuances in meanings should be included in entries. But in this case it's simple: either it refers to that same ball of rock floating around the sun, or it doesn't. —Rua (mew) 14:17, 18 September 2017 (UTC)
It means Venus, but only in a Chinese astronomy, astrology, or Taoist context. It is already indicated with the label in the entry. It is the Grand White Star in traditional Chinese astronomy/folk religion, governed by the Grand White Star Lord. It is conveniently translated as Venus, but its definition really should be elaborated on in the future. Venus in modern Western astronomy and in the context of Taoist Wu Xing (five elements) is called 金星, Venus in Chinese astronomy and astrology is called 太白(星), Venus in the context of Taoist mythology is called 太白金星, Venus seen in the morning is called 啟明(星), Venus seen in the evening is called 長庚(星), etc. The nuances in most of the vocabulary in a language are simply too significant to allow a bijective map to another language, even for a simple term like Venus, especially if the two languages developed from cultures which historically had very few contacts. The principle of trying to map all words of a language to specific pre-defined semantic concepts, for the purpose of classification, is methodologically problematic. Wyang (talk) 23:09, 18 September 2017 (UTC)
@Wyang: Of course. But has an article named something on the planet *second-*closest to the Sun. What is that named? —Justin (koavf)TCM 23:29, 18 September 2017 (UTC)
@Koavf: You can't just say "of course" and wave that off. It's similar in Hindi: अरुण (aruṇ, Uranus in astrology), युरेनस (yurenas, Uranus the planet). Some Hindi purists also use अरुण for the planet. btw the closest planet to the sun is actually Mercury. That's not the main problem with this though, the real problem is the markup will become even more opaque and hidden from the casual editor. People wonder why we don't get many new editors, it's because it takes months to learn how to use all of our templates. —Aryaman (मुझसे बात करो) 23:57, 18 September 2017 (UTC)
@Aryamanarora: I'm not suggesting that the problem is simple or trivial: I'm suggesting that it's very complicated but that this is a good first step. Do you have a better solution? —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
If I did I would have to know enough Lua to implement it. —Aryaman (मुझसे बात करो) 00:08, 19 September 2017 (UTC)
@Aryamanarora: I'm not asking if you know the technical means (God knows I don't!), just what in principle would work better. Do you have any thoughts? I'd be happy to know what would work better even in a hypothetical sense. —Justin (koavf)TCM 04:26, 19 September 2017 (UTC)
I don't understand what you mean. Wyang (talk) 23:34, 18 September 2017 (UTC)
@Wyang: zh:w:金星 corresponds to en:w:Venus, so wouldn't that be the best word to use? Also, it's not a problem to use this senseid on more than one entry. —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
Well, it is a problem to try to assign foreign words to specific semantic labels in English: e.g. Venus on Wikidata, and try to systematically generate categories based on these crude equivalents. Like I said above, the best word to use depends on context. Very rarely do words in Chinese match with senses of a word in English exactly. Wyang (talk) 00:14, 19 September 2017 (UTC)
@Wyang: Then have both words in the category. That solves the problem. If one English word is a cognate*equivalent* to two words in another language, that's okay. (Or three or vice versa, etc.) —Justin (koavf)TCM 03:13, 19 September 2017 (UTC)
They can be both put in the same category, or any category, with the current categorisation system, without having to resort to such rigid equivalence sets. The current method is also superior in usability and aesthetics. P.S. See definitions of cognate. Wyang (talk) 04:05, 19 September 2017 (UTC)
@Wyang: It is not superior in usability because it can be exported across languages. With over 100 Wiktionaries and no less 6,500 languages, using structured data to do any part of the work is far more usable. If the method of categorization that MediaWiki uses is superior, why did we ever launch Wikidata in the first place? —Justin (koavf)TCM 04:23, 19 September 2017 (UTC)
Pronunciation is templatisable, inflection is templatisable, entry layout is templatisable, but semantics is not templatisable or structurisable. Every sense of every word in a language corresponds to a semantic field or domain, and in the hypothetical 3D representation of human perception and cognition, it is a sphere in space, which spatially centres on the core, fundamental meaning of the term. What we are trying to do when we translate foreign words into English on Wiktionary is to find existing English terms with spatially close semantic areas to the source words; that's why definitions for Chinese words on Wiktionary are usually given with two or three English equivalents. Giving these multiple equivalents allows the reader to imagine the semantic area for the foreign term by superimposing the various English terms. As such, semantics across languages is not structured, and attempts to structurise it will only result in confusion and chaos. If languages were strict bijective mappings of words and grammar from one to another, machine translation would be a lot easier. Sure, it may work for water (in most languages), since the semantic fields for words for water are mostly spatially close, but it will fail for river, fluid, syrup. Wyang (talk) 04:46, 19 September 2017 (UTC)
@Wyang: So are you opposed to the notion of categorizing these terms? —Justin (koavf)TCM 16:20, 19 September 2017 (UTC)
I'm opposed to the notion of mapping senses of foreign words onto specific, pre-defined English semantic labels, and blindly achieve categorisation via those labels, as if the labels themselves are equivalent to the senses of the foreign words. Wyang (talk) 23:26, 19 September 2017 (UTC)
The Wikidata items aren't meant to encompass entire senses. They encompass referents of senses. Chinese may have different words for Venus, all with various nuances, but they all refer to the same ball of rock in space. The context in which they are used isn't relevant, that's a matter for context labels and usage notes. All that matters is that they fundamentally are different terms for that same ball of rock. So can you give concrete examples? Which terms refer to the planet and which don't? —Rua (mew) 23:45, 19 September 2017 (UTC)
────────────────────────────────────────────────────────────────────────────────────────────────────
They differ on a lexical level and these nuances will be reflected in their categories. The category of Category:zh:Planets of the Solar System is perfect as it is now. There is no need to dump all tens of synonyms of Venus, plus the names of all other planets in traditional Chinese astronomy into this category; these words, which are largely limited to traditional Chinese astronomy, should go into Category:zh:Planets of the Solar System in Chinese astronomy, or at least Category:zh:Stars and planets in Chinese astronomy (the reason the entry has the {{lb|zh|Chinese star}} label). One can easily adjust the categorisation in whatever way is most appropriate now. Putting an unattractive senseid next to the sense simply takes away this freedom and flexibility. Another example is the senseid at happiness, linked to Q8 on Wikidata which has 幸福 (xingfu) listed as the Chinese equivalent. This is unfortunate as xingfu is probably one of the hardest Chinese words to translate into English. Although it is typically glossed as happy; happiness, its connotations are hard to describe and not insignificant. English "I am very happy" and Chinese "我很幸福" have vastly different meanings. It would be quite silly to let the meaning conveyed by happiness blindly dictate the categories of the foreign words. Wyang (talk) 02:12, 20 September 2017 (UTC)
I opppose this, in particular the {{senseid|zh|Q313}} noise added to 太白星. Wiki markup should be free from identifier noise; it should be pleasant to edit directly. --Dan Polansky (talk) 13:53, 18 September 2017 (UTC)
It's not possible to use Wikidata without identifiers. —Rua (mew) 14:11, 18 September 2017 (UTC)
I support using Wikidata to categorize planets. One alternative idea might be using something like {{senseid|zh|Venus}} instead of {{senseid|zh|Q313}} with a data module that recognizes that "Venus" means "Q313". --Daniel Carrero (talk) 14:35, 18 September 2017 (UTC)
I strongly oppose creating a module which maps strings to Wikidata identifiers. The Lua errors are rampant enough without going down that path. - TheDaveRoss 15:02, 18 September 2017 (UTC)
I'm not a fan of it either. In any case, the senseids themselves aren't a part of this proposal. This proposal is only about modifying {{senseid}} to use them for categorising. Having Wikidata IDs on entries is beneficial even if others decide they don't want {{senseid}} to categorise. —Rua (mew) 15:08, 18 September 2017 (UTC)
Sure, I take back the idea of mapping things like "Venus" = "Q313". I prefer using "Q313" anyway, that was just an alternative idea. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
I strongly agree with Wyang, especially his point regarding the fact that Chinese has multiple names for Venus, each with its own connotations. --WikiTiki89 21:19, 18 September 2017 (UTC)
Is there any Chinese name for Venus that shouldn't get categorized in Category:zh:Planets of the Solar System? This is a categorization proposal, so I'd like to know how the nuances of each name affect categorization. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
Oppose per Dan Polansky and Wyang. —Aryaman (मुझसे बात करो) 21:22, 18 September 2017 (UTC)
d:User:Amgine for a very old commentary on wikidata, which aligns with Wyang's opposition. Feel free to expand if you can. - Amgine/ t·e 01:54, 21 September 2017 (UTC)

Split RfD by English/non-English as we have with RfVEdit

I propose that we split Wiktionary:Requests for deletion into Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English, just as we have done with Wiktionary:Requests for verification. RfD is presently over 425K, and although I can't say offhand what proportion is non-English, I would estimate it at somewhere over one third. As with RfV discussions, examination of English and non-English entries, of course, requires different skill sets, and a different set of editors are typically attracted to each kind of discussion. bd2412 T 02:04, 17 September 2017 (UTC)

When I proposed the split of RFV, I considered this as well but ultimately rejected it. The fact is that if you don't know at least a little Japanese, you just can't be of any use in gathering Japanese quotations or assessing whether they're uses. However, anyone who understands how the SOP concept works can look at a Japanese word broken into its component parts and, once shown that 茶色の葉 is 茶色 (ちゃいろ) (chairo, brown colour) + (no, possessive connector) +  () (ha, leaf), and since it means "brown leaf", it would be inappropriate to have a Wiktionary entry for that. That's why everyone can contribute at RFD, and why we should focus on clearing up the backlog by making judgement calls on whether a consensus has been reached rather than splitting the page. —Μετάknowledgediscuss/deeds 03:58, 17 September 2017 (UTC)
Just as an academic question, doesn't RfD address issues other than SOP-ness? bd2412 T 01:28, 24 September 2017 (UTC)
Yes, but much less commonly. —Μετάknowledgediscuss/deeds 03:53, 24 September 2017 (UTC)
Support, WT:RFD is too large already. --Daniel Carrero (talk) 21:37, 18 September 2017 (UTC)
Support --Backinstadiums (talk) 07:04, 21 September 2017 (UTC)
Support and I also believe scriptio continua languages and some other language groups require CFI different from English. BTW, @Metaknowledge: Japanese idiomatic terms may get a possessive particle の, e.g.  () () (konoha) or  () () (kinoha). The 2nd one looks especially like SoP ("leaf of the tree"") but both terms are considered idiomatic. Languages such as Korean or Arabic, etc. (both use spaces between) may have non-words written together, with no spaces between them, such as clitic prepositions, pronouns, etc. (Arabic) - فَقَالَ (faqāla, and (he) said)‎ = فَ‎ + قَالَ‎, غُرْفَتِي (ḡurfatī, my room)‎ = غرفة‎ + ي‎, particles and copulas (Korean) - 한국어로 (han-gugeoro, “in Korean”) = 한국어 + 로, 학생입니다 (haksaeng-imnida, “(someone) is a student”) = 학생 + 입니다. I do agree, however, that one can take part in discussions without a thorough knowledge of a given language but one has to learn fast and listen to arguments of native speakers or advanced learners. --Anatoli T. (обсудить/вклад) 07:44, 21 September 2017 (UTC)
Support. — Ungoliant (falai) 12:52, 21 September 2017 (UTC)
Support. --Canonicalization (talk) 13:19, 21 September 2017 (UTC)7
Support. --Robbie SWE (talk) 18:18, 21 September 2017 (UTC)

For reference:

I don't see any particular need. RFD is nowhere near as huge as RFV was when we split. --WikiTiki89 18:19, 25 September 2017 (UTC)
I think the relative sizes of both pages have fluctuated and crossed over one another from time to time. bd2412 T 19:59, 27 September 2017 (UTC)
As I see it, there are two negatives to consider: first, of course, is the burden of working with a large page, but the other one hasn't been mentioned yet: splitting by language means requiring a language code in the {{rfd}} template and cleaning things up when people post to the wrong page. This represents yet another way that those who don't know the finer points of our templates can get tripped up while doing basic tasks. Chuck Entz (talk) 21:24, 27 September 2017 (UTC)

Upcoming Wiki Science CompetitionEdit

Did you hear about the Wiki Science competition, starting in November?

The competition will focus on images, but it might evolve in the near future, so users of other content platforms should take a look at it.

I've informed the village pump on commons, since there will be an intense workflow of technical uploaded by newbies, that will require some better categorization and translation of descriptions here and there. More importantly, images can be used for the articles on specific platforms. I think about some of your users who created and take care of many technical and scientific entries and are still currently active, such as User:SemperBlotto

I give you some details.

In 2015, limiting to Europe, we got thousands of entries, we can expect two or three times more this year. In the case of Italy for example we will send emails to many professional mailing lists, and other national wikimedia chapters will use their social media too to inform the public.

We have finished with Ivo Kruusamägi of WM Estonia to prepare some of the juries. I did my best to gather, besides people with a strong scientific background, also some expert wikipedians (because I ask first on wikipedia) here and there to take a look to the files on commons and not just the quality of the images. I have also informed users on English wikipedia, English wikiveristy and will do the same on some other wikimedia platforms in the following weeks.

The final international jury is made of expert researchers, usually with interest in photography, but no strong knowledge of the details of any wikimedia platforms. The main goal was to enlarge the network of "friends" of wikimedia platforms. Some national juries should have enough expert wikimedians and wikipedians probably, I guess because of the presence of active national chapter in their set up, so someone might take care of some the uploads at least improving some description and/or using them diorectly. Sometimes, suggesting technical entries to be created too.

More in general, gathering users besides wikipedians will probbaly help us to include more platforms for the competitions.

Now that I am sure that we have enough "scientists" here and there and from different fields, maybe we can see if we can also gathers specifically expert wikimedia users, whatever their background. Example simple teachers and not researchers that can evaluate the quality of the images for more specific uses.

For the countries without juries, there is the possibility of creating a second-level jury to select images from the rest of the world to the experts of the final jury. For such second-level jury I have found some names, but the numbers of entries could be really high, so maybe that's where we can look for more standard wikimedia users.

if you are a citizen of a country with a national jury you could also join them directly (rumor has it, more will appear). I don't know the details in many cases, if they need more jurors or they are fine.

Anyone interested?--Alexmar983 (talk) 05:59, 18 September 2017 (UTC)

I am not interested to be part of a jury but I though it is very interesting that you knock here. Pictures made with a Wikipedia uses perspectives are quite different than pictures usable in Wiktionary. Here we also need to illustrate verbs and actions for tools (not only the tool itself) and more. I'll be very enthusiastic to integrate pictures from this competition in wiktionaries if they fit our needs!   Noé 07:36, 19 September 2017 (UTC)
With thousands of uploads, statistically someone could fit some needs also here... Noé I am happy if more people take a look, this should give better feedbacks for the future when the competition will be bigger and we can make it more specific to the needs of some wikiplatform. For example edit-a-thons. In the meantime, I have found another juror on frwikipedia, I am close enough to finalize the second-level jury. I am "sad" noone replied form wikiversity yet.--Alexmar983 (talk) 12:56, 19 September 2017 (UTC)
Cool. Maybe you can try to ping the French Wikiversity, if some French Wikipedians can assist you   Noé 13:24, 19 September 2017 (UTC)

Wiktionary User GroupEdit

Hello!

The Tremendous Wiktionary User Group is a coalition of users of Wiktionaries aimed to create a common platform to share ideas and documents. It is also a way to be a lobby at Wikimedia Foundation to make it acknowledge the needs of our projects in term of technical improvements. .

This User Group is completing a revolution, a first year of existence! We are writing our first Annual Report (due September 26th). It's time to look at what was made during the year and to frame the future axis of action. There is 42 affiliates now but the group can include much more people. I invite you to read our works and to see if you want to participate in our actions. The more visible one is LexiSession but there is much more to do, including promotional material (leaflet, banners, stickers, etc.), inter-wiktionarian collaborations (on templates, Wikidata, policies and guidelines) and meet-ups! There is no fees nor admission processes, it's open to everyone who like Wiktionary and want to do more about this project. Your ideas and initiatives are welcome!

Thank you for your attention, I hope to see you soon   Noé 08:14, 19 September 2017 (UTC)

Help review PulauKakatua19 (talkcontribs)'s entriesEdit

This user is editing in way too many languages for them to possibly understand all of them. I have checked and fixed all of the recent edits in Hindi, Bengali, and Sanskrit, but someone acquainted with Indonesian, Malay, and now Korean should check the rest. Atitarev (talkcontribs) warned them on their talk page about Russian a while ago too. —Aryaman (मुझसे बात करो) 01:00, 21 September 2017 (UTC)

I will check their Chinese, Korean and Malay ones. Wyang (talk) 01:11, 21 September 2017 (UTC)

Gfarnab (talkcontribs) back at it againEdit

e.g. A recent error at . Someone please block them. Wyang (talk) 01:13, 21 September 2017 (UTC)

Wiktionary:Votes/cu-2017-09/User:SemperBlotto for checkuserEdit

Could someone add this to Wiktionary:Votes/Active? Thanks. --2A02:2788:A4:F44:AC35:948A:635A:9569 18:16, 21 September 2017 (UTC)

Before any such thing (I don't even think anons can create votes to start with), have you at least asked @SemperBlotto if he's interested? --Robbie SWE (talk) 18:21, 21 September 2017 (UTC)
No, he isn't. SemperBlotto (talk) 20:17, 21 September 2017 (UTC)
That's what I suspected. The vote is therefore useless and will be deleted. --Robbie SWE (talk) 20:19, 21 September 2017 (UTC)

Parentheses in IPAEdit

I really wish people would stop inserting these back into pronunciations. The only acceptable IPA use of parentheses is in w:ExtIPA, and those are subscript parentheses to represent partial devoicing. In fact, I've searched through various linguistics databases and can't find much evidence even of non-IPA uses, except their occasional use to denote silent articulation. Obviously this doesn't apply to the case which Wiktionary editors are most commonly trying to use parentheses (optional articulation of ⟨ɹ⟩). It may be helpful, but it's wrong. The entry should show either both possible pronunciations separately or a more specific phonetic pronunciation. If people are going to keep using them, then Wiktionary as a whole needs to stop claiming they are using IPA's system, and admit they have their own in-house system. It's rather disrespectful to the creators of a standard to cherrypick what you wish to use. If the IPA thought parenthesis were really that important, don't you think they would have standardized them by now? Pariah24 (talk) 20:31, 21 September 2017 (UTC)

Of course, IPA is disrespectful to the creators of the Latin alphabet, by the way they cherrypick that alphabet. Phonetic alphabets are used in great variation throughout the world, including many, many minor variants on IPA. And our use of parenthesis has precedence; we have for crater /ˈkɹeɪ.tə(ɹ)/, and Keynon and Knott's A Pronouncing Dictionary of American English (1953) has (among others) ˈkɹetə(r--Prosfilaes (talk) 21:52, 21 September 2017 (UTC)
@Pariah24 It's not wrong. Peter Ladefoged transcribes the unstressed form of the as [(ð)ə] in broad phonetic transcription in the Handbook of the IPA (chapter 'American English'). Just because something is not officially endorsed (and I'm not so sure of that, have you tried asking the IPA itself?) it doesn't mean that it's wrong or that it shouldn't be used. Unless I'm missing something?
You're also a bit inconsistent in your edits. In martyr, you transcribed the AuE pronunciation /ˈmɑːtəɹ/, /ˈmɑːtə/, [ˈmäːtə], [ˈmäːɾə]. The order was wrong, as the pronunciation with the final /ɹ/ is marked and used only immediately before vowels, not the other way around. Also, the way the final [ɹ] is omitted in phonetic transcriptions suggests that it's there phonemically but not phonetically, which is of course completely wrong (if anything, again, it's the other way around). I've fixed that for you. Mr KEBAB (talk) 07:39, 27 September 2017 (UTC)

Braille entriesEdit

Should we reformat Braille entries like this?--2001:DA8:201:3512:BC46:AD88:D9A7:3939 16:39, 22 September 2017 (UTC)

(@Daniel Carrerosuzukaze (tc) 00:47, 23 September 2017 (UTC))
Mostly support. My opinion is this:
  1. I would suggest, in normal letter entries like a and also Braille letter entries like (which is Braille for "a") deleting all Latin script sections like Spanish, Portuguese, Italian, etc. because they clutter the entry and are basically infinite. The Translingual section can explain the Latin script letters.
  2. But, in Braille entries like the aforementioned , I would support keeping separate sections for Japanese, Arabic, Hebrew, etc. and other non-Latin script entries as opposed to keeping them all in the Translingual section.
  3. I would also support using proper categorization like Category:Arabic letters in Braille script (current redlink), with the written language and script. I would suggest using Category:Arabic letters in Arabic script (self-explanatory) for the normal alphabet.
--Daniel Carrero (talk) 03:41, 23 September 2017 (UTC)
@Daniel Carrero: I agree. --Backinstadiums (talk) 08:04, 23 September 2017 (UTC)

Wiktionary:Votes/sy-2017-09/User:Aryamanarora for adminEdit

User:Aryamanarora has been nominated for adminship. Please voice your opinion on the page. Thanks! Wyang (talk) 13:56, 23 September 2017 (UTC)

Denoting long aspirationEdit

In Northern Sami, there's a set of preaspirated consonants, but these consonants can be lengthened as well. When they are long, it is the preaspiration that lengthens rather than the occlusion itself. Usually, I've seen the preaspiration transcribed with just the letter h, e.g. hp, so that long preaspiration then becomes a matter of writing hːp. The few Northern Sami transcriptions that we have, and those on Wikipedia, use the superscript ʰ instead. I prefer the superscript, but writing ʰːp is probably less than ideal. I've written ʰpː instead in these occasions, but it doesn't really reflect the phonetic reality that it's the aspiration that lengthens. Any ideas? —Rua (mew) 18:29, 23 September 2017 (UTC)

How about hhp? — Eru·tuon 19:09, 23 September 2017 (UTC)
That's more or less equivalent to hːp. I would prefer to avoid h because there's also an actual phoneme /h/, and it's not part of these preaspirated consonants. /hːp/ is one phoneme, so I'd like it if the transcription reflected that. —Rua (mew) 19:14, 23 September 2017 (UTC)
You call it "less than ideal", but from your description I don't see much choice beside ʰːp, unless it's ʰʰp. —Aɴɢʀ (talk) 14:37, 24 September 2017 (UTC)
Yeah. I just hoped someone would think of something I hadn't thought of yet. @Tropylium any ideas? —Rua (mew) 15:14, 24 September 2017 (UTC)
In phonetic transcription there should be no problem with [hːp]. Phonetically there is no difference between [ʰ] and [h]. Even phonemically /hːp/ might be feasible. It is not universally agreed that these are unitary consonants; some analyses do consider them clusters /h/+/p/, in part precisely because it's the aspiration that lengthens and not the closure (similar to how the long counterpart of clusters such as /sk/ is /sːk/). In any case there is no contrast between /hp/ versus /ʰp/. --Tropylium (talk) 19:24, 24 September 2017 (UTC)
Ok, I'll just go with /hːp/ then. Thank you. —Rua (mew) 19:50, 24 September 2017 (UTC)
I would have picked /ʰʰp/, but it doesn't matter too much. --WikiTiki89 18:30, 25 September 2017 (UTC)

DinajpuriaEdit

Do we have a language code for this under a different name? Used on জাৰ, নিগনি, translation at winter @Sagir Ahmed Msa, Aryamanarora. DTLHS (talk) 19:11, 23 September 2017 (UTC)

Also "Mymensinghiya", used on light. DTLHS (talk) 19:16, 23 September 2017 (UTC)
AFAIK both of these are Bengali dialects... Sagir probably knows better than me. —Aryaman (मुझसे बात करो) 20:11, 23 September 2017 (UTC)
@Aryamanarora: nope, there's no code for these languages. Yes these are considered as Bengali dialects just like Sylheti, Chittagonian, Rajbongsi etc (Wiktionary has code for these). Chakma and Rohingya (both are very closely related to Chittagonian) are not considered as Bengali dialects probably because their native speakers are not considered as Bengali people. But these languages are not actual dialects. Some of these are more closely related to other languages than standard Bengali. Similar for Assamese dialects (I mentioned Kamrupi language in মেকুৰী, which is considered as an Assamese dialect). I think just like Sylheti, Chittagonian etc, these languages should also have codes. They have different phonology, grammar even origins. The Dinajpuria and Mymensinghia words are not present in Rarhi-Nadia (standard Bengali), these are also closer to Rajbongsi and Sylheti respectively. Please check
 
Wikipedia has an article on:
Wikipedia
samples.

User:Sagir Ahmed Msa

@Sagir Ahmed Msa: Is grammar significantly different in these lects from Rarhi-Nadia? I'll admit I know little about Eastern Indo-Aryan, it's just ISO is usually generous with codes (e.g. a bunch of Hindi lects are given codes when they are often considered to be dialects). If we did add a code, bn-dnj etc. would be fine right? —Aryaman (मुझसे बात करो) 17:14, 25 September 2017 (UTC)
@Aryamanarora: yes you can make codes with "bn-" since they are generally considered as Bengali dialects.

-- Sagir

@Aryamanarora The code would be inc-dnj (see Wiktionary:Languages#Language_codes). DTLHS (talk) 19:10, 25 September 2017 (UTC)
@DTLHS: Whoops, typo on my part. But why not bn- since they are often considered Bengali dialects? That's probably not supposed to be argued about here though. —Aryaman (मुझसे बात करो) 19:12, 25 September 2017 (UTC)
Bengali is not a language family. DTLHS (talk) 19:15, 25 September 2017 (UTC)
Yes, I understand that now. So could inc-dnj (Dinajpuria) and inc-mym (Mymensinghiya) both with script Beng and ancestor inc-mgd? —Aryaman (मुझसे बात करो) 19:18, 25 September 2017 (UTC)
I am concerned that there is no information about these languages / dialects online- not even mentions of the language names. Since they would be WT:LDLs, what references would be used to support entries? DTLHS (talk) 19:21, 25 September 2017 (UTC)
@DTLHS: [1] seems promising, and attests to the lack of mutual intelligibility between these dialects... But I am not sure whether they deserve codes. Would {{lb|bn|...}} not suffice @Sagir Ahmed Msa? —Aryaman (मुझसे बात करो) 19:27, 25 September 2017 (UTC)
@Aryamanarora, Aryaman:

Here are some examples: Unfortunately i couldn't find Mymensinghiya tenses, so I'm comparing with Dhakaya, they are closely related to each other and Mymensinghiya is more distinct from Standard Bengali than Dhakaiya.

  • English :
  1. I do.
  2. I am doing.
  3. I did.
  4. I was doing.
  5. I will do.
  6. I will be doing.
  • Dhakaiya :
  1. Ami kôri.
  2. Ami kôrtasi.
  3. Ami kôrsi/kôrsilam.
  4. Ami kôrtasilam.
  5. Ami kôrmu.
  6. Ami kôrtê thakum.
  • Bengali:
  1. Ami kôri.
  2. Ami kôrchi.
  3. Ami kôrêchi.
  4. Ami kôrchilam.
  5. Ami kôrbo.
  6. Ami kôrtê thakbo.
  • Assamese:
  1. Môi kôrû.
  2. Môi kôri asû. (kôri = kôrat)
  3. Môi kôrisû/kôrisilû.
  4. Môi kôri asilû.
  5. Môi kôrim.
  6. Môi kôri thakim. (kôri = kôrat)
  • Rangpuri/Rajbongsi/Kamata:
  1. Muĩ kôrû.
  2. Muĩ kôrûsû.
  3. Muĩ kôrsinû. (And kôrsû?)
  4. Muĩ kôrûsinû.
  5. Muĩ kôrim.
  6. Muĩ kôrtê thakim.

-- Sagir

Wiktionary:Votes/sy-2017-09/User:Justinrleung for adminEdit

Another veteran editor of Wiktionary, User:Justinrleung, has been nominated for adminship. Please voice your opinion on that page. Thanks! (Vote closes on 8 Oct.) Wyang (talk) 00:39, 24 September 2017 (UTC)

Should sense ids be distinct across pages?Edit

Israel and State of Israel have the same sense id. I can’t imagine that this will cause any problem, since a sense id will presumably always accompany a pagename, or do we want to ensure universal uniqueness? — Ungoliant (falai) 13:43, 25 September 2017 (UTC)

In principle, it might take pagename + etymology + PoS + senseid to guarantee uniqueness in English, at least if the senseid is poorly chosen (eg, noun and verb spelled the same each used by itself in two different senseids for different PoSes). In the absence of etymology, pronunciation might be required. In some FLs gender might needed. I wonder what requirements exist in other languages. This seems messy.
Do we have anyone running comprehensive checks for this kind of thing against the XML dumps? For example, I use {{sense|genus}} under synonyms, hypernyms, and hyponyms header in taxonomic entries, but I sometimes need to differentiate by taxonomic family, order, etc. to ensure uniqueness. Have I always done so? I haven't been checking for that. DCDuring (talk) 17:00, 25 September 2017 (UTC)
Doesn't the same kind of problem exist to a vastly greater extent in FL sections where definitions consist of a single polysemic English word, with no disambiguating gloss? DCDuring (talk) 17:06, 25 September 2017 (UTC)
Yes, it does. That’s a major problem with our FL content. Our definitions in certain languages (Italian and Spanish come to mind) are still too poor to be used as my primary source of information. — Ungoliant (falai) 17:19, 25 September 2017 (UTC)
I try to fix these for Dutch whenever I spot them, but it's an uphill battle. Finding them is difficult enough. —Rua (mew) 18:05, 25 September 2017 (UTC)
(edit conflict) The only thing that has to be unique is the combination of page name, language, and sense id. Sense ids appear in a link to an entry that contains the entry name, the language name, and the bit of sense id text; they are not used without a language name or as a substitute for a page name. So they do not need to be unique across pages; if they were, they would probably be too long or unintuitive. I think they should be as short as possible, because they have to be plugged into the |id= parameter of link templates. (However, I just searched and discovered a very long sense id for radical in English: linguistics: portion of a character that provides an indication of its meaning. Oh well.) — Eru·tuon 18:12, 25 September 2017 (UTC)
Can we agree that senseids must be unique within a language section? — Ungoliant (falai) 16:38, 27 September 2017 (UTC)
They wouldn't work if they weren't. --WikiTiki89 16:46, 27 September 2017 (UTC)
Sounds like a good rule. DCDuring (talk) 18:06, 27 September 2017 (UTC)
Yes, that is a restatement of what I meant by "the combination of page name, language, and sense id must be unique". It may not have been very clear. — Eru·tuon 18:47, 27 September 2017 (UTC)

Modern Greek terms spelt with Latin charactersEdit

@Xoristzatziki has just speedied the Greek entry at marketing (marketing). I assume the reason is that the word is not written in Greek letters. Since the entry in question was reviewed by experienced editor Saltmarsh, and since marketing is trivially attestable in running text, I think its inclusibility should be at least discussed. — Ungoliant (falai) 16:26, 27 September 2017 (UTC)

Yes - and I thought hard about it at the time. I have frequently considered raising the subject here (TLDR generally stops me). As an ageing Englishman I can feel annoyed at myironic language being mangled by others (I heard an Englishman say "crawfish" on the radio this morning - I'm sorry, we say "crayfish"), ; I can also understand @Xoristzatziki's anger when the same thing is happening to his language. Greek web pages (the first supermarket site I look at has "FRANCHISE" and "CLUB CARD" (and "SUPER MARKET"). When I go to Greece I feel sad that packaging and billboards are similarly invaded. A quick look at my w:Babiniotis Dictionary shows the Latin script "status quo" (we have it as an English term as well as Latin) and other Latin terms, only a few English terms (I only find NATO in the time available). To pick an easy example - "weekend" is common in Greek text (the Academie francaise fought against it for years) have entries for 5 other languages, and it even declines in Polish! But perhaps marketing is better than μάρκετινγκ ? — Saltmarsh. 06:15, 28 September 2017 (UTC)

There is no such term Modern Greek terms spelt with Latin characters. Please do not try to alter a language out of nothing. If you think Greeklish should be a new language in wiktionary, make a propose. --Xoristzatziki (talk) 16:33, 27 September 2017 (UTC)

As a descriptive dictionary, if a word is used in texts by Greek speakers it can be included. @Saltmarsh DTLHS (talk) 16:42, 27 September 2017 (UTC)
In that sense all English words should contain a Greek section... And sections for all other languages also... Or only Greeks and Cypriots use in texts signs and words from English? Chinese do not do it? --Xoristzatziki (talk) 16:54, 27 September 2017 (UTC)
Category:Chinese terms written in multiple scripts, Category:Chinese terms written in foreign scripts. DTLHS (talk) 16:59, 27 September 2017 (UTC)
This is something else. Please do not confuse us. We are talking about the usage of real English words. Not for terms that cannot be otherwise identified (σ鍵). There is not a single English word in the above mentioned category although ex. fast-food is written, as stand alone word, in more Chinese restaurants around the world than marketing is written in Greek "googloid" texts. --Xoristzatziki (talk) 17:08, 27 September 2017 (UTC)
Look at the second category (Category:Chinese terms written in foreign scripts), especially band, size, and friend. --WikiTiki89 17:26, 27 September 2017 (UTC)
@Ungoliant MMDCCLXIV Could you link some examples of "marketing" being used in Greek texts? DTLHS (talk) 17:11, 27 September 2017 (UTC)
google books:"το marketing". Compare with google books:"το μάρκετινγκ". --WikiTiki89 17:26, 27 September 2017 (UTC)
[2], [3], [4], [5], [6], [7], [8]. Some of them also use μάρκετινγκ elsewhere in the text. — Ungoliant (falai) 17:30, 27 September 2017 (UTC)

Apart of all that I could agree in such a "descriptive"(!?) way, if all languages had the same confronting. marketing should include every language for which google returns that word if specific language is asked. And, any way, I will not revert such Greeklish entries if the dominant status of volunteers in Wiktionary is to create such "Modern Greek terms spelt with Latin characters".--Xoristzatziki (talk) 17:18, 27 September 2017 (UTC)

@Wikitiki89, @Ungoliant MMDCCLXIV one thing is sure. You do not know how google works (and especially their department of sales together with google books). Otherwise you should come with true results. Mentioning counts relative to the time they where written, to whom they are addressed, how many are duplicating or copying or attesting other books etc. etc. --Xoristzatziki (talk) 17:39, 27 September 2017 (UTC)

It's true that relative numbers of Google Books hits aren't that useful. One thing I noticed is that one hit displayed on the results page for the "το μάρκετινγκ" search has "το Marketing" highlighted as the search term- you have to wonder if there's some bleed-through between languages in their search algorithms. That said, such things are beside the point when it comes to CFI: there are enough viewable hits to satisfy CFI- if they really are using the term to convey meaning in Greek. The latter is the tricky part. Chuck Entz (talk) 22:03, 27 September 2017 (UTC)

Based on User:DTLHS's idea of "descriptive dictionary" and the whole above conversation, assuming I can provide enough sources written in English as main language (electronic or printed) which have or inside the text is it safe to assume that this an indication to add to these terms an English section? (I have in my hands at least two such books) --Xoristzatziki (talk) 05:34, 28 September 2017 (UTC)

Do the Chinese characters in your texts convey meaning (Wiktionary:Criteria_for_inclusion#Conveying_meaning)? Are they being used as English words and not just mentioned as Chinese characters? It can be hard to answer these questions for a non-native speaker, which is why I don't know if the Greek quotes linked above would qualify. DTLHS (talk) 05:46, 28 September 2017 (UTC)
The "Conveying_meaning" does not mention at all the script. Only mentions words in the same script (which should be considered as "only words in the same script" and not the opposite). The fact that many people who speak a second language prefer to pronounce some words in the way they are pronounced in that second language does not make the pronunciation of these words part of the pronunciation of first language. Such as the "USA" pronunciation of words from enough people living in London does not make that pronunciation British. A book targeted to specific group might contain anything that the target group can identify. A book containing emoticons might have emoticons inside sentences used not as example but as a full sentence. That does not mean the emoticons are part of a specific language. (Or they are now? Can you spell File:Fxemoji u1F602.svg in English or in any other spoken language? Or we are not interest in pronunciation from that point forward? Just "a printed icon" of "example" is enough? Should we start converting any word to picture and stop writing it here but include it as picture?) --Xoristzatziki (talk) 09:25, 28 September 2017 (UTC)
Yes, if enough people in London pronounce something some way, that pronunciation is British. That's what "descriptive" means. Cross-lingual pronunciations are complex, but again, descriptive means that the pronunciation of a word is many times going to be foreignized. I wouldn't say it was safe to assume that 羊 is part of English, but there would certainly be an argument if it was used in running text, particularly if it was treated as an English word. There's also complexities here; English absorbs all sorts of random accents in rare words, like ʔAllāt, or Greek letters, in cases like γ-globulin, and odd characters like ℝ-order tree. But scripts outside Greek rarely get used; ℵ₀ is an odd example, and Cyrillic occasionally leaks through, like СССР, but I'd be very surprised by Chinese characters. I'd expect Greek to be a similar spot; Latin getting mixed in sometimes, with other scripts being rare.--Prosfilaes (talk) 10:02, 28 September 2017 (UTC)

As a native reader of Chinese characters, I agree with User:Xoristzatziki 100% here. If native speakers do not treat these words as their language, do not include these so-called "attestable" words in the comprehensive monolingual or bilingual reference dictionaries they produce, there is really no point in including them. The native speakers (not language regulators) have the best Sprachgefühl regarding what is their language, what is sum of parts in their language, what part of speech a word is (for analytic languages), and often the script is a formidable barrier to something being considered 'their language'; it is a very bad idea to argue against the perception of native speakers, and say this is your language when they are native in it. I'm sure native English speakers would be similarly concerned if a user starts to mass-create "English" entries of a similar nature, even if it is just Latin-script perro ([9], [10], [11], [12]). It is just the case that English is the overwhelming exporter of these uses in other languages, but all languages have principles as to what can be considered part of their language and what can not; not all words a Chinese person says or writes when they speak Chinese is Chinese ― they can mix a lot of English, Malay, Japanese, etc. words in, depending on where they are and how much Chinese/other languages they know. Likewise, Latin-script marketing in running Greek text is just not Greek. Wyang (talk) 11:11, 28 September 2017 (UTC)

I am totally with Xoristzatziki and Wyang on this one. Imagine a published dictionary where marketing is marked as a Greek, Russian, Armenian, etc. word. We have, unfortunately allowed some Latin script words enter CAT:Chinese terms written in foreign scripts, they are mostly slang and, very few are standard Chinese (Mandarin) and, unlike Greek, Russian and other alphabet-based or phonetic languages, there is no Chinese script to render those words phonetically. Most of these terms wouldn't pass if they were in a respected published dictionary. I'd like to mention again that a language, such as Chinese needs a separate CFI for various reasons. --Anatoli T. (обсудить/вклад) 13:26, 28 September 2017 (UTC)
You may want to RFD the contents of Category:Greek terms written in Latin script. — Ungoliant (falai) 13:35, 28 September 2017 (UTC)
Gone. There wasn't even an attempt to provide citations for those. As far as I am concerned, they are all against our policies and the common sense. --Anatoli T. (обсудить/вклад) 13:47, 28 September 2017 (UTC)
There goes ain't and fuck, which weren't English words in the comprehensive monolingual or bilingual reference dictionaries of English for a long time. It also strands "English" words that aren't English anymore, that are being used in ways that no native speaker of English would use the word. I don't know about marketing, but this is not a simple case.
Also, we're a descriptive dictionary. The "correct" writing style frequently differs from the writing style in actual use. If digging around in the newsgroups, we find a few million words of Latin-script Greek, then of course we should record that.--Prosfilaes (talk) 01:20, 29 September 2017 (UTC)
These are different cases: one (ain't, fuck) where the words are deemed nonstandard or vulgar by dictionary makers, and the other where the words are rejected outright by native speakers as simply being foreign words mixed into speech or writing, much like the example of this perro above. We certainly should not include a few million words of Latin-script Greek; that will only lead us to become a laughing stock, and lead to complete dismissal by Greek speakers. Wyang (talk) 05:46, 29 September 2017 (UTC)
It's not different cases; they're both cases where dictionary writers consider a word not a word, because it's not proper. I wasn't talking about native speakers; I was responding to where you were talking about words considered inappropriate to include by dictionary makers.
What other corpuses should we ignore? All that Hebrew-script German Jargon? Scots (the very existence of the Scots Wikipedia seems to get a lot of mockery)? Should we delete Category:Macedonian language because that might cause complete dismissal by Greek speakers? If we have a corpus of several million words, we should record it.--Prosfilaes (talk) 07:14, 29 September 2017 (UTC)
I'm speechless... You are insisting ain't/fuck and marketing in Greek are of the same nature, so ― I was able to find ain't in many English dictionaries: Marriam-Webster, Oxford Dictionary, Cambridge Dictionary, Collins Dictionary, MacMillan Dictionary, American Heritage Dictionary, Longman Dictionary of Contemporary English; can you find a Greek dictionary that includes the Latin-script word marketing as a Greek word? We've already got a native Greek speaker complaining that we are butchering their language, why? Because non-native speakers and non-speakers are dictating their language, often in a self-assumed manner, as if we know what is best for their language. We are sometimes trapped in the mindset of our own rules, so trapped that we have lost touch with reality, with common sense. Show a native Greek speaker the texts containing marketing and ask them what this is, and they would unanimously tell you this is an English word mixed into a Greek text, and the author is trying to show off that they are professional, up-to-date with the lingo and superior with their knowledge. Ask them what μάρκετινγκ is, and they will tell you it is a Greek word borrowed from English. Yet, we decide for the Greeks, ruling that marketing is their language, as well as several million more Latin-script 'Greek' words. Of course this is going to lead to displeasure, dismissal, and ridicule amongst the Greeks, the native help from whom we desperately and paradoxically need here. Wyang (talk) 09:10, 29 September 2017 (UTC)
The issue boils down to having criteria that allows us to draw a line between “Y-language word used in language X” and “loanword from language Y that has been borrowed into X”. This is not as self-evident as some here seem to think, and consulting n people will yield n different opinions as to which words are the former and which are the latter.
Since no one is arguing for the deletion of μάρκετινγκ, it seems that you want the use script as a criterion, which is not at all unreasonable, but do discuss it rather than removing the entries without explanation. — Ungoliant (falai) 11:54, 29 September 2017 (UTC)
There's nothing to discuss here, really. No need to create votes and policies for the obvious, natural and universally accepted rules - languages are written in native scripts, romanisation and words in other scripts are not words in those languages. People who imagine that Greek or any other language written in non-Roman script language can be written in scripts other than the native should be the ones seeking approvals, not the ones who protect the sanity and quality of this dictionary. --Anatoli T. (обсудить/вклад) 12:31, 29 September 2017 (UTC)
Apparently you have to teach that to Nikos, who created the Greek entry at marketing (marketing) (and to the Greeks who keep using marketing in their books). — Ungoliant (falai) 12:38, 29 September 2017 (UTC)
Nikosks (talkcontribs) was the user who created the section in 2016. They only had two edits, spaced less than twenty minutes apart, one on management, and one on marketing, both edits involving the addition of a Greek section. Looking at their edits at the time (edit to management and edit to marketing), this may be the case of an innocent newbie mistake after all: they thought the Greek sections on these entries can be used to hold translations. Wyang (talk) 12:54, 29 September 2017 (UTC)
I doubt the user even knew Greek. Both terms were created as masculines but they are neuters - both μάνατζμεντ (mánatzment) and μάρκετινγκ (márketingk). We often have this type of entries made by clueless users. A while ago [[ghar]] was created with a definition something like "This is a Hindi word for "house". (The history is now overwritten, as the entry was deleted.) The correct entry is, of course, at घर (ghar). --Anatoli T. (обсудить/вклад) 04:57, 30 September 2017 (UTC)

It's pretty obvious that these are English words being used in Greek text (probably for convenience), not integral Greek words. —Aryaman (मुझसे बात करो) 15:44, 28 September 2017 (UTC)

Another possibility is code switching. People who are bilingual sometimes switch between languages because different languages have different associations: using one's native language for personal, emotional topics, using another language to evoke a certain style, or yet another to show one is up on the latest in a field dominated by speakers of that language. This would probably be the latter: if the field of internet marketing is dominated by English-speakers, one might throw in an occasional bit of English internet marketing terminology to give the appearance of being well-versed in that type of thing. We don't see as much of that in English nowadays because English speakers are less bilingual and don't care as much about other languages, but there are specialized areas such as religions like Islam or Catholicism based in other languages or academic fields where you can see it, and it was once common for educated people to throw in the occasional Greek, Latin or French term in ordinary conversation. Chuck Entz (talk) 16:22, 28 September 2017 (UTC)

FWIW, I found back in 2011 that both ἄρχων and Москва were citable in running English text. The latter was deleted per RFD, but if we decide to keep marketing (marketing) et al, it would be easy to cite a bunch more like ἄρχων. IMO, it's better to analyse it as code-switching. We consider the presence/absence of italic script when trying to determine if a Latin-script foreign-language phrase or term has been borrowed into English or only mentioned, and it seems appropriate to me to consider the presence/absence of native script similarly. - -sche (discuss) 19:15, 6 October 2017 (UTC)

Project Grant proposal for Lingua LibreEdit

 
Lingua Libre's logo

Hi!

Lingua Libre is an opensource platform created to ease mass recording of word pronounciations into clean, well cut and well normalized audio files. Given a clean words list, recording productivity can reach up to 1000 audio recordings per hour, i.e. ten times faster than the best method described on Help:Audio pronunciations and without requiring any technical skills.

It's currently supported by a team of (mostly French, including French Wiktionary administrators) volunteers. Even if the core recording tool is fully functional and very efficient, it currently suffers from a very poor integration with the Wikimedia projects. To accelerate the development of this tool and overcome these problems, we have submitted a Project Grant proposal. If you're interested by this project, take a look at the proposal, on meta: meta:Grants:Project/0x010C/LinguaLibre. Don't hesitate to ask questions on it if you feel there are ambiguous points, or to endorse the project if you wish to see it coming true!

Furthermore, if you want the English Wiktionary to benefit from these audio recordings (through a bot, or some other way), please get in touch with me!  0x010C ~talk~ 17:21, 27 September 2017 (UTC)

Little gnomes at work?Edit

Who was the little gnome that removed the arrow symbol from references? I kind of miss it, and there is a gap where it should be. The way watched pages (such as pages one has created) are presented has also changed. DonnanZ (talk) 12:09, 29 September 2017 (UTC)

I don't recall seeing such an arrow, but it sounds like it may have been added by a gadget, which may have been broken by the recent updates to the site software, or the fact that the "References" header has been changed to something else in many entries, or the recent cleanup of old gadgets. Sorry I can't be of any more help than that. - -sche (discuss) 19:24, 6 October 2017 (UTC)

Category:en:AutomotiveEdit

The name of this category is a bit strange. Aren't we supposed to use only nouns? --Barytonesis (talk) 19:39, 29 September 2017 (UTC)

There is also Category:en:Nautical using an adjective. But I prefer "Automotive" to Category:en:Auto parts. DonnanZ (talk) 08:54, 30 September 2017 (UTC)

Wikimedia Movement Strategy phase 2, and a goodbyeEdit

Hello,

As phase one of the Wikimedia movement strategy process nears its close with the strategic direction being finalized, my contractor role as a coordinator is ending too. I am returning to my normal role as a volunteer (Tar Lócesilion) and wanted to thank you all for your participation in the process.

The strategic direction should be finalized on Meta late this weekend. The planning and designing of phase 2 of the strategy process will start in November. The next phase will again offer many opportunities to participate and discuss the future of our movement, and will focus on roles, resources, and responsibilities.

Thank you, SGrabarczuk (WMF) (talk) 21:55, 30 September 2017 (UTC)

Language userboxes: by country/region?Edit

I don't need it, but I just wondered: can our language userboxes support specific country/region, e.g. British English, or Swiss German? (And if not, should they?) Equinox 21:55, 30 September 2017 (UTC)

Do you mean the Babel boxes? Mine says "This user is a native speaker of British English". DonnanZ (talk) 11:59, 1 October 2017 (UTC)

Desinence as a POSEdit

I suggest adding desinence (inflectional ending) as a POS header/category, I think it would be good to differentiate them from suffix in general. Crom daba (talk) 23:37, 30 September 2017 (UTC)

I think "desinence" is a very obscure term that most people wouldn't know. What about just "inflectional suffix"? DTLHS (talk) 23:48, 30 September 2017 (UTC)
I thought it was that lovey-dovey feeling. 20 seconds of brain-searching later, I realise I was thinking of limerence. Equinox 00:07, 1 October 2017 (UTC)
(edit conflict) In principle, that sounds nice, but desinence is a lousy name (how many people know what it is without looking it up), and the Indo-European languages we're used to are deceptively simple when it comes to the types and hierarchy of affixes. For instance, Bantu languages show number with prefixes in many cases, and, as you know, agglutinative languages throw in all kinds of things represented by separate words of various parts of speech, with the lines separating inflection, derivation and syntax getting thoroughly tangled. Chuck Entz (talk) 00:02, 1 October 2017 (UTC)
We call them suffixes, but for many languages we do distinguish between inflectional and derivational suffixes (e.g. CAT:Irish inflectional suffixes and CAT:Irish derivational suffixes). Note that not all inflectional affixes are suffixes, e.g. Maltese ni-, ti-, ji- (and their equivalents in other Semitic languages) are prefixes, i.e. endings that are actually "beginnings". —Aɴɢʀ (talk) 15:07, 1 October 2017 (UTC)

October 2017

October LexiSession: punishmentEdit

 
The Punishment of Loki.

The monthly suggested collective theme is punishment. Not so funny, but the 10th of October is the World Day Against the Death Penalty so we may look at the alternatives and do better descriptions around this theme.

Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. In one year, 35+ people have participated! I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every project at the same time, to give us more insight into the ways our colleagues works in the other projects.

See you soon   Noé 09:28, 1 October 2017 (UTC)

slow slicing and poena cullei are my requests for this month (entries for these, not as punishment...although, they could be useful detractors for trolls...) --P5Nd2 (talk) 09:39, 8 October 2017 (UTC)

Special:Contributions/98.113.14.63Edit

Adding translations to too many unrelated languages. No idea where they get transliterations for Chinese dialects, such as Jin, Gan, Xiang, etc. --Anatoli T. (обсудить/вклад) 10:51, 1 October 2017 (UTC)

How the heck did they find Prakrit translations? They must be going through the entries we have already. —Aryaman (मुझसे बात करो) 20:44, 1 October 2017 (UTC)

French Wiktionary September newsEdit

Hello!

Hey! September issue of Wiktionary Actualités just came out in English!

In this issue: Comments about press articles, our information desk is not like yours, a description of a dictionary of short-text signs, a comment on the expression of gender in an Andean language, some cool videos about words (in French and English!), announcements for the Wikiconférence francophone in October and plenty of statistics with fancy fleurons surrounding it all!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We did not receive any money for this publication and we are not supported by any user group or chapter. It is only written by the community, for the large community of lexicolovers! I hope you did not feel harassed by this notice   Noé 21:41, 1 October 2017 (UTC)

PulauKakatua19 (talkcontribs) againEdit

This user is adding spurious Hittite entries, with some really bad/outdated etymologies. No references either. They have been warned many times for Russian, Hindi, and Rohingya edits. I suggest a week-long block. —Aryaman (मुझसे बात करो) 14:51, 3 October 2017 (UTC)

Etymological information for strong verb non-lemma formsEdit

There are many forms, e.g. in English, where arbitrary, irregular, strong forms of verbs deserve their own etymology. Many of these individual forms have received particular attention from linguists over the decades, e.g. did (past tense of do, from a unique, non-past reduplicated root form of the ancestor of do dating back to Pre-Germanic, for unknown reasons), sang (past tense of sing derived directly from a Proto-Indo-European form of the ancestor of sing). These non-lemma forms have their own independent etymological lineage that can be traced back thousands of years.

A certain administrator (Rua) has informed me that it is policy on Wiktionary to minimize etymological information on non-lemma forms, and instead place such information in the lemma form's etymology section. This can be understood for weak forms like walked, but those forms need little explanation because they are formed regularly, and for the forms that do require extra explanation, it makes for unsightly etymology pages on the lemma form's etymology sections (see Proto-Germanic *dōną's etymology section for the current policy specification; it doesn't even specify the past form *dedǭ, referring to it only as "the past form").

I understand the concern to avoid etymology fragmentation, but in this case, the etymology itself is fragmented and the two forms are remembered as separate, arbitrary, irregular forms. Perhaps there is a solution to maintain the same etymology information in multiple pages, but I think the most simple solution would be to provide etymological information for such forms on their own pages. There is really no reason to avoid this practice and it only makes things more confusing. I am surprised that this is against current policy. Do you agree with this assessment?

128.84.127.223 16:17, 3 October 2017 (UTC)

Strongly oppose putting etymologies on every inflected form, irregular or not. —Rua (mew) 16:35, 3 October 2017 (UTC)
  • Out of curiosity -- where should such etymological information go? Some simple-present verb forms include etymological information for irregular conjugated forms, such as at [[go#Etymology 1]]. Others do not, such as at [[do]], which includes no explanation for the formation of [[did]]. ‑‑ Eiríkr Útlendi │Tala við mig 17:29, 3 October 2017 (UTC)
    • At the lemma entry, where we currently already place them. The IP is arguing that we should put etymologies on nonlemma entries too, which is going to lead to a huge duplicative mess. —Rua (mew) 17:47, 3 October 2017 (UTC)
I am proposing a move of the notable etymologies from the lemma to the non-lemma forms, if they are notable as in strong verbs. There is no duplication going on, only a move, as I indicated in the OP. 128.84.127.223 17:58, 3 October 2017 (UTC)
Then I still oppose it, the etymological information should be centralised on the lemma form. That's how all etymological dictionaries work, that's how we've worked so far too. Our users are accustomed to follow the link to the lemma for information, which is the purpose of non-lemma entries in the first place. They're there to help get users to the right place, nothing more. We should not scatter our information across various non-lemmas. —Rua (mew) 18:34, 3 October 2017 (UTC)
Traditional etymological dictionaries are constrained by the space in a book and give priority to lemma forms because they are the most popular. There is no real reason to ignore non-lemma forms or centralize their etymologies because Wiktionary doesn't have a size constraint, especially for adding reasonable information. I disagree that the only purpose of non-lemma forms is to provide a link to a lemma form; many non-lemma forms have lineages in their own right and there is no reason to marginalize them. Furthermore, users are not accustomed to follow the link to the lemma forms as you suggest; precedents for separate etymologies for non-lemma forms like done, is, are, am, etc. already exist and have existed for a long time. 128.84.127.223 18:37, 3 October 2017 (UTC)
Size constraint isn't the issue. It's keeping our information organised so that information can be found easily. And what I said is the agreed-upon purpose of non-lemma forms. It's why we don't include things such as derived terms, descendants or inflection tables on non-lemmas. Wiktionary is fundamentally lemma-oriented (or lexeme-oriented) rather than word-oriented. If we were word-oriented, we'd also include full definitions on non-lemmas, but thankfully we've been wise enough to not follow that idea. —Rua (mew) 18:42, 3 October 2017 (UTC)
Semantically and synchronically, what you're saying is correct; non-lemma forms don't require separate definitions. The placeholders used now are adequate. Etymologically and diachronically, it's incorrect. Irregular non-lemma forms are entirely independent of their lemma forms. Wiktionary is a semantically lemma-based dictionary, but that's completely unrelated to etymology. There is good reason for irregular non-lemmas to provide etymologies, and the semantic value of the terms have no bearing on it. 128.84.127.223 18:46, 3 October 2017 (UTC)
Ok, but as you must understand by now, that's not how Wiktionary works. The etymologies for the individual parts are noted on the lemma. You'll simply have to adapt to this practice. We're not going to change it just because some random user doesn't like it. —Rua (mew) 18:48, 3 October 2017 (UTC)
You are not arguing against my point, you are arguing your point because "that's the way it's always been done" and based on ad hominem because I'm "some random user". 128.84.127.223 18:54, 3 October 2017 (UTC)
  • I am not proposing putting etymologies on "every inflected form", only on the arbitrary forms with their own separate, traceable etymologies, if only to indicate their significance. The regular forms don't require etymologies because they are predictable. E.g. the etymology for the strong non-lemma form of sing, which is sang:
From Old English sang, from Proto-Germanic *sang, from Proto-Indo-European *songʷh-, o-grade past tense of *sengʷh- (sing, make an incantation).
Right now, the article sang doesn't indicate any of this lineage at all. As a strong and unpredictable form, lexically, sang is just as prominent as its form which is arbitrarily deemed the lemma form, sing, which independently derives from a different PIE form. There is no reason to treat it as a secondary form etymologically, at least in this case. 128.84.127.223 17:36, 3 October 2017 (UTC)
That's not any better. Consider how many times we'd have to duplicate the etymology for all 12 of the past tense forms of vera or syngja. The lemma is a natural place for etymologies, since it's a single central entry that covers all inflected forms. —Rua (mew) 17:47, 3 October 2017 (UTC)
That's not duplication, that's providing the very separate etymologies for very separate forms. If the forms merge at a certain point, then a link can be provided to the form from which they split off to avoid etymology duplication, like is done with borrowed terms. The words is, are, and were, for example, are all forms of is, but does that mean these forms should not provide their own etymologies? Are the etymologies of these forms of less interest and notability than any other term? They are not. 128.84.127.223 17:56, 3 October 2017 (UTC)
  • I could be wrong, but I don't think Rua is arguing that the etymologies of conjugated forms are not worthy of inclusion. I believe that she is instead arguing that the etymologies of conjugated forms should go within the etymology of the lemma form, and that the conjugated-form entries should be minimal.
The issue at hand is not whether to include or exclude certain information -- rather, it is about where to include that information. ‑‑ Eiríkr Útlendi │Tala við mig 18:09, 3 October 2017 (UTC)
Right. I am proposing that it should go in the non-lemma form. In fact, what I'm proposing is already standard practice for many notable forms, e.g. done. I want this to be consistent. For verbs like be, it would clutter its etymology section to list all of the etymologies for all of its many suppletive forms is, are, were, was, am, etc. One interested in these etymologies can follow the links to these forms' pages (which are provided in the term head) and view the etymology. More importantly, someone who specifically searches for irregular forms should have immediate access to their etymologies on the same page.
When I go to the page for am (which actually already follows the format that I'm proposing; I don't think anyone would want to move its etymology to the be page), I want to know the etymology of the term. When dealing with etymology, I don't really care if it's a form of any other (in this case, a completely unrelated lemma form), I want immediate access to its own unrelated and notable etymology. I believe this seems fairly reasonable and already has precedent. 128.84.127.223 18:20, 3 October 2017 (UTC)
The lemma entry is a central place for the term and all of its inflections. Information about am concerns the lemma be, so it should go there. The individual parts of verb paradigms may have separate origins but they don't have separate etymologies because they are inherited as a whole. The verb be in modern English is the same paradigm as the verb been in Middle English. —Rua (mew) 18:38, 3 October 2017 (UTC)
That's incorrect. Separate forms of a verb are not "inherited as a whole". That doesn't even make any sense. Irregular forms all require individual memorization and passing down. The lineage of was, for example, is entirely separate from be, as are both from am. If what you were saying was true, all Indo-European languages would still preserve the verb paradigms of Proto-Indo-European. They do not. They mix, they match, they innovate, they supply. 128.84.127.223 18:41, 3 October 2017 (UTC)
But they still form a single verbal paradigm. A question like "what is the past tense of be?" has an answer precisely because paradigms exist. We have chosen to use a single form to stand in for the entire paradigm, the lemma form, for convenience. That's where etymologies also go. —Rua (mew) 18:46, 3 October 2017 (UTC)
Etymologically, verbal paradigms don't matter for irregular forms. We have chosen the lemma form to stand in for the non-lemma forms semantically, but we have not done so etymologically, because that makes no sense. 128.84.127.223 18:48, 3 October 2017 (UTC)
We've chosen to do both. I'm sorry if that makes no sense to you, but it is what it is. —Rua (mew) 18:49, 3 October 2017 (UTC)
Please cite for me to this specific point in Wiktionary policy. I will propose the change through the proper channels. 128.84.127.223 18:53, 3 October 2017 (UTC)
I found it myself, and lo and behold, you seem to be the one who added this into the "common guidelines" page in the first place. While I agree with most of your additions, exclusivity of etymology to the lemma page is one that does not make any sense. 128.84.127.223 19:06, 3 October 2017 (UTC)
I am in favor of continuing to not split etymologies, on the grounds of workability: an editor who is interested in adding this type of information should be able to see at a glance if it has been done already, without checking each relevant non-lemma entry separately. On the other hand, I don't see a problem in directing users from non-lemma forms to the lemma, in cases where they need a separate discussion.
Actual suppletion seems like a different case, though. is, are and be have completely unrelated etymologies, and continuing to maintain separate etymology sections for them seems like a good idea (but I'd again be in favor of pointing users from the lemma form to the other entries for further reading). --Tropylium (talk) 20:49, 3 October 2017 (UTC)
I don't want to split etymologies, except like you said, for terms with suppletive forms and terms with strong forms. For example, the etymology of "did" takes a separate lineage all the way back to Proto-Indo-European that's completely independent from "do"; despite not being a suppletive form, it's a strong form. I don't want to split etymologies for verbs like "walked", only for verbs like "did" and "is/am/are" and "brought". One page should not contain etymologies for different terms if the etymologies are not currently regularly formed. So this would only be an exception that would affect a relative minority of pages. Wouldn't you agree with this? 128.84.127.223 21:47, 3 October 2017 (UTC)
English has relatively few inflected forms, but it can get pretty complicated when you have forms inflected for gender, number, case, etc. Even in English, am, is and are all go back to inflected forms of the same Proto-Indo-European root. As for strong verbs, I don't think differences in ablaut grade are enough to justify maintaining separate etymologies. We have a recognized system of lemmas and non-lemmas, but I'm not sure how you could decide which form to make the "etymology lemma" for forms sharing an etymology. Chuck Entz (talk) 02:15, 4 October 2017 (UTC)
I have trouble with the vagueness of "strong forms". This is well-defined only for Germanic languages, not a generally applicable concept. Likewise, "having a separate lineage" holds for a lot of things, for starters all irregular forms in general. We have a separate etymology for mice; should we also have separate etymologies for taught or bent?
I think the default assumption should be that, if not otherwise specified, it is not merely the lemma but all applicable inflected forms that descend from a given ancestor. If we give mūs as the ancestor of mouse, then this should already imply that the former's plural mȳs is the ancestor of the latter's plural mice. This gets rid of having to treat any irregularities that represent fossilized original regular alternations, no matter how far back they go. We are working on etymology sections here after all, not on historical morphology or historical phonology.
To be fair, without morphological and phonological supplementary information, etymology often becomes fairly opaque just-take-my-word-for-it business, and I do think Wiktionary could benefit from detailing these somewhere; I just do not think etymology sections are the place for this. --Tropylium (talk) 10:26, 4 October 2017 (UTC)
Having mūs as the ancestor of mouse does not immediately imply that mice derives from mȳs, or make it clear to the viewer. There is no duplication of information going on when etymology is given for mice, only clarification and necessary etymology. Apparently, someone rightly found that etymology should be specified for this non-lemma form, since an etymology section for mice already exists. Anyhow, I think this is being blown out of proportion. I would only ask for the option of specifying non-lemma etymologies where they are notable, as has already long been done with the article of am. Rua would delete all these etymology sections (despite am being a oft-cited non-lemma form for the purposes of reconstruction). When I make an etymology section on brought and did to explain their opaque etymologies, I don't want my edits nonsensically moved and crowded under the etymology pages of bring and do (or more often than not, simply deleted). These sorts of power trips by administrators not following the spirit of the guidelines (that they themselves wrote!) just make me incredibly discouraged from adding information to this website. 128.84.125.120 18:04, 4 October 2017 (UTC)
@Tropylium How would you handle the suppletion of the potential of olla, the perfect of sum, or in być? Putting etymologies on each of the forms is not going to be feasible. —Rua (mew) 23:36, 3 October 2017 (UTC)
There's only a limited amount of suppletion for any given case; we could assign an "etymological lemma" for each nonsuppletive group (e.g. lienee for the Finnish possessive stem). --Tropylium (talk)
Ew. —Rua (mew) 11:04, 4 October 2017 (UTC)
Seconded Rua. Anti-Gamz Dust (There's Hillcrest!) 00:34, 16 October 2017 (UTC)

Rollbacking/PatrollingEdit

Hullo. I'd like to make a request for the rollbacking or the patrolling tool. Where is it at? --Barytonesis (talk) 08:09, 5 October 2017 (UTC)

@Barytonesis: An admin has to nominate you at WT:Whitelist I think (or is that only for auto patrol)? —Aryaman (मुझसे बात करो) 17:01, 5 October 2017 (UTC)
I think that rollback/patrol most often is applied to people who, for one reason or another, do not want to be administrators. Just apply to be an admin if you want some subset of the tools. - TheDaveRoss 17:04, 5 October 2017 (UTC)
@TheDaveRoss: I'd like to, but I don't think I've gathered enough trust yet. Would you endorse me? --Barytonesis (talk) 16:42, 14 October 2017 (UTC)

A more personal form of Google Translate just for FaroeseEdit

https://www.faroeislandstranslate.com/#!/Justin (koavf)TCM 08:01, 6 October 2017 (UTC)

Entries with deprecated labelsEdit

The label (ordinal) used for ordinal numbers is listed in Category:Entries with deprecated labels with no suggested replacement. Should it even be listed there? DonnanZ (talk) 13:21, 6 October 2017 (UTC)

There is no replacement. There should not be a label there at all, add the category with {{head}} or {{cln}} instead. —Rua (mew) 13:27, 6 October 2017 (UTC)
The label automatically generates the category though, as well as saying what it is, so I don't see any reason to change it, e.g. nittende. Besides that, there is no suggestion to use {{head}} or {{cln}} in the above-mentioned category. DonnanZ (talk) 13:45, 6 October 2017 (UTC)
It's a misuse of labels, that's why it's deprecated. "Ordinal" doesn't specify a context in which a term is used. —Rua (mew) 13:57, 6 October 2017 (UTC)
Whoever set up the label didn't take that into account. It surely would be a simple matter to change the label to "ordinal number", although loads of entries would have to be revised. "cln|nb|ordinal numbers" works for generating the category, but a qualifier would then have to be added, which is twice as much writing, and a step backwards. DonnanZ (talk) 14:11, 6 October 2017 (UTC)
The other label (cardinal) when moused over shows "cardinal number", but this doesn't happen with (ordinal). It is not deprecated. DonnanZ (talk) 14:47, 6 October 2017 (UTC)
"ordinal number" is also not a valid context. Context labels should not be used to give definitions or disambiguate them. They are meant to describe how something is used, not what it means. —Rua (mew) 15:20, 6 October 2017 (UTC)
Have you checked ordinal number? Also see here. Nineteenth is an ordinal number. DonnanZ (talk) 15:35, 6 October 2017 (UTC)
Where are you getting the idea that I'm denying that these are ordinal numbers? I only said that a context label is not how this fact should be indicated. The entry should be categorised with {{cln}} or the cat2= parameter on {{head}}, but there shouldn't be a context label saying that it's an ordinal number. —Rua (mew) 15:47, 6 October 2017 (UTC)
I agree with RuaCat. Ordinal numbers should be categorized as such using |cat2= or {{cln}} but not using {{lb}}. —Aɴɢʀ (talk) 16:21, 6 October 2017 (UTC)
I still disagree, but as you are so keen on everything else but, perhaps you would like to come up with some usage examples. DonnanZ (talk) 22:31, 6 October 2017 (UTC)

Please, please reveal the cause of the revert in the edit summaryEdit

Void information is the default text If you think this rollback is in error, please leave a message on my talk page. In so many words you could give some specific about the actual problem.

Instead of writing pure junk this formula, it would be more helpful for all of us if you would just write the reason in the edit summary (this way we won’t have to bother you on your talk page).

By the revert you make the work of someone to nil. Please, please either correct the error, other at least give a hint about the problem to avoid.

(Sorry for my poor English.)

Karmela (talk) 07:09, 8 October 2017 (UTC)

There are relatively few admins who have to go through a flood of edits by new contributors and see whether they belong in the dictionary or not. Given that, we simply do not have the time to give explanations tailored for every rollback that we make (if it wasn't clear, the default text is added automatically). I created the vote that added that default text because previously, it said nothing at all — obviously, this is much better, because you followed the instructions and left a message on Wikitiki89's page, where you can further discuss the edit. —Μετάknowledgediscuss/deeds 07:22, 8 October 2017 (UTC)
Thank you. For a (not vandal) contributor is the cause of the rollback _never_ clear, s/he made the contribution supposing it was ok.
The list of the typic errors must not be too long, would be possible to chose from a premade explanation list by reverts?
Karmela (talk) 16:37, 8 October 2017 (UTC)
We have such a list for deletions of entire entries. It would be a good start for what you recommend. I do not know whether it is readily done technically. DCDuring (talk) 18:59, 8 October 2017 (UTC)
  • @DCDuring, Metaknowledge In en.wikipedia.org you can add two dropdown boxes below the edit summary box with some useful default summaries:
  1. Common edit summaries -- click to use
  2. Common minor edit summaries -- click to use
One can enable this gadget at https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-gadgets
An analog dropdown box Common revert summaries -- click to use must be technically similar.
Karmela (talk) 07:47, 14 October 2017 (UTC)
So, apparently technically possible. How do we get it? DCDuring (talk) 14:01, 14 October 2017 (UTC)
This is how mw.loader.load('//en.wikipedia.org/w/index.php?title=MediaWiki:Gadget-defaultsummaries.js&action=raw&ctype=text/javascript'); Dixtosa (talk) 14:20, 14 October 2017 (UTC)
All this postulating the wish of the community here. Is this here the correct place and form to ask the community of the Wiktionary?
Karmela (talk) 08:52, 15 October 2017 (UTC)

Requests for deletion - restoring the list of nominationsEdit

In June 2017, WT:RFD was changed to no longer list items nominated at the right top of the page. I propose to restore the previous state. The current state is that categories are listed but not the items nominated themselves. That is not very useful, IMHO.

Therefore, I propose:

  • List nominated items again, as a list of items for all languages.
  • To support that, list all nominated items in Category:Requests for deletion instead of listing them only in per-language categories. This, again, is a restoration proposal.

--Dan Polansky (talk) 09:34, 8 October 2017 (UTC)

@Dan Polansky: If you want to see, say, the 5 French requests, click on the "▶" symbol next to the "Requests for deletion in French entries‎ (0 c, 5 e)". In my opinion, this is more useful than before, because now you can choose the language you want to see, as opposed to seeing a mess of entries in all languages. If we want to see a mess of entries in all languages, we may look at the normal TOC (the "Contents" list). I believe we also have the option of making all languages un-collapsed by default, though personally I'd prefer them collapsed as they currently are. --Daniel Carrero (talk) 09:58, 8 October 2017 (UTC)
I want to see the complete list, not by language. I only want to check whether all the items listed there were put to RFD page itself; if I did not want to do that, I would not want to see that right-floating portion of the page at all. --Dan Polansky (talk) 13:32, 8 October 2017 (UTC)
  • I support Dan's proposals. The language-specific RFD categories seem to be useless. —Μετάknowledgediscuss/deeds 16:05, 8 October 2017 (UTC)
    How do we know that no one uses the by-language listings? (BTW, I don't use them)
    BTW, I have noticed that we have a fair number of headings on request pages that do not have tags. Do we need yet another run against the XML to identify:
    1. Tagged L2s that are not on current request pages.
      1. Tagged L2s that are for archived or otherwise closed requests.
    2. Untagged L2s that are on the request pages.
    We'd also need to treat items that have been stricken or closed, but not yet archived.
    At the moment I don't see how this can systematically be accomplished with search. Though I doubt we would need such a run every two weeks, it might be useful every quarter or, at least, every year. DCDuring (talk) 18:47, 8 October 2017 (UTC)
@DCDuring For part 1, User:DTLHS/cleanup/request consistency. I don't think 2 is that important since entries request request pages get archived eventually. It's possible that there are false positives if pages are linked unusually on the request pages. DTLHS (talk) 19:27, 8 October 2017 (UTC)
@DTLHS: For 2 I was thinking about those requests that are entered without use of any request template. Today I noticed it when [[academic institution]] was added to RFDE. (The contributor has now added {{rfd}} at my request.) Perhaps what is needed is to discourage addition of new headers on request pages except through the relevant templates. DCDuring (talk) 20:35, 8 October 2017 (UTC)

Classification of forms with -n'tEdit

Hello. Rua, Equinox, Erutuon and I have been talking about the classification of don't, can't and other forms with -n't in User talk:TAKASUGI Shinji/2017#Contractions. I think they are verb forms just like did and could, according to Arnold M. Zwicky and Geoffrey K. Pullum (Cliticization vs. Inflection: English n’t, Language 59(3), 1983, pp. 502-513), but not everyone agrees with their analysis. In my opinion, we shouldn't use “Contraction” as a header because it is not a part of speech, and we should replace it with a part of speech we can reasonably assign. What do you guys think? — TAKASUGI Shinji (talk) 10:09, 8 October 2017 (UTC)

Our level 3 headers are for more than just part of speech. Suffix isn't a part of speech either. We have to use "contraction" because for most cases there is no other way to do it. Look at Category:Middle Dutch contractions for example. So that argument is not very compelling.
As for these contractions specifically, I don't see how they can be considered anything else. They aren't considered verb forms in any standard grammar of English. One paper is interesting, but we should follow linguistic consensus on the matter and not the opinion of a single paper. —Rua (mew) 11:44, 8 October 2017 (UTC)
An analysis of well-known linguists and lack of analysis don't have the same value. I find their analysis convincing. You can only say don't you? and not *do not you?, from which we must conclude that don't is not a contraction of do not. — TAKASUGI Shinji (talk) 11:41, 9 October 2017 (UTC)
What do you mean by "standard grammar"? The Cambridge Grammar of the English Language (naturally, because it was co-written by Pullum) uses the inflectional-suffix analysis of -n't, and its auxiliary verb paradigms show negative forms corresponding to each of the finite forms. Certainly the more traditional version of English grammar that I learned as a kid didn't recognize negative inflected forms, but it wasn't particularly linguistically rigorous and shouldn't be the basis for our decisions on Wiktionary. — Eru·tuon 21:52, 9 October 2017 (UTC)
I'm in favor of the analysis in which -n't is an inflectional suffix and forms like don't are verb forms (and I could go on about that), but the essential thing is to at least be consistent. I don't think it's consistent to label -n't as a suffix (as it's been labeled since 2008) and then call forms like won't contractions. A contraction is basically the combination of a full word plus one or more clitics that are derived from orthographic words, but are not spelled as words in this case. So for won't to be a contraction, -n't has to be a clitic (a variant form of not). The other option is for -n't to be a suffix and won't a verb form. We need to pick an analysis and stick to it. It would be fine to include usage notes explaining the alternative analysis, or alternative inflection tables, or categories, but the headers and headword templates should stick to a single analysis. — Eru·tuon 19:20, 8 October 2017 (UTC)

Any idea for a new "Thesaurus:" shortcut?Edit

WS:goodThesaurus:good stills works, as it should.

But "WS:" does not make a lot of sense anymore, because now "Wikisaurus" is called "Thesaurus".

Then again, "TS:good" and "TH:good" are unavailable, because they are language codes. Is there a good shortcut available? If not, I guess we'll have to keep using only "WS". --Daniel Carrero (talk) 18:37, 9 October 2017 (UTC)

THES seems the obvious choice. Equinox 18:44, 9 October 2017 (UTC)
Alright, I guess. I'm not entirely happy with a mere reduction from 9 to 4 letters, but maybe that's the best option we have.
Maybe THE would be better ("THE:good" → Thesaurus:good), but "the" is the ISO code for Chitwania Tharu (w:Tharu languages). Can't we use it anyway? --Daniel Carrero (talk) 14:31, 11 October 2017 (UTC)
But if we can't use ISO codes then we can't use any three-letter code, even if ISO hasn't used it yet. It should be considered reserved for future ISO use. Equinox 15:07, 11 October 2017 (UTC)
That may be true, but we have violated that rule before. We have "cat" and "mod" as working aliases. See CAT:English nouns and MOD:sandbox. "cat" means Catalan, which seems unlikely to be used by Wikimedia because they have settled for https://ca.wiktionary.org/ and https://ca.wikipedia.org/ (using "ca", not "cat"). "mod" is Mobilian Jargon language. --Daniel Carrero (talk) 15:22, 11 October 2017 (UTC)
To be clear, I would support using "the". ("THE:good" → Thesaurus:good) --Daniel Carrero (talk) 15:24, 11 October 2017 (UTC)
I would prefer "THS" which is the language code for the w:Thakali language, a Nepali Sino-Tibetan language with 5,900 native speakers. Chuck Entz (talk) 02:04, 12 October 2017 (UTC)
SYN. —suzukaze (tc) 02:28, 12 October 2017 (UTC)

Linking active policy proposalsEdit

WT:EL should probably link to WT:FORMS in some fashion. I imagine there are also other cases like these, where EL is a dead-end and the actual documentation is hidden away in some obscure undocumented location.

Some might protest that the former is policy while the latter are often drafts, but as long as this is indicated, I do not see any problem in linking. Should we maybe settle on some specific more mildly worded section hatnote, such as "Read more:" (instead of "Main article:" or the like)?

Interestingly, WT:Policies and guidelines, despite being prominently linked from the policy headers ({{policy}}, {{policy-TT}}, {{policy-DP}}), is currently categorized as "inactive". There's Category:Wiktionary think tank policies, but it's not especially user-friendly. --Tropylium (talk) 14:54, 11 October 2017 (UTC)

New section "Synchronic analysis" in WT:ELEdit

w:en:Synchrony and diachrony

It isn't useful to have only historic (current "Etymology" section at en.wiktionary) or only modern analys.

Example: атония d1g (talk) 08:51, 12 October 2017 (UTC)

We include this in etymology, but the usual wording is "equivalent to". —Rua (mew) 13:36, 12 October 2017 (UTC)

Linking to Wikimedia Commons categoriesEdit

Hello, I would like to know why the wiktionary entries are not linked to the Wikimedia Commons categories (by using statements at Wikidata). For example the entry Varvel can be connected to commons:Category:Vervels. It can only help the readers to (visually) learn more about that particular word. Fructibus (talk) 09:45, 12 October 2017 (UTC)

@Fructibus: Have you seen Wiktionary:Wikidata? We do in fact have some links to that sister project via local templates. E.g. tea. —Justin (koavf)TCM 09:50, 12 October 2017 (UTC)
@Koavf: Thanks a lot! By the way, was there any discussion about including the wiktionary pages into Wikidata, connecting with the Wikipedia/Commons pages? Then the Commons link would show automatically for the Wiktionary pages, in all languages. At this moment, if you want to link to Commons in all language articles, that means you have to edit 67 Wiktionary pages. Fructibus (talk) 19:05, 12 October 2017 (UTC)
@fructibus: "Was there any discussion about including the wiktionary pages into Wikidata" Oh yes, quite a bit. And there are currently options to include Wiktionary entries in Wikidata but I don't feel like I can do a good job of summarizing all of that. You may wish to see the equivalent page here: d:Wikidata:Wiktionary. I 100% agree that we should use Wikidata to make sister links--you may wish to talk with User:CodeCatUser:Rua (I had forgotten he [she?] was renamed for some reason) about that. —Justin (koavf)TCM 19:16, 12 October 2017 (UTC)
People have expressed dislike for Wikidata IDs, so we probably won't be using Wikidata for anything after all. I tried. —Rua (mew) 19:45, 12 October 2017 (UTC)
It will happen, it's just that at the moment the advantages aren't completely obvious. – Jberkel (talk) 20:56, 12 October 2017 (UTC)
@Jberkel: Isn't this one of them? —Justin (koavf)TCM 22:26, 12 October 2017 (UTC)
The page tea has already {{wikidata|Q6097}}. Changing to a template like {{sister links|Q6097}} could fetch all sister project links with automatic update of new links, deleted ones or renamed ones. The problem is that a word may have multiple senses that can be connected to multiple equivalent pages on Wikidata. --Vriullop (talk) 08:22, 13 October 2017 (UTC)
@Koavf: I'm all for Wikidata, it's just that to some editors the advantages are less clear at the moment. @Vriullop: yes that would be great, via Wikidata one should be able to fetch all the other relationships. Couldn't {{senseid}} (or something similar) be used for fine-grained associations? – Jberkel (talk) 14:11, 13 October 2017 (UTC)

@Jberkel - @Koavf - @Vriullop - @Rua - Sorry, I am new to Wiktionary buy I really don't see the reason in not linking the Wiktionary definitions in Wikidata. For example the Wikipedia article Water has a link to the Wiktionary definition, at the bottom of the article. Why not to show it in the middle-left side of the page, near to the other sister project links? (Commons, Wikibooks, Wikiquote). This way all the 220 Wikipedia articles can show the link to the Wiktionary definition in their respective language (if it exists), without the need to actually edit the 220 Wikipedia articles. Fructibus (talk) 18:49, 13 October 2017 (UTC)

@Fructibus: I agree as well but there were concerns that it's too difficult, impossible, or possible-but-difficult and not actually helpful. I disagree with the latter two but it's definitely an undertaking to be sure. Then again, so is everything. —Justin (koavf)TCM 19:01, 13 October 2017 (UTC)
@Koavf: Very nice answer, gives a feeling of touching a perfection in language, thanks :) - Fructibus (talk) 23:39, 13 October 2017 (UTC)

Ōbaku tō-on/sō-on readingsEdit

Found this video: Heart Sutra chanted by Ōbaku monks; is the ruby a Chinese pronunciation or as Wikipedia states: tō-on/sō-on readings? Here's a supporting resource. Domo, --POKéTalker (talk) 04:49, 13 October 2017 (UTC)

Personally it sounds suspiciously(?) too much like accented Mandarin ( () (ji)?  () (e)?), possibly dated ( (けん) (ken)), but I also don't know know what I'm talking about. Maybe tō-on is Mandarin. —suzukaze (tc) 05:11, 13 October 2017 (UTC)

For reference, comparison of Japanese, Ōbaku reading (sō-on?) and standard Chinese:

 (かん) () (ざい) () (さつ) (ぎょう) (じん) (はん) (にゃ) () () (みっ) () ()
Kanjizai Bosatsu gyō jin hannya haramitta ji
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
 (クヮン) () (サイ) () () (ヘン) (シン) () () () () () () ()
K(w)antsusai Pusa hen shin poze poromito su
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
觀自在菩薩般若波羅蜜多 [MSC, trad.]
观自在菩萨般若波罗蜜多 [MSC, simp.]
Guānzìzài Púsà xíng shēn bānruò bōluómìduō shí [Pinyin]
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...

Though there is probably no clear romanization to the monks chanting, should kanji with these Ōbaku on-readings be provided as sō-on? Just wondering. --POKéTalker (talk) 02:21, 14 October 2017 (UTC)

(The automatic pinyin generated by zh-usex is not correct because it uses the most common readings. [13] has pinyin transcription that seems to be OK —suzukaze (tc) 02:36, 14 October 2017 (UTC))
  • @POKéTalker, to confirm / clarify -- it sounds like you're asking if there is value in adding sōon readings to the individual kanji entries. If that's your proposal, I have no particular opposition, so long as the readings are clearly labeled as sōon (provided that's the correct reading category). ‑‑ Eiríkr Útlendi │Tala við mig 04:36, 14 October 2017 (UTC)
I think POKéTalker wants to make sure that they are indeed tou'on first. —suzukaze (tc) 20:48, 14 October 2017 (UTC)

TabbedLanguages default and English links in definitionsEdit

Yesterday, following Wiktionary:Beer parlour/2017/July#TabbedLanguages edit: default to English for unmarked links, I made a change to MediaWiki:Gadget-TabbedLanguages.js so that the default language would always be English, if no language is specified. This means that it's no longer necessary to use {{l|en|...}} in definitions. I'd like to ask the people who do this to use regular links from now on. —Rua (mew) 15:40, 14 October 2017 (UTC)

But not everyone uses tabbed languages. DTLHS (talk) 15:45, 14 October 2017 (UTC)
I agree that this would be a sensible default for those without it, too. But then we'd need a separate gadget. —Rua (mew) 15:46, 14 October 2017 (UTC)
If we made a separate gadget it could be more smart, such as linking derived terms to the correct language, while linking terms in definitions to English. DTLHS (talk) 15:53, 14 October 2017 (UTC)
Also, what happened to the plan to make TL the default? We had a vote and everything. —Rua (mew) 15:51, 14 October 2017 (UTC)

Singapore termsEdit

Just a heads up: a while ago, a Singapore schoolteacher encouraged his students to add Singaporean English terms to Wiktionary (which is, on the whole, a good thing). We seem to have a new batch of these happening at the moment, e.g. bus captain, taxi uncle. So be ready for some cleanup. Equinox 08:23, 16 October 2017 (UTC)

Special:Contributions/86.30.235.176Edit

They've made some drastic changes to pronuciation which might not be correct. Anyone who knows Old English, do you mind taking a look. --Robbie SWE (talk) 18:12, 16 October 2017 (UTC)

@Robbie SWE: It looks like, as far as Old English pronunciation goes, they're changing sequences of /h/ and a sonorant to sonorant and voicelessness diacritic (for instance, /hr/ to /r̥/). That might be correct in a pseudo-phonetic transcription, but I don't know if it is an accepted phonological analysis. — Eru·tuon 21:46, 16 October 2017 (UTC)

Translating both waysEdit

Hello!

When I started working on a project in which I would like to use translations from the wiktionary, I noticed that wiktionary translations are created separately for each language. That means that even if the English wiktionary contains the translation of a word into another language e.g. Mandarin, in that language there will not be a translation of that word into english.


One example:

library

- the list of translations contains the translation 图书馆

zh:图书馆

- the Chinese wiktionary page does not have a translation for that word into English (the site contains: 英语(English):[[]])

Since these translations are symmetric, it would be correct to add a large number of translations to these wiktionaries with much less effort. However there surely will be a few issues that have to be resolved first.


TheDaveRoss already replied to me per mail already stating some issues:

"1. There are numerous Wiktionaries, each one maintained by a distinct community of volunteers. Each has its own policies regarding what may or may not be included, how translations are to be added, etc. It is very important that you coordinate with the local community wherever you add content to ensure that the content meets their criteria.

2. Translations are very nuanced (as you are probably aware). Automated addition of translation has happened at small scales in the past, however close oversight by a person familiar with both languages is required. Even translations which appear to be symmetrical may require special annotation in the target language which is not included in the original language.

3. The source material may not be correct, and automation can propagate errors. The English Wiktionary, and a few other large Wiktionaries, have enough contributors that many errors are caught quickly. That is not the case for the majority of other languages, so it is important to ensure any additions to other languages are correct"

4. Attribution to the original contributor will be important

E.g. adding the new words to proposed translation first and then checking for correctness would decrease the risk of wrong translations but add some value right away.


What do you think about writing a script to do this, what other problems are there with this? Do you know about previous attempts to do this? I hope this could be very useful!

Noahho (talk) 01:21, 17 October 2017 (UTC)

@Noahho: Something similar to two-way translation could work if we can agree on how we will use Wikidata and how it will be connected across Wiktionaries. Unfortunately, how that would work is very difficult to determine. —Justin (koavf)TCM 02:13, 17 October 2017 (UTC)
@Noahho Hello Noah. Can I please know what project you are working on? If the aim of the project is to extract translations of foreign-language terms into English from Wiktionary, it would be much easier to extract from the pages on English Wiktionary; e.g. for simplified 图书馆 it would be at 圖書館, which says "library". Wyang (talk) 02:17, 17 October 2017 (UTC)

Turkish vs Ottoman TurkishEdit

The Balkan language loanwords from Turkish should technically be Ottoman Turkish, since that's the era they entered those languages, right? Is the only main difference the script being Arabic vs. Latin? I realized I need to go back and change a bunch of Romanian and some other entries. Word dewd544 (talk) 16:12, 17 October 2017 (UTC)

Yes, they should generally be Ottoman Turkish. The script is one significant difference, but if I’m not mistaken there’s also a huge difference in lexicon, where a large portion of the Ottoman Turkish lexis consists of loanwords from Persian and Arabic that were later stamped out of usage and replaced with neologisms by Atatürk. — Vorziblix (talk · contribs) 21:40, 17 October 2017 (UTC)
There are also grammatical differences. That said, I would personally prefer to treat them as a single language, and I don't think we lose much by claiming Balkan loanwords are from Turkish rather than Ottoman Turkish when the word in question is itself the same. —Μετάknowledgediscuss/deeds 21:44, 17 October 2017 (UTC)