Open main menu
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← July 2016 · August 2016 · September 2016 → · (current)


OTRS call for helpEdit

Dear colleagues. The volunteer response team (aka OTRS) is currently lacking volunteers to take care of questions regarding the sister projects wikibooks, wikinews, wikiquote and wiktionary. I'd like to invite you to volunteer at meta:OTRS/Volunteering. If you have any questions, please feel free to contact me. Thank you in advance for considering. --Krd (talk) 08:00, 1 August 2016 (UTC)

Why do we have both Category:Mongolian terms derived from Mandarin and Category:Mongolian terms borrowed from Mandarin?Edit

I just noticed that one of the templates in мөөг creates a (currently redlink) Category:Mongolian terms borrowed from Mandarin but we already have a similar category populated by other templates with a very similar name and semantic: Category:Mongolian terms derived from Mandarin, containing terms like бууз, мантуу, etc. - What to do? — hippietrail (talk) 09:12, 1 August 2016 (UTC)

Borrowing is a subset of deriving, "derived from" is the generic category that holds terms not categorised more specifically. —CodeCat 11:47, 1 August 2016 (UTC)
If Mongolian borrows a term from another language that borrowed it from Chinese that term can't be described as a Mongolian borrowing from Chinese- thus the need for a separate category. Also, the "derived from" node needs to be there needs to be there to keep the data structure parallel with other categories using sister nodes such as "inherited from". Chuck Entz (talk) 12:41, 1 August 2016 (UTC)
I hadn't noticed this level of subdivision before so I'll leave it to you guys, thanks! — hippietrail (talk) 13:53, 1 August 2016 (UTC)

First LexiSession : catEdit

Dear all,

Apologies for writing in non-native English; please fix any mistakes you may encounter in these lines!

Wiktionary Tremendous Group, a cabal nice and open gathering of Wiktionarians, is happy to introduce a new collective experiment: LexiSession.

So, what is a LexiSession? The idea is to coordinate a massive number of contributors from different languages to focus on a shared topic, to enhance all projects at the same time! It may remind you of the Commons monthly contests, but here everyone is a winner! For this first LexiSession, we decided on a month - until the end of August - to make friends with a cat! Not only the cat entry, but also Wikisaurus:cat and other pages dealing with the vocabulary one may need to talk about cats: adjectives, verbs and expressions.

You're welcome to contribute alone, or to create a local project and organize an edit-a-thon in your region. We will probably do at least one edit-a-thon in Lyon soon, and another in Paris during the French WikiConference. Please share your contributions here! You can also have a look at what other Wiktionarians are doing, on the LexiSession Meta page. We will discuss the processes and results in Meta, so feel free to have a look and suggest topics for the following LexiSession.

Thank you for your attention, and I hope you will be interested in this new way of contributing. I'll get back to you later for an update! Noé (talk) 21:57, 1 August 2016 (UTC)

Update: in French Wiktionary, we started a thesaurus about cat in French, and that's cool, we have plenty of red links to create! Noé (talk) 09:51, 4 August 2016 (UTC)

Ok guys, last day of this first LexiSession. It's quite a success for French Wiktionary, as we made three thesauri: thesaurus about cat in French, thesaurus about cat in Breton and thesaurus about cat in English. Sadly, it was quite hard to animate our fellow on other wiktionaries. So, we will try to make a better effort for the next ones. Yes, there will be more cross-wiki contribution events! In September, LexiSession 2 is about cartography and street types. Please, let me know if anyone have done anything about cat I haven't been noticed. See you soon. Noé (talk) 10:46, 31 August 2016 (UTC)

I never noticed this. I'll try to add some cat related stuff today while I still can. --WikiTiki89 12:43, 31 August 2016 (UTC)
I see that Wiktionnaire's thesaurus pages are far more comprehensive than the ones here! I have created an Wikisaurus entry for chat (which, to my knowledge, is the first Wikisaurus entry for French). :) Andrew Sheedy (talk) 18:10, 31 August 2016 (UTC)
@Noé (Note aussi que l'on ne peut pas utiliser fellow comme un nom indénombrable. De plus, on n’emploie que rarement ce mot pour désigner un ami ou camarade, ce qui est le sens primaire selon Wiktionnaire. Je dirais plutôt « those on other wiktionaries » au lieu de « our fellow[s] on other wiktionaries ». Autrement, on pourrait l’employer en tant qu’adjectif (ce qui est beaucoup plur courant que le nom) et dire « our fellow editors on other wiktionaries ». Andrew Sheedy (talk) 19:21, 31 August 2016 (UTC)
Au lieu de « Please, let me know if anyone have done anything about cat I haven't been noticed. » → « Please, let me know if anyone has done anything about “cat” that I haven't noticed. » (« been noticed » est dans le voix passif). Andrew Sheedy (talk) 19:28, 31 August 2016 (UTC)
"That" is optional. --WikiTiki89 19:32, 31 August 2016 (UTC)
Hmmm...I'm not sure I agree that it works without it in this case. Strictly speaking, it could be ommitted, but it seems rather awkward to me in this particular sentence. Andrew Sheedy (talk) 20:01, 31 August 2016 (UTC)
Interesting. Thank you a lot for your comments about the language and your creation is awesome, I am very happy that you made something. I like what wikisaurus are in English Wiktionary but I think there is much more vocabulary to gather in a thesaurus. So, it is a different kind of annex in French Wiktionary, with much more terms, sometimes too much, but homemade, crafted by hand. Plus, we added link to those thesaurus into Wikipedia equivalent pages. For example, at the bottom of Chat. We aim to bring people to Wiktionary from Wikipedia, but until now, we can not evaluate if it worked. You can still contribute on cats, it was just a short impulse but we can do it again later, as cat lives 9 times. Noé (talk) 22:36, 31 August 2016 (UTC)


It seems we have categories for English words with consonant pseudo-digraphs and English words with vowel pseudo-digraphs despite the fact that there is no such thing as a "pseudo-digraph". A combination of letters is either a digraph or it isn't. These are cases of letter combinations that look like digraphs, but aren't. Personally, I don't really see the need for these categories, and I don't think we should be using novel orthographic terms or concepts on Wiktionary. What do other folks think? Kaldari (talk) 06:01, 2 August 2016 (UTC)

A digraph is a sound written with two letters, and a pseudo-digraph is a combination that looks like it should be a single sound, but isn't. For instance, zoology doesn't rhyme with eulogy, and a ramshorn is a ram's horn, not a ram shorn. Chuck Entz (talk) 08:05, 2 August 2016 (UTC)
It is sometimes hard to decide what should go in them and they may need to be renamed at least slightly (to use "terms" like evetything else). I'm also sceptical that they're maintainable. - -sche (discuss) 15:24, 2 August 2016 (UTC)
I don't think pseudo-digraph is accurate terminology. Is there any better name we can come up with? --WikiTiki89 15:27, 2 August 2016 (UTC)
I've got into the habit of adding this consonant category to certain entries, but I also dislike the non-standard term pseudo-digraph. I don't see maintainability as a big argument against it, however: I hope that one day we can do it with a bot, based on the spelling and the pronunciation, but that is a seriously hard problem and the solution might be decades away. But we already maintain lots of awkward things manually. Equinox 01:53, 3 August 2016 (UTC)
Re: Kaldari these aren't digraphs but they may appear to be. Hence Pseudo-digraph. Of course there's such a thing. Renard Migrant (talk) 11:22, 31 August 2016 (UTC)

{{lb}} linking to CategoriesEdit

Why does {{lb}} work with some categories and not others? For example it won't work with Category:Languages. Is it because no one has thought about it? DonnanZ (talk) 15:34, 2 August 2016 (UTC)

I don't understand what you mean. Could you give some examples of what you are trying to do? --WikiTiki89 17:03, 2 August 2016 (UTC)
For example {{lb|nb|rail}} works fine for Category:nb:Rail transportation, but {{lb|nb|language}} or {{lb|nb|languages}} doesn't work if I add it to, say armensk, which is also an adjective. The category Category:nb:Languages has to be added instead. A little more typing. There are a number of categories like this, but I can't remember which ones now. DonnanZ (talk) 17:55, 2 August 2016 (UTC)
Oh that's what you meant. All the categories, auto-linking, and display text stuff are configurable in Module:labels/data. --WikiTiki89 18:01, 2 August 2016 (UTC)
Hmm, OK, that's all Greek to me. Strangely enough I can't find "language" or "languages" listed as a recognised label, assuming it should be listed alphabetically. Is that the list of labels that have been set up? DonnanZ (talk) 18:38, 2 August 2016 (UTC)
If you can't find it, it's probably not there and that is why it doesn't automatically categorize. --WikiTiki89 18:42, 2 August 2016 (UTC)
Oh great, a serious omission. Can that be rectified please? DonnanZ (talk) 18:58, 2 August 2016 (UTC)
I could do that, but there would be very few cases where it would be needed. In the case of armensk, "language" is not a context label. It's a topic category. A context label would mean that this word is only used in the context of languages, but that is not true, it simply refers to a language. --WikiTiki89 19:04, 2 August 2016 (UTC)
Right. {{lb|nb|language}} is a misuse of {{lb}}; labels indicate that a word is restricted in usage to a certain context, but I doubt that people only say armensk when talking about language/linguistics but not when talking about e.g. botany and mentioning in passing that a certain cited book was translated from armensk. The usual and proper thing to do to add a list category is just add the category manually or via {{C}} (see how it's used on French letter). - -sche (discuss) 19:05, 2 August 2016 (UTC)
Additionally, {{C}} also allows you to add multiple categories more easily, for example: {{C|en|Fruits|Trees|Pome fruits|Mythological plants}} at apple. --WikiTiki89 19:15, 2 August 2016 (UTC)
Normally, {{lb}} combined with language code and category will automatically link to a category, whether it's meant to or not, so I'm not sure that that can be classed as misuse. Admittedly I wasn't aware of {{C}} and I'm sure I will be making use of it now. I was also wanting to combine the functions of a qualifier with that of a label and category, but I obviously can't do that here. There shouldn't be any confusion with the armensk entry, adjective and noun are clearly separated; "en armensk forfatter", "en armensk bok", "armenske bøker", a book translated from armensk is obviously referring to Armenian, the language, not the adjective or the book itself. DonnanZ (talk) 20:01, 2 August 2016 (UTC)
I agree with -sche that {{lb|xx|language}} is a misuse. If the term is a language, then the definition should say so. Context labels should not be used to say what the definition should, and certainly not just to categorise. Categorising should always be secondary to the label; something that comes as part of using the label, rather than a reason to use the label in the first place. Perhaps we should start placing restrictions on labels to combat misuse. —CodeCat 20:23, 2 August 2016 (UTC)
I wouldn't class that as one of your most momentous ideas. Use of {{lb|xx|category}} is a good shortcut if used properly, and doesn't cause any harm at all. DonnanZ (talk) 21:08, 2 August 2016 (UTC)
It does cause harm. Let me give an example with one of our most commonly misused labels "anatomy". The "anatomy" label for the term glomerulus is justified because no who doesn't know anatomy would know what that is, so anatomy is a context in which this term is understood. However, it would be a misuse to put an "anatomy" label for the term kneecap, because everyone knows what a kneecap is and the word can be expected to be understood in practically any context (if you say "I feel and hurt my kneecap", you are not having a discussion about anatomy). Thus, putting the "anatomy" label at kneecap would mislead people to think that it is as much a technical term as glomerulus and that would be harmful. --WikiTiki89 14:50, 3 August 2016 (UTC)
You're calling it misuse, but obviously not everyone agrees with you, if the translations are anything to go by. Some use Category:Anatomy for kneecap, others Category:Skeleton. But that's so-called "misuse" of a category, not of {{lb}}. DonnanZ (talk) 17:31, 3 August 2016 (UTC)
As I just said, "anatomy" is one of our most commonly misused labels. Also, you are confusing categories with labels. The label gives the context, the category just adds the term to a category so that it can be found by browsing the category. The fact that some labels also categorize is just a matter of convenience to not have to put both the label and the category, but that doesn't mean all categories should be given as a label. --WikiTiki89 20:08, 3 August 2016 (UTC)
  • Nope, no confusion. Whether all categories can be accessed via {{lb}} is seemingly another matter that I have no control over. It should be up to the editor's discretion whether they use {{Category|xx|category}}, {{C|xx|category}} or {{lb|xx|category}}, depending on circumstances, and shouldn't be deliberately restricted in this way. DonnanZ (talk) 23:16, 3 August 2016 (UTC)
    It's not deliberately restricted. {{lb}} is meant to add labels not categories. Some of these labels also categorize for convenience, but not all of them do and not all of the categories are textually equivalent to the label, thus there is no way to automatically categorize these labels. Every label that wants to categorize needs to be added to the module so that the module would know the name of the category to use for that label. --WikiTiki89 00:21, 4 August 2016 (UTC)
  • I realise that, but when a request for inclusion is declined, that becomes a deliberate restriction. DonnanZ (talk) 08:27, 4 August 2016 (UTC)
    If you give me an example of where you would use it, I would add it. But so far, I disagree with your use cases. It's important to have a real example in order to actually identify the correct category, and whether we should redirect the label to "linguistics", and things like that. --WikiTiki89 14:56, 4 August 2016 (UTC)
  • No, I'm not confusing languages with linguistics; languages are always categorised as such, linguistics covers related matters, and I wouldn't use the languages label for anything other than actual languages. Most if not all languages in Norwegian have the same spelling as the adjective, which happens in English too. Therefore it would be useful to use a label {{lb|xx|language}} for the language entry. I have already mentioned armensk, other examples are fransk, tysk, japansk, spansk, portugisisk and so on. It's no big deal, but it would be a great convenience, a clear marking and pretty harmless. DonnanZ (talk) 23:11, 4 August 2016 (UTC)
    • But "languages" is not a context. These words are not only used in the context of talking about languages. They're used generally, without context. If I ask "Do you speak tysk?", people will understand regardless of what was being discussed before, and regardless of setting. Therefore the label "languages" is a misuse on these entries. Labels should not, ever, be used to clarify or disambiguate definitions. If the definition by itself is unclear, that's what you'd use a gloss for: {{gloss|language}} after the definition. —CodeCat 23:42, 4 August 2016 (UTC)
  • I give up, some people like to make mountains out of molehills. I don't particularly like {{gloss}} anyway, the note is not in italics. DonnanZ (talk) 10:15, 5 August 2016 (UTC)
    • You mean you wanted to define armensk as "(language) Armenian"? I don't understand what you mean, but the others are right that {{lb}} should not be used to generate a label "language" and categorize an entry. If you just want to categorize the entry without generating a label, use {{C}} or a simple category link. The use of context labels was voted and approved at Wiktionary:Votes/pl-2009-03/Context labels in ELE v2. The voted text says: "A context label identifies a definition which only applies in a restricted context." --Daniel Carrero (talk) 17:03, 8 August 2016 (UTC)

Too many pictures?Edit

I wonder if we need a policy on where/when to use pictures in entries. For example, having a picture of a Bible at Bible makes sense, but we also have one at swear on a stack of Bibles. Ditto for on it like a car bonnet. I think that having pictures for purely figurative phrases is actually misleading (it might suggest that a real Bible, or a real car bonnet, is involved), and worse than not having them. I definitely don't feel that every entry, when finally completed, ought to have a picture. Only some entries (usually those for literal things, like moon or dog) benefit from them. Thoughts? Equinox 01:51, 3 August 2016 (UTC)

  • Yes, inappropriate images (as in the two you mention) could be removed without discussion (I have so removed). SemperBlotto (talk) 06:20, 3 August 2016 (UTC)
    • As well as that, there is no need to add an image to an entry if it is linked to a Wikipedia article the image is taken from. That's pointless. DonnanZ (talk) 08:04, 3 August 2016 (UTC)
      • I agree that some image removal, such as those mentioned, are clearcut. But what about marginal cases? Where should any necessary discussions of candidates for removal (and appeals or removals) be? Tea Room? Rfd? I don't think a new page is necessary now nor will it be ever. DCDuring TALK 11:01, 3 August 2016 (UTC)
      • It makes no sense to me that the mere presence of a Wikipedia link on an entry page would mean that that entry should not have any images. Granted, the Wikipedia article may have tons of images -- but how is that relevant to the content of the Wiktionary entry? A relevant and appropriate (set of) image(s) in the Wiktionary entry increases the utility of the entry. Requiring the user to click through to some other page entirely is not good usability. ‑‑ Eiríkr Útlendi │Tala við mig 20:03, 3 August 2016 (UTC)

Sorry to interrupt. This discussion is interesting. In French Wiktionary we have 26.406 pictures and we think we need more!. Do you know how much pictures are used in English Wiktionary? Noé (talk) 14:12, 3 August 2016 (UTC)

  • Hopefully someone can answer your question, I must add that I love images and have added a few myself, but I think it has to be done in an intelligent manner. DonnanZ (talk) 14:25, 3 August 2016 (UTC)
  • I also think that we could do with more pictures. And a linked Wikipedia article with a picture is not good enough, for a couple of reasons: the Wikipedia article might change (and remove or change the picture), the image contained in the article could be at the bottom of the page, and there are cases where it is not obvious which sense the picture is actually meant to illustrate. Having the picture right in the Wiktionary entry is more convenient and fixes these problems. Traditionally most dictionaries (and we as editors, too!) are very focussed on words (somewhat understandably), so more entries with well chosen images would help to make us stand out. Jberkel (talk) 17:33, 3 August 2016 (UTC)
  • I grant that point ("the Wikipedia article might change (and remove or change the picture"). DonnanZ (talk) 15:01, 13 August 2016 (UTC)
  • Pictures are great, but they should make some kind of useful point, such as "It looks like this", "What's different about it this", "It got its name because of this", "It's important because of this," or "It can be found in these locations." (especially useful if we are looking for translations). One kind of image that isn't too helpful is a picture of a particular kind (eg. a Norway maple) of a thing (eg, a tree) that doesn't show the features that distinguish it from other kinds (eg, of maples or of trees). DCDuring TALK 18:47, 3 August 2016 (UTC)
  • Note that we have a project in Wikipedia called Wikigrenier, whose purpose is to photograph various common objects. See a list of pictures. Photographs such as those are good for dictionary articles, and it is possible to make requests. — Dakdada 08:34, 4 August 2016 (UTC)
  • I generally agree that some pictures are inappropriate. I also agree that pictures in figurative or abstact entries are generally suspect. Of course, there is that vast category of picture-deserving entries which is not under discussion and in which better picture coverage is welcome. --Dan Polansky (talk) 17:43, 12 August 2016 (UTC)
  • As an example, I tried to find applicable images to clarify the many different senses at (sakura, cherry; cherry tree; cherry blossom; etc.). Many of these senses are easier to understand with a visual. ‑‑ Eiríkr Útlendi │Tala við mig 18:40, 12 August 2016 (UTC)
    • That's great, but looks a bit crowded and the markup is confusing, with the inline table. Are there any image related templates in use? Maybe this would help to standardize the use of images, and we could keep track of which entries have illustrations. Jberkel (talk) 19:32, 12 August 2016 (UTC)
  • I looked in the past (possibly when working on that very entry), and I didn't find anything that did what I needed. If anyone is aware of such a template, I'm certainly game to use it. ‑‑ Eiríkr Útlendi │Tala við mig 20:31, 15 August 2016 (UTC)
  • A list and count of images used, as mentioned above by User:Noé may be a useful tool, if this doesn't exist already. DonnanZ (talk) 15:01, 13 August 2016 (UTC)
  • Just wanted to say that I have been adding images to entries, primarily as part of a general effort to improve entries that appear as Words of the Day. Personally, I don't think there's anything wrong with images that are not strictly descriptive. — SMUconlaw (talk) 17:00, 13 August 2016 (UTC)
  • It's quite pleasing when one finds suitable images on Wikimedia Commons that haven't been used anywhere else before, as for hopper wagon. DonnanZ (talk) 18:15, 13 August 2016 (UTC)
  • Did an analysis of the 20160801 dump with the help of Lyokoï (who provided the numbers for the French Wiktionary): we have 35927 image links. – Jberkel (talk) 20:38, 19 August 2016 (UTC)
  • Very interesting, thanks for the figure. More than in French, apparently. DonnanZ (talk) 20:54, 19 August 2016 (UTC)
  • Yeah, just for short time... We will change this fact quickly ! Haha ! --Lyokoï (talk) 11:46, 20 August 2016 (UTC)
  • That's the spirit! DonnanZ (talk) 18:24, 20 August 2016 (UTC)

To make it clear, if I remember properly, 35927 is not how many pictures are in en.Wiktionary but the number of pages that include at least one picture. Maybe I am wrong, but it is how I remember the math was. So, it is already pretty cool! Good job fellow! Noé (talk) 10:33, 31 August 2016 (UTC)

No, it's the absolute number of images (counted by searching for [[Image:]] and [[File:]]). – Jberkel (talk) 23:44, 31 August 2016 (UTC)

Proposed creation of Module:it-IPAEdit

I was wondering whether it were possible to create a similar module to this one in the Catalan Wiktionary, but more complex; that is:

  1. the apostrophe read as completely absent;
  2. monosyllables only stressed when spelled with an accent;
  3. words treated separately if a space is put between them;
  4. two distinct IPAs, a phonetic and a phonemic one;
  5. the possibility of endless alternative pronunciations.

If anyone is willing to help me, please let me know. Thanks! ;) [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 15:33, 3 August 2016 (UTC)

There's a Module:ca-IPA already, but it was never deployed. Maybe you can do so, and adapt it to Italian as well? —CodeCat 15:41, 3 August 2016 (UTC)
@CodeCat: the fact is I’m not able to create modules, that’s why I was asking for help. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 16:37, 3 August 2016 (UTC)
@IvanScrooge98 I'm willing to help out. Is the idea to automatically create IPA transcriptions? Jberkel (talk) 17:38, 3 August 2016 (UTC)

@Jberkel: thank you so much; yes, basically, that’s the idea. Can you arrange that? [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 17:43, 3 August 2016 (UTC)

OK, what I would need are some examples of input where ca:Mòdul:it-general produces the undesired output, with the expected output. It probably also makes sense to clean up / rewrite some of the code there – it's one big chain of regular expressions, a maintenance nightmare. Module:ca-IPA is a lot easier to understand and documented as well. Jberkel (talk) 21:05, 3 August 2016 (UTC)
OK, I’m working on it. I’ll let you have a list of examples. Thank you again, @Jberkel! [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 21:28, 3 August 2016 (UTC)

@Jberkel, here's a short list of comparisons. You will notice certain consonants geminated at the beginning of the word, that's a feature of Italian occurring, between two different words, after vowels only in these circumstances:

  • beginning /ʃ/, /ʎ/, /ɲ/, /dz/ /ts/ are always as double after vowels, even between two separate words;
  • all beginning consonants (with exceptions for certain clusters, namely cn, pn, ps, tm, tn, and all clusters starting with S which don't give /ʃ/, as st, spr, sc+a, etc.) undergo this gemination if they come after:
    • words ending with a graphically stressed vowel (as città, perché, però, giù, , etc.) or the list of stressed monosyllables which are spelled without accent that I provided you there (these monosyllables should display as though O were Ò /ɔ/, E were É /e/, I were Ì, etc...);
    • the unstressed monosyllables (with o = /o/, e = /e/) and the four words I provided.

All other unstressed monosyllables don't make the following consonant geminated; all monosyllables (including the geminating ones) should not display with a stress mark /ˈ/, not even by themselves, unless they are apocopic forms of nouns, etc. as ciel or cuor.
When it comes to secondary stress, I would just leave it to all stressed monosyllables unless very before the primarily stressed syllable, as in è vero; I would also put it in polysyllabic words if the distance with the primary stress is less than four syllables, otherwise I'd mark them with a normal stress mark; but you can choose to just leave one primary stress and all the others as secondary.
I think I didn't miss anything; hope I've been clear enough with these few words and that the task won't be hard for you; in any case, if you have any doubts or didn't understand something in my explanation, don't hesitate to ask me clarifications. Enjoy your work!! [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 15:31, 5 August 2016 (UTC)

OK, That should be enough to get going. I started to learn a bit of Italian recently, and the pronunciation seemed to be quite straightforward compared to other languages. – Jberkel (talk) 13:33, 6 August 2016 (UTC)
How would the difference between high-mid and low-mid vowels be handled? Presumably you need to write é or è, ó or ò? What happens if someone leaves out the stress mark (is this an error)? Benwing2 (talk) 21:26, 7 August 2016 (UTC)
@Benwing2: in the module on Catalan Wiktionary there are already some base rules to guess close-mid E and O and proparoxytones, the others are considered paroxytones with open-mid vowels. However, there’s no general rule in Italian and that’s just tentative, pronunciations have to be checked and overwritten if it is the case. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔt̪ːo] (parla con me) 09:03, 8 August 2016 (UTC)
For {{ca-IPA}}, you have to specify the vowel, it shows an error if you don't. See gos. —CodeCat 11:00, 8 August 2016 (UTC)

Abbreviations L4 headerEdit

I'm doing some automated fixup on Dutch entries (per WT:NORM and WT:ELE) and came across the entry aanwijzend voornaamwoord. This entry uses an "Abbreviations" section under the part of speech. This section is not listed in WT:ELE as allowed, thus it's also not clear where the section should appear relative to the others. What is the proper fix here?

Note: This isn't about the use of "Abbreviation" as the part of speech, but rather the abbreviations section appearing under a POS header, and therefore describing abbreviations of the current term. —CodeCat 20:01, 3 August 2016 (UTC)

Change it to "Synonyms" and use {{qualifier|abbreviation}}. DTLHS (talk) 20:14, 3 August 2016 (UTC)
Is an abbreviation really a synonym, or an alternative form? I think it's more the latter. —CodeCat 20:27, 3 August 2016 (UTC)
This isn't a black-and-white question. --WikiTiki89 20:30, 3 August 2016 (UTC)
But it needs a black-and-white answer. —CodeCat 20:36, 3 August 2016 (UTC)
What I mean is it's a case-by-case question. Some abbreviations are best handled as alternative forms and some are best handled as synonyms. --WikiTiki89 20:37, 3 August 2016 (UTC)
And how would you decide which to use? DTLHS (talk) 20:41, 3 August 2016 (UTC)
How about using the Derived terms section instead of Synonyms or Alternative forms? --Panda10 (talk) 20:44, 3 August 2016 (UTC)
I don't would object to either the alternative forms or the synonyms headers as a home for these for Dutch terms. If one alternative is necessary to make life technically simpler, then who really cares about how users might interpret one rather than the other. DCDuring TALK 21:27, 3 August 2016 (UTC)

Wiktionarian skills listEdit

Dear colleagues,

In February, I initiated in French Wiktionary a list of skills, of what we are doing on our project, what we have learned to do. It is not a guideline nor an Help page but roughly what can fill a CV with our empirical learning. After few months of improvement by other people, I am glad to inform you that I tried to translate it in English as Wiktionarian skills list! Yay my English is quite awful, but I imagine it can be improve collaboratively   I hope it will be interesting for some of you, and I'll be happy to discuss and improve it with you! Noé (talk) 23:43, 3 August 2016 (UTC)

@Noé: I made an attempt to fix most of your English grammar mistakes: diff. Could you please check to make sure I didn't change the meaning of anything? --WikiTiki89 17:51, 4 August 2016 (UTC)
Thank you a lot, it is much clear now! I hope you enjoyed the reading! I changed back only one sentence about trademark, because I was thinking about the problem we have to describe objects when names are brands. We discussed a lot about this in French Wiktionary because of 3M Company. They sent a dozen of e-mails to one contributor that wrote scotch (fr:scotch) and post-it (fr:post-it) to indicate they are trademarks. Companies need to protect their brands against dilution and Wiktionaries have to decide of a policy on this, to explain clearly that Wiktionary is descriptive and not prescriptive, so we do not pretend to decide if this band are now commonly attested as substantive in a language. Well, long discussion, and I plan to translate our conclusions to English at some point. -- Noé (talk) 12:36, 5 August 2016 (UTC)

External links and references in WT:ELEdit

These headers are given as level 4 headers in the "example entry", but these are actually level 3 headers that go below the parts of speech. In particular, References goes below Anagrams. Can this be fixed? —CodeCat 17:35, 4 August 2016 (UTC)

I fixed this without a vote. --Daniel Carrero (talk) 10:58, 7 August 2016 (UTC)

WT:EL says nothing about multiple etymologiesEdit

It is the longstanding practice that when there are multiple POS sections, each with their own etymology, then we have separate numbered etymology sections and POS sections are nested in each etymology section, at level 4 rather than level 3. But WT:EL seems to say nothing about this. In fact, the "example entry" shows a rather atypical entry layout, with multiple POS sections under a single etymology; in the vast majority of cases, separate POSes ought to have separate etymologies too. I would like to see this remedied, does anyone have proposals for a change? —CodeCat 17:42, 4 August 2016 (UTC)

I created this vote, and it failed in January 2016: Wiktionary:Votes/pl-2015-12/Headings. It had some issues as discussed by the voters. Like in the failed vote, I'd like to propose having a decent "Headings" section with a list of all headings and levels (Etymology = level 3, Noun/Adjective/etc. = level 3, Translations = level 4), explaining how the presence of "Etymology 1"/"Etymology 2"/etc. affects the level of other sections. --Daniel Carrero (talk) 06:53, 7 August 2016 (UTC)

WT:EL homophones and rhymes sectionsEdit

These basically duplicate the content of Wiktionary:Pronunciation, which is already linked to on the page. Rather than try to elaborate on every detail of the Pronunciation section, WT:EL should stay short and to the point. I therefore propose to remove these two sections, perhaps replacing them with a sentence or two that mentions other things that go in Pronunciation sections. —CodeCat 17:58, 4 August 2016 (UTC)

If Wiktionary:Votes/pl-2016-07/Pronunciation 2 passes, as explained in the "changes and rationale", all the text in the subsections "Homophones" and "Rhymes" is going to be kept, albeit edited to occupy less space, and the subsection titles will disappear. The titles themselves are unnecessary, in my opinion. If the titles were to be kept, we might as well have titles: "Audio pronunciation", "Transcription", "Hyphenation", etc. --Daniel Carrero (talk) 06:45, 7 August 2016 (UTC)

WT:EL: "Language" under "Entry core"Edit

The way WT:EL is currently laid out, it first mentions things that go before the definitions, then the "core" which includes the POS section and definitions themselves, and finally things that go after the definitions. But the "Language" section, which describes the use of L2 language sections, is not part of the entry core as implied here. Note also that it is nested under the "Additional headings" L2 section, which says "There are additional headings which you should include if possible, but if you don’t have the necessary expertise, resources or time, you have no obligation to add them, with the possible exception of “References”." I certainly don't think that the L2 language section is optional or dependent on expertise in any sense. Therefore it should be mentioned earlier on the page, and more prominently. —CodeCat 18:03, 4 August 2016 (UTC)

Some thoughts:
  • I think "Entry name" should be somewhere above "Language" and the explanation of any section.
  • Then "Language", above any explanations of basically everything else (etymologies, POS headers, definitions, etc.), because it is the highest-level section we use (not counting the H1 page title).
  • We could delete the titles "Headings before the definitions" and "Headings after the definitions" and just have "Headings".
  • "Additional headings" is a misnomer. It does not contain what the name promises, and the name or contents should change.
  • WT:EL seems to imply that "References" is mandatory. Is "References" mandatory? Why? If anything, any entry must have the language, POS header, headword line and at least 1 definition. (for the record, it was voted and approved that romanization entries must have a definition, too)
--Daniel Carrero (talk) 07:05, 7 August 2016 (UTC)

Variables extensionEdit

What exactly happened to Wiktionary:Votes/2015-12/Install Extension:Variables? On the phabricator (phab:T122934), plans were made to create a similar function, but after 6 months no progress has been made. -Xbony2 (talk) 21:17, 4 August 2016 (UTC)

My understanding is that the work required to make section-aware templates possible is dependent on some work to actually make the MediaWiki parser know what sections are (phab:T114072). As you can see, the Parsing team of the WMF is quite busy, so it looks like it may be some time before this work gets underway. I posted a rather hackish proof-of-concept patch for MediaWiki at phab:T122934, which would solve the problem, but there is practically zero chance of that being accepted - I think the Parsing team would prefer to do it the proper way, rather than introduce yet more technical debt into MediaWiki. This, that and the other (talk) 07:19, 7 August 2016 (UTC)

moisturising cream v. moisturizerEdit


How do I get the diacritic in phah-sǹg to display correctly in the title?--Prisencolin (talk) 00:06, 6 August 2016 (UTC)

It might be a font problem on your own computer. It shows up fine for me (Windows 10, Monobook skin). —suzukaze (tc) 00:14, 6 August 2016 (UTC)

Vote: Using template l to link to English entries from English entriesEdit

FYI, I created Wiktionary:Votes/2016-08/Using template l to link to English entries from English entries.

Let us postpone the vote as much as discussion makes necessary, if at all. --Dan Polansky (talk) 10:06, 6 August 2016 (UTC)

Alternative forms after definitions — weaker proposalEdit

Previous discussions:

The vote Wiktionary:Votes/2016-02/Placement of "Alternative forms" had the proposal below. It ended as no consensus (10-9-1 = 52.6%-47.4%) in March 2016.

Voting on:

  • Fix the placement of the "Alternative forms" section directly above the "Synonyms" section, as a subsection of the POS section.


  • Arguably, synonyms and alternative forms are related concepts.
  • Removing "Alternative forms" from above the definitions is a way to promote the definitions.

Simplified entry example: hardworking



# Definition.

(possibly other headers between the definitions and the alternative forms)

====Alternative forms====
* {{l|en|hard-working}}

* {{l|en|industrious}}

Unfortunately, as mentioned by some voters, if this vote passed, it would have resulted in duplication of alternative forms sections in entries with multiple POS sections.

New, weaker proposal:

  • Rather than editing all entries (as in, by bot or whatever), just allowing entries to be edited on a case-by-case basis: If someone wants to edit an entry manually and place the "Alternative forms" as a L4 section above "Synonyms", that would be OK. If someone wants to edit an entry manually and place the "Alternative forms" as an L3 section above Etymology/Pronunciation, that would be OK too, and individual entries can be discussed in case of disagreement. This would need a new vote.

Pinging all participants of the previous vote (I hope I didn't miss anyone):

@Metaknowledge, Mr. Granger, Equinox, This, that and the other, -sche, Wikitiki89, Makaokalani, Embryomystic, Andrew Sheedy
@Droigheann, Nibiko, I'm so meta even this acronym, Vahagn Petrosyan, Dan Polansky, Xoristzatziki, Erutuon, Korn, Xbony2

Thoughts? --Daniel Carrero (talk) 08:06, 7 August 2016 (UTC)

I think this makes much more sense than placing them above, as though they apply to all terms. But they're not like pronunciation, where it actually makes sense to show it as a "global" thing that applies to the entire entry. Alternative forms are often term/etymology specific. —CodeCat 12:18, 7 August 2016 (UTC)
steden is an entry where the different etymologies have different alternative forms. —CodeCat 16:38, 7 August 2016 (UTC)
I doubt all alternate forms apply to all POS's of every word. Since they need to be attested for each POS independently, I would (now) have no problem repeating them for each POS. They should definitely be split up for separate etymologies. Andrew Sheedy (talk) 20:32, 7 August 2016 (UTC)
I agree with CodeCat (talkcontribs) here. Alternative forms are very much like synonyms and it makes no sense to stick them at the top where they'll often be missed. Benwing2 (talk) 21:08, 7 August 2016 (UTC)
As for duplication of L4 alternative forms, one alternative (so to speak) is to place them as an L3 header after both or all POS sections. I do this often with Related Terms. (Although I'll grant that it makes more sense to do this for related terms than for alternative forms as often this means no more than converting an L4 to an L3, whereas with alternative forms it will involve moving them below synonyms, antonyms, derived terms and related terms, and they may be missed there just like at the top). Benwing2 (talk) 21:11, 7 August 2016 (UTC)
I don't like this idea. I think moving away from having headers apply to multiple POS sections is the way to go. If we have to duplicate a few, then so be it. It's not that frequent. —CodeCat 21:40, 7 August 2016 (UTC)
I agree with CodeCat on this. --Daniel Carrero (talk) 15:09, 8 August 2016 (UTC)

Suggestion for sense tags on antonymsEdit

Awhile ago CodeCat (talkcontribs) tried changing the text of the {{sense}} tag to say something like (of sense "foo") instead of just (foo). This was roundly disliked, and reverted. The logic given by CodeCat was that it's confusing to have a simple (foo) sense tag next to antonyms, which suggests that the antonyms has the meaning of the sense tag rather than the opposite. How about we do something like what CodeCat tried, but only for antonyms? It could say (of sense "foo") or (antonym of "foo") or similar. The way to implement it is to create a new template {{antsense}}, and use a bot to change all occurrences of {{sense}} in Antonyms sections to {{antsense}}. Thoughts? Benwing2 (talk) 21:17, 7 August 2016 (UTC)

I do think we should do something about this. Unfamiliar users fairly regularly invert the sense, thinking they are fixing an error. Equinox 21:44, 7 August 2016 (UTC)
How many dictionaries actually have antonyms (or any other semantic relations)? I think not many.
How do references of any kind that have antonyms handle this? Among OneLook references, WordNet and Collins Thesaurus have antonyms, which are published online by The Free Dictionary. They offer two presentations (using dark as an example):
  1. the freedictionary, which uses color-coded icons in red () and green ().
  2. the freethesaurus, which is new and uses color coded boxes, pale green for synonyms, pale red for antonyms, peach(?) for "related words".
Color-coding is imperfect (blindness, red-green color-blindness, monitors or screens that do not show colors).
The icons alone don't seem adequate for the full range of users who need the current approach supplemented or replaced.
Longmans DCE 1985 includes "—opposite light" on the appropriate sense line.
Webster's 2nd Intl. has "syn." heading a block of text explaining synonyms and (mostly) near-synonyms and "ant." before a very short list of antonyms.
What does OED do? DCDuring TALK 23:50, 7 August 2016 (UTC)
OneLook itself has a Thesaurus, which uses color coding in the manner of freethesaurus. DCDuring TALK 23:53, 7 August 2016 (UTC)
Chambers Thesaurus screenshot: [1]. They divide each entry into sections headed by examples (dark hair, dark secrets, etc.) and collect all antonyms at the end, numbered by section and marked by the inequality sign . Equinox 23:56, 7 August 2016 (UTC)
That's sort of what we do, except we label senses directly rather than by number, because numbers tend to change as senses are added and rearranged. —CodeCat 00:12, 8 August 2016 (UTC)
On my screen Chambers entry shows the antonyms in red type.
This use of the inequality symbol to mark antonyms doesn't seem obvious though it is quickly learned. DCDuring TALK 04:02, 8 August 2016 (UTC)
We certainly should not use colour exclusively to convey information, only colour in addition to something else. —CodeCat 18:13, 8 August 2016 (UTC)
My screenshot came from the CD-ROM version. I assume they use that symbol to save visual space. Equinox 20:18, 8 August 2016 (UTC)

Is Ushakov's dictionary copyrighted still?Edit

Calling @bd2412. See Copyright law of the Russian Federation. This case is tricky because Ushakov's dictionary was published in 1935-1940 and he died April 17, 1942 (see Dmitry Ushakov). The copyright law of 1993 retroactively made a copyright of 50 years after the published date or the author's death (whichever is later), and later works extended this to a 70-year term. The Wikipedia article says this means anyone who died in 1943 or later was within the copyright period in 1993, but various additional details might possibly make Ushakov's work within this period as well, in which case the copyright would extend (presumably) to 2013, meaning it's (presumably) out of copyright now. But this stuff is sufficiently complicated that I don't know for sure. Basically I want to use some example sentences from this dictionary to illustrate some Wiktionary entries. Benwing2 (talk) 02:52, 8 August 2016 (UTC)

Possibly relevant: this discussion of fair use and de minimis copying, and of the applicability of US vs non-US laws. - -sche (discuss) 05:35, 8 August 2016 (UTC)
For loading on Commons, you have to follow the source country's copyright and US copyright, but Wiktionary doesn't have to follow source country's copyright by WMF rules, and I don't think en.Wiktionary has policy on it. If it was out of copyright in Russia in 1996, it's almost certainly out of copyright in the US, but if it was in copyright in Russia in 1996, it will be in copyright in the US for 95 years from publication, or until 2031-2036.
I'd note that fair use on stuff taken from a dictionary is going to be much more problematic than on stuff taken from a novel, since quotations from a novel don't influence the normal commercial use of the novel, but we are directly competing against a dictionary.--Prosfilaes (talk) 07:33, 8 August 2016 (UTC)
What kind of damages can they claim, though? How much profit do they still make? Also, I can't imagine it is the case that works are not free of copyright until they are in every country in the world. And retroactively copyrighting seems even more dubious, what if someone had published it under a permissive licence in the meantime? Do non-infringing works suddenly become infringing? —CodeCat 18:16, 8 August 2016 (UTC)
@Prosfilaes This stuff is such a mess. It seems quite possible that it went out of copyright in Russia, went back into copyright in 1993 (conceivably due to a rule stating that dates are moved forward to Jan 1 of the next year), went out again later that year (50 years from author's death, moved forward to Jan 1 1943???), hence was out of copyright in 1996, then went back into copyright in 2004 due to the new 70-year-from-death policy, then went out again in 2013. Presumably that means it's out of copyright in the US. But who knows. What exactly happens if you copy from an out-of-copyright work and then it later goes back into copyright? This stuff sucks. Copyright terms IMO are way way too long. Benwing2 (talk) 00:11, 9 August 2016 (UTC)
@CodeCat: They can claim up to $30,000 as statuary damages in the US. Works are not free of copyright everywhere in the world until they are free of copyright everywhere in the world. The WMF is chartered in the US, and therefore has to follow US rules. There are some countries where the rule of the shorter term is in play, and thus lack of copyright in Russia matters there (which is part of the reason Commons cares about it), but the US doesn't have the rule of the shorter term. Putting a work in the public domain back in copyright is a mess, but countries do it some times, usually with exceptions for preexisting users.
@Benwing2: It was never in copyright in the US, and the URAA in 1996 would have returned it to copyright in the US only if it was still in copyright in Russia. It looks like it's out of copyright close to world-round, so it should be safe to use. As far as I know, copyright terms virtually always extend through the end of the year they expire in.--Prosfilaes (talk) 08:03, 9 August 2016 (UTC)
Thanks! Benwing2 (talk) 08:33, 9 August 2016 (UTC)


Whatamidoing (WMF) (talk) 18:02, 9 August 2016 (UTC)

Old RuthenianEdit

I'm wondering how we should handle this language. Should we give it its own code and make it a descendent of Old East Slavic and the ancestor of Ukrainian, Belarusian, and Rusyn, or should we make it a dialect of Old East Slavic, or even a dialect of Russian? What should we call it, Ruthenian, Old Ruthenian, Old Western Russian, Lithuanian Russian, etc.? What code should we give it? --WikiTiki89 19:33, 9 August 2016 (UTC)

@CodeCat, Atitarev, Useigor, -sche: Pinging people who might be interested. --WikiTiki89 15:58, 10 August 2016 (UTC)
It's mainly about how different they are. Is Old Ruthenian clearly identifiable as a language contrasting with Old East Slavic? —CodeCat 16:06, 10 August 2016 (UTC)
What language do we consider having been spoken in the Grand Duchy of Moscow and the Tsardom of Russia? Old East Slavic, or Modern Russian? If the answer is Old East Slavic, then we can consider the language of the Grand Duchy of Lithuania to also have been a dialect of Old East Slavic; if the answer is Modern Russian, then we would need to make it a separate language. --WikiTiki89 17:35, 10 August 2016 (UTC)
Russian and Ruthenian probably diverged by the 15th century. Old East Slavic (Old Russian) is the predecessor of both. Church Slavonic was used as the official language of Muscovy then. --Anatoli T. (обсудить/вклад) 12:03, 12 August 2016 (UTC)
If we add it, I would call it "Old Ruthenian", because "Ruthenian" seems too ambiguous. "Lithuanian Russian" also seems ambiguous and has been less common since the 1980s (per ngrams). Looking around cursorily, I do find scholars who consider Old Ruthenian distinct from Old East Slavic — some consider Old Ruthenian a jumble of Old East Slavic elements and Polish ones. It makes it sound like it would be possible to tell whether a given text was Old Ruthenian or Old East Slavic, which is an obvious prerequisite to splitting it. - -sche (discuss) 09:30, 13 August 2016 (UTC)

Involved administrator actionsEdit

Greetings. I would like to ask a question about Wiktionary policies regarding using administrative tools in situations where the admin is "involved" in a dispute. I am not sure if I am in the right place, and if not, could you please direct me to the appropriate venue?

If I am in the right place, here's the situation. The etymology section at sheng nu has been contested for quite some time. Both at the deletion discussion and subsequently the talk page. The first revert was by User:Wyang with the edit summary, "Western fantasisation". The content was sourced by reliable sources including the BBC and the New York Times. This was way back in 2013.

In 2015 I re-added the content back because it had reliable sources and it was discussed at the talk page. The conversation ended with me asking them for reliable sources that place the etymology elsewhere otherwise it's being removed purely on personal opinion and original research. The topic was seemingly dropped and the content remained.

Then in July 2016, Wyang reverted it again. I came across it today and re-added it and then made some major changes to the etymology section. Namely added that the etymology is disputed as described in a book I cited, and then proceeded to list the varying origins for the term as cited by the various reliable sources. Wyang reverted my edits with new changes without an explanation. I left a talk page message and restored my new changes. I was reverted almost immediately, and then to my surprise, Wyang protected the article so that only administrators could edit it.

Maybe Wiktionary allows for the removal of cited etymology content. Fine, but Wyang never provided anything other than original research. Even if they did have sources that indicated a different etymology, they could have added it to the section as one of the alternate origins. All of this as far as I'm concerned is just a simple editing dispute between two editors, but I was very surprised to see administrative tools used to essentially levy the argument into one direction. I'm not very familiar with Wiktionary policies, but as an admin over at the English Wikipedia, we are expressly prohibited from using our administrative tools in arguments and disputes we are involved in.

Any advice is appreciated. Will totally drop this if this is the custom here. Mkdw (talk) 22:30, 9 August 2016 (UTC)

Wyang did the same to me, with a widely-used module, and I'm also an admin. So you're not the only one. —CodeCat 22:33, 9 August 2016 (UTC)
@CodeCat Do you think it was OK to make a widely used module for Thai transliterations and transcriptions unusable, affecting thousands of entries, upsetting all Thai editors and not really giving a working alternative just because you didn't like the methods used? Please don't mention this in unrelated discussions. Sorry, I don't support you in that. --Anatoli T. (обсудить/вклад) 23:51, 9 August 2016 (UTC)
Please don't misrepresent the problem. The module was not made unusable once Wikitiki had provided an alternative. Those edits were reverted by Wyang. —CodeCat 23:53, 9 August 2016 (UTC)
Wikitiki tried to help but he wasn't sure himself it was working correctly and did what was expected. Wyang gave reasons why. --Anatoli T. (обсудить/вклад) 23:57, 9 August 2016 (UTC)
Wyang was wrong. The fixes did work. I repeatedly asked him to give examples of entries that were broken by Wikitiki's edits. He never gave any. There was no reason to revert the fixes, especially not when they re-created the problem he accused me of creating. Since the edit war, things have been left in a semi-broken state, I'm afraid to try fixing them again for fear of another edit war. I would like a guarantee that it will not happen. —CodeCat 00:00, 10 August 2016 (UTC)
Doesn't seem very collaborative. We didn't even try dispute resolution. Does Wiktionary have a formal process for reporting administrator abuse of the tools? Mkdw (talk) 22:47, 9 August 2016 (UTC)
Bringing up old grievances in unrelated discussions- when you look in the mirror you should be seeing Dan Polansky right now... Chuck Entz (talk) 14:19, 10 August 2016 (UTC)
I failed to see any evidence of substance for your claim (reputable Chinese sources, announcements by the All-China Women's Federation or the Ministry of Education). Unreliable Western media claims should be removed if no original sources can be found. Wyang (talk) 23:03, 9 August 2016 (UTC)
The New York Times, BBC, and the Huffington Post among other sources were provided. In addition, I also included a source from the China Daily, South China Morning Post, and a book by Sandy To. If you believe these sources are "unreliable" that is your personal opinion but is directly in line with WikiMedia Foundation policies on reliable sources. Further to, you have failed to provide any sources of your own to support your theory, and even if you had sources, you should have expanded the etymology section to include these other origin explanations. I already added a source that says the etymology is disputed. Lastly, indefinitely protecting the article is prohibited as an abuse of your administrative privileges. Mkdw (talk) 23:11, 9 August 2016 (UTC)
The Wikimedia Foundation has no policies on reliable sources. It is entirely up to the individual sites. DTLHS (talk) 23:14, 9 August 2016 (UTC)
None of these sources makes sense.
"The China Daily reported in 2011 that Xu Wei, the editor-in-chief of the Cosmopolitan Magazine China, coined the term."
This is obviously false (Citations:剩女).
"Chiu, Joanna (04 March 2013). Unlucky in love … or just left out of the market?. South China Morning Post. Retrieved 9 August 2016." is the reference cited for "The term was added to the national lexicon in 2007 and widely popularized by the All-china Women's Federation."
No such claim was found in the actual article.
Wyang (talk) 23:16, 9 August 2016 (UTC)
To repeat my stance from the argument with CodeCat: I consider the application of admin power to either prevent or allow editing in an edit war in which the admin is involved to be misuse of such power, even when the admin in question is correct. I would ask all administrators involved in edit wars to turn to one of their colleagues as a manager of such situations. Korn [kʰũːɘ̃n] (talk) 23:34, 9 August 2016 (UTC)
I would like to petition the community here to unlock the entry sheng nu. Wyang protected the article indefinitely not to prevent vandalism or harm, but to simply enforce their editorial position. As for the editorial dispute, Wiktionary has processes in place such as dispute resolution to which I am a willing participant. Mkdw (talk) 23:38, 9 August 2016 (UTC)
There is no point of petitioning if there is effectively no basis for your claims - the content you added misattributes content from references, or is apparently factually incorrect. Wyang (talk) 23:47, 9 August 2016 (UTC)
"The very first origins of the term sheng nu have been much contested, and it is virtually impossible to find out exactly whena nd who first coined the term, be it television dramas, talk show hoests, magazine articles, or academic circles. But the most significant aspect of the 2007 official definition that has been endorsed by the Chinese government, and continuously propagated by the government-run All-China Women's Federation"
To, S. (2015). China’s Leftover Women: Late Marriage among Professional Women and its Consequences. Oxford; New York: Routledge.
"The term refers to any unmarried Chinese woman over the tender age of 27, and was coined by the All-China Women's Federation"
Tunstall, Lee (15 November 2012). Are All the Single Ladies Really Like the Oil Sands?. The Huffington Post. Retrieved 2 April 2013.
"State-run media started using the term "sheng nu" in 2007. "
Magistad, Mary Kay (20 February 2013). "BBC News - China's 'leftover women', unmarried at 27". BBC News (Beijing). Retrieved 9 August 2016.
"According to The New York Times, the term was made popular by the All-China Women’s Federation in 2007"
MacLeod, Duncan (11 April 2016). "Marriage Market Takeover for Leftover Women". Inspiration Room. Retrieved 9 August 2016.
"The term “leftover women” surfaced in 2007 in a report by the All-China Women’s Federation, a state agency whose professed purpose is to “protect women’s rights and interests.”"
Reynolds, Christopher (18 April 2016). "Viral video inspires China's 'leftover women'". Toronto Star. Retrieved 9 August 2016.
"The term "sheng nu" was first used by the All-China Women's Federation (founded by the Communist party in 1949) in 2007, to explain that a leftover is an unmarried woman over the age of 27."
Iaccino, Ludovica (31 January 2014). "Single and Educated: the Problem of China's 'Leftover' Women". International Business Times. Retrieved 9 August 2016.
The lists of available references goes on and on. The other points you brought out were originally used to cite the sentence, "pressure unwed women into marriage", but you reverted my changes before I was complete. Regardless of whether you think my arguments about sources have merits, it does not exclude you from abusing your administrative tools, nor does it warrant engaging in an edit war. Mkdw (talk) 23:54, 9 August 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I agree with Korn that a page should not be protected to enforce an editorial position. Unless someone (other than Wyang) objects, I'll unprotect it. Benwing2 (talk) 00:45, 10 August 2016 (UTC)

I support unprotecting the page. I do think more discussion is needed before further edits are made, but they should involve more parties than just Mkdw and Wyang. —CodeCat 00:51, 10 August 2016 (UTC)
Unbelievable. Repeatedly adding unsubstantiated material amounts to vandalism, warranting a block already. This is especially true considering it has been more than three years since I had asked for direct evidence for the claim, and there is none. In this case the editing patterns of the user Mkdw shows that he/she clearly has an agenda with the edits: using fantasy-driven Western media articles misconstruing the actual linguistic scenario in China to push for a point of view, i.e. the viewpoint that the Chinese culture is distorted in the Western eyes - there are words coined by the "All-China Women's Federation" which pejoratively refer to unmarried Chinese women over 25 as "leftover women". This is nonsense. If person 1 writes that "A claims B did something", then it is person 1's task to be able to provide direct evidence that B did something, especially when someone considers A's claim unreliable. If person 1 cannot do so, the claim should be promptly removed. Wyang (talk) 01:18, 10 August 2016 (UTC)
(After edit conflict...)
  • Looking on from outside the argument and sussing out the details, I feel compelled to chime in.
Re: the origins of the term, just looking at Citations:剩女, I see that this term was prima facie not coined by the All-China Women's Federation in 2007 -- all five citations currently on that page are older than 2007: 1964, 1992, 1995, 2002, 2006. Past that, relying on English-language sources to divine the etymology of a Chinese term does not strike me as a wholly viable approach. My field is Japanese, and I've run across numerous instances of English-language sources claiming this or that about a Japanese term, when reliable and respected native-language sources say something else entirely. Relying on mass-media sources is even less viable -- their business is to sell copies, and they do that by printing interesting content, often without much regard to strict veracity.
Re: sources, finding a citation of a term in use is enough to meet our criteria for inclusion, vaguely analogous to Wikipedia's "notability" requirement. But when it comes to the content of an entry, it is not enough that a given source says X or Y: we also pay attention to the identity, reputation, and expertise of sources. As a thought experiment, I wouldn't care one whit if you found that the New York Times itself claimed that the Japanese word gaijin (“outsider, foreigner”) originally came from Hebrew גויים(goyim) -- unless that also agreed with known Japanese sources that make the same claim.
Re: edit warring, it bears noting that Wiktionary's editor base is much smaller than Wikipedia's. We neither need, nor can we use, the kind of bureaucracy that has evolved on Wikipedia. Given also that the number of editors for any given language is much smaller than the total number of Wiktionary editors, we must often rely upon the judgment and expertise of the very small number of people who handle the day-to-day process of maintaining our content. Your edit history (23K+ on Wikipedia, 129 or so here on Wiktionary) and some of the background threads (as at Talk:sheng_nu) suggest that you're well-versed in Wikipedia's culture and way of doing things, but not so much in Wiktionary's.
Ultimately, considering that Wyang is a native speaker of Chinese, can read Chinese source materials, and has a long history of high-quality work on Chinese entries here, I'm much more inclined to trust his judgment over yours, when it comes to the origins of Chinese terms. You discount him entirely by merely posting English-language sources, many of zero etymologic value, and claiming that the burden is on him when he's asking you for reputable Chinese sources backing your claims.
I haven't agreed with everything that Wyang has done, but in this case, it does appear that he is more in the right on the etymology of English sheng nuChinese 剩女. ‑‑ Eiríkr Útlendi │Tala við mig 01:21, 10 August 2016 (UTC)
I admit I discounted Wyang when they removed content under the rationale "Western fantasisation". That indicates to me an unreasonable bias. It doesn't matter how Wiktionary treats original research. Any opinion needs to be supported by something otherwise it's simply an opinion. Here is what was removed:
The exact etymology of the term is disputed.[2] The China Daily reported in 2011 that Xu Wei, the editor-in-chief of the Cosmopolitan Magazine China, coined the term.[3] Other sources have indicated the All-China Women's Federation and the Ministry of Education of the People's Republic of China.[1][4] The term was added to the national lexicon in 2007 and widely popularized by the All-china Women's Federation.[1][5][6]
The first two citations at Citations:剩女 seemingly use the term to refer to people who remained after the war. This does not explain the etymology of the term sheng nu as defined by the Chinese lexicon, an unmarried women in their late twenties. It suggests that leftover and woman (and man in this case) were put together not as a term but as a turn of phrase. The same goes for the 2002 citation that talks about food. Suggesting these are the origins for the term about unmarried women seems unlikely because there is nothing supporting the finding of these words together to the term or an evolutionary process. Where I think Wyang's ability to read and write Chinese could be useful is not conducting their own original research, but finding Chinese sources that tie any one of their citations as being the origin of the term. In the meantime, the sources we do have are all we have. Wrong? Possibly. Sourced and not original research? Yes. I would have even settled for "The exact etymology of the term is disputed.[2] The term was added to the national lexicon in 2007 and widely popularized by the All-china Women's Federation.[1][5][6]" but that never seemed to be on the table either because Wyang always resorted to wholesale reverts.
I won't even get into all the problems Wiktionary invites by allowing original research to prevail over published material (including books). Or by allowing admins to use their tools to resolve editorial disputes. You're right, I know Wikipedia and Wiktionary is different, but I cannot see how Wiktionary hopes to grow their community and welcome newcomers when special privilege and rules are applied to a select few.
Lastly, so quickly labelling me as a disruptive editor is only evidence of that. I'm not here pushing some fringe idea. I was adding what I've found in the sources. I have adjusted the content as new sources have come up as evident in my last series of edits. It allowed for multiple explanations and acknowledged the origin is disputed. I was seeking compromise. I was using the talk page until Wyang stopped replying. I have even mentioned an openness to dispute resolution. Mkdw (talk) 02:45, 10 August 2016 (UTC)
One point that hasn't been made yet: your sources are adequate for an encyclopedia article, but this is an etymology, which requires a different skill set than most journalists have. There was an article in a non-linguistic journal that made assumptions about prehistoric human culture based on long-range linguistic reconstructions that are considered by most linguists to be way out on the fringe, and there were all kinds of articles in mainstream journalistic sources that treated this without any skepticism at all. Chuck Entz (talk) 14:19, 10 August 2016 (UTC)
  • Thank you, Chuck -- that was the point I tried to make earlier, that mass-media sources are of exceedingly low value when it comes to etymologies. Your restatement is clearer. ‑‑ Eiríkr Útlendi │Tala við mig 23:18, 10 August 2016 (UTC)
  • @Benwing2, I'm with Wyang here -- Mkdw really does not seem to get it, but he keeps pushing his position. In Wyang's place, I might have done the same thing: with few Chinese editors to collaborate with, he's been one of the most active Chinese editors for a while now (at least, from what I've seen). If we unprotect the page, what do we do if Mkdw keeps adding low-value English-language "sources"? What other approach would you all advocate? Should we block disruptive users, rather than locking disrupted pages? Serious questions, BTW, I'm not being rhetorical. ‑‑ Eiríkr Útlendi │Tala við mig 01:25, 10 August 2016 (UTC)
OK, I'll leave it alone now. I still believe that it's an abuse of admin powers to lock a page over editorial disagreements, even if the admin is almost certainly correct, as long as the other user is apparently acting in good faith and is willing to respect the dispute process (which for us would probably be a Tea Room discussion); but Eirik you make good points. Benwing2 (talk) 02:31, 10 August 2016 (UTC)
This sets a terrible precedent on principal alone. This creates a de facto community endorsement whereby administrators can revert an editor and then protect the entry indefinitely to enforce their editorial preference in a situation, even if the editor is willing to go to dispute resolution, using the talk page, and changing their edits to find a compromise -- provided the admin feels like they're "right". I'm admittedly very disappointed but I strongly believe in community consensus. If the community consensus here and now is to endorse this action, then I withdraw my request and accept it under protest. The editors at Wiktionary have the right to their own self determination regarding their practices. It's unfortunate that this type of conduct is being endorsed rather than, say, a community consensus possibly endorsing Wyang's editorial position, if they deemed so, but also finding Wyang's administrative actions inappropriate. I would have accepted it and the community (including administrators) would have had recourse against any editor (including myself) as going against the consensus. I would think that would then be deemed a disruptive editor, but seemingly the threshold is way less than that. Maybe there's no appetite for bureaucracy but there must be checks and balances for administrators and this is one step in the opposite direction. Mkdw (talk) 02:51, 10 August 2016 (UTC)

Why are you all walking into a smokescreen now? This is not a debate about the origins of the phrase sheng nü, this is a debate about whether admins should get to abuse their goddamn power. The edits made were both good faith and sourced with media which is generally accepted to be decent, and as such as proper as any Wiki-edit can get. They can be wrong a hundred times over and Wyang can be right a hundred times over, using his superior power in an edit war, not even to stall the warring until a consensus was reached, mind you, but to allow him to forego the argument is crass abuse and should not, for any reason, be tolerated. This should be a place where arguments should be won by superior evidence, not by sucker-punching your opponent with your superior admin-muscle. In the same vein, and you damn well know whom I am looking at, this is a place where arguments should be carried out in civilised debate and not by playing volleyball with a page because nobody can stop you. If you're part of the argument, you don't get to be judge or police, that should be a very simple rule we can all agree on. While I have no doubt that Wyang is an absolute treasure as an editor on Asian languages, the fact that he's involved in two such situations within a short period of time doesn't give me the best impression of him as an admin and the fact that an outsider now has reached the conclusion that we as a whole condone such abuse should shame us all. Korn [kʰũːɘ̃n] (talk) 10:42, 10 August 2016 (UTC)

Well, Wiktionary is not like Wikipedia. A word can be demonstrated to exist at a point in time simply by showing attestations of the word at that time point, and there is no point citing an external article claiming a word was coined in 2011 while there are ample attestations for its use long prior to that. This is exactly what User:Mkdw did not get and what he/she had been trying to do repetitively over the years, including today again. It has been made clear that there are reasonable doubts regarding his edits more than three years ago, but he ignored the comments and the attestations I have gathered at Citations:剩女 to continue pushing for his POV edits. It is apparent they are trying to match the Wiktionary content to the stuff ("Good Article") they have written over at Wikipedia, with complete disregard for criticism and the linguistic facts. This is vandalism and should be dealt with as such. Wyang (talk) 11:57, 10 August 2016 (UTC)
In my book, it's not vandalism as long as it's done in good faith. But no matter whether it is: Even if there is an edit war involving one steadfast defender of the right thing and one plain vandal, then yes, the vandal has to be dealt with, but by a neutral third who has heard both sides, not by one side of the edit war. That is all I'm saying. Korn [kʰũːɘ̃n] (talk) 13:20, 10 August 2016 (UTC)
  • I think the first round of back-and-forth edits in 2013 counts as good faith. Mkdw came back just a couple days ago and added essentially the same content, completely failing to accept or respond to the past argument that the content he was adding was from sources of clearly contestable value. After Wyang made it clear that Mkdw's sources were still inadequate, Mkdw continued to insist -- he again refused to acknowledge the possibility that just having a source isn't enough here on Wiktionary. This is where I start to view Mkdw's edits as not in good faith any more. ‑‑ Eiríkr Útlendi │Tala við mig 23:27, 10 August 2016 (UTC)
PS: See [[Talk:sheng nu]] for the relevant discussion and timeline. ‑‑ Eiríkr Útlendi │Tala við mig 23:29, 10 August 2016 (UTC)
I agree with Korn in principle, but there are some practical issues that need to be dealt with. When someone makes an edit and an admin finds it, at this point the admin is still an uninvolved party, when the admin reverts the edit and the original editor reverts it back, does the admin now suddenly become an involved party and need to seek out another uninvolved admin? When the second admin reverts the edit and the original editor reverts it back, does the second admin now also become an involved party and need to seek out a third admin? It's difficult to know what the "right" thing to do is if you are an admin in that situation and we don't have any clear guidelines on this. I think we really need to draft up a policy on this, so that admins will have a procedure to follow and also so that we can clearly determine when an admin is not following it. --WikiTiki89 15:04, 10 August 2016 (UTC)
My simple proposal for every sort of edit war is that instead of a second undoing, a third person has to be contacted. That is: 1. An edit is made by John. 2. It is undone by Jim. (It is not relevant whether the undo-function or be rewriting of the contents.) 3. The original edit is restored by John. 4. Jim is not allowed to undo it a second time. Instead, Jim is now obliged to bring the discussion to the attention of other users of the language or the community in general (such as Beer Parlour). Korn [kʰũːɘ̃n] (talk) 15:45, 10 August 2016 (UTC)
ps.: Obviously, if John and Jim agree that they will debate this amongst themselves or that they are fine with continued editing as a form of successive proposals rather than merely trying to set the page back to a former status quo over and over, a third party need not be bothered with it. Korn [kʰũːɘ̃n] (talk) 15:48, 10 August 2016 (UTC)
My name is John and my brother's, Jim. I sincerely doubt whether we would have such a disagreement. :PJohnC5 15:54, 10 August 2016 (UTC)
  • I'd like to point out a concern here -- we have the Wiktionary editor community (not very big to start with), and we have the individual language editor communities (much smaller still). If our hypothetical Admin Jim is the only active editor for Language Foo during the time that non-admin Editor John is busy adding controversial content to entry Bar, do we now demand that Admin Jim just sit on his hands for possibly several days, or longer, until some other editor for Language Foo comes along? Again, serious question, not rhetorical. I'm interested in people's views here. ‑‑ Eiríkr Útlendi │Tala við mig 23:27, 10 August 2016 (UTC)
In response to Wikitiki89 (talkcontribs), no it seems to me that an uninvolved admin does not become involved by reverting the editor causing the controversial edit, including multiple times. In response to Eirikr (talkcontribs), no the admin shouldn't have to wait until someone else knowing that language comes along. Instead there should be a discussion in Tea Room or Beer Parlour or wherever. That way, others can weigh in based on the evidence. (In practice, in such a case the admin, esp. a long-time contributor to a language, will probably get the benefit of the doubt unless the other user can show a sufficiently good reason why the admin is wrong.) IMO in a controversy a long-time status quo should prevail until the controversy is resolved, and it's probably OK to lock a page on the status quo to prevent an edit war, *if* the user doing the controversial edit insists on edit-warring rather than participating in a discussion; I've seen that happen in Wikipedia. The locking should happen by an uninvolved admin, though, and only while the discussion is happening. Benwing2 (talk) 23:41, 10 August 2016 (UTC)

Proposed addition to WT:NORM: headers cannot be nested inside thingsEdit

I propose adding an additional rule: Headers must not be nested inside other elements, such as templates and (HTML) tags.

This rule would make parsing a lot easier, because a parser would not need to parse the nesting of templates before they can determine whether a header is "real", appearing at page-level, or is actually nested within some template. A parser would be able to assume that every header is "real". With this change, the code


would be disallowed. —CodeCat 20:06, 10 August 2016 (UTC)

Do you have an example of a page that does this? DTLHS (talk) 20:12, 10 August 2016 (UTC)
Not really, there may well be none (there's some talk pages, but those don't count for WT:NORM). The point is to have an official rule in place that disallows it, so that a parser's design can be simplified. —CodeCat 20:14, 10 August 2016 (UTC)
Is there really a need to make a formal rule about this? I mean to me it just seems obvious not to screw around with the layout and such like that via a template. Although I do see some pretty crazy things done on wikis sometimes... Philmonte101 (talk) 21:38, 10 August 2016 (UTC)
@CodeCat: I think this is a problem with WT:Norm, Templates, 3: `For templates with many or long parameter values, line breaks are allowed at the end of a template's name or a parameter's value, for the purpose of making the wikitext easier to read.' If one changes `are allowed at' to 'are only allowed at,' I believe this would make your example impossible, because every line inside a template would have to begin with a pipe or a double closing curly brace. I assumed that only, although not stated, was probably intended by this line. Edit: For HTML, I agree and see something like this as important. Isomorphyc (talk) 02:57, 11 August 2016 (UTC)
I don't understand what you're saying. This is about nesting headers in templates and other things. —CodeCat 15:24, 11 August 2016 (UTC)
@CodeCat: As I understand, headers are required to begin immediately after a line break (I assume Headings 1. means `one blank line [immediately] before all headings,' except the first, as anything else is extremely rare and normally treated as an error. If templates which continue onto a second line are required to do so only after the value, then the line break will be immediately followed by a pipe, a closing double curly brace, or whitespace, never an equals sign. Hence, this change to the template rule will prevent embedding pseudo-headers into templates, which it probably was intended to do in the first place. HTML and other things are a different matter. WT:NORM is a little bit subtle in places; am I misunderstanding? Isomorphyc (talk) 16:44, 11 August 2016 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────@Isomorphyc: It's a minor loophole and unlikely to be successfully exploited, but under the current rules, we could wrap the whole entry inside a template. There is no blank line before the first heading. The example below assumes that a module can convert the "\n" into newlines.

{{example|==English==\n\n===Noun===\nblah blah\n\n# defs}}

Or we could have this:


Anyway, even if nobody would do one of those things, in my opinion it's a good idea to implement the rule that CodeCat proposed, either on its own or as a complement to the other rules. According to the introduction of WT:NORM (the entirety of the policy was voted and approved, including the introduction), people aren't even required to follow NORM and technically they would be able to go wild with whitespaces and line breaks (I hope common sense is required to some extent, even if not explicitly said so in the rules) so it's good to be extra clear about what we want.

I created the vote: Wiktionary:Votes/pl-2016-09/No headings nested inside templates or tags. --Daniel Carrero (talk) 03:53, 23 September 2016 (UTC)

FWIW, I have written lots of bot scripts that split pages into sections and subsections, and I realize now that I've implicitly assumed that headers never occur in templates. CodeCat is completely right that having to worry about this is a real pain in the ass. AFAIK I've never been bitten by any header inside of a template so it probably doesn't occur. Benwing2 (talk) 05:17, 23 September 2016 (UTC)
No objections, although, in about 9 years of editing here, I don't think I've ever seen it. Renard Migrant (talk) 17:28, 24 September 2016 (UTC)
I seem to remember some kind of Usage notes template that included the header in it, but it might have been subst-only. Chuck Entz (talk) 18:05, 24 September 2016 (UTC)
I am fine with this proposal, but you should still make your parser smart enough to know when it is inside a template and when it is outside a template. More problematic is the possible inclusion of things like headers via templates, not as template parameters. - TheDaveRoss 19:44, 24 September 2016 (UTC)

Proposed addition to WT:NORM: no template parameter expansionsEdit

This means that things like {{{1}}}, with three curly braces, can't appear in the wikitext. This is probably something that goes without saying, since regular pages aren't ever passed parameters. But to have it codified would again be a useful assumption for parsers: rather than having to decide whether a bunch of curly braces should be grouped two or three, it can assume it's always two. —CodeCat 20:11, 10 August 2016 (UTC)

I support this. Edit: Some rationale: I recognise it is not desirable to turn WT:NORM into a grammar, but I think just a few lines should make the extensions completely orthogonal to the Wikitext abstraction. Given the potential for adding logic into the triple brace notation, and the fact that a pushdown parser is required to fully treat this syntax, this exception is worth codifying. Isomorphyc (talk) 03:57, 11 August 2016 (UTC)
That sounds reasonable. - -sche (discuss) 09:32, 13 August 2016 (UTC)
I am fine with this. - TheDaveRoss 19:55, 24 September 2016 (UTC)

Wiktionary:About ScotsEdit

Who here contributes a lot with our Scots lexicon? I notice that words that mean the same thing in both English and Scots are never added by users other than me for some reason. I've added terms like electromagnetic, Denmark, and others, that have the same meaning as in English. I guess it's because people a lot of times will consider Scots a dialect rather than a language. I feel like we should have all terms in Scots, or else we should make a formal decision about which Scots terms should be allowed here and which shouldn't. Philmonte101 (talk) 21:36, 10 August 2016 (UTC)

@User:Angr, User:Nbarth, User:Leasnam, you three might be interested in this discussion. Philmonte101 (talk) 23:48, 10 August 2016 (UTC)
I agree that these words should be added, but like we saw with Middle English entries, it tends to fall by the wayside. It may seem like duplicate effort to try and get Scots words in when they're the same as the English word. Time to roll up our sleeves I guess Leasnam (talk) 04:48, 11 August 2016 (UTC)

I just had a few things pop into my head. You're right, for one, it seems really tedious to add all of those entries. But what if we could possibly get a list of these terms from somewhere, like, say, another dictionary dedicated to the Scots language? And then we could either add them manually or automatically create the entries somehow, like, say, a bot of some kind that takes information from the English entries and converts them into Scots (except no definition, just a simple one-word translation)?

@ User:Leasnam But then again, we'd have to worry about verification of these terms, as per WT:ATTEST. I wonder, is there an easier way to verify Scots terms? When I try to look for Scots terms in, say, Google Books, for "electromagnetic", all I find are English sources. Is there an online Scots library of some sort, or some way to search in only Scots language books/archived documents? Philmonte101 (talk) 05:09, 11 August 2016 (UTC)

Hrmm, there are several, but I am unsure if they are copyrighted or not. Most likely would be. I tried searching for "electromagnetic" and "maist"/"mair"/"ilka" and didn't turn up anything in Scots. Scots is mainly devoted to poetry and older language. Anything having to do with electromagnetism I think would be written in English (?) Leasnam (talk) 05:24, 11 August 2016 (UTC)
I'll be honest with you, I found a lot of those terms on Wikipedia, and searched them up to make sure they were used consistently on the site, and didn't seem to have any variations. So I assumed that because of this those terms may be attested. But I guess I was wrong about that. But wait, couldn't Scots be one of those languages that has a template that says it's scarcely documented, and so therefore the rules of CFI for it are different than for more common languages? I know I've seen Malagasy terms that had this template. Philmonte101 (talk) 16:47, 11 August 2016 (UTC)
Semi-relevant: one thing that annoys me is that when you create a Scots verb (sco-verb), it inflects in a certain way (can't remember exactly, but something like e.g. walkit instead of walked) that often doesn't reflect the actual literature. It is, of course, hard to draw a clear line between English and Scots words, given the history. Equinox 20:09, 11 August 2016 (UTC)

Telugu wikisaurusEdit

I would like to create Wikisaurus for Telugu language. Where and How to do it. What is the platform; Is it English wiktionary or Telugu wiktionary. Thank you if someone answers.--Rajasekhar1961 (talk) 04:01, 11 August 2016 (UTC)

I would try Telegu Wiktionary. DCDuring TALK 10:45, 11 August 2016 (UTC)
Can someone help me in creating the Wikisaurus in Telugu Wiktionary. I need some technical assistance and create the necessary Wiktionary:Wikisaurus/Format and templates there. Thanking you.--Rajasekhar1961 (talk) 11:28, 11 August 2016 (UTC)
@Dan Polansky did the work on that here. DCDuring TALK 13:13, 11 August 2016 (UTC)

In French Wiktionary there is thesaurus in different languages, not only in French, with translation into French. Is it different in English Wiktionary? In which convention page is it specified? Noé (talk) 10:00, 18 August 2016 (UTC)

Usage note at schriftEdit

I removed this usage note, considering it pointless, but User:Morgengave restored it and expanded it further. I don't think this is much of a usage note at all, since it doesn't say anything about the usage of the term (that the definitions don't already say), nor is it customary for us to mention other terms different by capitalisation in usage notes. We have this on Earth vs earth, but not Moon vs moon or most other cases where this happens. What do others think? —CodeCat 21:56, 11 August 2016 (UTC)

Hi CodeCat - My personal view is that if it can help the person consulting the lemma, then let us include it, on condition that it remains sharp and concise obviously. Overall, it would be helpful to have a Wiktionary:Usage notes policy to use as guidance. Morgengave (talk) 22:05, 11 August 2016 (UTC)
We already have {{also}} for this, which works fine for the vast majority of pages. So why do these few cases in particular require a usage note? Also, does your usage note even give any notes on usage? —CodeCat 22:08, 11 August 2016 (UTC)
I'm fairly relaxed about this - I am open to multiple solutions as long as the user is helped. I believe it helps more than the "also" on the top of the page as in my view the also-template is for users to find the right lemma back easily and the usage notes are to help the user on usage (including potentially confusing situations). Two different purposes in short. There's unfortunately no usage notes policy and this one sentence seems fairly harmless even if one would deem it redundant. Can you help me understand you better: why are you keen on removing this one sentence? Morgengave (talk) 22:27, 11 August 2016 (UTC)

Reference specificationsEdit

As per a debate I have had recently and in the past with @Dan Polansky, an example of which may be found at Template talk:R:DSMG, I think that references in templates or in entries should be explicit and full format. Pace Dan, whatever beauty may be derived from a “simple” format of templates such as {{R:DSMG}} and {{R:Webster 1913}} accompanies a loss of relevant citation information. We have templates created (and updated recent by @Smuconlaw), {{cite-book}}, {{cite-journal}}, etc., that provide standardized citation functionality. Indeed, I would be prepared to start a vote to make rules for citation generally. What do people think? Is this a minor quibble, or do people agree that we should have a standardized, full format? —JohnC5 17:26, 12 August 2016 (UTC)

I am fine with a full detail being available on a mouseover. Thereby, the detail would be there for those who require it, while it would not block the radar screen and disturb the skimming focus of those who love succinct identification. --Dan Polansky (talk) 17:28, 12 August 2016 (UTC)
By way of example: into {{R:is:IEO_1989}}, I placed the following two formats, the short one and the long one, the long one being visible upon moseover:
  • word in Hólmarsson et al.: Íslensk-ensk orðabók. 1989.
  • word in Sverrir Hólmarsson; Sanders, Christopher; Tucker, John • Íslensk-ensk orðabók / Concise Icelandic-English Dictionary • Reykjavík: Iðunn, 1989
--Dan Polansky (talk) 17:31, 12 August 2016 (UTC)
I agree that excessive details can be hidden using a mouseover or expandable box. (I would like this for our quotation templates as well). DTLHS (talk) 17:33, 12 August 2016 (UTC)
If this is to be done it should be done consistently across all templates with regards to what information is hidden. "Box" wasn't the right word, just an expandable section. I prefer that to a mouseover because with a mouseover you can't have any links and you can't select and copy the data. DTLHS (talk) 17:43, 12 August 2016 (UTC)
What should be done for platforms that have no mouse pointer? —CodeCat 17:46, 12 August 2016 (UTC)
An alternative I have in mind is that each reference template would contain a link to a section in an appendix page for reference templates. That section would contain a full identification and more. As for technology, it is a simple wikilink. --Dan Polansky (talk) 17:52, 12 August 2016 (UTC)
The proposal here would be to have giant appendices containing full versions of every citation we use or choose to abbreviate? —JohnC5 18:12, 12 August 2016 (UTC)
You mean every reference, right? I hope you do not intend to push your ornamental cast iron to our poor attesting quotations; they are already too noisy, putting metadata before the quotation itself. As to the substance of your question, the appendices obviously do not need to be "giant"; they can be as granular as we see fit, and therefore as small as we see fit. --Dan Polansky (talk) 18:21, 12 August 2016 (UTC)
Don't worry, none of this will ever happen since there are a million different reference formats none controlled by the same back end, making any kind of unification impossible. DTLHS (talk) 18:42, 12 August 2016 (UTC)
Oh ye of little faith! We certainly can make a standard then fix all the templates. —JohnC5 19:36, 12 August 2016 (UTC)

Proposed extension to criteria for inclusion on proper names of fictional works.Edit

I'd like to propose a change (and if these become votes, they'd be separate votes from one another) to our criteria for inclusion.


  • Proper names of titles of fictional works, such as books, television series, video games and video game series, should be included in our lexicon, as long as they have 3 sources that are independent from the book/series itself. I.e. the book citations, or Usenet citations, must not specifically be about the television series or video game.
  • Please note that this proposal is not about appending fictional characters or names of fictional entities into Wiktionary; just about the titles of the works themselves (usually represented by italics). I feel that characters and entities should follow the guidelines that are here already.


  • Just like countries, cities, county names, etc., titles of these works are proper nouns.
  • Many will argue that including such things "is not traditional." Though it is not traditional, I'm surprised that we didn't include these already. For example, with Wikipedia, traditional paper encyclopedias don't generally include articles about television series or cartoon characters. Well, they might have a few of the really important ones, but not many. So, Wikipedia is thus extremely different from the traditional encyclopedias in many ways, and is in fact better if you ask me. I'd say the same thing about Wiktionary. Wiktionary includes far more information than most paper dictionaries do. Many dictionaries don't include nearly the amount of etymological information, synonym information, derived terms, anagrams, pronunciation, etc., that we do here. Also, they generally don't include rare slang terms. We do. Most paper dictionaries wouldn't include "all words in all languages", because, well, it'd be silly; millions + pages. Most paper dictionaries don't include individual entries for inflected forms. And now add this; paper dictionaries generally don't include names of popular TV series, or classic works of literature, etc. But, it would be informative to readers, so why don't we?
  • Many TV series have a few translations in other languages. Such as, The Simpsons sometimes translates to Los Simpson in Spanish. The TV series Cops apparently translates to Zsaruk in Hungarian. I could find more examples, but you get my point. Many people might want to know the translations of these proper nouns. Of course, the translations as well as the English entries would have to be verified as per the changed CFI.


TV seriesEdit


  • [4] "The network felt that Duckman, the Emmy award-winning, but low-rated series about an acerbic, chauvinistic detective and his bumbling family, did not reflect the general-entertainment brand model that USA was trying to build in prime time [...]" 2013
  • [5] "This may be the show that proves TV animation can stay up past most kids' bedtimes and stiii And a strong, profitable audience. "Duckman," a new latenight series on USA Network, is crude, violent, cynical, antisocial and a little sexist, pretty [...]" 1996
  • [6] Usenet. Just scroll all the way down until you find ones that don't have "Duckman" in the title.
Classic bookEdit

Winesburg, Ohio

  • [7] Groups, from 2010.
  • [8] "As I begin to reevaluate the place of Sherwood Anderson's Winesburg, Ohio in the development of American fiction, I first want to look at Anderson's symbiotic relationship with Gertrude Stein, a relationship most Stein devotees will know about [...]" 1999
  • [9] "In triggering conversation in Winesburg, Ohio, however, a single word can dramatically alienate and isolate dialogic partners in the frightening immediacy of their encounter; such contact is always a "traumatism of astonishment." 2009

Separate proposalEdit

If this doesn't work, we may be able to have these proper names of fictional or nonfictional works somewhere in the appendix namespace.


I just threw this together in about 30-45 minutes. But you get my point, I'm sure. You can find sources that aren't directly about these proper nouns, that are from varying years, and that were not written by its creator. If these were the inclusion standards for entries for book names or TV show names, we should also have a header that italicizes the proper noun, as this is the standard in English.

Comments belowEdit

So, what do you think? I feel the urge to start a vote, and it says to start discussion in the beer parlour. I know quite a few of you will most definitely and immediately disagree, and I have a semi-good idea of which users will and won't (whom I know). Although, I'm sure there will at least be some who agree or at least partially agree with this proposal, and I'm open to suggestions to things I should change before starting the vote. (No personal attacks please) Philmonte101 (talk) 05:16, 13 August 2016 (UTC)

  • Oppose. I hope other editors will articulate the reasons; I think it's kind of obvious. --Dan Polansky (talk) 07:49, 13 August 2016 (UTC)
  • Oppose We have enough work to do just filling out, cleaning up, and otherwise maintaining what we've got. DCDuring TALK 12:47, 13 August 2016 (UTC)
  • Oppose. This type of information is better suited to an encyclopedia. If only there were an encyclopedia version of Wiktionary... --WikiTiki89 13:27, 13 August 2016 (UTC)
    Precisely. Oppose. Equinox 13:28, 13 August 2016 (UTC)
  • Oppose for the reason given by Wikitiki89. If a fictional title or character has gained some idiomatic meaning, then it merits inclusion here (e.g., Wonder Woman to mean a woman of extraordinary ability). If it only retains its fictional meaning, then it belongs at Wikipedia. — SMUconlaw (talk) 11:33, 15 August 2016 (UTC)
  • Oppose. We might as well include the names of specific people, like Justin Trudeau, so that people who want to know why his parents chose that name for him can look it up under the etymology. Andrew Sheedy (talk) 17:22, 15 August 2016 (UTC)

oversized Cyrillic for Old Church Slavonic and Old East SlavicEdit

For some reason, the Cyrillic font we use for Old Church Slavonic and Old East Slavic renders bigger than the Cyrillic font for Russian, at least on my Mac OS X laptop under Chrome. See тать for an example; compare the Old Church Slavonic entry to the Russian entry, and see the Russian etymology for an example of Old East Slavic, which looks (on my machine) the same as Old Church Slavonic. Do we want to fix this? Benwing2 (talk) 21:15, 13 August 2016 (UTC)

The reason for this is that the specific Old Cyrillic fonts come out smaller and therefore need to be rendered bigger. Your Mac is probably using Helvetica or whatever the default Mac font is, because it supports the characters and because you don't have Old Cyrillic fonts installed. Ideally, we should be be able to provide font-specific sizes, but I don't think CSS supports that. --WikiTiki89 21:32, 13 August 2016 (UTC)
If you are lack of fonts, these fonts may help you. [10] (In this case, try Noto Sans or Noto Serif.) --Octahedron80 (talk) 12:57, 18 August 2016 (UTC)
The issue is not character support in the fonts, but rather the choice of font. The Old Cyrillic script is meant to be displayed like this. --WikiTiki89 13:18, 18 August 2016 (UTC)
I'll pass if it depends on font variations, since they are located on the same codepoint. --Octahedron80 (talk) 00:38, 19 August 2016 (UTC)

"book cites aren't usexes"Edit

In diff user:Equinox removed the {{ux}} template. It's good and well if we decide that this template is strictly for usexes (which is far from decided as far as I know, but never mind), but the template should not be removed altogether. Instead an alternative template should be provided that is more appropriate. —CodeCat 20:26, 14 August 2016 (UTC)

At the moment, ux puts the text in italics, which doesn't look good for book citations. Equinox 20:27, 14 August 2016 (UTC)
{{ux}} is wrong if we wish to maintain the customary italicization of book/journal/newpaper titles. I can't understand how the failure to explicitly exclude {{ux}} from use for citations constitutes sanction in favor of it. One could as easily claim that wikitext can overwrite all templates not explicitly endorsed by a voted policy. This kind of thinking is dangerous in an admin. DCDuring TALK 21:46, 14 August 2016 (UTC)
Maybe we should have a template that's identical to {{ux}} except it doesn't italicize, for use with quotations. People seem to like to use it in this way. DTLHS (talk) 22:19, 14 August 2016 (UTC)
Maybe we should keep {{ux}} for both usexes and quotes and maybe change it not to use italics for Latin chars. Note that it doesn't currently use italics in Cyrillic. Benwing2 (talk) 22:20, 14 August 2016 (UTC)
It's more or less like {{l}} versus {{m}}. —CodeCat 22:26, 14 August 2016 (UTC)
What is the advantage in using {{ux}} in terms of improved user experience, improved ease of adding content, speed of downloading, server load, etc.? Why aren't we hearing about such advantages? This also seems to go against the idea of intuitive names to speed the learning by new contributors. If that isn't important, why not remain {{ux}} to {{u}} for "qUotation" and "Usage example? DCDuring TALK 22:38, 14 August 2016 (UTC)
For English, I don't know. For foreign languages it provides uniform formatting of translations and (for non-Latin scripts) transliterations. Benwing2 (talk) 22:57, 14 August 2016 (UTC)
Because the more consistent we make our entries the easier they are to edit. And the last 10 years have proven that we are utterly incapable of any consistency that isn't rigidly enforced by templates. DTLHS (talk) 22:59, 14 August 2016 (UTC)
For quotations, we have a whole series of templates (the most important being {{quote-book}}, {{quote-journal}} and {{quote-web}}) that can be used for a consistent appearance. I don't think any other templates are required. — SMUconlaw (talk) 11:29, 15 August 2016 (UTC)
  1. . How does {{ux}} do a better job of making our entries easier to edit? It looks like just a labeling requirement imposed on others to make life easier for amateur programmers.
  2. . The "quote-" family of templates does make for a great deal of uniformity in line and character formatting and in order of the components of citations. What does {{ux}} add? If the idea is that the advantage will emerge in the fullness of time, we would need to have a great deal more faith in the capability of our "technical" contributors than I believe they have earned. DCDuring TALK 13:31, 15 August 2016 (UTC)
{{ux}} provides automatic transliteration, whereas {{quote-book}} et al. do not. A deal breaker for me. --Vahag (talk) 14:03, 15 August 2016 (UTC)
I didn't know that {{ux}} provides automatic transliteration. But in that case, the solution is for someone knowledgeable about Lua to add automatic transliteration to {{quote-meta}}. (The "quote-" family of templates already has a |transliteration= parameter.) Using {{ux}} in this context is not very appropriate because it formats quotations differently from the "quote-" templates, leading to a lack consistent appearance. — SMUconlaw (talk) 15:34, 15 August 2016 (UTC)
But why mix two different functions (citations and text rendering) in one template, when it would be easier to just use two templates? --WikiTiki89 17:06, 15 August 2016 (UTC)
Since the "quote-" templates are already intended for formatting quotations, why not just build the automatic transliteration function into {{quote-meta}} instead of having to use yet another new template? — SMUconlaw (talk) 17:20, 15 August 2016 (UTC)
Transliteration is not the only thing missing. {{ux}} (and I guess now {{quote}}) support many features that are useful for rendering text, such as allowing language-links, and may support more features in the future, such as automatic linking and who knows what else. It wouldn't make sense to add each of these features in more than one place, when they can just be added to one place. Let the "quote-" templates focus on formatting the citation line itself and not worry about rendering quotation text. --WikiTiki89 17:26, 15 August 2016 (UTC)
I see. But the "quote-" templates also render the quotation text through the |passage= parameter. The current situation means that two separate templates have to be used for formatting quotations depending on what features are required in the quotation text. I hope this is explained somewhere (perhaps at "Wiktionary:Quotations"). — SMUconlaw (talk) 17:45, 15 August 2016 (UTC)
I don't see any inherent problem with using two different templates. In fact I even find that it makes the wikitext more readable. --WikiTiki89 17:52, 15 August 2016 (UTC)
I think {{quote-meta}} should be using the usex module to render the quotation text. DTLHS (talk) 17:47, 15 August 2016 (UTC)
You could do that, but you would have to ensure that there are no argument naming conflicts and such. I don't think there are any at the moment, but it would be an extra thing to worry about whenever adding a new argument to either {{ux}} or any of the individual "quote-" templates. I don't see the point. --WikiTiki89 17:52, 15 August 2016 (UTC)
I have created {{quote}}, which works the same as {{ux}} except for these differences in formatting. And it does do automatic transliteration. —CodeCat 15:44, 15 August 2016 (UTC)
What exactly are the differences in formatting? I tried it at קטון‎, but the formatting is exactly the same. Does this only apply to Latin script? --WikiTiki89 17:06, 15 August 2016 (UTC)
CodeCat answered this question at User talk:KIeio‎#Template:quote, saying: “It doesn't apply any formatting to the quoted text, so that it preserves its original formatting as much as possible.” @CodeCat: I presume you meant basically that italics are not applied? Is there any other difference? And can we get the following to work as expected: {{quote|ru|Слова ''да'' и ''нет''.}} (currently the italics don't do anything in scripts other than Latin)? --WikiTiki89 19:52, 15 August 2016 (UTC)
This can't be fixed unless we allow for Cyrillic italics in general. Previous discussions have mostly led to the conclusion not to allow them. —CodeCat 21:08, 15 August 2016 (UTC)
This can be fixed without allowing non-Latin scripts to be italicized in mentions and usexes. Previous discussions have led only to the conclusion not to allow italicizing non-Latin mentions and usexes, but that does not apply to quotations. --WikiTiki89 21:55, 15 August 2016 (UTC)
True, this would work thanks to {{mention}} having distinct style tags (finally, a good use for it). I just wanted to make sure that it was ok to remove the blocking of all Cyrillic italics, which I believe we have currently. —CodeCat 22:19, 15 August 2016 (UTC)
Is there a reason why {{ux}} no longer seems to italicize example sentences? It's introduced a whole bunch of inconsistencies (compare, for example, the verb and noun sections at shift, one of which was italicized manually; the other with the template). Also, now that the example sentences are no longer visually distinguished from the definitions, it is much harder to read. I'm guessing this discussion has something to do with the change, but could it be reverted? It's just undone possibly hundreds of my edits which were aimed at increasing consistency between entries. Andrew Sheedy (talk) 20:02, 15 August 2016 (UTC)
Module:usex if quote and (sc:getCode() == "Latn" or lang:getCode() == "und") then. @CodeCat Shouldn't it just be if (sc:getCode() == "Latn" or lang:getCode() == "und") then? DTLHS (talk) 20:09, 15 August 2016 (UTC)
That seems to have been a mistake. I fixed it. --WikiTiki89 20:11, 15 August 2016 (UTC)
Perfect, thanks. I'm glad to know it was an accident rather than another template change I have to adjust to.... Andrew Sheedy (talk) 20:17, 15 August 2016 (UTC)
It's pretty standard actually not to use a template for a citation. Not sure why. Perhaps no template has ever exceeded simply doing it by hand. Renard Migrant (talk) 20:27, 15 August 2016 (UTC)
It should never have been standard. DTLHS (talk) 20:31, 15 August 2016 (UTC)
The quotation templates have been greatly improved over the past few months. --WikiTiki89 20:37, 15 August 2016 (UTC)
 SMUconlaw (talk) 09:10, 16 August 2016 (UTC)
@Smuconlaw: I don't know if you've gotten one, but you deserve a very big thanks for that! It was something I always thought about doing but the templates were such a mess I was too afraid to go near them. --WikiTiki89 14:58, 16 August 2016 (UTC)
Awww, shucks! No problem, it was quite interesting working on those templates. — SMUconlaw (talk) 15:42, 16 August 2016 (UTC)

Templates in Category:Quotation reference templates should use {{quote-book}} et al when possibleEdit

I'm hoping there's agreement on this. Some of the templates have extra parameters that may not fit elegantly. It will be some work but there aren't too many templates to convert. DTLHS (talk) 21:27, 15 August 2016 (UTC)

Is this applicable to reference websites? DCDuring TALK 21:56, 15 August 2016 (UTC)
What do you mean? DTLHS (talk) 21:57, 15 August 2016 (UTC)
  • Support: I'm in favour as this would standardize the formatting of quotations. — SMUconlaw (talk) 09:10, 16 August 2016 (UTC)

Extended flexibility voteEdit

FYI, I extended Wiktionary:Votes/pl-2016-07/Editing "Flexibility" by 1 month per request. --Daniel Carrero (talk) 00:44, 16 August 2016 (UTC)

I request that it be closed as per its original creation page. DCDuring TALK 02:30, 16 August 2016 (UTC)
Three people, including myself, supported the extension in the #Decision section in the vote. --Daniel Carrero (talk) 02:46, 17 August 2016 (UTC)

Russian combining forms like -бавить or -ключитьEdit

I created a number of entries visible in CAT:Russian verbal combining forms. These are verbs where the base verb is missing but various prefixed derived verbs exist, and I want to create an entry for the base verb for use in etymologies and such. CodeCat (talkcontribs) didn't like the term "combining forms". What do others think? Benwing2 (talk) 09:13, 16 August 2016 (UTC)

I have no strong opinion on this. The categories are useful but I haven't seen similar examples for naming them. --Anatoli T. (обсудить/вклад) 10:25, 16 August 2016 (UTC)
For the L3 header, I'd just call it a verb. We don't seem to have entries for parallel things in English like -ceive, but I did make an entry for Old Irish ·icc and call it a verb. —Aɴɢʀ (talk) 14:36, 16 August 2016 (UTC)
What would make the most sense would be to call them reconstructions: *бавить (*bavitʹ), *ключить (*ključitʹ). But for some reason I don't like that idea. I also don't like the hyphens in -бавить (-bavitʹ) and -ключить (-ključitʹ). I would say we should do what Angr did with ·icc and put them at бавить (bavitʹ) and ключить (ključitʹ). --WikiTiki89 15:01, 16 August 2016 (UTC)
Would they survive an RFV? —CodeCat 15:31, 16 August 2016 (UTC)
It would be an RFD question, because the claim would that they are attested as part of their derivations. --WikiTiki89 15:49, 16 August 2016 (UTC)
That would make them like ceive or ject, which I doubt would survive. RFV demands attestation of the lemma itself, it doesn't allow for such exceptions as far as I know. —CodeCat 16:05, 16 August 2016 (UTC)
Adding a hyphen onto them doesn't suddenly make them any more or less attestable than they were before. This issue is about what the entry name should be and not about attestation. For some languages, like Arabic, we don't indicate prefixes and suffixes with any sort of hyphen in entry titles. For Sanskrit, all our noun lemmas are actually suffixless stems that don't really exist on their own. This isn't much different. --WikiTiki89 16:27, 16 August 2016 (UTC)
-ceive and -ject aren't quite parallel to -бавить and -ключить because (among other reasons) all words containing the former morphemes were borrowed from French and Latin with those morphemes in them, similar to Russian -инг. And we do have entries for things like Latin -bulum that can't be attested on their own so I don't see how the RFV issue applies. Benwing2 (talk) 00:10, 17 August 2016 (UTC)
What about the case of verdwijnen? There is no verb dwijnen, at least not in current Dutch. The point is that we have essentially an unattested verb that it might be desirable to have an entry for. In Latin, we have opted to go for a reconstructed entry for the unattested base verb, as linked in the etymology of abdō, ēdō, dīdō and others. —CodeCat 00:22, 17 August 2016 (UTC)

User:Wyang is edit warring againEdit

I have tried to explain the situation to him on his talk page, but he doesn't seem to want to understand that he can't just change common practice regarding transliteration to suit his own personal tastes. Big changes to common established practice like this need discussion and consensus, and I consider this a big enough change to require a vote, but I am having a hard time getting him to actually do so and wait for consensus. Instead he edit wars over it to try and force his change through, since he thinks he is right, anything is warranted and any opposition is apparently shortsighted and Eurocentric and therefore it's ok to ignore consensus. Can someone else please try explaining it to him and try to get him to stop messing with the modules? The only thing I can do is continue to revert him. Thank you. —CodeCat 02:01, 17 August 2016 (UTC)

It has been very frustrating interacting with User:CodeCat - unreceptiveness to suggestion, poor participation in discussions at the Beer Parlour, blocking wilfully, replying with completely irrelevant comments, and impetuous reverts without any input to the topic at hand. The word being thrown around is consensus, when there is not even one to begin with. I repeatedly asked for consensus for treating romanisation and transliteration as equivalent in Module:links, but User:CodeCat's response is plain simple - evasion, evasion and evasion. Without any clear and thorough discussion showing your edit is consensus, why are you throwing around the consensus as if there is one? If you are not willing to discuss, you should not be making any changes, let alone reverting impetuously. Disappointing that such blatant bullying is condoned. Wyang (talk) 02:11, 17 August 2016 (UTC)

Proposal: "Description" section for symbolsEdit

I've been using the Etymology section to place descriptions like these for some symbols.

Proposal: I'd like to use a "Description" section instead.


  • These are descriptions, not etymologies.
  • Maybe this would discourage definitions that are merely the Unicode description of the symbol, which would be a good thing.

Template:editnotice-exotic symbols says: "When creating this entry please make sure you give the symbol a proper definition, preferably with attestation. Mere Unicode code point name does not constitute a definition. Symbol entries without proper definitions may be deleted." Related discussion: Wiktionary:Beer parlour/2015/January#Is documenting all Unicode characters within the scope of Wiktionary?.

If someone creates an entry like that, (using the code point name as the definition, I mean) I was hoping we would be able to say: "The definition is not the place to describe the symbol, use the Description section instead. The definitions are for real meanings that can be attested."


As I said, my idea is to use the Description section for symbols like 💾 and the others above, but if we agree about allowing the section for some symbols, it raises the question of whether common letters, numbers, punctuation, etc. as well should have a Description, too. I'm not sure about whether they should. I'm leaning towards allowing it, but I'd like to know what others have to say. I thought of a few examples for consideration:

  • A = "An upside-down V (two symmetrially opposed diagonal lines meeting at the top-middle point) with a horizontal line in the middle, from one diagonal line to the other. (also mention about the appearance of "A" in handwriting)"
  • ! = "A dot below a vertical line."
  • + = "A vertical and a horizontal line, crossing in the middle."
  • ¨ = "Two horizontally-aligned dots, to be placed above a letter."

I think there may be reasons not to want a "Description" section for all characters of all scripts. Han compounds like "秋 = compound of 禾 ‎+ 火" are real etymologies and descriptions too. Correct me if I'm wrong, but for Han compounds, I believe Etymology is enough and they don't need a "Description" section. Other scripts might have other considerations. --Daniel Carrero (talk) 03:02, 17 August 2016 (UTC)

  • Agree I think that describing symbols and how they may appear "in the wild" with actual usage is a valuable resource. I don't know that we need to describe "A"--if you can read English, you already will recognize this character. —Justin (koavf)TCM 05:44, 17 August 2016 (UTC)
  • Unsure. This would add a special case/section just for unicode characters and might be confusing. The entry layout is already complicated enough. I suspect that "Description" might get used outside of the unicode/symbol context. – Jberkel (talk) 11:19, 17 August 2016 (UTC)
"An upside-down V" would be a terrible way to describe something that wasn't actually derived from the letter V. Very misleading. Equinox 15:29, 17 August 2016 (UTC)
Fair enough. --Daniel Carrero (talk) 22:28, 19 August 2016 (UTC)
IMO "a V-shape" or "a V-like shape" is fine, though. I think "an upside-down V-shape" or "the shape of an upside-down V" would probably be OK. - -sche (discuss) 01:10, 20 August 2016 (UTC)

Is it better to put in usage note instead? --Octahedron80 (talk) 06:11, 18 August 2016 (UTC)

Maybe not, it would sound odd to me. Concerning "⚤", the text "Interlocked male and female symbol." is not how to use the symbol. It is how to draw the symbol, or what to expect in Unicode fonts.
I thought of having a separate Description section, also because it is a repeated, specific type of information that many symbols would have. The Usage notes section is for miscellaneous usage information. --Daniel Carrero (talk) 18:33, 20 August 2016 (UTC)
Maybe the Description section would be useful for someone to know what a certain symbol looks like, without installing the right Unicode font. I'm also willing to consider the hypothesis that a textual description of a symbol or letter would be useful to blind people. Maybe also for creators of fonts, I don't know.
I seem to remember that a certain Unicode character was sometimes depicted as a cross and sometimes depicted as a full church. To me, this sounds like something we should mention somewhere. --Daniel Carrero (talk) 11:36, 21 August 2016 (UTC)

I created Wiktionary:Votes/2016-08/Description. --Daniel Carrero (talk) 16:08, 22 August 2016 (UTC)

I worked closely with a blind person for two years. She used Braille (on paper) and screen-reader software (on the computer) and she had no reason whatsoever to know or care about the shapes of letters, let alone obscure mathematical symbols. Let's not create worthless rubbish for no reason please. Equinox 17:17, 22 August 2016 (UTC)
Thanks for the info. I removed "A separate hypothesis, although unproved, is that it would be useful for blind people to know what the symbol looks like, too." from the vote rationale. --Daniel Carrero (talk) 17:54, 22 August 2016 (UTC)

Image availabilityEdit

Are some images not available for use in Wiktionary? I tried using this one, which is used in Wikipedia, but couldn't get it to work [11]. DonnanZ (talk) 09:34, 17 August 2016 (UTC)

It's not hosted at the Wikimedia Commons. (Note the lack of "View on Commons" tab at the top of the page) —suzukaze (tc) 09:39, 17 August 2016 (UTC)
That's a shame, it's a great image. Is it possible to rectify that? DonnanZ (talk) 09:51, 17 August 2016 (UTC)
It seems like it's been a candidate to be copied to the Commons since February 2012 (see the Licensing section, which also includes detailed information on the moving process). —suzukaze (tc) 09:56, 17 August 2016 (UTC)
I'm not skilled in doing that. I wouldn't like to try! DonnanZ (talk) 10:05, 17 August 2016 (UTC)
@Donnanz: There is a guide at w:Wikipedia:Moving files to Commons. You already have an account at Commons just by virtue of having one here, so you don't need to do anything new. If this guide is too confusing, let me know by typing {{Ping|Koavf}} and respond here. Thanks for being so eager to learn and help us! —Justin (koavf)TCM 14:04, 17 August 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I transferred it. It's now called "File:London United Tramways tram in front of its tram-shed, Kew Road, Richmond, UK - c 1900.jpg". — SMUconlaw (talk) 15:24, 17 August 2016 (UTC)

@Smuconlaw: Ah, wonderful. Thanks a lot (and also to Koavf). The result can be seen at tramshed. DonnanZ (talk) 18:10, 17 August 2016 (UTC)

Quotation questionsEdit

I recently added a quotation to mouth-breather, Gamergater, alt-right, and bluestockings. The quotation captures a number of colorful words, so I added it to several entries. The quotation was origninally:

  • 2016, Ross Douthat, "A Playboy for President," The New York Times, 14 Aug.
    "But the cultural conflict between these two post-revolutionary styles — between frat guys and feminist bluestockings, Gamergaters and the diversity police, alt-right provocateurs and 'woke' dudebros, the mouthbreathers who poured hate on the all-female 'Ghostbusters' and the tastemakers who pretended it was good — is likely here to stay."

Note two things:

  1. The external hyperlinks are in the original article on the NYT website.
  2. My addition was to what I perceived to be the "main" page for mouth-breather, and intentionally not to mouthbreather or mouthbreathers. I viewed the latter two as secondary because they identified themselves as an alternate spelling and a plural, respectively, had no definition or quotations, and directed the reader to the "main" page.

Equinox and I discussed several things, and it was proposed I bring them here:

  1. Is it appropriate to preserve hyperlinks at all? (I don't believe search engines are affected by such relinks in a wiki article.)
  2. If so, should it only be on the entries where they satisfy two criteria: they are in the original source and they clarify the author's meaning of the entry in question. Here that would mean they would stay for Gamergater and alt-right, but not for mouthbreather or bluestocking.
  3. What page should quotations appear on when alternate spellings are in play?
  4. If the alternate spelling has no entry, should it be created and the quotation moved?

--Flex (talk) 21:54, 17 August 2016 (UTC)

1. You should be using {{quote-news}} with the url parameter. 2. I believe quotations should go with the exact spelling of the word, unless the word is exceptionally hard to attest. DTLHS (talk) 22:04, 17 August 2016 (UTC)
One distinction: regarding the word-form where citations go: while I believe in using the exact form (i.e. mouthbreather rather than mouth breather, if there is no space in the citation — since citations need to attest and provide evidence for a specific form existing), I don't think that has to apply to grammatical inflections. In other words, I think it's okay to put a mouthbreathers (plural) citation at mouthbreather, but not at mouth breather. Equinox 22:15, 17 August 2016 (UTC)
Regarding the preservation of hyperlinks in cited text, my other points when talking with Flex were (i) they are something like formatting (e.g. font colour) and not really vital to showing attestation of the word, (ii) our linking might have annoying semantic repercussions (e.g. Google strengthening its PageRank between the linked text and its target, whereas we don't want to reinforce the writer's opinions — though I believe there are technical ways around this, e.g. nofollow), and (iii) Web links die very frequently anyway, and we often have to remove them from entries (plus dead links often benefit subsequent cybersquatters). Equinox 22:22, 17 August 2016 (UTC)
My responses:
  1. It is a common practice for websites to add hyperlinks to previous articles on their own website and, perhaps less frequently, links to external websites. I'm leaning towards the omission of such links in quotations, particularly since those links may well go dead if not archived in some way. (It would theoretically be possible in some instances to archive them at, but indicating the archive URLs would be cumbersome. Consider the following: "But the cultural conflict between these two post-revolutionary styles — between frat guys and feminist bluestockings, Gamergaters [archived from the original on 9 August 2016] and the diversity police, alt-right provocateurs [archived from the original on 15 August 2016] and 'woke' dudebros, [] ")
  2. The quotation should appear under the main lemma, not the variant spelling. (This is the practice adopted by the OED.) It would be easier to gauge the vintage of a word that way, as all the quotations would appear on one page. Thus, an 18th-century quotation containing the obsolete form stowadore should appear together with more modern quotations from the 19th to 21st centuries where the lemma is spelled stevedore.
SMUconlaw (talk) 22:24, 17 August 2016 (UTC)
(It seems like Wikimedia wikis have nofollow on by default for external links. —suzukaze (tc) 22:24, 17 August 2016 (UTC))
Smuconlaw: I know it's OED practice, but we need a better rationale than "some experts do it that way". We don't know how they store their data internally; we only see the finished product, the dictionary. If we put all cites at the main form then it becomes very hard to see whether alt forms are supported or not. Equinox 22:27, 17 August 2016 (UTC)
[Edit conflict. I added a reason before seeing your latest comment. — SMUconlaw (talk) 22:29, 17 August 2016 (UTC)]
I think this is more of a user-interface issue. Wouldn't you agree that it makes sense to store citations with their exact attested form internally, but to show them all together to users who view an entry? (This would need further development work.) Equinox 22:32, 17 August 2016 (UTC)
Yes, if this can be technically achieved I see no issue with that. — SMUconlaw (talk) 10:31, 18 August 2016 (UTC)
We could use section transclusion to maintain the quotation in one place and include it in multiple other places. - TheDaveRoss 20:28, 15 September 2016 (UTC)
I think quotations should all go at the main form, but citatons should be on the citations page of the exact form they exemplify. Andrew Sheedy (talk) 18:39, 22 August 2016 (UTC)
What if the main form has more than one sense? Should all the citations for the alternative form be grouped together even if they have difference senses? DTLHS (talk) 22:40, 17 August 2016 (UTC)
I think it's logical (and our current practice, as far as I can tell) for quotations to be grouped by sense. — SMUconlaw (talk) 10:31, 18 August 2016 (UTC)
I think we should remove the hyperlinks, because what we are actually quoting is the durably archived version of the article, i.e. the version printed on paper (that will eventually end up on microfilm in libraries), and that version doesn't have hyperlinks. —Aɴɢʀ (talk) 19:04, 22 August 2016 (UTC)
I disagree. I think we're archiving the text the author actually published. In the case of a purely electronic medium, the links can be very relevant to the author's intent (whom does he consider an exemplar of "alt-right provocateurs"? Sarah Palin? Donald Trump? Richard John Newhouse? Ann Coulter? Specifically, he cites Milo Yiannopoulos.) The links in question are to reputable sources (Time and the NYT) which are less likely to expire than most.
Counter argument to myself: the print edition of this article did not have the hyperlinks, so at least in this case, it seems legitimate to remove them. That doesn't answer the general question of whether they should be included when possible. --Flex (talk) 21:17, 23 August 2016 (UTC)
Yes to linking We definitely should include links--even if they expire sometime, we still have the original citation. In fact, we should include more links to Internet Archive and WebCite as archive links. —Justin (koavf)TCM 21:54, 23 August 2016 (UTC)

I'm having trouble detecting consensus here. What should I do on the two points in question? If this is not the place to look for consensus, where should I go? --Flex (talk) 21:17, 23 August 2016 (UTC)

wheel warring between User:Wyang and User:CodeCat -- not coolEdit

Something has to be done here. I'm not following this issue closely but I did notice that Wyang blocked CodeCat for 1 day for edit warring, when (a) almost certainly Wyang was equally guilty, (b) it is absolutely not OK for an involved admin to block someone they're involved in a dispute with, esp. another admin. I get the feeling both are equally guilty and deserve to be blocked. Wikitiki actually did block Wyang, who somehow managed to unblock himself (??), something else that's definitely not OK. My first instinct is to block Wyang again for his bad behavior, but instead I'm just going to unblock CodeCat since this particular block should not have been put in in the first place. What do others think? Benwing2 (talk) 02:06, 18 August 2016 (UTC)

The main reason this has escalated to this point is because Wyang has shown no willingness to find a consensus for his proposed changes to Wiktionary practice, and my previous call for help on the matter was completely ignored. Since Wyang is also an admin, my ability to enforce rules and common practice are limited and reverting the contested changes while trying to reason with him is all I can do. Please advise what can be done in the future in dealing with a rogue administrator without making myself a guilty party. —CodeCat 02:46, 18 August 2016 (UTC)
To answer your "??": admins have permission to unblock themselves. Obviously if it gets to that point they should hopefully be trying to generate some consensus with the blocking admin or the community. Equinox 03:44, 18 August 2016 (UTC)

Wheel War- Action TakenEdit

The conflict between User:CodeCat and User:Wyang has gone on long enough. They've been edit warring over an absolutely critical module used by huge numbers of entries. I'm not sure what that's doing to the edit queue- but it can't be good.

Both deserve to be blocked, but that would render them unable to contribute in discussions over the issue. It's also true that their misbehavior has been limited to editing protected modules and blocking each other.

Therefore, I have temporarily desysopped both of them, which will prevent them from editing the modules in question. I intend to restore them in one week, or when this is resolved- whichever comes first.

If edits need to be made to protected modules before then, I would appreciate it if our more-knowledgable admins would make themselves available to help out- perhaps User:Wikitiki89 or User:DTLHS?

I hope we can resolve this conflict quickly and get back to building a dictionary.

I would appreciate your feedback on my actions, since such things should only be done with community consensus.

Thanks! Chuck Entz (talk) 05:49, 18 August 2016 (UTC)

  • I can't think of any other action that would have been more appropriate. SemperBlotto (talk) 06:10, 18 August 2016 (UTC)
I think the desysopping was appropriate. I would even strongly propose that community consensus be obtained before reinstating the tools. CodeCat and Wyang have wheel-warred before, and each has blocked the other at least once, among other questioned actions. CodeCat, Wyang: you two are knowledgeable contributors to our content, and you are valuable contributors to our technical infrastructure, but you've both long (and not necessarily in equal measure) shown a tendency towards using your abilities to implement faits accomplis and get your way on e.g. module and entry layout or on treatment of Chinese, respectively. For instance, although on this page CodeCat calls on Wyang "to find a consensus for his proposed changes to [what she asserts is] Wiktionary practice", mere days ago Benwing called her out for again using her bot to create many new entries inconsistent without our existing entries and practices. Wyang, in turn, has threatened a few times to take his ball and go home if we don't agree with an action or, long ago, the unification of Chinese. These attitudes have driven away other editors; for instance, User:Mkdw just recently left after calling out Wyang's use of admin tools in the BP, while User:Ruakh has been largely inactive since earlier disputes with CodeCat over modules (as noted e.g. here) and the presentation of module errors (then and still now I agreed with CodeCat that module errors should generate a visible error message, but the dispute cost us a knowledgeable technical editor). This particular wheel-war seems especially excessive because the dispute seems to be not over whether there should be an automatic translit feature for complex non-European scripts like Thai, but over where it is most elegant to put that feature. - -sche (discuss) 06:49, 18 August 2016 (UTC)
  • As, what it feels like anyway, the only non-admin reading these discussion boards, I express my consensus and agree with -sche that the stripping should not be time-bound but powers should only be restored when the community is convinced that the issue is done with in such a way that neither will have any incentive to do something which sparks it up again. I also repeat my conviction that no party of an edit war (as defined by me above as a conflict where two reversals of an edit have happened) should have the right to block or unblock any participant. Korn [kʰũːɘ̃n] (talk) 10:14, 18 August 2016 (UTC)
  • I pretty much agree with you on everything (not unusual, by the way). We have here two equally stubborn and overbearing people who have met their match- if it weren't for the stakes and the damage done, it might be satisfying to see both get their comeuppance. As for duration, I was careful to say "I intend", because the week was just an arbitrary time picked out of the air, and I was hoping we could come up with something better. Right now both are responding with stereotyped "talking points" about the failings of the other, which shows both are still dealing with this on a strictly emotional level. The truth is, both are basically right about each other in the most part, but it's irrelevant. We need to come up with a solution that makes sense and that both can live with. Chuck Entz (talk) 14:16, 18 August 2016 (UTC)
    I think the desysopping has to continue until the matter is resolved. As long as there is no sulking, the project will continue to benefit from their contributions. I hope that the project does not suffer from lingering bad feelings once these valued contributors regain their sysop status. DCDuring TALK 13:10, 19 August 2016 (UTC)
  • I support the emergency temporary desysopping of both editors made by Chuck Entz on account of interminable wheel-warring. I believe a bureaucrat is authorized to take such temporary measures to eliminate this kind of wheel-warring, without a vote. --Dan Polansky (talk) 11:44, 21 August 2016 (UTC)

To be honest, I am not expecting any functional input from User:CodeCat regarding the topic at hand, based on her bullying behaviour and unwillingness to engage in discussions in the past few days. Her only argument has been that her edit was based on "consensus", which is obviously nowhere to be found, even when requested again and again.

Treating romanisation as equivalent to transliteration is clearly erroneous (since romanisation = transcription + transliteration), but she keeps reinstating this misinterpretation, with total disregard for the infrastructure of languages which make the distinction between transcription and transliteration on a romanisation level. For example, Module:th-translit does not even describe what it does after her edits, and she is apparently nonchalant about these languages ("It's a misnomer, but that's the way it is.").

This lack of regard for correctness, coupled with her previous heedless deletion of the indispensable code in Module:links (which precipitated all this), are acts of admin sloppiness. Her one-line response of "So, what happens now? Can we please get rid of the Thai code from Module:links now, or do we need some more edit warring?" to my detailed rationales for putting transcription support in the central modules, is exemplification of her uttermost apathy towards the actual topic ("would rather fight not explain") and disrespect to people.

This second episode was perfectly bound to happen, and bound to end tragically, when all that one side of the dispute cares about is "getting rid of the Thai code from Module:links now", even if she has to use "some more edit warring" for that. Yet, there are people cheering for her. Wyang (talk) 10:38, 18 August 2016 (UTC)

I support this action and wish that someone had done something sooner. I called for help above, but nobody responded, so I was very unsure what to do as I didn't feel like I had any options left, and it was all up to me. I'm sad to see that the community only cares when there is edit warring going on but is unwilling to help in solving the problem outside of that. At least now, people's attention is finally here so I can't complain too much.

As far as the dispute goes, I can summarise what I see:

  • Wyang, in principle, believes that transliteration modules should only be used for transliteration in the strict sense: letter-by-letter conversion.
  • Consequently, the Thai transliteration module does literal transliteration, which makes it pretty much useless for Thai.
  • This goes counter to how the term "transliteration" is generally used on Wiktionary; we use the term to refer to transcription, transliteration and romanization in general. Transliteration modules perform all of these functions, and the tr= parameter that is present on many templates is frequently provided with something that is not strictly transliteration, but rather adheres to the Wiktionary usage of the word. Our policies with respect to the use of these parameters and modules are labelled "transliteration" as well, as evidenced by WT:RU TR, WT:EL TR and WT:JA TR for example. None of these transliteration policies describes transliteration in the strict sense (av rather than ay for Greek, ō rather than ou for Japanese, etc.).
  • Because the transliteration module for Thai is useless by Wyang's own choice, Wyang decided that the best way around this was to insert special-purpose code into Module:links, a widely-used general-purpose module, to transliterate Thai correctly by using code present in another module, Module:th.
  • This was disputed by me, arguing that such special language-specific code does not belong in a general purpose module, especially not when it can easily be put into the existing transliteration module and have everything work just fine.
  • User:Wikitiki89, in the last war, did just this: he moved the code over to the transliteration module, where it belongs. This was immediately reverted by Wyang however, and his special code in Module:links reinstated despite it already having been disputed. My efforts to reapply Wikitiki's edits were repeatedly reverted by Wyang.
  • Fast forward to now, when I once again noticed Wyang's special purpose code in Module:links, and got frustrated that the issue was never solved. I therefore once again moved the code to the Thai modules. This again resulted in a revert war.
  • I attempted to explain on Wyang's talk page that in order for his alternative interpretation of transliteration, which involved creating separate modules and infrastructure for transliteration versus transcription/romanization modules, to be accepted, he would have to find a consensus with the community for it and seal it with a vote.
  • Wyang showed no intention of doing this, instead arguing on the merits of his views as if to convince me that separating the two was the right way to go. In my view, this missed the point as it wasn't me he was supposed to convince, but the community at large. Thus, I ignored his arguments and instead tried to focus on stopping him from edit warrning and trying to get community consensus first.
  • Wyang refused to create a vote, instead telling me to create a vote for him. Two other editors also called for a vote, and even offered to make one if Wyang didn't. I welcomed this, but nothing has been done in this regard yet, and Wyang continued his edit war, rather than waiting on the outcome of the vote.
  • I called for help on the Beer Parlour regarding the matter, hoping that other users would be better capable of solving the issue and, especially, to stop Wyang from reverting me each time and get him to wait for consensus. This call for help was entirely ignored, and thus the warring continued.

CodeCat 14:21, 18 August 2016 (UTC)

It seems the issue is a bit more complicated than that. Wyang seems to want to have both transliterations and transcriptions for Thai, used in different places. This is something that goes against the status quo and should need a vote before being implemented. Wyang has refused to draft this vote claiming that the consensus among Thai editors is enough. However, this impacts not only Thai editors, but our readers as well who may be confused by having two different romanization systems in different places. As long as Wyang continues to refuse to draft a vote, I don't think we should allow his system to be put in place. My personal opinion is that there should be one default romanization system, whether it be strictly a transliteration, or a transcription, and if it is necessary to use a different system in etymologies, this should either be done manually with tr= parameter, or potentially with a dedicated Thai template that would allow choosing a different automatic romanization. In either case, all the automatic Thai romanization code, both transliteration and transcription, should be located in Module:th-translit. --WikiTiki89 14:35, 18 August 2016 (UTC)
I'd like Thai to follow the pattern we've already established for Burmese: one automatically generated transliteration system used everywhere outside of Thai entries (translation sections, etymology sections, etc.), and Thai entries with additional transliteration systems (both spelling-based and sound-based). Ideally the automatically generated one should be ISO 11940-2 or at least based on it. —Aɴɢʀ (talk) 15:05, 18 August 2016 (UTC)
@Angr Burmese entries are nowhere near the level of current Thai entries. The current Burmese transliteration is much closer to the spelling, which doesn't help users much with the pronunciation. Ideally, we should have a system created for Thai - with phonetic respellings but for that we need more native knowledge or reliable data available. With Thai, we're are luckier - we have native speakers, phonetic respellings from some dictionaries and "Paiboon" or other transcriptions from published dictionaries sometimes can help reverse-engineer the phonetic respelling (for non-natives). I'd like to see the same methods used for Burmese and Tibetan. --Anatoli T. (обсудить/вклад) 02:32, 21 August 2016 (UTC)

Transliteration is not concerned with representing the sounds of the original, only the characters, ideally accurately and unambiguously. (Wikipedia)


Romanization, in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so. Methods of romanization include transliteration, for representing written text, and transcription, for representing the spoken word, and combinations of both. (Wikipedia)

Transliteration is not the same as romanisation. Romanisation = transliteration (script conversion by spelling) + transcription (script conversion by pronunciation). It is quite embarrassing that we as a dictionary are getting this basic concept wrong in some places, and seem proud of propagating and not rectifying the error. The contrast between transliteration and transcription is a fundamental concept in basic linguistics (and I do not even have a linguistics background). The distinction is strictly adhered to when one discusses romanisation schemes for languages which have a noticeable script-pronunciation discordance, i.e. languages which distinguish between transliteration and transcription on a romanisation level. If one wishes to talk about conversion to the Latin script using any mapping, it is romanisation. There are some places on Wiktionary which have confused these two concepts - for example WT:RU TR - which is precisely why people have complained that "this [i.e. the "translit" system] looks like a mess of transliteration and transcription". If we search for "transliteration of Russian" (or other languages) on Wikipedia, we will quite sensibly be redirected to "romanisation of Russian".
So far the confusion has been largely non-disastrous, since (1) most content has been for Latin-script languages, and (2) for languages which do not use the Latin script and have had romanisation systems devised for them on Wiktionary, the difference in the romanisation outcomes generated by transliteration and transcription is comparatively minimal, for example Russian. Then we deal with languages of the East, which are renowned in keeping the spelling forms used hundreds or thousands of years ago, and therefore have a high script-pronunciation discordance. If we go back to the comparison table of various languages in transliteration and transcription outcomes that I created in June, we can see that transliteration and transcription are two visibly dissimilar modes of romanisation, and this confusion of transliteration and transcription is destined to have disastrous consequences.
People may not be aware of this, but this distinction has been faithfully adhered to when we designed the module infrastructure for Oriental languages, until this incident. From the aforementioned table, we can see that transliteration as a concept is inherently impossible for Chinese and Japanese since there is no script-to-script mapping, and appropriately there is no Module:zh-translit or Module:ja-translit on Wiktionary. What we have in place for Chinese and Japanese is Module:zh/Module:zh-pron and Module:ja/Module:ja-pron which help interpret auxiliary or native phonetic representations of these languages to generate romanisations in a transcriptive manner. On the other hand, transliteration is possible and contrastive with transcription for Tibetan and Burmese, therefore we have Module:bo-translit and Module:my-translit to generate transliterations and Module:bo/Module:bo-pron to generate transcriptions. For Thai, the distinction has also been faithfully observed until the incident: we have Module:th-translit which deals with transliteration, and Module:th/Module:th-pron dealing with transcription of Thai. It has been customary practice to devise module infrastructure for languages observing the transcription-transliteration distinction prudently, and this includes naming modules the way they should be named, to avoid future complications.
Why do we have to be prudent in devising the module infrastructure for these languages, and what are the complications of imprudent and misnomeric handling of languages observing this transcription-transliteration distinction? I explained this in detail in the previous discussions (discussion 1, discussion 2). In short, the two divisions of romanisation (transliteration and transcription) symbolise the two polarising ends of the romanisation spectrum in a dictionary-building context:
etymology (transliteration) ———— pronunciation (transcription).
The reasons we use romanisations on Wiktionary are different in various parts of the project. In translation sections, the purpose of romanisations is to inform readers how the word in another language is pronounced. In etymological comparisons, the purpose of romanisations is to inform readers how the term is spelled, i.e. how it used to be pronounced. Among languages which observe the transliteration-transcription distinction, there is variation in how acceptable it is to approximate one romanisation with the other and use it in all places. This really needs to be decided on a language-by-language basis. For some languages (e.g. Tibetan, Burmese), it is not advisable to use one mode of romanisation in all places. Again using the Tibetan example of བརྒྱད (transliteration: brgyad; transcription: gyaew): It makes no sense to say:
བརྒྱད (gyaew) is a cognate of Old Chinese (OC *preːd)
when the word is actually spelt as brgyad; and it similarly makes no sense to put:
བརྒྱད (brgyad)
as the Tibetan translation of eight. The next day after the faithful Wiktionary user downloads our app, he/she is found at a Lhasa stall, trying to bargain by whipping out their phone and awkwardly pronouncing /brgjad/ invalid IPA characters (g), replace g with ɡ.
The transcription-transliteration distinction is a pan-linguistic phenomenon not just limited to Thai lacking support in the core module system, and more consideration and acknowledgement that many script-pronunciation discordant languages use two methods of romanisation needs to go into the infrastructure. A system is never perfect (e.g. Module:links and Module:translations already contain language-specific adaptations), but changes are gradual and have to be initiated at the correct end. A step in the wrong direction may precipitate amplified counterproductivity before eventual rectification takes place, like the misinterpretation of transliteration here. Wyang (talk) 00:41, 19 August 2016 (UTC)
@Angr: ISO 11940-2 is a transcription system. Wyang (talk) 00:43, 19 August 2016 (UTC)
@Wyang: You're starting to sound like a broken record. We all understand that in linguistics there is a distinction between transliteration and transcription, both of which can be called romanization (when this is done into the Latin alphabet). But the issue here is not of terminology. Yes, we use transliteration incorrectly according to the linguistic definition (although the common non-linguistic definition would include phonetic transcriptions as transliterations), but if we "corrected" ourselves and replaced the word transliteration with romanization everywhere that we misuse it in our templates, modules, "about" pages, etc., you still would not be happy. Why? Because your problem is that you want to use two different automatic romanization systems (one a "transcription" and the other a "transliteration"), when our templates only support using one automatic romanization system. So let's talk about that issue and not the terminology. --WikiTiki89 01:09, 19 August 2016 (UTC)
Well, I have to sound like a broken record because much of this has already been said two months ago in discussions poorly tended to by User:CodeCat (aside from the one-liner). The core issue is the confusion of transcription and transliteration by people who designed Module:links and the consequent awkwardness in its support for transcriptions. If transliteration = romanisation = transliteration + transcription in the central infrastructure, then where does transcription fit? It merely becomes a transliteration2, which it is not and should instead be contrasted with. The technical side is easy to fix - the shorthand "tr" is perfect already. We only need to store the transliteration and transcription modules as separate in language_data, and turn on the transcription modules at the appropriate point. For example, my revision at Module:links. Obviously there are more rigorous ways, but the approach has to be central to start with, not by confusing this concept even further in languages which truly distinguish them. Wyang (talk) 01:30, 19 August 2016 (UTC)
Ok, so let's say that we do this. Now, which of the two modules is called when our modules need a romanization to be auto-generated? And what use is the module that does not get called?
Also, aside from all this, why have you never thrown up a BP discussion or vote to discuss this proposal? Why did you edit war to put it in place instead? —CodeCat 01:37, 19 August 2016 (UTC)
Personally I think it would be hella confusing to display brgyad in one place and gyaew in another place when referring to the same word. If we are to make a systematic distinction between transliteration (in the proper sense) and transcription, we should include both forms consistently. Perhaps we write བརྒྱད (brgyad ・gyaew) where the dot in the middle links to a page explaining what the two romanizations mean. Mind you, I'm not convinced it's worth the trouble, but if we are to do it, something like this would be the way. Benwing2 (talk) 01:50, 19 August 2016 (UTC)
And in fact that suggestion is already possible without Wyang's changes to Module:links. --WikiTiki89 01:53, 19 August 2016 (UTC)
"བརྒྱད (trlit. brgyad; trscr. gyaew)"? (a bit more explicit; the blue dot in headwords is not terribly intuitive IMO) —suzukaze (tc) 02:08, 19 August 2016 (UTC)
This is fine with me, and I agree is more intuitive. Benwing2 (talk) 06:01, 19 August 2016 (UTC)
CodeCat, all of this was in the original discussions (discussion 1, discussion 2). "Why did you edit war to put it in place instead" - this is irresponsible and unnecessary accusation. Please have a look at the page history of Module:links; the first revert was your heedless revert which paralysed the Thai entries.
བརྒྱད (brgyad ・gyaew) in translations is too confusing for newcomers. The technical support for transcriptions is not difficult to put in place. A simple parallel function of Language:transcribe can be added in Module:languages. This function can be called by Module:links (i.e. to turn on transcription support) unconditionally for language A, or conditionally for language B (e.g. only when Module:links is called by Module:translations, or unless Module:links is called by Module:etymology). Wyang (talk) 04:26, 19 August 2016 (UTC)
What about suzukaze's suggestion? Do you still think it's too confusing for newcomers? IMO displaying different romanizations in different places is far more confusing than displaying both and I would be strongly against that. Benwing2 (talk) 06:01, 19 August 2016 (UTC)
What about བརྒྱད (gyaew [brgyad])? We already do this for Akkadian, for example: 𒆍𒀭𒊏𒆠 (bābili [KA2.DINGIR.RAKI]). (Although I don't understand why gyaew is even needed, none of the IPA transcriptions at the page look anything like it; they all look much more like brgyad.) --WikiTiki89 09:20, 19 August 2016 (UTC)
gyaew is the Lhasa pronunciation: gy /c/, ae /ɛ/, w /˩˧˨/. Frankly, I would be quite confused by the Akkadian word if I saw it in translations (I still am after reading the entry, especially the etymology). It may be less unsatisfactory for Akkadian, as people may be less interested in the spoken aspects of a dead language. I don't think putting transliteration in translations is a good idea for any of the non-small living languages with a high level of script-language discordance. {{bo-pron}} has more examples of transliteration-transcription correspondences in Tibetan. Wyang (talk) 09:40, 19 August 2016 (UTC)
The fact that you are confused by our Akkadian romanizations is not really a problem. We shouldn't necessarily expect people to automatically understand these things. We need to have appendix pages explaining our romanization scheme for each language, just like any other dictionary would do. Such an appendix would explain to you that bābili is the transcription and KA2.DINGIR.RAKI are the names of each character in the word, named by their usual phonetic value, with capital letters indicating Sumerian logograms (Sumerograms; kind of like Kanji) and superscript indicating determinatives. --WikiTiki89 12:26, 19 August 2016 (UTC)
I would love to read some statistics regarding the traffic of our help pages - I have always been under the impression that very few people are able to navigate to our Wiktionary:About... pages, since we do not have an obvious or subtle link on the entry itself linking to the language help page. We do not have a "translate!" tool alongside the search box that helps a reader check if translations of word A in language X exist (i.e. a simple interface with two fields "word" and "language" (dropdown by #speakers), which parses through the content of the entry A to see if it has the translation of any sense in language X), and prompt the user to suggest that we add this translation if there is none. We also do not have a fuzzy search function, or a reverse transcription search, and many other things. Personally, the reason I look up translations is because I want to know how to say the equivalent in another language. Like the common phrase "How do you say ... in the ... language?", not "How do you spell?". I imagine most readers are expecting to find out the pronunciation of a foreign non-Latin-script word on the translation page itself, which is why I'm suggesting simple, straight-to-the-point phonetic transcriptions inside translation boxes. Wyang (talk) 13:07, 19 August 2016 (UTC)
I'm all for making the "about" pages more easily accessible. As for pronunciation, you're supposed to click on the entry and not simply look at the table. The entry should have all the pronunciation information. Someone unfamiliar with Tibetan will not know how to pronounce gyaew anyway. Someone who knows a little bit about Tibetan would realize that the word might not be pronounced brgyad in Lhasa and click on the entry for further pronunciation information. I don't know why you're bringing up search features, they do not seem relevant to this discussion. --WikiTiki89 13:29, 19 August 2016 (UTC)
Yes, people are supposed to click on them, but people (especially casual visitors) often don't. People may not know how to pronounce gyaew initially, but if the display in translations consistently uses transcription and people are pointed to the correct help page, they are more likely to become regular users and use the translation functionality more frequently. The point about the search features was to lament that our user friendliness is (excuse me) crap... and yet, we are here arguing whether or not we should give support to transcriptions which prominently contrast with transliterations in many languages, and whether or not it is worthy to improve user experience with more consideration. Wyang (talk) 14:24, 19 August 2016 (UTC)
Giving a user a piece of unexplained information without a link or even a name for that information, thus effectively blocking the user from figuring out what that information is, is a problem. Because that means you have not given the user any information at all, you just blurted some nonsensical text. Korn [kʰũːɘ̃n] (talk) 15:11, 19 August 2016 (UTC)
Perhaps all of our transliterations should automatically link to a description page, like this: обезья́на (obezʹjána (key), “monkey”)? --WikiTiki89 15:53, 19 August 2016 (UTC)
That's actually how I handle it for Middle Low German grammar. Though I'm not sure it needs to happen for plain transliteration, which should be more intuitive than Sumerograms. In case of doubt, better safe than sorry, though. Korn [kʰũːɘ̃n] (talk) 16:08, 19 August 2016 (UTC)
The only downside to that idea is that it puts too much emphasis on the transliteration, rather than on the word itself. Another idea I've always contemplated was to just get rid of all transliterations in links and have them only in entries and etymologies, but that's a very radical change. Another idea I just had is what if we have links to transliteration keys after the language name in translation tables and at the top of each language header. --WikiTiki89 16:43, 19 August 2016 (UTC)
What if we just made the transliterations themselves the links to the keys? (Languages where the transliterations have entries (e.g. Gothic) could continue to link to those entries, since they contain, or link to the main entries which contain, much the same information as the key would.) - -sche (discuss) 17:15, 19 August 2016 (UTC)
I did consider that. My first thought is that it would look weird for all transliterations to be colored as links. Also, would the reader know what he would get from clicking the transliteration? But maybe it's not such a bad idea. We should limit this to link templates, though. Usage examples and other such things probably don't need to have their transliterations linked. --WikiTiki89 17:54, 19 August 2016 (UTC)
Strong oppose for any move to remove transliterations / romanizations from links. That would greatly reduce the usability of all Japanese entries. ‑‑ Eiríkr Útlendi │Tala við mig 00:42, 20 August 2016 (UTC)
A lot of online dictionaries have significantly better interfaces than us. Some use hover over for all links to show a sneak peek of the linked-to entry; examples are Moedict, CantoDict, Thai-language. These are all impressive tricks which we can potentially implement to greatly improve the user experience. The link in translations can be turned into a hover-over link which previews the pronunciation and first sense of the term, and on mobiles it can be simple link with transcription following it in parentheses. The point is we need to suitably name and record our utility modules, so that we can easily call on them and not come to the realisation we have mixed up all the transliteration and transcription modules when there is a need to use transcriptions. Wyang (talk) 00:46, 20 August 2016 (UTC)
But at what point has there ever been a need to choose between them or display them both? If we have both a transcription and a transliteration module, would they ever both be used for anything? —CodeCat 01:10, 20 August 2016 (UTC)
In translations. The purpose of having romanisations in translation sections is to inform readers how to say something in another language. Transcription modules, if they exist, should be preferentially called upon when romanising terms in translation sections. Wyang (talk) 02:08, 20 August 2016 (UTC)
I think presenting both romanisations simultaneously in translations is confusing - readers are unlikely to understand what the difference between transliteration or transcription is, or the difference between Wylie transliteration and Tibetan Pinyin. I would prefer presenting the information in the entry itself, and presenting only what is necessary in translations, e.g. བརྒྱད (pr. gyaew). Wyang (talk) 09:13, 19 August 2016 (UTC)
You can always make the words give a one line explanation of the difference on hover over. Korn [kʰũːɘ̃n] (talk) 09:37, 19 August 2016 (UTC)
We have to be careful with using hover over though - it does not seem to be well-supported on mobile devices. Wyang (talk) 10:01, 19 August 2016 (UTC)
  • I'd be fine with making the de-syspopping of CodeCat permanent. This is the latest in a series of abuses of the tools, ranging from bad blocks to making major changes without community consensus. Purplebackpack89 18:59, 18 August 2016 (UTC)
IMO, any action like this needs to be by formal vote. (Note that there was already a vote to desysop CodeCat, which failed.) Benwing2 (talk) 20:22, 18 August 2016 (UTC)
There should be no double standard. Either both Wyang and CodeCat have their sysop powers restored upon resolution of this problem, or they both have to reapply and be voted on. I do not understand, however, why CodeCat's edits are no longer autopatrolled. That should be fixed as soon as possible. —Aɴɢʀ (talk) 21:55, 18 August 2016 (UTC)
I overlooked that detail. Fixed. Chuck Entz (talk) 02:11, 19 August 2016 (UTC)
I can always trust you to make everything be about you and your grievances, no matter the subject. That type of attitude is a large part of what caused this mess in the first place- we need less of it, not more. Chuck Entz (talk) 02:11, 19 August 2016 (UTC)
I oppose CodeCat's recent desysop. First off, where is the formal vote? Second of all, I've not really had any problems with her. I think her intentions really are good, but she may have made a mistake, just like all of us have. Jeez if I had a penny for every mistake I've made on the internet, I'd have like 10 bucks (which is a lot of pennies!). I feel like it's only if a person continues to make such mistakes somewhat consistently over a long period of time, or do something really bad (like delete the main page, for instance), that they should be desysopped because of behavior. I'd be willing to put up a vote to get her resysopped (hey look I made up a new word!) if necessary. Philmonte101 (talk) 22:33, 18 August 2016 (UTC)
A vote would be required for a permanent desysop, but in this case, the desysopping was temporary in order to stop an ongoing edit war. Normal users would have received a temporary block for this, but admins can unblock themselves, making such a block useless if the admin is determined to circumvent it (and both of them did so in this case, before they were desysopped). Thus, I think the temporary desypping was justified. --WikiTiki89 23:36, 18 August 2016 (UTC)
It seems like you completely misunderstand the entire situation. The desysop was the emergency countermeasure to a serious edit war, not "a mistake". —suzukaze (tc) 01:09, 19 August 2016 (UTC)

This topic must not die again. How are we going to set up the transcription/transliteration infrastructure? —suzukaze (tc) 00:39, 21 August 2016 (UTC)

Agreed. I think most people are in agreement that the status quo of one single transliteration is OK, and it's also OK to display two transcriptions/transliterations for languages like Tibetan and Burmese where the pronunciation and written script are far from each other and where the written form carries important etymological information that's missing from the modern pronunciation. This potentially could be done for Thai and Khmer as well although here I think it's less useful, as the difference between the two isn't so much, and the extra information in the written form is mostly only present in Sanskrit loanwords, which are fairly unproblematic etymologically. The main issue here is that Wyang disagrees and wants to impose a system where we show transcriptions in some places and transliterations in others, but I think pretty much everyone else is opposed to this so it won't fly. We could vote on this but Wyang has to be willing to accept the result, since he seems to be the main one who would implement it. Benwing2 (talk) 01:07, 21 August 2016 (UTC)
My main points were: 1) transliteration and transcription should not be confused; 2) for languages which can both be romanised with transliterations and transcriptions, the functional modules should be distinguished and named appropriately; 3) using multiple romanisations is very confusing in translations and readers will not understand the difference; and 4) translation sections should use transcriptions to romanise terms, if transcriptions are contrastive with transliterations. I do not believe I am the only one who is in favour of this. Discussions should involve effective argumentation, not by merely accusing others of being outlandish. Wyang (talk) 01:36, 21 August 2016 (UTC)
Eh, I find it reasonable to display only the relevant romanization to reduce clutter. The entry itself could show which is a transcription and which is a transliteration. —suzukaze (tc) 01:55, 21 August 2016 (UTC)
I prefer to see transcriptions as is currently done by the Thai module. Transliterations or symbol sequence can still be found in Thai entries. --Anatoli T. (обсудить/вклад) 02:32, 21 August 2016 (UTC)
I suggest recording the transcription modules in language_data, creating a parallel Language:transcribe function in Module:languages, and making Module:links call on this function unconditionally or conditionally for certain languages. Wyang (talk) 01:26, 21 August 2016 (UTC)
Thanks for summarizing your position. Benwing2 (talk) 01:41, 21 August 2016 (UTC)
Note that you haven't answered whether you will accept the community's consensus if it goes against yours. Benwing2 (talk) 01:41, 21 August 2016 (UTC)
Fine. Bye bye. Wyang (talk) 01:42, 21 August 2016 (UTC)
Christ. I was trying to play mediator but seem not to have been successful. Wyang, I do hope you will reconsider. No one wants to see you leave. Benwing2 (talk) 03:53, 21 August 2016 (UTC)
Repeatedly using imagined “consensus” (your opinion) as majority tyranny to intimidate others is hardly mediation. I am perplexed how the above discussion could be interpreted as me spewing out nonsense and needing to be brought under control. I elaborated my various points in the discussion and there isn't really any opposing argument regarding either using transcriptions in translations or separating transliteration and transcription utilities for certain languages. Then there was your “summary” which identified the need to smother me without providing any counterarguments whatsoever. To my technical proposal, instead of commenting on the feasibility or reasonableness of this, you again tried to smother me by labelling whatever you believe in as “consensus” and coercing me to accept it. This is opposing for the sake of opposing, without bringing in any intelligent arguments to the discussion. This is bullying. How is བཀྲ་ཤིས་བདེ་ལེགས (zhacf-xih-dev-leh [bkra shis bde legs]) not confusing as the Tibetan translation of hello? It is frustrating to try to have people think sensibly and analytically about topics with the future in mind on Wiktionary. Look at how long it took for the community to come to senses with the Chinese merger and now this; time and time again, it is regression led by the unfamiliar majority, without critically analysing proposals for what they are. Wyang (talk) 02:16, 22 August 2016 (UTC)
Wyang, I am sorry things have gotten to the point that you think I am smothering you, bullying you, tyrannizing you, etc. It was not my intention to do any of these things, and I apologize for giving the wrong impression. How about we simply hold a vote on what is the best way to handle this? This is the Wiktionary way of doing things, and will more clearly reveal the consensus. Are you willing to lead that vote? Benwing2 (talk) 03:12, 22 August 2016 (UTC)
Thank you. Wikipedia:Polling is not a substitute for discussion; Wikipedia:What_Wikipedia_is_not#Wikipedia_is_not_a_democracy has more arguments why decision making should be achieved by discussion and consensus, not votes. It is not sensible to expect voting to be the most suitable method of decision making, when the great majority of eligible voters are uninvolved and perhaps have no prior familiarity with how the transliteration-transcription distinction manifests itself in the romanisation of certain languages. It is akin to believing that User:Wyang will be a responsible voter on the topic of Akkadian romanisation; quite the contrary I have no previous experience with this and any stance I take regarding Akkadian romanisation could be very unwise for the project's future. In this discussion we should be appraising whether the preferential use of transcriptions in translations is favourable over transliterations for certain oriental languages (Tibetan, Burmese, etc.; example below), and consequently whether transliteration and transcription modules should be kept separate for these languages. Wyang (talk) 04:27, 22 August 2016 (UTC)
“eight” “ear” “hello”
Transcription བརྒྱད (pr. gyaew) རྣ་བ (pr. naf-waf) བཀྲ་ཤིས་བདེ་ལེགས (pr. zhacf-xih-dev-leh)
(without tone letters)
བརྒྱད (pr. gyae) རྣ་བ (pr. na-wa) བཀྲ་ཤིས་བདེ་ལེགས (pr. zhac-xi-de-le)
Transliteration བརྒྱད (brgyad) རྣ་བ (rna ba) བཀྲ་ཤིས་བདེ་ལེགས (bkra shis bde legs)
A vote is good way way out. The above speaker seems to underestimate the ability of voters to inspect and evaluate evidence and to consider arguments presented. Instead, he seems to commit what looks like an authority fallacy, the erroneous notion that only those already familiar with Thai can make a sound judgment about Thai romanization, be it transcription or "transliteration" narrowly construed.
How consensus, mentioned in the above post, could ever be anything different from the result of a vote is beyond me. Since, consensus is a general agreement even if not unanimity, and I fail to see how a passing vote could ever show anything other than consensus. --Dan Polansky (talk) 08:02, 22 August 2016 (UTC)
A vote, where the topic is unfamiliar to most of the eligible voters, can easily produce ill-advised consensus (“collective stupidity”). And yes, only those familiar with how the transliteration-transcription distinction manifests itself in the romanisation of certain languages can make a sound judgment about the issue. Calling for a vote is not the way out if that side shows no willingness to engage in discussions and present counterarguments to reach consensus. Cases of collective intelligence are when those familiar and knowledgable about the topic critically appraise the arguments for and against the proposal to attempt to reach consensus, not when the proposal is relayed to a vote to see which side is more numerous. Wyang (talk) 10:22, 22 August 2016 (UTC)
You have the option to present "how the transliteration-transcription distinction manifests itself in the romanisation of certain languages". In fact, you have just done that in a table above. I trust most of the voters to consider such presentation in their voting decision. Discussion alone is not a mechanism of decision making; indeed, in this dispute, both parties think they are right and that they have presented the right arguments. Strength of argument is not a mechanism of decision making since there is no simple mechanism to assess strength of argument. --Dan Polansky (talk) 11:13, 22 August 2016 (UTC)
Having information presented can never, ever supplant being knowledgeable about a topic. For instance, if I were provided with a comprehensive overview of the Akkadian language, I would still feel that I am in no position to provide any judgment on Akkadian romanisation. In fact, Wiktionary has witnessed many lessons learnt from having the unfamiliar collectively voice opinions and make decisions. The merger of Chinese is one - it was only adopted in 2014, more than 10 years after the launching of Wiktionary. So much more work could have been done in the meantime, and so much work still remains to be done to rectify the initial step in the wrong direction. The misinterpretation of transliteration is another one. All of these resulted from the lack of intelligent decision making from people who are familiar and knowledgeable about the topics. Discussion is, and is arguably the most important mechanism of decision making, and I would argue that no decision making should be achieved without any substantial discussion. In this dispute, there has been a paucity of argumentation from one side throughout, and a paucity of active discussion of the topics at hand (whether the preferential use of transcriptions in translations is favourable over transliterations for certain oriental languages, and consequently whether transliteration and transcription modules should be kept separate for these languages). Wyang (talk) 12:05, 22 August 2016 (UTC)
If you submit that after being presented the table, I still don't appreciate the difference between letter or character based transliteration and pronunciation based transcription, you are drastically underestimating my capacity to understand very simple things. Nor do I think other readers have failed to appreciate the distinction. Your fallacy is grave. --Dan Polansky (talk) 12:58, 22 August 2016 (UTC)
Basically, you're fighting against any way for other editors to disagree with you. Is there any way Wiktionary editors could decide on a course that you disagree with that you'd accept?--Prosfilaes (talk) 13:43, 22 August 2016 (UTC)
Transliteration / transcription is about providing a Latin-script handle for people who don't read the script. Anywhere but the entry for the word itself we should be using one consistent Latin-script version, and there were can and should provide every transcription/transliteration version now or once in standard use.--Prosfilaes (talk) 09:57, 22 August 2016 (UTC)
Romanisation is indeed about providing a Latin-script handle for people who don't read the script. Nonetheless, the reasons we want to use romanisations are different in various parts of the project. It could be to show how the foreign-script word is pronounced (for example in translation sections), or how it is spelt literally (in etymological comparisons). At the moment, Tibetan is romanised with a transliteration method (Wylie transliteration), which is 100% automatable and is fantastic in etymological comparisons, as it faithfully represents how the word is spelt. However, there is no point showing brgyad as the Tibetan translation of eight, as readers will automatically assume the romanisation in translations is the word's pronunciation and attempt to pronounce it as such when communicating with locals. It makes more sense to simply use transcriptions to inform readers of the pronunciation in translation sections. Wyang (talk) 10:22, 22 August 2016 (UTC)
Romanization is not for showing how words are pronounced. That's what the pronunciation section in the entry is for. If you're using Wiktionary's translation tables as a pronunciation key for communicating in that language, epic fail. In any case, showing བརྒྱད shows six rather different pronunciations; giving me gyaew instead of brgyad doesn't really help me pronounce the word.--Prosfilaes (talk) 13:43, 22 August 2016 (UTC)
Are you sure?! You should then talk to Japanese editors and tell them they should transliterate こんにちは as "konnnichiha" They've been doing it wrong all these years! Also, get in touch with some other dictionary publishers and tell them their Korean and Thai transliterations are wrong. --Anatoli T. (обсудить/вклад) 13:49, 22 August 2016 (UTC)
Sarcasm aside, the rest of my statement still stands. If you want to know how a word is pronounced, look at the pronunciation key, not the translation table. Readers who "automatically assume the romanisation in translations is the word's pronunciation" is going to be consistently lost, and I fail to see how gyaew is going to help any reader who doesn't know Tibetan figure out the pronunciation is /cɛʔ¹³²/ or /bɡjat/ or /dʑɛʔ⁵³/ or /dʑed/ or /wɟjal/ or /hdʑal/. I in fact feel that any reader who could use gyaew to derive the correct pronunciation probably knows enough to figure out that brgyad isn't the pronunciation transcription they were looking for.
Romanize as you will, but the value of having one romanization throughout Wiktionary and giving readers a consistent Latin-script name for a word outweighs the value of having different romanizations in different places.--Prosfilaes (talk) 15:53, 22 August 2016 (UTC)
Now that CodeCat and her supporters have successfully driven away Wyang from the project, someone has to take over all the work he has been doing. Congratulations! I am disgusted with community's reaction to the problem. --Anatoli T. (обсудить/вклад) 02:21, 21 August 2016 (UTC)
Anatoli, what do you think should have been done (or should be done) differently? Benwing2 (talk) 04:17, 21 August 2016 (UTC)
I don't think it's anyone's fault but Wyang's, given that the leaving was in response to "Note that you haven't answered whether you will accept the community's consensus if it goes against yours."--Prosfilaes (talk) 13:07, 21 August 2016 (UTC)
I know the technical subject is not the subject of this thread but anyway: could not the naming disagreement be solved by placing CodeCat code in Module:th-transcr rather than Module:th-translit? Then, the misnomer argument would no longer apply, and other argument against CodeCat's solution would have to be sought. --Dan Polansky (talk) 08:06, 22 August 2016 (UTC)
The code was originally placed in Module:th (function getTranslit). Either Module:th or Module:th-transcript is fine, though either way the transcription module needs to be recorded in addition in Module:links or Module:languages/data2, as translit_module is a misnomeric parameter. Wyang (talk) 10:22, 22 August 2016 (UTC)
A further question: are the modules currently present in Category:Transliteration modules in general transcription modules or are they overwhelmingly transliteration modules in the narrow sense, transcribing on the letter or character level? --Dan Polansky (talk) 08:13, 22 August 2016 (UTC)
Just call it all 'Romanisation modules' and be done with it. Korn [kʰũːɘ̃n] (talk) 10:57, 22 August 2016 (UTC)
I oppose calling those modules "Romanisation modules". There are elements of "transcription" (more or less) in many languages, most of them are standard transliteration. Here are examples of transliterations with elements of transcription, "the translit" shows more graphical transliterations of the same word (the actual spelling):
Arabic: عربى(ʿarabiyy), translit: "ʿrbā", vocalised Arabic: عَرَبِيّ(ʿarabiyy)
Greek: Μπούρμα (Boúrma), translit: "Mpoúrma"
Russian: легкого (ljóxkovo) (phonetic respelling: "лёхково"), translit: "legkogo", spelling with "ё": лёгкого (ljóxkovo)
Japanese: こんにちは (konnichiwa), translit: "konnichiha"
Korean: 십육 (simnyuk) (phonetic respelling: "심뉵"), translit: "sibyuk"
Hindi: फिल्म (film), translit: "philma", spelling with "nuqta": फ़िल्म (film)
One can argue that abjad languages like Arabic, Persian, Urdu, Hebrew, etc. can't be transliterated but romanisations are still called transliterations. Persian and Urdu are seldom fully vocalised, so their graphical transliterations would be completely useless for someone wanting to know how to pronounce Persian or Urdu words. Some irregularities are handled by transliteration modules, for some terms manual (hard-coded) transliteration is required. If someone accuses Wyang for making up transliterations for Thai, check Paiboon dictionaries for terms like ชาติ (châat) (graphical transliteration: "châa-dtì") and see how these terms are transliterated there. --Anatoli T. (обсудить/вклад) 13:18, 22 August 2016 (UTC)
I see no problem with calling any rendering of a non-Latin word in Latin script a romanisation as a hypernym and only referring to it as a transliteration/transcription specifically when it's important to underline the difference. (Maybe leave a note in the documentation.) If a module is made which does both, as some parties propose, or if only one is reasonable for a language, why not go with an indiscriminating 'romanisation' so you can categorise them all easier and don't have to waste debate time on naming conventions? Korn [kʰũːɘ̃n] (talk) 14:37, 22 August 2016 (UTC)
  • A propos of the voting/consensus matter, I am particularly well qualified to contribute as one of those ignorant of most aspects of the matters under discussion.
The rationales for requiring a consensus of more than those knowledgeable about the languages in question is that it might interfere with the module architecture as currently designed, that the translation tables might become cluttered, and that some users (including those not intending to learn the languages and scripts in question) might be confused/overwhelmed/put-off by the transliteration-transcription distinction.
What makes sense for entries in the languages in question is a matter best left to the contributors in those languages IMO. If our module architecture did not anticipate the need for a transcription-transcription distinction, then so much the worse for the architecture. We cannot have the module architecture unreasonably preventing contributors from contributing in the manner that is best for the languages in question by their lights. IOW, we should not have the tail wagging the dog. How to apply this principle is left as an exercise to the reader.
I can only beg that the translation table matters do not make the tables cluttered and confusing for all to deliver a questionable benefit to some. DCDuring TALK 12:40, 22 August 2016 (UTC)
@DCDuring I agree! But wait, maybe CodeCat is eager to make a new module for Khmer or Burmese language and apply their "best practice" there? Well, Wyang has started, somebody can make those modules perfect!
Seriously, I perfectly understand Wyang's frustration. He created a WORKING SOLUTION for complex Asian languages nobody even attempted before. Now, someone starts changing modules without any discussion with him. I would be very upset if someone tried to change my work without first checking with me. Why people even think they should be both blocked? How would YOU feel if you were in the same situation? I don't want CodeCat blocked but I think she is absolutely wrong here. Yes, location of the code can be reviewed and discussed, agreed first and only THEN changed, if the agreement is reached.--Anatoli T. (обсудить/вклад) 13:29, 22 August 2016 (UTC)
They were both blocked/desysopped because they both used their admin powers to continue an edit war. It's not a punishment for being on the wrong side of the argument, it's a method for suppressing disruptive behaviour. Korn [kʰũːɘ̃n] (talk) 14:43, 22 August 2016 (UTC)
@Dan Polansky Dan, can you create a vote to short-circuit endless arguing? The vote should have two choices: (1) Continue the current situation where Module:links enforces the constraint that a single romanization (which may be a two-part transcription/transliteration romanization, on a language-specific basis) is used for all types of links; (2) Modify Module:links to allow different romanizations for different types of links (e.g. etym links vs. translation links). The former is User:CodeCat's position, the latter is User:Wyang's. Set the discussion period and vote start/end dates however you think most appropriate. Benwing2 (talk) 15:34, 22 August 2016 (UTC)
I agree with what User:DCDuring said above. My proposal is to keep transliteration and transcription utilities modules separate in Module:languages/data2 and similar modules, for languages possessing two contrastive sets of romanisation schemes. Notable examples include Tibetan (Wylie transliteration vs Tibetan Pinyin), Burmese (MLCTS vs BGN-PCGN), Thai (ISO 11940 vs Paiboon) and Korean (Yale vs RR). The rationale is that the module infrastructure should anticipate the need for a transliteration-transcription distinction in certain languages, and not unreasonably prevent contributors of these languages from contributing in a manner that is best for the languages in question by their lights. I am in no position to singlehandedly advocate that language X should use romanisation Y for a certain purpose without meticulous discussion having taken place surrounding language X, which need to happen in separate language-specific discussions. Still, there is a lack of adequate in-depth discussion concerning the issue, especially from arguments against - why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages? Wyang (talk) 23:56, 22 August 2016 (UTC)
If they are kept separate, then there needs to be a functional reason. The distinction needs to have a consequence in how our modules work and treat each one, and where each one appears. I don't think it particularly desirable to have multiple romanization schemes in different parts of Wiktionary, this just confuses users. The system we have now, with a consistent representation across Wiktionary, is just fine. We don't need two systems when one suffices. —CodeCat 00:13, 23 August 2016 (UTC)
  • The whole point of Wyang's argument is that two systems are already in broader use for certain languages, with each system used for specific purposes. I.e., one romanization scheme doesn't suffice, for certain specific languages. ‑‑ Eiríkr Útlendi │Tala við mig 00:40, 23 August 2016 (UTC)
  • The question is “why do the harms outweigh the benefits if we keep the transliteration and transcription modules separate for these languages”? It does not make sense to use multiple romanisation schemes for Greek, Russian, Georgian, Armenian, etc., but the languages of question are languages which contrast these modes of romanisation prominently. Does it make sense to keep transliteration and transcription modules separate in the module infrastructure for these languages? Yes. Many editors of these languages have been conscious of the need to use the appropriate romanisation in certain contexts. See for example, how User:Angr changed the romanisation of the Burmese word to a transcription at elephant. Our Korean romanisation scheme is the transcriptive Revised Romanisation scheme, which is official in South Korea (also hidden under the misnomer Module:ko-translit). User:Visviva, our first prolific Korean contributor, created the entry 미끄럽다. Note the differential use of a transcriptive romanisation in the main text (mikkeureopda) and a transliterative romanisation in etymology (muys.kulepta). Considerations of the arguments for and against need to be made in the context of these script-pronunciation-discordant languages. Provided the romanisation is well-annotated, such as 믯그럽다 (Yale: muys.kulepta) at 미끄럽다, the appropriate, purpose-oriented use of romanisations is hugely beneficial to dictionary building, for these languages. Wyang (talk) 00:54, 23 August 2016 (UTC)
I will just add, Wyang, that you keep claiming that the people opposed to you are giving no real reasons for doing so, but you yourself have given no reasons why consistently using a dual romanization scheme, like I've suggested as a compromise between your view and CodeCat's, is unacceptable, other than the unsupported claim that it's confusing for new users. Benwing2 (talk) 01:09, 23 August 2016 (UTC)
  • FWIW, my bias is to using the more-phonetic romanization (I recognize this is not IPA-grade, but it *is* generally closer to how an English speaker would say something) in translation lists and similar locations, and using the strict transliteration (i.e. letter-for-letter) romanization in etymology sections and other discussions of the term's etymology and historical development. I.e., I disagree with Benwing's suggestion, and I don't want to see both systems used in all cases. This would be similar to what is already in practice for Korean. ‑‑ Eiríkr Útlendi │Tala við mig 01:18, 23 August 2016 (UTC)
  • Exactly. The information presented in a dictionary should be succinct; dual romanisation in translations is infoxication. The reason users look up translations is to answer their questions of “how do you say ... in ...?”, and romanisation in translations should cater to the need of users. Let's look at how other translation dictionaries do this: the only previewable English-Tibetan dictionaries on Google Books are 1, 2 and 3 and all are using transcriptions only. Transcriptions answer the users' questions directly, without additional romanisations to complicate their information processing (below). Wyang (talk) 02:57, 23 August 2016 (UTC)
“hello”: བཀྲ་ཤིས་བདེ་ལེགས (pr. zhacf-xih-dev-leh) vs བཀྲ་ཤིས་བདེ་ལེགས (zhacf-xih-dev-leh [bkra shis bde legs])
“birthday”: འཁྲུངས་སྐར (pr. chungf-gaaf) vs འཁྲུངས་སྐར (chungf-gaaf ['khrungs skar])
“brain”: ཀླད་པ (pr. laef-baf) vs ཀླད་པ (laef-baf [klad pa])
Do you have any evidence that users look up translations solely for the pronunciations of things? The main reason that the system with multiple romanisations in confusing is that not a single one of them is explained.
“brain”: ཀླད་པ (pr. laef-baf, sp. [klad] [pa]) Korn [kʰũːɘ̃n] (talk) 10:08, 23 August 2016 (UTC)
There is plenty of evidence for that. Many transliterations are efforts of a few people over a period of time who took part in their development. Apart from Wyang, myself, you can talk to people like Eirikr, Haplology (Japanese), TAKASUGI Shinji (Japanese and Korean), Aryamanarora (Hindi), Benwing2 (Russian), Saltmarsh (Greek) why transliterations are the way they are. They use some phonetic elements, they are not IPA and not supposed to convey the pronunciation accurately. --Anatoli T. (обсудить/вклад) 10:26, 23 August 2016 (UTC)
Anatoli, I don't understand what you are trying to tell me with your comment. Wikipedia says that Tibetan Pinyin does not mark tone. Our pronunciation sections use the label 'Tibetan Pinyin' but according to Wang, the superscript letters we see are tone marks. What kind of system is that? It seems in need of relabeling. Korn [kʰũːɘ̃n] (talk) 10:34, 23 August 2016 (UTC)
Korn: See Tibetan pinyin#References. This is the modified version of the official Tibetan Pinyin, with tone letters. The Wylie transliteration scheme is the gold standard of Tibetan romanisation; it and its variants are used in almost all scholarly publications. But it is interesting that all of the three previewable English-Tibetan dictionaries on Google Books use transcriptions only to romanise their Tibetan translations. Wyang (talk) 10:56, 23 August 2016 (UTC)
My point was that for each language, the interested and knowledgeable editors decided what and how to go about transliterations for specific languages. I could bring series of discussions about Korean. Wyang implemented most of it. The phonetic transliteration (RR) was adopted - officially recommended in South Korea. There were relevant long discussions, decisions made. Now with the argument between Wyang and CodeCat everybody joined with their opinions but cared little when the actual problems were discussed and solved. --Anatoli T. (обсудить/вклад) 11:08, 23 August 2016 (UTC)
Wyang, there should be a note about what the symbols mean on About: Tibetan, including what the tone marks mean, this information is not easily retrievable. Anatoli, I assume the reason for that is that now the community is forced to take note of the situation because it's brought to the Beer Parlour whereas before it was discussed amongst editors of the language. Korn [kʰũːɘ̃n] (talk) 14:55, 23 August 2016 (UTC)
Absolutely; have to work on it later. Wyang (talk) 21:16, 23 August 2016 (UTC)


Hello. I am here because there's nowhere to post. It's about az.wiktionary (Azeri aka Azerbaijani). There is some problem that a user (perhaps admin) that is native changed verb categories from fel to feil last year. Later, another user that is also native rejects his changes. There is not many people there to discuss about this. I hope this community has knowledge about Azeri and can decide which one to name the categories. --Octahedron80 (talk) 06:02, 18 August 2016 (UTC)

PS If you want to make changes, please do at az.wiktionary directly. I am off. --Octahedron80 (talk) 06:19, 18 August 2016 (UTC)

(calling @Aabdullayev851) A third spelling for verb is fe'l. I think there are slight differences in fel / feil / fe'l, but I not sure what the differences are. —Stephen (Talk) 15:12, 22 August 2016 (UTC)
Hello Octahedron80 and Stephen. I am new admin in Azerbaijani Wiktionary. I am thanks you about your attention to Azerbaijani. There is no fe'l in Azerbaijani language. I am sure about that. But fel or feil? Some sources use fel, but some sources use feil. I am also confuse about this words. How we can decide which is correct? --Aabdullayev851 (talk) 17:00, 22 August 2016 (UTC)
OK. Fe'l use in Soviet time in Azerbaijani. After that start using fel. After 2004 using feil. So current version is feil. You can see this changes in şe'r - şer - şeir. --Aabdullayev851 (talk) 17:31, 22 August 2016 (UTC)
Let's use modern spelling. Feil maybe. --Octahedron80 (talk) 00:33, 23 August 2016 (UTC)
Ok Octahedron80 --Aabdullayev851 (talk) 07:25, 24 August 2016 (UTC)

Proposal: 3-revert-ruleEdit

Wikipedia has a 3-revert-rule which exists explicitly to stop endless edit wars. I think Wiktionary should have a similar policy.

However, I don't know if it should be different in every detail from Wikipedia's version. For example, Wikipedia gives both users three reverts, which means that once all three are used up, the edit is kept rather than reverted. On Wiktionary we follow the principle that disputed edits should be reverted first and then discussed for consensus. So it would make more sense if, when both parties have exhausted their reverts, the final state of a page is the original state rather than the state with the disputed edit. I therefore think that the editing party should get only 2 reverts.

Another detail that should be worked out is how to enforce it, especially when two admins are involved. Wikipedia has a special 3RR noticeboard, which makes sense because it happens so often there. But here it's pretty rare, so there's probably not that much need for a special forum. Would WT:VIP do? I feel that the Beer Parlour is insufficient, as I found out myself when I tried to report an incident and it was ignored. Is it possible at all to make a rule against sysop apathy? —CodeCat 22:06, 18 August 2016 (UTC)

I really don't want us to start enshrining common sense as policy or to start having highly regulated drama-fests like Wikipedia does. Editors shouldn't revert-war and wheel-war because it's disruptive; we shouldn't need an explicit policy stating what everyone already knows. As for returning a page to the "original state", who decides what the "original state" is? If today I revert something another editor did back in 2013, and 15 minutes later he reverts my revert, what's the "original state"? I like how unregulated and uncomplicated Wiktionary is compared to Wikipedia, and I want to keep it that way. —Aɴɢʀ (talk) 22:15, 18 August 2016 (UTC)
I don't see a way to enforce any of our policies. Renard Migrant (talk) 22:23, 18 August 2016 (UTC)
The enforcement mechanisms are social pressure and the action of bureaucrats. Sadly, some seem largely immune from social pressure. DCDuring TALK 02:09, 19 August 2016 (UTC)

PIE rootEdit

Wiktionary:Table of votesEdit

Planned, running, and recent votes [edit this list]
(see also: timeline, policy)
Sep 14User:So9q for interface adminFailed
Sep 16Rescinding the "Coalmine" policyFailed
Sep 29User:Erutuon for adminPassed
Oct 14Replacing de-sysop votes with confirmation votesNo consensus
(=4)[Wiktionary:Table of votes](=88)

I created Wiktionary:Table of votes. It is automatically generated using Lua.

Feel free to discuss the new page. If there are any suggestions, I can make the changes in the Lua code.

My idea was to let people know when they voted and when they didn't vote. I was hoping that maybe this would increase the turnout in all the vote pages. --Daniel Carrero (talk) 13:58, 22 August 2016 (UTC)

I also added an "expand" link, at the bottom of the vote box, pointing to the new table of votes. --Daniel Carrero (talk) 21:06, 22 August 2016 (UTC)

Unfortunately, the page is too wide. I wonder if I should implement a list of abbreviations for all users. Examples:

--Daniel Carrero (talk) 14:47, 23 August 2016 (UTC)

I will object to any foreshortening of User:I'm so meta even this acronym which is not "ISMETA." - TheDaveRoss 18:08, 23 August 2016 (UTC)
Oops! I typed "IMSOMETA" but I fixed it now. I also removed "META". Those were mistakes. --Daniel Carrero (talk) 18:43, 23 August 2016 (UTC)
Would it be too hard to read the names of the votes if you reoriented the table so that the usernames were in a vertical column? As long as the text of the vote names was wrapped, I think that might be an improvement. Andrew Sheedy (talk) 20:18, 23 August 2016 (UTC)
I agree. There's more voters than votes, so the lower number should be arranged horizontally. —CodeCat 20:42, 23 August 2016 (UTC)
@Andrew Sheedy, CodeCat:   Done. Does it look better now? --Daniel Carrero (talk) 21:48, 23 August 2016 (UTC)
Much! —CodeCat 21:54, 23 August 2016 (UTC)
Definitely—it fits on my screen now! :) Andrew Sheedy (talk) 21:58, 23 August 2016 (UTC)
It should fit on everyone's screens now: "The time allocated for running scripts has expired" doesn't take up much room at all ... Chuck Entz (talk) 09:03, 31 August 2016 (UTC)
@Chuck Entz: You mean you wanted the table to fit your screen and display some actual content? Some people are never happy!
Just kidding. I split the vote into 3 separate tables yesterday: WT:TOV, WT:TOV2, WT:TOV3. The pages seem to be displayed like they should ... most of the time, but sometimes the module error reappears. I used the "hard purge" button to fix it when it happened. You can edit the pages to change the number of votes that appear in each table if you want. --Daniel Carrero (talk) 22:59, 1 September 2016 (UTC)

Voting policy etc.Edit

Wiktionary:Votes had a lot of important text hidden in collapsible divs. I moved it all to:

--Daniel Carrero (talk) 12:33, 23 August 2016 (UTC)

Vote: Making usex the primary name in the wiki markupEdit

FYI, I created Wiktionary:Votes/2016-08/Making usex the primary name in the wiki markup. Let us extend the vote as much as discussion requires. --Dan Polansky (talk) 17:32, 23 August 2016 (UTC)

New PIE root categoriesEdit

Per comments at WT:RFDO#Template:PIE root, I made {{inh}}, {{der}} and {{bor}} populate categories like Category:Czech terms derived from the PIE root *swep- when the current term is derived from PIE.

This caused a number of redlinked PIE root categories to appear in Special:WantedCategories.

Is that okay with everyone? Is there any problem with these categories, or can they all be created normally? If there is no objection, can someone create all the categories automatically? --Daniel Carrero (talk) 19:07, 23 August 2016 (UTC)

I removed the code that did that, because it was creating all kinds of unwanted categories. Essentially, all PIE terms were being given categories, not just the roots. That said, it's impossible for these templates to determine what is or isn't a root, and besides that, there's tons of etymologies which refer to invalid or badly-formed roots as well. —CodeCat 19:40, 23 August 2016 (UTC)
Maybe it would be better to give the templates a parameter so users can manually make them opt in to categorizing, e.g. |PIErootcat=1 or something like that. —Aɴɢʀ (talk) 09:58, 25 August 2016 (UTC)

Proposal: Request categories with longer namesEdit

This concerns Wiktionary:Votes/2016-07/Request categories. See also the vote talk page for further discussion: Wiktionary talk:Votes/2016-07/Request categories. The vote did not start yet, and is under construction.

Consider all these categories:


Rename all categories, with longer, more accurate names, with proper English grammar/syntax and "requests" in all names. Details are to be discussed below. (Feel free to suggest different names for the categories if you want.)

Proposed names:








Rationale and notes:

  • A more consistent naming style proposed to be used in all request categories.
  • ("requests" vs. "needing") These categories track where something was manually requested, not where it is needed.
  • I attempted to propose names with correct English grammar/syntax. As opposed to "English requests for example sentences", for example. As discussed before, there are no "English requests" anywhere.
  • I'd like to replace "needing attention" and "to be checked" by "review". If we are making a request to do something, you are asking people to review the entries.
  • A minor reason: Some of the proposed category names match the request template. {{rfap}} = "request for audio pronunciation", {{rfe}} = "request for etymology".

--Daniel Carrero (talk) 09:21, 24 August 2016 (UTC)

@Daniel Carrero Thanks for directing me over here. I very much like this proposal. It solves the problem of the current names, which imply that entries in a category are the only ones that need this or that to be added to them or worked on, and the problem of the earlier proposal implying that the requests themselves are in a particular language. It might also be more understandable to newcomers.
I'm curious, though: "requests for review" sounds fine to me as an American English speaker, but is it acceptable British English as well? I recall from Harry Potter that "revising" is the verb denoting what I would call "reviewing". But that was in the context of homework assignments, so I'm not sure. — Eru·tuon 01:24, 26 August 2016 (UTC)
IMO, "revise" does not look as good as "review". Then again, I'm from Brazil, rather than USA or England. Let me know if you would suggest any change.
I started the vote: Wiktionary:Votes/2016-07/Request categories. --Daniel Carrero (talk) 01:21, 28 August 2016 (UTC)

Upcoming 5 million entries milestoneEdit

Should be reached in around one or two months max. Maybe an occasion to celebrate a bit and do some events / communication around the project? A WMF guest blog post with some stats / data visualisations / stories? I'm still surprised about the low profile Wiktionary has, many people have not even heard about it, or confuse it with Wikipedia. Or worse, it gets treated as inferior. At the recent WikiConvention francophone I heard the remark (paraphrased from memory, probably meant as a joke) "Wikipedia is where the real work gets done, Wiktionary is for scrabble players". Time to change this perception! – Jberkel (talk) 10:08, 24 August 2016 (UTC)

The speaker was probably referring to fr.wikt. [;-}] DCDuring TALK 10:33, 24 August 2016 (UTC)
To this day at Wikipedia there are people who consider Wiktionary to be Wikipedia's trashcan. I still see "Transwiki to Wiktionary" at deletion discussions all the time there, even for terms we already have an entry for, and even when our entry is superior to the Wikipedia article up for deletion. —Aɴɢʀ (talk) 15:09, 24 August 2016 (UTC)
I wish that we had a real encyclopedia as a sister project. DCDuring TALK 17:04, 24 August 2016 (UTC)
We are their WT:LOP, and WT:LOP's WT:LOP is Urban Dictionary. Equinox 17:28, 24 August 2016 (UTC)
Deleting WT:LOP would improve our overall quality. --Daniel Carrero (talk) 22:54, 24 August 2016 (UTC)
WT:LOP was formerly the only place to put words that we now call "hot words". It also served as a means of handling good-faith new entries that is less hostile than deletion. I think we are better off looking like the work in progress that we are rather than pretending that we are at all close to being a finished product in whole or in part. DCDuring TALK 00:33, 25 August 2016 (UTC)
Yeah, LOP serves to satisfy/district (some of the) contributors who would otherwise keep trying to add their neologisms to the mainspace, which makes it useful. 'Cause on our end, we can just ignore it... - -sche (discuss) 08:40, 25 August 2016 (UTC)
Something to be mentioned in whatever news release goes out: we have words from about one-third of the world's languages, according to conventional estimates of how many languages are spoken in the world. As of when WT:STATS was updated, we were up to 2535 languages with entries, and I expect we are at least over 2600 by now, if not higher. We include codes for 7960 languages, and given how many languages we've identified as needing codes, which I am steadily adding, I expect that figure will reach 8000 soon (by which time I expect to have passed the one-third mark of 2667 languages with entries, since most languages I am adding codes for I am also adding entries in). - -sche (discuss) 08:40, 25 August 2016 (UTC)
At ~4,845,000 pages, we're pretty close to Wikipedia's pagecount (~5,223,000). I expect that we'll overtake WP as time goes on, because we define at least 200,000 English base words (number of entries minus number of form-of definitions = 368,098, but many are variant spellings, so I conservatively guess 200,000 base words), and have the potential to include that many entries in several thousand languages (assuming poorly-attested languages on one side and highly agglutinative or inflected languages on the other side will make a wash), which is several hundred million entries. (Even just that many entries in the 500 most common of the languages we include would be 100,000,000.) - -sche (discuss) 08:50, 25 August 2016 (UTC)
Interesting, puts things into perspective, but also hints at how much there is still left to do. I'm planning to work on a visualisation which can somehow demonstrate this diversity and connectedness of languages. – Jberkel (talk) 16:19, 30 August 2016 (UTC)

Proposal: make headword templates for some languages automatically categorise phrasal verbsEdit

The "phrasal verbs" category is pretty well populated for English, but not so much for other languages. There are also, presumably, many more missing from the English category. I therefore propose that

  • The English headword module be modified so that when the page name contains a space, then the phrasal verbs category is automatically added. I'm not sure if phrasal verbs should also be put in the plain "verbs" category.
  • This change also be applied to the modules of other languages, where this is desirable or applicable.
  • This change be applied to other parts of speech, if desired.

It would also be possible to implement this directly in Module:headword, and then it would apply automatically for all languages. However, I don't know if this would be desirable. If everyone else thinks it's fine, we can do that instead. —CodeCat 16:26, 24 August 2016 (UTC)

I don't think that everything that meets the condition presented is a phrasal verb, within the normal meaning of the term, eg, break wind. I don't believe that every entry for a verb followed by a preposition or particle is a phrasal verb, eg, go to hell or go to in that phrase. DCDuring TALK 17:10, 24 August 2016 (UTC)
I wasn't aware of that definition. I thought it just meant any verb that is a phrase (i.e. the SoP meaning of phrasal verb itself). —CodeCat 17:14, 24 August 2016 (UTC)

Proposal: automatically categorise palindromes in Module:headwordEdit

Recently, logic was added to categorise entries if they have unusual characters in them. We can also do other "analysis" of words automatically in the module, including palindromes. Therefore I propose that we add this feature to the module so that the categories don't have to be added manually anymore. —CodeCat 16:31, 24 August 2016 (UTC)

Would it be undesirably computationally expensive to do anagrams this way too? Equinox 17:27, 24 August 2016 (UTC)
Modules have no way to see what entries are in a category, so they are not able to go over each one and see if a term is an anagram of the current term. —CodeCat 17:39, 24 August 2016 (UTC)
  • This is a good idea, and (much unlike anagrams) I can't think of too many language-specific issues, besides the fact that it's not relevant for certain scripts. —Μετάknowledgediscuss/deeds 22:40, 24 August 2016 (UTC)
  • I agree, this is a good idea. --Daniel Carrero (talk) 22:53, 24 August 2016 (UTC)
  • I also like this idea. Will it be implemented such that periphrastic palindromes ("Madam, I'm Adam") would be allowed? You'd have to remove spaces and punctuation, lowercase the string, and pass it through the sort key logic to get reliable results. —JohnC5 00:44, 25 August 2016 (UTC)
What is the minimum length string we will consider a palindrome? 3 characters? DTLHS (talk) 00:44, 25 August 2016 (UTC)
1 character, it seems. Both a and I are in Category:English palindromes. --Daniel Carrero (talk) 00:53, 25 August 2016 (UTC)
I definitely disagree with categorized single characters as "palindromes". DTLHS (talk) 01:01, 25 August 2016 (UTC)
Appendix:English palindromes has palindromes with a minimum length of 2 characters. There are a few two-letters palindromes in the category, too: ee, oo, BB. Can abbreviations, such as AAA, be considered normal palindromes? --Daniel Carrero (talk) 01:09, 25 August 2016 (UTC)
Those may be better in a category Repeated character than palindromes. — Dakdada 13:21, 25 August 2016 (UTC)
Suppose we define "palindrome" for these purposes as words containing multiple different characters. This would effectively exclude all two letter palindromes, which only repeat the same character, and things like WWW and ooo, but would include things like Ana and oro. bd2412 T 14:40, 25 August 2016 (UTC)
That rule should not apply to non-strictly-alphabetic scripts, such as abjads, abugidas, syllabaries, and logographic scripts. In these scripts repetition of the same letter is more than a "repeated character", and thus two-character palindromes and even three-same-character palindromes should be counted as palindromes. --WikiTiki89 14:06, 29 August 2016 (UTC)
I'd like to implement this, but I don't have the sysop rights required to do it. —CodeCat 16:24, 25 August 2016 (UTC)
One other consideration, should we only do this for languages in the Latin script? Or are there scripts we should specifically exclude? DTLHS (talk) 00:00, 26 August 2016 (UTC)
Alphabetic scripts like Greek and Cyrillic also have palindromes. We also have Appendix:Chinese palindromes, and Category:Arabic palindromes. Sanskrit, despite its inherent vowels, also apparently has palindromes. It seems like all scripts can have palindromes. - -sche (discuss) 00:14, 26 August 2016 (UTC)
Is a vowelless Hebrew palindrome still a palindrome when you add the vowels? DTLHS (talk) 00:15, 26 August 2016 (UTC)
Based on the examples in Category:Hebrew palindromes and w:Palindrome, yes. - -sche (discuss) 00:34, 26 August 2016 (UTC)
In Hebrew and Arabic, vowels do not count as part of the written word. In Hebrew, you also have to make sure to count final letters as equivalent to their non-final forms (which, as I see at מום‎, already works). In Arabic, there is another little bit of trickiness: classically, ي‎ and ى‎ counted as one letter, as did ه‎ and ة‎, and hamzas did not count as letters. Thus, the following words should be palindromes: يرى‎, أنا‎, آباء‎, همة‎, يجئ‎, as well as وضؤ‎ if we had entries for such things. I don't know much of this carries over to Persian or other languages. --WikiTiki89 13:14, 29 August 2016 (UTC)
I implemented Arabic. --WikiTiki89 14:06, 29 August 2016 (UTC)
It seems Korean palindromes are graphical, by hangeul syllables, e.g. 적극적 (jeokgeukjeok) 적 + 극 + 적, not by jamo. 적극적 wouldn't be a palindrome if decomposed into ᄌ ᅥ ᆨ ᄀ ᆨ ᄌ ᅥ ᆨ. --Anatoli T. (обсудить/вклад) 13:26, 29 August 2016 (UTC)
Also Scribunto doesn't have any built-in way to reverse a string by character instead of bytes- if you could write out your implementation I'd be happy to edit the module. DTLHS (talk) 00:11, 26 August 2016 (UTC)
There's a working function now, it's in Module:palindromes but I imagine it will be inserted into Module:headword eventually. There's a test at Module:palindromes/testcases, where I put in the lyrics of "Bob", a song consisting entirely of palindromes, to test it. It probably needs to ignore more types of characters though. —CodeCat 02:17, 26 August 2016 (UTC)
palindrome: "[...] sometimes disregarding punctuation, capitalization and diacritics". It should, probably, remove also diacritics, réifier is listed as a French palindrome. --Vriullop (talk) 08:43, 26 August 2016 (UTC)
@CodeCat I have added this to Module:headword. We should only get false negatives, not false positives, so it should be safe. If anyone wants to add more language specific rules just edit the data table in Module:palindromes. DTLHS (talk) 23:44, 26 August 2016 (UTC)
I don't really agree with passing the language code to the function, as a string. The normal practice in the modules is to pass the language object itself, and let the function fetch the code from it when it needs it. —CodeCat 23:53, 26 August 2016 (UTC)
Also, it's usually a bad practice to include other modules at the top of a module. This is an unconditional load; the module gets loaded by the other module no matter what. But it is often more efficient to load the module in-place, where you use it. That way, the module doesn't get loaded unless it's actually needed. —CodeCat 23:55, 26 August 2016 (UTC)
But we can't know whether the module is needed without including it. DTLHS (talk) 23:56, 26 August 2016 (UTC)
You include it when you need it. Not the entirety of Module:headword needs the module, but only that one piece of code you added in. So you can require it there, on the spot. You don't even need a variable for it, just require("Module:palindromes").is_palindrome(. —CodeCat 23:59, 26 August 2016 (UTC)
Now the module just needs to be categorised. Any idea where it fits? —CodeCat 00:16, 27 August 2016 (UTC)
non-- maybe we shouldn't remove hyphens from the start or the end of the word? I'm not sure this should be categorized as a palindrome. DTLHS (talk) 00:31, 27 August 2016 (UTC)
Fixed now. —CodeCat 00:38, 27 August 2016 (UTC)
I'm no longer able to edit the module. Can this be fixed please? —CodeCat 12:18, 27 August 2016 (UTC)
It is transcluded in the main page, so it has cascading protection. DTLHS (talk) 14:06, 27 August 2016 (UTC)
That cascading protection has repeatedly caused issues like this, with no discernible benefit that can't be obtained better in another way with less collateral damage. (And discussions have shown apathy from admins towards the question.) I've turned it off so that the protection is now only local, and laid out my reasoning in more detail below, with links to some previous discussions. - -sche (discuss) 18:40, 27 August 2016 (UTC)
There are now a bunch of new palindrome categories in Special:WantedCategories if anyone would like to go over them for mistakes. DTLHS (talk) 15:02, 29 August 2016 (UTC)
Now with even more Proto-Malayo-Polynesian palindromes! (nice work) Jberkel (talk) 02:59, 31 August 2016 (UTC)
That does seem odd... @CodeCat, @JohnC5, should we disable it for reconstructions? DTLHS (talk) 03:04, 31 August 2016 (UTC)
That wasn't sarcastic. The only dictionary which would ever have something like it. Although the usefulness of reconstructed palindromes is indeed questionable. Jberkel (talk) 03:10, 31 August 2016 (UTC)
I don't think we should have them for reconstructed terms, whether in reconstructed languages or attested ones. Reconstructions aren't really orthographic forms, but rather representations of the scientific method that reconstructs them. It is, therefore, much more arbitrary. The very same word might be a palindrome to one linguist and not one to another, all depending on which notation they happen to prefer. Attested languages have no such leeway. —CodeCat 13:04, 31 August 2016 (UTC)
While I agree with you that reconstructed terms should not have palindrome categories, your reasoning is a little off. There are many words in attested languages that have alternative spellings, one being a palindrome and the other not, and it is not uncommon for people who are trying to create palindromic sentences to choose a rarer spelling because it happens to fit. My point is that the fact that "the very same word might be a palindrome to one linguist and not one to another" is not the real reason to exclude reconstructions. The real reason is simply that the orthography of reconstructions is artificial and thus meaningless, and there's no more to it than that. --WikiTiki89 13:15, 31 August 2016 (UTC)
User:DTLHS has just disabled this feature for languages whose type is "reconstructed". This brings up a question: Should reconstructed words in an attested language (for example, Vulgar Latin) be categorized as palindromes? I think not, thus the disabling should depend on the namespace rather than the language type. --WikiTiki89 14:55, 31 August 2016 (UTC)
That's fine with me. DTLHS (talk) 14:58, 31 August 2016 (UTC)
What's the best way to check the namespace? DTLHS (talk) 15:08, 31 August 2016 (UTC)
Like this. --WikiTiki89 15:11, 31 August 2016 (UTC)
Actually, I just moved this check to Module:headword. It makes more logical sense to me to have it there. --WikiTiki89 15:15, 31 August 2016 (UTC)

Unicode 9.0Edit

Can someone please update Appendix:Unicode and subpages?

The appendices cover the characters up until Unicode 8.0. Unicode 9.0 was introduced in June, apparently.

List of new Unicode 9.0 characters:

--Daniel Carrero (talk) 16:38, 25 August 2016 (UTC)

Already updated. But Appendix:Unicode/Tangut block is unable to display. It must be done in other approach. --Octahedron80 (talk) 02:26, 27 August 2016 (UTC)
By the way, Unihan database is still not updated the readings of CJK extended blocks. --Octahedron80 (talk) 02:54, 28 August 2016 (UTC)

Trademarks, againEdit

I'm very much opposed to the idea that Wiktionary has to indicate trade marks in any way. We've recently had an editor who seems to be adding trademark nonsense to dodge bow, a common term not at all connected to any particular trademark, or at the very least a genericised one. They've gone to User talk:JohnC5 and argued that their trademark deserves the same recognition as we give to Nike. What is the policy on this? Why is the trademark indicated for Nike in the first place, and how might this case differ from others? —CodeCat 21:36, 25 August 2016 (UTC)

Per WT:TM, we do not indicate trademark status. The talk page of that page has links to the discussions that led up to it, in which a WMF staff member and an[other?] intellectual property lawyer participated. I have removed the "trademark" context label from Nike. (I wonder if we should upgrade that page from think tank to a higher status.) - -sche (discuss) 21:46, 25 August 2016 (UTC)
Yet, we do explicitly define "trademark" as a recognised label. Even though it's not even a usage context. Should we get rid of it? A possibility we could even consider is to include logic in Module:labels and its data modules to explicitly forbid certain labels. —CodeCat 21:58, 25 August 2016 (UTC)
I had been leaving it in the module because it gets a couple new uses a year, in addition to the existing uses we haven't finished clearing out (update: now done), and having them all categorized is useful. It's tempting to imagine forbidding it, but people would find ways around that, like manually writing (trademark) or using "[as] a trademark", so leaving it (or at least not forbidding it) might be better, in that it makes it easy(er) to find and fix uses. - -sche (discuss) 22:17, 25 August 2016 (UTC)
Leaving it in so that you can take it out... that's wonderfully sneaky. —suzukaze (tc) 03:36, 26 August 2016 (UTC)
Leave it in so you can move it, not remove it. Leave it in so you can find such entries to format them correctly. Trademarks should be indicated elsewhere, IMO as a usage note if not in the etymology. DAVilla 09:57, 3 September 2016 (UTC)
If something was coined as a trademark, that's relevant etymology. If it merely has been trademarked, at some point (possibly the present) in some country, that's not relevant. We previously got some requests from companies which insisted that our entries on terms they trademarked (in one case apparently including a term that predated the trademark) should be indicated as trademarked; we noted that a large number of common words are trademarked (eagle, crest, tide), especially in smaller countries; a WMF representative asked us to formulate a document the WMF legal team could point companies to, and with input from an intellectual property attorney we arrived at this approach. - -sche (discuss) 17:27, 3 September 2016 (UTC)
I'm surprised that so many people dislike the trademarks. At the very least it's good to say that a word like "kleenex" originated as a marketing coinage: to some extent, this explains why it's spelled in that strange way. But per the vote it seems to be something to put in the etymology only. Equinox 13:32, 26 August 2016 (UTC)
We may be unreasonably hostile to them, but it is not unreasonable to avoid asserting or denying the legal status of any particular trademark. I do think that our determination that a given trademark has become generic would often be the same as a court's determination, but IANAL. DCDuring TALK 13:59, 26 August 2016 (UTC)
Indicating that something originated as a trademark / brand name is OK. The issue is with companies that want to indicate that they currently hold a trademark on some word ("off", etc). - -sche (discuss) 17:32, 3 September 2016 (UTC)

Austro-Asiatic and Mon-KhmerEdit

Austro-Asiatic was traditionally divided into Mon-Khmer and Munda, but more recent classifications have made Mon-Khmer synonymous with Austro-Asiatic. On Wiktionary, we still have mkh (Mon-Khmer) as a subfamily of aav (Austro-Asiatic). Where this becomes a problem is that it prevents Munda terms like दाः from referencing their Proto-Mon-Khmer i.e. Proto-Austro-Asiatic ancestors. How should this be addressed? - -sche (discuss) 21:54, 25 August 2016 (UTC)

In case anyone wants to, the same fallback we did with Uralic vs. Finno-Ugric should be possible: make Proto-Mon-Khmer an etymology-only language, so that all mkh-pro mentions link to the corresponding aav-pro reconstruction page. In other words, there will be no separate entries for mkh-pro, but any Austroasiatic words not attested in Munda can still be referred to as "Mon-Khmer".
In this particular case though, mkh-pro *ɗaak should simply be moved to aav-pro *ɗaak, as it appears to indeed have Munda descendants. --Tropylium (talk) 20:23, 26 August 2016 (UTC)

Voting in "borrowing, borrowed, loan, loanword → bor"Edit

About this vote: Wiktionary:Votes/2016-07/borrowing, borrowed, loan, loanword → bor. I am the vote creator and one of the supporters.

The vote is scheduled to end in a few days (on August 30). If the vote ended right now, it would pass. Current results: 11-5-2 (68.75% supporting votes, 31.25% opposing votes). But it is just a little above the two-thirds majority required to pass, which would be 66.6% supporting votes.

I think it would be a good idea for more people to cast their votes here if they didn't already. If more people voted, the result would hopefully show a clearer consensus. It is not clear if it's going to pass or fail if a few more people voted. --Daniel Carrero (talk) 03:28, 26 August 2016 (UTC)

Vote: Enabling different kinds of romanization in different locationsEdit

FYI, I created Wiktionary:Votes/2016-08/Enabling different kinds of romanization in different locations. Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 08:32, 26 August 2016 (UTC)

Dan Polansky for adminEdit

See Wiktionary:Votes/sy-2016-08/User:Dan Polansky for admin. Thank you. --Daniel Carrero (talk) 12:52, 26 August 2016 (UTC)

What, again? Renard Migrant (talk) 12:59, 26 August 2016 (UTC)
@Renard Migrant: Dan has rejected sysop nominations in the past. Now he accepted this nomination. --Daniel Carrero (talk) 23:25, 26 August 2016 (UTC)

Request/proposal: Show preferred languages in translation tables as a translation, not in the headerEdit

The tool that shows translations of certain selected languages in the header of the translation table is useful, but quite limited. You can only select a small number of languages before the header gets too big. I therefore propose the following change to how it's presented. Instead of showing it in the header, it's shown in a minimal version of the translation table itself. This table would have two columns and display translations the same way they appear in the full table. When the table is expanded, the minimal table is replaced with the full table. This is similar to how some recent inflection tables work, like the Dutch verb tables (see zijn for a working example).

This change requires some good knowledge of JavaScript, which I don't really have. So please volunteer if you're able to make this change. —CodeCat 19:08, 26 August 2016 (UTC)

Am I correct in assuming that:
  1. all the JS that is common slows down entry downloading for everyone and
  2. this would be in common JS?
If so, this and other items that are not likely to be used by most users, including the non-contributing anons, should not be in common. Instead they can be added to the user JS files. DCDuring TALK 23:57, 26 August 2016 (UTC)
What are you talking about? This is a feature we already have in our JS. I'm just asking to make it look nicer. —CodeCat 00:01, 27 August 2016 (UTC)
So it's already contributing to sluggish downloading, non-appearance of icons, etc? DCDuring TALK 00:33, 27 August 2016 (UTC)
That's not relevant to my request. —CodeCat 00:36, 27 August 2016 (UTC)

User:Babel AutoCreateEdit

This user has created several questionable Babel categories: Category:User roa-Tara, Category:User zh-Hans, Category:User zh-Hant. DTLHS (talk) 01:06, 27 August 2016 (UTC)

My understanding is that it is not a user, or even a bot, it's a WMF script that is part of the cross-site Babel extension (the Babel you get if you use '#' in front of the Babel codes, whereas if you don't use '#' you just get our local templates). It merely takes the form of an account, presumably to make its page-creations 'neater'. (I notice that it has, hilariously, received a mass-message warning that it would be renamed.) As I wrote when I unblocked it after someone blocked it, it "mostly adds valid cats, [and] when invalid cats are added, it's a sign actual people have used those Babel codes, and the solution is to educate the people and protect the cat against recreation", as I have now done to the zh-Hans, zh-Hant categories. If we had one of our bots create every possible level of every possible Babel language — Category:User aaa-4 for Ghotuo, etc — then I guess we would no longer need the script to create things on-demand for us, and could block it. - -sche (discuss) 02:58, 27 August 2016 (UTC)
The fact that it's categorizing them in Category:language (when it doesn't know what the correct language is?) is concerning, maybe we should warn the operator about that. DTLHS (talk) 03:00, 27 August 2016 (UTC)

"Template editor" user groupEdit

At Wikipedia, they have a user group called "template editors" who can edit protected, high-visibility templates and modules. More information on how they handle it can be found here. This allows for the existence of users whom we don't need to trust as admins (so no blocking or deleting) but can use their technical skills to help the project. They could be confirmed much like rollbackers are, at WT:WL (or alternatively through a vote like sysops). Is there interest in having this user group here? —Μετάknowledgediscuss/deeds 06:16, 27 August 2016 (UTC)

I would be interested if the group allowed editing gadgets and all of the protected high profile non template pages (like mediawiki common.css common.js).--Dixtosa (talk) 09:50, 27 August 2016 (UTC)
This sounds like a good idea to me. Andrew Sheedy (talk) 01:28, 28 August 2016 (UTC)
This sounds like a great idea to me. We definitely need more permissions groups. Benwing2 (talk) 02:01, 5 September 2016 (UTC)

Poll: Description sectionEdit

This concerns Wiktionary:Votes/2016-08/Description. The vote proposes adding a "Description" section with a visual description of symbols. The vote did not start yet.

See this short entry example, with a description of the symbol "🔇".


A speaker with a stroke. Sometimes, shown as a speaker with a prohibition sign (🚫).


# [[mute]]

Question: What should be the name of the description section, in your opinion?

This is a poll with no policy value.

Use "Description"Edit

  1.   Support --Daniel Carrero (talk) 15:34, 27 August 2016 (UTC)
    If someone said to me: "Describe the Venus symbol." (), I would probably say its shape.
    If someone said to me: "Describe the Eszett." (ß), I would probably say its shape, too.
    Correct me if I'm wrong: other things such as pronunciations are also "descriptive" in a sense, but visual descriptions are the first thing that comes to mind. --Daniel Carrero (talk) 15:36, 27 August 2016 (UTC)
  2.   Support Korn [kʰũːɘ̃n] (talk) 15:57, 27 August 2016 (UTC)
  3.   Support, per above. Andrew Sheedy (talk) 01:29, 28 August 2016 (UTC)

Use "Shape"Edit

  1.   Support --Daniel Carrero (talk) 15:38, 27 August 2016 (UTC)
    I prefer "Description". Shape is the second best, in my opinion. --Daniel Carrero (talk) 02:43, 28 August 2016 (UTC)
  2. Taking Equinox's good point about "description" (which I otherwise like) into account, this is possibly the best name, or perhaps "appearance" is. - -sche (discuss) 04:40, 28 August 2016 (UTC)

Use "Glyph"Edit

  1.   Oppose because this sounds like a Part of Speech section; compare "Letter", "Symbol". - -sche (discuss) 04:40, 28 August 2016 (UTC)
  2.   Oppose per -sche. --Daniel Carrero (talk) 23:41, 29 August 2016 (UTC)

Use "Appearance"Edit

  1.   Support, but I prefer "Description". --Daniel Carrero (talk) 17:01, 28 August 2016 (UTC)

Use "Visual description"Edit

  1.   Support This will work I think. Think about with topic 'shape', and someone put "a triangle" ... very decriptive. lol. Octahedron80 (talk) 16:04, 27 August 2016 (UTC)
    Personally, I don't like "Visual description" because it is too long for my taste.
    Maybe the actual description of that character should be a little longer, like this: "A triangle pointing upwards." So it would not be confused with other triangles in Appendix:Unicode/Geometric Shapes. It is officially called "WHITE UP-POINTING TRIANGLE", but I don't think we need to mention the word "white". It seems to have a jargon-y meaning in Unicode: here, "white" means "symbol contour, not filled with the black color".
    Incidentally, delta (Δ) can also be described as "A triangle pointing upwards." --Daniel Carrero (talk) 16:30, 27 August 2016 (UTC)
    You are correct about Unicode's jargony use of "black" and "white". However, if we need to describe shapes, we can always say "solid", "filled", "outline", etc... Equinox 02:00, 28 August 2016 (UTC)
    Point taken. For ("WHITE UP-POINTING TRIANGLE"), we may want to choose between "A triangle pointing upwards." or "The outline of a triangle pointing upwards." --Daniel Carrero (talk) 02:14, 28 August 2016 (UTC)
    Some characters are described in color, usually emoji. For example, 💙 blue heart, 💚 green heart, 💛 yellow heart, 💜 purple heart, 🍎 red apple, 🍏 green apple, etc. Is there other way to explain these? --Octahedron80 (talk) 02:17, 28 August 2016 (UTC)
    Yeah, this is a very strange (and very new!) thing in Unicode. I can only speak for my own Windows/Linux rendering (I think Apple is more "colourful") but I see them as different shades and stripes. Whether Unicode should concern itself with colours at all is very arguable, but those arguments have been had, elsewhere, and one of the current big issues seems to be skin colour. (I can see why a set of purely white emoji would be a problem, but I've mostly seen smileys as yellow things. Oh well. Not going to touch that with a barge-pole.) I see this as a question of "our friend just jumped off a cliff, should we follow?". Equinox 02:27, 28 August 2016 (UTC)
    The hearts are all red on my screen, and both apples are blue. I assume they are supposed to actually match the colours used to describe them? :P Andrew Sheedy (talk) 02:56, 28 August 2016 (UTC)
    Yeah, you're seeing them essentially in normal monochrome, except that on Wiktionary a linked text is blue (if the target page exists) or red (if it doesn't). In some other contexts, especially phone chat, some clients render them in actual colour, like those weird bright yellow pictures of smileys that you get when you type :). Equinox 02:59, 28 August 2016 (UTC)
    From the unicode-faq: "Some of the characters from the core emoji sets have names that include a color term, for example, BLUE HEART or ORANGE BOOK. These color terms in the names do not imply any requirement about how a character must be presented, they are intended only to help identify the corresponding character in the core emoji sets." – Jberkel (talk) 23:30, 29 August 2016 (UTC)
    Re: "Is there other way to explain these?"
    I believe "A blue heart." is enough for the blue heart emoji. Same for others. --Daniel Carrero (talk) 03:02, 28 August 2016 (UTC)
    Kinda off-topic: All the different-colored hearts represent the same concept: a heart. For that reason, I would support redirecting "blue heart", "orange heart", etc. to a single main heart entry and letting the single heart entry explain that Unicode variations exist. We are currently using ("BLACK HEART SUIT") for all heart-related senses: love (generic); love (English verb); hearts suit; hit points; etc. --Daniel Carrero (talk) 23:40, 29 August 2016 (UTC)
  1.   Oppose. As I said above, "Visual description" is too long for my taste. --Daniel Carrero (talk) 17:01, 28 August 2016 (UTC)
  2.   Oppose. This name is too long, IMO. - -sche (discuss) 04:40, 28 August 2016 (UTC)

Use "Etymology"Edit

  1.   Oppose. I've been placing shape descriptions in the "Etymology" section because it is an allowable section and it's better doing that than leaving descriptions as definitions. But visual descriptions are not etymologies. I'd rather use an actual visual description section instead of the "Etymology" section. --Daniel Carrero (talk) 23:43, 29 August 2016 (UTC)

Use "Usage notes"Edit

  1.   Oppose. These are not actual usage notes. --Daniel Carrero (talk) 00:01, 30 August 2016 (UTC)

Use other nameEdit


  1.   Oppose having this section, but if we do have it, I would favour "Shape". The entire entry is a description: a description of the meaning, a description of the sound, etc. Equinox 15:40, 27 August 2016 (UTC)
    @Equinox I added a short visual description in the Etymology section of these entries:
    Do you think the entry should have that information, or do you think we should remove it? I proposed using a "Description" section for the visual description. If we don't use a "Description" section (or "Shape" section, "Appearance" section, etc.), do you think we can use the Etymology section to place the visual description of these symbols? --Daniel Carrero (talk) 15:36, 29 August 2016 (UTC)
  2.   Oppose The only potential purpose I see for this is to inform blind users of a what the character looks like, which is not something I think we need to be concerned about. If the our purpose is the more obvious purpose of helping users who do not have the font support to see the character, then a much better solution would be to just include an image of the character (and all its variants, if applicable), which is something we already do to sometimes. --WikiTiki89 11:12, 29 August 2016 (UTC)
    @Wikitiki89 I added a short visual description in the Etymology section of these entries:
    Do you think the entry should have that information, or do you think we should remove it? I proposed using a "Description" section for the visual description. If we don't use a "Description" section (or "Shape" section, "Appearance" section, etc.), do you think we can use the Etymology section to place the visual description of these symbols? --Daniel Carrero (talk) 15:36, 29 August 2016 (UTC)
    The Etymology section should not simply be a place to put a visual description of the symbol, but if the description fits as part of the etymology, that would be fine. --WikiTiki89 15:42, 29 August 2016 (UTC)
    @Wikitiki89 In my opinion, we should have these shape descriptions somewhere in the entry. For now, I've been using the Etymology to place shape descriptions. See this diff of the hourglass symbol for an instance where I moved a simple, presumably unattestable "hourglass" definition into the etymology and added an actual sense.
    However, you said: "The Etymology section should not simply be a place to put a visual description of the symbol ..." . In your opinion, should we remove all these shape descriptions from the entries? --Daniel Carrero (talk) 23:55, 29 August 2016 (UTC)
  3.   Oppose I'm also wondering what kind of value Wiktionary can get (or add) by describing emojis in detail. I find the definitions you've added to the entries a lot more helpful than the description itself. The meaning / gloss should be relatively stable, the images can change (maybe you have heard about the recent discussion around 🔫 (pistol)). emojipedia now lists more than 10 different character sets, not counting OS sub-variants. If really needed, why not add a (short!) visual description after the definition, e.g.
    🔗: (Internet) Indicates a hyperlink (usually displayed as chain links).
    Jberkel (talk) 01:14, 30 August 2016 (UTC)
    @Jberkel: I tried to add useful definitions for the symbols. So, concerning the definitions, I feel flattered by your comment and for that I thank you: "I find the definitions you've added to the entries a lot more helpful than the description itself."
    Now let's get down to business:
    About "describing emojis in detail". I'm not actually interested in describing emojis in detail. My idea is specifically having short descriptions. I expect it to be apparent from the descriptions I have already wrote. Basically I just use the Unicode character name as the "Description" when appropriate.
    • = "A hourglass."
    • 🔫 = "A pistol."
    The following is an exception because the Unicode name would be just "link symbol", which is not that descriptive.
    Of course the pistol has a dozen of variants! -- Emojipedia page: -- Like I said about "lollipop" somewhere else, I don't want to describe every nook and cranny of the pistol. Just "A pistol." suffices. It conveys the whole idea. Actually, we might want to say in the "Description" section of 🔫: "A pistol pointing left.", if applicable.
    I oppose this:
    🔗: (Internet) Indicates a hyperlink (usually displayed as chain links).
    My reasons are:
    • Every symbol can have a short description and that's why I proposed having a whole section for it. (there may be important exceptions: I would probably oppose visual descriptions for Han characters until further discussion)
    • If a given symbol has, say, 6 definitions, where would we put the visual description? Only in the first sense? In all senses?
    • See this diff of the hourglass symbol for an instance where I moved a simple, presumably unattestable "hourglass" definition into the etymology and added an actual sense. (I used the "Etymology" but I'd prefer using a "Description" section.) In my opinion, definitions should be for actual semantics. There are too many symbol entries where the definition is a visual description instead of an actual, semantic definition. If we disallow visual descriptions in definitions completely, it should become easier to clean up symbol entries by moving visual descriptions up to the "Description" section.
    When you say: "If really needed, why not add a (short!) visual description after the definition," it is a different way of entertaining the possibility of having a visual description somewhere. My idea is to use a "Description" section specifically. --Daniel Carrero (talk) 02:00, 30 August 2016 (UTC)
    Regarding "if really needed": my point was that most symbols probably don't need it since they have a straightforward mapping from graphic depiction to sense, as in 🔫 (pistol) and (hourglass). For , the description could be its primary definition:
    1. (literally) hourglass
    2. (by extension) time
    3. (by extension, GUI) Indicates that the current application is busy performing an operation.
    The "literal" (not sure if that's a good term. "visual"?) sense would be the simplified visual description. If the pistol is pointing left or right is an implementation detail and should be left out. Symbols which denote abstract concepts such as 🔗 benefit most from the visual description, but I suspect that this is a small percentage of all symbols. Also I'm not sure if there even is a shared understanding of Unicode "meanings", since they are highly dependent on context and evolve rapidly (something we could help to document though). – Jberkel (talk) 08:07, 30 August 2016 (UTC)
    I agree with your point: "If the pistol is pointing left or right is an implementation detail and should be left out."
    I have something to say about this: "Symbols which denote abstract concepts such as 🔗 benefit most from the visual description, but I suspect that this is a small percentage of all symbols." I don't know if it's a small percentage (Unicode has more than 128,000 characters, so 1% of characters would be a high number) but if you want, I'm pretty sure I should be able to list 100 characters where the Unicode codepoint name is not a visual description, like "LINK SYMBOL" for 🔗.
    I don't like very much the idea of having this sense for :
    1. (literally) hourglass
    It may not be attestable. Is there any permanently recorded media including a sentence with denoting the actual object "hourglass"? I wonder if someone would write, in a book or Usenet: There is a hole in my , so the sand is spilling out! Mind you, we are not allowed to create the entry with only 1 sense: "A hourglass.", so why would it be acceptable to create with multiple senses, the first one being "A hourglass."? If the visual description is so special that we consider allowing a whole unattested sense for it, then in my opinion we may as well have the separate "Description" section for the visual description. --Daniel Carrero (talk) 15:01, 30 August 2016 (UTC)
    Just wanna point out that on my iPhone, the link symbol consists of two linked chains, oriented diagonally. By the way, in the etymology section you can write something along the lines of "The two or three chain links that make up the symbol symbolize the concept of connection." --WikiTiki89 15:09, 30 August 2016 (UTC)
    Thank you! I edited 🔗. --Daniel Carrero (talk) 15:46, 30 August 2016 (UTC)
    I like your counterexample :) probably falls into the similar category as 💾, 💿, ☎️ or 📠: the objects are no longer in (widespread) use. You're right, if the literal sense cannot be attested it probably should not be included. A search for on twitter shows that it mostly gets used to represent time or the concept of time running out (countdown). – Jberkel (talk) 16:56, 30 August 2016 (UTC)



I'm pinging everyone else who participated in the vote talk page or the previous BP discussion: @Andrew Sheedy, Koavf, Jberkel, -sche, Octahedron80. --Daniel Carrero (talk) 15:47, 27 August 2016 (UTC)

Forgive my intrusion, but what do you really mean by "description"? Isn't a definition supposed to describe? Also, what's better about "description" than "usage notes"? A description could mean lots of things. I have this inner feeling that it isn't appropriate, but I'm not voting yet until I better understand what "description" flatly means. Philmonte101 (talk) 23:28, 27 August 2016 (UTC)
@Philmonte101: "Description" is the shape of the symbol. In the example given, "🔇" is a speaker with a stroke. So, "a speaker with a stroke" would be the description. Note that it would be incorrect to define the character as "a speaker with a stroke". The actual meaning of the character is "mute", or "mute symbol". For more, see Wiktionary:Votes/2016-08/Description. --Daniel Carrero (talk) 23:44, 27 August 2016 (UTC)
All of the above have problems, as far as I'm concerned. This isn't really a description of anything, it's an explanation of what the glyph is supposed to represent. I think something along the lines of "Glyph representation" would be better, but that's still not quite right. Chuck Entz (talk) 02:41, 28 August 2016 (UTC)
On second thought, what about just "Glyph"? Chuck Entz (talk) 02:44, 28 August 2016 (UTC)
Maybe I'd support "Glyph". Let me think it over. For now, I added a Support "Glyph" option in the poll, in case you and/or other people want to use it. --Daniel Carrero (talk) 03:08, 28 August 2016 (UTC)

As we are explicitly warned when creating an entry, "mere Unicode code point name does not constitute a definition". What are we aiming for, here? A dictionary explains what a word means. The OED famously added the smiley face a year or two ago (how do they alphabetize it?) but we should not slavishly copy every little Unicode symbol just because we can represent it. CFI applies to them all. If this idea of describing shapes is a sneaky way to legitimize symbols that otherwise have no lexical value — or if it's a way to satisfy some mechanistic desire to create entries, without regard to whether anyone will ever get the slightest use out of them — then admit it. I might be getting old but I fail to see the use. We don't have to have an entry for every single Unicode code-point. Equinox 02:35, 28 August 2016 (UTC)

I'll repeat something I said in the vote talk page: "I used the Etymology section for [a few descriptions]. Do you support using the Etymology section for that information? Would you change something? In the vote, I argued that these are actually descriptions, not etymologies, so at least if we are using the Etymology section to keep a description, then we might as well use a Description section." We might also avoid having entries defined using the Unicode name of the character.
I also said in the vote page:
  • This is not part of the current vote, but rather something that can be discussed eventually: if we have the Description section, we can either: 1) keep striving to have only the attestable symbols, deleting all other symbols; or 2) if people want, we can try having a large Unicode database, with unattestable entries that have the Description section properly filled with a textual description, and a single definition along the lines of: "# Symbol not attested. This entry merely describes the Unicode character."
This is not a "sneaky way to legitimize symbols". If we wanted to have a large Unicode database, with many unattested symbols, then the "Description" section would help by keeping descriptions away from the part of the entry reserved for actual definitions. But I'm leaning towards keeping the status quo and only allowing attestable symbols. My idea is to have the "Description" for attestable entries of symbols. --Daniel Carrero (talk) 04:17, 28 August 2016 (UTC)
At some point in the future, I would like to suggest redirecting some "Roman numerals" entries like this: II. Also I'd redirect fullwidth letters, halfwidth katakana, etc. into the normal characters. I'd also redirect vertical-writing parentheses and parentheses pieces to the entries for the normal parentheses. In the past, I created Wiktionary:Votes/2011-06/Redirecting combining characters to redirect some combining characters. (it passed.) I also recently created a discussion about redirecting the components of some matched pairs like: and ⌈ ⌉. (Not to mention the discussions about completely nuking entries for letters and replacing them by appendices.) That is, sometimes, I like to propose merging some entries of symbols that seem to be redundant to each other, which is the opposite of actively looking for a way to create unattested symbols.
I like to try and hunt for meanings given to random symbols, too, though it's unclear if I can find 3 citations for those. I found one citation for each of those: , and .
Maybe in the future I'll find a reason to propose creating unattested symbols, I don't know. I happen to like the characters for dominoes, playing cards, mahjong and box drawing, and maybe they are really hard to attest, (I didn't check.) so in my heart there's some temptation to try and find a reason to create these symbols. But I acknowledge what you said: if this would be just a "mechanistic desire to create entries, without regard to whether anyone will ever get the slightest use out of them", it'd be foolish to create entries like these. (and who knows, they can be attestable if I search hard enough) --Daniel Carrero (talk) 05:23, 28 August 2016 (UTC)

(the discussion below was moved from the Support "Shape" section)

I prefer "Description" above, and consider "Shape" a good 2nd place. This word defines the concept in a straightforward way. It feels a little less professional than "Description", though. But it could be just me. --Daniel Carrero (talk) 15:38, 27 August 2016 (UTC)
Whose profession? Particularly since you originally wanted to justify this as "possibly useful to blind people" (paraphrase), I'd feel you ought to be more sensitive to people who are physically unable to use visual descriptions. Equinox 01:59, 28 August 2016 (UTC)
I apologize for my statement that you quoted: "possibly useful to blind people". I did not mean to be insensitive towards people who are physically unable to use visual descriptions. My statement was temporarily present in the vote rationale, and then I removed it a few days ago, when you commented about it in the previous BP discussion.
Re: "Whose profession?" Correct me if I'm wrong: In my opinion, "shape" sounds like a more basic English word and "description" sounds like a more normal-ish English word. I admitted it could be just me. I should have explained my opinion better; "It feels a little less professional" was poor wording on my part. --Daniel Carrero (talk) 02:19, 28 August 2016 (UTC)
Okay I don't want to massively derail this section and we can carry it to another page if it matters, but you are saying that "shape" sounds somehow stupid, or kiddy, or not adult enough? and "description" sounds more like what an adult would say? My main criticism is that they have totally different meanings... Equinox 02:23, 28 August 2016 (UTC)
Yes. In my opinion, "shape" sounds kiddy, not adult enough, and "description" sounds more like an adult would say. --Daniel Carrero (talk) 02:41, 28 August 2016 (UTC)
But they have totally different meanings!! To "describe" a word is to give its definition, its sound, its history, all kinds of things. Imagine if the headers were like "Description of its sound", "Description of its shape" (your new section), "Description of its meaning"... don't you see why that header is too broad? Equinox 02:43, 28 August 2016 (UTC)
If I give a "description" of a car to the police, that would involve its number-plate (licence/registration), its colour, its size, everything. It doesn't just mean shape. They don't say "give me a description" and I say "...oh, kind of hemispherical, with four wheels". Equinox 02:44, 28 August 2016 (UTC)
I'm beginning to feel inclined towards supporting the "Glyph" suggested above. Do you agree with me that "Shape" sounds childish? Wouldn't it be bad to use a childish section title?
I stand by my argument about ♀ and ß, that I said in #Support "Description". As I said elsewhere, I don't think that people are going to try adding definitions, etymologies, etc. in the "Description" section, or looking for definitions, etymologies, etc. in the "Description" section. --Daniel Carrero (talk) 03:20, 28 August 2016 (UTC)
To illustrate your point, you gave as examples: "Description of its sound", "Description of its shape", "Description of its meaning". Imagine if the entries actually had these exact same sections, without "description of". The entry would have: "Sound" (instead of Pronunciation), "Shape", and "Meaning" (for definitions). These all look like kiddy English to me, and they would make me feel uncomfortable to some extent, like "Shape" already does. (Again, it could be just me.)
-sche gave a good point against "Glyph", (see the #Support "Glyph" section) and I tend to agree with the point given. Maybe "Appearance" would be a good name. If all else fails, I stand by "Description" or I may agree with you on "Shape" if it's the best name available. --Daniel Carrero (talk) 05:38, 28 August 2016 (UTC)

Cascading protection of the Main PageEdit

In the past, every language code was in its own template, every label was its own template, and languages had separate headword templates. This was inefficient, but also meant that cascading-protecting one page didn't usurp the local protection levels of and lock every single one of our thousands of labels and prevent the addition of new labels, lock each of the various things that feed into headword templates and prevent bug-fixing, etc. As we have centralized content into modules that are now called by most headword-line templates, by any use of a label template, etc, the cascading protection of the Main Page has become a problem — whenever a WOTD or FWOTD on the main page includes a label, for instance, it overrules the lower level of protection that is applied to e.g. Module:labels/data, and prevents longtime trusted users including e.g. fr.Wikt admin User:JackPotte from adding labels (see talk); it also apparently prevented users from editing peripheral things that feed into headword templates, per the most recent comments in this thread; etc.
The cascading protection option, which applies the protection level which the main page has to all pages it transcludes, is excessive. The Word-of-the-Day templates all seem to already be independently protected, so no anon can vandalize them, and the WOTD mainspace entries themselves aren't covered by the cascading protection, so it's of no benefit to them. If FWOTDs (for example) need to be protected, protect them specifically.
Last time this was briefly discussed by a few of us, someone suggested simply changing labels in WOTDs to wikicode as a workaround to avoid having the cascading protection lock the label module, but you can see that no-one has remembered to do that, since the main page currently uses {{lb|en|humorous}}.
In light of the cascading-protection-option's clear, recurring negative effects, and its apparent lack of positive effects that can't be obtained better in other ways with less collateral damage, ... and the general apathy the last discussion exhibited towards the narrow-seeming issue of whether autoconfirmed users should be able to edit things where the local protection level allows them but is usurped by the cascading protection, ... I have taken the initiative and protected the main page against non-admin edits or moves while turning off the cascading option. - -sche (discuss) 18:36, 27 August 2016 (UTC)

Cascading is a good thing. If I understand this correctly, the only problem is labels being used in WOTDs, and @Smuconlaw simply needs to avoid doing that. We already make sure not to put labels in FWOTDs. —Μετάknowledgediscuss/deeds 23:03, 27 August 2016 (UTC)
Er, why is it a good thing? In the past, cascading protection was removed several times without negative effect AFAICT, and simply reinstated by Liliana (who is now globally blocked), when she finally noticed its absence, only out of habit / "because it used to have this [cascading option turned on]". How the cascade usurps local protection settings, not just of the label modules but apparently also headword-related modules like the palindrome one discussed above, is clear. What is the benefit of it? The WOTDs are already protected. If there were vandalism against the current day's FWOTD template (the only one protected by cascading, while all the rest are apparently free to be vandalized), we could protect them specifically, presumably at the same autoconfirmed level as the WOTDs. Or perhaps, instead of the main page itself transcluding {{#ifexist:Wiktionary:Foreign Word of the Day/{{CURRENTYEAR}}/..., there could be a template with cascading autoconfirmed (not admin-only) protection applied to it which transcluded that code, and then that template could be transcluded by the main page. - -sche (discuss) 00:37, 28 August 2016 (UTC)
@-sche: Fair point. If you implement what you described in your last sentence, I'll be happy. —Μετάknowledgediscuss/deeds 01:49, 28 August 2016 (UTC)
Just wanted to say that there is nothing on "Wiktionary:Word of the day/Nominations" which says to avoid using {{l}} (or other templates) in WOTDs. If this was an important point, it should have been documented. — SMUconlaw (talk) 06:44, 28 August 2016 (UTC)
@Smuconlaw: To clarify, this wasn't an important point when WOTD was set up and the documentation drafted; it only became important recently when we centralized labels etc. into modules, and it only became noticed more recently than that. Also, I don't think {{l}} is a problem(?), it's {{lb}}s, the use of which cause the main page to overrule the protection settings of the label modules. - -sche (discuss) 01:20, 16 September 2016 (UTC)
OK, thanks. Anyway, I think the issue has been resolved. — SMUconlaw (talk) 12:22, 16 September 2016 (UTC)
Thanks for doing that! Feels weird to have to ask around in order to make changes. Regarding the WOTD, if we have to change the markup or can't use certain templates then something is clearly wrong. Hope we can find a better solution. The radical but safe approach would be to use a static copy during the period the word is featured. – Jberkel (talk) 08:52, 30 August 2016 (UTC)


Discussion moved to WT:RFDO.

Renaming Scanian to Old ScanianEdit

Discussion moved from Wiktionary talk:List of languages.

"Scanian" usually refers to the Swedish dialect spoken today in Scania, while the entries in Category:Scanian language belong to an archaic language similar to Old Danish and Old Swedish. After a quick googling, "Old Scanian" seems to already be in use to distinguish the archaic language from the contemporary Swedish dialect. Smiddle (talk) 08:16, 28 August 2016 (UTC)

If that situation is the case, I don't think renaming would help. Of "Scanian" (Swedish) and the Scanian language are two different things, then renaming Scanian language to Old Scanian simply puts the confusion one step back in the chain and makes it sound like Old Scanian is the ancestor of Scanian Swedish - which should be Old Swedish. Korn [kʰũːɘ̃n] (talk) 09:41, 28 August 2016 (UTC)

Diacritic stripping in BretonEdit

@Embryomystic and anyone else who knows Breton: we currently strip the following diacritics in Breton:

from = {"[âàä]", "[êèë]", "[îìï]", "[ôòö]", "[ûùü]", CIRC, GRAVE, DIAER},
to   = {"a",     "e",     "i",     "o",     "u"}},

But according to w:Breton_language#Alphabet, at least the following are actually used: â, ê, î, ô, û, ù, ü, ñ. Shouldn't we therefore be keeping all 8 of those, instead of only ñ? Breton Wikipedia uses at least ê and ù in its articles. (Here's a sentence from br:Backgammon: Dre m'eo berr amzer c'hoari pep den, hag a-benn digreskiñ lodenn ar chañs, e vez c'hoariet a-grogadoù alies, ha trec'h e vez disklêriet an den en deus dastumet ar muiañ a boentoù dreist 3 e-lec'h gortoz ma vije bet lamet an holl jedoueroù diouzh un tu.) So it looks like these diacritics are used in normal writing, not just in pedagogical texts and reference works. —Aɴɢʀ (talk) 14:15, 29 August 2016 (UTC)

In fact, I just noticed that we do use "ù" at least in article titles, e.g. anvioù, which is the plural of anv. But "anvioù" in the headline of anv points to "anviou" instead, so the link is broken. —Aɴɢʀ (talk) 14:19, 29 August 2016 (UTC)
@JohnC5 is the one who made the change to the module, about a month ago, so I'm pinging him too. —Aɴɢʀ (talk) 14:21, 29 August 2016 (UTC)
I'm honestly a little baffled why I made this change. Feel free to change it according to what is correct, and sorry for the inconvenience. —JohnC5 14:42, 29 August 2016 (UTC)
OK, I've unstripped the diacritics. —Aɴɢʀ (talk) 14:44, 29 August 2016 (UTC)

Proposal: Creating entries for Morse code charactersEdit

@Octahedron80 asked me here me if they could create entries for Morse code patterns. I support the idea. This sounds something natural to do, like we have for Braille A, for Braille B, etc. I'm opening this discussion to see if it's okay with other people.

Idea for characters: Use - ("HYPHEN-MINUS") for dash and · ("MIDDLE DOT") for dot. (there are many variants of dashes and dots in Unicode; these two feel the most "generic" to me, in a sense)

Please don't create Morse code entries until the discussion is over. Thank you! --Daniel Carrero (talk) 04:55, 30 August 2016 (UTC)

I also heard that there is Japanese version of Morse code either. As well as punctuations. --Octahedron80 (talk) 04:58, 30 August 2016 (UTC)
Oppose. Unlike Braille, morse code isn't even meant to be written. There are all sorts of encodings for letters, but if they aren't used to actually write the language, there is no point in including them. In my mind, this would be as silly as including entries for written-out Unicode codepoints (U+0020 = space, U+03B1 = α, etc.). --WikiTiki89 11:41, 30 August 2016 (UTC)
Well, we do have this, which certainly isn't meant to be written. I can see the morse proposal fit into our project. And should the debate come to the point that it is decided that we exclude all those representations of language which do not constitute a written form, sign language has to go too (and maybe all non-word symbols in general). I'm fine with either. Korn [kʰũːɘ̃n] (talk) 11:51, 30 August 2016 (UTC)
But sign language isn't merely an encoding of existing letters, it's a whole language in and of itself and for that reason it's worth documenting. The entry titles for sign language are not ideal, but there is no better way around that. For morse code, on the other hand, we already have entries for A, B, and C, so why do we need strangely encoded versions of them at ·-, -···, and -·-·? In any case, if we do include these, I think we should use periods for the dots, because that's the way it's traditionally done in typewriter and computer settings (.-, -..., and -.-.). More proper typesetting would probably use something like bullets () and n-dashes () anyway (or perhaps something better that I haven't found in Unicode), so I don't like the combination of mid-dots and hyphens. --WikiTiki89 11:54, 30 August 2016 (UTC)
Morse code isn't strictly just an encoding- it's also a sort of script. Just as Cyrillic and Latin Serbo-Croatian represent the same sounds with different scripts that aren't mutually intelligible, so does Morse code represent the same sounds as the Latin script does. It also has a very limited number of non-alphabetic {{w:Prosigns for Morse code|prosigns}} and a very rudimentary set of grammar-like conventions. It's not a language like the various sign languages are, but it's not just an encoding like ASCII or ANSI or Unicode (or EBCDIC- does anyone remember EBCDIC?). Whether we decide to have entries for the letters or not I think mostly we should treat it like we treat IPA: include it where relevant as unlinked text, so people trying to learn it can see how Latin characters are represented in Morse code, but avoid creating entries for words. After all, no one is going to go to a dictionary to look up Morse code that they've heard. Chuck Entz (talk) 04:05, 31 August 2016 (UTC)
  • Can you provide examples of the use of Morse code in print outside of manuals and the like explaining how to use Morse code? bd2412 T 12:29, 30 August 2016 (UTC)
Do these count? [12] [13] --Octahedron80 (talk) 12:49, 30 August 2016 (UTC)
Support including Morse code. Though what would count towards attesting it? I imagine it's possible, as I've encountered Morse code in books before, but I imagine Morse code manuals wouldn't count. Andrew Sheedy (talk) 13:11, 30 August 2016 (UTC)
Support. In fact, I seem to remember that I added some a very long time ago and they got deleted. SemperBlotto (talk) 13:27, 30 August 2016 (UTC)
p.s. Any chance of semaphore? SemperBlotto (talk) 13:27, 30 August 2016 (UTC)
Support, go for it. --. ----- . --.. .-.. (my callsign) DonnanZ (talk) 14:52, 30 August 2016 (UTC)
Question: What characters should we use?
I suggested: - ("HYPHEN-MINUS") for dashes and · ("MIDDLE DOT") for dots.
Wikitiki89 suggested using . ("FULL STOP") for dots.
If all books that we find use the full stop, then I believe we should use the full stop in our entries.
If some books use the middle dot, and others use the full stop, I propose creating entries for both varieties, one of those being as an "alternative form".
-.-- = Morse code for Y.
-·-- = Alternative form of -.--.
More broadly, I think we should create whatever variation of Morse code dashes and dots that is attestable, including "something like bullets () and n-dashes ()" as Wikitiki89 mentioned. Probably the main entries would use hyphen-dash and full stop, because they are the easiest to type and probably can be found in more books. (I didn't check)
--Daniel Carrero (talk) 15:12, 30 August 2016 (UTC)
You have to find a way of displaying the dots and dashes in a straight line. DonnanZ (talk) 15:22, 30 August 2016 (UTC)
Do I? I initially chose using "MIDDLE DOT" and "HYPHEN-MINUS" to display the dots and lines in a straight line, but if published works use a "FULL STOP" like you used above, then it's not really a straight line. --Daniel Carrero (talk) 15:42, 30 August 2016 (UTC)
Note that on my computer in Arial Unicode MS font, the middle dot and hyphen-minus are not lined up (but in the Georgia font used in headings, they do line up). --WikiTiki89 16:04, 30 August 2016 (UTC)
Does this look better: –•–•–•–•–•–•–•? It is a combination of "EN DASH" and "BULLET".
If we want to use hyphen-minus + middle dot (which looks good to me), we could use a new script code, like sc=Morse, to apply Georgia and/or other fonts in Morse code headwords and links. --Daniel Carrero (talk) 16:12, 30 August 2016 (UTC)
Whether they line up is always going to depend on the font. To respond to Daniel Carrero, I think most printed material dedicated to morse code uses its own custom dots and dashes that do not necessarily correspond to any Unicode character. Morse code in more casual usage, such as within a fiction novel, is likely to use full stops and hyphens (and slashes between letters within a word, and spaces between words [or maybe I got that backwards?]). --WikiTiki89 15:32, 30 August 2016 (UTC)
Re: "slashes between letters within a word". Sounds like we can add, in the entry /, the sense: "In Morse code, used between different letters of the same word." Incidentally, the entry . ("FULL STOP") has the sense "In Morse code, the shorter of two marks (the dot)." since April 2016. --Daniel Carrero (talk) 15:39, 30 August 2016 (UTC)
It's better to be professional rather than casual. You need a font which displays in a straight line, also leaving a space between the dashes; ·-, -···, -·-·, --·· are nearly there but not quite good enough. DonnanZ (talk) 17:24, 30 August 2016 (UTC)
Then why are we using straight quotes and apostrophes (" and ') rather than curly ones (, and )? --WikiTiki89 17:39, 30 August 2016 (UTC)
  • Oppose: Morse Code is probably best left as a Wiktionary Appendix rather than having entries in mainspace. If Morse Code is treated as a valid alternative script for the English language (which according to user WikiTiki89 above, it isn't), then every entry on Wiktionary would be eligible to have a Morse Code transliteration included. However, I would check first with the governing body for Morse Code (the International Telecommunication Union) on the validity of its written use. They should have something on their website, I know they have publicly published standards for other forms of communication. Nicole Sharp (talk) 17:33, 30 August 2016 (UTC)
    • From what I understand, Morse Code is ISO 15924 ZXXX, or an unwritten (e.g. in auditory tones or visual flashes) form of communication. Other languages which are communicated (have been communicated) as ISO 15924 ZXXX include protolanguages (such as Proto-Indo-European), the words for which have no historically-documented written forms and are also not allowed in mainspace, but are instead located in appendices. A Wiktionary Appendix of commonly-used words and expressions in Morse Code would definitely be helpful, but I think it would be difficult to include it in the Wiktionary mainspace. Nicole Sharp (talk) 17:46, 30 August 2016 (UTC)
      • Here is the official international standard for Morse Code, as published by its governing body, the International Telecommunication Union:!!PDF-E.pdf There is no mention of Morse Code being used as a written or printed form of communication, only as an unwritten signal. They also use periods instead of middots and en dashes instead of hyphens for the code. For consistency with the official documentation, I would suggest following the typography used by the ITU for any Morse Code on Wiktionary. Nicole Sharp (talk) 17:57, 30 August 2016 (UTC)
        Actually, those are not en-dashes, but minus signs (U+2212). And also note that they put spaces between them. --WikiTiki89 18:16, 30 August 2016 (UTC)
        Actually I just noticed that they are inconsistent and sometimes use en-dashes (U+2013) and sometimes minus signs (U+2212). --WikiTiki89 18:19, 30 August 2016 (UTC)
    • I missed the comment above on allowing sign-language entries in Wiktionary. After reading the Wikipedia article as well for Morse Code (including "Morse Code as an Assistive Technology"), I think I am more ambivalent on the issue now and think that Morse Code could work out in mainspace, as long as a way of expressing it can be agreed upon. Having Morse Code transliterations available in Wiktionary could be very useful to not have to look up one letter at a time. I think in the long run it is better to not restrict Wiktionary to just written forms of communication, and that we should be open to as many forms of communication as possible (including communication via colors and symbols, e.g. the color red for "stop"), despite the technical and logistic challenges of presenting and organizing such unwritten forms of communication. Nicole Sharp (talk) 18:45, 30 August 2016 (UTC)
      • Any text can easily be translated (technically, "transliterated" or "encoded" would be more accurate) to Morse code with various automatic online translators (such as this one). In fact we can even easily create our own such tool and put it on an appendix page (Appendix:Morse code). The difference with sign language is that it is an actual language and not simply a way to encode English letters. --WikiTiki89 18:55, 30 August 2016 (UTC)
        • Given Morse Code is (or has been) used for actual communication, it deserves at least an appendix. Full entries don't seem a like a problem to me. Renard Migrant (talk) 19:04, 30 August 2016 (UTC)
        • It's more like Braille really; a different representation for the same letters. We should treat it the same way. —CodeCat 19:04, 30 August 2016 (UTC)
        • The biggest difference though between Wiktionary and the majority of automated online translation/transliteration software is the license. Wiktionary is free and open-source (copylefted) whereas most of those are under commercial for-profit licenses (e.g. Google Translate). The license of the information source doesn't matter to some people, but to other people it does (e.g. Ubuntu Linux users versus Linux Mint users). I think creating a free open-source repository of Morse Code transliterations on Wiktionary isn't a bad idea, whether it is in mainspace or in an Appendix. The argument that the content is already available elsewhere doesn't really hold water (since most everything on Wiktionary is already on for-profit sites like If it is decided to be included in mainspace, then the simplicity of Morse works in our advantage, since a bot can probably be programmed to automatically add a Morse Code transliteration to each English-language entry on Wiktionary without human labor needed. I think a simpler start would be to program a Morse Code transliteration of the 2000 words needed for Basic English ("Appendix:Basic English word list," [14]) to be placed in a Wiktionary Appendix, which should be adequate. An Appendix of the 2000 words of Basic English in Morse Code also has the major advantage that it can be printed out or saved as a single document for offline reference. Nicole Sharp (talk) 19:31, 30 August 2016 (UTC)
          • Wiktionary is not copylefted. Copyleft, in my understanding, refers to parasitic licenses like the GPL that infect anything they touch. But that's beside the point. If we provide our own transliteration tool, it will be available under our own CC license. I don't see what problem we would be solving by creating thousands of duplicate entries in Morse code. --WikiTiki89 19:44, 30 August 2016 (UTC)
            • Wiktionary is copylefted. The Creative Commons ShareAlike licence is copyleft. All derivative works of Wiktionary must be licenced under the same terms. —CodeCat 19:50, 30 August 2016 (UTC)
              • Now that I think about it, you're sort of right. But Wiktionary isn't really meant to have "derivative works", but rather to have its content freely available to everyone, so the parasitism of copyleft doesn't really apply. --WikiTiki89 20:05, 30 August 2016 (UTC)
                • There are so many different licenses that it gives me a headache. I personally use the term "copyleft" to generically refer to any work that can be redistributed and modified without the permission of the author. Wikimedia falls under that. From what I understand, anyone can download a copy of Wiktionary and modify it however they like for (nonprofit) republication, as long as they keep the edit histories attached to each wikipage. Nicole Sharp (talk) 20:22, 30 August 2016 (UTC)
                  • You said: "anyone can download a copy of Wiktionary and modify it however they like for (nonprofit) republication".
                  • But Wiktionary is licensed under which explicitly states: "for any purpose, even commercially." If I want to publish and sell a copy of Wiktionary, I can do it, provided I give credit to Wiktionary. I can make changes and derivative works if I want, provided I state what the changes are. --Daniel Carrero (talk) 20:37, 30 August 2016 (UTC)
                • I think that this section though should be split into two subsections: proposals to include or bar Morse Code being within the Wiktionary mainspace versus proposals to add Morse Code as a Wiktionary Appendix outside of mainspace. The latter is less controversial than the former, and will hopefully be quicker to reach a consensus, while debate on the former can continue. Nicole Sharp (talk) 20:22, 30 August 2016 (UTC)
                  • Actually, I don't think anyone would object to an Appendix page, so it doesn't need to be discussed unless someone who opposes it brings it up. --WikiTiki89 20:32, 30 August 2016 (UTC)
                  • Speaking of licenses though, is International Morse Code public domain, or is it patented/copyrighted by the ITU? If it is not public domain, we may not even be able to use it on Wiktionary (versus Wikipedia being able to use it within an article under fair use). Nicole Sharp (talk) 20:42, 30 August 2016 (UTC)
                    • I believe the patent of the Morse code expired, because it's been around for a long time. --Daniel Carrero (talk) 21:15, 30 August 2016 (UTC)

Are we going to have only entries for Braille letters and numbers, or will we allow words such as: -·-· ·- - = cat? At least, if there are attestable Morse code abbreviations we should include those, too. We have a Braille entry ⠁⠇⠍ ("alm") meaning "almost". --Daniel Carrero (talk) 19:29, 30 August 2016 (UTC)

@Daniel Carrero: I would be against creating a trivial example like encoding every conceivable word into Morse code (although providing a redirect would be fine or having a section called Encodings which has Braille, fingerspelling, Morse Code, and semaphore would actually be fine with me). I would be in favor of creating standardized contractions. I created ---.. ---... What does everyone think? —Justin (koavf)TCM 14:16, 1 September 2016 (UTC)
I like the entry ---.. ---... Actually, some users felt that abbreviations like these should be written in Latin script. We already have 88 for "hugs and kisses". I'm fine with keeping both 88 and ---.. ---.., one of those should be the "alternative form". Does anyone would prefer to delete the Morse code entry altogether? (technically, "88" is not "Latin script" but this is not important)
I disagree with a few implementation details, but IMO we can create more entries like these now if we want, and the layout can be edited later. I'll assume that "88" is really English and not Translingual.
I don't like using "Contraction" as the POS for Morse code abbreviations (I also don't like using "Contraction" for Braille entries): "hugs and kisses" is an abbreviation, and the actual POS is a noun. In the entry 88, the POS is already "Noun". I edited 88 to add {{en-plural noun}}.
Also, we need to edit the headword of ---.. ---.. so it will properly display the proper Morse code images that Wikitiki89 created. I would create Morse code templates for English: {{en-morse noun}}, {{en-morse plural noun}}, {{en-morse adjective}}, {{en-morse interjection}}. --Daniel Carrero (talk) 15:34, 1 September 2016 (UTC)
Note that Morse code abbreviations are translingual, not English. In fact they were used in communication between people who did not understand each other's languages (see w:Morse code abbreviations#Informal Language Independent Conversations). --WikiTiki89 16:08, 1 September 2016 (UTC)
  Support Encoding section in entries. Korn [kʰũːɘ̃n] (talk) 15:59, 2 September 2016 (UTC)

polls on the use of Morse CodeEdit

Poll: Allowing Morse code charactersEdit

Proposal: Allowing entries for Morse code letters (A-Z), digits (0-9), punctuation marks and possibly other symbols and characters, such as commercial at (@ = ·--·-·), addition sign (+ = ·-·-·), etc.

This is a poll with no policy value.

  1.   Support. This sounds something natural to do, like we have for Braille A, for Braille B, etc. --Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
  2. Weak   Support, per Daniel. Putting all the letters in an appendix is also a possibility, although if any string which deserves an entry for another reason happens to also be a Morse code letter, I would certainly mention the Morse code letter somewhere in the entry. - -sche (discuss) 21:59, 30 August 2016 (UTC)
  3.   SupportCodeCat 22:03, 30 August 2016 (UTC)
  4.   Support Also some Japanese need 2 codes for 1 character. For example パ, we need to make the code of ハ and ゜ consecutively. --Octahedron80 (talk) 00:21, 31 August 2016 (UTC)
  5.   Support for the basic character set. bd2412 T 15:43, 31 August 2016 (UTC)
  6.   Support. Andrew Sheedy (talk) 17:07, 31 August 2016 (UTC)
  7.   Support. Leasnam (talk) 17:25, 31 August 2016 (UTC)
  8.   SupportEru·tuon 03:33, 1 September 2016 (UTC)
  9.   SupportAɴɢʀ (talk) 14:14, 2 September 2016 (UTC)
  1.   Oppose I think we should take the same approach to Morse Code as we do with unwritten protolanguages. We include the Morse Code transliteration in a subsection under the main entry, but with the individual Morse characters wikilinked to an Appendix, not as mainspace entries. See third poll response below. Nicole Sharp (talk) 07:04, 31 August 2016 (UTC)
  2.   Oppose I see no reason that an Appendix page would be insufficient. --WikiTiki89 12:37, 31 August 2016 (UTC)
  3.   Oppose Cruft, useless except to occupy contributors who might do even worse. DCDuring TALK 14:25, 1 September 2016 (UTC)
    See w:Wikipedia:Assume good faith. (I know it does not qualify as a Wiktionary policy, but it looks better than our Wiktionary:Assume good faith, I think.) --Daniel Carrero (talk) 14:37, 1 September 2016 (UTC)
    I've never felt that most of the silliness was in bad faith, just that it was silly with bad consequences. DCDuring TALK 01:06, 27 September 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:41, 27 September 2016 (UTC)

Poll: Allowing Morse code abbreviationsEdit

Proposal: Allowing entries for Morse code-specific abbreviations, written in Morse code. Example: ···· ·-- ("hw") = how. See w:Morse code abbreviations for a list.

This is a poll with no policy value.

  1.   Support. Absolutely. We also have entries for Braille-specific abbreviations, such as ⠁⠇⠍ ("alm") meaning "almost". --Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
  2.   Support But I think they should be presented in Latin script, not Morse notation. Same for Braille abbreviations. —CodeCat 22:04, 30 August 2016 (UTC)
  3.   Support "All words in all languages." Philmonte101 (talk) 22:26, 30 August 2016 (UTC)
  4.   Support --Octahedron80 (talk) 00:21, 31 August 2016 (UTC)
  5.   Support, but only to the extent that these can be cited/attested. The obvious example would be ··· --- ···. bd2412 T 15:45, 31 August 2016 (UTC)
  6.   Support, provided they meet CFI. Andrew Sheedy (talk) 17:08, 31 August 2016 (UTC)
  7.   Support Leasnam (talk) 17:25, 31 August 2016 (UTC)
  8.   SupportAɴɢʀ (talk) 14:12, 2 September 2016 (UTC)
  9.   Support Purplebackpack89 23:55, 26 September 2016 (UTC)
  1.   Oppose Same as above. See below. Abbreviations should be listed in mainspace in Roman script (e.g. HW), with the Morse Code transliteration for the abbreviation listed under the "Morse Code" subsection. Nicole Sharp (talk) 07:04, 31 August 2016 (UTC)
  2.   Oppose Like Nicole Sharp, I think abbreviations should be entered in roman script (at HW, for example). --WikiTiki89 12:37, 31 August 2016 (UTC)
  3.   Oppose Absolutely fucking ridiculous. Can you find any more productive way to waste your time? DTLHS (talk) 04:12, 2 September 2016 (UTC)
    @DTLHS: After you created an RFD for 2 Morse code entries, I pointed you to this poll. Were you insulting me specifically, or the person who gave the idea, the people who supported the proposal or the people who otherwise helped to edit the templates/entries? Anyway, I'd appreciate if we could abide by the (non-Wiktionary) policies w:WP:CIVIL and w:WP:AGF. --Daniel Carrero (talk) 04:20, 2 September 2016 (UTC)
    It applies equally to whoever came up with the idea and the people who supported it. DTLHS (talk) 04:26, 2 September 2016 (UTC)
  4.   Oppose DCDuring TALK 01:07, 27 September 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:41, 27 September 2016 (UTC)

Poll: Allowing Morse code termsEdit

Proposal: Allowing entries for normal English terms written in Morse code, such as Morse for "cat" and "dog".

This is a poll with no policy value.

  1.   Support: "all words in all languages". Regardless of how they're written. If it's attested, I say go for it. Philmonte101 (talk) 22:23, 30 August 2016 (UTC)
  2. I do not think there is any need to place Morse Code as mainspace entries. However, I think it would be very helpful if a Morse Code transliteration could be added to each English-language entry, perhaps next to the IPA spelling or under its own heading. If we have room in headings for trivia such as having a heading for the anagrams of each word then we can certainly make room for adding a handful of dots and dashes in each entry. Morse Code is simple enough that a bot can be programmed to automatically add a new section to each English-language entry with the Morse Code transliteration of the word. If someone wanted to reverse-search Morse Code, they can still type it into search and it would come up as under the entry content, without needing duplicate mainspace entries in Morse. Nicole Sharp (talk) 06:33, 31 August 2016 (UTC)
  1.   Oppose, just as we don't have (full) Braille words, or AFAIK full (multi-character) Deseret words, etc. - -sche (discuss) 22:01, 30 August 2016 (UTC)
  2.   Oppose. Morse is pretty much never written, it was never intended to be written. It maps one-to-one to the Latin alphabet, people don't think of Morse code as a separate alphabet, just an encoding of Latin letters. This makes it very different from sign languages, which are very clearly not English or Latin script (although they do have a mechanism to "spell" words, but then, we have methods to spell out signs in writing too). —CodeCat 22:15, 30 August 2016 (UTC)
  3.   Oppose with extreme prejudice. Never in a million years. DTLHS (talk) 23:59, 30 August 2016 (UTC)
  4.   Oppose per others. --Daniel Carrero (talk) 00:02, 31 August 2016 (UTC)
  5.   Oppose Nope for vocabulary. Just characters and abbreviations. --Octahedron80 (talk) 00:22, 31 August 2016 (UTC)
  6.   Oppose per DTLHS. DCDuring TALK 02:34, 31 August 2016 (UTC)
  7.   Oppose. No one is going to go to a dictionary to look up a string of dahs and dits that they've heard somewhere. The only way they're going to find entries is from other entries, and there's nothing of any value in an entry that wouldn't already be in the text of the link. Chuck Entz (talk) 04:05, 31 August 2016 (UTC)
  8.   Oppose strongly. --WikiTiki89 12:38, 31 August 2016 (UTC)
  9.   Oppose - Morse code is not a language, it is just a means of transmitting existing languages. bd2412 T 15:41, 31 August 2016 (UTC)
  10.   Oppose. Andrew Sheedy (talk) 17:09, 31 August 2016 (UTC)
  11.   Oppose Leasnam (talk) 17:26, 31 August 2016 (UTC)
  12.   OpposeAɴɢʀ (talk) 14:13, 2 September 2016 (UTC)
  Abstain until further discussion. Should we duplicate all our entries in Morse code, like we do for romanizations in some languages? I'm curious if we would allow all English terms in Morse code, (like we do for romanization entries) or just English terms that are attestable in Morse code text. --Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
Have you ever, ever, ever, EVER thought about doing stuff based on what human beings want and need?
Why that attitude? You even forgot your signature, Equinox, are you drunk again? --Daniel Carrero (talk) 00:06, 31 August 2016 (UTC)
I deliberately dropped the sig. Treat the question honestly, as it deserves. Nobody needs Morse code. It's just Unicode wank from someone who doesn't spend any time on creating useful content. If we could create an entry for every little pixel on the screen, you'd create 640x480 entries before we could do anything about it. Equinox 00:10, 31 August 2016 (UTC)
"someone who doesn't spend any time on creating useful content" is too harsh, and unfair. What about my new Portuguese and English entries, my time rewriting definitions, adding quotations, creating templates and modules and proposing new policies and edits to WT:EL, and helping people with questions and favors asked in my talk page?
I agree with @Nicole Sharp above: "I think in the long run it is better to not restrict Wiktionary to just written forms of communication, and that we should be open to as many forms of communication as possible (including communication via colors and symbols, e.g. the color red for "stop"), despite the technical and logistic challenges of presenting and organizing such unwritten forms of communication." You call it "Unicode wank". So what? Sometimes you vote oppose when I create a new proposal, but if enough people vote support, they pass, and it's not up to me to decide, or you. --Daniel Carrero (talk) 00:28, 31 August 2016 (UTC)
Indeed, I should start a vote, and a huge number of votes, so that the people who care are so overwhelmed they can't actually tell what is going on, to use Wiktionary to create entries for my personal family tree. An entry for my uncle. An entry for my aunt. Dude, I am sure you have done some good work but about 70% of your ideas are literally garbage. kisses. Equinox 00:31, 31 August 2016 (UTC)
Do you think I have too many active votes right now? Of the 14 active votes, I have 6 votes in the list. (one ended yesterday so make it 5 out of 13) I apologize because I flooded the list of votes at some point around February, and since then I've been trying not to do it again. Anyway, Morse code was not my idea but I support it for single characters and Morse-specific abbreviations like we do in Braille. (I wonder what is your position about Braille chars and abbreviations?) Apparently you are free to think my ideas are garbage and I'm not angry, but it was rude of you to say things like "He just wants to do it to fulfill some kind of nerdy, autistic bullshit." (said by you below) I'm not perfect but I like to create new things and I hope some are useful, even if sometimes I'm not up to your expectations specifically.
All this is because we've been discussing my idea about a "Description" section recently and I maintained my position despite your multiple objections? Please don't think that I'm just stubborn and refusing to admit that the "Description" section is an obvious garbage. I really tried to reply to all your questions in that conversation. (And you didn't answer some of my questions, but that's forgivable, we've been talking a lot.) --Daniel Carrero (talk) 00:59, 31 August 2016 (UTC)
Just remember that every new feature you want to add to Wiktionary will require time and effort from the community to assess it. Even worthwhile proposals have a certain cost, and the more you have going at the same time, the more these costs compound each other. I'd like to be able to just create and edit entries without having to be constantly dropping everything to express my opinion on, say, pig Latin entries (DON'T EVEN THINK ABOUT IT!!! ;-)). We can't just tune you out, because then you'll just go ahead and run with everything, including that non-negligible percentage of lame and otherwise awful ideas. Yes, it's nice to have a little water now and then, but when you open the floodgates, we have to stop and pump out the basement. Chuck Entz (talk) 04:05, 31 August 2016 (UTC)
@Chuck Entz: I'll have this in mind, thanks: "every new feature [...] will require time and effort from the community ...".
You made that comment, specifically, in the poll "Allowing entries for normal English terms written in Morse code.", so at least let me point out that I don't want terms in Morse code (I don't want Pig Latin terms either!), though I do support having Morse characters like "A" and other stuff. I just thought this poll would be a good idea in contrast to the other Morse polls, so it's clear that a lot of people oppose it.
Re: "you'll just go ahead and run with everything, including that non-negligible percentage of lame and otherwise awful ideas". I challenge you to make a list of my lame or otherwise awful ideas. Thank you. --Daniel Carrero (talk) 20:58, 2 September 2016 (UTC)

Oppose - Morse codeEdit

  • Oppose. This isn't in the right section but I can't even tell what's going on, and when I edit the page, it's confusing, and hard. Move it if you must. I oppose because I don't believe that the creator wants to create Morse code to help anyone. He just wants to do it to fulfil some kind of nerdy, autistic bullshit. It won't help anyone. It's worthless. Forbid it. If anyone ever comes here and says "we can't use Wiktionary because we need Morse code in order to read it", we should start using Morse code. This naturally won't happen because Morse code is from the telegraphy era and totally obsolete now. Grow up and start doing useful, meaningful entries. Equinox 23:35, 30 August 2016 (UTC)
    • I created a separate "Oppose - Morse code" section for your vote, if that helps. --Daniel Carrero (talk) 00:02, 31 August 2016 (UTC)
    • "He just wants to do it to fulfil some kind of nerdy, autistic bullshit." Isn't this pretty much all of us, Eq? Why else would we lavish so much time on Wiktionary? Don't be a hater for hate's sake. —Μετάknowledgediscuss/deeds 05:36, 31 August 2016 (UTC)
    • As an autistic woman myself, I find that kind of language to be derogatory, offensive, and hateful. Bashing a contributor's neurology or (dis)ability status has no place on Wikimedia. And Morse Code is not obsolete, go read "wikipedia:Morse Code." It is required knowledge for radio licenses in the USA, and also by USA Air Force personnel. Nicole Sharp (talk) 06:15, 31 August 2016 (UTC)
      • (Not to mention that we document obsolete writing systems too!) --Daniel Carrero (talk) 06:44, 31 August 2016 (UTC)
        • Speaking of obsolete writing systems, we should start a campaign to also add (asterisked) Runic transliterations to the Old English, Old Norse, and Proto-Germanic entries. :-/ Nicole Sharp (talk) 06:50, 31 August 2016 (UTC)
  •   Oppose I basically agree with Equinox. Morse code is still used as part of some rescue and disaster tropes in movies. DCDuring TALK 02:38, 31 August 2016 (UTC)

Abstain - Morse codeEdit

polls on the typography for Morse CodeEdit

Poll: Morse code format: "FULL STOP", "HYPHEN-DASH"Edit

Proposal: Naming Morse code using "FULL STOP", "HYPHEN-DASH", like this: .-.-.-.-.

This is a poll with no policy value.

  1.   Support Do both and keep them as alternative forms of each other. Philmonte101 (talk) 22:25, 30 August 2016 (UTC)
  2.   Support. If we create entries for Morse code characters, this should be the format of the entry titles. We can format the headword lines with images so that they line up appropriately. --WikiTiki89 12:41, 31 August 2016 (UTC)
  3.   Support as the actual entry form, but display the header with en dashes if that option is not used as the main form. Andrew Sheedy (talk) 20:05, 31 August 2016 (UTC)
  1.   Oppose --Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
    Pro: .-.-.-.- is easy to type. Con: single dot (letter I) and two dots (letter E) are impossible to use as normal entry titles and would have to be kept in Unsupported titles/Full stop and Unsupported titles/Double period, respectively.
    I'm opposing this because I prefer using "MIDDLE DOT", "HYPHEN-DASH", like this: ·-·-·-·-. Actually, I support using the full stop entries as "alternative form" entries, assuming they are attestable.--Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
  2.   Oppose For the reasons Daniel gives. —CodeCat 22:16, 30 August 2016 (UTC)
  3.   Oppose Technical limitation. But okay making them redirects. --Octahedron80 (talk) 00:24, 31 August 2016 (UTC)
  4.   Oppose The official standard for Morse Code uses minuses, not hyphens. Nicole Sharp (talk) 06:53, 31 August 2016 (UTC)
  Oppose, but create as redirects if possible. Andrew Sheedy (talk) 17:14, 31 August 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:47, 27 September 2016 (UTC)

Poll: Morse code format: "MIDDLE DOT", "HYPHEN-DASH"Edit

Proposal: Naming Morse code using "MIDDLE DOT", "HYPHEN-DASH", like this: ·-·-·-·-.

This is a poll with no policy value.

  •   Support. There are many types of dots and dashes on Unicode, but those feel the most "generic" to me. We can introduce a new script code, such as |sc=Morse, to use the right fonts (Georgia would do, apparently) that make the dot and dash display properly aligned in a horizontal line. --Daniel Carrero (talk) 21:32, 30 August 2016 (UTC)
    I changed my vote to middle dot full stop + en dash. --Daniel Carrero (talk) 16:08, 31 August 2016 (UTC)
  1.   Support, though I'm opposed to having Morse word entries in general. —CodeCat 22:17, 30 August 2016 (UTC)
  2.   Support Do both of them as alternative forms of each other. Philmonte101 (talk) 22:26, 30 August 2016 (UTC)
  3.   Support --Octahedron80 (talk) 00:25, 31 August 2016 (UTC)
  1.   Oppose The official standard for Morse Code uses periods and en dashes minuses, not hyphens and middots. Nicole Sharp (talk) 06:54, 31 August 2016 (UTC)
  2.   Oppose Let's use en dash per Nicole Sharp. Still agree with middle dot. --Octahedron80 (talk) 06:55, 31 August 2016 (UTC)
    • Actually, they are apparently minuses, not en dashes (both look identical on my screen). See analysis by user Daniel Carrero below. Nicole Sharp (talk) 07:16, 31 August 2016 (UTC)
  3.   Oppose. If we want alignment, we can format the headword lines with images so that they line up appropriately. --WikiTiki89 12:42, 31 August 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:48, 27 September 2016 (UTC)

Poll: Morse code format: "MIDDLE DOT", "EN DASH"Edit

Proposal: Naming Morse code using "MIDDLE DOT", "EN DASH", like this: ·–·–·–·–.

(This option was not present in the poll before, it was proposed and added later.)

This is a poll with no policy value.

  Support --Daniel Carrero (talk) 16:08, 31 August 2016 (UTC)
As discussed below, the official document!!PDF-E.pdf mostly uses minuses and middle dots, but the minuses seem to be a mistake on their part and might not make sense. The en dash seems to be typographically identical to the minus and does not have the same mathematical meaning as the minus has. Hence let's use the en dash. --Daniel Carrero (talk) 16:08, 31 August 2016 (UTC)
You are mistaken, it uses full stops and minus (and in some places full stops and en dashes), but not middle dots. --WikiTiki89 17:13, 31 August 2016 (UTC)
Thank you for pointing my mistake. --Daniel Carrero (talk) 17:18, 31 August 2016 (UTC)
  1.   Oppose. --WikiTiki89 17:13, 31 August 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:49, 27 September 2016 (UTC)

Poll: Morse code format: "FULL STOP", "EN DASH"Edit

Proposal: Naming Morse code using "FULL STOP", "EN DASH", like this: .–.–.–.–.

(This option was not present in the poll before, it was proposed and added later.)

This is a poll with no policy value.

  1.   Support --Daniel Carrero (talk) 16:08, 31 August 2016 (UTC)
    As discussed below, the official document!!PDF-E.pdf mostly uses minuses and full stops, but the minuses seem to be a mistake on their part and might not make sense. The en dash seems to be typographically identical to the minus and does not have the same mathematical meaning as the minus has. Hence let's use the en dash. --Daniel Carrero (talk) 17:19, 31 August 2016 (UTC)
    Also note that the document puts spaces between each character and enlarges the font, like this: . – . – . – . – --WikiTiki89 17:21, 31 August 2016 (UTC)
    If we add |sc=Morse in all Morse code entries, we can make the CSS "font-size" larger. I prefer not adding spaces manually between all Morse characters, because I'd rather use CSS "letter-spacing" to make the kerning a bit wider. --Daniel Carrero (talk) 17:27, 31 August 2016 (UTC)
    Or we can enter them as .--. and display them in the headword line with an image. That would put the entry at the most commonly typed form and would resolve the problem of displaying them properly. --WikiTiki89 17:29, 31 August 2016 (UTC)
    Here is example of what an entry could look like. The images I found are not 100% ideal, but we can deal with that if decide to go with this option. Alternatively, we can just include the audio at P and be done with it. --WikiTiki89 18:22, 31 August 2016 (UTC)
  1.   Oppose. I don't think that the document in question is implying that this is any sort of standard for the computerized encoding of written morse code. --WikiTiki89 17:22, 31 August 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:49, 27 September 2016 (UTC)

Poll: Morse code format using other charactersEdit

Proposal: Naming Morse code using using other characters for dot and dash. (Presumably, oppose and abstain votes can be given once there are any new proposals for dots and dashes.)

This is a poll with no policy value.

I am off the topic. With パ as above example needs 2 codes, should we use a space, or a slash, or space-slash-space to tokenize it? IMO I would use just a space. --Octahedron80 (talk) 00:36, 31 August 2016 (UTC)
  1.   Support, but again, keep as alternative forms. Perhaps we may want to have "MorseBot", who can create all the alt forms automatically whenever a morse entry is created. Philmonte101 (talk) 22:27, 30 August 2016 (UTC)
  2. My suggestion would be to use the same typography used in the official standard for International Morse Code, the governing body for which is the United Nations International Telecommunication Union. The official standard from the United Nations for international use is given here:!!PDF-E.pdf . They use the format of periods and en dashes minuses (not hyphens) in bold font, separated by spaces. Nicole Sharp (talk) 06:22, 31 August 2016 (UTC)
    • Actually, that does not seem to be the case. The PDF file you linked has a certain degree of inconsistency: it contains 114 occurences of the ("MINUS SIGN") and 22 occurences of the ("EN DASH"). For example, in the list "Punctuation marks and miscellaneous signs", almost all the Morse codes are using the minus sign, except the commercial at, which is using the en dash. --Daniel Carrero (talk) 07:06, 31 August 2016 (UTC)
      • Thank you for the analysis; en dashes and minuses display identically on my computer. I corrected above. Either way, they are not hyphens. Nicole Sharp (talk) 07:11, 31 August 2016 (UTC)
        • Minuses might not make sense. Let's use en dashes instead. --Octahedron80 (talk) 07:15, 31 August 2016 (UTC)
          • I do agree with that. Minuses should be reserved for mathematical use, per Unicode guidelines. The en dash would be the correct character for this use. Someone should email the ITU with their mistake. Nicole Sharp (talk) 07:19, 31 August 2016 (UTC)
            • I agree with using en dashes. Minuses seem to unavoidably imply the mathematical subtraction sense. (and "varieties" of the subtraction sense, such as the "negative number" sense and the "cathode" sense, as seen in: ) The en dashes seem to be typographically better because they are "generic" dashes, without an inherent meaning. --Daniel Carrero (talk) 07:23, 31 August 2016 (UTC)
              • The online Morse code converters/translators use periods and hyphens; a space between letters, and either / or three spaces between words:
              • -... . -. ... / -... . ... - / .-- .. .-. . / -... . -. -.. ... / -... . ... - —Stephen (Talk) 12:05, 27 September 2016 (UTC)
  3.   Support using en dashes and periods/full stops, with redirects for hyphens and periods and minus signs and periods. Andrew Sheedy (talk) 17:47, 31 August 2016 (UTC)
  1. Abstain - I cannot see how the bitwise representation of ascii, unicode is substantively different from morse code. Or, for that matter, the multiple tonal representations of bit state in various baud standards. (obligatory references to pae, creep, nvc) - Amgine/ t·e 01:50, 27 September 2016 (UTC)

Comment - Morse codeEdit

As Wikitiki89 says above, the standards document shouldn't be interpreted as setting a standard for representation in writing. My guess is that they used whatever was convenient, and whatever word processing app they wrote it in converted the dashes and dots into whatever Unicode characters its algorithms chose (remember Smart Quotes™?).

Whatever we choose should only be used for the entry name itself. For everything else, we should have a module that converts regular characters into their Morse code representation, and use it for everything. That would make the wikitext a lot easier to work with. The only reason to make an exception would be if there were more than one possible way of representing the same character in Morse code (not likely for the Unicode Basic Latin block, but who knows about all the other characters?). It also would mean that we could switch from hyphens to en-dashes to em-dashes to images to whatever we wanted by fiddling with a line or two of code and stop debating over standards that only affect appearance. Chuck Entz (talk) 02:49, 1 September 2016 (UTC)

  • I created the Morse code entries for letters and numbers. See .-. It links to the other entries. Feel free to give suggestions, edit the entries, etc. --Daniel Carrero (talk) 08:17, 1 September 2016 (UTC)
    I much prefer this version. DCDuring TALK 14:31, 1 September 2016 (UTC)


FYI: I read all the votes between 2005–2016 and attempted to clean up / rewrite Wiktionary:Votes/Timeline completely. I added the missing votes.

--Daniel Carrero (talk) 06:22, 31 August 2016 (UTC)

Thank you! - -sche (discuss) 18:48, 1 September 2016 (UTC)
Very nice! — Eru·tuon 22:00, 1 September 2016 (UTC)
 :) --Daniel Carrero (talk) 07:56, 3 September 2016 (UTC)


Lately I've been referencing Proto-Turkic in Mongolian etymology sections, and I think we need to decide on a standardisation now before the usages grow too numerous and inconsistent. Here are the details we need to decide on:

  1. z or ŕ?
  2. ĺ or š or ş?
  3. y or ı?
  4. j or c or dž?
  5. c or č?
  6. y or j?
  7. ä and e or e and é or some other combination?
  8. Should length be indicated with a macron or with a colon?
  9. Should we write d- or ń- in cases of Common Turkic y- words that have been loaned with n- and d- into Mongolian and Hungarian?
    Or only an etymology section "From earlier..."?
  10. Should we write non initial syllable o/ö in cases where Brahmi script implies it and Mongolian loans have a/e?
  11. -d2- or -ð- or -d- for Chuvash -r- Common Turkic -y- Khalaj -d-
  12. Should we write d in places where Oghuz has it and other languages have t?
  13. Should we mark back and front vocalic k and g differently?
    ǵ and g, or g and ğ; ḱ and k, or k and q

@Anylai, @Madina, @Vahagn Petrosyan, not sure who else uses Proto-turkic here, please ping them. —This unsigned comment was added by Crom daba (talkcontribs).

Let me repeat the ping, since it wouldn't have worked without a signature: @Anylai, @Madina, @Vahagn Petrosyan Chuck Entz (talk) 03:46, 1 September 2016 (UTC)
Rhotacism and zetacism along with lambdacism are still debated topics in Turkic. Clauson looks at the oldest records of Turkic and thinks it is rhotacism and lambdacism that happened in other languages, but amount of external evidence in Hungarian, Mongolic languages and some Siberian languages suggests otherwise. Therefore, usually ŕ and ĺ are reconstructed for Proto-Turkic, for Common Turkic you can use z or š. Turkic loanwords in Mongolic are divided into 3 periods, for first period of loanwords you will meet l and r in Mongolic, therefore use ĺ and ŕ, infact ĺ-->l, ŕ-->r is actually observed in modern Turkic languages too. I usually reference from Starling, and for your questions they have:
č, j instead of y, ɨ instead of y or ı. Not too many reconstruct dž, at least not word initially. ń in not reconstructed for Proto Turkic in word initial position, d is also debated because only Oghuz languages have it and it is not very consistent. Generally Altaicists reconstruct a secondary d instead of t where a word ends with ĺ, n, ŕ which should again be inherited from Proto Altaic t, because Proto Altaic word initial d is not equal to Proto Turkic d for them. But we are talking about another debated topic which is Altaic, some are assumptions that rely on Altaic hypothesis so most of the time you will in fact hear they are not loanwords.
Use macron for long vowels. As far as I know, at Proto Turkic level it is assumed that k and g were not yet split into q and gh. Starling prefers /ki/ instead of ḱ where Chuvash suggests it so, for example see *Kiār (snow). By the way it is also accepted that Proto Turkic had word final b instead of w. --Anylai (talk) 18:25, 1 September 2016 (UTC)
I don't think we should rely too much on Dybo-Starostin and Starling in general, Altaic hypothesis is controversial and reconstructions given there may be too dependent on a given Altaic etymology. As far as I'm concerned I'd prefer a convention near to Clauson (I wouldn't object to easily made changes such as ŕ instead of z, and changes that look prettier or make it easier for me to input the word such as ä/e instead of e/é and macrons instead of colons) Crom daba (talk) 00:56, 2 September 2016 (UTC)
Their Proto-Turkic reconstructions is not bad and do not rely on Altaic, only semantics could be distorted, in fact it is better than Clauson who only has a dictionary of pre 13th century Turkic. Clauson makes a lot of assumptions too, if you read his books and entries in his dictionary, he sometimes mentions "a 1st period loanword in Mongolian." etc... Here you meet l and r corresponding to Common Turkic š and z. He makes up an imaginary language, the language of the Tabgach (Touba) who Mongols apparently borrowed these words from before the earliest Turkic records (8th century) and already have z and š. For him somehow the Tabgach also spoke a language close to Chuvash, undergoing same assimilations in the same areas, and the original consonants were z and š, this is not convincing just like pretty much like most entries in Starling's Proto Altaic database. You can use that source, there are references and comments below the etymology which you can also make use of, those who reject, who think Mongolic one is a borrowing, etc... We can not deny there are a lot of cognates between Turkic and Mongolic, yes there are very obvious loans but also very dangerous words that even the most primitive community wouldnt need to borrow from a different language. Yes according to Clauson the relationship between Mongolic and Turkic is just an expected list of loanwords in a primitive community which in this state is the Mongolic one. --Anylai (talk) 06:48, 4 September 2016 (UTC)
Whether we select zetacism or rhotacism as primary is non-essential, there is no risk of misunderstanding either way for the person reading the entry. Personally I'm undecided on the question.
Bigger problems are their *ạ and *ia which aren't accepted in mainstream Turkic studies, ignoring Khalaj h- as secondary when it doesn't line up with Altaic comparanda, and the annoying convention of writing capitals to indicate an uncertain reconstruction; and I hate to see all that adopted here wholesale.
I disagree with Clauson's outright denial of Altaic, but I feel that a bigger fallacy is being committed by Starostin et al when denying prehistoric (pre-13th century even) word loaning among Mongolic and Turkic in the introduction to their Altaic dictionary. Virtually all language contact situations I know of have been highly asymmetrical, and I think there's no lexical sphere that is completely immune to borrowing.
Starling is indeed a handy resource, but you should probably double check with the references if they're available, I've caught a few mistakes in the Mongolian database and I've heard the same about the Uralic one (as can be expected of a project with such a big scope) Crom daba (talk) 02:21, 5 September 2016 (UTC)
Not a Turkologist, but I'd like to note that ɨ does not combine well with macrons (you get ɨ̄ instead of the expected dotless version); an i/ı distinction would have similar problems. (Also, just in case, reminder to please put whatever you decide on at Wiktionary:About Proto-Turkic.) --Tropylium (talk) 07:38, 7 September 2016 (UTC)
I hope you don't let font issues get in the way of making a correct decision. --WikiTiki89 15:59, 7 September 2016 (UTC)
ɨ̄ shows as two characters for me in the editing box, but on the screen once posted, it shows correctly, with the macron replacing the dot.--Prosfilaes (talk) 22:12, 8 September 2016 (UTC)