Wiktionary:Beer parlour/2020/April

Cancelled edit

April has been cancelled due to coronavirus. Please proceed to May. —Rua (mew) 12:25, 1 April 2020 (UTC)[reply]

Ahem ... actually, April contracted COVID-19 which is the disease caused by the virus; it is imprecise to say that it was canceled due to the virus when it was actually canceled due to the disease. - TheDaveRoss 22:03, 1 April 2020 (UTC)[reply]

This is a descriptivist household. —Suzukaze-c ◇◇ 05:33, 2 April 2020 (UTC)[reply]

Meaning: in case of disease, we’ll describe you some medication. --Lambiam 14:25, 2 April 2020 (UTC)[reply]

How to request a missing sense: request for definition? edit

I have the impression that the intention is to use {{rfdef}} to request a missing sense, but this is not evident from the documentation at Category:Request_templates or at {{rfdef}} itself. I would gladly add clarifications to {{rfdef}} and Category:Request_templates if someone can confirm that this is indeed the agreed approach – or that someone could do it themself. Looking for some examples I came across some in Category:Requests_for_definitions_in_Russian_entries where {{rfdef}} is used without a comment after other definitions, which seems almost useless (александровский, аттитюд, both added by Benwing2 – perhaps some boilerplate that they forgot to remove?). PJTraill (talk) 13:12, 1 April 2020 (UTC)[reply]

Usually, I've seen {{rfdef}} used even when other definitions are present, but more often than not, someone will request a definition another way (on the talk page, in the Tea Room, etc.). In most cases that I've seen {{rfdef}} alongside other definitions, it was accompanied by a usage example, which made it possible to identify and add the intended sense. Andrew Sheedy (talk) 01:07, 2 April 2020 (UTC)[reply]

Usage examples or citations are very useful to convey to others what sense you are looking for. It's also not all that hard to look at how other dictionaries meet the need and then 'improve' their definition. DCDuring (talk) 20:54, 17 April 2020 (UTC)[reply]

User:Rua systematically removing `{{top3}}` from Slavic entries edit

Despite the vote failing to depreciate topN, @Rua is still systematically removing {{top3}} from Slavic entries. Pinging those that voted: @Fay Freak, Robbie SWE, Vorziblix, Donnanz, פֿינצטערניש, Droigheann, Mellohi!, Julia --{{victar|talk}} 09:14, 5 April 2020 (UTC)[reply]

@Rua: You should stop edit entries to enforce a format or scheme that is known to but alternate with the other format – except perhaps to counteract other people doing the same futile thing. But when somebody created an entry with {{top3}} there is no specific point to remove it and when {{c}} or {{C}} is used there is no point to write manually {{topics}} in place of it; if there ever is consensus on it then there are bots for such changes. You are splurging your time and the time of other people who inspect your edits, and entertain your own insanity, and I say this not to hurt but because I am sorry for you. Fay Freak (talk) 11:07, 5 April 2020 (UTC)[reply]

I'm sorry, but my understanding of that vote is that its failing means that topN &c can be used when appropriate, not that once used they can't be removed. In fact I find edits like this a definite improvement for human readability, whatever it does to bots. --Droigheann (talk) 11:13, 5 April 2020 (UTC)[reply]

That's my interpretation of the vote as well. The vote failed, so the normal rule remains, which is that users can improve entries whatever way they see fit. Since I think my edits are an improvement, I will make them, obviously. —Rua (mew) 11:27, 5 April 2020 (UTC)[reply]

@Rua: So it is that the system of that user will prevail who spends the most insane amount of time. Fay Freak (talk) 11:29, 5 April 2020 (UTC)[reply]

You act like this is surprising. What would Wikipedia do in a situation like this? —Rua (mew) 11:35, 5 April 2020 (UTC)[reply]

@Rua: Oh no, I am not surprised, but it’s a Pyrrhic victory for you and for all, and I thought you fain don’t let Wiktionary descend to Wikipedia’s level. Fay Freak (talk) 11:45, 5 April 2020 (UTC)[reply]

@Droigheann: Your mixing up and not properly distinguishing things. This edit deployed {{onomatopoeic}} and put {{catlangcode}} to the sense and even the removal of column templates is inarguable here because of not enough descendants for column templates, but there is specifically no consensus to remove column templates from Proto-Slavic entries which have descendants in three groups, and then there are changes like moving content from before the part of speech to after the part of speech which is not agreed upon and editing but to replace {{c}} and {{topics}} which hasn’t had any consensus and I observe with concern. Fay Freak (talk) 11:29, 5 April 2020 (UTC)[reply]

Obviously I was talking about the removal of columns in that edit, which you admit is inarguable here, I don't think this discussion is about the other templates you mention. Thing is Victar pinged me about Rua 'systematically removing columns from Slavic entries', so I went into their contribution history and the first Slavic edit I came across was related to Czech and had nothing to do with columns [1], and the second was the one I'd already linked to. Maybe Rua is systematically removing column templates everywhere on principle, but if that's the case it's a matter for admins, innit? (Not to mention that with my connexion speed I'd spend a week before I could even decide whether or not they do that.) --Droigheann (talk) 11:50, 5 April 2020 (UTC)[reply]

It seems clear that @Rua is formally justified, but that the spirit of the oppose votes was that columns were often worthwhile; I would also appreciate their using an edit summary. It therefore seems that a further discussion is desirable to see if consensus can be reached on when columns should be used (if ever), and when it is justifiable to remove them. For the record, I find removing columns from Proto-Slavic/maca justified, but regret it being done to golǫbь. My feeling is that a lot depends on the device on which you read: I use a wide screen and so like columns; if this is behind people’s preferences, the ideal solution would be a redefinable style, but I do not know if HTML/CSS can do columns like that. (CC: @Fay Freak, Robbie SWE, Vorziblix, Donnanz, פֿינצטערניש, Droigheann, Mellohi!, Julia PJTraill (talk) 20:15, 5 April 2020 (UTC)[reply]

I suggested such a solution on the talk page of the vote, but it was ignored. —Rua (mew) 20:53, 5 April 2020 (UTC)[reply]

But we act for the general readership, and you would justify every edit with the general readership, so user preferences make little a difference here. Fay Freak (talk) 21:48, 5 April 2020 (UTC)[reply]

Now it's not ignored, i gave some solution.—Игорь Тълкачь (talk) 16:27, 13 July 2020 (UTC)[reply]

(edit conflict) I have mixed feelings about this. Yes, Rua has an unfortunate tendency to respond to disagreement by putting her head down and bulldozing her way through. Yes, Victar has a tendency to take disagreements personally and complain rather than doing the work necessary to persuade and/or arrive at an acceptable compromise. And yes, the vote was poorly designed to find the consensus on this issue because it was far too broad.

That said, the point about mobile view and narrow screens seems to make sense: if you resize a window with a three-column list, you get narrower and narrower columns, until the individual columns are only a few characters wide and everything wraps in awkward ways. I wonder if there's a way to break the list into floating blocks that can display either side by side on wide screens or one after the other on narrower ones. This would work well in cases where you want to limit a column to children of one node even if the nodes have different numbers of children, which is dealt with in the {{top3}} scheme by inserting {{mid}} before each node. Chuck Entz (talk) 21:27, 5 April 2020 (UTC)[reply]

Everything in tailoring display according to device properties is possible via CSS and HTML nowadays, especially formatting columns, which wasn’t easy less than twenty years ago. Fay Freak (talk) 21:48, 5 April 2020 (UTC)[reply]

@Chuck Entz: Fay Freak, is absolutely correct; mobile is a complete non-issue because three columns appear as one on mobile.

Because Rua is allowed to seemingly act with impunity, all I have left at my disposal is extreme bureaucracy, i.e. creating votes, and to sound alarms for misbehaviors. If I had reverted her on the edit I illustrated, she would have reverted and probably blocked me, as she did with columns on West Germanic entries. --{{victar|talk}} 02:39, 6 April 2020 (UTC)[reply]

When I voted I was more concerned with keeping these templates, not with particular applications. I do find them useful. DonnanZ (talk) 11:19, 6 April 2020 (UTC)[reply]

Module:zh/data/ltc(och)-pron/* of varient forms edit

For example:

Module:zh/data/ltc-pron/欬 > Module:zh/data/ltc-pron/咳 (咳 has different pronunciations and meanings)
Module:zh/data/ltc-pron/撦 > Module:zh/data/ltc-pron/扯
Module:zh/data/ltc-pron/㨾 > Module:zh/data/ltc-pron/樣
Module:zh/data/ltc-pron/喫 > Module:zh/data/ltc-pron/吃 (吃 has different pronunciations and meanings)
Module:zh/data/och-pron-ZS/喫 > Module:zh/data/och-pron-ZS/吃
Module:zh/data/och-pron-ZS/鑑 > Module:zh/data/och-pron-ZS/鑒

and:

Module:zh/data/ltc-pron/歸 > Module:zh/data/ltc-pron/皈 (皈 [not recorded in Guangyun] is a varient form of 歸, but 皈依 [used in Middle Chinese] is treated as orthodox form)

Should we:

move (merge) these page;
or create redirections in module namespace;
or put {{zh-pron}} on the entries of varient forms? (@Justinrleung, Suzukaze-c, Tooironic, kc_kennylau, Nyarukoseijin)--沈澄心 ✉ 15:18, 5 April 2020 (UTC)[reply]

Request for using AutoWikiBrowser edit

Dear sysops, I would like to be permitted to use AutoWikiBrowser for creating forms (verb forms, noun forms, orthographical variants, etc.) in Esparanto. Thank you. Jonashtand (talk) 19:13, 6 April 2020 (UTC)[reply]

@Jonashtand All set. - TheDaveRoss 12:31, 7 April 2020 (UTC)[reply]

bug when editing entries edit

Is anyone else experiencing a weird bug when editing entries? The text comes up in blue, and lags a lot when editing larger entries. How can I disable this? ---> Tooironic (talk) 02:03, 9 April 2020 (UTC)[reply]

@Tooironic: I have not. Can you tell me which entry(ies)? —Justin (koavf)❤T☮C☺M☯ 02:07, 9 April 2020 (UTC)[reply]

It comes up anytime I edit entries, including right now as I am leaving this comment. Some bits are blue, some bits are black, some are yellow/dark green. Kind of like a HTML editor. ---> Tooironic (talk) 02:11, 9 April 2020 (UTC)[reply]

See Extension:CodeMirror on MediaWiki wiki. It provides syntax highlighting for wikitext. To disable or enable the syntax highlighting, click the marker button in the toolbar for the edit box. — Eru·tuon 02:49, 9 April 2020 (UTC)[reply]

That fixed it. Thank you very much. ---> Tooironic (talk) 04:27, 9 April 2020 (UTC)[reply]

Vote: Attestation of comparatives and superlatives edit

I created Wiktionary:Votes/pl-2020-04/Attestation of comparatives and superlatives for what I find pretty obvious.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:03, 10 April 2020 (UTC)[reply]

I'd like template editor permissions, please. edit

See the relevant contributions. I mostly work on quotation templates, and have occasionally run into an issue where I have to request an edit from someone with the relevant permissions. It was suggested that I request the permissions for myself. grendel|khan 20:05, 10 April 2020 (UTC)[reply]

Yes, please give the user the rights. --Vitoscots (talk) 21:20, 10 April 2020 (UTC)[reply]

Support; he or she has been doing useful work with quotation templates. — SGconlaw (talk) 08:33, 11 April 2020 (UTC)[reply]

Done. SemperBlotto (talk) 08:40, 11 April 2020 (UTC)[reply]

@SemperBlotto: Not that I much care, but I believe it takes two admins to approve. --{{victar|talk}} 08:49, 11 April 2020 (UTC)[reply]

I’m an admin. Does that count? — SGconlaw (talk) 20:43, 19 April 2020 (UTC)[reply]

More editing than normal edit

Is it just me, or have we been getting a lot more edits since the whole rona crisis started? --Vitoscots (talk) 21:22, 10 April 2020 (UTC)[reply]

I had a look at the stats recently, and the increase is not very pronounced. – Jberkel 21:43, 10 April 2020 (UTC)[reply]

I see. What the hell was happening on January 19? Did I miss a edit-o-thon or something? --Vitoscots (talk) 21:54, 10 April 2020 (UTC)[reply]

Seems like someone ran an import bot that day, maybe. It would be real nifty if we could look at 'recent changes' as of a particular date or time. grendel|khan 00:00, 11 April 2020 (UTC)[reply]

Apparently not. Bots are charted separately. It could've been a bot using a normal user account. I used to do that all the time it my botting days. --Vitoscots (talk) 00:16, 11 April 2020 (UTC)[reply]

I made a SQL query to figure out who was editing the most, and apparently I was. Looking at my contributions from that day, that was the day that I started protecting all the Old Chinese and Middle Chinese pronunciation data modules (see this Grease Pit discussion). Protections count as edits and there are about 37,000 of these modules. Super boring compared to the other possibility, an unauthorized bot that had slipped a bunch of edits in under the radar. :-( — Eru·tuon 00:38, 11 April 2020 (UTC)[reply]

Super boring indeed. Makes me think, though...How could we find out the record for most number of edits per day? I imagine anything above 1000 would be pretty impressive already. I've made 887 edits in the last two days, and I'm just a newbie. --Vitoscots (talk) 01:02, 11 April 2020 (UTC)[reply]

Newbie, huh? Well, anyway, here's an edits-per-day leaderboard for 2020. You're in the top 300. Bots take up lots of spots unfortunately. I wonder how long an all-time leaderboard would take to generate. — Eru·tuon 01:53, 11 April 2020 (UTC)[reply]

Adam78 (talk • contribs) takes top spot for user, and that's for his astonishing over 3000 edits in a day for Hungarian Rhyme pages (but we all know that editing Rhyme pages is kinda cheating, bu still, let's hope someone gets inspired and writes a decent Hungarian poem because of it). Then there's Fenakhay (talk • contribs) with 1153 edits, but they were all edits changing a template, which was probably bot-done and is also cheating. Hergilei (talk • contribs) is next best, with 1065 edits on 8 April, minor Czech edits, so I consider Hergilei (talk • contribs) as the actually leader. I'm actually just outside the top 300, but I feel good as I am above Equinox (talk • contribs) (488 per day compared to a rather paltry 405). --Vitoscots (talk) 11:42, 11 April 2020 (UTC)[reply]

Quality, not quantity! You've probably got more accounts than edits. Equinox ◑ 22:31, 11 April 2020 (UTC)[reply]

For some projects the figures of page views are notable since the beginning of the confinement in some countries: w:Special:Permalink/950292354. --Vriullop (talk) 08:32, 11 April 2020 (UTC)[reply]

That would have been my guess … — SGconlaw (talk) 08:34, 11 April 2020 (UTC)[reply]

Can a term's idiomaticity in English argue for idiomaticity of identical formations in other languages? edit

For example fried egg, which is the namesake of the "fried egg test" due to its surviving RFD. If this term is idiomatic in English, then do we regard the identically-formed and synonymous Dutch gebakken ei as idiomatic as well? Note that gebakken means not just "fried" but also "baked", yet this term never refers to a baked egg. —Rua (mew) 09:18, 11 April 2020 (UTC)[reply]

I would say the criterion certainly applies to other languages, but I think the idiomaticity should be determined separately, as it would be for any other term. Andrew Sheedy (talk) 04:31, 12 April 2020 (UTC)[reply]

I would certainly say that gebakken ei deserves an entry. Andrew Sheedy (talk) 04:37, 12 April 2020 (UTC)[reply]

With as hyponym spiegelei. Today’s special: gebakken paasei. --Lambiam 11:32, 12 April 2020 (UTC)[reply]

Category:English supposedly 1-syllable words edit

Is this empty (and funnily-named) category still useful? Equinox ◑ 15:31, 11 April 2020 (UTC)[reply]

I think it can be deleted. It apparently has to do with a method of counting English syllables in {{IPA}} that Daniel Carrero was working on; it's discussed in User talk:Daniel Carrero/2016. But now we have a different syllable-counting method in {{IPA}} so it isn't needed. — Eru·tuon 23:37, 11 April 2020 (UTC)[reply]

I don't mind deleting it. --Daniel Carrero (talk) 02:58, 12 April 2020 (UTC)[reply]

Gone. —Μετάknowledge^{discuss/deeds} 03:26, 12 April 2020 (UTC)[reply]

AWB permission edit

I hereby request the AWB permission to make uncontroversial, repetitive edits faster and more efficiently than through manual editing. I have recently gotten the permission at both Wikimedia Commons and English Wikipedia and would say that I'm now farily acquainted with the tool.Jonteemil (talk) 23:41, 15 April 2020 (UTC)[reply]

@-sche I saw you added the last user to WT:AWB so I was wondering if you could add me too.Jonteemil (talk) 22:12, 30 July 2020 (UTC)[reply]

Given your long history here and the absence of any objections, I have added you. - -sche (discuss) 22:18, 30 July 2020 (UTC)[reply]

@-sche: Perfect, thanks.Jonteemil (talk) 22:33, 30 July 2020 (UTC)[reply]

Mailing list quotation template? edit

We have {{quote-newsgroup}}, but I've seen it used occasionally for anything on Google Groups, which includes mailing lists (like golang-dev, which could provide a cite for race in the computing sense, for example). We only have a few quotation templates, so I wanted to get some input on this. Wikipedia has Template:Cite mailing list, which links to durably-archived mailing list archives, which can include Google Groups. The generated text of the existing template points to Usenet, which isn't always accurate. Is it worth creating a separate template for this sort of thing? grendel|khan 02:19, 16 April 2020 (UTC)[reply]

How can we tell whether a mailing list, newsgroup, bulletin board or other forum is durably archived? --Lambiam 08:19, 16 April 2020 (UTC)[reply]

@Lambiam:: Tough question, but Wikipedia seems to have a good idea. I don't know how much we tend to enforce in the templates by whitelists and the like, but looking at some uses on Wikipedia, seemingly durable archives include Google Groups and MARC. There are also various Pipermail archives, some of which go back more than two decades (lists.debian.org, lists.gnupg.org), individual lists not on MARC like SL4 and the various Linux Kernel mailing lists and their various archives... I'm aware of the case of gmane, which archived millions of messages across twenty thousand mailing lists and is no longer available on the web. Maybe a strong preference for Google Groups or MARC, only using a different archive if they're not available? MARC only goes back to 2000, but that should cover nearly everything. (There are other archives, for example, for the cypherpunks list in the 1990s.) Maybe a guideline that the site should have been running for at least n years and have a clear intent to be a durable archive? Maybe a short whitelist of domain URLs, making it clear that this is for mailing lists, not bulletin boards or anything else like that? grendel|khan 16:56, 16 April 2020 (UTC)[reply]

Support As I said on other occasions, they are durable by the fact that the economic foundation of our century and Wikimedia depends on these software projects, so the vanishment of the mailing-lists if one is allowed to equate it with the vanishment of the source-codes is tantamount to and as likely as Wiktionary itself disappearing. See Talk:copy-pasto, @Grendelkhan, Lambiam; the deletion of the term was unfounded, and @Equinox wrongly adduced Github when it wasn’t Github, which lead to @Kiwima deleting it because the ensuing lack of understanding of this durability, which is greater than that of most printed works of the twentieth century. There were observable criteria under which such archives are durable, and only this needs to be understood for templates for quoting mailing-lists to be demanded. Fay Freak (talk) 17:32, 16 April 2020 (UTC)[reply]

"Economic foundations" have nothing to do with whether something is durably archived. Durably archived means it will be able to be accessed in future and won't disappear. That's not economics. I'm sure that the early accounts and receipts of the Ford Motor Company were crucially important to the now-huge car industry but they probably don't all exist any more. Equinox ◑ 17:45, 16 April 2020 (UTC)[reply]

@Equinox: No, they aren’t the foundations in that sense. For software projects, so Wikimedia/Mediawiki, continuous, persisting and still persisting access to the source codes together with these lists is a prerequisite for their being kept up at all. Fay Freak (talk) 18:01, 16 April 2020 (UTC)[reply]

What exactly are the durability criteria? {{quote-web}} doesn't whitelist URLs or have any specific standards for durability (though maybe it should); we could already be using it to cite mailing lists, though that doesn't seem clean. I can understand wanting to use guidelines; anything that would allow both Google Groups and MARC would get us most of the way there. grendel|khan 19:29, 16 April 2020 (UTC)[reply]

I've created {{quote-mailing list}} and added the appropriate notes in the appropriate places. I'll ask on the Grease Pit about getting a maintenance line/category for non-Usenet groups currently using {{quote-newsgroup}}. grendel|khan 19:32, 17 April 2020 (UTC)[reply]

@Grendelkhan: please add it to {{citation templates}}. Thanks. — SGconlaw (talk) 20:48, 19 April 2020 (UTC)[reply]

@Sgconlaw: Done! grendel|khan 22:20, 19 April 2020 (UTC)[reply]

FWIW, as the existence of {{quote-web}} suggests and as has been discussed in a few places, we can include citations of non-durable media even though they don't count towards attesting a term : they may still be particularly illustrative uses of a term, or perhaps be used in referencing/verifying something else like etymology or pronunciation. So, I think it's OK for {{quote-mailing list}} to exist, quite independent from questions of whether or not mailing lists are durable (and I suppose that its creation means the original question is resolved). - -sche (discuss) 01:26, 20 April 2020 (UTC)[reply]

Finding consensus on a large variety of non-content changes edit

Hello, while looking over Wiktionary:Criteria for inclusion a while a go I noticed many phrasing and formatting changes which could be made in order to improve it. After posting an example of what I thought the page should look at User:The Editor's Apprentice/sandbox I started a new section at Wiktionary talk:Criteria for inclusion#Edit request: Formatting, phrasing, and minor changes in multiple sections listing and providing some rationale for my requested changes, some of which were replacing "formulae" with "formulas", moving related sentences onto the same lines, and rephrasing writing. Since then, I've expanded the list of changes that I think should be made to the criteria for inclusion page as well as closed my edit request. At this point I'm trying to figure what might be the best strategy for finding the consensus on my various proposed changes. Would it be best to hold one large vote where users can vote on which changes they do and do not support? Would it make more sense to hold many votes separately for each of the categories of proposed changes? Do some proposed changes need their own vote while others can be grouped together? Answers to these questions as well as other suggestions on what might be the best way of gathering community consensus would be greatly appreciate! —The Editor's Apprentice (talk) 02:00, 17 April 2020 (UTC)>[reply]

Ideally, you would create a vote, and if you are making multiple independent edit proposals, you could create separate support-oppose sections for each edit proposal. In Wiktionary:Votes/pl-2020-04/Attestation of comparatives and superlatives, there are two options, and this is a vote design that we had a lot of success with; you would not have options but changes, I guess, like "Change 1", "Change 2", "Change 3", etc., and "Support change 1", "Oppose change 1", etc. If you believe the changes are very likely to be accepted, you can create a single "Support" and "Oppose", but then if someone objects to one change only, they have to vote down all changes via "Oppose". --Dan Polansky (talk) 09:25, 17 April 2020 (UTC)[reply]

As for the proposed series of edits, I for one am not convinced they constitute an improvement; others may differ. --Dan Polansky (talk) 09:30, 17 April 2020 (UTC)[reply]

Support; I like many of the changes, and am indifferent about some of the others. If you asked to make them all I would vote in favor I think. I am not sure I prefer bulleted lists to numbered lists, and I would make all of the {{mention}} uses into {{m}}, but the style changes I find good. - TheDaveRoss 12:58, 17 April 2020 (UTC)[reply]

I think DanP is right about how to approach it. DCDuring (talk) 20:48, 17 April 2020 (UTC)[reply]

Cool, thank you all for the feedback. The way I am planning to move forward is something along the lines of what Dan Polansky suggested, specifically one vote with multiple sections for different changes. Although I still have an attraction to grouping together edits, which is what I tried initially, I don't think it is the best idea. Again because of what Dan Polansky said as well as because I think having multiple voting sections would most clearly demonstrate where consensus exists and how strong it is, among other things. —The Editor's Apprentice (talk) 23:26, 17 April 2020 (UTC)[reply]

A possible problem I've hit upon: I've identified more than a dozen categories of general changes I want to propose, which would mean that a vote containing all of them would have more that 40 sections. That seems like a lot and an I'd rather not flood Wiktionary:Votes, or anywhere else for that matter, with all of the content, nor ask anyone to navigate it. Given the absurdity of the situation, would it be smart to just start a vote suggesting that the protection level on policy pages be lowered as well as one that English Wiktionary adopt an edit notice akin like English Wikipedia w:Template:Editnotices/Page/Wikipedia:Edit warring and its related rules? Afterwards I could just make the changes I think should be made and, if needed, another user could just revert them. The whole thing would look like English Wikipedia's BOLD, revert, discuss cycle, something I think is worth emulating. —The Editor's Apprentice (talk) 01:44, 18 April 2020 (UTC)[reply]

CFI's being basically vote-protected served us very well. It helped us bring CFI to the good state it is in over the years. Making cosmetic changes to a core policy is not essential; what is essential is to prevent making bad changes. The bad parts still present in CFI are for the most part from the time when people edited CFI using the bold method. If you feel strongly about what are arguably cosmetic changes, I would suggest you start with a vote proposing 7 changes in one vote; that one vote will surely not overflood anything. --Dan Polansky (talk) 09:55, 18 April 2020 (UTC)[reply]

I'd start with the 7 you hope to be the least controversial: it'd be a way to build trust. DCDuring (talk) 19:32, 18 April 2020 (UTC)[reply]

I have no doubt that the criteria for inclusion (CFI) having such a high protection level has been something that has worked well, but I believe that it having a lower protection level would allow for even better working, because such would allow it to be much more refined and polished. I agree with you that cosmetic changes are not essential, but I do believe that having well and clearly written core policy would make English Wiktionary much more accessible and help grow editorship since such would help to present a organized image for English Wiktionary. One thing that I do disagree with you about is the idea that preventing "bad" changes is essential. In contrast, I would argue that preventing them isn't what is essential, but instead making sure that they do not persist in the current revision of a page is what is essential. That is pretty much the de facto policy that already exists on most mainspace pages. The existence of the powerful revert option makes this philosophy easy to implement in reality and it can be supplemented by other tools, including blocks, when necessary. If you think that would lead to a burdensome future, I would be willing to personally adopt the responsibility of reverting "bad" changes. When I look at the revision at the time when the CFI became official, I see what your saying. Many of the things that I am pointing to as problems have existed since the time the page was boldly edited. I think two other things are also important to note: first, over the nearly 14 years since then, many of these problems have remained unresolved by the WT:VOTE system. Second, many of the great things about the CFI that persist to today also originate from the time before it was subject to the WT:VOTE policy. Also, as to the suggestion about just proposing 7 changes at a time, that's a really good point! I guess I totally missed the kind of obvious fact that I don't have to propose all of my proposed changes at the same time. Instead, as you said, Dan Polansky, I can just start multiple smaller votes. I agree with you DCDuring, starting with the smaller and less controversial path of doing small batches of votes before doing anything more drastic is probably a smart idea and a good way to build trust. —The Editor's Apprentice (talk) 20:36, 18 April 2020 (UTC)[reply]

The vote is now live at Wiktionary:Votes/2020-04/Style changes to the criteria for inclusion. —The Editor's Apprentice (talk) 21:18, 18 April 2020 (UTC)[reply]

`{{synonyms}}` broken edit

{{synonyms}} is broken. It only shows the next message:

Lua error in Module:nyms at line 56: The parameter "2" is required.

[[w:Wikipedia:Lua error messages|Lua error]] in Module:nyms at line 56: The parameter "2" is required.

Backtrace:

[C]: in function "error"
Module:parameters:204: in function "process"
Module:nyms:56: in function "chunk"
mw.lua:518: ?
[C]: ?

For example:

{{synonyms|example}} produces:

Lua error in Module:nyms at line 56: The parameter "2" is required.

--BoldLuis (talk) 07:49, 17 April 2020 (UTC)[reply]

The first parameter is the language code: {{synonyms|en|example}}. The error message is confusing. – Jberkel 08:30, 17 April 2020 (UTC)[reply]

There are lots of templates that require the language code as first parameter, even some that make no current use of it. DCDuring (talk) 20:51, 17 April 2020 (UTC)[reply]

Since this is en.wikt (so don't accuse me of colonialism or whatever) I'd be tempted to assume lang=en by default if no other is specified. It's nicer for users than a big red warning message. Equinox ◑ 23:42, 17 April 2020 (UTC)[reply]

I dimly recall that lots of things worked that way in the good old days. I liked the idea that items that needed a language code other than "en" appeared in English categories, which tend to have the most users and would thus be likely to be corrected. Making a good correction is not always easy so it would be handy to have a pseudo-language code for "unknown language". Script codes are hand if that is known to the would be corrector. DCDuring (talk) 19:29, 18 April 2020 (UTC)[reply]

I totally miss the days when you could just write {{cooking}}. I think about it a lot. Equinox ◑ 19:35, 18 April 2020 (UTC)[reply]

While we're reminiscing about the days of yore, I really enjoyed {{boozing}}, because {{lb|en|drinking}} just didn't cut it for me. --Vitoscots (talk) 19:44, 18 April 2020 (UTC)[reply]

Do you remember when there were pubs and restaurants, and you could sit out in the sun scribbling furiously on sheets on paper, and then bill it as "systems analysis"? Before the plague times. Equinox ◑ 19:53, 18 April 2020 (UTC)[reply]

I wasn't much of a furious scribbler in the pubs, myself. I just used them for their wifi so I could edit Wiktionary in an environment other than my house. --Vitoscots (talk) 19:59, 18 April 2020 (UTC)[reply]

Yeah your house is bloody awful, had to get out of there to do some proper Spanish editing. Equinox ◑ 20:00, 18 April 2020 (UTC)[reply]

Also I miss pretending to be a Czech whose English was pretty dodgy. --Vitoscots (talk) 20:05, 18 April 2020 (UTC)[reply]

Allowing Jyutping polysyllabic entries as non-lemmas edit

Under the current policy, Jyutping transliterations for Cantonese are only allowed for monosyllables, such as zoeng1, but not polysyllables; while Pinyin transliterations for Mandarin are allowed for both, as in zhāng and jǐnzhāng. I propose that Jyutping should be given the equal status as Pinyin that polysyllables be allowed as non-lemma entries, since Jyutping has acquired the status as the standard phonetic transliteration for Cantonese in Hong Kong, considering that:

it is developed by the w:Linguistic Society of Hong Kong;
it is used in the the Cantonese Read-Aloud Test; and
recent linguistic papers written in English transliterate Cantonese in Jyutping.

There are no reasons for us to treat Pinyin and Jyutping differently. Jonashtand (talk) 08:49, 17 April 2020 (UTC)[reply]

No replies? Should I create a vote for this? Jonashtand (talk) 08:55, 23 April 2020 (UTC)[reply]

Whole bunch of rfdefs edit

Shiro1998 (talk • contribs) has been creating loads of empty Turkish pages like these, with no actual content, just {{rfdef}} and a link to another dictionary. How do we feel about it? --Vitoscots (talk) 00:13, 18 April 2020 (UTC)[reply]

On one hand, Wiktionary:Requested entries (Turkish) seems to be the indicated route for such requests. On the other hand, these requests tend to languish there as if in limbo, and the grunt work of creating the entries has now already been done – all that is left is to fill out the definitions. --Lambiam 08:46, 18 April 2020 (UTC)[reply]

I think it's a good idea – it lowers the barriers of contribution, someone proficient in the language can jump right in, without having to worry too much about template/formatting details. – Jberkel 08:55, 18 April 2020 (UTC)[reply]

I oppose volume addition of definitionless entries: semantics is the core of an entry. There was a discussion about it; let me see if I can find it. Serbo-Croatian Wiktionary is an example of a site that has many definitionless entries and low visit rate, making the point for me. --Dan Polansky (talk) 09:12, 18 April 2020 (UTC)[reply]
en.wiktionary already has much more traffic than the Serbo-Croatian Wiktionary, so there's that. And how can you be sure it's the reason why that wiki has low traffic? P U C – 09:23, 18 April 2020 (UTC)[reply]
@PUC: He can’t, and it’s wrong. The lack of definitions is reason for lack of traffic for Serbo-Croatian Wiktionary, the presence of definitionless entries is not. There is no reason why a definitionless entry would have worse effect than say a 404. So it is good to add these definitionless entries since they lower the barrier of contribution. His reliance on empirics is an empty vessel wherein he conjures up the causality that confirms his bias, while we all know from introspection of our own how dictionary users operate or need to operate. Fay Freak (talk) 09:49, 18 April 2020 (UTC)[reply]
The empirical approach is the scientific approach. It helps one to see beyond one's own mind. Should I go by my introspection, I would reach the same conclusion: definitionless entries are nearly worthless, and reduce overall attractivenes of a project. But I should not trust my own hunches if I can do better; what if other people find useful what I find worthless? That's where empirical research helps. --Dan Polansky (talk) 10:43, 18 April 2020 (UTC)[reply]
I agree, definitionless entries are a major turn-off. Stinks low effort. Allahverdi Verdizade (talk) 22:05, 18 April 2020 (UTC)[reply]
A discussion was at Wiktionary:Beer parlour/2014/May#Pregenerating entries, resulting in no consensus. I believe I have collected empirical evidence suggesting that definitionless entries are an undesirable thing. --Dan Polansky (talk) 09:19, 18 April 2020 (UTC)[reply]
I support the creation of definitionless entries. It's much less effort to make them complete. Serbo-Croatian Wiktionary had a huge number of 101,315 such entries. Now it's down to 84,697. It's not bad for such a small project, especially considering that most speakers of Serbo-Croatian don't even recognise it as a language, so they don't want to work on that project at all. No empirical evidence, Dan. And previous creations of definitionless entries turned to be a success, IMO. Some of the remaining Ukrainian entries are a little bit obscure or have obscure senses but it's good to remind sometimes that such maintenance categories do exist.--Anatoli T. ^{(обсудить}/^вклад) 09:42, 18 April 2020 (UTC)[reply]
The analysis at User talk:Dan Polansky/2019#Definitionless entries in the Serbo-Croatian Wiktionary suggests the project is worthless. Let me quote: "[...], if we focus on Croatia and Bosnia and Herzegovina (Serbia uses a different script), it seems that 8 000 Croatian lemmas in the Croatian Wiktionary (^) produced more page views than all the 137 030 Latin-script Serbo-Croatian lemmas in the Serbo-Croatian Wiktionary (^), of which 84 720 are definitionless (^).". --Dan Polansky (talk) 10:37, 18 April 2020 (UTC)[reply]

Believing the above, in 6 years since 2014, Serbo-Croatian Wiktionary changed the number of defless entries from 101,315 to 84,697, that is, by 16,618. That does not really appear very hopeful to me. To the contrary, it suggests that if a large volume of definitionless entries is created (not 500, that is quite manageable), a large portion of them is going to remain definitionless for many years to come. --Dan Polansky (talk) 11:35, 18 April 2020 (UTC)[reply]
Just saying that the math is improperly applied here and @Dan Polansky’s mention of statistics is bloviating without understanding as usual. As all great contributors learn, for each language there are only few people who do the bulk, so it is a coincidence if rfdefs in a particular language are left unsolved, apart from the fact that “Serbo-Croatian” is a politically repudiated language category (and so, Serbian Wiktionary appears more successful in defining), which makes the developments on Serbo-Croatian Wiktionary particularly unfit for comparison. Fay Freak (talk) 16:48, 23 April 2020 (UTC)[reply]

If the entry adds nothing beyond the mere existence of the headword then it should go into a request list, like WT:REE. If there is something more (preferably something that shows that this is a word with a meaning, like proper citations) then rfdef is (IMO) appropriate: it's not much use for site users but it gives us a sound base to work from. I would struggle to consider an entry keepable if it had no definition and no citations, even if it had e.g. pronunciation. Equinox ◑ 11:40, 18 April 2020 (UTC)[reply]

^ —Suzukaze-c ◇◇ 20:05, 18 April 2020 (UTC)[reply]

Perhaps we can have an incubator space for entries that are in a state of development beyond a mere request but not yet ready for prime time. --Lambiam 16:09, 18 April 2020 (UTC)[reply]

I don't know what's wrong with stub entries. Wikipedia has them, and tracks them. We do the same with {{rfdef}}. Why throw out something that can be expanded? —Rua (mew) 16:41, 18 April 2020 (UTC)[reply]

These are substubs, not stubs, by my lights. What is wrong with them was explained in Wiktionary:Beer parlour/2014/May#Pregenerating entries; when present in large volumes, they present a disincentive for users to consult Wiktionary since the users would then associate Wiktionary with definitionless entries in their minds, which fail the most critical lexicographical purpose. The argument "Why throw out something that can be expanded" leaks: I could equally well create entries that would only contain further reading (no pronunciation, no inflection, no gender, no definition), and they would still be "something that can be expanded". --Dan Polansky (talk) 17:42, 18 April 2020 (UTC)[reply]

We could have a new namespace, like Draft:quirkafleeg, for these, though frankly I think it would become a mess of protologisms that nobody bothers to clean up. If someone is really keen to add a word then they should put in the absolute basic minimum work to create it, or else use our existing, good and successful Requested Entries features. Equinox ◑ 18:43, 18 April 2020 (UTC)[reply]

The average user probably doesn't know about the Requested Entries feature. --Vitoscots (talk) 19:45, 18 April 2020 (UTC)[reply]

That might be true but to me it suggests we should try to increase findability of RE and not that we should fall back on creating basically empty placeholder entries. Equinox ◑ 19:51, 18 April 2020 (UTC)[reply]

Like before, I argue the stubs created by Ivan Štambuk's User:StubCreationBot are not substubs, they contained a lot of useful information. E.g. in this revision the Ukrainian entry of мі́ра (míra) already contained the full inflection paradigm, pronunciation, gender, animacy and a reference link to a dictionary. I consider this not just useful but very useful, since adding manually 14 inflected forms for each term is not something many editors here are willing to do. Adding complex inflections is probably the most discouraging factor for contributors of some languages, especially if it's not automated. And adding a missing definition (translation) by a speaker of a particular foreign language entry may take seconds! --Anatoli T. ^{(обсудить}/^вклад) 13:29, 19 April 2020 (UTC)[reply]

I'm not opposed to this if someone adds a bunch of other information to the entry, like etymology, pronunciation, and related terms, but I don't think this is a good practice in general. One of the problems with it is that it eliminates redlinks, which makes anyone who doesn't actually visit the page much less likely to add the word. Andrew Sheedy (talk) 00:13, 19 April 2020 (UTC)[reply]

The citation space seems to get used as a sort of draft namespace already. Citations:quirkafleeg – Jberkel 12:58, 19 April 2020 (UTC)[reply]

Here's my own rfdef, which I occasionally make: a Khmer suffix Template:km-l used in the middle of ឯកអគ្គរដ្ឋទូត ― ʼaek ʼakkĕəʼrŏətthaʼtuut ― ambassador, made of two parts ឯកអគ្គ ― ʼaek ʼakkĕəʼ ― the best and only and រដ្ឋទូត ― rŏətthaʼtuut ― embassy. The suffix is the second component of Template:km-l. I don't feel comfortable adding the definition of the complex prefix of Pali origin but having a respelling in the entry and etymology is still useful. Irregular native speakers won't be comfortable with respellings to get the right IPA but they will be able to provide the translation. Please tell me why do you think it's useless. --Anatoli T. ^{(обсудить}/^вклад) 00:54, 20 April 2020 (UTC)[reply]

Maybe we could have a robots.txt file hide definitionless entries? One point against having them is that if the link isn't red, editors who would have clicked the link and made a stub entry ignore it. Maybe we should have a script make them orange for editors?

If we could have a javascript make adding a definition as easy (this should be easier as well) as adding translations it could be worth it. Krm db (talk) 16:26, 23 April 2020 (UTC)[reply]

Yes, displaying in a different colour is not a bad idea. It depends on what contributors are looking for. It does add to the hidden categories, like Category:Requests for definitions in Russian entries (hmm, it has grown to 206).

I have added my own Russian defintionless entry, which was requested. Although I am a native speaker, I don't know how to define the mining term закопу́шка (zakopúška) (it has a Russian description in the talk page). So it does require human attention but it has a lot of information already and inflected forms can help find more about the usage of the word or the context. --Anatoli T. ^{(обсудить}/^вклад) 03:22, 29 April 2020 (UTC)[reply]

Merging Malay and Indonesian headers edit

Given that the majority of the vocabularies between the two language varieties are shared and that the two varieties are largely mutually intelligible (at least in their written forms), I think that the Malay and Indonesian headers should be merged into a single "Malay" header, just like the single "Serbo-Croatian" header for Bosnian, Serbian, Croatian, and Montenegrin. Differences between Indonesian and Standard Malay can be dealt with simply by labeling the definitions that are applicable only for one of the varieties. Jonashtand (talk) 19:41, 19 April 2020 (UTC)[reply]

In order to discuss this, I think we need first to clarify the difference between Indonesian Malay and Indonesian language, which is something I asked here two years ago and never got a satisfying answer on. See also the following previous discussions:

Wiktionary:Votes/2012-12/Unified Malay
Wiktionary:Votes/2016-10/Unified Malay Revote (a vote that never actually happened)
Wiktionary:Beer parlour/2018/September#Malay as an ISO 639 macrolanguage

Is it time for a new vote? —Mahāgaja · talk 20:34, 19 April 2020 (UTC)[reply]

@Mahagaja I am not sure about all the varieties under Malay the macrolanguage as defined by ISO 639. The new header "Malay" that I propose only include the standardised varieties of Malaysian (bahasa Malaysia, the official language of Malaysia) and Indonesian (bahasa Indonesia, the official language of Indonesia). Jonashtand (talk) 21:01, 19 April 2020 (UTC)[reply]

This may start another konfrontasi. --Lambiam 21:43, 19 April 2020 (UTC)[reply]

It would be probably worth considering if we had dedicated contributors sick of duplicating Malay and Indonesian contents but as it is, it's worthless. Not much efforts have been seen before and no efforts are to be expected with this change. A lot of merging work would be required. Any previous merger required, at least labelling of the other varieties under the same L2 header, so that no information is lost. --Anatoli T. ^{(обсудить}/^вклад) 03:50, 20 April 2020 (UTC)[reply]

@Lambian I don't think this will get political. "Malay" is just a language name.

@Atitarev Presently our Malay/Indonesian dictionary is far from being complete, lacking quite a number of basic words. I think it'd be better to do the merger now. If we leave it for now, it'd be even harder to merge in the future. Jonashtand (talk) 07:55, 20 April 2020 (UTC)[reply]

@Jonashtand: It's not your fault but this discussion keeps coming up but nothing happens because of the lack of any serious commitments and the opposition. You can, of course, make a new vote (previous votes might serve as a template) and try to make your case. --Anatoli T. ^{(обсудить}/^вклад) 10:00, 20 April 2020 (UTC)[reply]

@Atitarev I'd like to draw some consensus here before resorting to voting. Jonashtand (talk) 12:35, 20 April 2020 (UTC)[reply]

@Jonashtand: Sure, good luck. Re: "Malay" is just a language name. There's a lot in the name. It's a very sensitive matter. If you find Chinese (passed), Serbo-Croatian (passed), Norwegian (failed) votes, you will see that it's so straightforward but we had a lot of contributors. --Anatoli T. ^{(обсудить}/^вклад) 22:26, 20 April 2020 (UTC)[reply]

What do our Malay/Indonesian editors think? @Tofeiku, Zulfadli51, Xbypass, ArdiPras95, Blazerlazer555, Heydari, Rex Aurorum?

Krm db (talk) 16:16, 23 April 2020 (UTC)[reply]

@Jonashtand: I am not a Malay editor but I would vote for merging since I do not see a case made for their being different enough languages. Seems like German and Austrian to me from what I have encountered. If there is ever a difference {{tlb}} / {{lb}} can be used but there mostly isn’t or one doesn’t know, just like in Serbo-Croatian, and the claims “this is Indonesian” resp. “this is Malay but not Indonesian” are regularly suspicious of being overly specific. Therefore I agree with the statements “I think it'd be better to do the merger now. If we leave it for now, it'd be even harder to merge in the future.” Fay Freak (talk) 16:37, 23 April 2020 (UTC)[reply]

@Mahagaja: Well, if you had to ask the difference between them, I would like to quote the Sneddon (2003) statement in Wikipedia. "Standard Indonesian is confined mostly to formal situations, existing in a diglossic relationship with vernacular Malay varieties, which are commonly used for daily communication". In Indonesian point of view, Indonesian Malay is any vernacular Malay languages which is not Standard Indonesian [which is spoken by Malay, such as Deli Malay, Jambi Malay, Palembang Malay, Riau Malay, and Pontianak Malay] in Indonesia. By the way, Malay from Dutch colonisation until independence of Indonesia is better defined in Dutch East Indies Malay rather than Indonesian Malay as in current practice. --Xbypass (talk) 21:32, 23 April 2020 (UTC)[reply]

@Lambian: A language is a dialect with an army and navy. So, I have to disagree with "Malay" is just a language name. There is a significant political and sociocultural backgrounds behind the selected Indonesian language and the difference between Standard Indonesian and Indonesian Malay. The proposed merger between the two standards itself has political and sociocultural backgrounds. But, sure, good luck. --Xbypass (talk) 21:32, 23 April 2020 (UTC)[reply]

I have no opinion in this merger as I as a Malay speaker does not actually understand much when listening to a formal Indonesian news due to the choice of words and grammar. FYI, in the 2 main Malay dictionaries (Kamus Dewan of Malaysia and Kamus Bahasa Melayu Nusantara of Brunei) have Indonesian entries in it labelled as Indonesian. Also, please list down what "Malay" do you want to merge. Is it only just ms and id or including all the dialects because some dialects are not mutually intelligible. --Tofeiku (talk) 01:56, 24 April 2020 (UTC)[reply]

@Tofeiku: As I mentioned, just the Malaysian and Indonesian standard varieties of Malay, but not the dialects. Jonashtand (talk) 12:59, 24 April 2020 (UTC)[reply]

@Xbypass, Tofeiku: That is very informative, thank you. If you don't mind, I'm also interested in your point of view as Indonesian/Malay editors (as opposed to speakers). Do you think the merger would help your work (by removing redundancy, allowing for a wider set of resources, enabling collaboration with editors across the border, ...) or hinder it (by introducing too much ambiguity, demanding excessive labeling, repelling users and editors for political reasons, ...)? Krm db (talk) 16:05, 24 April 2020 (UTC)[reply]

@Krm db: My only concern is I do not know Indonesian well and so do Indonesian speakers vice versa. I would not know if this word is used in Indonesian or not, or this word is rarely used. Also when I make sentences for examples, I need to write the Indonesian version as well which I do not know too. --Tofeiku (talk) 04:11, 25 April 2020 (UTC)[reply]

@Krm db: This proposal is to merge the two standard of Malaccan Malay (ie. the Malay language based on Malacca Sultanate), i.e. the (standard) Malay [which is spoken in Malaysia, Brunei and Singapore] and Indonesian, isn't it? First, the merged item have to indicate similar spelling, but different source of loanword (especially English vs Dutch loanwords). Second, the merged item have to indicate different diacritic of e as Indonesia and Malaysia have different spelling rule with the e [as their colonial predecessor]. Third, Indonesian is existing in a diglossic relationship with vernacular Malay varieties, but these vernacular Malay (such as Riau Malay which is the descendant of Malaccan Malay and said has similar pronunciation to the [peninsular] Malay) has different degree of intelligible to the standard Indonesian [but present as dialect continuum], which have to be addressed either merged to Indonesian, merged to Malay or listed on different heading. Fourth, we have limited number of person who have enough understanding for both standards as most of Malay and Indonesian speakers prefers to switch into English rather than learning the other standard. --Xbypass (talk) 18:40, 25 April 2020 (UTC)[reply]

The last but not the least, the socio-political issue and ambiguity are still unresolved with the usage of "Malay" as header is not quite representative as the proposal exclude other vernacular Malay languages and does not represent Indonesian language clearly [as Malay has been used for Standard Malay (Bahasa Melayu Piawai)]. For example, Pembicaraan:Perbedaan antara bahasa Melayu Baku dan bahasa Indonesia, the article was protested for ambiguous usage of "Malay" [and edited into "Standard Malay" subsequently]. --Xbypass (talk) 18:40, 25 April 2020 (UTC)[reply]

I think this proposal is wonderful, but it had to be passed a thoughtful discussion as this proposal will cause significant change. Is there any other alternative to this proposal while everyone think about it? --Xbypass (talk) 18:40, 25 April 2020 (UTC)[reply]

Malay speaker here. I too do not understand Indonesian well. Although many words have similar spelling I find that the meaning can be very different. Sometimes it can even cause misunderstanding. For example, I once asked "kamu duduk di mana" to an Indonesian friend. I wanted to ask "where are you staying" but he thought I was asking "where are you sitting". From what I can observe, it is not true that all meanings are the same. Some words have additional meanings that only exist in Indonesian but not Malay, or the other way round. For example, "perang" means war and also brown colour in Malay, which has same spelling but slightly different pronunciation. But, the word "perang" in Indonesia means only war and does not refer to brown colour. I check the Indonesian dictionary and "perang" is marked as not standard spelling of "pirang" which is a reddish brown or reddish yellow colour. I asked my Indonesian friend and he says "pirang" means blonde and brown color is "cokelat". But "cokelat" is spelled "coklat" in Malay and it usually means chocolate or chocolate colour rather than brown colour. So if you merge the two languages, there's going to be a lot of confusion when it comes to listing synonyms, because you need to know which meaning or spelling apply to which language. For example, perang/pirang/cokelat/coklat, it can be brown, chocolate, blonde or incorrect spelling depending on region. And there are also words in Indonesian with same spelling but slightly different meaning compared to Malay. For example, "mesra" is used to describe very close relationship between two persons such as friends in Malay but in Indonesian it is only used to describe intimate relationships. My advice is, only edit in the language you are familiar with. Please do not copy and paste definition from one language to another and duplicate it, unless you are skilled in both language. All this copy and paste has caused people to think that the meaning is just duplicate and that both languages can be merged. Reality is, the meanings are similar but not exactly same. If you merge it is going to confuse language learners because there are many words with slightly different meaning and spelling. So if you don't know both Malay and Indonesian, then stop copy and paste definition from one section to the other. 49.228.203.169 07:30, 26 April 2020 (UTC)[reply]

@49.228.203.169: Thank you very much for your information. I'd like to ask if there are a lot of such differences. If there are only a dozen or so (I think below 100 is acceptable), then the differences can still be conveniently pointed out using usage notes and labels. Jonashtand (talk) 17:57, 26 April 2020 (UTC)[reply]

But one wouldn’t even see that in a text; one often misunderstands without knowing. Tofeiku mentions he cannot “actually” understand much when listening to formal Indonesian news and I can’t “actually“ understand implications in Swiss or Austrian legal texts, and laymen can’t either, and without special knowledge one cannot detect the regiolect. How much can one reliably detect whether something is Indonesian or Malay, without knowing from circumstantial data that it is one or the other? For Serbo-Croatian texts it is often impossible to assign a text to one of the regional standards, hence it is one language. Fay Freak (talk) 09:35, 27 April 2020 (UTC)[reply]

@Fay Freak: I would rather ask the definition of "circumstantial data". In most situation, it is easy to assign a text to Standard Malay or Indonesian as the two standards have different vocabulary choices, especially when the texts are using loanwords, while both dictionaries will list them as synonyms if there are. This sentence is about common text (such as news), not about specialised one like legal texts. --Xbypass (talk) 23:09, 27 April 2020 (UTC)[reply]

@Xbypass: I mean with circumstantial data that one knows beforehand whether it will be Standard Malay or Indonesian, from when one is in Indonesia or in Malaysia or the book is printed there or the top-level domain is a country domain. If one meets someone speaking Serbo-Croatian in Germany or the US or finds a shorter text with a TLD not related to one of the country (or even then it may be contradictory) then it is often not conspicuous which of the standards it is. The question is whether it is the same with Malay. If one has to exert oneself in an analysis to assign a text that one already understands to a language then the distinction is not at the language-level anymore.

With common texts it is the same problem as with legal texts in a lesser degree. News texts refer to the political systems; the names of functions and institutions and their structure may be different, so many related idioms and memes. Fay Freak (talk) 07:59, 28 April 2020 (UTC)[reply]

@Fay Freak: Even without the name of political systems, the names of functions and institutions and their structure, we can assign easily because of different choice of words and existed loanwords. For example, this sentence "Indonesia lapor 275 kes baru COVID-19" is a standard Malay because Indonesian will use kasus instead of kes and so on, while both form have minimal difference in grammar. --Xbypass (talk) 00:37, 29 April 2020 (UTC)[reply]

I don't speak either Malay or Indonesian, but I've spent a lot of time over the years helping the communities for all the merged languages here and I've seen what goes into making them work. I'd like to give an overview on what I think is involved.

First of all, let's remember that boundaries between languages are not based on empirical, black and white facts. Every speaker has a slightly different version of the common language, and that language is a dynamic compromise between all of those differences, that results from people using the language with each other. As social beings, we use differences in language to say things about ourselves: who we think we are, who we want to be identified with. A language is as much a social and political construct as a linguistic one.

Indonesia is a modern world country, but it's also one of the major language-diversity hot spots on the planet and it's been the meeting place of major cultures, religions and even civilizations- which means a lot of conflict over the centuries. The current state is relatively young, so matters of identity are very political and sensitive.

From a linguistic point of view, Indonesian is obviously no more different from some varieties of Malay than they are from each other. There's no strictly linguistic reason to keep them separate without splitting up Malay, too. That's the easy part.

We have to also consider the impact on the editors. A wiki is a community, and Wiktionary depends heavily on the sub-communities that do all the work in individual languages. This kind of change won't work unless the people who work on Malay and Indonesian entries are willing to make it work. Sure, you can run a bot and change all the headers, adding {{lb|ms|Indonesian}} and {{lb|ms|Malay}} where appropriate. You can even have admins revert any attempts to change the headers back. But that's just the first step. Here are some problems to solve:

What's a lemma, and what's an regional alternative form? In English, there are minor regional spelling difference like color vs. colour and center vs. centre. We have an informal understanding that we leave the pairs of entries the way they're first set up: one may have the US spelling as the main one, with all the definitions, etymology, etc, and the other one is the regional form. Another may be the other way around. If I see someone change one of these pairs from non-US main to US-main, I revert it. Our English, Australian or New Zealand admins will revert if they see US-main changed to non-US-main. Some agreement of this sort has to be reached in order to avoid endless edit wars.
What about labels and usage notes? This is where the differences are the greatest, and it's the hardest for those who know only one or the other. What may be archaic and obscure for one may be the only correct choice for the other.
What about subtle differences in the definitions? It's easy to deal with cases where regional differences result in different definitions, but sometimes the definitions overlap, or they're ambiguous. Merging definitions can be very tricky, unless you're familiar with the specific senses in both languages
What about headwords? I notice that, for instance negeri has:
nêgêri (plural, first-person possessive negeriku, second-person possessive negerimu, third-person possessive negerinya) [Indonesian]

nĕgĕri (plural negeri-negeri, informal 1st possessive negeriku, impolite 2nd possessive negerimu, 3rd possessive negerinya) [Malay]

Here, the forms are the same, but the labeling is different. In some cases I suppose the forms might be different too, though the language structure makes that less likely/

I also noticed from About Indonesian that Indonesian words are supposed to have diacritics in the headword to distinguish between different types of "e": ê,é,è. It's not documented, but some Malay words have diacritics, too, as you can see here.
What about the templates? The Malay ones have to have provision for Jawi spellings, but the Indonesian ones don't. I'm sure there are irregular differences in what's listed, in what order, and how it's labeled, as well as whether there's a template at all. Not all of them can be kept separate, and I'm sure there would have to be discussion and compromise to come up with something that both works and fits the space. This is likely to make things more complicated and harder for new editors.

Solving all of these requires communication and cooperation between both the Indonesian and Malay editors. It also requires a lot of tedious work by people who know what they're doing.

We will also have to thoroughly document our decisions with a detailed "About" page to make it easier for new/IP editors, and to minimize arguments.

Finally, there's the pushback from people both within and outside of Wiktionary who disagree. Both the Chinese and the Serbo-Croatian mergers have prompted vehement accusations of cultural genocide from Wiktionary editors. The Serbo-Croatian one still attracts intermittent edit-warring and patronizing lectures from IP editors.

To sum it all up, this isn't just a technical adjustment. It will require someone within the community itself who has the will and the expertise to push this through- against opposition and technical challenges. It will require lots of work that can't be done by anyone who doesn't know the languages. Sure, a merger makes sense in theory, but will it work? Do we have people who will make it work? If yes to both, we can substantially improve our coverage for both languages. But if no to either question, it could very well end up alienating or burning out our contributors and doing real damage to the dictionary.

I apologize for the massive wall of text, but we need to completely understand what we're getting into before we decide. Chuck Entz (talk)

Don't you see? This is the exact kind of problem your community is facing. Editors who do not speak Malay or Indonesian insisting that both languages are the same and can be merged. First of all, if you are familiar with the language, you would know that diacritics are not used in written Malay or Indonesian. The diacritics are only used in dictionaries to distinguish between the mid central vowel (e pepet) and mid front unrounded vowel (e taling). Second of all, if you are linguistically adept, you would know that modern Malay and Indonesian are proscribed based on Malacca or Riau-Lingga Malay. Why Malacca or Riau-Lingga Malay? Because this is historically the homeland of the Malay people. During colonial times, the Dutch actively promoted this ideal form of Malay, rejecting words like "bisa" (able in Javanese, poison in Malay), while they were teaching Malay as a second language in the Dutch Indies region. After the Dutch left, many words that are already in colloquial use became absorbed into Bahasa Indonesia, the most controversial being "butuh" (must in Sundanese, penis in Malay). Meanwhile in the Malay peninsula, there are changes in pronunciation that are not reflected in spelling. The current spelling reflects ideal pronunciation which differs from colloquial pronunciation, the most common being "kasih" in "terima kasih" (thank you) which is pronounced "ka-seh" in Malay despite being spelled "kasih". More importantly, how can people claim that both languages are linguistically similar when they are only judging the written form of these languages? The written form is proscribed, and it looks similar because it has been standardized by MABBIM (Majlis Bahasa Brunei-Indonesia-Malaysia) but where meaning/definition is concerned, these have not been standardized. The differences that are recorded on the Wikipedia page are just random examples and is not even complete! And speaking of differences, isn't this a descriptivist dictionary, rather than a proscriptivist one? If you do speak either Malay or Indonesian, you would know that colloquial speech is very different from formal written Malay and Indonesian. Many words in common use, such as "kecik" for small, are rejected in formal dictionaries because they do not reflect ideal spelling. And it doesn't help that people think that the written form of Malay/Indonesian is the same as the spoken form of Malay/Indonesian because they are comparing formal written Malay/Indonesian which is widely available on the Internet, but not the spoken form. So if anyone here does speak colloquial Malay/Indonesian, they would know that the written form only used in formal occasions differs a lot from the spoken form of Malay or Indonesian that is used in daily life, e.g. "ngak mau" in colloquial Indonesia, "tak mau" in colloquial Malay, "tidak mau" in formal written Indonesian, "tidak mahu" in formal written Malay, all with the same meaning "don't want". Also, colloquial spelling has not been standardized, so there are all sorts of variants. So if you're proscribing the language, go ahead and merge both languages and reject all the informal and incorrect spellings that are commonly used in daily speech. 172.103.227.106 05:26, 27 April 2020 (UTC)[reply]

Using diacritics in this example is not a big deal. They can be still used for display purposes, as we do with Latin, accents in various Cyrillic-based languages. E.g. nĕgĕri would be helpful for pronunciation. The pronunciation section would cater for various standards and dialect. Whatever is developed, can be used.

One of the point User:Chuck Entz made, was the presence or lack of the will and interest.

You can use, following your examples,

tidak mahu (standard Malay) ― don't want
tidak mau (standard Indonesian) ― don't want
tak mau (colloquial Malay) ― don't want
ngak mau (colloquial Indonesian) ― don't want

Generic or language-specific templates would be able to handle all that. The can default to standard Malay.

I don't think contributors would be required to know each variety but they would need to be aware of their existence, so if they add a term, an example, they will mark it appropriately. For casual visitors it's hard to grasp but we have done complex mergers. Take a look at Chinese 歷史／历史 (lìshǐ), for example. It covers multiple Chinese, mutually-unintelligible topolects.

Now let's have a look at usage examples:

他們是學生／他们是学生 ― tāmen shì xuéshēng ― they are students
佢哋係學生／佢哋系学生 [Cantonese] ― keoi⁵ dei⁶ hai⁶ hok⁶ saang¹ [Jyutping] ― they are students

Let's analyse the above. The first example is standard Chinese or Mandarin (default), the second is Cantonese and labelled as such. In the examples, both traditional and simplified Chinese are displayed. The pronunciations, transliterations and even some words are different between standard Chinese and vernacular Cantonese. Of course, something like this can be developed for varieties of Malay and Indonesian.

Serbo-Croatian is written in two scripts. Croatians and many Bosnians, even some Serbs can't even read Cyrillic but the contents are duplicated in two varieties and there could be further dialectal differences - Ekavian/Ijekavian, a simple word for "river" has four forms (there could be more!)

ре́ка (Ekavian), рије́ка (Ijekavian) (Cyrillic); réka (Ekavian), rijéka (Ijekavian) (Roman).

As you can see, under one L2 header (==LANGUAGE NAME==) different varieties, scripts and dialects can fit OK. All you need is the will to make it work. There will be haters but benefits outweigh downsides. --Anatoli T. ^{(обсудить}/^вклад) 07:09, 27 April 2020 (UTC)[reply]

(edit conflict) While I'm not familiar with the details, this kind of thing is fairly common. You have a mass of spoken dialects that could be grouped into multiple mutually-unintelligible languages, but which are considered by their speakers to be the same language due to their all being taught a single standard language. Serbo-Croatian is the result of Serbs, Croats, etc. deciding to base their standard languages on a single Croatian dialect. There's huge dialectal variation, but the standard languages are very close. The mainland Scandinavian languages had their standard languages converge on Danish for centuries, though The Norwegians came up with a competing standard that's much less like Danish. Chinese is an extreme example: a large family of spoken languages uses a single writing system that has all the words spelled mostly the same, though the grammar and pronunciation are completely different. What's more, completely unrelated languages like Japanese, Korean and Vietnamese also use the same writing system as part of their standards so there's a lot of the written vocabulary that's comprehensible to all of them and all of the Chinese. If you put someone from Morocco in the same room with someone from Iraq, they won't understand each other until they switch to standard Arabic.

The problem is that we're a written dictionary, and our sources are in writing. Written standard Malay and written standard Indonesian are basically the same language. The part of Malay and Indonesian that's not part of the standard is extremely complex, and I doubt the distribution of all that variation corresponds very closely with that of the standard languages. I would be very surprised if there weren't a lot of Malay speakers speaking the same dialect at home as some Indonesian speakers. We don't have enough data to organize everything around the spoken variation, so the best we can manage is starting from the written standard(s) and attaching all the information we can get to that.

To put it another way: neither the two-language nor the one-language model reflects reality, so we might as well go with what works the best for what we do have. By the way- a couple of points: I think you have proscribed (forbidden) and prescribed (required) confused. Also, I was well aware that diacritics aren't part of the standard spellings. I was talking about the headword, which is where we display dictionary-specific information about the term. The problem I was highlighting was that our Indonesian and our Malay entries use different systems of diacritics in the same place. Oh, and one last technical quibble: I hope you have a real good reason for using an anonymous proxy, because we normally block those on sight. Chuck Entz (talk) 07:20, 27 April 2020 (UTC)[reply]

Serious problems with the merger haven't been articulated yet but I think if the headers had inflections, and if they differed for Malay/Indonesian (they don't), it would create an additional hurdle (can be solved too). As for, for example words, which are only used in one variety, a potential solution would be just labelling the the term. Indonesian has many borrowings from Dutch but Malay has mostly borrowings from English. In such cases, words like bangkrut would need to be labelled {{lb|ms|Indonesia}}.

I want to present another example of multiple varieties under one L2 header. Have a look at the simple Persian word (emâm, “imam”). Multiple pronunciations are displayed by one template, including Tajik, Dari and colloquial Tehrani. The entry includes the Tajik spelling имом (imom) (it's not merged with Persian). Persian and Dari share the spelling on this word but the pronunciation is different, even the transliteration for Dari and Classical Persian would be "imâm", not "emâm" but the current compromise is to use the Iranian Persian transliteration. --Anatoli T. ^{(обсудить}/^вклад) 07:47, 27 April 2020 (UTC)[reply]

The diacritic issue is the easiest problem to solve. There shall no diacritic in headword, the diacritic shall be written in pronunciation section. However, the problem does not lie there. --Xbypass (talk) 22:42, 27 April 2020 (UTC)[reply]

@Chuck Entz: "neither the two-language nor the one-language model reflects reality", so one-and-a-half-language model proposal does not reflect reality, as this proposal put Standard Malay and Indonesian as dialects of Malay, but separate others vernacular and creole Malays. There are two possible solution, either the two-language model (ie. kept every dialects of Malay, included (standard) Malay and Indonesian, separately) or one-language model (ie. every subdivision under Malay (ms) are relegated into label and usage note). --Xbypass (talk) 00:14, 28 April 2020 (UTC)[reply]

But that sounds like “Indonesian” and “Standard Malay“ are varieties created by proscription where the colloquials can be from times more different and from times closer again, irrespectively of nation borders? Which speaks for a merge. Fay Freak (talk) 09:18, 27 April 2020 (UTC)[reply]

Your edit summary was easier to understand than the actual post :) What does it mean "Which speaks for a merge"?

I'll try, anyway. If you look at 歷史／历史 (lìshǐ), it includes a number of Chinese varieties but there is no Gan, Xiang, Jin, Min Bei, etc. Does it mean that the word for "history" is different in these lects? No, not necessarily, it's not clear, it only means nobody has added them yet. Editors don't need to know ALL standards, there may be no data or a dictionary available elsewhere. So, if an Indonesian speaker adds a word and he/she doesn't know any Malay (doesn't want to check against another dictionary), they can label it as "Indonesian", another editor may add a label "Malaysian", "colloquial Malaysian", "rare Malaysian" or remove the label altogether, if this is to mean "applicable to both varieties".

This begin to sound more and more strange. If indonesian speaker add word and don't know any malay...what do you mean dont know any malay? If the language is same variety of course he already know malay. and is all this labeling necessary? If UK people edit english entry, do they need to label UK for all the words because they dont know US english? and why label malaysian? malay language is not malaysian language! it is language spoken by ethnic malay in south east asia. if you dont know the language of course you need to check a dictionary! --Ismail 10:06, 27 April 2020 (UTC)

I think you shall differentiate Malay and Standard Malay before anything else. Malay is a supralanguage. If you said that Indonesian don't know Malay, it is wrong because all Indonesian know Indonesian (as a part of Malay supralanguage). However, it is different if we talk about Standard Malay (Bahasa Melayu Piawai, the standard used in Brunei, Malaysia and Singapore) to Indonesian as most Indonesians do not know about Standard Malay, so most of them fallback to English to communicate (except for Malays who live in Sumatra or Borneo, probably). So, if this merger goes on, the labelling is necessary. However, if you write as "language spoken by ethnic malay in south east asia", we shall not the proposal continue as majority of Indonesian speakers are not Malay. --Xbypass (talk) 22:42, 27 April 2020 (UTC)[reply]

To IP 49.228.11.105 - @Ismail: Of course, if words are shared for both, no label is needed, just like with English terms. What if they don't or the editor doesn't know? Labelling first would be a start, especially if a large number of Dutch loanwords into Indonesian suddenly get a "Malay" header. A thorough editor would check if a term also applies to both Malaysian and Indonesian, that's what was done and is still happening by editors with other mergers. --Anatoli T. ^{(обсудить}/^вклад) 22:48, 27 April 2020 (UTC)[reply]

Take a look at 佢哋 (keoi5 dei6), it has a label for two - Cantonese, Pinghua. We only know for sure that the word is used in Cantonese and Pinghua but we only see Cantonese pronunciation of the word. Now, another example, Serbo-Croatian Európa is labelled "Croatia", it belongs to Category:Croatian Serbo-Croatian but the term Евро́па and Evrópa are labelled (Bosnia, Serbia) and they belong to their appropriate categories. This is basically how a language merger works by design. I'm sure it would work the same or similar way with Malay/Indonesian, if some thought and effort are put in. More templates, categories, labels but less repetitive work for a whopping 90% of words. --Anatoli T. ^{(обсудить}/^вклад) 10:06, 27 April 2020 (UTC)[reply]

Excuse me but this is discussion about malay and indonesian language. please dont mislead others by discussing about chinese or cantonese or serbo croatia. how can you compare malay with chinese when the writing system not the same. it is like comparing apples and oranges. if apple is easy to plant will orange also be easy to plant? of course malay and indonesian can be merged but what are you going to name it. If you rename indonesian as malay it going to be very controversial. i suggest rename as south east asia malay to be fair to all people, because it is language spoken in south east asia and language has no borders. --Ismail 10:06, 27 April 2020 (UTC)

@Lambian, Jonashtand, Ismail: A language is a dialect with an army and navy, as army and nave have border, so language has border. Without the army and navy in first place, any dialects have border, otherwise we do not get British English, American English and so on, or in this case, Brunei Malay, Kelantan Malay, Riau Malay, Deli Malay, Musi Malay and et cetera. This heading (L1) problem is not addressed enough for a merger happens as I wrote previously. The proposal to rename it as South East Asia Malay is not less controversial than the proposed Malay one as the other vernacular Malays is still not addressed (these Malays are spoken in South East Asia too) and does not represent Indonesian language clearly [as Malay has been used for Standard Malay (Bahasa Melayu Piawai)]. --Xbypass (talk) 22:42, 27 April 2020 (UTC)[reply]

To IP 49.228.11.105 - @Ismail: You're not able to understand the analogy? Don't put this apples and oranges stuff here if you can't follow. You want to learn by your own mistakes? Fine by me. I'm not the one pushing this agenda. What does writing system has to do with it? Chinese varieties share the writing system but the words used may be different, they may have different senses. Serbo-Croatian has regional varieties as well. Doesn't that apply to Malaysian and Indonesian differences? Doesn't Malay also use Arabic Jawi, which is not be applicable to Indonesian?

You're free to suggest new names for the merged language in the vote but I don't like your chances. --Anatoli T. ^{(обсудить}/^вклад) 22:48, 27 April 2020 (UTC)[reply]

yes, i am sorry if you do not like my suggestion --Ismail 10:55, 28 April 2020 (UTC)

@Jonashtand, Tofeiku, Xbypass, Krm db I am Malay native speaker (Pontianak dialect) and Indonesia Standard speaker (i use it for daily conversations). I have been interacted to a dozen Malay dialects. I also have been read some Malay dialects dictionaries published by Pusat Bahasa and few Malay dialects particularly from Malaysia. I support proposal for merge Malay and Indonesian. --Rex Aurorum (talk) 06:04, 28 April 2020 (UTC)[reply]

Standard Malay is Standard Bruneian Malay, Standard Malaysian Malay and Standard Singaporean Malay. So, just labelling as Malaysia is not good enough. --Tofeiku (talk) 06:50, 28 April 2020 (UTC)[reply]

We've got a bunch of regional labels. They just keep growing. --Anatoli T. ^{(обсудить}/^вклад) 07:27, 28 April 2020 (UTC)[reply]

yes, i agree labeling as malaysia is not enough because same standard form also used in singapore and brunei, so not fair to label "malaysia" and ignore other regions. official dictionary used for malay exam in singapore is all publish by official language authority in malaysia (dewan bahasa dan pustaka) https://www.seab.gov.sg/home/examinations/approved-dictionaries and brunei malay newspaper https://www.google.com/search?&q=brunei+malay+newspaper use same standard with malay newspaper in malaysia and singapore. there is no difference between official malay language in brunei, singapore or malaysia. see if you can prove me wrong? --Ismail 10:55, 28 April 2020 (UTC)

Well, except for a minor thing, the standard Malay in Brunei, Malaysia and Singapore are same. The notable exception, as far as I know, is the translation of government, which is kerajaan in Malaysia and Brunei Standard Malay and pemerintah in Indonesia and Singapore for a historical reason. So, I prefer adding Brunei and Singapore label rather than scrapping Malaysian label. --Xbypass (talk) 00:37, 29 April 2020 (UTC)[reply]

Here's a list of previous votes, passed and failed.

Wiktionary:Votes/pl-2014-03/Unified Norwegian (failed) - Bokmål and Nynorsk use different entries but "Norwegian" language code is still allowed, apparently too much difference in inflections and complex headers - different genders, plurals, etc.
Wiktionary:Votes/pl-2014-04/Unified Chinese (passed) - only for Chinese character-based varieties.
Wiktionary:Votes/pl-2009-06/Unified_Serbo-Croatian (passed)
Wiktionary:Votes/2011-10/Unified Romanian (passed) - Moldovan/Moldavian language code is not allowed but Romanian Cyrillic entries exist.

Read them and read the talk pages, related the discussions, in case you want to make a new vote for a Unified Malay --Anatoli T. ^{(обсудить}/^вклад) 07:17, 28 April 2020 (UTC)[reply]

Hindi and Urdu - a Unified Hindustani vote was never attempted. Not enough interest generated and some oppisition.

A Unified Albanian vote never occurred. Contributors are happy with Ghek and Tosk to be used under the same L2 header.

A Unified Arabic vote never occurred. Contributors oppose. Too much complexity and grammar and no desire to merge classical Arabic with colloquial dialects. The complexity of grammatical inflections and differences in headers, transliterations makes it very difficult.

A unified Persian vote never occurred but Dari and Persian are both treated as Persian. Tajik is separate. --Anatoli T. ^{(обсудить}/^вклад) 07:27, 28 April 2020 (UTC)[reply]

hihihi...look what I just found, the vote already fail for TWO times in 2012 and 2016 and now all of you want make new vote again, then the policy become permanent. lol wait 4 years, fail, vote again, fail, vote again, until it become success.

https://en.wiktionary.org/wiki/Wiktionary:Votes/2012-12/Unified_Malay

https://en.wiktionary.org/wiki/Wiktionary:Votes/2016-10/Unified_Malay_Revote --Ismail 10:55, 28 April 2020 (UTC)

Nobody really waited, LOL. The votes failed and nothing happened. Thanks for digging them up. I forgot to mention previous votes. It may be useful to look at why they failed. --Anatoli T. ^{(обсудить}/^вклад) 10:37, 28 April 2020 (UTC)[reply]

Well, let's them challenge the status quo. If this fail, we will have the same vote two or four years later, so don't worry, be happy. --Xbypass (talk) 00:37, 29 April 2020 (UTC)[reply]

@Xbypass: Well, good luck. Nobody can say you don't have any previous experience. You can learn from previous failures. Malay/Indonesian merger may be quite sensitive, comparable with Serbo-Croatian where people were shot based on their identity. Linguistically and scientifically, based on what we know, it makes sense, though, despite known differences. You can start draughting the vote without setting the date, collecting related discussions (including this one) and the rationale. Other people will join, the discussion will continue and everybody will be able to respond to opponents. You will need to make promises of preserving the contents, so that the merger would help both Malay and Indonesian contributors, not make their life harder. --Anatoli T. ^{(обсудить}/^вклад) 02:47, 29 April 2020 (UTC)[reply]

@Mahagaja: I actually created some Indonesian Malay entries before. And to explain, the only reason I made them was to integrate Indonesia-specific words into the Malay entries, so that in Malay entries, one could see a complete dialectal variation on word usage between Indonesia, Malaysia, and Brunei. So that means several, if not all, of the entries in Wiktionary under Indonesian Malay are just Indonesian-specific words or senses not found in Malaysia. Not sure if there are some entries not under the category which I explained, because technically there are some Malay dialects within Indonesian territory, but I think if ever they're already in Wiktionary, they should be labelled under the name of their specific dialect, not just a country name like Indonesia. --Mar vin kaiser (talk) 09:48, 22 November 2021 (UTC)[reply]

Rhymes of non-lemma words edit

Hi. Sorry if this isn't the place for this doubt, I never know where to ask things. I was wondering if it's alright to add non-lemma words to rhymes pages. Thanks in advance. --Pablussky (talk) 23:27, 19 April 2020 (UTC)[reply]

I think it's fine--they rhyme just as much as non-lemmas. —Justin (koavf)❤T☮C☺M☯ 00:58, 20 April 2020 (UTC)[reply]

If you're going to start doing this it's time to move the rhyme infrastructure to categories instead of the ridiculous system that we have now. DTLHS (talk) 01:00, 20 April 2020 (UTC)[reply]

@DTLHS: What do you mean, move it to categories? I really don't know how to do that. --Pablussky (talk) 08:23, 20 April 2020 (UTC)[reply]

I don't think he means it's something you need to do right now; rather, I think he wants to eliminate pages like Rhymes:English/iːl and instead have categories like Category:English words rhyming with /iːl/. But that's not the infrastructure we have set up yet. —Mahāgaja · talk 09:41, 20 April 2020 (UTC)[reply]

I see. That probably would need a vote, I guess? --Pablussky (talk) 11:59, 20 April 2020 (UTC)[reply]

It wouldn't need a vote to add categorization to the {{rhymes}} template, though it would certainly be worth having a discussion at least before changing where that template links to, and certainly before getting rid of the existing pages. - TheDaveRoss 12:31, 20 April 2020 (UTC)[reply]

It needs a little more thought than that since we would want to keep the syllable number information. Anyway the motivation for moving to categories in this case is the potential for overwhelming numbers of rhymes on one page if you start adding non-lemmas. DTLHS (talk) 16:59, 20 April 2020 (UTC)[reply]

Are quotes from the Wycliffe Bible okay for English? edit

See, for example, botch, which quotes Wycliffe's translation of the second chapter of Job. However, {{RQ:Wycliffe Bible}} is listed as a Middle English quotation template. Is it a usable source for English words? grendel|khan 06:50, 20 April 2020 (UTC)[reply]

No, Wycliffe's Bible is in Middle English. We have a lot of English words that give quotes from Chaucer, too, but that's also wrong. I think they were taken over from older (now public domain) dictionaries that don't distinguish between Middle and Modern English, but we do. —Mahāgaja · talk 07:10, 20 April 2020 (UTC)[reply]

It's true. We have some undated Chaucer quotes in Category:Requests for date/Chaucer, but they're probably all in the English section. --Vitoscots (talk) 11:36, 20 April 2020 (UTC)[reply]

@Mahagaja: Ah, thank you! Should we remove those quotations/requests on sight, or create a "Middle English" section for any such words and move the quotes there? grendel|khan 15:57, 20 April 2020 (UTC)[reply]

Creating a Middle English section/entry is the less destructive approach. —Mahāgaja · talk 15:58, 20 April 2020 (UTC)[reply]

I've asked Erutuon to make a list of all Chaucer quotations in the English section at Wiktionary:Todo/English Chaucer. Eru is normally pretty good at these kinds of things. --Vitoscots (talk) 17:26, 20 April 2020 (UTC)[reply]

Long overdue. Thanks. BTW, you can use regex searches to find instances yourself. See Preferences, Gadgets, Editing tools. If you don't know regexes, they are worth learning about IMHO. DCDuring (talk) 19:48, 20 April 2020 (UTC)[reply]

You mean take away from Erutuon the simple pleasure of fulfilling my requests? That's what they live for, DCD! --Vitoscots (talk) 22:32, 20 April 2020 (UTC)[reply]

@Vitoscots, Mahagaja, DCDuring, Erutuon: I think {{RQ:Mlry MrtDrthr}} might be another example; Wikipedia says that the book is in Middle English, but it's categorized as an English template. Is this worth another todo list? (And a template rename to something more legible?) grendel|khan 05:08, 22 April 2020 (UTC)[reply]

@Grendelkhan: Yeah, Le Morte Darthur should be Middle English because it was published in 1485 and Wiktionary officially ends Middle English at 1500 (WT:AENM). Oddly the other template, {{Template:RQ:Malory Le Morte Darthur}}, is already categorized as a Middle English template. — Eru·tuon 06:19, 22 April 2020 (UTC)[reply]

It always used to be the case that ME citations were allowable in English entries for senses which survived into Modern English, thus illustrating the range of a sense's usage. I personally think this is very desirable (and in any case, lexicographically speaking, I would see ME as a temporal phase of "English" rather than a separate language, since there is no hard boundary in citations between the two). Ƿidsiþ 06:46, 22 April 2020 (UTC)[reply]

Maybe it was not uncommon, but it isn't correct. You might one to take a look at this:

WT:English entry guidelines#Etymology: "The ancestors of Modern English (1500 to present) are, in order: Middle English (1066 to 1500), Old English (450 to 1066), Proto-Germanic, and Proto-Indo-European."
WT:About Middle English: "Middle English was the form of English spoken in England between 1150 and 1500."
WT:Quotations#How to choose a quotation: "All quotations should be from works written in the language of the word in question [...]. If a word is one which is known to have been coined [...] in another language [...], that information should go in the etymology section [...]."
WT:List of languages: "en English" & "enm Middle English"

--Bakunla (talk) 08:59, 27 April 2020 (UTC)[reply]

It was "correct", since this convention pre-dates the Quotation guidelines. But in any case, the rules are there to consolidate our practice, not the other way round. The question is whether Middle English citations can help to illustrate an entry in English; to me it's clear that they can, but I understand this is not an opinion shared by everyone. Ƿidsiþ 06:09, 28 April 2020 (UTC)[reply]

It is, incidentally, complete nonsense to pretend that there is a difference in the language of texts produced in 1450 and in 1550. There is not. Ƿidsiþ 06:10, 28 April 2020 (UTC)[reply]

fr-noun edit

The plural of CHSLD is CHSLD (no s, it's the general rule in French for this kind of nouns). I was willing to correct, but I only found a template fr-noun: the s is added magically. This kind of template is counter-productive, as it makes correction by casual readers impossible. Lmaltier (talk) 16:47, 20 April 2020 (UTC)[reply]

Thanks for pointing it out. It's easy to fix! --Vitoscots (talk) 17:27, 20 April 2020 (UTC)[reply]

Milton's been dated edit

In case you didn't know, Grendelkhan (talk • contribs) and I have pretty much dated all the John Milton quotes in Wiktionary. Generally I did the easy bits and Grendel tidied up after my sloppiness. There were around 750 of them in total. Yet again, another wikiproject nailed, and wikisatisfaction levels are high once more. Now what's my next wikigoal gonna be? --Vitoscots (talk) 22:48, 20 April 2020 (UTC)[reply]

How about clearing Category:ParserFunction errors? Most of them should look quite familiar... Chuck Entz (talk) 02:30, 21 April 2020 (UTC)[reply]

Got them. Thanks, they weren't really my fault. The original template creator did some weird things. --Vitoscots (talk) 10:07, 21 April 2020 (UTC)[reply]

I'll still be adding page links and doing some consolidation and cleanup going forward, but this is a marvelous milestone! Great work! grendel|khan 06:42, 22 April 2020 (UTC)[reply]

Open access to US govt.-funded peer-reviewed research edit

I think it is in the interests of Wiktionary to respond to a US government request for information about public access to unclassified US govt.-funded peer-reviewed research.

Here is the link to the request for information.

They are seeking responses to four broad questions, but also seem open to other information that bears on the subject:

"What current limitations exist to the effective communication of research outputs (publications, data, and code) and how might communications evolve to accelerate public access while advancing the quality of scientific research? What are the barriers to and opportunities for change?

"What more can Federal agencies do to make tax-payer funded research results, including peer-reviewed author manuscripts, data, and code funded by the Federal Government, freely and publicly accessible in a way that minimizes delay, maximizes access, and enhances usability?

"How can the Federal Government engage with other sectors to achieve these goals?

"How would American science leadership and American competitiveness benefit from immediate access to these resources? What are potential challenges and effective approaches for overcoming them? Analyses that weigh the trade-offs of different approaches and models, especially those that provide data, will be particularly helpful.

"Any additional information that might be considered for Federal policies related to public access to peer-reviewed author manuscripts, data, and code resulting from federally supported research."

Our interests are presumably limited to straightforward free access to the text by search, preferrably seamlessly integrated into some existing large corpus of scholarly and/or policy research. Obviously, Google Scholar comes to mind, but we would want to be open to other possibilities. WMF will almost certainly be responding, but I see no reason for us not to express ourselves on our somewhat peculiar needs. Any thoughts? DCDuring (talk) 01:34, 21 April 2020 (UTC)[reply]

Never mind. Too late. DCDuring (talk) 17:38, 22 April 2020 (UTC)[reply]

User:Rua removing Old Chuch Slavonic as ancestor of Bulgarian edit

For reasons that are opaque to me, Rua changed the ancestor of Bulgarian from cu (Old Church Slavonic) to sla-pro. I don't see any discussion to prompt this and it caused a lot of errors. I've undone this but I'm bringing it here because I don't want an edit war. Benwing2 (talk) 04:35, 21 April 2020 (UTC)[reply]

Oh, there's been lots of discussion, but nothing recent, and no consensus. I'm not sure why she decided to do it this way. She could easily have avoided the module errors and the scrutiny by editing the entries first. It's sort of like trying to sneak into a house by using a battering ram on an unlocked door... Chuck Entz (talk) 06:43, 21 April 2020 (UTC)[reply]

@Chuck Entz: There was this recent discussion: Wiktionary:Beer parlour/2020/February § Bulgarian as descendant of Old Church Slavonic where I tried to conceptualize a bit. “Discussion” of course only sensu lato, no one trying to attack a position and defend another. Fay Freak (talk) 17:05, 23 April 2020 (UTC)[reply]

I wasn't aware of module errors, I'm sorry about that. It just seemed like a mistake, perhaps brought on by a nationalist drive to "claim" OCS for Bulgaria. My reasoning is that OCS, while indeed primarily based on the dialects of Bulgaria where it was first written down, soon "fanned out" to other places such as Macedonia and Moravia, each with local differences added in. And that is kinda the problem: if Bulgarian indeed descended from OCS, then it also descended from the Moravian version of OCS, and saying that Bulgarian derives from early forms of Czech is absurd. —Rua (mew) 07:42, 21 April 2020 (UTC)[reply]

I thought it was intended to indicate that Bulgarian is descended from Old Bulgarian, which is either a form of Old Church Slavonic or a language too close to it to be given its own full language code. (I thought Old Bulgarian had an etymology language code, but I can't find it. Maybe Old Bulgarian existed but was removed. If only we had an automatically generated change log for the language and script data modules.) A related discussion: Wiktionary:Beer parlour/2020/February#Bulgarian as descendant of Old Church Slavonic. — Eru·tuon 18:09, 21 April 2020 (UTC)[reply]

Does this mean that Modern English isn't a descendant of Old English because it isn't derived from Kentish Old English? Krm db (talk) 16:05, 23 April 2020 (UTC)[reply]

This is a very apt comparison. Fay Freak (talk) 17:05, 23 April 2020 (UTC)[reply]

@Krm db: No, I'm not saying that. I was trying to be not too specific about what Old Bulgarian is because I don't know much about it but it seems like my words are coming across as a statement of some specific theory. — Eru·tuon 19:47, 23 April 2020 (UTC)[reply]

@Erutuon, sorry I was responding to what @Rua said. Krm db (talk) 20:26, 23 April 2020 (UTC)[reply]

Recent Taipei and Beijing postal romanizations etc changes edit

@Metaknowledge Hey all, something terrible is happening on the Taipei and Beijing pages. I just want to let you all know that I object. I guess I'll just have to go along with it, but it's not right to make Peking an alternate form instead of a synonym. If I can't change it in the next decade once people come to their senses, hopefully future generations will realize the error. Too dangerous for me to discuss it at length as I would certainly get banned- discuss it amongst yourselves if interested. --Geographyinitiative (talk) 05:32, 21 April 2020 (UTC)[reply]

Not a persecution complex if you are persecuted (it's so bad, I can't even say the details of the persecution!) --Geographyinitiative (talk) 05:35, 21 April 2020 (UTC)[reply]

Wow, it hasn't even been a month since your last rant. If you can't get the professional help Eirikr recommended, you might at least consider contributing to Wiktionary rather than fighting your crusade. —Μετάknowledge^{discuss/deeds} 06:35, 21 April 2020 (UTC)[reply]

Beijing and Peking have different pronunciations. They are indeed more like synonyms than alternative forms. -- Huhu9001 (talk) 08:49, 21 April 2020 (UTC)[reply]

Most alternative forms have different pronunciations... in fact, if they didn't, we'd use {{alternative spelling of}} instead. —Μετάknowledge^{discuss/deeds} 18:25, 21 April 2020 (UTC)[reply]

I agree with User:Huhu9001. At the very least it's bizarre that the entry on Peking currently says "Former spelling of Beijing". Peking isn't similar enough to be an alternative spelling of Beijing in my view since they have very different pronunciations. It was originally an alternative transcription of the Mandarin, based on a different pronunciation system without palatalization, but in English nobody without knowledge of Mandarin would on first glance view them as basically the same word with differences in spelling. They share their primary definition and the Mandarin etymon, but not their pronunciation (unless someone is essentially reading Peking as Beijing because they know the two correspond etymologically), and their spellings differ in p and b, k and j, which are not thought of as very similar because they correspond to different phonemes in English (unlike, say, k and q in Koran and Quran), so they only qualify as alternative forms under a more expansive definition. (In Wikipedia, alternative forms are on a continuum, some nearer to each other, some farther apart; and in this case very far apart.) I think it might be less confusing for readers to label them synonyms, though I can see how putting them in the alternative forms section might be neater. The same applies to other pairs of Postal Map–derived and a pinyin-derived names whose pronunciation and spelling are very different from an English perspective, like Nanking and Nanjing. — Eru·tuon 19:38, 21 April 2020 (UTC)[reply]

I missed that — by our usual standards, Peking is an alternative form rather than an alternative spelling, so I have changed the entry accordingly. —Μετάknowledge^{discuss/deeds} 19:55, 21 April 2020 (UTC)[reply]

I think a similar case in English is Joan and Jane. They are not alternative forms. -- Huhu9001 (talk) 03:11, 22 April 2020 (UTC)[reply]

@Huhu9001: This case is slightly different because "Joan" and "Jane" don't usually refer to the same entity, but "Peking" and "Beijing" do. — justin(r)leung _{{ (t...) | c=› }} 03:38, 22 April 2020 (UTC)[reply]

@Justinrleung: I found another example: geophagy and geophagia. -- Huhu9001 (talk) 01:22, 27 April 2020 (UTC)[reply]

I'm inclined to agree with Erutuon here (and lament that the connecting of this to a broader crusade against Chinese has probably made simple change to these entries harder/unliklier). - -sche (discuss) 15:44, 22 April 2020 (UTC)[reply]

Chinese 入声 categories edit

How about creating these 3 new categories? -- Huhu9001 (talk) 09:10, 21 April 2020 (UTC)[reply]

cat:Chinese -p characters (入及立甲...)
cat:Chinese -t characters (八骨勿必...)
cat:Chinese -k characters（白或谷乐...）

The idea is fine, but the naming scheme you've suggested is so opaque that not only will most readers not understand it, but I expect my natively Mandarin-speaking friends would be just as confused. —Μετάknowledge^{discuss/deeds} 18:28, 21 April 2020 (UTC)[reply]

This feels very arbitrary. Why are only sorting based on one particular series of codas? There are other codas like -m, -n, -ng, and 入聲 is not the only tone. (It's just a feature that's lost in many northern varieties.) I'm assuming we're sorting this based on Middle Chinese - in that case the category names need to be more specific. — justin(r)leung _{{ (t...) | c=› }} 19:21, 21 April 2020 (UTC)[reply]

Possessive forms of Indonesian nouns edit

In Indonesian (and Malay), nouns can be attached by a suffix showing the possessor. For example, rumah ("home"), rumahku ("my home"), rumahmu ("your home"), rumahnya ("his/her home"). Separate entries for these combined noun forms are not created under the current policy. On the other hand, in Spanish, non-finite verb forms can be attached by a suffix showing the object. For example, decir ("to tell"), decirme ("to tell me"), decirte ("to tell you"), decirle ("to tell him/her"). Separate entries for these combined noun forms are created. Is this being inconsistent or is there any rationale behind? Jonashtand (talk) 15:52, 22 April 2020 (UTC)[reply]

For Turkish sometimes such forms have been created (e.g. babamız: first-person plural possessive of baba), more often not (e.g., there is no entry ablamız (our older sister)). Personally I see little value in creating entries for these non-lemma forms, for two-and-a-half reasons. The first is that if you search for “ablamız”, the search page shows the term with a link to the entry abla, so the meaning is easy enough to find. (Aside, the table with buttons for preloaded entry stencils is a bit obtrusive; it might lead to users overlooking that there is more below it.) The second reason is that these suffixes are completely predictable and are among the first things a learner of Turkish will learn, since they occur multiple times in almost every sentence. So users will typically go directly to the lemma form. (However, the mapping from possessive to unmarked is not unique; for example, elması is both the third-person singular possessive of elma (apple) and elmas (diamond)). The two-and-a-halfth reason is that there is no end to it (literally): there are so many other suffixes that can and will be stacked at the end, giving a virtually endless variety of attestable but predictable forms. I do make an exception for accidental confluences of inflected forms, in particular when stemming from different lemma forms. --Lambiam 16:35, 22 April 2020 (UTC)[reply]

@Lambiam I seek to establish consistent criteria regarding the inclusion of combined forms for all languages. If we are against the creation of rumahku, ablamız, etc., should we propose to delete decirme, decirte, etc.? Jonashtand (talk) 08:49, 23 April 2020 (UTC)[reply]

For many editors “word” means “string of graphemes not delimited by spaces or punctuation” (at least for languages whose orthography has spaces), and they tend to take the slogan “every word in every language” very seriously, and are often unaware of the predictability and regularity of some languages. Proposals to delete completely transparent German compounds (such as Unfallversicherung = “accident insurance”) will surely be shot down. So I foresee considerable resistance to such a proposal. There are notable exceptions to the “every (spaceless) word” rule, though. We do typically not include English possessives of nouns (so no entry John’s), which are formed with the enclitic ’s, and the consensus among Latinists is not to include words suffixed with the enclitic -que – no virumque. We also do not list French hyphenated (but spaceless) forms such as dites-moi, so there seems to be something special about enclitics. Since suffixed pronouns like Spanish -me are also enclitics, perhaps we can establish a consensus. Note also e.g. Italian ditemi etc. --Lambiam 14:06, 23 April 2020 (UTC)[reply]

Consistency would be nice, but there's so much variation between languages that inflexible "one size fits all" rules aren't always practical. The problem is that many agglutinative languages (such as Turkish, but especially some American Indian languages) really blur the distinctions between inflection, compounding and syntax to the point where most sentences with pronouns for subjects and objects can be single words (here's an example in Chinook Jargon). In cases like this, we have to respect the practices of the community of editors working in a particular language. I would personally prefer that we get rid of Spanish forms with clitic pronouns, but that decision should be made by the editors who work with Spanish, and not forced on them by the community at large. Chuck Entz (talk) 15:06, 23 April 2020 (UTC)[reply]

Thank you @Chuck Entz. That is a nice way to present the issue. I agree with you. Jonashtand (talk) 15:55, 23 April 2020 (UTC)[reply]

One question I have about the Malay case is this: how invariable is it? Do you add the string -ku to literally any noun (whose semantics allow it) to make it mean "my X"? Or can the string differ depending on what precedes it (e.g. -gu or -ngu or -hu or -u or something in certain contexts)? Part of the reason I don't mind us excluding John's and virumque is that those clitics are invariable. I'd definitely be opposed to excluding forms with clitics when the clitic isn't invariable (and I don't necessarily support exclusion even if the clitic is invariable). That's why I want to allow all attested agglunative forms in Native American languages and the like (incidentally, Chuck, your example is from actual Chinook, not Chinook Jargon), because there's so much morphophonology involved that changes the surface structure of the strings when they agglutinate. Another reason I don't mind excluding John's and virumque is that 's and -que are not limited to any particular part of speech they can encliticize to; -que can be attached to virtually any Latin word of any part of speech, and even though 's is chiefly attached to nouns in English, it can be attached to whole noun phrases, both those ending in a noun (The King of Ireland's Son) and those ending in some other part of speech (the boy I was talking to's mother). If Malay -ku can only be added to nouns, and especially if it can only be added to the noun it governs directly, I'm more inclined to be opposed to excluding things like ramahku. —Mahāgaja · talk 16:09, 23 April 2020 (UTC)[reply]

@Mahagaja There's no variance of "-ku". Any nouns and verbs add with "-ku". Eg. rumahku (my house), menghadiahiku (to gift me). The only changes occur is in Jawi script. For example "باڤ" (bapa) + -کو (-ku) = باڤاکو (bapaku). Only the spelling of the stem word changes, the -ku is still the same. Also, how about Malay plurals? Should it be created also? --Tofeiku (talk) 02:10, 24 April 2020 (UTC)[reply]

I browsed CAT:Malay compound words to see how compound nouns behave when they take these possessive suffixes, and it looks like the clitic gets added to the whole noun phrase, not just the relevant noun, e.g. air kecilku (“my urine”), literally "water little-my", i.e. the clitic is added to the adjective and not to the noun. To me, that's a strong argument not to create entries for terms with the possessive clitics, because a form like kecilku is presumably meaningless by itself ("my little") and can only appear when the adjective is modifying a noun. So at the moment, I'm leaning against inclusion, but if it came down to a vote I'd still abstain just because I don't work on Malay/Indonesian and I don't speak them. As for plurals, I don't know. On the hand, they're completely predictable: any noun makes a plural by reduplicating the whole noun and writing a hyphen between the parts (orang → orang-orang); on the other hand, compounds reduplicate only the head of the compound, not the entire compound (rumah anjing → rumah-rumah anjing, not *rumah anjing-rumah anjing), which is not really predictable for a learner. So I'm still neutral on the issue of plurals. —Mahāgaja · talk 05:59, 24 April 2020 (UTC)[reply]

Actually both are used in Indonesian, but rumah-rumah anjing is more common. --Lambiam 08:32, 24 April 2020 (UTC)[reply]

OK, thanks. Is that generally true for all countable compound nouns, that the plural of "X Y" is either "X-X Y" or "X Y-X Y", but "X-X Y" is more common? Or are there some compounds that can really only or almost only take "X-X Y" or others that can (almost) only take "X Y-X Y"? Because the more predictable the plural formation is, the less inclined I am to include entries for plural forms. —Mahāgaja · talk 09:33, 24 April 2020 (UTC)[reply]

Personally I do not consider these reduplicated nouns plurals in a grammatical sense. They are one of several ways to indicate plurality (which is most often unmarked) when otherwise ambiguity might arise; such reduplications may also have other senses than mere plurality,^[2] and many other word classes can be likewise reduplicated to indicate frequency, intensity or variety. As to the practice with compound nouns, usually only the head noun (which comes first) is reduplicated; however, Indonesian speakers disagree about which is preferable or even acceptable.^[3] In colloquial speech, reduplication is more likely to be avoided altogether by using some other method, like a determiner meaning “several”. --Lambiam 12:12, 25 April 2020 (UTC)[reply]

First, in regards of possessive suffix (-ku, -mu, and -nya), such as rumahku, it is quite predictable so this is not quite priority for lemma inclusion (and this is used for other Indonesian dictionaries). Second, Indonesian has more important issue to solve, i.e. the derivative term (affixed) which has different meaning [but related to the stem one]. Third, the usage of possessive suffix, such as rumahku, is less formal than (possessive) pronoun, such as rumah saya. --Xbypass (talk) 19:22, 25 April 2020 (UTC)[reply]

In regards of the plural, it is not considered as plural noun (such as English houses). "Not all languages have number as a grammatical category. In those that do not, quantity must be expressed either directly, with numerals, or indirectly, through optional quantifiers. However, many of these languages compensate for the lack of grammatical number with an extensive system of measure words." Thus, reduplication plurals are not in priority to be included, except for the irregular one (ie. the reduplication is not about the plural of something), such as kupu-kupu (“butterfly”). --Xbypass (talk) 19:22, 25 April 2020 (UTC)[reply]

Talking about kupu-kupu, the Malay entry says its plural form is "kupu-kupu-kupu-kupu". Is that true or is it just a mistake made by the template? Jonashtand (talk) 17:50, 26 April 2020 (UTC)[reply]

@Jonashtand: As far I know for Indonesian (and Malay), it is a mistake made by the template. There is no a such word "kupu-kupu-kupu-kupu", we would say them as "banyak kupu-kupu" or "beberapa kupu-kupu" depend on the intended focus. --Xbypass (talk) 23:27, 27 April 2020 (UTC)[reply]

For quotations of use of this term, see Citations:blabla edit

I am sure that I am perpetrating one of my frequent blunders, but in looking up impune and opprobrium I encountered the likes of:
"For quotations of use of this term, see Citations:opprobrium."
I never have noticed such an entry before. Is it old? Is it a new trend in favour of not using quote entries in future? I have recently inserted a number of quote entries, and would like to know what gives before either wasting time or fighting a new trend or approved practice. Several of the citations I found were not helpful or had lapsed or something anyway. Instruction welcome, please. JonRichfield (talk) 18:47, 22 April 2020 (UTC)[reply]

You should keep all the citations on the entry page unless you are collecting evidence for a new sense. Other people have convoluted explanations for moving quotations to citation pages but you should ignore them. DTLHS (talk) 18:48, 22 April 2020 (UTC)[reply]

You should put just one or two quotes which clearly demonstrate how the term is used on the entry page, and put the rest on the citations page. Other people have convoluted explanations for having lots of quotes on the entry page but you should ignore them. - TheDaveRoss 21:05, 22 April 2020 (UTC)[reply]

Ideally, per WT:QUOTE and common sense, a quote on the entry page should illustrate one of the senses of an entry. If you find a quote that does the job better than existing quotes, please add it at the entry below the corresponding sense. If there are too many quotes, or quotes that do not serve any of the purposes mentioned at WT:QUOTE, please feel free to bump them to the Citations page, or in egregious cases (for example, a quote of the form, “There are many different types of ideology”) to discard them. Conversely, sometimes a Citations-page quote will be more useful at the entry. If a quote does not appear to fit any of the currently listed senses well, that is a good reason to retain it on the Citations page for future use. --Lambiam 06:10, 23 April 2020 (UTC)[reply]

Unrelated, but we have tons of redlinks to the citations namespace from a few editors including a link every time they created a page, then never adding citations. Can someone round these up and remove them? Ultimateria (talk) 23:30, 22 April 2020 (UTC)[reply]

+1 – Jberkel 09:16, 23 April 2020 (UTC)[reply]

In case it's useful, here's a SQL query showing all redlinks from mainspace to Citations namespace. It also shows whether the entry has {{no entry}} or {{seeCites}}, which contain Citations links in them, because most of them have one or both (2607 out of 3339). — Eru·tuon 20:14, 23 April 2020 (UTC)[reply]

Thanks folks. That was helpful. As for having too many quotes on the entry page, I have on occasion removed a quote or two, for any of the reasons mentioned here, but sometimes I have added quotes for different styles of usage or to illustrate the kinds of usage at widely different dates or different countries. Something I have more or less stopped doing, is adding illustrative text apart from the quotes, and it strikes me that I haven't seen much of that lately. Is it going out of fashion, or should I be doing it more conscientiously, or what? JonRichfield (talk) 18:30, 24 April 2020 (UTC)[reply]

If there is specific grammatical usage aspect not evident from the headword already and not clear from the quotations, I like to add a usage example illustrating it. For example, in one sense of to listen you use listen to and in another sense listen for. I think this serves the typical user better than describing this after the definitions section in a usage notes sections. --Lambiam 20:20, 25 April 2020 (UTC)[reply]

Rename suppletion categories edit

Discussion moved from Template talk:poscatboiler#Rename suppletion categories.

Categories like "English suppletive adjectives/adverbs/nouns/verbs" are misleading. It should be "English adjectives/adverbs/nouns/verbs with suppleted forms". For example, in the case of go (which uses went), wend is a suppletive verb, not go; the latter has suppleted forms (from wend). — This unsigned comment was added by 92.184.97.41 (talk) at 11:43, 22 April 2020 (UTC).[reply]

Makes sense to me, but it seems more doable just to replace (have a bot replace) the word “suppletive” in these category names by “suppleted”. Also Category:Afrikaans suppletive adjectives, Category:Yiddish suppletive verbs, Category:Suppletive nouns by language, and so on. --Lambiam 06:29, 23 April 2020 (UTC)[reply]

As far as I can tell suppletive is used in at least two senses: it's applied to an unrelated form that is used in a paradigm and the paradigm that to which the form was added. Dictionary.com mentions both senses, and the OED gives the sense "displaying suppletion" and shows a quotation "Aller is suppletive (aller, vais, irai)". You can also find cases of "suppletive verb" in a web search. So I think this use of suppletive is fine. — Eru·tuon 19:54, 23 April 2020 (UTC)[reply]

But then our own definition of suppletive is in need of suppletion. --Lambiam 08:20, 24 April 2020 (UTC)[reply]

Wiktionary:Word Competition 2020 edit

Starting on Monday, I'm going to be running this year's multilingual Scrabble competition. Everybody is welcome to play! --Vitoscots (talk) 07:56, 24 April 2020 (UTC)[reply]

The first set of letters to play with are L T R I * T E --Vitoscots (talk) 08:02, 24 April 2020 (UTC)[reply]

Adding etymology-only languages edit

How big of a deal is adding etym-only languages? I want to add a few Germanic ones, not sure if I need to create a vote, consult other admins, or if I can just add them. Julia ☺ ☆ 22:07, 24 April 2020 (UTC)[reply]

Which ones are you thinking of? Etymology-only languages are usually treated as a variety of an existing language (e.g. attested Lunfardo terms should be added as Spanish) or an umbrella term for an unknown language (e.g. pre-Latin of Iberia could any of a number of languages). — Ungoliant ^(falai) 00:27, 25 April 2020 (UTC)[reply]

Basically all the Germanic languages that have an ISO code, so that's Walser, Eastern and Western Yiddish, and a bunch of Low German varieties. Julia ☺ ☆ 02:34, 25 April 2020 (UTC)[reply]

@Julia: I don't know if a formal vote is necessary, but it might be good to start discussions at Wiktionary:Requests for moves, mergers and splits for each one you propose to create. —Mahāgaja · talk 05:49, 25 April 2020 (UTC)[reply]

@Julia: Yeah, for example, creating an etym-only code for Western Yiddish isn't a terrible idea, because it's extinct, somewhat poorly attested, and nobody has ever worked on it at Wiktionary. But creating a code for Eastern Yiddish would be pointless, because that's the default of Yiddish and what literally all our entries are for. Bottom line: Ethnologue often doesn't know what it's doing, and ISO codes are a poor guide to linguistic reality or lexicographic praxis. —Μετάknowledge^{discuss/deeds} 06:11, 25 April 2020 (UTC)[reply]

I have just realised that I understood your post incorrectly. I thought you meant adding entries in etymology-only languages. — Ungoliant ^(falai) 15:41, 25 April 2020 (UTC)[reply]

Proper nouns with unattested plurals? edit

Should Category:English proper nouns with unattested plurals be created—and if so, could someone who understands category formatting better than me do so? I've noticed a number of terms I would consider proper nouns listed as ordinary nouns with unattested plurals, e.g. Divine Mercy Sunday—though I've changed that one to a proper noun with an attested plural—and I just tagged Anglosphere as having an unattested plural during my ongoing -sphere puzzle because "Anglospheres" seems plausible and there are seemingly one or two usages available online. Apparently the only other English proper noun tagged as such is Ajax, Etymology 2, though. —Nizolan ^(talk) 11:38, 25 April 2020 (UTC)[reply]

I tried to create it with {{autocat}}, but this generates an error message that the label given to the {{poscatboiler}} template is not valid. I cannot see what the invalid label is, nor is there any information on what determines the validity of a label. The documentation of {{autocat}} and {{poscatboiler}} is unhelpful. --Lambiam 19:54, 25 April 2020 (UTC)[reply]

@Lambiam: I went to Category:English nouns with unattested plurals, clicked "edit category data", and enabled the category in Module:category tree/poscatboiler/data/lemmas. — Eru·tuon 20:00, 25 April 2020 (UTC)[reply]

Thank you both! —Nizolan ^(talk) 21:49, 25 April 2020 (UTC)[reply]

Our proper-noun template does not show any plural by default (since many, like the names of cities, do not usually have one). Some things that are grammatically countable in theory ("the Anglosphere") have no plural attested. In that case I usually just rely on {{en-proper noun|head=the Whatever}}, explicitly indicating that it is used with "the" (since you wouldn't know that if it wasn't indicated). Equinox ◑ 20:26, 25 April 2020 (UTC)[reply]

That makes sense. I did see that Category:English proper nouns with unknown or uncertain plurals exists, though in the case of something like "?Anglospheres" I don't think there's any uncertainty about what the plural would be (and some of the articles in that category are odd). —Nizolan ^(talk) 21:49, 25 April 2020 (UTC)[reply]

Walter Scott's works are all successfully dated edit

Hey. I cleared out Category:Requests for date/Sir Walter Scott by adding dates to around 200 undated quotes. It was pretty easy, really. I also added a bunch of his books to Template:Walter Scott quotation templates. Only problem is, I couldn't figure out the correct dates of publication for three poems - The Poacher, The Saxon War Song and The Wild Huntsman. So they're left with red links for now and a bunch of pages that link to them have got some crap in them. If anyone smarter than me (which is everyone here, to be fair) can put a date on those poems, it'd be awesome. --Vitoscots (talk) 16:35, 26 April 2020 (UTC)[reply]

Quotation templates for the 1911 Britannica? edit

Recently, {{RQ:Macaulay Atterbury}}, {{RQ:Macaulay Johnson}}, and {{RQ:Macaulay Goldsmith}} have been created (along with, I'm sure, others), which all link to articles (in this case, by Thomas Macaulay) from the 1911 Britannica. There are resources at Wikisource and at the Internet Archive, though I'm not sure how attribution for each individual article works. @Sgconlaw, you're good with templates; is this a useful idea? grendel|khan 22:13, 27 April 2020 (UTC)[reply]

The main issue with those templates is the ugly formatting that shows up. I think the guy who originally created them either planning to tidy them up or, more likely, hoping someone would come along one day and do so. --Equidrat (talk) 00:43, 28 April 2020 (UTC)[reply]

I’m not sure it’s useful to have a template that links to a single encyclopedia article, frankly. That seems too narrow. If the 1911 Britannica as a whole is used as a source of quotations, then a quotation template should be created for that work, particularly if the original scans are available on the Internet Archive. — SGconlaw (talk) 05:16, 28 April 2020 (UTC)[reply]

@Sgconlaw: I agree; I don't see the point of these per-entry templates. Would a master template linking to the 1911 Britannica work? Maybe with an optional parameter to specify a named author, if there is one? grendel|khan 07:22, 28 April 2020 (UTC)[reply]

@Grendelkhan: yes, that was my thought. — SGconlaw (talk) 08:26, 28 April 2020 (UTC)[reply]

Small citation templates of questionable utility edit

There seem to be a lot of small templates which provide very little information and don't link to an online copy; see {{RQ:Scott Anne}} or most of what's at {{Walter Scott quotation templates}} for that matter. I'm hesitant to say they're actively bad, in that it's providing a year, a title, and maybe a link to Wikipedia, but they seem minimally useful. (I'd ping @Vitoscots, but they've been banned.) grendel|khan 22:13, 27 April 2020 (UTC)[reply]

Just as useful as including the quotations as untemplated text, but five times faster to edit. --Equidrat (talk) 00:40, 28 April 2020 (UTC)[reply]

It looks like they were created just to templatize bare quotations. My suggestion would be to go through them methodically (take your time; no rush) and expand them with links to the first (or, if unavailable, early/reprinted) editions of the works. Duplicate templates can be deleted. — SGconlaw (talk) 05:21, 28 April 2020 (UTC)[reply]

Macaulay's History of England edit

{{RQ:Macaulay History}} is an abbreviated and less featureful version of {{RQ:Macaulay History of England}}. I'll be folding one into the other, but it may take a while; I thought I'd let people know. grendel|khan 22:13, 27 April 2020 (UTC)[reply]

Couldn't you just make one a redirect, or like a subtemplate of the other? --Equidrat (talk) 00:41, 28 April 2020 (UTC)[reply]

I'll be making one a redirect once it's orphaned, but the real work is in finding the specific quote and linking directly to the correct page of the original. grendel|khan 06:42, 28 April 2020 (UTC)[reply]

Joint articles for equal spelling variants edit

I trust that everyone is on board with the idea that mass duplication of content for essentially trivial spelling variants, such as "colour" vs "color" or "realise" vs "realize", is undesirable -- albeit the issue of how to handle usage examples and citations is not yet fully resolved. I would like to go a step further and suggest that, where variant spellings have equal status (a prime example being American vs. British differences such as the ones mentioned), these have a single article titled, for example "color or colour". This would avoid the issue of one spelling being demoted to "variant of ~" status while the other gets the full treatment. The only precedence issue would be the order of mention in the title and perhaps elsewhere in the article. Further work would need to be done to establish the exact presentation of the "joint" article, and there may be certain technical issues that need to be addressed too, but at this stage I would like to sound out opinion about the general principle, without getting into that detail. If the general principle is supported then we can think in more detail about how to implement it. Please share your thoughts on the general principle. Mihia (talk) 00:29, 28 April 2020 (UTC)[reply]

That would be a massive change since so much of our infrastructure relies on page titles. It's not necessarily a bad idea to have all content for one "word" on one page but it's definitely not as simple as just moving pages to a new joint title. DTLHS (talk) 00:42, 28 April 2020 (UTC)[reply]

@DTLHS: Would it be possible for you give a couple of examples of the sort of reasons why it would be a massive change, just so I can get an idea? Mihia (talk) 22:40, 3 May 2020 (UTC)[reply]

@Mihia Mainly inflection templates that would all be broken since they mostly rely on page titles. Also how would you handle links? What if someone wants a link to "color ~ colour"? What page do they link to- color, colour, or the combined page? What would the derived terms section of "color ~ colour" look like? Also how do categories work in this new system? DTLHS (talk) 23:18, 4 May 2020 (UTC)[reply]

No objection in principle, but I have no idea how this would be achieved technically (unless both the entries color and colour were turned into redirects to colour | color?) — SGconlaw (talk) 05:24, 28 April 2020 (UTC)[reply]

Yes, that is certainly what I had in mind (either automatic redirects if there are no other language sections, or I guess it would have to be a manual click in the case that other language sections exist). I don't see a problem with this in itself, but certain other things would need to be addressed too, such as dealing with templates that pick up the page title ("en-noun", "en-verb" etc.). In addition, I don't know how many other "processes" might break or behave incorrectly if they encounter an "X or Y" page title, which may be what DTLHS is alluding to. There may have to be some kind of audit to establish which things would be affected, should there be general agreement to pursue the idea further. Mihia (talk) 09:50, 28 April 2020 (UTC)[reply]

I think that the use of templates for the page body can make colour and color share their content while leaving the page names alone. --Lambiam 13:46, 28 April 2020 (UTC)[reply]

It did occur to me too that some kind of transclusion process could allow content to be maintained in one place and included on both pages. One problem with this is that content (especially examples/citations) specifically for e.g. "colour" would appear on a page titled "color", and vice versa. If the page is titled "colour or color" then it won't seem so strange to have the examples/citations for both spellings together in the same place. In fact, it would largely eliminate the problem that we have at the moment of where to put citations/examples for the spelling that is deemed the "variant". Another drawback with the "shared content" solution is that, while we may know that content is identical, dictionary users may not. They may note that there are, or appear to be, two different articles, "colour" and "color", and assume that this must be because there are some differences. These issues could be addressed by a notice on the page I suppose, but to me it seems a less elegant solution. Mihia (talk) 14:40, 28 April 2020 (UTC)[reply]

See {{ja-see}} for a template that uses a similar method. Chuck Entz (talk) 14:58, 28 April 2020 (UTC)[reply]

Turning [[color]] into a redirect to [[colour]] does not work, since we also have Latin color. --Lambiam 13:50, 28 April 2020 (UTC)[reply]

I’d love to see something like that, but I can’t imagine how we could do it in a way that is convenient to readers and editors alike without making some fundamental changes to how we structure our pages.

I’ve argued before in favour of holding each entry on a separate page. I think this would go a long way in providing the flexibility needed for this kind of format. — Ungoliant ^(falai) 14:23, 28 April 2020 (UTC)[reply]

Then the new argument will be "should it be color or colour, or colour or color?" Equinox ◑ 14:49, 28 April 2020 (UTC)[reply]

Ah, finally a hill I am willing to die on! - TheDaveRoss 13:15, 29 April 2020 (UTC)[reply]

The main problems I see are deciding which entry the categories should be added to, and confusion with non-combined entries that have names that are coincidentally in the same pattern, e.g. trick or treat. Chuck Entz (talk) 14:58, 28 April 2020 (UTC)[reply]

Are there any entries that contain ∨? Or can we use color ∨ colour? As for which comes first, how about simple alphabetic order? That way American spelling comes first in some cases (like the preceding) and British spelling comes first in others (e.g. realise ∨ realize). Or we could get medieval on our readers' asses and use ꝉ: color ꝉ colour, realise ꝉ realize etc. —Mahāgaja · talk 15:13, 28 April 2020 (UTC)[reply]

∨ is not used in any entry names besides ∨, though I doubt you're being extremely serious... — Eru·tuon 17:03, 28 April 2020 (UTC)[reply]

What about very long titles where only one word varies in spelling? It could get unwieldy: "horse of a different color or horse of a different colour"? (Also, we shouldn't use the word "or", because that might be part of the phrase itself.) Equinox ◑ 15:23, 28 April 2020 (UTC)[reply]

Why assume we would be limited by two spellings? There are some pages that have a dozen or more possible spellings. DTLHS (talk) 16:41, 28 April 2020 (UTC)[reply]

We could use regex syntax: "horse of a different (color|colour)" or "horse of a different (?:color|colour)" (to avoid captures, not that it matters at all here). Or the most efficient regex syntax: "horse of a different colou?r)" (eew, and impossible to find either "color" or "colour" in the title then). Or invent a sort of programming language to represent it, with the "or" symbol suggested by User:Mahagaja: "horse of a different (color ∨ colour)". — Eru·tuon 17:03, 28 April 2020 (UTC)[reply]

My proposal is that the word or would be in italics in the title. I would personally prefer to use an easily user-readable syntax rather than something more technical that may be cryptic to some users. Three or more variants can be handled using "X, Y or Z" etc., or "X or Y or Z", whichever is preferred. Anything can be accommodated in principle, including phrases, but if the title becomes too unwieldy then we need not employ this method. It is by no means proposed that this method should mandatorily be applied to every case of "alternative spelling of ~". It is mainly designed to address "important" cases, such as the US/UK variants mentioned. Mihia (talk) 17:17, 28 April 2020 (UTC)[reply]

Using the page title to denote every single spelling is a bad idea. What if a new spelling is invented or discovered? Then we would have to move the page. It would be better to decide (hopefully not arbitrarily but based on some rules) on a single spelling to host content on. DTLHS (talk) 17:20, 28 April 2020 (UTC)[reply]

I said that the proposal is NOT to use the page title to denote "every single spelling" of everything. A newly discovered or invented spelling of an established word would most likely be a marginal spelling, not an established "equal" spelling. This proposal is intended to apply only to the latter. Mihia (talk) 17:23, 28 April 2020 (UTC)[reply]

Titles are plain text, so strictly speaking, they can't contain italics. Do you mean a display title, like {{DISPLAYTITLE:X, Y ''or'' Z}} in X Y or Z, to force the top header to display with "or" italicized? Such a display title could be added even when the actual title uses a weird notation like "horse of a different (color ∨ colour)". — Eru·tuon 18:54, 28 April 2020 (UTC)[reply]

However it can be done. I don't (or didn't) know how, but per Wiktionary:Grease_pit#Copying_templates_from_Wikipedia, yes, it seems that "DISPLAYTITLE" is a method. Mihia (talk) 22:48, 28 April 2020 (UTC)[reply]

Another issue: sometimes one variant (e.g. the older colour, honour) might have a few extra senses never encountered under the newer spelling; or vice versa. Equinox ◑ 17:29, 28 April 2020 (UTC)[reply]

The joint article would apply to cases where the meanings are 95%+ the same, or whatever. Where, exceptionally, there are differences, it can be dealt with using a sense-specific label, such as "colour only", or whatever. Mihia (talk) 17:33, 28 April 2020 (UTC)[reply]

Does the proposal apply only to English? Or would we be merging German German Straße with Swiss German Strasse, and Portuguese Portuguese António with Brazilian Portuguese Antônio? What about Latin and Cyrillic Serbo-Croatian? —Mahāgaja · talk 18:24, 28 April 2020 (UTC)[reply]

My proposal, as it stands, applies only to English, in that I don't have enough knowledge of other languages to have a definite opinion on how this issue should be handled there. However, even if it may not apply cross-language, my feeling is that the general principle may apply to other individual languages too -- that content for cosmetic spelling variants of what is actually the same word should not be manually duplicated, and that where "equal" spelling variants exist, one should not be arbitrarily relegated to "variant of ~" status. Mihia (talk) 23:09, 28 April 2020 (UTC)[reply]

Senses are encountered for a lexical item, not “a spelling”. The spellings are only representations and do not always need to be attested. Sometimes a spelling is normalized, sometimes a word is only found in audio, and with digitization details get omitted. And sometimes you may have one quote in one spelling and two quotes in the other which are for the same word, which is bare enough according to the rules about the criteria for inclusion, unlike parts of the Wiktionary establishment pretend. WT:CFI mentions spellings only peripherally but not as a basis, it is “a term” which needs to be found with a sense, not a spelling that matters (and not even “attested”, WT:CFI does not say that a term needs to be “attested” in a fashion that is now sought, it emphasizes that a term is to be included if “it's likely that someone would run across it and want to know what it means”, so it matters what a reader would find most informative, and the presentation of the information shan’t be deformed by editors clinging to the letter). Fay Freak (talk) 18:55, 28 April 2020 (UTC)[reply]

As others have said, there are challenges, but I do think this would actually make sensible handling of content easier and better even in situations where e.g. a British spelling has meanings the American spelling isn't attested with, since I tend to think all the meanings should still be on one page. I agree with listing things in alphabetical order. Cases where multiple spellings are standard would get tricky, e.g. AFAICT colorize is the US standard, colourise is the general UK standard but colourize is the Oxford standard, and while that could be handled by listing them all, ... I think the idea of "hosting" content on a single backend page and transcluding it onto both [[color]] and [[colour]] would have the same benefits but fewer drawbacks. It would allow long idioms to be synced without the page title becoming overlong, and would allow English color and colour to be synced while not requiring Latin color to be unexpectedly on a different page than English color. One thing mentioned previously is how to handle e.g. "colourise" showing a definition like "to color", but in the past it was proposed that, if we went the transclusion route, we could have the word "color" in the definition be enclosed in a template or something, such that it could be set to display as "color" on colorize and "colour" on colourise. It seems like it might even be possible, using something like Wikipedia uses for transcluding only sections of pages, to have the "backend" page that hosts the content be whatever page current hosts it, e.g. color, and transclude color onto colour. - -sche (discuss) 21:29, 28 April 2020 (UTC)[reply]

If the "X or Y" method, which is my preference, is not technically possible to implement, I would support the transclusion method as better than what we presently have, provided that, for the reasons I mentioned earlier, a satisfactory presentation style can be found to make it clear to users that the same content is reused for both/all spelling variants. Mihia (talk) 23:00, 28 April 2020 (UTC)[reply]

Separate thought, after reading the above with various and sundry ideas.

If I've understood correctly, it sounds like most (all?) proposals amount to having some page with a possibly-quite-complicated string, and this page would serve as the location of the content for the many-spellings term.

Any individual spelling version would serve as a soft-redirect to the possibly-quite-complicated-string page.

Alternative proposal: What about putting the content at some arbitrary and unique page address, and having each individual spelling not just soft-redirect, but rather transclude the content? A reader looking at, say, the English color entry would just see the entry. Likewise for a reader looking at colour instead. Better usability than requiring click-throughs. We already do this in some ways with our forum pages, where [[Wiktionary:Beer parlour/2020/April]] holds the content, and [[Wiktionary:Beer parlour]] displays it. The casual reader would never have to know that the possibly-quite-complicated-string page even exists. Editors clicking on the "edit" link from any of the individual-spelling pages would open the possibly-quite-complicated-string page, same as for the Beer Parlour.

There are ways of transcluding now using Lua that are quite advanced, and that can circumvent problems with categorization and the like, when just naively transcluding a page. ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:02, 28 April 2020 (UTC)[reply]

My preference would just be to have the less common spellings display something like (and I say this as someone who much prefers Commonwealth spellings):

1. Standard spelling of color in the UK, Canada, Australia, New Zealand, India, and Nigeria.

I would also support transclusion, however, but not a "color or colour" page. Andrew Sheedy (talk) 02:13, 29 April 2020 (UTC)[reply]

You don't need to make up arbitrary non-English titles. Just choose a spelling (hopefully based on some rule and not randomly) and put all alternative spelling information there. Then delete all entries for alternative spellings, alternative capitalizations, alternate hyphenations, misspellings, obsolete spellings, etc. Also transclusion is garbage. DTLHS (talk) 17:52, 29 April 2020 (UTC)[reply]

@DTLHS, you say "Just choose a spelling" -- that seems to run counter to the thread above, where the core issue is multiple equally valid spellings each deserve their own entries, equal weighting in the dictionary. The commenters above mostly refuse the approach of choosing one spelling and making all others "alternative spelling" stubs, let alone just deleting them.

Yes, I realize that but unlike all of the other ideas this one doesn't require any actual changes to modules, categories, or any other technical aspect- just organizational changes, which means it actually has a non-miniscule chance of happening. DTLHS (talk) 22:14, 29 April 2020 (UTC)[reply]

"Also transclusion is garbage." Unclear what you mean by this, specifically in this context? Transclusion works quite well here at WT:BP, among other places. ‑‑ Eiríkr Útlendi │^{Tala við mig} 21:58, 29 April 2020 (UTC)[reply]

@DTLHS: Yes, I would also like to know why you think transclusion is such a bad idea. Mihia (talk) 22:42, 3 May 2020 (UTC)[reply]

Planned maintenance operation (read-only time) on 30th April edit

Hi, Just wanted to inform that there's a planned maintenance operation on Thursday 30th April at 05:00 AM UTC. It impacts all wikis and is supposed to last a few minutes. During this time, new translations may fail, and Notifications may not be delivered. For more details about the operation and on all impacted services, please check on Phabricator: phab:T250733 --Kaartic (talk) 18:49, 28 April 2020 (UTC)[reply]

Meanings of component characters: 薄暮 edit

Hello all. I come before you to ask that this edit [4] be upheld and not reverted. My rationale for making this edit is that Wiktionary users are highly likely to be unaware that 薄 means 'to approach; to go near' in this context, and I want to share this information with them so that they understand the word better. Reference: [5] "2 迫近、接近。" --Geographyinitiative (talk) 04:46, 29 April 2020 (UTC)[reply]

(Similar example 謀生／谋生 (móushēng)) --Geographyinitiative (talk) 04:51, 29 April 2020 (UTC)[reply]

The place for discussing words is WT:TR. --Anatoli T. ^{(обсудить}/^вклад) 05:10, 29 April 2020 (UTC)[reply]

(Moved to [6]) --Geographyinitiative (talk) 06:03, 29 April 2020 (UTC)[reply]

Reflexive verbs in languages with a separate word for reflexive particles edit

I am now working with a language where reflexive particles are a separate word - Bulgarian, so I'm facing a new issue for me where to put the lemma, what to do with full verb with the particle and how to display it.

Take a look at къ́пя (kǎ́pja):

I have decided to make the definition line like this:

(reflexive, intransitive) (~ се) to bathe, to have a bath

The entry contains conjugations for both the transitive and reflexive verb and and the entry къ́пя се (kǎ́pja se) is meant to be a soft redirect.

There is not much consistency across languages for handling reflexive verbs with a separate word for reflexive particles. The new thing I suggest is to display the actual particle. Rather than just saying reflexive, maybe templates could also display the particle as well?

E.g: French reflexive senses of laver and habiller:

laver

(reflexive) (se ~) to wash oneself

habiller

(reflexive) (s’~) to get dressed

If the template could also handle the position of the particle based on the language and spelling rules, it would be great too. --Anatoli T. ^{(обсудить}/^вклад) 06:04, 29 April 2020 (UTC)[reply]

The policy needs to handle cases like горде́я се (gordéja se) which AFAIK only exist in the reflexive form. Note for French that s'habiller exists but is a hard redirect to habiller, while e.g. s'en aller exists as its own lemma (as it should). Another example: s'enfuir and enfuir both exist. enfuir has a "reflexive" definition "flee", and s'enfuir also defines itself as "flee". Benwing2 (talk) 01:07, 30 April 2020 (UTC)[reply]

A related issue: how should non-lemma forms inside of conjugation tables for reflexive verbs be linked? Should we link to the entire form (including the reflexive), or link just the non-reflexive part? If the latter, what to do about горде́я се (gordéja se)? AFAIK the verb *горде́я (gordéja) doesn't exist and so it would be wrong to create non-lemma forms for it. (In reality, we do have non-lemma forms of *горде́я (gordéja) created, which wrongly say they are forms of горде́я се (gordéja se).) Benwing2 (talk) 01:10, 30 April 2020 (UTC)[reply]

There is definitely inconsistency in the usage even withing one language, such as Bulgarian:

усми́хвам (usmíhvam) doesn't exist/is not used without (~ се), there is no entry for усми́хвам се (usmíhvam se, “to smile”). This is no different from the situation горде́я (gordéja) and горде́я се (gordéja se, “to be proud”) but the entries are reversed in how they are entered.

We can establish the following:

There may be verbs used only with reflexive particles. Not a passive meaning. - усми́хвам се (usmíhvam se, “to smile”) or горде́я се (gordéja se, “to be proud”)
There may be transitive verbs with or without reflexive particles. къ́пя (kǎ́pja, “to bathe”) or къ́пя се (kǎ́pja se, “to have a bath”)
There may be intransitive verbs with reflexive particles with a special meaning.
In some languages, such as most modern East Slavic (Russian, Ukrainian and Belarusian but not Rusyn), reflexive particles are part of the word and are usually included in dictionaries, even if they only have a passive meaning.
It may be beneficial to display the reflexive particles themselves based on the language it is for to help users make a difference, that for example (reflexive) habiller is actually s'habiller, which is reflexive, not habiller, etc. --Anatoli T. ^{(обсудить}/^вклад) 01:53, 30 April 2020 (UTC)[reply]

@Atitarev: Frankly speaking, I do not see the difference between 1 and 3. In Bulgarian the reflexive verbs are always intransitive. In an example about the German verb dünken I found the remark about the prævalent usage of the Nominativ that follows this reflexive verb and about the rare, but still admissible Akkusativ: "Ich dünke mich ein(en) Sieger". This kind of grammatical structures are completely absent from Bulgarian with regard to reflexive verbs. I have no objections against your proposals, except the redundancy of the intransitive tag. Bogorm converſation 18:59, 30 April 2020 (UTC)[reply]

@Benwing2: The aforementioned French verbs may be used in a non-reflexive construction, but se récrier cannot. I just corrected the entry, because it had not emphasised the reflexivity until a few minutes ago. There are also examples from German like sich entschließen and from Slovak like báť sa. To my surprise, the reflexive verbs in those three languages are treated differently: In Slovak, sa is part of the lemma here, whereas in German and French sich/se are not. If Atitarev's proposal is accepted, its application might be considered with regard to Slovak reflexive verbs too. Bogorm converſation 18:59, 30 April 2020 (UTC)[reply]

Sorry for not getting back earlier. We are now using a new template {{bg-reflexive}}, which displays (reflexive) (~ се) on the definition line. IMO, it should only be used in verb entries without "се" and the usual {{lb||reflexive}} on terms with it. --Anatoli T. ^{(обсудить}/^вклад) 06:08, 5 May 2020 (UTC)[reply]

It might be helpful to display reflexive verbs on the same page as the non-reflexive form, but under a different header, displaying the reflexive form. That would be clearer than labels but without necessitating separate pages. The exception might be when the reflexive form is the only form or is part of a broader idiom. Andrew Sheedy (talk) 03:23, 30 April 2020 (UTC)[reply]

Regarding горде́я (gordéja) existing only in горде́я се (gordéja se): IMO this is exactly a scenario where we should have an entry at горде́я (gordéja) that soft-redirects to горде́я се (gordéja se) using {{only in}} or something similar, since someone unfamiliar with the language would probably look up individual words. On that basis, I would be inclined to also tolerate inflected forms of горде́я (gordéja) existing, but I expect that might be more contentious. But we're not paper, even soft redirects are relatively cheap. Meh. - -sche (discuss) 14:25, 30 April 2020 (UTC)[reply]

@-sche: I agree. --Anatoli T. ^{(обсудить}/^вклад) 06:08, 5 May 2020 (UTC)[reply]

I've been using inflection of: infinitive. See Catalan vantar. I think our main consideration should be how language learners will actually search for terms. For Romance languages at least, I think users will see es vanta in running text and look up vanta rather than es vanta because a) there are so many thousands of verbs that could be reflexive but could also be transitive with the reflexive particle (the second type Anatoli mentions) for many possible reasons. And b), dictionaries (at least in the languages I know) are totally centered around individual words. You can find a tot estirar here, but if the page didn't exist, I would look up estirar in a Catalan dictionary, and that's where it's found. I know we're not forced to do things that way, but it makes sense from a user's perspective. Is vanta really the "third-person singular present indicative form of vantar-se"? No, es vanta is, but Romance language learners are conditioned to not search for that. Asking @Rua for input because I think she either removed Catalan reflexive conjugation tables or just told me why someone else did. Ultimateria (talk) 17:17, 30 April 2020 (UTC)[reply]

@Ultimateria: Thanks for the post. It's interesting. I am not familiar with Catalan, so I would find the conjugation table at vantar-se helpful if it included the reflexive particle, even if it's not easy technically, like habiller has "Conjugation of s'habiller" table. The form vantar could probably use a definition line like this:

(reflexive) (~-se) {{form of|ca|infinitive|vantar-se}}

--Anatoli T. ^{(обсудить}/^вклад) 06:08, 5 May 2020 (UTC)[reply]

@Atitarev: I think that should be fine. I forgot to give my input on the main issue here, ~ ce; I already like it in CJKV entries, and I don't see why we shouldn't extend it to other languages. As it is there are a few Catalan and Portuguese verbs out there labeled reflexive, PAGENAME-se, that I'd like to templatize for consistency if people approve of your format. Ultimateria (talk) 17:37, 5 May 2020 (UTC)[reply]