Wiktionary:Beer parlour/2022/March

Plurals and countability edit

I've created the very rare article German Generaladmiralin. It is countable with the only grammatically permissible plural form being Generaladmiralinnen, but that form isn't attested (a purely statistical phenomenon, of course: very rare word + less common inflected form). What do we want to do in such cases? Pinging @Equinox as the master of countability/plural issues, @Benwing2 as a person very involved with German inflections. — Fytcha T | L | C 12:53, 1 March 2022 (UTC)[reply]

I wouldn’t know that the attestation criteria specifically apply to all forms. Would be too much to expect that every plural form of an Arabic noun is even found in use. أَمَة (ʔama) has exactly one well understood plural, for the others Nöldeke cited provides at least one occurrence in poetry, and another plural is under quarantine, as he says, and with this the reader is well informed. But there is absolutely no doubt that it is correct to state Generaladmiralinnen is the plural of German Generaladmiralin—attestation of which is unasked and unsought. It is more that unusual or unpredictable plurals ask for more evidence, in so much as extraordinary claims require extraordinary evidence. Fay Freak (talk) 15:54, 1 March 2022 (UTC)[reply]
Yeah I agree with this, excluding Generaladmiralinnen would really rather be following the letter of the law, not its spirit. — Fytcha T | L | C 11:04, 2 March 2022 (UTC)[reply]
In English we can mark a noun as plural not attested. Vox Sciurorum (talk) 17:32, 1 March 2022 (UTC)[reply]
It depends a bit on the meaning whether it makes even sense. Generaladmiralin is obviously countable. Fay Freak (talk) 17:53, 1 March 2022 (UTC)[reply]
@Vox Sciurorum: I thought about that as well but perhaps it'd be wrong to use that flag in such cases even if it existed: There's no question that the noun is countable and also none what its ("hypothetical") plural form would be, so writing anything other than Generaladmiralinnen after plural in the article Generaladmiralin does not document an inherent quality of the word but rather a statistical peculiarity. What do you think? — Fytcha T | L | C 11:10, 2 March 2022 (UTC)[reply]
My understanding based on prior discussions (some linked to and summarized in Talk:dulcamini) is that we require triple attestation of a word, not of every inflected form, unless there's reason to doubt (a) that word inflects (some words are singular-only, etc), or (b) what the inflected form is (the expected plural of star is stars, so if someone says it's staren, that needs cites). (My go-to example is "if the masculine singular dative mixed declension form of mitternachtsblau is found to have only two — or zero — Google Books hits, but enough other inflected forms are attested to confirm that mitternachtsblau does indeed inflect, we're not going to create a one-off inflection table with a gap in that one slot, or redefine mitternachtsblauen to say 'masculine singular strong genitive and accusative, weak genitive, dative and accusative, and mixed genitive and accusative but not dative, and feminine weak genitive and dative, and mixed...'" since it would be, as you say, a purely statistical artifact. - -sche (discuss) 17:00, 2 March 2022 (UTC)[reply]
@-sche: I think the discussion should be shifted from being form-centric to parameter-centric: We don't require attestations for all unique declined superlative forms of, say, German agitatorisch (e.g. agitatorischstes "wouldn't exist" by that measure) but we do require there to be attestation that proves that the "is-comparable parameter" of agitatorisch is true (which is to say, there needs to be attestation for at least one of the comparative/superlative forms; this should be self-evident). In the case of mitternachtsblauen (suppose it itself hadn't been attested), there could be two types of corroborating evidence for the claim that the parameter that ensures this form's existence is true: 1. At least one other predicative form can be attested, or 2. the constituent blau can be used predicatively. To circle back to the concrete case of countability/plurals: I've used evidence of type 1 in the past in cases where a plural of a noun is only citable in the dative with an -n but not in the nom/acc/gen; the attested existence of the dative plural suffices as evidence for the other plurals. Regarding type 2, that is what I'd want to use here. The claim that Generaladmiralin is countable and takes the -nen plural is based on the fact that these two parameters are retained while compounding.
I think this parameter-centric framework tracks the intuition of editors for what does or doesn't require attestation more closely. — Fytcha T | L | C 17:44, 2 March 2022 (UTC)[reply]

Given that the issue of CFI for inflected forms crops up on a regular basis, it may be a good idea to launch a vote to codify what -sche wrote above. MuDavid 栘𩿠 (talk) 01:22, 3 March 2022 (UTC)[reply]

No, it wouldn’t. Every codification wreaks new complications. Fay Freak (talk) 08:30, 3 March 2022 (UTC)[reply]
@Fytcha, Fay Freak I agree with the above discussion, I think it's fine to include the plural Generaladmiralinnen even if it happens not to be attested, since all native speakers would agree it can be pluralized and will have that form as the plural. I think Fytcha's parameter-centric framework makes a lot of sense. Benwing2 (talk) 07:29, 4 March 2022 (UTC)[reply]
BTW I'm running into similar issues trying to fix up the declensions of proper names in German. For some names it's hard to attest plurals, but I imagine in modern usage almost all names in German can be pluralized by adding -s (although I'm not adding them unless I can find them either in dewikt, Duden or Google Books). Benwing2 (talk) 07:31, 4 March 2022 (UTC)[reply]
I only just saw this (as I don't look at my "pings" much); also I don't know very much about German. However, the attestability of a plural form doesn't really have any bearing on the countability of the noun. If I say "this morning I saw a wug" then clearly "wug" must be countable, even if we can't find "wugs" in any corpus. That's all. Equinox 02:52, 11 March 2022 (UTC)[reply]

are you married: plural translations? edit

(If you're interested in the phrasebook project, consider adding yourself to the phrasebook workgroup)

Per WT:PB, " [] or editors have collaboratively decided to include plural translations for an individual phrase.", I am hereby asking whether anybody wants to have plural translations in are you married. Prior to my changes, Polish had translations qualified with "to a couple" which I've removed per our newly voted-on policy. — Fytcha T | L | C 15:34, 3 March 2022 (UTC)[reply]

I went ahead and updated the Polish ones for that. Vininn126 (talk) 16:26, 3 March 2022 (UTC)[reply]
Edit: I added what WILL be the plural, but removed for now. I thought we were going to have plurals in the same box as their relative singular of formality? Vininn126 (talk) 16:28, 3 March 2022 (UTC)[reply]
@Vininn126: That's not what I had in mind when I created the vote; the relevant section of WT:PB is: "Plural forms must not be provided as translations, unless this is inherently necessary due to the nature of the phrase (for example, thank you all), or editors have collaboratively decided to include plural translations for an individual phrase. If, for a certain phrase, both singular as well as plural forms are to be documented, the translation boxes as prescribed by the above paragraph shall be duplicated with singular and plural prepended to the gloss respectively." The rationale for this was to make it less cluttery and to be able to request translations more easily, as the translation box gadget doesn't let you make a request (e.g. for the plural) if there's already a translation in the same box. — Fytcha T | L | C 13:12, 4 March 2022 (UTC)[reply]
The current definition explicitly excludes the most likely "to a couple" translation, which would no doubt mean "are you married to each other". To have a translation for that, you would need to add a separate sense. Although it may not be useful enough for a phrasebook entry in its own right, it might be helpful to include it to avoid confusion, somewhat analogous to {{&lit}} for SOP senses in the regular dictionary. Of course, one might also be asking a group of two or more people whether they are married in general/nonreciprocally, but the "couple" part implies otherwise (though I suppose it might not in a language that distinguishes dual and plural).
Unfortunately, there are lots of unspoken culturally-based assumptions involved that go into what makes a phrase useful for a phrasebook, and we need to be very careful to consider those. Traditionally, this phrase would be used by someone of the opposite gender (more often man asking woman) to determine if the other person is a potential romantic/sexual partner. With more tolerance for nontraditional gender identification and definitions of marriage, assumptions about the gender of the person asking and the person(s) asked (and what those mean) are more complex. And something that might be ambiguous in one language might be explicitly specified in another. Chuck Entz (talk) 21:10, 5 March 2022 (UTC)[reply]
What about those of us in group marriages? DCDuring (talk) 00:49, 6 March 2022 (UTC)[reply]

Invitation to Hubs event: Global Conversation on 2022-03-12 at 13:00 UTC edit

Hello!

The Movement Strategy and Governance team of the Wikimedia Foundation would like to invite you to the next event about "Regional and Thematic Hubs". The Wikimedia Movement is in the process of understanding what Regional and Thematic Hubs should be. Our workshop in November was a good start (read the report), but we're not finished yet.

Over the last weeks we conducted about 16 interviews with groups working on establishing a Hub in their context (see Hubs Dialogue). These interviews informed a report that will serve as a foundation for discussion on March 12. The report is planned to be published on March 9.

The event will take place on March 12, 13:00 to 16:00 UTC on Zoom. Interpretation will be provided in French, Spanish, Arabic, Russian, and Portuguese. Registration is open, and will close on March 10. Anyone interested in the topic is invited to join us. More information on the event on Meta-wiki.

Best regards,

Kaarel Vaidla
Movement Strategy

This user apparently started out as w:mi:User:Te Reo Ahitereiria on Maori Wikipedia in October of 2021. They were given an indefinite block on Wikipedia for vandalism (real vandalism- if they had done that here, I would have blocked them), and then globally locked for longterm abuse- all within a week of their account creation.

Which brings us to March, 2022. They started editing here, with a series of marginally questionable, but definitely not vandalistic, entry creations for various poorly-attested languages of the South Pacific.

They have since been globally locked for block evasion, after which they created a new account, which was then globally locked, and so on. By my count, they've gone through at least 5 accounts in 5 days. I should mention that I have not used my checkuser tools yet- I'll leave that to my colleagues on other wikis, since they're not breaking any local rules as far as I can tell.

So, what should we do about this? We have precedence for giving editors blocked elsewhere a second chance here. Aside from the usual concerns, they need to comply with our criteria for inclusion and follow best practices for editing in languages they don't know (the fact that they edited at Cherokee Wikipedia does make me a bit nervous). I don't have much experience with cross-wiki diplomacy, so I'm open to any help and/or suggestions (I believe we do have the capability to locally unblock them). Whatever we decide, this account-per-day pattern has to stop- there's no effective way to communicate with them, to start with. I haven't mentioned or pinged them on their latest account so as to avoid getting that account locked prematurely before we can discuss this. Chuck Entz (talk) 01:10, 6 March 2022 (UTC)[reply]

@Chuck Entz: I don't think their past behavior on other WMF projects should have any bearing on whether they're allowed to edit on en.wikt (there's precedent as you point out), which is also why I haven't blocked any of their obvious socks. I agree with the rest of the message too, we should establish a channel of communication with them and then unlock one account here locally. — Fytcha T | L | C 08:56, 6 March 2022 (UTC)[reply]
Some more of their accounts: Special:Contributions/MinecraftRapper, Special:Contributions/MinecraftKing123, Special:Contributions/MinecraftGod12345, Special:Contributions/MinecraftGod123Fytcha T | L | C 08:57, 6 March 2022 (UTC)[reply]
The other account I know of is Special:Contributions/PasifikaMinecraftGod. All are now globally locked. Chuck Entz (talk) 09:23, 6 March 2022 (UTC)[reply]
I have had contact with this user a few days ago and apart from a few newbie mistakes and, at some point, a questionable focus on toponyms in minority languages, they don't seem to cause any problems yet. Thadh (talk) 10:15, 6 March 2022 (UTC)[reply]
Is there a provision in global blocking to allow exceptions for specific wikis? Can this user be emailed? DCDuring (talk) 16:47, 6 March 2022 (UTC)[reply]

Surjection's Feb 2022 vote edit

User:Surjection has long been advocating a general split of the labels "informal" and "colloquial", reversing a merge I did 3 years ago or so. Early on in the discussion they threatened unilateral action if they did not get their intended result, and then seem to have done that anyway in the midst of the discussion, despite no consensus to do so. They then created a vote Wiktionary:Votes/2022-01/Label for lower register to ratify what they had already done, and wrote the vote as a simple-majority vote (when all votes normally require a 2/3 supermajority) (a) without either any consensus to do so, and (b) without making it clear that their vote was radically breaking with normal policy (i.e. this stipulation was hidden among a bunch of other verbiage and AFAIK never called out specifically by Surjection). A lot of people ended up surprised and discomfited (to say the least) when they found this out. They have also made specific accusations against me of forum shopping, and insinuated that by opposing them in this unmerger I don't have the best interest of the project at heart; see [1]. Anyone who has worked with me knows this is an utterly ludicrous accusation. And in fact I do not oppose creating a lower-register label; my opposition is to the specific use of the term "colloquial" for it. I proposed a lot of other possibilities, all of which Surjection ended up rejecting. I would in fact be fine with the label "lower-register", which is in the very title of this vote that Surjection created. However I strongly oppose the simple-majority nature of this vote and the way it was created without proper notice, and I will oppose any attempt to claim a consensus on the basis of less than a 2/3 majority. I am bringing this discussion to the BP because it needs to be hashed out somewhere other than in the discussion pages of the mentioned vote. Benwing2 (talk) 02:59, 6 March 2022 (UTC)[reply]

For clauses in contracts prescribing written in form, the leading opinion in Germany is that the requirement of the written form can be superseded by oral agreement and thus the contractual terms can be changed orally, unless it is specifically agreed that the requirement of the written form cannot be superseded by oral agreement. Similarly it seems that supermajority rule may be agreed upon to be superseded in an individual vote but for that it must be, as Benwing appears to refer to, made clear that this policy of supermajority form is broken. We may do to so anyway if the vote is complicated by having multiple options so agreement is more difficult to reach. However in either case one should see that substantial agreement is lower than it otherwise is, and the policy has been made specifically because of the instability of user base requiring a higher threshold for continuous actual agreement. And if the vote is ambiguous people may have voted for distinct concepts so there is dissensus. Which I reckon the actual problem if this vote. A lot of verbiage so Surjection can interpret into it what she wants after people have given support votes for sundry things they like and not grasping the ill plan they would disagree with if they comprehended it, and their agreement, though posted under a unified section, wasn’t referring to the same things. Fay Freak (talk) 03:54, 6 March 2022 (UTC)[reply]
It is disappointing albeit not surprising to see that the same lies still keep being peddled despite the fact I already have called them out. Anyone is free to check themselves from the previous thread:
* "they threatened unilateral action if they did not get their intended result". Not unilateral. At the time there was strong consensus in the BP thread that the merge should be undone.
* "then seem to have done that anyway in the midst of the discussion". The only "unilateral action", as you say, that I have done is to revert the change that made the colloquial label an alias of the informal one (so that the label colloquial shows up as informal), which is something you never even did. I have not touched the categorization, so in fact the current situation is exactly as you left it.
* "without either any consensus to do so". That's why there was a two-week (as opposed to a normal one-week) period for discussing how the vote should be carried out. Nobody objected so I assumed I had the consensus. Only now it has become apparent that the only reason that happened is because nobody noticed it. In a sense, yes, that is my fault, as I did not point that out clearly enough, but at the same time, I probably would've still gotten people like that even if I put out that note in bold, red, with 72 pt font size and a blinking effect. The reason for not demanding supermajority is simple; RFM was never the correct venue. It should've been a BP discussion or even a vote on its own.
* "insinuated that by opposing them in this unmerger I don't have the best interest of the project at heart" is a flat-out lie and is borne by a willing misinterpretation of what I wrote. Then again, the fact that the rhetoric against my viewpoint have consisted of nothing but endless lies have seriously started to make me question this point.
* "all of which Surjection ended up rejecting" is complete nonsense. I have stated multiple times in the BP discussion that I would rather have any of those other labels than to not have one at all, but Benwing insists I must be opposed to them because I argued against them, completely disregarding the fact that he explicitly invited me to do so and explain why I support colloquial. Apparently it was not enough to point out that the other labels were fine by me, since those arguments still ended up being turned against me in an attempt to attack my character in showing that I don't somehow accept any compromises.
If anything, you're not the one that accepts compromises. Multiple people, me included, have suggested unmerging the label only for specific languages, which would mean having language-specific overrides for the label. This is even something you yourself proposed in the previous thread ("One possibility is to make the unmerger conditional only in specific languages"), but then opposed because you think "it is too confusing".
This casting of aspersions has to stop. — SURJECTION / T / C / L / 10:31, 6 March 2022 (UTC)[reply]
I'm not sure I like the whole simple majority thing either, but for what it's worth, I haven't noticed the discussion on splitting the categories either, and I wouldn't be surprised if others didn't, either. This should've been a BP discussion, IMO, and maybe even a full-fledged vote. Also, "They have also made specific accusations against me [...]": Surjection literally said the opposite, saying they were convinced this was not your intention. Thadh (talk) 11:38, 6 March 2022 (UTC)[reply]
@Surjection FYI: I see someone has now done a blanket unmerger, including English. Don't be surprised if I remerge everything except for specific languages (we can start with Finnish and Welsh). Benwing2 (talk) 09:24, 12 March 2022 (UTC)[reply]
Yes, Svartava did it (see the discussion below). Since the consensus is that English shouldn't have the distinction, merging colloquial with informal in English entries is completely fine by me. It should be discussed for other languages first though. — SURJECTION / T / C / L / 10:52, 12 March 2022 (UTC)[reply]

The listing of "vocative singular of __" as a separate definition on Latin headwords edit

For most classes of Latin nouns, the nominative singular form (used as the lemma in lexicography) is identical to the vocative singular form. For neuter nouns, the accusative singular form is also identical. Some Wiktionary entries for Latin nouns include a separate line defining the lemma form as "vocative singular of __". Should this be practice be encouraged or avoided? It doesn't seem useful to me, since almost all noun entries have a "Declension" section that shows all the forms of the word. Including the vocative singular as its own definition line seems to just waste space and add clutter. An example is the entry for gloria: "1. glory, renown, fame, honor 2. vocative singular of glōria". Aside from the nominative and accusative, something similar can apply to the ablative and genitive singulars, which are sometimes spelled the same as the nominative singular, but I think I've less often seen these forms added as extra definitions.--Urszag (talk) 21:14, 7 March 2022 (UTC)[reply]

Yeah, I delete those when I see them. They're distracting and pointless. I also feel the same way about having a separate header for, say, feminā underneath the entry for femina. The declension table is right there. Non-lemma entries are meant as cross-references only, and when there is no cross-referencing to be done, they are inappropriate. This, that and the other (talk) 01:40, 13 March 2022 (UTC)[reply]
@Urszag I made a list of entries with these unnecessary form-of senses: User:This, that and the other/Latin redundant form-of. This, that and the other (talk) 02:29, 13 March 2022 (UTC)[reply]

Universal Code of Conduct Enforcement guidelines ratification voting open from 7 to 21 March 2022 edit

You can find this message translated into additional languages on Meta-wiki.

Hello everyone,

The ratification voting process for the revised enforcement guidelines of the Universal Code of Conduct (UCoC) is now open! Voting commenced on SecurePoll on 7 March 2022 and will conclude on 21 March 2022. Please read more on the voter information and eligibility details.

The Universal Code of Conduct (UCoC) provides a baseline of acceptable behavior for the entire movement. The revised enforcement guidelines were published 24 January 2022 as a proposed way to apply the policy across the movement. You can read more about the UCoC project.

You can also comment on Meta-wiki talk pages in any language. You may also contact the team by email: ucocproject wikimedia.org

Sincerely,

Movement Strategy and Governance

Wikimedia Foundation

MediaWiki message delivery (talk) 10:22, 8 March 2022 (UTC)[reply]

Update on Universal Code of Conduct Enforcement guidelines ratification vote (as of 17 March) edit

Hi all,

With a little under 4 days left in the poll, I can share that there are 1518 voters as of 17 March 21:00 UTC, and there was 6 voters with a home wiki registration of en.wikitionary.

While I'm keeping in mind that homewiki isn't always indicative of where an editor is active, I want to remind everyone that local opinions are sought in this global decision. Even if your homewiki is not wikitionary, commenting on the guidelines from the perspective of this project will be beneficial.

Here are some of the other votership numbers:

Votership numbers by homewiki as of 17 March 2022 21:00 UTC

( 564 : enwiki ) ( 168 : dewiki ) ( 90 : frwiki ) ( 69 : eswiki ) ( 71 : ruwiki ) ( 65 : plwiki ) ( 50 : metawiki ) ( 46 : zhwiki ) ( 44 : jawiki ) ( 45 : itwiki ) ( 29 : commonswiki ) ( 20 : arwiki ) ( 19 : cswiki ) ( 18 : ptwiki ) ( 17 : nlwiki ) ( 17 : kowiki ) ( 15 : trwiki ) ( 11 : cawiki ) ( 10 : idwiki ) ( 144 : 78 other projects )

The poll can be accessed locally here: Special:SecurePoll/vote/378.

Let me know if you have any questions. --Mervat (WMF) (talk) 15:36, 18 March 2022 (UTC)[reply]

Edit request on edit

I think the following two translingual definitions should be added:

  1. The right-facing swastika, a symbol widely used in South Asian religious beliefs as a symbol of divinity and spirituality.
  2. The right-facing swastika, a symbol of Nazism.

The first proposed definition is based on the wording used at (note that both variants are used spiritually, sometimes with different meanings).

The second one is based on the wording used on for communism. 70.172.194.25 03:02, 9 March 2022 (UTC)[reply]

  DoneSvārtava (t/u) • 03:28, 9 March 2022 (UTC)[reply]

What's going on with colloquialism categories? edit

User:The Ice Mage here editing from NLN centre as per my user page. I just want to ask what is going on with these categories? I recreated 4 and there were only 5 in Category:Colloquialisms by language before I got to work just now. The logs show they were deleted as empty categories last year but Category:Finnish colloquialisms, which I just recreated, has over 1000 entries in it! Has there been some kind of U-turn on some policy or something related to these categories..? 37.110.218.43 11:54, 9 March 2022 (UTC)[reply]

Per RFM discussion at Category talk:Colloquialisms by language, the category was merged with Category:Informal terms by language. Today, Wiktionary:Votes/2022-01/Label for lower register which seeked to overturn the RFM result passed so the RFM change was reverted and the colloquialisms categories suddenly filled up again. See also: (history) Module:labels/data. —Svārtava (t/u) • 12:05, 9 March 2022 (UTC)[reply]
So, these categories should be recreated then? Alright. I'm disappointed that WantedCategories hasn't updated itself yet so I can't see a list of categories to revive... 37.110.218.43 12:11, 9 March 2022 (UTC)[reply]
Yes, they will be recreated soon, probably by WingerBot. —Svārtava (t/u) • 12:12, 9 March 2022 (UTC)[reply]
Deleted categories: list. —Svārtava (t/u) • 12:14, 9 March 2022 (UTC)[reply]
That would be nice, but maybe I'll beat the bot to the punch lol. Idk why really but I enjoy creating chains of category pages such as "X terms derived from Y" and topical categories. :) 37.110.218.43 12:15, 9 March 2022 (UTC)[reply]
Well, then you might find User:Erutuon/scripts/addAutoCat.js helpful and quick, but it would probably not work for an anon; might create a temporary a/c for it. —Svārtava (t/u) • 12:19, 9 March 2022 (UTC)[reply]
I'm confused and a little curious after skimming through the old RFM discussion...am I right in saying you supported the merge, but then later advocated for the categories to be resplit...? 37.110.218.43 12:36, 9 March 2022 (UTC)[reply]
Not really, I would still support the categories merged and I opposed the vote, but it passed so I reverted the category merger in the module (Surjection would probably do it anyway even if I hadn't). —Svārtava (t/u) • 12:48, 9 March 2022 (UTC)[reply]
I see, fair enough. 37.110.218.43 13:22, 9 March 2022 (UTC)[reply]
I would've let the above BP discussion run its course first, although it seems at this point it already kind of has without any clear consensus either way. — SURJECTION / T / C / L / 13:52, 9 March 2022 (UTC)[reply]
I'm rebuilding most of them now. I have tried so hard to clear out Special:WantedCategories, but there is a lot of dross in there where someone has made a one-off category that has no connection with a module or any particular scheme, added it to three entries and never tried to actually make the category or categorize said category that it's impossible. —Justin (koavf)TCM 19:05, 9 March 2022 (UTC)[reply]
@Koavf There was no intention on the part of anyone to split English informal/colloquial. I am going to remerge them. Benwing2 (talk) 09:17, 12 March 2022 (UTC)[reply]
@Benwing2: Okay. I'm not sure how that responds to what I wrote. —Justin (koavf)TCM 16:28, 12 March 2022 (UTC)[reply]

Arabic root categorization edit

Currently we have two parallel category schemes for Arabic roots, exemplified by the following:

I think it would make sense to put the Arabic-specific category inside the general "Terms derived" category, perhaps with a sort key to make it show up first, for easier navigation. 70.172.194.25 07:18, 10 March 2022 (UTC)[reply]

What is the rationale for using this template? I guess it should be used for ‘nonextisting’ entries that could potentially be created in the future, but currently do not meet our criteria for inclusion: thus basically RFV-failed entries, like cyberpathy, Banglaphilia, unculturalness, etc. However, I see that many entries in this category are also RFD-deleted stuff, such as Republican Party, Democratic Party, that would likely not be created again— therefore I wonder why they should stay in that category instead of being totally deleted. ·~ dictátor·mundꟾ 14:24, 10 March 2022 (UTC)[reply]

It's more useful for someone to see that an entry has been deleted than to see nothing at all. Theknightwho (talk) 15:09, 25 March 2022 (UTC)[reply]
@Theknightwho: Thanks for replying. However I don’t get your point: while creating a new entry, one can already see if the entry had been previously deleted… ·~ dictátor·mundꟾ 17:17, 23 April 2022 (UTC)[reply]

Outranked by forks? edit

I've noticed that forks/mirrors like Wordcyclopedia and WordSense sometimes rank before Wiktionary on Google (or only they are listed, and not Wiktionary). Example. I'm not sure there's really anything we can or should do about this, but I thought it was something worth noting. Obviously, as long as these sites give credit, they are free to redistribute Wiktionary's content. I just think it's odd that they would have better SEO. 70.172.194.25 00:09, 11 March 2022 (UTC)[reply]

I don't think it's odd. They can manually optimize the SEO and tailor it for the dictionary domain: the page is littered with keywords and metadata like "synonym, meaning, spelling" which might be used in queries. And MediaWiki is not optimized for readers, with lots of distracting, non-content related elements. – Jberkel 00:46, 11 March 2022 (UTC)[reply]
No, we don’t have bad SEO, Google is a bad search engine, our main content which bears on SEO is—naturally—hardly different and we must have better backlinks. Having “good SEO” is basically suggesting you are spammer trying to manipulate attention in order to flog some snake-oil, which to present is the main-purpose of Google. They don’t cease to think like an adman, hence they stay close to the marketing snake pit. On DuckDuckGo Wiktionary uses to rank first or otherwise high when having been given random words, and I use that because it has more informative results, particularly considering that the interest is linguistic: If I seek to buy something Google indeed is a better choice, but otherwise Google does not have the best results. You see that one succeeds more with superficialities than organic content on it. Fortunately we don’t need to care for Google since we are not selling things. Reputation spreads a dictionary. (While quality was less decisive in the choice of the masses for a search engine.) Fay Freak (talk) 01:01, 11 March 2022 (UTC)[reply]
If no one knows you have a reputation, you have none. Respondents at the Wikipedia Reference desk citing a source for the meaning of a term rarely link to Wiktionary.  --Lambiam 10:33, 11 March 2022 (UTC)[reply]
@Lambiam: I'm increasingly seeing forreign words in WP articles linked directly to wiktionary. It's not a lot but better than none and I think this should be encouraged. ApisAzuli (talk) 12:18, 14 March 2022 (UTC)[reply]
Well, let's nest SEO keywords into the fine-print of the footer that goes at the bottom of all our pages, e.g. add a short statement like "Wiktionary is a free dictionary with definitions, synonyms, [etc]". That'd be easy to do, most humans don't notice we have that fine-print footer at all and so shouldn't mind, and if it'd help our prominence, let's try it... - -sche (discuss) 03:20, 12 March 2022 (UTC)[reply]
Well, it is dubious that search engine put value on texts repetitive for pages on a whole site. If I put in a common (the more so uncommon) Arabic word like أخ or دم or ميس I am first on DDG or second only after Arabic Wikipedia. Apparently it knows this is a dictionary and relevant. And people complain that the search engine does not give the results “as is” any more. Clearly it should weigh hits and should not be trivially gameable. Fay Freak (talk) 15:33, 12 March 2022 (UTC)[reply]
I know a lot of people prefer to use a search engine other than Google. But the undeniable reality is that Google is by far the most popular engine used, and our entries generally do poorly in its results, being outranked by spammier sites. Even those sites that borrow our definitions will omit our etymologies, translations and so on, which is a loss to readers. Adding some gentle SEO keywords could be good (sadly, it has been known to work), but I know that Google also looks at how modern the site's appearance is - we have text that spills across the entire width of the page like something out of the 90s, which can't help. This, that and the other (talk) 10:38, 13 March 2022 (UTC)[reply]

I believe that Wiktionary has incredible potential, but it still absolutely SUCKS. See for instance, my very recent creation of an entry for Wake Island. You can't overlook a concept that is in all internet dictionaries and encyclopedias and expect to be considered a reputable source, you just can't. The mirrors and forks represent scammers investing in the potential of what Wiktionary could one day be. --Geographyinitiative (talk) 13:24, 14 March 2022 (UTC)[reply]

SEO questions aside, it's a shame that Wiktionary hasn't reached the same status and level of trust as Wikipedia, after all this time. People link to well-known/traditional dictionaries or even urbandictionary.com, maybe because they are seen as "authoritative". And it should just be a small mental leap from "encyclopedia" to "dictionary". – Jberkel 14:14, 16 March 2022 (UTC)[reply]

Thanks to Justin/Koavf's efforts, we have a nice Appendix:English capitonyms. Someone could generate a list of English entries (and next, other languages) that differ only in capitalization, where both entries have "gloss definitions" (i.e. excluding cases where one has no definition beyond "alternative spelling / lettercase form of" the other), and use that to expand the appendix. This raises two questions: is this better as an appendix or a category? And, should we exclude certain types of pairs? There are many trivial conversions of nouns into names (smithSmith, joyJoy, etc), which seem less interesting than etymologically-unrelated pairs like March vs march. (We could shunt them to a separate subsection of the appendix and/or a separate subcategory, but do we want to do that? Or exclude them entirely? Or lump everything together without distinction?) - -sche (discuss) 03:10, 12 March 2022 (UTC)[reply]

Thanks for thinking about this. I think that the kind of trivial examples are really only a problem depending on how many instances we end up having, really. —Justin (koavf)TCM 05:00, 12 March 2022 (UTC)[reply]
As a wiki we won't "end up" with a number of them. It seems like yet another open class.
I expect that a good first use would be to make sure that, if two entries differing only in capitalization both have gloss definitions, there is good reason one should not be an alternative form. I would like to see whether we have both capitalized and uncapitalized vernacular names of organisms.
BTW, I assume that Translingual names will be treated as if they were from a language distinct from any normal language. DCDuring (talk) 02:27, 13 March 2022 (UTC)[reply]
@-sche, @Koavf This is a fascinating project, and I think it is much more worthy of being a curated appendix sorted by scenario, as -sche suggested at Appendix talk:English capitonyms, compared to a plain category.
For your curiosity, I've generated a full list of English capitonyms with very rudimentary glosses (just the text of the first English sense in the entry) at User:This, that and the other/capitonyms. There are about 10,000 pairs. The lists are slightly messy (for example, multi-way capitonyms like BPs/bps/BPS/Bps get listed as multiple "pairs"), but they still expose some potentially interesting pairs like Bieberite/bieberite and Hank/hank. Please let me know if you have ideas for how I could improve them so they can be useful as raw input for the project. This, that and the other (talk) 09:24, 13 March 2022 (UTC)[reply]
This is great and I personally love B/bieberite. Maybe the solution is manually adding to the appendix and striking out ones from your userpage that have been added? We'll probably need to split the appendix, since 10,000 is quite a few. Maybe we could have a criterion that we don't include abbreviations? —Justin (koavf)TCM 15:41, 13 March 2022 (UTC)[reply]
Honestly I think we should just pick the ones we find "interesting" to start with! Long term, there are three broad categories of trivial ones that we might want to exclude: abbreviations, place names, and surnames. A fourth category where the difference in sense is very minor (Aidos/aidos, Devil/devil) might also be excluded. This, that and the other (talk) 02:54, 14 March 2022 (UTC)[reply]
I made User:This, that and the other/capitonyms/ba. I'd better stop now or I'll end up doing the whole alphabet... This, that and the other (talk) 11:57, 14 March 2022 (UTC)[reply]

Hubs Dialogue Findings Summary edit

Hello; After conducting conversations with many Wikimedians from around the world, who are planning or working on possible future hubs, we are happy to share the findings summary from the series of the Hubs Dialogue, which you can find here.
The results focus on the aspect of the existing problems in the Wikimedia movement that the hubs may be needed to solve, including both: 1. globally shared needs among all Wikimedia communities, and 2. needs specific to the context of a particular community or a group of communities.

Movement Strategy and Governance Team Wikimedia Foundation --Mervat (WMF) (discusscontribs) 11:58, 12 March 2022 (UTC)[reply]

I have created the above vote for the coalmine rule to reflect more of the current practises. Comments, suggestions, and objections are welcome here before the vote starts. —Svārtava (t/u) • 06:54, 13 March 2022 (UTC)[reply]

Categories for sprachbünde edit

(See also: Category_talk:Languages_of_the_Balkans, Wiktionary:Requests_for_deletion/Others#Category:Languages_of_the_Caucasus)

While grouping any bunch of languages together based on merely spatial proximity is nonsense, I want to know whether creating categories for sprachbünde is something that the community is in favor of. I would want to add Category:Romanian language to Category:Languages belonging to the Balkan sprachbund, which in turn should be part of Category:Sprachbünde. I know that what is and isn't a sprachbund is debatable but so are language families which we nevertheless include (so we'd just go with the scientific mainstream). I personally think this is interesting and valuable linguistic information that is also very low-maintenance. — Fytcha T | L | C 10:45, 13 March 2022 (UTC)[reply]

FYI I do not see how this is useful because categories work best when they index large amounts of data in a sorted fashion. The Balkan Sprachbund Category would have a handful of entries only, and a collection of Sprachbünde would be fairly unrelated. It's best served by an encyclopedic treatment, where the individual context is to be elaborated. Its only utility, that I can fathom, would be increasing awareness for those few users who stumble through the unstructured (sorted alphabetically) category box at the bottom of a page, and if I understand correctly that goes only for the categorization of a category, not the lemmas. A link can be established through the about-language pages, however. On the downsight, there is the risk of uttering categories, eg. Fusional Language, would be justified similarly. Implying, this should not be decided on one particular instance of Romanian, although I think I understand the significance in this case.
The irony is, on the one hand, that Sprachbund means almost by definition that the Sprachbund languages do precisely not fit the usual tree-models. I am not involved enough in the Balkans to have any opinion on how this squares with the usual taxonomy as Eastern Romance language. I do suspect that, since Itic folks entered the peninsula from the north, the relation is much vise versa. This is not debatable, because it is well known that the individual disciplines, linguistics, archaeology, history etc. do not talk enough to each other, or didn't until recently.
On the other hand, some debatable language families are debatable precisely because they do show Sprachbund qualities. That's what I heared at least about Proto-West-Germanic, which we inude nevertheless. And I don't see much difference to the notkrious vulgar dialect continuum that is broadly labeled Latin.
In a similar vein, the Dating of our nominally Ancient Greek spanning milennia throws a wrench in your works because the anachronisms between that and the much much later attested Romanian, or Albanian, whatever, are so far not resolved. ApisAzuli (talk) 13:23, 19 March 2022 (UTC)[reply]

To make this more useful for people not fluent in Hebrew script, we could add automatic transliteration. There are a few characters (beth, vav, pe, qoph) with two different possible Latin transliterations, but we could just do e.g. b/v for beth. Is there any reason not to add this? 70.172.194.25 19:39, 13 March 2022 (UTC)[reply]

Implemented. Feel free to give feedback if the transliterations could be improved. I ended up mostly following Wiktionary:About_Hebrew#Romanizations, using the (IMO) simplest unambiguous form for each consonant. I did not yet implement the thing to substitute _ for word-final h, etc., but I'm not sure how necessary that is for a root; e.g., only one existing root page seems to do it: צ־ו־ה. 70.172.194.25 05:13, 14 March 2022 (UTC)[reply]

Small change to WT:EL#Translations edit

WT:EL#Translations says: "If you think {{t}} is too complex, simply enclose the translation in square brackets." This obviously should never ever be done. Does anybody object to me removing this sentence? — Fytcha T | L | C 13:35, 14 March 2022 (UTC)[reply]

Since the entry layout is supposed to reflect how we want entries to look, and not just the bare minimum that we tolerate, the change seems fine. If people add translations using a plain link it can always be changed. FYI there are currently quite a few pages that use this syntax, e.g. [2]. Perhaps a bot could fix these? 70.172.194.25 20:37, 14 March 2022 (UTC)[reply]
I support that. I recently ran a bot to convert single words enclosed in square brackets to {{t}}. More complex entries need human-oversight because there is a lot of variation. JeffDoozan (talk) 15:14, 15 March 2022 (UTC)[reply]

Wiki Loves Folklore 2022 ends tomorrow edit

 

International photographic contest Wiki Loves Folklore 2022 ends on 15th March 2022 23:59:59 UTC. This is the last chance of the year to upload images about local folk culture, festival, cuisine, costume, folklore etc on Wikimedia Commons. Watch out our social media handles for regular updates and declaration of Winners.

(Facebook , Twitter , Instagram)

The writing competition Feminism and Folklore will run till 31st of March 2022 23:59:59 UTC. Write about your local folk tradition, women, folk festivals, folk dances, folk music, folk activities, folk games, folk cuisine, folk wear, folklore, and tradition, including ballads, folktales, fairy tales, legends, traditional song and dance, folk plays, games, seasonal events, calendar customs, folk arts, folk religion, mythology etc. on your local Wikipedia. Check if your local Wikipedia is participating

A special competition called Wiki Loves Falles is organised in Spain and the world during 15th March 2022 till 15th April 2022 to document local folk culture and Falles in Valencia, Spain. Learn more about it on Catalan Wikipedia project page.

We look forward for your immense co-operation.

Thanks Wiki Loves Folklore international Team MediaWiki message delivery (talk) 14:41, 14 March 2022 (UTC)[reply]

Leadership Development Working Group: Apply to join! (14 March to 10 April 2022) edit

You can find this message translated into additional languages on Meta-wiki.

Hello everyone,

Thank you to everyone who participated in the feedback period for the Leadership Development Working Group initiative. A summary of the feedback can be found on Meta-wiki. This feedback will be shared with the working group to inform their work. The application period to join the Working Group is now open and will close on April 10, 2022. Please review the information about the working group, share with community members who might be interested, and apply if you are interested.

Thank you,

From the Community Development team
Mervat (WMF) 09:22, 17 March 2022 (UTC)[reply]

Phono-semantic matching edit

Is cool arrow a correct example of a phono-semantic matching? I ask because I was surprised by the redlinked category "English phono-semantic matchings from Spanish", and then checked the parent category of Category:English phono-semantic matchings and saw it was nearly empty. Other similar phrases like mercy buckets and grassy ass weren't categorized as such either. Some of them are grouped under pronunciation spellings or eye dialect. Are those more appropriate categories to use? Does the fact that it's intentionally spelled wrong in order to be humorous make a difference? 96.57.6.131 04:46, 18 March 2022 (UTC)[reply]

In your Spanish examples I see the "phono-" but not the "semantic". What connection do the words cool arrow have to the concept of being an asshole? The Wikipedia article gives some French examples where there is a true phono-semantic matching process. This, that and the other (talk) 05:17, 18 March 2022 (UTC)[reply]
I see. 'Mercy buckets' may be sort of a valid example then (given the explanation in its etymology section), but the others don't reflect any semantic matching. I think I did actually know the correct definition of the term but for some reason forgot and blurred it with malapropism in my mind. 96.57.6.131 05:25, 18 March 2022 (UTC)[reply]
Is Dutch hangmat (hammock), from Spanish hamaca, an example? Category:Dutch phono-semantic matchings from Spanish is empty.  --Lambiam 09:03, 18 March 2022 (UTC)[reply]

Why are all Chinese varieties stuffed under one Chinese? edit

I find it inconvenient for me to look up a specific language (e.g. Cantonese or Hokkien). Not to mention Romance languages like Ladin and Istriot that I have never heard of, if even Japonic languages can be divided into Okinawan, Miyako, Yaeyama, etc. as in なー, why not just split the Sinitic languages as many as possibile? 汩汩银泉 (talk) 11:13, 19 March 2022 (UTC)[reply]

@汩汩银泉, See Wiktionary:Votes/pl-2014-04/Unified_Chinese, and while I'm not a Chinese-language editor, I do agree that there's some obscurity, more so when it comes to historical Chinese (Old & Middle) and terms that derive from them, especially for readers who are not well-acquainted with Chinese. I've brought it up to the #Chinese channel on Discord as well. (Notifying Atitarev, Tooironic, Fish bowl, Justinrleung, Mar vin kaiser, RcAlex36, The dog2, Frigoris, 沈澄心, 恨国党非蠢即坏, Michael Ly): AG202 (talk) 02:56, 20 March 2022 (UTC)[reply]
I don’t think it can be inconvenient. Just accept it as different. And the portrayal of the Chinese languages collectively is compacter so it is likely that you will be able to use it for a more efficient grasp of information. Merging other language groups would pose other problems. Fay Freak (talk) 03:16, 20 March 2022 (UTC)[reply]
Is it? I think Ladin and Istriot look compact enough in the part "Descendants" of Latin as in lignum#Latin. 汩汩银泉 (talk) 15:39, 20 March 2022 (UTC)[reply]
How is it inconvenient to look up a specific language? All topolect readings are listed in one place in every entry (under the Pronunciation heading). We also provide a wide range of dialectal synonyms. ---> Tooironic (talk) 04:06, 20 March 2022 (UTC)[reply]
The examples and explanations are mixed between languages, and often can only be identified by paying attention to the subscripts; the examples themselves often only correspond to one of the languages. I see cross-linguistic comparisons as more of an etymological work, like the etymological part of Germanic languages. Of course, I'm not opposed to putting it under one heading for cross-linguistic comparison reasons, but wouldn't the format ===languageA=== be more elegant than stuffing them all together? 汩汩银泉 (talk) 15:32, 20 March 2022 (UTC)[reply]
Because the usage of characters has made a unique situation where boundaries are blurred, like reading literary Chinese or ancient poems aloud using the local dialect, Hong Kong's "written language" being essentially Mandarin but perceived by locals as "formal Cantonese" (書面語书面语 (shūmiànyǔ) [3]), and modern dictionary makers reconstructing Mandarin pronunciations for every obscure hapax abomination. Unification would be impossible if each Chinese had its own phonetic script. —Fish bowl (talk) 05:02, 20 March 2022 (UTC)[reply]
I am not from Hong Kong, and I have never taken Mandarin scripts to be the written form of my mother tongue. When we talk about the written form of our language, we mainly only refer to sentences like the following:
我某识锦子讲话有好耍到边啲去。
(after regularization and in Trad. Characters) 我冇識噉子講話有好耍到邊啲去。
(En.)I don't get the punch lines in those kinds of words.
Of course, there were sister languages in French Indochina that were recorded purely in Latin characters, which may be compared to the relationship between Dungan and a certain variety of Mandarin, but much closer. In fact, my ability to read aloud the shūmiànyǔ of Hong Kong is close to that of an illiterate person. 汩汩银泉 (talk) 15:17, 20 March 2022 (UTC)[reply]
If you divide Chinese into different separate varieties, where will you put Classical/Literary Chinese? And how will you deal with individual characters in different Chinese lects especially when many of them are bound morphemes that make up compounds? RcAlex36 (talk) 05:34, 20 March 2022 (UTC)[reply]
"where will you put Classical/Literary Chinese?" Just like there's a separate language header for Middle English, make a new header for zh-classical, just like there's a zh-classical.wikipedia.org. One day Wiktionary will wake up and regret slamming (minimum) ten languages into one language header. --Geographyinitiative (talk) 11:47, 20 March 2022 (UTC)[reply]
Sometimes it's worth reconsidering whether wiki code is the best way to support the world's thousands of languages. 汩汩银泉 (talk) 15:49, 20 March 2022 (UTC)[reply]
Well, like ==Classical/Literary Chinese==? 汩汩银泉 (talk) 15:33, 20 March 2022 (UTC)[reply]
What would you include under this heading? Would you include Romance of the Three Kingdoms? RcAlex36 (talk) 15:45, 20 March 2022 (UTC)[reply]
I don't really understand your point of view. The Romance of the Three Kingdoms is obviously a novel in the near ancient Mandarin. 汩汩银泉 (talk) 15:55, 20 March 2022 (UTC)[reply]
How about: Would you include Bianwen under Classical/Literary Chinese? 汩汩银泉 (talk) 15:59, 20 March 2022 (UTC)[reply]
@汩汩银泉: To be honest, there are pros and cons for both options. I think though that the pros outweigh the cons in the current Unified Chinese format. The Chinese language is too messy to separate into several smaller varieties. As for the cons, I admit, they have not been solved yet. For example, many Chinese entries in Wiktionary lists several pronunciation from various Chinese varieties, but it does not indicate the intricacies of how literary/colloquial the word is used for each variety/dialect (at least for most words), plus it's hard to provide sentence examples for each variety/dialect. But I think there are ways to get around that, to give more information of how a specific word is used in various varieties/dialects. Either through usage notes, or the qualifiers in definitions. --Mar vin kaiser (talk) 22:15, 20 March 2022 (UTC)[reply]
Words that are commonly used in one language may be extremely rare or even considered foreign in another language. The Wiktionary only suggests that a word is used in a particular "Non-Standard Chinese language" by subscripting it, but does not specify for which language the word is rare in the "Written Standard Chinese". It was very difficult for me to get the information quickly as a learner of a particular Sinitic language. So does one really say 不然 or 貌似 in Min Nan? If not, should I look up "Min Nan" every time by ctrl+f under synonyms? Why not just list the vocabulary of the descendant languages under the ancestral languages? Or, conversely, why not list all similar words in Castilian, Galician, Portuguese, Catalan, Occitan, French, Romansh, Lombard, Venetian, Italian, Sicilian, Romanian, Sardinian, etc., under the word in one of these languages? I also often learn French through Spanish or vice versa, which also requires "a more efficient grasp of information".
No matter what, I have no such feeling when I looked up English and German etymologies, as well for French, Spanish, Vietnamese, and Korean words via Wiktionary. Maybe in the case of Min Nan (Taiwan), Wiktionary itself is not a dictionary, not comparable to twblg.dict.edu.tw, moedict.tw or even itaigi.tw. 汩汩银泉 (talk) 09:47, 21 March 2022 (UTC)[reply]
As a non-native advanced speaker of both Minnan and Hakka, I support the current layout for the Chinese languages and find it as a unique layout and primary resource for many things not possible with my paper dictionaries. I don't think there is any problem with listing the Minnan pronunciation for 不然 even though it's not as common as "na7-bo5". Currently the 不然 does not list the synonyms in other lects, however on the 貌似 page, I can find the synonyms 甲像 and 親像 which are common in Minnan, which have their own pages. If I enter "na-bo" in the search box, I can get to the page 若無 which has a synonym box that includes 不然, so I don't know why the synonym box is missing from the 不然 page. I enjoy being able to browse all of the other Chinese dialect vocabulary under these headings, especially as I improve my knowledge of both Yue and Wu and others. I own approximately 100 large-volume Chinese dialect paper dictionaries, many of which are translations between each other, and each pose their own challenges when it comes to time efficiency. I don't own that many because I intend to learn them all, I just like having the comprehensive reference available. My only regret with Wiktionary as it is, is the lack of example sentences and usage for Minnan and other lects which my paper dictionaries are full of. So sometimes I'll come to Wiktionary to find a synonym or pronunciation via Mandarin, then go back to my paper dictionary to get more usage information. I think over time all of this will improve. As for Romance languages, the same issue exists, as not every word that is spelled the same has the same meaning, for example the Catalan term encinta listed under the Latin descendants for incinctus goes to the Spanish page encinta, which leads to encinto, neither of which list the Catalan term found on the Latin descendants, but only there do you find the common Spanish synonym embarazado, which I find amusing as men don't get pregnant, but they use seahorse in the example sentence, which is clever. So actually, I find that I've wasted more time between all the Romance links that I did on the Chinese. I think a better solution is one like the Chinese, where you have a common Latin ancestor and all the descendants are listed on one page with a separate synonym box that indicates which languages use their own words. Descendants do not mean they carry the same semantics, as we all know from French/English faux amis. Kangtw (talk) 03:30, 3 April 2022 (UTC)[reply]
As for the drawbacks for listing every Sinitic language, I think that hindering the enthusiasm of Japanese, (unlikely) Korean, and (unlikely) Vietnamese contributors can be counted as one, assuming that each word as long as in Chinese characters is filled with information unrelated to their native language(s). 汩汩银泉 (talk) 09:59, 21 March 2022 (UTC)[reply]
@汩汩银泉: I get your point about the drawback of the current system. Actually, it's an issue that has bothered me for a while, but I got thinking about it more since you raised the Unified Chinese format here. There are many Chinese entries in Wiktionary that are listed in Standard Chinese, and it lists pronunciations in various Chinese lects, but it's often unclear whether that word is actually colloquially used in that Chinese lect or not. Like for example, the entry 趕快赶快 (gǎnkuài) has a Min Nan pronunciation but I've never heard anyone speaking Hokkien and saying 趕快. But at least in this case, we have a dialectal synonym chart, but that doesn't exist for most entries. @Justinrleung, Tooironic, Fish bowl What do you guys think? Is there a solution to this? --Mar vin kaiser (talk) 14:11, 21 March 2022 (UTC)[reply]
There are several issues with splitting Chinese (back). First, how do we want to split it? By major topolects (Mandarin, Yue, Wu, etc.)? The same issues would arise even with these splits because even within these topolects there is significant variation (Shanghainese vs. Wenzhounese, for example). If we split them up too much (like by city/town), it would never be manageable. Second, as mentioned above, it ignores the fact that in many varieties of Chinese, there is a sort of diglossic situation where some words are reserved for the written register or reading written texts. For example, in reading, Cantonese speakers would see 我們的祖母 and could read it as ngo5 mun4 dik1 zou2 mou5, even though in colloquial speech, they would say 我哋嘅嫲嫲 (ngo5 dei6 ge3 maa4 maa4). This is not unique to Cantonese, as Teochew (example) and Hakka (example) seem to allow this, for example. The degree of integration of these words from Standard Mandarin differs in each variety, but it's hard to ascertain. Third, most big dictionaries of Chinese do not really separate non-modern Chinese from modern Chinese. It is difficult to split just based on time, like with Old English and Middle English, since classical Chinese was common up to the Qing dynasty (and limitedly even until now), though works with different degrees of vernacular wording show up since the Tang dynasty.
Currently, we would give pronunciations for words in a variety only when we have proof of some usage in that variety. Ideally, we should have more labels and usage notes that make it clear which words are more reserved for particular varieties and/or registers. — justin(r)leung (t...) | c=› } 16:21, 21 March 2022 (UTC)[reply]
Are Ladin and Istriot manageable? Are these types of sentences considered to be Cantonese, or let's relax the constraints a bit, to be commonly used as Cantonese expressions in e.g. Cantonese corpora, in Cantonese audio or visual materials? Why would Hong Kong presenters and politicians rather use 哋 dei6, 嘅 ge3, 畀 bei2, 同 tung4, 話 waa6, 先係 sin1 hai6 than use 們 "mun4", 的 "dik1", 給 "kap1", 和 "wo4", 纔是 "coi4 si6" in broadcasting television or radio, which is commonly regarded as one of formal registers? If the problem of word frequency cannot be solved, it will never be clear to the learner how useful words "的吧這也" "dik1 baa6 ze5 jaa5" are in reality. As for the ancient literary language and the vernacular language, they were already diglossia since the Tang and Song dynasties, and the people then did not think that vernacular was literary or vice versa; also modern research does not confuse literary language with Tang and Song vernacular, which is easy to be divided according dynasties as languages from a historical view, e.g. in 漢語白話史 (History of Chinese Vernacular), 古白話詞彙研究論稿 (A Study of Ancient Vernacular Vocabulary), 《朱子語類》文獻語言研究 (A Study of the Language in A Collection of Conversations of Master Zhu), etc. So I don't know how difficult it would be to diachronically break down the vernacular varieties. 汩汩银泉 (talk) 18:21, 21 March 2022 (UTC)[reply]
我們的祖母 would not be colloquial Cantonese, but it would be certainly be possible to find these words in Cantonese songs or works written by people who only speak Cantonese (and write Chinese). News reporters would not use 們, 的, etc. because news reports are in a formal vernacular. Writing in Hong Kong, for example, can range from purely 我手寫我口 "down-to-earth" colloquial vernacular to replacing functional words with the vernacular (something like the news) to adding colloquial words into what's otherwise "Standard Chinese" to purely Standard Chinese (which could sound like a formal Mandarin with a lot of 文言 elements) (and probably anything in between). Where do we draw the line? I would not be comfortable with calling HK authors' work Mandarin even if it walks like Mandarin (but maybe doesn't talk like it), because the authors themselves would not be reading them in Mandarin. Do we now need a separate header called "Hong Kong Written Chinese"? — justin(r)leung (t...) | c=› } 18:42, 21 March 2022 (UTC)[reply]
Why not? We need varieties like Hong Kong Written Chinese, Tang vernacular, Song vernacular, Yuan vernacular, Ming vernacular, Qing vernacular etc., as long as their academic definitions are clear enough. 汩汩银泉 (talk) 19:41, 21 March 2022 (UTC)[reply]
@汩汩银泉: We can't just include all the labels that academia has put out there. That's not practical. It's not like we have Elizabethan English/Early Modern English separate from modern English, for example, or Singlish separate from other varieties of English. These changes would require a lot of thought and consensus. The current consensus would be hard to change unless there is a satisfactory alternative proposed. — justin(r)leung (t...) | c=› } 19:45, 21 March 2022 (UTC)[reply]
The current assortment of Sinitic languages is difficult for people to read and contribute unless there is a satisfactory alternative proposed. 汩汩银泉 (talk) 11:40, 22 March 2022 (UTC)[reply]
These are my own personal qualms with the Chinese L2 header, mainly focusing around Old & Middle Chinese and their descendants. (Taken directly from #Chinese in Discord) The descendants section at () is a prime example, some of these are Old Chinese borrowings while others are Middle Chinese borrowings while some are listed as just "Chinese" borrowings, and it feels unclear as a reader could think that they're Modern Chinese borrowings/descendants (minus the obvious ones like "Proto-Turkic"). AG202 (talk) 16:38, 21 March 2022 (UTC)[reply]
This reminds me of some doublets in Vietnamese that share Chinese origin, e.g. (in Vietnamese: vạn, muôn), while as long as one of it is Sino-Vietnamese, it is still distinguishable. 汩汩银泉 (talk) 18:35, 21 March 2022 (UTC)[reply]
Thank you for your understanding, it is much harder to know the frequency of words in a language as a non-native speaker learner. Yes, there is a list of words, but it is still difficult for me to focus on finding words in a particular language. It would be much better if each language in this list were hided individually by default and I could click on "Min Nan" individually, for example. 汩汩银泉 (talk) 16:45, 21 March 2022 (UTC)[reply]

────────────────────────────────────────────────────────────────────────────────────────────────────@汩汩银泉: We have this thing called the dialectal synonyms module, so you can find a table of the equivalents in different dialects of Chinese. It is represented in the form of {{zh-dial}}. So in that way, people can look up the standard Mandarin term here, then use the dialectal synonyms table to find the equivalent in the relevant dialect. Look at 開玩笑 for example. When you open the dialectal table, you can see that the equivalent in Cantonese is 講笑. Speaking of which, why don't you contribute by adding the equivalent terms in your native dialect of Cantonese. We can always add your city/district to the dialectal tables if you want to make that contribution. The dog2 (talk) 19:19, 21 March 2022 (UTC)[reply]

I am neither a Hongkongese nor a Shanghaiese. Do Hongkongese or Shanghaiese use for "(of eating) having the appearance that one really enjoys the food"? Maybe (not in words.hk however)? "Sound" (of sleep)? Really? "Popular"? What does "香" for "popular" mean? I don't even really understand this meaning without an example sentence.
I am first a reader and face inconvenience and problems finding information. I already said above that if there is any way to jump to a specific branch of Chinese (e.g. say Hokkien) with other varieties hiden, with registers (e.g. vulgar, colloquial, formal, literary), word frequencies (e.g. rare), historical hints (e.g. archaic) and example sentences only relevant to that language branch, then the experience will be much better. Until then, any excessive contributions will only exacerbate the inconvenience. 汩汩银泉 (talk) 11:26, 22 March 2022 (UTC)[reply]
@汩汩银泉: We have people who have academic documents regarded the use of these terms. Ask Justinrleung if you want to know the source. Anyway, the different dialects are already grouped by family, so if there is data, you can tell the difference between Guangzhou Cantonese and Dongguan Cantonese for instance. Since you're a native Cantonese speaker who speaks a lesser-known dialect, we can add your native dialect of Cantonese as a data point if you want to contribute. The dog2 (talk) 22:56, 22 March 2022 (UTC)[reply]
Then I would expect to have these tags in at least a dozen languages under one meaning. Also, my native language is already in the list. Anyone who wants to edit can get a relatively complete picture of it by simply looking up more academic works. I would be curious to see how it would be recorded in the Wiktionary, like more subscripts with subscripts? Anyway, I still can't learn Min Nan through Wiktionary. 汩汩银泉 (talk) 11:52, 23 March 2022 (UTC)[reply]
  • Reading through this thread, and having read the older one as well, my impression is that the primary complaints boil down to two issues:
    1. An apparent usability issue, where someone interested in just one variety of Chinese has trouble finding just that information within the entry
    2. An apparent completeness issue, where Chinese entries are simply incomplete in terms of nuances, examples, usage notes, etc.
#1 might have a technical solution. Splitting Chinese varieties out into full entries is one such possible approach.
Personally, I think that would be a mistake -- it would require a ton of data duplication, and the lines for where to divide are far from clear, as described some above.
It does appear that dialectal data is already grouped within an entry. It's possible that different formatting might make entries easier to use.
#2 is a problem for all entries, quite frankly. I could make the same complaint of many of the German entries I've seen here, for instance schimmeln has no etymology listed, verbrauchen has an etymology section but very little usable detail (no explanation of the constituent etyma, no date of first appearance, no explanation of sense development, etc. etc.).
Filling in this kind of detail is precisely what this entire Wiktionary project is about. Information is missing simply because we aren't done yet. (Arguably, we never will be, considering that this is a volunteer project, and considering the way language changes -- but that's a separate issue altogether.) 汩汩银泉's question above about (xiāng) and senses shared or not shared between Hongkongese and Shanghainese is valid, and I would argue that this is best addressed by adding the relevant information to the existing entry at 香#Chinese, rather than by splitting out Hongkongese and Shanghainese into separate entries. Simply put, if information is missing, let's add it.
That's my perspective, at any rate. ‑‑ Eiríkr Útlendi │Tala við mig 22:02, 3 April 2022 (UTC)[reply]

It is not only about the usability, or putting them under one entry. Some entries assume that Chinese equals standard Chinese (or places it above others), and other varieties become subsumed under that despite being a completely separate language. E.g. compare 吃#Chinese and 食#Chinese: for the sense of "to eat", 食 is used in Cantonese, Hakka, and Min and 吃 is used in others. In 吃, there is no mention in which variety of Chinese is 吃 used, nor the usage of 食, and therefore the average reader may assume that "to eat" is 吃 in all Chinese languages, unless they check the dialect box. On the other hand 食 has all entries tagged with dialect, and it even states that 食 is only used in Cantonese and Hakka, while 吃 is used in Mandarin. Another example would be , , and , where various senses for walking and running are mixed together, alongside various other meanings. The problem is not only about lumping them under Chinese, it is also about the presumption that all Chinese languages are Chinese, but some Chinese (i.e. "standard" Chinese or Putonghua) are more Chinese than others. I would suggest that either we separate Chinese into each of the major branches, or if we still want to keep them together, then all Chinese entries should be labelled with {{lb}} regardless of which variety used, unless the difference is minimal such as in 朋友#Chinese. --Wpi31 (talk) 05:48, 8 April 2022 (UTC)[reply]

A few extra problems arising from the current method:
  • The links in categories do not point to the correct location, for example in Category:Hong Kong Cantonese the link is bio#Cantonese but the second-level heading listed in bio is bio#Chinese. This problem becomes less annoying when Chinese is listed first in the CJK entries, but it nevertheless complicates things.
  • The sections for synonyms, derived terms, usage notes, etc always only lists the standard Mandarin pronunciation or usage, despite the fact that other languages have different pronunciations or usages, e.g. in 人類 or 食物. This becomes nonsensical when the synonyms are more commonly used in non-Mandarin languages, e.g. under 切爾西#Synonyms which points to 車路士. Indeed ISO-7098 is a romanisation of Mandarin Chinese (see [4]), but not all Chinese languages. Again, all Chinese are equal, but Mandarin is more equal than other Chinese.
  • The pronunciation box is still quite long even if everything is collapsed, since it tries to list the pronunciations in every major Chinese language/dialect. When there are multiple readings listed, some of the Chinese languages would have the same reading of the same character split under different headings, e.g. , making it impossible to maintain. It is never clear whether to split or lump the same reading together, for instance, 會#Pronunciation 1 and 會#Pronunciation 2 have the same readings for Mandarin, so perhaps we should merge them? Meanwhile some of the words in 會#Pronunciation 1 are pronounced wui6-2 in Cantonese and the rest wui6-2, with a consistent predictable pattern. Should we split them into two pronunciations or keep it status quo?

--Wpi31 (talk) 06:57, 8 April 2022 (UTC)[reply]

Internationalisms edit

Somewhat inspired by the above conversation on Sprachbunds, and somewhat been thinking about this for months, would there be any value in having an internationalisms etymology template for categories? I believe Hungarian has one. One potential danger would be that editors apply it to pages without doing a further etymology. Vininn126 (talk) 14:11, 19 March 2022 (UTC)[reply]

The last discussion went nowhere. — SURJECTION / T / C / L / 14:55, 19 March 2022 (UTC)[reply]
That's right, I remember that. I'd like to push for it again, it would be very useful for a lot of different pages like chemical names and other scientific names. I really can't imagine this would be the kind of thing we'd need to put to a vote, unless we'd have to for things such as categorizations. What's to stop us from just making it now, other than community agreement? To me the lukewarm response is not meant to be taken as we shouldn't. Vininn126 (talk) 15:01, 19 March 2022 (UTC)[reply]
For almost all taxonomic names this would not be germane, IMHO. There is almost always a coiner (often identifiable) or a prescribed morphological derivation. Something similar applies to modern chemical terms AFAICT formed from standard chemical morphemes per IUPAC.
For a time Merriam-Webster used "ISV" (International Scientific Vocabulary) in lieu of a more specific etymology for these kind of terms. They give an explanation of their use of it from a print edition of MW3, citing impracticality of ascertaining the language of origin of every such term. "Accordingly, whenever a term that is entered in this dictionary [MW3] belongs recognizably to this class of internationally current terms, and no positive evidence is at hand to show that it was coined in English, the etymology recognizes its international status and the possibility that it originated elsewhere than in English by use of the label ISV [] . In some instances a statement as to probable language of origin is added after a semicolon." DCDuring (talk) 15:51, 19 March 2022 (UTC)[reply]
I'm a bit confused by your use of germane, as we have it, it means "not pertinent". Do you mean not an ideal solution? If so, then it would be a good idea to not put them on those entries. Also, I believe this would mostly be a secondary etymology, going alongside the morphological/historical one, used mainly for categorization. The last part sounds mostly the same as using an internationalism template. Vininn126 (talk) 15:57, 19 March 2022 (UTC)[reply]
Yes, it is not pertinent to taxonomic names and it may not be pertinent to chemical names for similar reasons. As those were the two classes of names that you had suggested this would be useful for, I wonder what other classes of terms would benefit from this proposal. DCDuring (talk) 16:08, 23 March 2022 (UTC)[reply]
@Surjection I think we could maybe move ahead then. Vininn126 (talk) 16:11, 23 March 2022 (UTC)[reply]
Surjection has suggest having some restriction, such as "use this template only if the direct source language is not clear, and always link to a term (often the English term) with complete etymology". Would this help alleviate your concerns? Vininn126 (talk) 14:17, 23 March 2022 (UTC)[reply]
How would that restriction advice apply to cases where a specific epithet is from a not-well-specified language of a local people? DCDuring (talk) 16:10, 23 March 2022 (UTC)[reply]
What would an example of that be? Vininn126 (talk) 16:18, 23 March 2022 (UTC)[reply]
This is a good idea. Maybe we could also add a link to Category:Internationalisms by language to Wiktionary:Todo#Regular tasks encouraging people to add cognates and the like to the entries. brittletheories (talk) 11:17, 21 March 2022 (UTC)[reply]

Adding a parameter for "normalizations" edit

Hello all!

Words and sentences in cuneiform languages (Akkadian, Sumerian, Hittite) or runic languages (Old Norse, Proto-Norse, etc.) are usually transliterated (sign-to-sign correspondence) and then "normalised" (i.e. given a reconstructed "standard" spelling. Normalisations are neither transliterations (|tr=) nor transcriptions (|ts=). At the moment we don't have a parameter for normalizations on Wiktionary, which has forced editors in the above languages to find various way to deal with it, often having to use |ts=, which is not ideal, since the opening and closing "/" make no sense at all in a normalization. (This is especially clear with proper nouns, which are regularly capitalised in normalizations, see 𒆠𒂗𒄀.)

See for example 1) 𒋼𒀀𒀝𒆤 for Sumerian and 2) 𒅴𒂠 (Šumerum) for Akkadian.

  • Current Sumerian:
𒋼𒀀𒀝𒆤 • (kar-kid₃kid /karkid/)

𒋼𒀀𒆤𒈾𒀭𒉚𒉚𒀭
  kar-kid na-an-sa₁₀-sa₁₀-an
  /karkid nansasan/
  Do not buy a prostitute!
  • Current Akkadian:
𒅴𒂠 𒀀𒄩𒍪𒌝 ― EME.GI₇ a-ḫa-zu-um /Šumeram aḫāzum/ ― to learn Sumerian

𒇷𒊭𒀭 𒋗𒈨𒊑𒅎 𒋫𒄠𒅆𒅋 𒀝𒅗𒁲𒅎 𒂊𒁺𒌝
  li-ša-an šu-me-ri-im ta-am-ši-il ak-ka-di-im e-du-um
  /lišān Šumerim tamšīl Akkadîm edûm/
  to know the Akkadian counterpart of the Sumerian

Ideally, normalizations should not appear between /slashes/, but on their own. I'm not too bothered about exactly what layout is best, that's something that we can discuss separately, but the above examples should appear as something like the following:

  • Desired Sumerian:
𒋼𒀀𒀝𒆤 • (kar-kid₃kid) karkid

𒋼𒀀𒆤𒈾𒀭𒉚𒉚𒀭
kar-kid na-an-sa₁₀-sa₁₀-an
  karkid nansasan
  Do not buy a prostitute!
  • Desired Akkadian:
𒅴𒂠 𒀀𒄩𒍪𒌝 (EME.GI₇ a-ḫa-zu-um) Šumeram aḫāzum ― to learn Sumerian

𒇷𒊭𒀭 𒋗𒈨𒊑𒅎 𒋫𒄠𒅆𒅋 𒀝𒅗𒁲𒅎 𒂊𒁺𒌝
li-ša-an šu-me-ri-im ta-am-ši-il ak-ka-di-im e-du-um
  lišān Šumerim tamšīl Akkadîm edûm
  to know the Akkadian counterpart of the Sumerian

(I think @Mårtensås should be able to give some more example of runic languages?)

Therefore, I'd like to propose adding a normalization parameter (|nr=?) to the existing |tr= and |ts=. What do y'all think? Sartma (talk) 16:45, 20 March 2022 (UTC)[reply]

For those of us who are stupid, such as myself, maybe you can explain more what the problem is. Transliteration is "replace letter/character in the script of language A with letter/character/digraph/etc. in the script of language B" and transcription is "encode signed or spoken language with written language (a.k.a. transcribe it)". What is normalization? It looks a lot like transliteration to me: you have some written language using one script and you want to use a different script to give others a way to vocalize the original script. Maybe you can give an example where an entry would be transliterated as [x] but normalized as [y]? —Justin (koavf)TCM 17:11, 20 March 2022 (UTC)[reply]
@Koavf: True, I guess most people are not familiar with the concept. It's something that only comes up for ancient languages that have been deciphered/reconstructed through linguistic investigation. The difference between transcription and normalization is the additional linguistic work performed on the word.
For example:
  • Original script: 𒅴𒂠 𒀀𒄩𒍪𒌝
  • Transliteration (sign-to-sign): EME.GI₇ a-ḫa-zu-um
  • Romanization (linguistic reconstruction): Šumeram aḫāzum
The original script doesn't indicate that 𒀀𒄩𒍪𒌝 (a-ha-zu-um) has a long /ā/, for example, or that the /u/ is short. That long /ā/ has been linguistically reconstructed, and so has the short /u/. The original script also doesn't tell you whether the /z/ is /z/ or /ṣ/, or wether the /ḫ/ or /z/ are geminated or not.
That same transliteration /a-ḫa-zu-um/ could be transcribed as: aḫazum, aḫaṣum, aḫḫazum, aḫḫaṣum, aḫazzum, aḫaṣṣum, āḫazum, āḫāzum, āḫāzūm, aḫazûm, āḫazûm, aḫāzûm... and so on for any possible combination.
In a romanization you have any sort of linguistic reconstructions that are not directly connected with the original script, its transliteration or its possible transcription.
I guess the main point is that romanizations are a thing in the study of reconstructed/deciphered ancient languages, being the main way those languages are written in dictionaries and textbooks. The /slashes/ are not needed, the same way one wouldn't use slashes to differentiate two different ways a word can be written in languages that have two writing systems (like, say, Serbian).
In the case of Akkadian, the romanization of a word is actually the pagename of a lemma, so it makes no sense to put it between /slashes/. See for example šamû. The quotation uses a custom template that got rid of the /slashes/ and the layout reflects what people studying Akkadian expect it to be. Usexes should have the same layout, too. Sartma (talk) 18:49, 20 March 2022 (UTC)[reply]
Thanks for walking me thru that. Your proposal seems sensible to me. —Justin (koavf)TCM 19:29, 20 March 2022 (UTC)[reply]
Maybe |ts= could still be used, but with its behaviour modified when the language is Sumerian/Akkadian.--Ser be être 是talk/stalk 23:26, 20 March 2022 (UTC)[reply]
@Ser be etre shi: I guess that's an option too. I'm happy with what's easier, as long as the /slashes/ disappear. :D
I'd like to hear what @Mårtensås thinks about this, too. I remember chatting about the need for normalizations in runic languages at some point in the past. Akkadian/Sumerian can do without |ts=, would runic languages too? Sartma (talk) 10:44, 21 March 2022 (UTC)[reply]
Definitely. I can't think of a possible use for it.
I will also give an example of a word and its normalization: ᛋᛡᛏᛖ sAte. First, the rune ᛡ, transliterated as A, here stands for a regular a sound. Secondly, the t is geminate (which we see from comparison with the later dialects, Old Norse, Old Swedish, Old Danish and Old Gutnish), and third the e at the end is long (as seen by it's outcome as Old Norse -i). Thus we get transliterated sAte, normalized sattē. Many such cases! ᛙᛆᚱᛐᛁᚿᛌᛆᛌProto-NorsingAsk me anything 11:47, 21 March 2022 (UTC)[reply]
@Sartma: I am here long enough to remember what the original motivation to introduce |ts= has been, not yet half a decade ago: It actually was for Akkadian—when the pagename never was the Latin-written word, and there were more Akkadian terms mentioned in etymologies than Akkadian pages. Of course there was neither an essentialist motivation of distinguishing transliteration and transcription: The parameter |ts= I rather imagine encompassed by German Lesung or Lesart; in most languages |tr= contains anything that is most usually given alone. I am ready to acknowledge that the implementation of the option for making a language-specific additional distinction was not thought through the best: Slashes may be ugly and obscure to an important majority of readers and what people who study a certain language are not used to, and if man only gives |ts= it adds a “term request” and not “script needed”. Still it makes sense and from manaman's walls of texts I do not see what a better suggestion is. Fay Freak (talk) 18:12, 21 March 2022 (UTC)[reply]
@Fay Freak: Oh, ok... I see... Do we know if other languages are using |ts= at the moment, or is it just Sumerian (since Akkadian is not using it anymore)? Sartma (talk) 16:52, 22 March 2022 (UTC)[reply]
@Sartma: Sometimes guessing vocalizations for Ugaritic and Phoenician-Punic. But most often and important for Pahlavi, all kinds of Middle Iranian languages with their terrible scripts. It might be telling of course, for your preceding speech, that you have not had them in mind at all. Fay Freak (talk) 17:02, 22 March 2022 (UTC)[reply]
@Fay Freak: So we'll have to change behaviour just for Sumerian and Runic. Got it. Thanks. Sartma (talk) 17:07, 22 March 2022 (UTC)[reply]
@Sartma: "just for Sumerian..." but also Akkadian, Hittite, etc. (i.e. the other cunieform-using languages)? Or only Sumerian? —Justin (koavf)TCM 17:10, 22 March 2022 (UTC)[reply]
@Koavf: I guess Hittite is included too (but I don't edit Hittite, so I wouldn't know what's best there). Akkadian will need a new formatting only for usex's and quotes, since |ts= is not used in headword lines or linking templates. Sumerian will need new formatting of |ts= in headword lines, linking templates and usex/quotes. Sartma (talk) 20:55, 22 March 2022 (UTC)[reply]
FWIW, Ashokan Prakrit is using |ts= in header lines, to deal with the fact that consonant gemination is not written. It could have been used for Pali using the Lao repertoire of the Lao script, but for the fact that the slashes too strongly imply pronunciation. Pali using the Lao repertoire doesn't show some phonetic distinctions that aren't made in the Lao pronunciation. --RichardW57m (talk) 16:27, 20 April 2022 (UTC)[reply]

By the way, there is more to search than G_____e edit

When <word> is not found, the Search results page says

See whether another page links to <word>. Or, try searching the site using Google.

I'd rather Wiktionary was vendor-neutral and acknowledged the diversity of search engines. WP:NPOV, filter bubble and all that. --Noliscient (talk) 16:47, 20 March 2022 (UTC)[reply]

Could not agree more. If you posted a ticket to phab:, I would second it. —Justin (koavf)TCM 17:02, 20 March 2022 (UTC)[reply]
This is handled locally by MediaWiki:Searchmenu-new, no need for dev intervention. Anyway, if you look just below the search bar on the results page and have JS enabled, you should see links for Yahoo and Bing on the search page as well. 70.172.194.25 17:24, 20 March 2022 (UTC)[reply]
Ah, how silly. I could unilaterally change this. I'd like to get more feedback before I do it, so I'll leave this open for a bit to get comments. —Justin (koavf)TCM 17:54, 20 March 2022 (UTC)[reply]
We could use more choices than the three. It would be nice if the default were randomly selected from the entire list for anons and new registrants with positive user selection taking over the session and for the life of the registrant name. DCDuring (talk) 18:21, 20 March 2022 (UTC)[reply]
Template:Wikipedia I support this de-Googlization. To me, it's like a Coca-Cola ad slapped on the side of the Kaaba. I'm sure that this was added to Wiktionary during the Bush administration when the corporation had a different motto. This is a great decision I support wholeheartedly, and if there's a vote on this, ping me and I will vote in support of immediate removal of the corporation from that page. --Geographyinitiative (talk) 18:29, 20 March 2022 (UTC) (modified)[reply]
I also always found the message odd as Google is not the fittest search engine for a dictionary. For long it ignores quotation marks while DuckDuckGo adheres to them, although we specifically try to find rare forms instead of doing popular searches. So yes, Google has not earned its market-share being cemented this way. Fay Freak (talk) 03:00, 21 March 2022 (UTC)[reply]
I'd support just replacing it with "your preferred search engine" or something rather than random selection as I feel like that'd annoy some users. I'm personally a bit iffy with certain ones mentioned when it comes to scripts/terms in some languages as they can sometimes be less accessible, which is part of why I just suggest using the aforementioned phrasing. AG202 (talk) 03:09, 23 March 2022 (UTC)[reply]
I think it's useful to provide a direct link to a search through a search engine. I agree that privileging Google is less than ideal in many respects, but so as to continue to have a direct link, I would suggest something along the lines of:
Graham11 (talk) 04:34, 24 March 2022 (UTC)[reply]
Just saw this implemented with the mention of DuckDuckGo. What a great move this is for this website. This is a nonprofit, volunteer dictionary project, and we don't need to advertise for Google anymore. This was maybe appropriate in the 2000s, but certainly not appropriate in the 2020s. --Geographyinitiative (talk) 09:49, 13 April 2022 (UTC)[reply]

Proto-Northern Iroquoian and Old Mohawk edit

Can someone add Proto-Northern Iroquoian as the ancestor of all the Iroquoian languages except Cherokee (as per Julian 2010), and also add Old Mohawk as an etymology-only language? /mof.va.nes/ (talk) 22:52, 20 March 2022 (UTC)[reply]

I'm wondering if it would make sense to make PNI an etym-only code and label for Proto-Iroquoian, since we're talking about excluding just one language. This would potentially reduce the amount of reconstructions, but I don't know how historically accurate that would be, and whether there are many reliable PI reconstructions altogether. Thadh (talk) 23:33, 20 March 2022 (UTC)[reply]
@Thadh Is there any other instances of this being done. I'm not against it. I'm just wondering how I would format it? /mof.va.nes/ (talk) 03:12, 22 March 2022 (UTC)[reply]
What does literature on the topic do — do scholars regularly reconstruct *words in PNI and PI different, in some detail, the way they regularly reconstruct and flesh out Proto-Algic and Proto-Algonquian separately (even though Proto-Algonquian is "excluding just two languages"), or is this more like Austronesian, where we could have a great many redundant identical or near-identical reconstructions for every level the family tree branches on, but we have elected to only have a few levels? - -sche (discuss) 18:42, 25 March 2022 (UTC)[reply]
Forgot to mention I have gone ahead and created these (after a discussion on my talkpage), although if someone wants to contest whether this is a good idea, they have my blessing, it's not too difficult to merge back. Thadh (talk) 19:07, 25 March 2022 (UTC)[reply]

Archaic spellings of Ukrainian words edit

I think it's rather unfair that Russian words are the ones with archaic alternative spellings, whereas I cannot find any in the Ukrainian-language entries.

However, as much as I would like to research and find those forms, I don't know if they might be an acceptable addition to Wiktionary or not. Especially because some of the forms that I found might resemble archaic Russian spelling, so it might be controversial.

PulauKakatua19 (talk) 10:15, 21 March 2022 (UTC)[reply]

I'm not familiar with the distinction between archaic East Slavic languages myself, but if you can locate those archaic forms in context that is obviously Ukrainian, there should be no problem in creating an alt form even when a Russian one already exists. brittletheories (talk) 11:24, 21 March 2022 (UTC)[reply]
As a parallel, almost all of the terms in Category:Meänkieli lemmas have more fleshed out duplicate Finnish entries, but they're still permitted on Wiktionary. brittletheories (talk) 11:26, 21 March 2022 (UTC)[reply]
We should not shy away from including true and verifiable information on words due to vague political weariness. Nobody denies that these languages are related anyway, and there are already many words spelled the same in current orthography. If you can find old alternative spellings, I don't see any reason not to add them (whether or not the spelling is the same as for the Russian cognate). Just be sure that they are actually being used in a Ukrainian text and not Russian or Old East Slavic, etc. 70.172.194.25 17:01, 21 March 2022 (UTC)[reply]
I've just split off Old Ruthenian; If a word is verifiable to have been in use in or after Kotliarevsky's Eneïd, it's Ukrainian, but otherwise it's probably Old Ruthenian and should be handled as such. Thadh (talk) 18:52, 21 March 2022 (UTC)[reply]
I have asked a subreddit for the Ukrainian language, and several people have collected some sources:
https://uk.m.wikipedia.org/wiki/%D0%9F%D0%B0%D0%BC%27%D1%8F%D1%82%D0%BA%D0%B8_%D1%83%D0%BA%D1%80%D0%B0%D1%97%D0%BD%D1%81%D1%8C%D0%BA%D0%BE%D1%97_%D0%BC%D0%BE%D0%B2%D0%B8
https://uk.m.wikipedia.org/wiki/%D0%A3%D0%BA%D1%80%D0%B0%D1%97%D0%BD%D1%81%D1%8C%D0%BA%D0%B0_%D0%BB%D0%B0%D1%82%D0%B8%D0%BD%D0%BA%D0%B0 (Latin alphabet forms when Latin was used to write Ukrainian as an official script)
http://litopys.org.ua/berlex/be.htm (a dictionary dating back to 1627 - is it Ukrainian, or Old Ruthenian?) PulauKakatua19 (talk) 02:32, 22 March 2022 (UTC)[reply]
Dictionary from 1627 is definitely Old Ruthenian. We classify Ukrainian as starting from 1792. Note that up to the 15th century we classify as Old East Slavic. Thadh (talk) 07:38, 23 March 2022 (UTC)[reply]

Edit request on אברם edit

Please copy the content of [5] (sans sandbox header). Thanks. 70.172.194.25 02:12, 22 March 2022 (UTC)[reply]

  Done. —Justin (koavf)TCM 03:31, 22 March 2022 (UTC)[reply]

Join the Community Resilience and Sustainability Conversation Hour with Maggie Dennis edit

You can find this message translated into additional languages on Meta-wiki.

The Community Resilience and Sustainability team at the Wikimedia Foundation is hosting a conversation hour led by its Vice President Maggie Dennis.

Topics within scope for this call include Movement Strategy, Board Governance, Trust and Safety, the Universal Code of Conduct, Community Development, and Human Rights. Come with your questions and feedback, and let's talk! You can also send us your questions in advance.

The meeting will be on 24 March 2022 at 15:00 UTC (check your local time).

You can read details on Meta-wiki. --Mervat (WMF) (talk) 19:12, 22 March 2022 (UTC)[reply]

Request for template editor permissions edit

Hello, I would like to request template editor permissions. I believe that I have a decent amount of editing history here and on English Wikipedia and this could make some contributions easier. Although I don't have the greatest amount of experience with editing templates specifically, I am comfortable with programming in general. Moreover I intend to use the permission to correct smaller errors as I see them, such as incorrect characters at Module:zh-glyph/phonetic (eg, 啴). For this purpose @Huhu9001 suggested on his talk page that I consider asking for the permissions to edit the data. Thank you, ChromeGames (talk) 19:12, 25 March 2022 (UTC)[reply]

Hypothetical question about paid editing edit

The following is just an idea; it is not something I have done, nor is it something I am planning to do.

Would it be considered against Wikimedia's terms of service to pay someone to create/edit entries? Let's say paying someone to clean up formatting in a way that is not quite suitable for automation, or to look for and add relevant quotations, or to research and add etymologies, etc. And in this hypothetical, the person being paid would agree to release their work under a suitable license.

As far as I know, the main reason paid editing is frowned upon is because people do it to promote themselves or their friends. I don't think that paying someone to edit random words in a dictionary is what the policy has in mind when it condemns the practice. 70.172.194.25 00:24, 26 March 2022 (UTC)[reply]

Wiktionarian opinion, from the archives: Wiktionary:Beer parlour/2015/October#Boring cleanup work for money, Wiktionary:Beer parlour/2012/February#Being paid to write Wiktionary entriesFish bowl (talk) 00:32, 26 March 2022 (UTC)[reply]
This guy was getting paid for cleaning up terms Notusbutthem (talk) 00:39, 26 March 2022 (UTC)[reply]
My opinion: as long as the work is work we would approve of if performed by an unpaid volunteer, I see no objection to an editor being paid for the work. The conflict-of-interest risk for a dictionary, although not entirely absent, is obviously much less than for an encyclopedia. A paid editor should do well, though, to even avoid the appearance of a potential conflict of interest. (If paid by the Kolaloka company, steer away from coke and whiskey.)  --Lambiam 16:09, 27 March 2022 (UTC)[reply]
Mostly agree with Lambiam. You (IP) say "policy [...] condemns the practice", but it seems like policies on most(?) wikis allow it, they just require disclosure. (AFAICT disclosure is also required here, by the global policy, which requires it on each wiki unless it explicitly opts out in a vote/RFC.) If someone were paid to promote something, whether a particular point of view (e.g. skewing entries on Kashmiri towns to promote the claims of one side) or a particular book (e.g. being paid to add quotations of a new Bible translation to entries to increase its prominence, the way North Face paid someone to add images with their branded clothes to lots of Wikipedia articles) that'd be an issue, but paying for valid cleanup or to add accurate pronunciations or (non-promotional) images or citations seems OK. - -sche (discuss) 05:28, 28 March 2022 (UTC)[reply]

Which was entry number 7,000,000? edit

We should update WT:MILE with the seven-millionth entry, if anyone can work it out. Currently I'm seeing the number of entries (shown in Recent Changes) as 7,000,616, having just created unvariedness. Equinox 16:55, 26 March 2022 (UTC)[reply]

I tried working it out from the page creation log, ignoring non-mainspace pages and accounting for removed entries and I got א־ט־ט. I'm not too sure though, so people should feel free to double-check. — SURJECTION / T / C / L / 18:20, 26 March 2022 (UTC)[reply]
A scripted recheck gives me 𒃮𒋗𒃻. An alternative is 𒋗𒋳 if we count redlinks the other way. Which way would be more "correct"? — SURJECTION / T / C / L / 19:00, 26 March 2022 (UTC)[reply]
Entries still extant at the time of reaching a milestone should count. If deleted earlier, not.  --Lambiam 15:50, 27 March 2022 (UTC)[reply]
This isn't scientific, but I noticed that the second-to-last created entry in Recent Changes when the counter was displaying 7,000,001 was Sprachkenntnissen. This remained the case for a few minutes, too. Unsure how out-of-sync the two are, though. Theknightwho (talk) 15:02, 28 March 2022 (UTC)[reply]
Excuse me, are we talking about pages of main content (excluding categories, thesauri, annexes, etc.) or entries, considering there is often more than one entry per page. For example, page could be considered to have at least 10 entries (one for each language), or more if we even consider etymology subdivisions and noun/verb subdivisions - that's what is considered as an "entry" in a printed dictionary. I am not sure MediaWiki could count this quantity of traditional dictionary entries. Anyway, cheers for this milestone!   Noé 07:42, 31 March 2022 (UTC)[reply]
@Noé: Yes, this is for pages only, there are many more entries than pages. There's WT:STATS which shows a 'real' entry count of 7.678.336 (as of 2021-12-01). – Jberkel 09:43, 31 March 2022 (UTC)[reply]
@Jberkel Congrats on making the 7 millionth entry btw! :) I'm been a bit sick for the past few days so I completely forgot about how the page count was approaching 7 million... Acolyte of Ice (talk) 10:16, 31 March 2022 (UTC)[reply]

Ownership/control of the Wiktionary Discord server edit

Only throwing out a thought here, and not trying to set up a vote. The Wiktionary Discord server is quite a success. I hang out there myself quite a bit. Right now I see 5 other admins online (the Discord admins are Wiktionary admins are the same set) and 31 general users, plus 157 offline users who might return or not.

If I remember rightly, this was originally set up by User:PseudoSkull. I'm not sure what the Discord rules are, regarding "inheritance" or the bus factor: if we were very unlucky and 'Skull died, or chose to leave the Internet forever, what would happen? I guess nobody would be able to configure the place fully?

I find myself comparing it to the old IRC servers/channels, which (AFAIK) have been controlled by some kind of WMF entity, rather than a single random user who happened to set it up first. Should we try to shift the ownership to WMF so that it's future-proof as we die and move on? Or do we hate WMF a lot? On the one hand I don't like this "one person" thing that seems like it might leave us stranded (real story: last year, the maintainer of a video game longplay site died, and nobody knows any passwords or anything, so the entire thing is untouchable, and we've had to set up a new one -- if only we had shared details among the community); on the other hand, most of Wiktionary doesn't like WMF much, and neither do I, and I can imagine them taking over and implementing a lot of dry code-of-conduct stuff and ripping the fun out of it.

(Incidentally, one little technical difference is that IRC servers really are separate servers, i.e. different computers on the Internet, potentially owned and controlled by different users, under the auspices of a greater network; whereas on Discord, "server" is just a convenient term for a virtual partition of their general network — a Discord server isn't really a tangible object at all.)

Equinox 05:32, 28 March 2022 (UTC)[reply]

The Polish wiktionary discord is on the same server as the rest of Polish WMF, and it seems to be fine, that said English WMF is a different beast. But I was thinking the same - it might not be the worst idea to "formalize" it a tiny bit. It would also be useful if say some new user becomes admin and needs their rights changed. Skull does a good job, but what if someday they can't? Vininn126 (talk) 10:32, 28 March 2022 (UTC)[reply]
@Equinox: I agree. The Discord server is precious. We should do something to make it future-proof. Sartma (talk) 11:30, 8 April 2022 (UTC)[reply]
There will come a time when people wish they had commented on this ignored discussion. [6] Equinox 11:17, 8 April 2022 (UTC)[reply]
Why not just give all of those Wiktionarians with checkuser or other similar status ex-officio "power" over Discord? If those with such rank declined the honor/responsibility one could reach into the population of admins. DCDuring (talk) 16:25, 8 April 2022 (UTC)[reply]
That's how it works mate. However, stuff we do on Wiktionary can be seen in the history and logs, but stuff we do on Discord can't be seen at all. Apparently nobody cares so fuck it, I warned you. Equinox 18:06, 8 April 2022 (UTC)[reply]
Are there important passwords that are not known by multiple people> DCDuring (talk) 19:49, 8 April 2022 (UTC)[reply]
I don't know much about Discord, but some cursory internet research indicates that admins can take over ownership in the case that owners disappear, so maybe it isn't a big problem. - TheDaveRoss 18:52, 8 April 2022 (UTC)[reply]

Leadership Development Working Group: Reminder to apply by 10 April 2022 edit

You can find this message translated into additional languages on Meta-wiki.

Hello everyone,

The Community Development team at the Wikimedia Foundation is supporting the creation of a global, community-driven Leadership Development Working Group. The purpose of the working group is to advise leadership development work. Feedback was collected in February 2022 and a summary of the feedback is on Meta-wiki. The application period to join the Working Group is now open and is closing soon on April 10, 2022. Please review the information about the working group, share with community members who might be interested, and apply if you are interested.

Thank you,

From the Community Development team
--Mervat (WMF) (talk) 12:42, 28 March 2022 (UTC)[reply]

Please explain the use of the edit summary as a medium for controversy edit

I refer anyone interested to virtue signalling, where:

Firstly, @Jberkel apparently assumes that the editing summary is the place to settle controversies. Jberkel also seems to think that the way to settle such a controversy is by Wikiwarrior tactics such as repeated reversion to one's preferred version. I do not ask for disciplinary action, but if someone in authority could explain such matters to Jberkel, it would probably be best all round.

Secondly. Concerning the content, the point at issue is that I added the wp template to the article, leading it to display the box that says"English Wikipedia has an article on: virtue signalling". Jberkel apparently takes offence at this because if the user only takes the trouble to page down to the bottom of the item, he will see "Further reading virtue signalling on Wikipedia". A little inconspicuous, but perfectly legible.

Now, the point of Wiktionary is to help users, not make them jump through hoops to find what they want to know. The further reading section is fine, if not strictly necessary in this item, but it is easily overlooked (I overlooked it, for one, so, even allowing for my inferior intellect, the point is not academic) because it does not display the conspicuous box. I suppose I could have deleted the further reading section, as Jberkel demands "it should not be linked twice. If you absolutely need the box, remove the other link", but this is small-minded nonsense: although we don't want to clutter items with reams of links, the link in the Further reading section is inconspicuous, does not harm the smooth reading of the page, and it just might help some user sometime who is musing over the term, having bypassed the box.

So why am I making a federal case of such a tiny detail? To begin with, if there is one thing we can do without it is having some nit-picking OCD authoritarians committing themselves to "improving" Wiktionary by unnecessary removal of items that actually are there deliberately, just because it gives them a thrill to discipline the lower orders. I assume that Jberkel is thinking of the WP wikiwar about "sea-of-blue" links. Allow me to point out that even in WP it is acceptable to repeat links in separate sections, and in this case the two (different kinds of) links are in separate sections anyway.

I'll give this a day, then, if no one in authority does anything to settle the two matters, I'll re-enter the link, including the further reading entry if removed. JonRichfield (talk) 14:52, 28 March 2022 (UTC)[reply]

I agree that the talk page is the best place for this, but at least when it comes to a supposed war of attrition on reverting, the last edit summary is "If you absolutely need the box, remove the other link", so what is the problem? If you just did that, then there would be nothing to post about here and there would have been no reason for you to have posted insults in this thread. —Justin (koavf)TCM 15:24, 28 March 2022 (UTC)[reply]
@Koavf The reason is firstly that the exchanges already exceed the number of reversions that reflect acceptable practice, secondly, that the assertion that, as a matter of principle, two links cannot be accepted is inappropriate as well as counter-productive, thirdly that the edit summary is not the place for that sort of exchange. Furthermore, @Jberkel's invitation to remove the link is no resolution, but largely the basis for disagreement, amounting to "if you don't like it, do it my way or I'll proceed with the wikiwarrior approach", which is precisely the issue. If those are insulting, then how would you have handled it? Not only for this case, but in case of future counterconstructive link-hunting? JonRichfield (talk) 18:21, 29 March 2022 (UTC)[reply]
If you don't understand how calling him an "OCD authoritarian" is inappropriate, I'm not sure how to help you. —Justin (koavf)TCM 19:21, 29 March 2022 (UTC)[reply]
@Koavf @Jberkel The description was directed at the behaviour, not the person. You could help by explaining to him that it is no part of his duty or right to assert that an item should not be linked twice, especially if the links are not in the same section of the entry; that removing links where a user might reasonably look for them is a disservice. For a start, I quote: There is deliberately no hard-and-fast rule about what is considered to hinder or harm our progress. Clear examples of such behaviour include: Deliberately harming our content by deleting useful things or adding useless content or pages. Links in different contexts are undebatably useful for people who look in different places, otherwise it would be unnecessary for books to have indexes. It is not for us to insist that the user must read through the whole entry just in case s/he has missed or misinterpreted something. The wp template is in practically every context separate from the rest of the entry, and should only be forbidden when it interferes with, or misleads, users. You also could explain that the edit descriptions are not the place for controversy, but that there is a talk page for settling disagreements, and that, though Wiktionary is not as obsessive about it as WP, the right response to a continuing disagreement is not to persistently revert each other's edits, but to take it to the talk page first, then to the community portal. I'll give this a day or two, then correct the deletion if it has not yet been done. JonRichfield (talk) 08:04, 4 April 2022 (UTC)[reply]
I don't agree with having two separately entered (and separately maintained) links to WP: the side box and the in-line link. If we have evidence that users are failing to spot WP links (and I doubt that we do), we should keep one link only but improve its visibility somehow. If we must have two different-looking links to WP per entry, then make it consistent: introduce a template or something that generates both of them, instead of forcing two separate things to be maintained in each entry. Equinox 08:07, 4 April 2022 (UTC)[reply]
@Equinox There is some confusion of concepts here. The point is not that we should duplicate links, but that there is not and should not be any blanket objection to duplicate links just because they are duplicates. Where links are given in different parts of an article or for different purposes, there is every reason to permit the duplication. (I mentioned the example of an index in a book, even though the reader could do without the index entry by simply reading through the book every time an item was wanted.) For example in Wikt. I have just been editing line, which has so many meanings as to justify a separate link for the unit of length, which has a separate article in WP. The suggestion of having to maintain separate links in separate contexts hardly arises; more often than not, if there is a justification for separate links in the first place, (as in a complex article) then if maintenance is likely to be an issue, then either only one need be changed because only one context is affected, or, if it leads to a redir at the other end, none. And in any case if there is any issue, the proper place to argue an individual case is in the talk page or the Community portal, as we are doing now, not repeated reversion and argument in the editing descriptions. JonRichfield (talk) 19:16, 4 April 2022 (UTC)[reply]
You say "[there] should not be any blanket objection to duplicate links just because they are duplicates". I personally think we should object to that. So, we disagree on that basis. I'm sorry about any edit-warring but I haven't been involved there and don't really care. Equinox 05:47, 8 April 2022 (UTC)[reply]
@Equinox To justify elimination of all links that occur in more than one context in the same article whether they serve separate purposes or not, you need to show that they do not reasonably serve any users' needs. Even in quite a short article they certainly can; in longer articles e.g. line they very likely do. Editors always can remove functionless material without having to justify their action by reference to explicit regulations, but when material is functional, the last thing we want is for anyone to sabotage the quality of the service by appeal to ill-conceived regulations. The rational rule for regulations is: "if we can do without this one most of the time, leave it out". JonRichfield (talk) 06:47, 8 April 2022 (UTC)[reply]
I agree that there should only be one link, preferably the floating box on the right, as it is far more conspicuous. It is difficult to imagine who could miss that but still find the 'further reading' link at the bottom of the entry.
"Wikiwarrior", "OCD", "authoritarian", "disciplinary action", "small-minded", etc. is unnecessarily toxic. Nicodene (talk) 09:27, 4 April 2022 (UTC)[reply]
I agree with @Equinox about WP links. One is more than enough (having two in two different formats looks sloppy to me), but if we must have two, then it should have a template for it so we can keep consistency throughout. Sartma (talk) 13:12, 4 April 2022 (UTC)[reply]
@Sartma I don't follow how a template would deal with this; could you please elaborate? JonRichfield (talk) 19:16, 4 April 2022 (UTC)[reply]
@JonRichfield: I have no idea either, I can barely create simple templates myself. If it's not possible with a template or something else, then we shouldn't have two links tout court. I really don't see the point of two identical links with a different degree of visibility. Your explanation above of how you missed-it-ergo-it's-needed is not very logical, so not quite good enough a reason for me. Sartma (talk) 22:08, 4 April 2022 (UTC)[reply]
@Sartma There are at least two very logical reasons, and I invite you to think again: the fact that I could miss a thing suggests that other people could as well, and Wiktionary is not intended for the people who don't need help so much as for people who do need help, whether because they don't understand the system well (not being able to write templates etc), or because they are inclined to overlook unobvious links. Secondly, as I pointed out, one can have links for various purposes, and then commonly in different places (I mentioned the example of entries in a book's index, if you remember, which might easily duplicate items in footnotes in text, or in notes pages, or in the bibliography. Forbidding such differences in function and possibly format, you suggest as logical? I invite you to explain such logic cogently. JonRichfield (talk) 13:13, 5 April 2022 (UTC)[reply]
@JonRichfield: There is no logical connection between you missing something and the need for that something to be there. It's your preference, and that's ok, but it's not correlated. I think it's ok to miss stuff once. That's how you learn where things are. I'm sure you missed it that one time and won't miss it again. For me that's not enough a reason to put two links. Anyway, I'm not against having two links per se, I just find it sloppy: unless there is a template or something, it'll become very inconsistent very soon (possibly more than it is now).
As for your example about indexes on books, I didn't quite get how it's relevant. The index in a book is there to help you navigate the book giving you the page of where a chapter is. It's not there to give you the same information twice. Nor are notes given everywhere, at the end of the page, at the end of a chapter and at the end of the book. It's usually only one of those places.
If the issue is visibility, make the link more visible: that would be the logical approach. Keeping a visible one and a less visible one makes no sense to me. Sartma (talk) 14:37, 5 April 2022 (UTC)[reply]
@Sartma The logical connection between missing something you would have wanted to find if you had known about it and the need for that something to be hard to miss in a resource notionally designed for presenting what you needed, is embarrassingly simple: it means that the resource is inadequate. It is doubly inadequate if the sucker who missed it is long experienced in looking things up. In presenting a preference for effectiveness in a service one troubles to present, and which one represents as a superior service, is exactly the preference appropriate to Wiktionarians. If that is your preference, then skimping on the links is definitely ‘’not’ ok, and contrary to what you said, it is strongly correlated.
You think it 's OK to miss stuff once? Sure it is, but that is not the issue. The point is that if you miss stuff once and you fail in your duty to prevent other folks from missing it in turn, then you are not acting as an asset in a service where the objective is to help people who would like to find things they might need.

You say that's how you learn where things are? When you don’t even know you missed something hidden? As a user in a dictionary, when I don’t know what I have missed? And you are talking about logic? You are sure that having missed something that one time the victim won't miss it again? How is he to guess?Only if someone who knows the user missed it will chip in and tell them, otherwise how is he to know it is there to look for? By your logic, nobody needs any links anywhere.

If you claim that you are not against having two links per se, you had some folks fooled. Your idea of sloppiness is mistaken in a facility like Wiktionary: even one link is sloppy where it is not calculated to help anyone; if every one of ten links rather than nine is appropriately placed to help a typical (possibly new) user, then omitting the extra link is sloppy design.

You say that: “unless there is a template or something, it'll become very inconsistent very soon (possibly more than it is now)? I don’t think that you checked the logic of that assertion. What sort of template (or something) would help? What kind of inconsistency do you see creeping in, especially if the separate links are not there with the same purpose? Especially in a long entry with many meanings and usages set, line. bleed, pit, point, lug and hundreds or thousands more such entries, such requirements are routine.

I do quite understand that you didn't quite get how the indexes in books are relevant. The index in a book is there to help you navigate the book giving you the page of where a term is used. The thing that tells you where the chapter is called the table of contents (TOC). The TOC and index, to the extent that they overlap, most decidedly are there to give you the same information twice. Each is needed in the appropriate context, and if you ever do any serious reference work in which such “duplication” has been omitted, you will learn just how painful it can be, and how hard it is to design an effective index.
Your ideas concerning “notes” leave me wondering what books you work with. There is no single standard, especially in modern textbooks, and in a well-designed book, the links might appear in various forms for various purposes, and the more complex the subject matter, the likelier the links are to appear multiple times. And even if it usually were “only one of those places” as you put it, that would be no argument against a tool like Wiktionary using multiple links wherever appropriate to the user’s convenience and assistance.

You say: “If the issue is visibility, make the link more visible: that would be the logical approach”. If it were the issue, it could be the logical approach; the issue is however, to put the link where it is helpful, and as conspicuous as is appropriate – no more – blue links are adequate in running text. (A certain class of critic goes ape even about the blue.) Something more is appropriate when the link is out of line. Logic, see? JonRichfield (talk) 16:22, 5 April 2022 (UTC)[reply]
@JonRichfield: You're blowing this way out of proportion. We're talking about a "shall-I-google-it-for-you" service, and you talk about it as if it was the main purpose of Wiktionary.
  • "By your logic, nobody needs any links anywhere": no, by my logic one link is enough.
  • "index ≠ TOC": My mistake. Both TOC and index are called indice in Italian, I actually didn't realise it wasn't the same in English until today. Anyway, that doesn't really change the point of my argument. The index doesn't duplicate the content, it just tells you all the places where they talk about a certain word in the book. I don't know any book that has a duplicated TOC or index. There's usually just one TOC and one index.
  • "it's not an issue of visibility, it's a question of making the link as conspicuous as appropriate": conspicuous = noticeable. It is pretty much only a matter of visibility you're complaining about. It's not visible enough, it's not enough in your face, it's not noticeable enough. Otherwise why would you be arguing that you want it in multiple places? "helpful" in this case coincide with "visible", but while "visible" is objectively measurable, "helpful" is just subjective.
Anyway, I understand you're very passionate about this thing. Unfortunately, I don't really care enough to waste any more of my time on this, so be it what it needs to be. I'm off to eat pizza. Sartma (talk) 17:37, 5 April 2022 (UTC)[reply]
@Sartma Getting a pizza is the nearest you have come to justifying your favourite word. You have not once countered a single thing I have said, nor justified the issues you tried to invent. For example: I don't know any book that has a duplicated TOC or index. There's usually just one TOC and one index. I bet you don't even realise why you should be blushing at having said anything of the type. JonRichfield (talk) 18:21, 5 April 2022 (UTC)[reply]
@JonRichfield: Sure. Sartma (talk) 18:50, 5 April 2022 (UTC)[reply]

Jon, drop this topic: nothing good is coming of it. —Justin (koavf)TCM 19:54, 5 April 2022 (UTC)[reply]

Although I'm not on Jon's side here, apparently, I don't appreciate your unilateral nag either, Koavf. Perhaps you are missing WP's Wikipedia:Drop the stick and back slowly away from the horse carcass? We don't usually operate on twee platitudes around here. Hopefully...? Equinox 05:52, 8 April 2022 (UTC)[reply]
What would you like to come of this post? —Justin (koavf)TCM 06:14, 8 April 2022 (UTC)[reply]
Thanks for your concern @Equinox, but the reason I did not reply to @Koavf on that point, was that I agreed with him. JonRichfield (talk) 06:47, 8 April 2022 (UTC)[reply]

"Fringe" quotes edit

I checked recent changes and saw a series of edits removing quotations from "fringe" sources. One example is this edit. The quote itself doesn't seem fringe (although the title may be considered political, as it involves China and Uyghurs). Does anyone have thoughts on this? 70.172.194.25 02:29, 29 March 2022 (UTC)[reply]

Problematic. --Geographyinitiative (talk) 03:48, 29 March 2022 (UTC)[reply]
Bitter Winter is a publication CESNUR. This organization is aimed at "[studying] new religious movements and [opposing] the anti-cult movement." In other words, they advocate for the religious freedom of groups many deem cults, including Aum Shinrikyo and the Church of Scientology. They have also been involved in documenting the persecution of religious minorities by the Chinese government. I'm not familiar enough with this organization to make a judgment call. But I can see why their history might raise alarms (broken clocks are right sometimes...). WordyAndNerdy (talk) 04:48, 29 March 2022 (UTC)[reply]
The editor who added this does great work on lots of languages, but they're not very good at real-life judgments like credibility and fringe-ness. This has been a problem for a long, long time, but they're not intentionally pushing a POV- it's just that they read this stuff offwiki, so they use quotes from it.
That said, using quotes from fringe sources for non-fringe usage IMO gives them undue weight and gives the impression that we're trying to push a POV. I think deciding to remove the quotes was a good call, and no one should shy away from doing so in the future. Chuck Entz (talk) 13:59, 29 March 2022 (UTC)[reply]
It’s not even fringe sources—I refuse to acknowledge that this is by itself sufficient reason, or we should or could care much about that, given the varying quality and low standards of mainstream sources as well, or think what mainstream means if said about a regime like Russia whose language we document—, but that editor from them picks out the most outlandish generalizations designed as if to be at war with large swaths of society, or say the most untrue and memetic parts. You can of course quote the science fiction of L. Ron Hubbard. Fay Freak (talk) 14:15, 29 March 2022 (UTC)[reply]
I am tagging @Apisite who added the cite in question [7] (that I restored) and seems to be obliquely referenced above. I believe in maintaining "relatively 'free' speech" (aka not really free speech, but trying to hope to pretend we could actually do that and trying to tolerate things to the max) within the boundary of the Wiktionary dictionary project so as to capture the most organic range of linguistic expression possible, which includes the fringes of society. I have actually quoted L. Ron Hubbard's science fiction. If you want to talk about 'giving an impression we're trying to push POV', take a gander at the 960 results in a Wiktionary search for Mao Zedong- by some standards, the content you will see is FRINGE. --Geographyinitiative (talk) 14:40, 29 March 2022 (UTC)[reply]
Hey @Chuck Entz, despite what I have said here, I recognize the need to avoid fringe content because of the need to maintain the legitimacy of the Wiktionary dictionary project as a whole. I probably have a higher tolerance than most for fringe though. If you are interested, would you give me a judgment on whether [8], [9], [10] and [11] are too fringe for mainspace cites? Would they be okay for the Citations page, for the purpose of demonstrating the 'range' of the word? (Madame Blavatsky is a relatively historically important cook for instance.) Thanks for any wisdom you can provide me with. --Geographyinitiative (talk) 17:57, 29 March 2022 (UTC) (modified)[reply]
@Fish bowl After thinking about this issue for a few days, I have concluded that if you really think something is a fringe quotation or cite, just move it to the Citations page like I did here: [12]. Fringe material is part of the English language and we need to not be blind to it as a descriptive dictionary; however the danger of appearing to endorse fringe views (as Chuck mentioned) is definitely important. I have made too many comments on this topic so I plan to cease participation in this thread. --Geographyinitiative (talk) 12:53, 31 March 2022 (UTC)[reply]

Adding Łacinka spelling to Belarusian words edit

The Belarusian language has a special feature that sets it apart from its other Eastern Slavic sister languages. The Belarusian languages has an official Latin-based alphabet (Interestingly enough Belarusian has an Arabic-based alphabet as well) that was used for some centuries before 1918. This alphabet is called "Łacinka", and although today łacinka has largely fallen into disuse in favor of the Cyrillic alphabet, Łacinka is still a part of Belarusian and I want to hear everyone's opinion about maybe letting Łacinka be added to Belarusian words on wiktionary? Similar to how Serbo-Croatian supports the usage of both the Latin and Cyrillic alphabet. Or do you guys think it's better to just include Łacinka under "alternative forms"? Kyning (talk) 17:39, 29 March 2022 (UTC)[reply]

As someone who is ignorant of this language, we should include these spellings, for sure and update Module:be:Dialects. —Justin (koavf)TCM 17:32, 29 March 2022 (UTC)[reply]
Is this something that's regularly used? That's the big question. I don't have a lot of experience with Belarusian, but from what I've seen, it's only been Cyrillic and no Łacinka. I could be wrong though. The important thing is attestability. (I'm aware of Łacinka's existance, but I've never seen it used). If it's attestable, then I see no issue with it. Someone may want to make a "Łacinka form of" template if that's the case, and we may want to do something like what Serbo-croation does and modify the headwords to include a place for the Łacinka variant. Vininn126 (talk) 21:31, 29 March 2022 (UTC)[reply]
How much does it differ from the transliterations we already give? 70.172.194.25 21:38, 29 March 2022 (UTC)[reply]
From what I know, quite a bit. Just look at the title, Ł is used, and I believe our transliterations don't. It's a very different orthgraphy. Think more along the lines of Serbo-Croation. Vininn126 (talk) 21:40, 29 March 2022 (UTC)[reply]
@Atitarev I think you may have some thoughts on this. Thadh (talk) 22:14, 29 March 2022 (UTC)[reply]
The attestability will be a big problem and also finding correct spellings. The original poster Kyning added an incorrect spelling in diff, even if the case with the word Беларусь (Bjelarusʹ) is straightforward - the łacinka spelling is "Biełaruś". I don't feel like going and correcting the spellings. We don't have natives speakers or experts in Belarusian here.
Also, there were multiple versions of łacinka at different periods, e.g. "ж" was "ż" (as in Polish) and "ž", "ш" was "sz" and "š", "в" was "w" and "v". The conversion to łacinka is rather straightforward but there are multiple considerations, since modern (government) spellings differs significantly from Taraškievica (capitalised (English) łacinka spelling of тарашке́віца (taraškjévica)) and even the choice of words and grammar forms can be different. So, modern standard Belarusian стаго́ддзе (stahóddzje, century) is actually стаго́дзьдзе (stahódzʹdzje) in taraškievica and łacinka would be "stahodździe", not "stahoddzie", modern сі́мвал (símval, symbol), taraškievica сы́мбаль (sýmbalʹ) and łacinka is "symbal", not "simvał" or "simwał". --Anatoli T. (обсудить/вклад) 23:01, 29 March 2022 (UTC)[reply]
I think this should be used only to document known historical usage- properly labeled as such- and not as an alternative way of spelling modern terms. We should even avoid it for terms where one might expect them to have been written in the script based on their history, but where we lack direct evidence of how they were spelled. It should be like runic spellings of Germanic languages: for instance, it probably wouldn't be that hard to figure out how to spell just about any Old English term in runes, but we have only half a dozen Old English runic entries, and those are all based on actual inscriptions which are quoted in the entries. While I'm sure the corpus of Łacinka texts is isn't even remotely that small, the principle still holds: first find the usage, then create the entries- not the other way around. Chuck Entz (talk) 04:54, 30 March 2022 (UTC)[reply]
This site: rodnyja vobrazy (родныя вобразы) shows Belarusian poems in Cyrillic Taraškievica (traditional/classical Belarusian spelling) and Łacinka. You can switch by clicking on кірыліца (Cyrillic) or łacinka.
E.g. Cyrillic: Канчаецца дваццатае стагодзьдзе...
The same text in Latin Kančajecca dvaccataje stahodździe... --Anatoli T. (обсудить/вклад) 06:44, 30 March 2022 (UTC)[reply]
So it is most easily comprehended as Taraškievica Latinized? Then I suggest, also with Chuck, as most practical and least noisy, to present the Latin spellings but in the headers and inflection tables of the Taraškievica alternative spellings entries. Fay Freak (talk) 07:51, 30 March 2022 (UTC)[reply]
Why not, e.g., define Biełaruś as “Łacinka spelling of Беларусь”?  --Lambiam 11:14, 30 March 2022 (UTC)[reply]
Attestability is a huge aspect of this. Vininn126 (talk) 11:16, 30 March 2022 (UTC)[reply]
That is why I used an attestable form. I’m not suggesting we should give definitions for unattested spellings.  --Lambiam 23:40, 31 March 2022 (UTC)[reply]
Well, I wrongly presumed that there always is a different Taraškievica spelling. For consistency then, if the Taraškievica spelling is the same, as it the Latin spelling should be in {{alter}}. About the entry layout of Łacinka spellings I have not said anything: if created then of course they use {{spelling of}}. Fay Freak (talk) 11:29, 30 March 2022 (UTC)[reply]
Most traditional Belarusian terms, which have a Taraškievica, should have correspondence in the Latin script. I am sure there could be some corner cases, e.g.
Belarusian is known to consistently use akanye on native words, unstressed о is used only when it is a secondary stress (which can be optional in some cases, BTW) or in some loanwords but this inconsistency occurs mostly in Taraškievica. E.g. філо́лаг (filólah, philologist) is the official spelling. Akanye (and yakanye) is applied to loanwords as well (unstressed "o"->"a"). In Taraškievica (note the difference in stress as well), it can be філёлё́г (filjoljóh) or філялё́г (filjaljóh), філёлё́ґ (filjoljóg) or філялё́ґ (filjaljóg). (Letter ґ (to render [ɡ]) was only used for a short period in Belarusian. I mean extra care would be required. E.g. "gvałt" can possibly be verified (see гвалт (hvalt) and ґвалт (gvalt)) but "hvałt" is more common. Cases with inconsistent akanye/yakanye in spellings and use of letter "ґ" vs "г" ("g" vs "h") would require a special attention. I would be suspicious if I saw łacinka spellings with "g", since "ґ" is also very rare and temporary (unlike Ukrainian where it's official and used more frequently). Also, just a note - łacinka in Polish and Belarusian лаці́нка (lacínka) (also лаці́ніца (lacínica)) simply means Latin/Roman script, even if it may also mean a specific standard. I don't know if we need to label the łacinka spellings as "łacinka" or simply "Roman" or "Latin", like we do with Serbo-Croatian. --Anatoli T. (обсудить/вклад) 00:03, 1 April 2022 (UTC)[reply]

Blocked on EN WP edit

I'm blocked indefinitely for vandalism and sockpuppetry on the English Wikipedia. May I edit Wiktionary as long as I follow the rules? --99.197.202.188 21:39, 29 March 2022 (UTC)[reply]

As far as you don't repeat your past mistakes on Wikipedia, follow the rules and make good edits I don't see why not. Make sure to doublecheck that what you're doing is okay and in line with the dictionary. Thadh (talk) 22:19, 29 March 2022 (UTC)[reply]

Adding European Portuguese plurals to Brazilian Portuguese forms and vice-versa edit

It's often the case that words ending in "n" are nasalized in Brazilian Portuguese (BP), but not in European Portuguese (EP). Examples are hífen that is read IPA(key): /ˈi.fẽj̃/ in BP, but IPA(key): /ˈi.fɛn/; or éon, read IPA(key): /ˈɛ.õ/ in BP and IPA(key): /ˈɛ.ɔn/ in EP. This leads to cases where the EP plural is created by adding "-es" while in BP only adds "-s" (e.g.: hífenes/hifens; éones/éons). My question is if should we add plurals for words or spelling that are exclusive to one variant. For instance, the BP íon (the EP equivalent is ião) would be read in EP as IPA(key): /ˈi.ɔn/, thus inducing the plural íones as an alternative to the BP plural íons. Would it make sense to add the plural íones or not? - Sarilho1 (talk) 09:43, 30 March 2022 (UTC)[reply]

If there are at least three attested uses, such a form may be created (and possibly tagged as rare, if appropriate). 70.172.194.25 09:47, 30 March 2022 (UTC)[reply]
Attestation doesn't seem to be a common policy for regular inflections, though. What's in question is if this is, or not, a regular inflection. - Sarilho1 (talk) 12:59, 30 March 2022 (UTC)[reply]
Offhand, I'd say "the plural this dialect would use for a word that it does not however actually use because the word is only found in the other dialect" doesn't sound like a regular inflection. (It sounds like the kind of irregular inflection we'd require cites for, like if someone said the plural of clue is cluen just because shoe can be shoon.) We could test, though; if we look at as many "Brazilian-only" words as we can, and for most of them a "European-style" plural actually can be attested (and vice versa, Brazilian plurals of European words), that'd be evidence it's actually regular. But in that case it'd be interesting to look at the context of the citations and whether it calls either of the assumptions "x is a BP-only word" or "y is an EP-only plural" (and vice versa) into question. - -sche (discuss) 06:39, 31 March 2022 (UTC)[reply]

2021 ISO code changes edit

In 2021 the ISO made several code changes we could follow (or not). I think a few of these like Toki Pona have already been brought up, but I didn't spot any comprehensive discussion. The ISO/SIL created the following codes:

  • egm Benamanga (a Bantoid language Wikipedia and Glottolog have nothing on)
  • gov Goo (a Mandean language; Wikipedia has a paper from 2013) (added)
  • nww Ndwewe (a Bantoid language Wikipedia has no info on beyond the existence of the code)
  • phj Pahari (a Himalayish language; Wikipedia has it as one variety of Newari, "unrelated to other 'Pahari' languages of the region")
  • rsk Ruthenian (is this the same thing we currently have an exceptional code for and call "Old Ruthenian"?)
  • tok Toki Pona
  • ugh Kubachi (a Nakh-Dagestani language Wikipedia says is also called Kubachin and "is often considered a divergent dialect of Dargwa. Ethnologue lists it under the dialects of Dargwa but recognizes that it may be a separate language.")
    not added for now, until there are editors who need it, see Vahag's comment
  • xdq Kaitag (a Nakh-Dagestani language; Wikipedia says is "divided into three dialects: Lower Kaitag, Upper Kaitag, Shari")
    not added for now, until there are editors who need it, see Vahag's comment
  • xhm Middle Khmer (1400 to 1850 CE)
  • zcd Las Delicias Zapotec
  • ajs Algerian Jewish Sign Language
  • dsz Mardin Sign Language
  • lsc Albarradas Sign Language
  • lsw Seychelles Sign Language
  • psc Iranian Sign Language
  • rib Bribri Sign Language
  • rnb Brunca Sign Language
  • rsn Rwandan Sign Language

And split the following languages:

  • cug Chungmboko split into cnq Chung and bpc Mbuk
  • lno Lango (South Sudan) split into lgo Lango (South Sudan), imt Imotong, lqr Logir, and oie Okolie
  • uun Kulon-Pazeh split into uon Kulon and pzh Pazeh
    uon and pzh have been added, see WT:ES; once all uses of uun are updated it can be removed
  • wya Wyandot split into wyn Wyandot and wdt Wendat

And merged or retired these:

  • merged ajt Judeo-Tunisian Arabic into aeb Tunisian Arabic
    we already retired ajt; WT:LT implies we merged it into jrb, but we don't have that code either, so apparently we need to update WT:LT (I assume these were all merged into Arabic?)
  • merged lak Laka (Nigeria) into ksp Kaba (there's apparently much uncertainty about this lect, I see we changed its family code a while back)
  • merged smd Sama into kmb Kimbundu
  • merged snb Sebuyau into iba Iban
  • retired pii Pini (an Australian language)
  • retired wrd Warduji (an Indo-Iranian language)
    we apparently already merged this into sgy Sanglechi

They also rejected dej for Indo-Iranian language Delvari, rejected wgf for Wangerooger Frisian, and rejected wtb for Bantoid language Matambwe. And they:

  • renamed brb from Lave to Brao (this rename might or might not affect other languages here / allow them to drop disambiguators or change from using other spellings like Lawe)
  • renamed env from Enwan (Edu State) to Enwan (Edo State)
  • renamed pzn from Para Naga to Jejara Naga
  • renamed trv from Taroko to Sediq
  • renamed xkk from Kaco' to Kachok
  • renamed xmx from Maden to Salawati

and changed the status of ssf Thao from extinct to liviing. To simplify seeing what's been done, let's strikethrough any changes above once we've adopted or rejected them. - -sche (discuss) 04:08, 31 March 2022 (UTC)[reply]

We recently also split Wyandot into Wyandot and Wendat, but the former may need a code change. Thadh (talk) 06:30, 31 March 2022 (UTC)[reply]
Also, yes, Ruthenian is what we have as Old Ruthenian, that's a code change we need. Thadh (talk) 06:33, 31 March 2022 (UTC)[reply]
Apparently, ISO's "Ruthenian" is actually Pannonian Rusyn. Thadh (talk) 17:46, 26 April 2022 (UTC)[reply]
Interesting. (Indeed the code request refers to it as "the language of the Ruthenians in Serbia and Croatia".) Which (a) means we should probably name it something more usual, like (Pannonian) Rusyn, if we include it, and (b) raises the question of whether it needs its own code or can be handled as rue Rusyn. Pinging @ZomBear, PetrGruko: do you have any sense of whether Pannonian Rusyn should be treated as a separate language from Rusyn? - -sche (discuss) 23:51, 27 April 2022 (UTC)[reply]
@-sche I doubt that we need the 5th East Slavic language. Firstly, it should be clear that rsk Pannonian Rusyn (aka "Ruthenian" or Rusnak), supposedly a language, is not zle-ort Old Ruthenian. Personally, as a Ukrainian, it is generally difficult for me to perceive the Rusyn language as a different language, because listening to the Rusyns speak, I hear 98% of the Ukrainian language mixed with Slovak, Polish and Hungarian borrowings. Returning to Pannonian Rusyn... I couldn't find a single dictionary at all to compare it with Carpathian Rusyn language. If anyone knows where to find it, please share the link. I'll be happy to hear other opinions. I can quite accept the creation of a new code for the rsk Pannonian Rusyn language. Probably, like with the Sorbian languages in West Slavic. --ZomBear (talk) 02:44, 28 April 2022 (UTC)[reply]
I have just updated Wiktionary:Language treatment to mention what we have done and do with “Judeo-Arabic” (which at least in general reads as Arabic transliterated into Hebrew script or dialects though featuring distinctions still mutually intelligible with the local majority one). Fay Freak (talk) 18:16, 11 April 2022 (UTC)[reply]
Thanks! - -sche (discuss) 20:41, 8 May 2022 (UTC)[reply]
The splitting of xdq Kaitag and ugh Kubachi from dar Dargwa is due to the efforts of the Kaitag language activist @Alkaitagi. Russian sources usually view Dargwa as a single language with different dialects. There are many more Dargwa varieties for which no activists exist: w:ru:Категория:Даргинские языки. Leaving aside the theoretical problem of dialect vs language, we should not split Dargwa on Wiktionary until we get an active editor here in these languages/dialects who requests the codes for practical reasons. The code dar Dargwa is sufficient for now for adding an occasional translation or a cognate by non-natives like me who are not competent to distinguish between the dialects. Vahag (talk) 19:14, 11 April 2022 (UTC)[reply]

Cuneiform in inflection tables: yes or no? edit

Hi all! I would like to ask your opinion about having cuneiform phonetic spellings on inflection tables for Akkadian entries. The current inflection template for Akkadian nouns and verbs creates two tables:

  1. with the normalised (=reconstructed transliteration) forms of the inflection;
  2. with the same forms but spelled out in phonetic cuneiform + its transcription.

You can see an example here: 𒅕𒍢𒌈 (erṣetum).

The phonetic cuneiform spelling is just one of the many possible ones, and moreover the same word could also be spelt with the corresponding logogram(s) or a mix of the two, so the second table is not really giving any more information than the first one.

Now, Akkadian words are lemmatised in their normalised form (in the Latin alphabet) with Logograms and phonetic or mixed spelling listed in each entry's "Cuneiform spelling" table. We also give a cuneiform spelling of Akkadian words in linking templates ({{m}}, {{l}}, {{cog}} etc, mainly when they appear in other languages' etymology sections or descendant trees) and quotes/usexes, so we do give the cuneiform as well when needed.

But in all other cases we just use the normalised forms.

Giving two tables for Akkadian inflections feels a bit "heavy" and redundant (if not confusing) to me. I would be prone to get rid of the phonetic cuneiform table and keep just the normalised one, to keep things simple and neat.

I don't think this matter requires a formal vote. If enough people here agree, I'd proceed and simplify the template. Otherwise, I'll keep things like they are now. Well, let me know! Thank you! Sartma (talk) 17:39, 31 March 2022 (UTC)[reply]

Yes. Tables are collapsed by default, they don't use that much space, and the button could be compressed to use even less. -- ObnoxiousCoder (𒅀𒀀𒋾𒁺𒁍𒌒) 18:03, 31 March 2022 (UTC)[reply]
@ObnoxiousCoder: You created the templates, so fair enough. It's not about the space they take, though. This being Wiktionary, space is the last thing we should worry about.
I gave you my reasons above, you didn't touch on any of them. Why do you think that we should give the phonetic cuneiform on inflection tables? What's the reasoning behind to have it in the first place?
Anyway, if we end up keeping it, we need to go through those templates and work on them seriously before adding them to all entries, possibly join the tables into a single one and review verbal spelling. At the moment, for example, the Akkadian verb template gives wrong cuneiform spelling for the Durative or I-a verbs, like for amārum arārum. The first person of the Durative is correct in the normalization table as arrar, but the cuneiform one is wrong. In Old Babylonian the spelling would have been either 𒀀𒅈𒊏𒅈 (a-ar-ra-ar) or 𒀀𒊏𒅈 (a-ra-ar), but spellings like 𒅈𒊏𒅈 (ar-ra-ar), as it's shown in the table, are not normally found. Sartma (talk) 20:58, 31 March 2022 (UTC)[reply]
We should have them because is the way we do it with all other languages here on Wiktionary. In fact, the way we actually do things is by prioritizing the original script, and offering a transliteration/transcription as a secondary option.
With respect to amāru, how can arrar be the Durative form? Don't you mean ammar?. ObnoxiousCoder (𒅀𒀀𒋾𒁺𒁍𒌒) 02:23, 1 April 2022 (UTC)[reply]
@ObnoxiousCoder: Soz, I meant arārum. Sartma (talk) 08:05, 1 April 2022 (UTC)[reply]
@Sartma Those edge cases are not difficult to fix. ObnoxiousCoder (𒅀𒀀𒋾𒁺𒁍𒌒) 11:51, 1 April 2022 (UTC)[reply]
@ObnoxiousCoder: I still think it's completely arbitrary to give just one possible spelling, especially when it comes to nouns. Akkadian should follow the example of Egyptian, that is the most honest and balanced way of giving inflections. There is no need to misrepresent the language making arbitrary spelling choices. Sartma (talk) 16:13, 1 April 2022 (UTC)[reply]
@ObnoxiousCoder: We do that with modern or classical languages that have a standardise spelling where one word has one spelling (with very few exceptions) and that's it. For ancient languages that don't have a standardised orthography we use some kind of transcription. See Egyptian, for example bḫnt (Just one random word. You can check other lemmas too). Sartma (talk) 08:18, 1 April 2022 (UTC)[reply]
Isn’t Egyptian the only script we do that for? And I was under the impression that’s because Unicode doesn’t cover it very well yet. I don’t see why we can’t do what we do with every other language with variant spellings and simply have entries at each of them. There’s no need to prioritise one over the other, either. Theknightwho (talk) 18:19, 23 April 2022 (UTC)[reply]