Wiktionary > Discussion rooms > Beer parlour

Click here to start a new Beer parlour discussion.

Wiktionary discussion rooms (edit) see also: requests
Information desk start a new discussion \| this month \| archives Newcomers’ questions, minor problems, specific requests for information or assistance.	Tea room start a new discussion \| this month \| archives Questions and discussions about specific words.	Etymology scriptorium start a new discussion \| this month \| archives Questions and discussions about etymology—the historical development of words.	Beer parlour start a new discussion \| this month \| archives General policy discussions and proposals, requests for permissions and major announcements.	Grease pit start a new discussion \| this month \| archives Technical questions, requests and discussions.

All Wiktionary: namespace discussions 1 2 3 4 5 – All discussion pages 1 2 3 4 5

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.

Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.

Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!

Beer parlour archives edit

2024

2023

Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

December

June 2024

How to resolve conflicts on Wiktionary

Latest comment: 1 month ago102 comments17 people in discussion

Thou wilt quarrel with a man that hath a hair more, or a hair less, in his beard, than thou hast[.] — William Shakespeare, Romeo and Juliet

Throughout the lifetime of this online dictionary, there have been plenty of conflicts between users. Some of this is unlikely to end any time soon, such as the fight between admins and vandals. This fight has a clear "good guy" and "bad guy", unlike some of the other fights we have had over the years. These morally-conflicting fights often turn into virtual bloodbaths, with people hurling vicious insults at each other, at each other throats over who's harassing who or if some person should be banned. While these are important conversations to have, too often is the main point ignored in favor of calling people "idiots". This problem has been pointed out before [cf Beer Parlour July 2023 § "please reduce the heat"], but nothing seems to be done on the topic, and we seem to keep going in circles, never reaching the point where we can have civilized discussions.

I intend to change that. Please, leave your thoughts as to how we can avoid any future conflicts for good. CitationsFreak (talk) 04:13, 1 June 2024 (UTC)Reply

In principle, there is no way to stop all conflict, of course, but a really good start would be an expectation of civility and admins enforcing that. —Justin (koavf)❤T☮C☺M☯ 06:38, 1 June 2024 (UTC)Reply

Would that just be the way to end all heated conflicts? CitationsFreak (talk) 14:48, 1 June 2024 (UTC)Reply

Clearly not all of us have the same standards for civility and for the need for civility in all interactions with others. Further, some don't seem to care much about feedback from others about their behavior. In some cases, people seem to get very annoyed that others might be potentially causing them to waste their precious time, taking them away from their sacred mission to improve Wiktionary, by their lights. I'm pretty sure that the folks whose behavior I most object to are supremely confident that they are right and that civility is for others who are not on the sacred mission they have defined. DCDuring (talk) 15:04, 1 June 2024 (UTC)Reply

Pobody's nerfect and no system is perfect, but it's a start. —Justin (koavf)❤T☮C☺M☯ 20:09, 1 June 2024 (UTC)Reply

Conflicts are not always bad if people are arguing about somethin they both care about. I guess that the biggest problem is when people are call each other bad words when arguing about some stupid stuff like a definition of a sand broom, without caring about anyone’s input. It feels that they are either too much drunk or too little. However, luckily, it’s not happening so often. Tollef Salemann (talk) 15:38, 1 June 2024 (UTC)Reply

I think @Theknightwho should try to make an effort to participate in drama less. I think we can be reasonable and agree that nothing good came out of engaging with Wiktionary:Beer parlour/2024/May#Stalking/harassment by User:Theknightwho which was a pretty obvious (yet successful) attempt to fan up drama. Ioaxxere (talk) 18:46, 1 June 2024 (UTC)Reply

@Ioaxxere Sure, but there needs to be a way to resolve issues that doesn't just amount to ignoring them. Theknightwho (talk) 18:57, 1 June 2024 (UTC)Reply

Yeah, Help:Dispute resolution doesn't offer much guidance except, hilariously, relax and do something more important and assume that they [the other user you're disputing about] are eccentric and will thus never be able to see eye to eye with you. Hardly worthy of a Nobel peace prize. P. Sovjunk (talk) 19:11, 1 June 2024 (UTC)Reply

It's rather pithy... but I also sometimes feel this way looking at drama from the outside in. Vininn126 (talk) 19:49, 1 June 2024 (UTC)Reply

@Theknightwho Sometimes stepping away from things IS the best course of action. You need to do that more often Purplebackpack89 06:02, 5 June 2024 (UTC)Reply

@Purplebackpack89 Given the amount of friction you're creating, you actually might want to do the same. Benwing2 (talk) 06:15, 5 June 2024 (UTC)Reply

@WordyAndNerdy: Do you have feedback here? I'd value it. —Justin (koavf)❤T☮C☺M☯ 20:10, 1 June 2024 (UTC)Reply

It isn't feasible to "avoid any future conflicts for good." Such an approach will only worsen conflicts that inevitably emerge. Ignoring problems doesn't resolve them. It's like putting a lid on a pot and expecting it not to make a huge mess when it boils over. If TKW had been given guidance at an early stage, this issue may not have grown to this extent. Now there's at least five productive contributors (me, Huhu9001, LlywelynII, Mahāgaja, Purplebackback89) who find his admin conduct to be a recurrent issue. Wiktionary desperately needs both formal dispute resolution processes and the willingness to enact them. This isn't about "fan[ning] up drama." Seeing it characterised as such with no pushback (except from TKW, to his credit) does little to reassure me that Wiktionary is interested in having difficult conversations as a community and making necessary systemic changes. I have noted that TKW hasn't been as combative as in past incidents. That gives me hope that there's room for course correction. But my continued participation here is contingent on resolving the current policy vacuum. We cannot have a repeat of an admin (Equinox) functionally being given carte blanche to be as hostile and combative as he pleases for years because he also makes valuable contributions. WordyAndNerdy (talk) 21:54, 1 June 2024 (UTC)Reply

Huhu and Purple have both been criticized for not being productive. Mahāgaja's complaint has been addressed as their behavior was problematic. Please don't ignore these aspects in your diagnosis. Vininn126 (talk) 22:00, 1 June 2024 (UTC)Reply

Also LlywelynII has been heavily criticized for being sloppy. The term "productive" is being used too loosely here. Vininn126 (talk) 22:01, 1 June 2024 (UTC)Reply

Huhu9001 seems to do solid work in the Japanese language area and in template-space. My understanding is that the dispute between TKW and Huhu9001 arose over changes that TKW made to modules that ended up unintentionally breaking things. So, if LlywelynII can be faulted for "being sloppy," so can TKW. Purple has been around as long as I have. Wiktionary has shown habitually sloppy editors (Luciferwildcat) the door before. I wouldn't necessarily number Purple among them. In any case, the common denominator in these disputes is TKW, not anything any editor did to get on his radar. It also needs to be underscored that all of these disputes were unrelated. TKW has found himself at the centre of multiple heated disputes with unconnected editors working in different areas of this project. That isn't a coincidence. It's a sign of a pattern of escalating and personalising conflicts. WordyAndNerdy (talk) 22:19, 1 June 2024 (UTC)Reply

This only half-addresses the issues I raised with hand-waiving. A user frequently (I admit, too frequently) addressing sloppiness in others' edits does not make their edits not sloppy. From Purple I have seen 10x more drama and the issue "is a hot-dog a sandwich", which I'd hardly call productive. Huhu has been criticized by others, as well, and is known to be abrasive in conversations. So no, it's not just knight there, it's also an uncooperative personality. I find your reply to be lacking. Vininn126 (talk) 22:23, 1 June 2024 (UTC)Reply

Isn't this thread now devolving into precisely the kind of escalation that it was designed to stop? Theknightwho (talk) 22:46, 1 June 2024 (UTC)Reply

This seems like a seeing-the-forest-for-the-trees situation at best. Whether a rank-and-file contributor is insisting hot dogs qualify as sandwiches (if subs are sandwiches, so are hot dogs, FWIW) is perpendicular to the issue of problematic admins. Huhu9001 having been "abrasive" at some point doesn't justify an admin becoming hostile in kind. It absolutely did not justify TKW implementing a blatantly retaliatory block against Huhu last year. Admins have more power than rank-and-file editors. They need to be held to a higher standard of conduct accordingly. They absolutely shouldn't take administrative action in disputes in which they are personally involved. Admins aren't frontier sheriffs. They shouldn't be making and enforcing policy at their own own discretion. Power necessitates accountability and a certain level of restraint. What has been core policy on every other WMF project for decades shouldn't be weirdly controversial here. We shouldn't have a culture in which everyone nods along as an editor (not TKW, to be clear) with a history of inserting Daily Stormer quotes votes against an anti-harassment proposal with inane blather about "wokery" and the suggestion that PB89 seek "treatment for paranoia." This is discussion is doing nothing to relieve my sense that Wiktionary loves being a boys' club. It really does seem that some users will be forgiven any trespass, however severe, while others, no matter how much good work they do, will be summarily dismissed and denigrated and blamed for inviting the hostility to which they've been subjected. WordyAndNerdy (talk) 02:44, 2 June 2024 (UTC)Reply

1) An admin doing their job by addressing sloppy edits is a good thing.

2) Fayfreak’s point about “wokery” seems à propos given your throwing out questionable accusations of misogyny and now, apparently, Nazism. Apparently pointing out that a user is being a bit high maintenance means one must hate women. And pulling a random collection of usage examples from Google that happens to include some kind of far-right tabloid rubbish means you might as well be merrily goose-stepping and Heil Hitlering your way to the Reichstag. Nicodene (talk) 04:43, 2 June 2024 (UTC)Reply

Ugh, User:Nicodene, you are not doing yourself any favors with this post, and you are aptly illustrating User:WordyAndNerdy's point about Wiktionary being a boy's club. Benwing2 (talk) 04:53, 2 June 2024 (UTC)Reply

Also, IMO Fay Freak is in a class of their own with their weird views (and contorted syntax). They know a ton about obscure languages but tend to go off on bizarre rants/tangents that are best ignored; I would not hold them up as an example to be emulated. Benwing2 (talk) 05:00, 2 June 2024 (UTC)Reply

@Benwing2 Given that neither I nor WAN are happy about this, it seems fairly clear that the underlying issue is not that this is an old boys' club, but that there is no adequate way to resolve conflicts, because consequences are essentially arbitrary, and there is a culture of admins allowing things to peter out instead of actively drawing things to a close. WAN has concluded it's because of nepotism because she's only considering me (and now, apparently Fay Freak), but doesn't seem to realise that she's got away with quite a lot of disruptive behaviour herself, and it's not like people haven't noticed ([1]). Theknightwho (talk) 05:15, 2 June 2024 (UTC)Reply

Name one example of "disruptive behaviour" on my part. Since I'm allegedly guilty of so many you ought to be able to name one. Our clashes at shitgibbon and cupsona don't count. Neither of us behaved with the decorum we ought to in both instances. I've never deleted the main page. I don't habitually insert nonsense into entries. I don't add translations for languages I don't know. I think the most dust I've ever kicked up is over a user having a Patreon in 2015 and the weird resistance to accepting online cites in 2020-2022. And in both cases I just voiced my opinions and left for a time. Rather the opposite of "disruption," I'd say. Unless you're insinuating that not contributing is itself a form of disruption. In any case, you're deflecting again. WordyAndNerdy (talk) 05:42, 2 June 2024 (UTC)Reply

@Theknightwho I agree with all your points about the problems with Wiktionary (and I think Nicodene's comments were inappropriate). I do not think User:WordyAndNerdy's attempt to get you desysopped soon after Huhu9001's attempt was called for, and I said that at the time; but at the same time it's hard not to notice how multiple times, WAN has made a statement about something problematic in Wiktionary, and expressed a fear of getting subjected to denigration and hostility for expressing this, and someone then proceeded to come out and do exactly that.

As for a more systematic way of resolving conflicts, we definitely need that; but at the same time I don't think there's any appetite for a Wikipedia-style legalistic approach. IMO it has to be more mediation-based than arbitration-based, with arbitration-style "let's lay down the law" as a last resort. I think a good start would be maybe something like this: (1) a more clearly expressed code of conduct that clearly prohibits bigoted remarks, and gives examples of reasonable punishments for transgressions that admins (or bureaucrats if an admin is the transgressor) can make; (2) some sort of "appeal" process if one or the two sides (transgressor or transgressee) feels they're not getting fair treatment or their concerns aren't being heard or addressed. My hope is to avoid long, drawn out processes in the vast majority of cases, because IMO people here don't have the time or energy for this. Benwing2 (talk) 05:43, 2 June 2024 (UTC)Reply

I am in full support of your plan. CitationsFreak (talk) 05:47, 2 June 2024 (UTC)Reply

@Benwing2 I'd like to avoid long, drawn out processes as well, but I'd prefer them over long, drawn out threads where everyone gets angry and nothing gets done. Theknightwho (talk) 06:09, 2 June 2024 (UTC)Reply

I'd support this as well. AG202 (talk) 21:06, 2 June 2024 (UTC)Reply

@User:Benwing2 I think that "bigoted remarks", problematic though they are, are not the source of all the bad behavior that policy needs to address. More common are uses of derogatory labeling of people as, eg, idiots, morons, drama queens, even when cleverly or humorously worded. The emphasis in establishing a behavioral norm like "No personal attacks" has to be on personal. We may need a total ban on personal attacks (including accusations of Naziism, geneder bias, etc). Enforcement of such a ban couldn't be on a hair-trigger, but it would point in the right direction. A single personal attack should require an apology or temporary block; multiple personal attacks, say, over the course of 12 months would earn longer blocks, etc. I'm not sure about how to enforce better behavior by admins and veteran users (and their bots, templates, and modules). DCDuring (talk) 19:42, 2 June 2024 (UTC)Reply

Calling out someone for using racist/misogynistic/etc. language or linking to a neo-Nazi site in an entry isn't a "personal attack." Usually such call-outs are backed up by diffs demonstrating said behaviour. As a community we need to be able to discuss inappropriate conduct in order to effectively mitigate it. You nailed your colours to the mast long ago.[2][3][4] WordyAndNerdy (talk) 20:29, 2 June 2024 (UTC)Reply

Did I say it was? Whatever evil we attribute to such behavior would not justify attacking the person as a Nazi or an advocate of Nazism. We should be calling out the behavior, not the person, no matter what. I am proud to advocate freedom of expression, toleration, and universal coverage of English expressions in Wiktionary based on uniform standards of attestation and idiomaticity, regardless of the source or meaning. DCDuring (talk) 02:35, 3 June 2024 (UTC)Reply

You're off-the-mark on this front, I think, but I do respect you haven't been combative about it, and I do get the sense your take is born of principle. It's why I don't consider you a problem admin even if I regard your thinking as totemic of Wiktionary's systemic issues. Some of my reaction here may be that your initial comment was posted in "mostly unproductive." There's agreement that Nicodene's comments toward me in there crossed a line and -sche putting a lid on that is the main reason I've felt comfortable returning to this discussion.

Protecting individual freedom of expression shouldn't be a pressing concern on a crowd-sourced dictionary project. (Government censorship regimes OTOH can make our mission more difficult). No one's legal rights are infringed by a website setting standards on the type of speech permitted on the site itself. People still have a legal right to express their views on other platforms and in other contexts. Wiktionary is functionally a professional setting. Many employers maintain some type of code of conduct. Letting employees freely spout off their opinions will very likely create a hostile work environment. Our unvarnished thoughts aren't always helpful. I'm sure no one here wants to read my random thoughts on tax reform, ongoing military conflicts, etc. But sometimes uncomfortable conversations are necessary for change to occur or for problems to be rectified. We can't discuss individual user conduct issues if we can't name the specific problems some users present. It isn't a personal attack to characterise someone's speech as "racist" etc. If we treat it as such, all we'll be doing is ensuring that marginalised voices go unheard, as the majority is often resistant to putting its own biases under a microscope.

I also haven't advocated disallowing the inclusion of offensive terms. A lot of Category:English 4chan slang and Category:English incel slang is my work. I do think there are middle-ground interpretations of "Wiktionary is not censored." There is a lot of distance between having [slur] as an entry and including a quote featuring [slur] in some random entry like umbrella. The former is objectively documenting language as it exists. The latter is an unnecessary and inflammatory editorial choice. The Daily Stormer quote shoehorned into smash wasn't for a specifically neo-Nazi/white supremacist sense. It was for a sense that was a synonym of hottie (“attractive person”). This is why I long ago concluded that Fay is an edgelord. Edgelords don't necessarily personally endorse the views they express. For many it's about stirring up trouble for the lulz. But -sche seems to think think Fay may be the real deal, and I do trust his expertise in this area. WordyAndNerdy (talk) 04:42, 3 June 2024 (UTC)Reply

Expertise indeed, didn’t he graduate with a PhD in identifying Nazi lexicographers?

Frankly it looks like you’re bullying a person who may not be entirely neurotypical, wielding “they’re an X-ist!” as a cudgel to smash someone you dislike into submission. Nicodene (talk) 11:16, 3 June 2024 (UTC)Reply

@Nicodene: She also underrates that I am not a native speaker and was triangulating new definitions, that I wouldn’t know how specifically saucy or redpilled it is or not. Lacking intercultural competence in an international dictionary. Where would I have looked, amongst all bilingual and monolingual dictionaries, to find all bymeanings and implications, huh, WordyAndNerdy? All was gained inductively, the gold-standard of documenting language for a dictionary. For these movement-kind of words multiple people had to guess around because they were previously uncovered, cf. slam later written in Feb 2022 by me.

And then rather than trying to be edgy I wasn’t too happy to quote that guy so that’s why I hedged and balanced it with other quotes and Wikipedia links for author and publication where you already read “far-right, conspiracy theorist, neo-Nazi, white supremacist, misogynist, Islamophobic, antisemitic, and Holocaust denial”; that was all I could, save not including it, which wasn’t compelling, since dictionaries nowadays are notoriously not SFW unless defined otherwise, but the quote read so easy and illustrative! And since I had not studied psychology consciously to guesstimate the tribalization programs hinging on references, it was barely possible to be concerned; I say concerned but not bothered because I have only an cognitive simulation of what happens in others, which I note is a bit extreme in WordyAndNerdy.

This day I read p. 63:

> Carter and her colleagues (2012) were interested in investigating the ability of children with ASD to make judgments about pictured social interactions. The pictured scenarios did not require the use of language and both the children with ASD and those with typical development accurately identified the situations that depicted inappropriate social interactions. However, the children with typical development had robust activation in their language processing network when performing the task; they appeared to be spontaneously verbally encoding the information from the scenes that they were viewing. In contrast, the children with ASD had activation in a network associated with the processing of social information but no significant use of neural resources in the language network. This result suggested that the children with ASD were not spontaneously encoding the information into a verbal form.

Well, when I read this social interaction by political content creators, nothing happens and I don’t connect scenes and don’t encode the author’s or my or my publishing platform’s eventual position. Seems like others do but the automated categorization is still likely to be toned down or correlated with better possibilities by reason, and some wokeness courses do away with this capability again, such that people graduate to see intersectional discrimination structures and victimizations everywhere, trouble as a business sector, for which people privately readapt whole personal identities, as it is defined to operate by means of identification. Fay Freak (talk) 13:48, 3 June 2024 (UTC)Reply

I won't dignify this with a response except to note that I have decades of first-hand personal experience of ASD and somehow manage not use it as an excuse for questionable behaviour. WordyAndNerdy (talk) 21:05, 3 June 2024 (UTC)Reply

What is your excuse, then? For things like making a personal attack on the same page where you’d voted to ban personal attacks. Nicodene (talk) 23:14, 3 June 2024 (UTC)Reply

That's not a "personal attack" by any reasonable standard. It's a thing that actually happened, as is demonstrated by the diff. You need to stop dogging every comment I make. You've already been told that your previous comments toward me in this thread have been out of line. WordyAndNerdy (talk) 01:50, 4 June 2024 (UTC)Reply

The very first sentence of "No personal attacks" reads "Comment on content, not on the contributor." The exact opposite of what you did.

In the list of examples of what constitutes a personal attack, we specifically find:

"Using someone's political affiliations as an ad hominem means of dismissing or discrediting their views."

I should like to add that in this case it's a matter of "imagined political affiliations", since FayFreak has never once that I have ever seen actually expressed the slightest whiff of believing in racial superiority or exterminating undesirables.

I'm curious by what standard you consider anything I have written "out of line" which would not apply just as well to what you have been writing yourself. Nicodene (talk) 02:04, 4 June 2024 (UTC)Reply

Please, it's not worth your time or effort. AG202 (talk) 02:08, 4 June 2024 (UTC)Reply

It’s you who is dogged. The same point stands, whether framed with this concept or not. You can also consider yourself as one of extreme, insane, unhealthy, not to forget wrong. Hyperfocus on the same fiddlestick for five years. Someone with the same neural preconditions as me I would know to tell to stop being autistic; it appears the same “revisiting past points” happens if you answer trauma. Like to reinstate the DMN they rub themselves off on railway tracks, though it be obviously disadvisable.

Other candidates with ASD seclude themselves, keep their interactions brief, out of concerns or actual anxiety of being blamed or missing to react on social information appropiately, depending on their verbal abilities. Looking at the stats, instead of your first-hand anecdotes we can’t revisit (unlike my story which I retell you as far as I remember), with the must-criteria for this diagnosis of impaired social cognition + repetitive and restricted behaviours, most fall just short of schizoid or obsessive-compulsive personality disorder, so they (and me) need to make an actual effort to sidestep avoidant reaction to social input arising from its restricted interpretation. It is artificial though, rather than people-pleasing, and I had to practice it years to see things differently, which includes discussing and defending controversial viewpoints favourable for certain outcomes even if I don’t feel strongly or anyhow at all about them. I am supposed to be controversial. It is sure there would be issues incomprehensible to you, with your inflexible paradigms, if they engaged in politics, which I now have to do as a daily business since graduating law school, which is an exceptional case which you probably haven’t experienced even from hearsay, and you can’t imagine how edgelordy it had to be, for my life. Stats say hubris is greatest among jurists in Germany as compared to all academics (you take your standards from other fields?), again you miss the language and culture barrier on top of the double empathy problem, by which a lot more questionable things appear anyhow if one speaks across continents, and I couldn’t expect to reap a stalker from reciting the Daily Stormer once: kidding of course, don’t get it twisted, it is what AG202 said, not worth your time or effort, though we interpret the result differently. I know you aren’t trying to stalk or concern-troll, I tried to interpret and put into different perspective, again, and enable you. Extreme viewpoints which one contradicts are super necessary to set benchmarks, very different they appear in me from how you currently deal with them. Fay Freak (talk) 03:02, 4 June 2024 (UTC)Reply

@WordyAndNerdy: I wasn't planning to respond but I noticed you quoted me here. The reasons why I characterized User:Purplebackpack89's posts as an attempt to fan up drama:

Hyperbolic language ("stalking/harassment") which is frankly disrespectful to actual victims of stalking.
Referencing TKW's desysop vote, which has little to do with the current situation (and which also seems to argue against your premise that no action is taken against problematic editors — recall that Dan got indeffed mid-vote) and quoting random comments.
Inflammatory language, viz. "his edit was so bad", "it's likely he's following me around BECAUSE it bothers me.", etc., apparently intended to provoke TKW.

Ioaxxere (talk) 05:19, 3 June 2024 (UTC)Reply

"Wikistalking" is old-school wiki-jargon. "Wikihounding" or simply "hounding" has seemingly replaced it. But it needs to be remembered that PB89 has been around since 2010. It's not unexpected for a veteran editor to sometimes use older jargon. Wikistalking/hounding has never been regarded as a one-for-one equivalent of real-life stalking or even cyberstalking. It's exactly as PB89 has expressed it: combing through someone's edit history, systematically undoing their edits, inserting yourself into unrelated disputes, etc. It might not be intended as antagonistic, but it understandably comes across as such. TKW should ideally seek to moderate his tone and conduct if he wishes to avoid finding himself at the centre of conflicts (he has improved since last year). And mentioning the desysop votes was absolutely relevant. This is not an isolated incident. It's a pattern of conduct. Ignoring past incidents won't do us any favours. WordyAndNerdy (talk) 05:44, 3 June 2024 (UTC)Reply

@WordyAndNerdy PB89 made an absolutely unfounded accusation of harassment towards me for a single RfD of one of his terms combined with a single comment I made about him in another RfD (which is located directly above the RfD I added), in response to a comment of his. This is not the first time he has made unfounded accusations of harassment (and not towards TKW; I reserve judgment on this matter as I haven't looked at it in detail to see what the circumstances were). PB89 seems to think he can shut down criticism of his (IMO often sloppy or ill-considered) edits with such accusations. I should also add, from statements made on his user page, he rejects some core Wiktionary principles such as SOP, and seems to have difficulty understanding why Wiktionary isn't just Wikipedia-lite; so it's not surprising to me that several users feel his edits deserve extra scrutiny. Benwing2 (talk) 05:57, 3 June 2024 (UTC)Reply

I'm going to quote myself from ten years ago (June 2014) because I found this while digging up the roadworn diff and it seems just as relevant today:

Expressing minority viewpoints or being the lone dissenter in an RfD discussion does not constitute disruption. The fact that we have a formal discussion process at all means that the deletion of entries isn't an open-and-shut policy-enforcement matter completely up to the discretion of administrators. It means that RfD is an open forum where people may put forward serious arguments for or against the inclusion of terms and have these arguments weighed on their merits. Sometimes arguments put forward will not align with majority opinion. Sometimes they'll challenge the soundness of our policies. That's good! We need more of that, not less. The exchange of ideas is what discussion is all about. On the issue of "drama," as an outsider who's watched these incidents transpire from the sidelines, I'm not going to disagree that PBP's behaviour has been problematic, or that it needs to change. But the passive tolerance of incivility on Wiktionary is the proverbial elephant in the room here. We don't have formal dispute resolution or mediation processes like Wikipedia, and when incivility occurs and someone gets upset, the general response, in my own experience, is getting told that occasional rudeness and hostility is par for the course and one should learn to deal with it. This is unacceptable. So if PBP has developed a flair for the dramatic, perhaps it's because Wiktionary, lacking any means for addressing civility concerns in a reasonable and orderly fashion, has left PBP no recourse but dramatics. PBP isn't the problem; PBP is a symptom of the problem. Is it really fair to punish someone for a problem that Wiktionary as a whole has helped to create?

WordyAndNerdy (talk) 06:07, 3 June 2024 (UTC)Reply

@User:WordyandNerdy How would you categorize the types of incivility that are not personal attacks? Or are all types of incivility personal attacks, possibly veiled. I am wondering how to give shape to a civility policy beyond the most obvious. Attacking people's unstated (and possibly imagined) values, attitudes, beliefs, or motives is an example of problematic behavior, IMO. On occasion I have resorted to this, but I believe it to be undesirable in a wiki, as well as in many other environments. DCDuring (talk) 12:40, 3 June 2024 (UTC)Reply

@DCDuring Two examples of this are rudeness and passive aggression. We all engage in them sometimes, but they can easily have a chilling effect on productive discourse. I'm not saying we should ban them (which would probably have a much bigger chilling effect), but I do think any civility policy needs to be more nuanced than simply banning overt personal attacks and leaving it at that. Theknightwho (talk) 13:57, 3 June 2024 (UTC)Reply

I am looking for categories of items that are relatively easy to characterize and which have a high likelihood of triggering escalation. Such categories can form the core of undesirable behavior which can be controlled. There are lots of types of uncivility that are undesirable, but are hard to police. I think 'rudeness' and 'passive-aggression' are hard to define operationally. We can't start with them or let their existence prevent action on what might be relatively easy to control. My hope is that the basic lessons of the psychology of interpersonal relations can be productively applied here. DCDuring (talk) 14:07, 3 June 2024 (UTC)Reply

Targeting a large number of edits by the same editor all at once is likely to make that editor feel targeted. You talk of basic psychology...basic psychology would suggest that, if a large number of edits (made in some cases over a period of years) are all targeted at once, that I would feel targeted! Anyone probably would! What's the solution here? Spread it out! Instead of targeting all my edits in a period of a few days, maybe take a couple months. Purplebackpack89 14:40, 3 June 2024 (UTC)Reply

I would argue that this discussion should not be a forum for airing personal grievances and settling scores. User talk pages are better for those purposes. When they fail, a mediator's assistance might be warranted. We could at least try to generalize to the matter of how, in an environment of volunteers, a patroller should select entries and edits for revision and how the patroller and 'targeted' patrollee should interact. DCDuring (talk) 15:03, 3 June 2024 (UTC)Reply

"When it rains, it pours." I'd say the airing of personal grievances here was inevitable. When there's been no remedy for problematic user conduct – and when discussions on the subject have fizzled out – the result is feeling unseen, unheard, and unvalued. Such an experience can naturally leave one with a sense of injustice. But I hope that everyone's gotten things out of their system now and we can focus on finding solutions.

I'd say that the question of how to categorise "types of incivility that are not personal attacks" depends on tone, context, and several other factors. If someone is generally in the right but is unnecessarily hasty or severe about it, I'd characterise that as "rude," "short," or "brusque." An example might be someone reverting a poorly-formatted but well-meaning edit by a newbie with "learn correct mark-up!" If they're unnecessarily harsh ("f***king learn mark-up!), I'd describe that as "abrasive," "hostile," etc. If they assert their own superiority ("learning mark-up isn't hard!"), I'd call that "snide," "condescending," etc.

None of these present major issues in isolation. We're all human and we all err from time to time. It becomes a community problem when it's a pattern of behaviour. That said I don't think it will be necessary for any user conduct policy we create to classify types of incivility with this level of granularity. All of this could be covered by a general advisement to "please try to keep a cool head and remain respectful in discussions".

Where I think we would need to get into specifics is with statements that express antipathy toward characteristics typically covered by human-rights legislation. There's no reason for invective statements to target someone's race, religion, national origin, disability, sex, gender identity, sexual orientation, etc. Of course there'll be disagreement on what crosses the line. Calling someone a "dumb American" would be taking an unambiguous potshot at their nationality. But "'British people don't say 'elevator.' Why are so many Americans ignorant?" is arguably just an unhelpfully cranky statement of the fact English varies between countries. More nuanced incidents will warrant consideration on a case-by-case basis. But a blanket rule against what is generally deemed hate speech is necessary for a communal project like Wiktionary to function (and thrive!). WordyAndNerdy (talk) 22:41, 3 June 2024 (UTC)Reply

mostly unproductive

I support having actual bigotry be punished, not “one of the random usage quotes you found turns out to be from a far-right website”. For all I know one or more of the countless linguists I’ve cited over the years could have been a literal Nazi and I’d have had no idea whatsoever. Should I be put on trial too?

I get that you think I’m “spewing crap” and such but to me this is a genuine concern. I’ve seen more than one community turn unbearably toxic from this sort of thing. Nicodene (talk) 06:53, 2 June 2024 (UTC)Reply

I feel that you being on a far-right website would be obvious. CitationsFreak (talk) 07:03, 2 June 2024 (UTC)Reply

If you’re there to actually read the article or check out the website, yes. It’s never occurred to me to examine a site like that when I’m just there to grab a quick usage example that Google found for me. Now I will, though, out of terror of being accused of “Shoehorning neo-Nazi propaganda into random entries”. Nicodene (talk) 07:31, 2 June 2024 (UTC)Reply

You probably should read the articles, at least the parts that give context for the word. (Plus, FF literally said "[N]eo-Nazis are […] possible contributors – and [quoting them] shows that Wiktionary knows its onions." in Talk:smash, the talk page for the page where he added the Daily Stormer quote.) CitationsFreak (talk) 07:39, 2 June 2024 (UTC)Reply

I see. I don’t agree with hosting (even innocuous) quotes from such people. Nicodene (talk) 08:04, 2 June 2024 (UTC)Reply

No one innocently stumbles onto the Daily Stormer website off Google. Particularly not when Google is likely to filter results in compliance with local laws. WordyAndNerdy (talk) 17:40, 2 June 2024 (UTC)Reply

That is quite literally the only plausible explanation. How else do you imagine he found “smash” used in such a specific sense, at the same time on that site as various others, if not via a search engine? He’s a Nazi with superhuman memory or perhaps incredibly lucky? Nicodene (talk) 21:20, 2 June 2024 (UTC)Reply

I guess, it is working towards all the political bias as such, not only nazi linguists. It is not a good idea to rely on linguistic works of Josef Stalin or Nikolai Marr. But it does not mean that all the Soviet linguists are complete garbage, even if they write some Soviet propaganda in they works sometimes. It is a difference between quoting a story of H.P. Lovecraft or Gabrielle d'Annunzio and quoting a Nazi propaganda paper, even if all of them have some political bias, but a Nazi propaganda paper is made for the propaganda reason, while a story about monsters or flowers has some other main goal. As Bogdanov-Malinovsky said, it is always possible to find political bias behind any text if you dig hard enough. As of me, when i choose quotes for Russian words, i find it hard to find any Soviet or Russian well-known author not involved in any political ideology at all (maybe except of Pelevin or Nabokov). Tollef Salemann (talk) 12:57, 2 June 2024 (UTC)Reply

I just don’t see it, I guess. E’s comment was catty, without a doubt, but how one goes from “he called me a drama queen” to “he hates women” I genuinely don’t understand. Or how one could seriously think FayFreak, of all people, to be far-right. Nicodene (talk) 05:15, 2 June 2024 (UTC)Reply

(e/c) I'm actually mystified in the other direction, how one could have avoided noticing it; his leanings are apparent (and even prove useful sometimes, e.g. that he knew where to look to aid with Talk:ᛦ), although I made clear early on that open Nazis / Nazism isn't welcome on this site. (Beyond that, it's as Benwing says, he makes useful contributions in many areas, and disruptively best-ignored tangents and misdefinitions [see e.g. the recent discussion of Tatbestand, or Talk:negligence per se] in other areas.) - -sche (discuss) 05:20, 2 June 2024 (UTC)Reply

I genuinely read (past and present tense) him as left-wing. Maybe I’m naïve. Nicodene (talk) 06:14, 2 June 2024 (UTC)Reply

@Nicodene: In the past I have not well known how one could read me as anything, social cognition impaired, you know. To jump on the bandwagon of any political wing I would have to go outside or at least expose myself to some community, otherwise my personal weltanschauung at any given time is decidedly idiosyncratic. Now I hit thirty engaging none but my computer screen, library books and professors, not having been on intimate terms with any peer, it is unclear how (in this alexithymia) one could picture, or essentialize, allegiance, anymore than that of a faceless cyberhacker, only because I randomly, eclectically, made my awareness of ideological trench wars, which I never experienced other than by second hand, noted by coloured language, which cannot be extirpated, as by dint of the idiom principle they constitute the metaphors you live by. I don’t live by them, they are just epistemic content providers with signalling hazards for me.

Parents have evinced fatal neglectfulness when the first books I got after elementary school, which was a waste of time as all succeeding one, was some serious crackpot stuff ordered from some physical malvertisement, after which, without any attachment and due to genuine enthusiasm for seeing things from unconvential perspectives, I went fully down all conspiracy rabbitholes and also got the hang of online extremism extensively, while being excluded from school for patterning some bully up (legal loophole in compulsory education, you just make them not want you pending further clarification, may also be rightful self-defence, doesn’t matter, teachers are only right themselves always), though to be fair that was the triggering event for an eventual upcoming diagnosis that would have to lay the foundations for special support (which never took place, had to self-learn), and reading books is good either way, from my experience. There you start general epistemology and language science, because there must be methods to parse and evaluate any information. Without methods I am a non-responder, there are no “such people”, only information and art.

Arguing around is underrated, that’s why I am a jurist, after arguemaxxing since that time, preferably with freethinkers of course. Some write to me though and assume I am lefty for their own convenience and I help them to solve their self-care questions their DMN, trying them to fit into society somewhere, burdens them with, you are not extra. Nazis get “each to their own” as well. They are easy! And what do you think happens if someone approaches me and wants to talk me into the message of Jesus? Normally people just dismiss him but I extend my walk until dusk in order to dissuade him and convert him to infidelity, while equating Jesus and Hitler, the mindsets of such proselytizers, climate activists and terrorists of all kind on multiple occasions. All ideologies are wrong. Their proponents just made unfortunate experiences and now everything they suffer is their bedlam governed by imaginary friends and icons. I can only consent with individual policies, never partisan.

You never fail to conjecture too much into people, so I still expand. It was doubly difficult to read the room if a school assistant (so an apparent pedagogue translates, UK equivalents seem to exist not, unless you substantiate the contrary) engenders an observer effect. Protection layers of abstraction are between me and any input, mirror neurons unknown. It still baffles me how much intertextual association can haunt people. That that much of a social feedback loop experience is for allists, that someone not distancing himself from a Nazi player (which is unmotivated, as laid out, in the way that I don’t distance myself from cows and gavials, it is a different species) is permanent, pervasive, and personal. And your “social circle” is actually infectious, since you respond to social confirmation! I mean, by analogy, if that’s an inappropriate image to you, given that for my configuration social behaviour is not open to intuition but only cognitive simulation, which expressed WordyAndNerdy impudently calls “inane blather”. Apparently being unemotional and putting things into perspective is inane. Fay Freak (talk) 11:05, 2 June 2024 (UTC)Reply

Good Lord. The Daily Stormer is a neo-Nazi website. There is absolutely no ambiguity on that front. Shoehorning neo-Nazi propaganda into random entries is detrimental to Wiktionary's reliability, whether it's as the result of 1) an edgelord trying to stir up trouble, 2) a free-speech absolutist making some self-sabotaging "Wiktionary is not censored" point, 3) or an actual neo-Nazi evangelising their real beliefs. I also did not make "questionable accusations of misogyny" against Equinox. I made one (1) reference to his use of "misogynistic language" in connection to me. I still have the screenshot of him referring to my Wiktionary history as the "tragic tale of the abused wife who comes back" in the Discord last year. I would've supplied that as evidence privately if this matter had ever reached a higher level. But Equinox's departure has rendered a need for that moot. That doesn't absolve Wiktionary of its deep-rooted systemic issues as a community. WordyAndNerdy (talk) 05:19, 2 June 2024 (UTC)Reply

It seems clear from the edit you’ve linked that FF used Google (or Bing or whatever) to gather quotes containing various senses of “smash”. One of which happened to be that website. For the record not all of us even what Daily Stormer is, and I sure wouldn’t have guessed correctly based on the two sentences Fayfreak got from it. I support removing that quote, to avoid directing any traffic to that site, but I can’t understand seeing so much malice in a simple oversight.

At least with E there is a level of actual malice. The comment was catty, needlessly personal - yes. But misogynistic? Would he have not been catty about it all if you were male? And apparently he thought you yourself were misogynistic for re-adding “cisgender” to the entry for febfem? I just don’t understand any of this. Nicodene (talk) 05:53, 2 June 2024 (UTC)Reply

Getting really sick of being gaslit and told I don't understand what misogyny is as a woman. I'll finish attesting an in-progress entry and then I'm done. I've given enough second chances to this site. WordyAndNerdy (talk) 05:59, 2 June 2024 (UTC)Reply

It's also clear that you didn't even read my comment because it's clear that the "misogynstic language" was a reference to a Discord comment and not the transphobic jibe at febfem. WordyAndNerdy (talk) 06:04, 2 June 2024 (UTC)Reply

I have read your comment as well as his. And mentioned that the latter was rude and explained how I read it. He was mocking your behaviour as far as I can tell not your gender. If it is misogynistic then you could just say how?

The edit summary he left on febfem jokingly says that you hate women. I don’t understand the rest of that at all. Nicodene (talk) 06:11, 2 June 2024 (UTC)Reply

@WordyAndNerdy Please (try to) ignore comments of this sort, if possible. As I said to TKW (using the equivalent Spanish proverb), a closed mouth catches no flies, and (IMO at least) it's best to give no air to people who spew crap. Benwing2 (talk) 06:08, 2 June 2024 (UTC)Reply

My input was specifically invited in this discussion. It's turned into a beatdown, exactly as I feared it would, because some people seemingly don't want to inspect Wiktionary's hostile, corrosive side too closely. There's a reason I only made two oblique references to my gender in my first ten years on Wiktionary. Suffice it to say that existing openly as a woman on the Internet is generally not conducive to positive experiences. This discussion has only served to reaffirm that. I've had others firmly object to my judgements/opinions/etc. before, but I've never been condescended to and gaslit by multiple users like this. I'm tired and disgusted and done. WordyAndNerdy (talk) 06:44, 2 June 2024 (UTC)Reply

“Suffice it to say that existing openly as a woman on the Internet is generally not conducive to positive experiences.” Don’t think so. Most places on the internet are generally not conducive to positive experiences, they would have to be specialized to be otherwise: minorities stick together. “Deep-rooted systemic issues as a community” exactly I see not; though there are ways to feel like it, using loaded language with spicy twang is much better than daily business outside. Wiktionary is goodt. You wouldn’t be here if it were unsafe, it’s not just hope for revenge on mirror-images of abusers, I hope.

You should by now know how men work; they structure their whole livelihoods, and by extension that of others, about enjoying one of their preferred sex two days a week (average) and thus behave differently depending on whether women are present or not, anxious if they are unsure; you are lucky not to have all the T and would wish to go back if you ever found yourself to wake up as a man. In this respect I understand Equinox as, at bottom, sincere in assuring himself that you are, in fact, cisgender. I am sorry for myself I have talk about humans’ constant rutting now, I have never started this topic, but when we have resolved the whys of its arising it can go away. None of the male sex who does not attempt to expend exceptional empathy owns up to this actual concern without a level of cheekiness, unconscious of it himself. Don’t be condescended, it is an opportunity to set the hierarchies straight, possibly appearing smarter yourself! Cognitive reframing trumps both action-based coping and venting. Indeed I strive to beget positive attitudes and don’t say anything demotivating to people, gaslighting or not from my side (not that I would know; a heart for men and women both anyway). Fay Freak (talk) 07:45, 2 June 2024 (UTC)Reply

@Vininn126 it is irrelevant whether a particular editor is perceived to be "productive" or "sloppy". That shouldn't be an excuse to be combative with them, or escalate things Purplebackpack89 15:51, 2 June 2024 (UTC)Reply

I've had limited but productive interaction with both TKW with WAN. I respect them both as editors and hope they can both find a way to continue editing. Contributing to Wiktionary is a particularly thankless endeavor and I imagine that, like many editors, each has received much less praise than they deserve for their efforts while being on the receiving end of a disproportionate amount of criticism. They, and other editors, have good reason to feel aggrieved and I think that we, as a community, could do a better job of shutting down bad behavior earlier and providing a forum to air grievances where the involved parties could get some perspective from uninvolved editors instead of feeling like they have to personally defend themselves against attacks. I would hope such a forum could provide actionable support for legitimate grievances, perspective for editors who feel slighted by innocuous remarks or edits, and a quick boot for anyone using it in bad faith. JeffDoozan (talk) 00:16, 2 June 2024 (UTC)Reply

I'll be honest. I think too many people here have stopped actually building a dictionary. I don't like that. So I'll be absolutely clear as to my position once, and I sincerely hope that at least some of the people here that are trying to figure out how to emit as much aggression as possible onto unknowns on the internet will find a better hobby.

I didn't become an admin to enforce any rules on "civility" or the like. I simply don't care. I should probably start helping out with closing RFDs and RFVs more often (I have been pretty busy with real-life things, but right now I have a bit more time), but other than that I am a volunteer as much as anyone else on this website, and I don't come here to do busywork I wouldn't even do if I were paid.
- So, basically: If you need a nanny, this isn't a website for you.
- If you get called an idiot, or stupid: Tough luck, you making a BP post on that only proves this statement.
- Actual slurs are a different matter, and we shouldn't tolerate those in any shape or form. Use your head.
We aim to be a full dictionary. We are also a politically neutral dictionary.
- Yes, that means we have entries for slurs, Neonazi slang, communistic formation and whatnot.
- We shouldn't use politically loaded quotes unless necessary, but sometimes they are: 99% of literature written on the territory of modern Russia in languages other than Russian will be loaded with communistic messages, that doesn't mean we shouldn't quote them.
- If anything, quoting anything that shows a capitalistic or religious view (including the Bible even!) should be as problematic as neonazism or communism.
- If you can't handle us hosting such quotes when they are necessary, maybe lexicography isn't your thing.
- If you find something you think wasn't necessary, remember: assume good faith. That's like page one of our whole dictionary. I feel this rule that should be plastered all over the website is forgotten too easily in the last few years. The person adding a neonazistic quote isn't necessarily a neonazi themselves, they may just be lazy and have found this quote before any others. That's why I add communistic quotes for Ingrian, because that's most of the literature, and it's easier for me to just take a book and add quotes word for word than look through the entire corpus hoping to find a sentence where the word "religion" isn't followed by "is complete bollocks".
The recent amount of technical "fixes" has grown out of control.
- Entries go first, templates go second, and markup goes last.
- Going out to change any technical feature of a language you aren't personally in the process of adding entries for should be done only at the request/agreement of the ones that do edit it. In the best case, you will have to re-do these changes later on when an active editor appears, and in the worst case you will lose every single editor that is invested in working on this dictionary at all.
In the end, seriously, I would rather have an editor do constructive work and be a little rude than an editor doing nothing and be the nicest person in the world.
- I'd say 99% (yes, I like that number) of the languages in our dictionary are grossly underrepresented. To give an example: Just today Ingrian (which has an estimated 20 native speakers) surpassed the closely-related Estonian (which has an estimated 1.2 million native speakers) in terms of number of lemmas, and the situation in Africa and Southeast Asia is even worse.
- If an admin is monitoring your edits, it's because you apparently did something wrong. Doesn't mean you're a bad editor, just means you have room to grow. See what was changed and try applying that in the future.
- Now, if you continue to make the same mistakes over and over again, then you'll at some point get the message "Please stop, and if you don't you'll get a block", and at that point you should really stop. We cannot keep fixing your mistakes for you.
- To the admins monitoring: If you tell the people why you're going to monitor their edits, that will probably be more effective than just acting like you're not doing that, or only explaining it after they have completely freaked out.

Maybe let's stop trying to figure out who's right and wrong and start actually working on the dictionary? Does that sound like a plan? In that case, we don't need any conflict resolution, because nobody will offend anyone and nobody will get offended. Sounds like a win-win to me.

Because seriously, what in the world is keeping you from editing so much that you absolutely need me and a few dozen other editors to write this type of enormous text just to solve it? Thadh (talk) 15:59, 2 June 2024 (UTC)Reply

I'd have to strongly agree with a lot here. Maybe not everything, but a lot. I'd like to emphasize that it seems to be the people who stir up the most mud also seem to do the least editing. Vininn126 (talk) 16:01, 2 June 2024 (UTC)Reply

Here's my 2c, most of which has been said by me or others elsewhere.

Wiktionary tends to be dominated by a relatively small group of "guardians", such as Knightwho, Equinox and Fay
Some of those guardians (again, Knightwho, Equinox and Fay) have problems getting along with non-guardians
The guardians aren't that interested in holding each other accountable
Some of the guardians are OK with driving non-guardians from the project. At least one of them (rather foolishly) stated that publicly.
This is in conflict with one of the base principles of all Wikimedia projects: that anyone can edit them
With great power comes great responsibilities. In exchange for being awarded the blocking tool, admins should be expected to be held to a higher standard than non-admins
There is no deadline. Except for obvious vandalism, there's no need for minor tweaks to be done immediately, nor is there any need for them to be done by any one editor in particular
It's been pointed out quite a few times, by several different editors, that Knightwho has a problem with conflict and escalation (one example being that, when I felt harassed, he just went further and further back into my edits, rather than stepping away)
Remedies have been offered to KnightWho on how to avoid conflict, and he's ignored them

What does this mean in real terms?

De-escalation is a good and necessary thing
If the parties are unwilling to de-escalate, remedies like two-way interaction bans need to be available.

Purplebackpack89 15:44, 2 June 2024 (UTC)Reply

I am going to be perfectly frank. Someone shouldn't be an admin if they aren't willing to enforce user conduct standards. Civility is one of the five pillars on Wikipedia. There is no reason for a load-bearing policy to be entirely absent on Wiktionary except to preserve and enable a toxic culture. Any rank-and-file editor could theoretically do menial maintenance tasks such as closing RfVs. I had a short stint running Word of the Day back in 2012 and I was (and remain) a non-admin. The necessity of admins is not in doing maintenance tasks but in keeping the peace. With the ability to block disruptive users, they might be thought of as a wiki's police. Ideally, blocking shouldn't be the first line of defence. Problem users can be dealt with through guidance, de-escalation, interaction bans, mediation (if such a process existed here). When one of the few woman editors sticks her head above the parapet to speak on her negative experiences, she shouldn't receive gaslighting, condescension, and a stunningly weird and deeply discomfitting jeremiad about how men are too horny to work with women in response. It's impossible to have a serious conversation when this type of rank nonsense is tacitly allowed. Was this thread started to have a discussion about how Wiktionary can create dispute resolution processes? Or is it an exercise in hand-waving and navel-gazing ("Why can't everyone just get along?") without any actual commitment to examining Wiktionary's systemic issues and implementing badly-needed changes? The fact that a civility policy seems slated to be rejected by a landslide beggars belief. I honestly don't think anything is going to change without WMF intervention. The rot has spread too deep for Wiktionary to keep its own house. WordyAndNerdy (talk) 17:17, 2 June 2024 (UTC)Reply

I did hope that it would lead to the former, myself. (In fact, I hoped that we would make a dispute resolution process.) CitationsFreak (talk) 18:08, 2 June 2024 (UTC)Reply

It won't. I went into this discussion skeptical, and it's affirmed every misgiving I had. Even the level heads in the room seem to be taking a hands-off approach. No one wants to the one to button down and call for change. Tall poppies are smacked down; squeaky wheels are dismantled. Doesn't matter if they've got 14 years of solid work behind them. Preserving a cootie-free space for the boys' club is apparently more important than building a dictionary. Heaven forbid anyone be required to exercise personal restraint in what is functionally a professional setting. That's woke pinko free-speech suppression or something. WordyAndNerdy (talk) 18:47, 2 June 2024 (UTC)Reply

@WordyAndNerdy I made a (bare-bones) proposal above, do you have any thoughts about that? User:CitationsFreak and User:Theknightwho are the only ones who made any comments about it so far. I am trying to find something that will both have some substance in it and work in practice (two aims that aren't easy to reconcile). Benwing2 (talk) 18:59, 2 June 2024 (UTC)Reply

Do you mean this? I'd considered the possibility of a semi-formal mediation process myself. But such a scheme would be just as easy to game as a more legalistic one. Too often subjective judgments inform individual perceptions of a situation. The scale will always be weighted in favour of those with power and the right connections. People are more willing to assume good faith of people they admire and/or consider friends. Which is why I believe an intermediate stage in the dispute-resolution process would be necessary. Problem users (including admins) could be restricted to 1RR and required to bring concerns to the BP to ensure uninvolved eyes assess the situation. We'd need to be comfortable with enforcement being applied asymmetrically in some cases. Sometimes both "sides" in a conflict aren't equally guilty of bad behaviour. An admin who is habitually hostile/antagonising isn't the same as a rank-and-file editor who reacts poorly in an isolated instance. That's a level of nuance more legalistic approaches are generally better at handling. WordyAndNerdy (talk) 19:39, 2 June 2024 (UTC)Reply

@WordyAndNerdy Thank you for your response. I think in general, edit wars should quickly be brought to the Beer parlour; if you get to the point that you've done 3 reverts (or even two), you should stop and bring the discussion to the BP. At least, this is what I've done and I have seen others do the same. We are generally less tolerant of edit warring than Wikipedia is. Maybe something like this can be put into a formal policy. I do agree that sometimes one person will be right and other wrong, although it's not always apparent to outside admins. As an example, there was a dispute a few years ago between User:Saranamd (aka Tibidibi/Karaeng Matoaya) and B2V22BHARAT. Both users asserted the other was wrong and was edit warring; eventually it was clear that the latter user was in the wrong and was blocked for a week (causing them to leave), but it took awhile to sort this out, esp. since there was no admin dedicated to the dispute. I agree in general that any process can be gamed, but having the process is better than not having one at all, and I think maybe a mediation process with a single uninvolved admin could be an intermediate step required before a full legalistic panel. I have read through such panels in Wikipedia, and they're exhausting just to read (much less to participate in, I'm sure). Such panels may be necessary in Wikipedia because they are often caused by underlying real-world political disputes (abortion and other US political issues; the Israeli-Palestinian conflict; a whole host of Eastern European conflicts; etc.). But in my experience these disputes are thankfully less relevant in Wiktionary, where the disputes instead are more on the personal level. I invite others to contribute suggestions regarding what should be considered actionable, what the steps are in the process, etc. Benwing2 (talk) 21:49, 2 June 2024 (UTC)Reply

You're correct that people are often unaware of points of contention outside their own personal experience and knowledge base. That's why it seems integral for project Wiktionary to strive to both invite and sustain a diverse editor base in order to help counteract systemic bias. While I'd personally prefer a more structured ("legalistic") approach, any dispute-resolution process would be a vast improvement on none. WordyAndNerdy (talk) 22:59, 2 June 2024 (UTC)Reply

I am cautiously more hopeful; I read support on the vote page, even from oppose voters, for having a thought-out civility policy; the thing which the vote looks set to defeat is one editor's attempt to win a personal dispute by pushing through a page from 2006 seemingly without even reading/comprehending it enough to notice it still said one of the processes involved notifying Jimbo. I'd like to hope a guideline that doesn't posit "Head Boy of the boy's club should be notified", a modern civility policy written in 2024, is attainable. (I also think ensuring the policy / community has mechanisms for dealing with gaming is a valid and serious concern; on Wikipedia, my anecdotal count is that it seems like about half the trans editors who've dared edit trans topics there have gotten baited/gamed and censured/censored/banned; I think we do need to think about how to write a civility policy that doesn't empower the one or two people taking the stance that someone calling out / disliking Nazism is the one in the wrong.) - -sche (discuss) 18:33, 2 June 2024 (UTC)Reply

I'm also not aware of any openly trans or non-binary Wiktionarians. I'm sure there's a couple but how many want to hang around with all the trans-antagonistic soapboxing that goes on here? Our collection of trans-related terms has seemingly been built primarily by cis people. Imagine if all entries for a language were created exclusively by non-native speakers. How would that shape Wiktionary's coverage of that language in subtle ways? I mean, the general lack of AFAB editors on here is of genuine lexicographical concern. WordyAndNerdy (talk) 20:52, 2 June 2024 (UTC)Reply

While not the same issue, I feel the same way about racial issues. I've been called epithets by users/IPs and had to go on resource dives for showing that the most basic terms are actually offensive, see the history of all lives matter, specifically this edit, for an example. However, one thing I do think I've learned here, for better or worse, is that it's not worth it to get into spats even if you're in the right. It just bogs you down and puts a negative light on you. For myself, I just keep mental track of folks I've interacted with and act accordingly, such as with Equinox. Not worth it to argue anymore. That obviously doesn't work for everyone, and it's not easy, but it keeps me sane on this project, especially after 2022 with the discussions leading up to the creation of WT:DEROGATORY. I just hope that one day this project will be welcoming enough to where we can get actual coverage done for the languages that really need it. AG202 (talk) 21:05, 2 June 2024 (UTC)Reply

Same here. CitationsFreak (talk) 21:12, 2 June 2024 (UTC)Reply

I'm not sure if I'd personally label all lives matter as "offensive." That phrase seems to be employed more as a silencing tactic than a provocation. One might argue it's the racial analogue of not all men. That kind of complexity can be difficult to condense into a context label. I might've offloaded it onto usage note as happened at TERF. But I'm willing to accept that I've got a large blind spot here. It's definitely good to have a diverse editor pool for this reason. Not everyone is going to catch errors that result from their own limited experience and/or biases. As for continuing to edit despite it all, I'm not sure that's feasible for me, given it's clear I'm unwelcome here. There was a time when it took me more than a year to point out that an editor (not Equinox, to be clear) was habitually inserting inflammatory quotes from manosphere blogs into random entries. I don't have the patience for tying myself in knots trying to explain why that's a bad thing without referencing systemic oppression and prejudice anymore. WordyAndNerdy (talk) 21:54, 2 June 2024 (UTC)Reply

@WordyAndNerdy I'd like to clarify that you are definitely not generally unwelcome. Yes, some contributors have essentially told you to fuck off, but I for one appreciate your contributions. E.g. you have added a lot of info about fandom ships, something I know next to nothing about; from reviewing your contributions, I also see stuff related to non-binary and other gender-non-conforming communities (if that is the right term), social-media memes and trends, and other stuff that's important for keeping Wiktionary up-to-date and representative of all (sub)cultures, not just the dominant one. Benwing2 (talk) 05:13, 3 June 2024 (UTC)Reply

Thank you for the kind words. One of the most gratifying things was randomly seeing "WIKTIONARY HAS SHIP NAMES???" in a tweet. Knowing my work is being referenced by people outside the fandom sphere is cool. WordyAndNerdy (talk) 06:13, 3 June 2024 (UTC)Reply

I can think of at least two who have openly identified themselves; I'm sure -sche knows of more. I'm not sure however if either of the people I'm thinking of have contributed to trans-related entries. One used to be one of the most active contributors, esp. for bot-related work, but left for reasons (I think) are at least partly unrelated to their trans status. The other is still active but has stayed away from this discussion. Benwing2 (talk) 21:20, 2 June 2024 (UTC)Reply

Nor are they required engage in this discussion in a "any marginalised individual in a group is required to serve as a spokesperson" kind of way. I just think it would just be nice to have more LGBT editors onboard to help counteract systemic bias. As rewarding as it has been documenting trans-related coinages on Wiktionary, it can feel like talking over actual trans people or treating them as anthropological curiosities at times. WordyAndNerdy (talk) 22:14, 2 June 2024 (UTC)Reply

If Wiktionary really is a "boys' club", may I suggest you take the first step to improve this state of affairs by de-sysopping yourself, having been one of the boys in charge for years now? "Walk the walk", as they say.

For the record I don't buy it. A perennially catty user (Equinox) being catty to yet another person is not because they're a woman, it's because they're just another person. FayFreak is not a Nazi whatsoever, he's a "free speech" champion. You disagree with him, I disagree with him as well – the difference is you see burning malice where I see a kind of optimistic naïveté. Nicodene (talk) 22:15, 2 June 2024 (UTC)Reply

What part of "I was (and remain) a non-admin" do you not understand? Would be really nice if you actually followed this discussion instead of shadowboxing against things that no one said. WordyAndNerdy (talk) 23:11, 2 June 2024 (UTC)Reply

What part of my replying to -sche, not you, do you not understand? Nicodene (talk) 23:19, 2 June 2024 (UTC)Reply

Then use @ to make it clear who to whom you're speaking because this thread is playing fast and loose with indentation. WordyAndNerdy (talk) 23:23, 2 June 2024 (UTC)Reply

Your hostile remarks toward -sche are also completely unwarranted. Maybe sit this one out if you're just gonna throw peanuts from the gallery. WordyAndNerdy (talk) 23:26, 2 June 2024 (UTC)Reply

Basic reading comprehension on your part is not my responsibility. How "[you've been] one of the boys in charge for years now" could possibly be construed as being about you is beyond me.

I don't think what I've said (and I stand by it) comes anywhere near frivolously accusing someone of Nazism. If you'd like to apply your own apparent standards for hostility to yourself and "sit this one out", I'll be happy to follow suit. Nicodene (talk) 23:34, 2 June 2024 (UTC)Reply

Can we please de-escalate here? —Justin (koavf)❤T☮C☺M☯ 23:42, 2 June 2024 (UTC)Reply

Feel free to start a de-sysop vote for me, but something tells me your idea of what an admin should or shouldn't want or have to do is not the community consensus. Thadh (talk) 18:36, 2 June 2024 (UTC)Reply

If we subtracted all of the statements in this discussion that themselves were about individual persons' values, attitudes, and beliefs, including defensive reactions, we would have a very short discussion indeed. I don't see that most of the discussion here is contributing to the topic-creator's concerns or even to an improvement of that statement of concerns. DCDuring (talk) 12:40, 3 June 2024 (UTC)Reply
I completely agree with you. Theknightwho (talk) 13:59, 3 June 2024 (UTC)Reply

how to identify locations in audio snippets of minority languages?

Latest comment: 1 month ago14 comments6 people in discussion

I am cleaning up the captions of audio snippets, and I've come across an issue that needs discussion. Sometimes if the audio file refers to the location where the language in question is a minority language, the file identifies the location using the minority language's preferred name instead of the common English name (which is usually based on the majority language). Examples:

There are 1,179 snippets for Palestinian Arabic as spoken in Lod, Israel, which identify it using the Arabic name al-Lidd.
The audio for the Northern Kurdish term emerîkî comes from Van in Turkey but originally identified it using the Kurdish name Wan. (In this case I changed it to Van before the wider issue became apparent.)
There are 5-6 Northern Kurdish terms from Diyarbakır that identify the location as Diyarbakir (note the two i's in the spelling), using the Kurdish form of the same name, and one that identifies it as Amed, using the normal Kurdish name. (Note, in this case, the form Diyarbakır is a Turkified name adopted in 1937; the older form in Turkish was Diyarbekir, from Arabic.)

I'm sure there are others, but these are the most politically fraught ones I've come across. The questions are:

Should we use the common name, as Wikipedia does (the above cities are found under Lod, Diyarbakır and Van, Turkey) or defer to the minority language's name?
If we defer to the minority language's name, do we do this only in certain cases (e.g. ones that are politically fraught)? (I bring this up because e.g. Navajo names of places tend to be radically different from the corresponding English ones, cf. Window Rock, Arizona vs. Navajo Tségháhoodzání and I think it would be confusing to use the Navajo names.)
What about accent marks not typically found in the common English name? E.g. there are hundreds of Vietnamese audio snippets that currently use the spellings Hà Nội and Hồ Chí Minh City, which I've changed to Hanoi and Ho Chi Minh City in accordance with the common English names.

Benwing2 (talk) 04:22, 2 June 2024 (UTC)Reply

~~This is the sort of thing that AI should be good at doing. —Justin (koavf)❤T☮C☺M☯ 04:29, 2 June 2024 (UTC)~~Reply

@Koavf I don't get what you're saying at all. Maybe you're misunderstanding my questions? Benwing2 (talk) 04:49, 2 June 2024 (UTC)Reply

I can be ignored here. Sorry. —Justin (koavf)❤T☮C☺M☯ 04:58, 2 June 2024 (UTC)Reply

For Navajo and other Native languages, my gut reaction is: if the entries currently use Navajo names, then either just continue to use the Native name, or list both ("Tségháhoodzání / Window Rock" or vice versa). Perhaps not in that specific case, but in the case of some other Native placenames, the nearest semi-applicable English name may have different scope/boundaries (or it may be unclear where the Native placename was, although this is probably not going to be a problem with audio files), so retaining the Native name seems useful. Slashing both would be a lot to type, but this might be mitigated if the template/module drew on T:a-et-al and so e.g. "Tségháhoodzání", "Window Rock", and optionally some even shorter name like ~"nv-TG", could all be aliases...? Pinging User:Eirikr for your thoughts.
For Palestinian Arabic, renaming cities to Israeli names indeed feels way too loaded, and for my part I would not support it. (If we have audio samples from "Bakhmut, Ukraine", does there come a point at which it's been occupied long enough that we change them to "Artyomovsk, Russia"? Ehhh...) For diacritic differences like the Vietnamese examples, I'd be inclined to use the common English form; that seems like another place where it could be useful if the template/module could know Hà Nội was an alias of Hanoi and display "Hanoi" when given the input "Hà Nội". - -sche (discuss) 06:06, 2 June 2024 (UTC)Reply

@-sche The template does use {{a}} for this purpose so a lot could be done with aliases, although I'm not sure it would make sense to have slashed names in most circumstances. (The Navajo example I brought up is theoretical in any case; AFAICT none of the Navajo audio files identify any place name at all, although many say "Audio (NV)", which I am tempted to delete because it seems to convey no useful info. Similar issues occur with "Audio (AF)" for Afrikaans, "Audio (CS)" for Czech, "Audio (KN)" for Khiamniungan Naga [KN is the country code for St. Kitts and Nevis, which is nowhere near India :) ...], and "Audio (BCL)" for Bikol Central = lang code bcl.) The issue with Lod, as with all Israeli/Palestinian issues, is very complex and fraught; the reason I brought up this example in particular is that Lod is not internationally considered occupied and AFAICT the term "Lod" does not have the sort of political baggage associated e.g. with terms like Judea and Samaria, so it may not be parallel with the case of Bakhmut or with cities in Gaza and the West Bank, which unquestionably should use Arabic language names. Maybe a more parallel example is Lviv, formerly a Polish city known as Lvov; if we somehow had Polish audio from this city, it might make sense to use a slashed form Lviv/Lvov, and similarly here maybe Lod/al-Lidd? Same thing might apply to Jerusalem/al-Quds? (The status of this city is even more convoluted and intractable but since the common name in English is "Jerusalem" and most readers won't be familiar with "al-Quds", I think it would be confusing to only say "al-Quds".) For that matter, maybe this approach is tenable also for the Northern Kurdish terms I mention above. Benwing2 (talk) 06:39, 2 June 2024 (UTC)Reply

OK, I seem to have reversed myself from what I said at top. Benwing2 (talk) 06:40, 2 June 2024 (UTC)Reply

Lvov is the Russian name. The Polish name is Lwów. There is surely some English dialectological study of Palestinian Arabic, where the Jerusalem dialect has some name. If it is called Al-Quds, i will rather go for using Al-Quds, because it is how this dialect is known in the English books about the Palestinian dialects. But nobodys gonna refer to Moscow dialect of Russian as "Moskva", cause the English books on Russian dialectology are surely using "Moscow" as the name of this dialect. On Diyarbakir, we should see some English books on Kurdish dialects how they call this dialect. Tollef Salemann (talk) 19:33, 2 June 2024 (UTC)Reply

@Tollef Salemann Thanks, my mistake. If you know of any books dedicated to Palestinian or Kurdish dialects, feel free to list them. I would guess that the more well-known a place is, the more likely the common name will be used (as you note with Moscow vs. Moskva, etc.). Benwing2 (talk) 21:51, 2 June 2024 (UTC)Reply

Thanks for the ping, but I'm not sure I have any useful input here. Cheers! ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:08, 5 June 2024 (UTC)Reply

We should use whatever the literature does, which will probably be the language's own name. Thadh (talk) 07:32, 2 June 2024 (UTC)Reply

@Thadh I actually suspect it will vary greatly depending on the individual author. It's hard for me to believe there will be any discernible standard here. But I may be wrong. Benwing2 (talk) 08:28, 2 June 2024 (UTC)Reply

Of course it will vary, but there will probably be an overal tendency to prefer native words over local words or the other way around. Thadh (talk) 08:32, 2 June 2024 (UTC)Reply

Agree with Thadh. Now we need to find all the English books about Palestinian and Kurdish dialects. Tollef Salemann (talk) 19:36, 2 June 2024 (UTC)Reply

Dealing with controversial quotes

Latest comment: 1 month ago17 comments10 people in discussion

In a bid to end the discord concerning the addition of quotes that disseminate objectional political etc. views, I would like to draw everyone’s attention to a recent discussion in which User:Geographyinitiative said he favors adding controversial quotations in the Citations namespace, which he deems a safe haven for such quotes which may not be suitable for adding in the dictionary entry. I, on the other hand, held the opinion that we could consider adding a note of disclaimer stating that Wiktionary does not endorse any of the views expressed in any quotes and they are for educational purposes alone (in this case however there’s the problem of cluttering up the dictionary page, so the note probably could be put in the mainpage?) Alternatively as a marriage of the twain ideas, we could as well resort to adding every controversial or inappropriate quote soever in the Citations namespace along with the said note of disclaimer put at the top of the Citations page using a template.

I think any of these ideas will be an attractive option if some people get so triggered by quotes bearing controversial POVs. Just my tuppence, thank you. Inqilābī 22:05, 2 June 2024 (UTC)Reply

I thank Inqilābī for the above comment, and I will say that I do not anticipate there is any negative outcome from this discussion from my view. I am fine with any note of disclaimer as proposed. Even if every Citations page I have worked on were deleted, I'm still okay. However, one among many uses for the Citations page seems to be to catalogue "fringe" material in a way that people can see it without it being right on the entry. There are other reasons for a Citations page. But I consider it one of uses. For instance, the users here like to analyze some wild racist words from dangerous evil blogs. That material seems so vile and repulsive to me that no note of disclaimer could fix it. But there should be some venue for the material given the "descriptivist" stance of the dictionary, so Witkionary "throws it in the hole" (the Citations page) so you can consult that if needed. There are numerous other uses for Citations pages including: a place for inter-sense citations or citations of uncertain sense (the 1966 and 1975 citations for Citations:transgender), a place for re-organizing senses or analyzing contexts, a place for cites of little importance or value for the entry proper, a place for words with only two acceptable cites so far (Citations:intercessionate), a staging area for a potential future entry if conditions permit (Citations:Pinghai), etc etc. The Citations page doesn't have to meet the standards of the entry proper, and stuff is less likely to be deleted there. But I tell you, some of the soul-scarring shit I've seen on the Citations page could NOT be solved by any note of disclaimer. It would HAVE to be deleted from the entry proper, regardless of anything, IMO. The Citations pages create distance from some of the most evil authors's evilest sentences I've ever seen and Wiktionary's entries, while simultaneously remaining true to the purist descriptivist mission. Wiktionary will not be allowed to exist if it puts those sentences on the entry proper. --Geographyinitiative (talk) 22:18, 2 June 2024 (UTC)Reply

Thank you for the elaborate reply Geographyinitiative. Just for the record, the main reason I wrote this post is due to disputes involving other editors, and not because of my RFD nomination that day. I would also like to maintain that I do not advocate deleting every Citation page, I understand your reasoning. Now if other editors overwhelmingly agree that such quotes can be thrown and kept secure in the Citations bin, then my suggestion of a disclaimer can be ignored. Inqilābī 22:34, 2 June 2024 (UTC)Reply

The "citations page containment zone" idea was floated back in 2022 and was not well-received for all its merits. WordyAndNerdy (talk) 23:22, 2 June 2024 (UTC)Reply

IMO, quotes espousing controversial or bigoted viewpoints should be limited to terms that are themselves associated with such viewpoints. If we stick to this, it shouldn't be necessary to have a cordon sanitaire like putting them in the Citations page, because the terms themselves will normally have (or certainly should have) labels indicating that they are controversial, offensive, etc., which clues the reader into the fact that the quotes (which are hidden by default) may express such viewpoints. Benwing2 (talk) 23:44, 2 June 2024 (UTC)Reply

Does this include things like using a quote from a racist speech on a word that is related to racism? CitationsFreak (talk) 23:53, 2 June 2024 (UTC)Reply

I would think so in general. What is the example you're thinking of? What to me isn't appropriate is e.g. User:WordyAndNerdy's example of an incel-type quote added to the word roadworn, since there's nothing about this term that relates specifically to the incel community or any other controversial viewpoints. Benwing2 (talk) 23:59, 2 June 2024 (UTC)Reply

I do not have a good grasp of policy. I'm just trying to 1) protect Wiktionary while 2) allowing the purist descriptivist mission to flourish. So my view does create a cowardly "semi-censored" and "self-censored" aspect to the project. It's not a good solution. But we exist in a society, and I guarantee Wiktionary could be snapped like a twig if it crossed the wrong lines. One device we can use to assuage people is say "hey it's not on the entry". Basically this applies to "fringe" content, so you just have to judge it for yourself. --Geographyinitiative (talk) 00:14, 3 June 2024 (UTC)Reply

Fundamentally, it boils down to "don't use a controversial quote unless you absolutely have to"

Don't use controversial quotes to talk about editors or about real people
If a word has three non-controversial quotes, use those three

Purplebackpack89 01:00, 3 June 2024 (UTC)Reply

No, it doesn’t. We do things that we don’t have to because it comes out optimized or more illustrative, rather than absolutely necessary. Don’t do things I “need to”, for example I don’t need creatine monohydrate but probably still benefit from it. And you wipe off the issue how controversiality is inferred and portrayed; in isolation, the Daily Stormer quote wasn’t the same as the site in general, but someone pushes a stance about the whole resource. Fay Freak (talk) 01:20, 3 June 2024 (UTC)Reply

Like the lock in {{R:OED Online}} “paid subscription required” (which I wanted elsewhere, for legal databases I quoted from) we could have a symbol and tooltip warning about “low factuality”. As on Ground News but less regularly. Nothing too regular since we generally shan’t consider any sources controversial inasmuch they are used for their language (which in rare cases itself is trolling), we already have a contradiction here and a lot of cognitive capacity is wasted for evaluating sources. “Incel-type quotes”? Am I supposed to waste my energy to say anything about these people?

Yet still Geographyinitiative does not recognize content generated via AI by the Chinese propaganda department snuck in as quotes, about which in some cases I have insider knowledge. Academic databases are littered with automated language, and in the former cases “publishing” takes places via PEMT. I invite everyone to search "gullible Bayes".

At least for random Neo-Nazis we know they are real people putting in the effort, and back then I also reasoned that this human language has durability, since Mr. Anglin has still not been downed from the internet despite all the efforts. AI imitates Mr. Average, and avoids controversial statements, think about it. Fay Freak (talk) 01:20, 3 June 2024 (UTC)Reply

@Fay Freak: The AI as a technology is perfectly capable of generating hate speech and Nazi propaganda [5][6]. It's just that the big players in the AI industry are making efforts to suppress this in their own products. But the technology can't be stopped and it is available to anyone. There may be already a lot of AI generated Neo-Nazi content in the net. So I wouldn't just blindly assume that every Neo-Nazi content is human generated and thus has some kind of linguistic relevance. --Ssvb (talk) 16:04, 3 June 2024 (UTC)Reply

@Ssvb: I don’t blindly assume it, but there are a number of reasons against the existence of formally plausible versions of it, apart from the circumstance that I have not stumbled upon it despite searches of the most heterodox things and following the upcoming trends in politics, which are warily tracked by hostile journalism more than anything if coming from this end. AI-generated article images of white families or the like appear, but we mean the texts. Currently everyone suppresses it, the hard cores of Neo-Nazis are too dumb or ideologically averse for targetted computer-generated content, and manual labour is too cheap and worth it for them: Like Kremlebots are real people sitting at a known address in Saint Petersburg. And it does not work: As the neuronal networks are trained on some old averages, even if it be biased content, and then have so-called model decay, they don’t hit humans where it hurts, they would have to have intricate understanding of current connotations of ideological concepts in order to reframe personal identities of people. You don’t change people’s worldviews with AI, though you can promote specific assumptions.

It’s a general problem in education, too. AI programs very much but teaches programming very little and human teachers will always exist and be preferred by totalitarian systems as well, and our dictionary be human-made because we explain politics, philosophy, psychology etc. Fay Freak (talk) 16:45, 3 June 2024 (UTC)Reply

@Fay Freak: "the hard cores of Neo-Nazis are too dumb" - this is a very questionable claim and I wouldn't count on that. Additionally, dumb people tend to make grammar and spelling mistakes, so this reduces the value of their content for Wiktionary. And some of them are even not native speakers. For example, I wouldn't consider the Anders Breivik's Manifesto to be a valuable example of written English. --Ssvb (talk) 19:07, 3 June 2024 (UTC)Reply

My approach, as I said in the 2022 discussion linked above, is: "we can (and do) already move un-illustrative, including unnecessarily offensive, quotes to Citations: pages if they're needed for WT:ATTEST. (If they're not, like someone is adding racist screeds as cites of and, just replace them with normal cites and block the user if needed.) This does lack a reader-facing warning [...] but eh, that probably reduces the amount of bad-faith or even good-faith debates over whether a quote is "really offensive" that a content warning would attract." We already see trolling about "they're not white supremacists, they're white racialists / race realists" etc etc: any "this quote is offensive"/"we don't agree with this quote" notice would just be a magnet for endless disputes. And do we apply "this quote doesn't represent our views" to quotes that express e.g. old or modern flat-earth or geocentric views, i.e. views that aren't really offensive but which nonetheless aren't Wiktionary's views? It's a morass we needn't create. Indeed, I'm not sure there's actually a problem here in the first place? AFAICT what I outline is what is broadly already done; is anyone actually going around and adding citations of Mein Kampf to und and der (and not immediately being reverted), is there an actual issue happening...? - -sche (discuss) 01:26, 3 June 2024 (UTC)Reply

Maybe the age of a quotation also plays a big role? Being old gives it at least a historical value. So that the ancient "flat-earth" theories are okay, but modern "flat-earth" theories - not so much. The former are likely to be honest mistakes, while the latter are likely to be the work of nutcases. Also if the readers see that a quotation is older than maybe 1950, then they can figure out themselves that it's unlikely to present a relevant up to date scientific information even without any extra disclaimers. For example, I added this quotation recently, which is stating something that is possibly not true nowadays (and possibly even debatable back in 1916). But does anyone really care? --Ssvb (talk) 18:49, 3 June 2024 (UTC)Reply

I pretty much agree with -sche here. I am not sure if this problem really merits a whole policy to tackle it, it's really a problem of common sense.

If an offensive quote does not add lexicographical value compared to a non-offensive quote, don't use it, or feel free to replace it with a more neutral quote (even if only because it is a waste of everyone's energy building this communal project to be bogged down in disputes over offensive quotes, or what constitutes offensiveness).

If it does add value (such as in illustrating firsthand the usage of offensive words, or of offensive senses of otherwise unoffensive words) or there are no good unoffensive candidate citations available in durably archived sources, feel free to use an offensive quote within the limits of reason. The guidelines that apply at WT:USEX ("Be friendly", particularly) already codify this for usage examples, fwiw. If we really want, we could expand WT:Quotations#Choosing quotations with a few (permissive) lines to the same effect, I wouldn't be opposed to that.

(I would on the other hand be opposed to disclaimers in mainspace indicating that a quote may be considered offensive, and I do not think that quarantining potentially offensive quotes in the Citations namespace is necessary as long as the principle of least offensiveness is followed wherever offensive quotes do not add any lexicographical value.) — Mnemosientje (t · c) 14:43, 5 June 2024 (UTC)Reply

Announcing the first Universal Code of Conduct Coordinating Committee

Latest comment: 1 month ago1 comment1 person in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello,

The scrutineers have finished reviewing the vote results. We are following up with the results of the first Universal Code of Conduct Coordinating Committee (U4C) election.

We are pleased to announce the following individuals as regional members of the U4C, who will fulfill a two-year term:

North America (USA and Canada)
- –
Northern and Western Europe
- Ghilt
Latin America and Caribbean
- –
Central and East Europe (CEE)
- —
Sub-Saharan Africa
- –
Middle East and North Africa
- Ibrahim.ID
East, South East Asia and Pacific (ESEAP)
- 0xDeadbeef
South Asia
- –

The following individuals are elected to be community-at-large members of the U4C, fulfilling a one-year term:

Barkeep49
Superpes15
Civvì
Luke081515
–
–
–
–

Thank you again to everyone who participated in this process and much appreciation to the candidates for your leadership and dedication to the Wikimedia movement and community.

Over the next few weeks, the U4C will begin meeting and planning the 2024-25 year in supporting the implementation and review of the UCoC and Enforcement Guidelines. Follow their work on Meta-wiki.

On behalf of the UCoC project team,

RamzyM (WMF) 08:15, 3 June 2024 (UTC)Reply

"ux" template

Latest comment: 1 month ago10 comments5 people in discussion

I now religiously (well, most times) use the "ux" template for usage examples, since it is what I see others have done, but since this is no easier (in fact actually more to type) than not using it, I wonder whether anyone could explain what the actual advantage is, if any? Mihia (talk) 17:41, 3 June 2024 (UTC)Reply

As opposed to plain wikitext? Category. Same with {{co}}. Vininn126 (talk) 17:46, 3 June 2024 (UTC)Reply

By "category", do you mean that it puts the article in the category "English terms with usage examples"? Not that I am really complaining about typing a couple more characters to use "ux", it's not a big deal, but out of curiosity I wonder what use to anyone or anything is such a category? (A category for articles without usage examples I could understand.*) Mihia (talk) 17:57, 3 June 2024 (UTC) -- (* or, actually, a category for definitions without usage examples would be more useful, since an entry could have ten definitions, only one of which had a usage example, yet still, as far as I gather, show up in "terms with usage examples")Reply

A big underappreciated advantage is that the "ux" and "quote-book" templates are machine readable. This allows easily doing various kind of automatic processing. Yes, it's possible to find terms with missing usage examples if you are interested in that. --Ssvb (talk) 18:06, 3 June 2024 (UTC)Reply

I use it quite often to see what entries still need a usex in the languages I edit. I think others do, too. It's similar to the "English terms with quotations". Thadh (talk) 18:22, 3 June 2024 (UTC)Reply

How do you use category "terms with usage examples" to find entries that don't have usage examples? Mihia (talk) 18:33, 3 June 2024 (UTC)Reply

I compare it to the other category. Thadh (talk) 20:55, 3 June 2024 (UTC)Reply

For non-English languages, {{ux}} is required for the text to be tagged in the correct language for e.g. screen readers or other automated software. — SURJECTION ^{/ T / C / L /} 19:15, 3 June 2024 (UTC)Reply

Not to mention script and font. Thadh (talk) 20:56, 3 June 2024 (UTC)Reply

I mentioned these points at https://en.wiktionary.org/wiki/Template:ux/documentation. Anyone wants to amend or add to this, please go ahead. Mihia (talk) 21:02, 3 June 2024 (UTC)Reply

`{{etymon}}`

Latest comment: 1 month ago34 comments9 people in discussion

I wasn't quite aware of the intended scope of this. Apparently it's to be an all-in-one etymology template, subsuming the functions of {{affix}}, {{inherited}}, {{etymid}}, etc.

Its current syntax strikes me as more than a bit unintuitive, and I'd like to propose a somewhat more user-friendly way of going about it:

cleverly: {{ety|en:ID|clever:en:ID|-ly:en:ID}} "From clever + -ly".
- Note that the first ID is for cleverly, the second for clever, and the third for -ly.

charity: {{ety|en:ID|charitee:enm:ID}}"From Middle English charitee."

furlough: {{ety|en:ID|verlof:de:ID}} "From Dutch verlof".

монтировать (montirovatʹ): {{ety|ru:ID|montieren:de:ID|-овать:ru:ID}} "From German montieren + Russian -овать (-ovatʹ)".

For categorization purposes, the default assumptions would be as follows.

If all the language codes match (i.e. it's a language-internal formation): compounding, suffixation, prefixation, or confixation. That can be automatically determined by hyphens: yass + -ify is suffixation, neuro- + -genic is confixation, etc. Other types of derivation can be specified with an additional parameter like |blend=1 or |deverbal=1.

If the language codes do not all match: "English terms derived from Dutch", etc. For mixed cases like the aforementioned монтировать, nonsensical categories like "Russian terms derived from Russian" would of course be disabled. More specific types of relation can be expressed with an additional parameter like |bor=1, |inh=1, |calque=1, |conflation=1, and so on.

This strikes me as a reasonaby straightforward way to handle things.

Thoughts, objections, or alternative suggestions?

Paging @Ioaxxere as the person who made the template and @Vininn126, @Rex Aurorum, @Qwertygiy, @Akaibu, @Biolongvistul, @Protegmatic as people who have used it. Nicodene (talk) 21:48, 3 June 2024 (UTC)Reply

Condensing the language and the ID parameters is very agreeable. As for the reshuffling in the etymon slots, it disrupts the ascending hierarchy of specificity and would not prove any easier to internalise to me.

The semantic austerity of the af keyword is, I dare to assure, a temporary solution. We don’t even have categorisation implemented yet. ―⁠Biolongvistul (talk) 22:19, 3 June 2024 (UTC)Reply

Could you explain what you mean by ‘ascending hierarchy of specificity’? Nicodene (talk) 23:25, 3 June 2024 (UTC)Reply

Broadest first, most specific last, as in taxonomy for species. I believe it's mostly a happy coincidence that it's implied with the current syntax using the "greater than" symbol. Language > term > sense.

The rest of the proposition I don't believe I quite understand. Syntax like "bor|fr>unité>to unite|af|en>-ed>past participle" for "Borrowed from French unité (“to unite”) and suffixed with -ed (“past participle”)" feels intuitive enough to me. Qwertygiy (talk) 23:41, 3 June 2024 (UTC)Reply

@Ioaxxere I have been meaning to respond to another thread about adding manual transliteration into {{etymon}}. The obvious way to do that is through inline modifiers; in that respect, the choice of > as a separator is singularly unfortunate as it prevents use of inline modifiers with the normal <...> syntax. I would recommend changing this to something else; for example, the {{given name}} template uses < to indicate inheritance, but requires that spaces be put around the < sign, which allows concurrent use with inline modifiers. You could also use ^, @, etc. Benwing2 (talk) 00:11, 4 June 2024 (UTC)Reply

BTW if you need help changing this, I can do this easily by bot. Benwing2 (talk) 00:12, 4 June 2024 (UTC)Reply

@Benwing2 I don't think it does prevent the use of < >, as it's not actually ambiguous, but I could see it being confusing (though no more than template syntax). Theknightwho (talk) 00:37, 4 June 2024 (UTC)Reply

I suppose you may be right, I need to think if there are any edge cases that will be problematic, although without spaces it will be very hard to read, e.g. фоо<tr:foo>>бар<tr:bar>>баз<tr:baz> is well-nigh unreadable. Benwing2 (talk) 00:43, 4 June 2024 (UTC)Reply

@Benwing2 It's not great, I agree. My suggstion is foo:bar<id:baz>, which probably maximises consistency with other templates. Theknightwho (talk) 00:46, 4 June 2024 (UTC)Reply

@Theknightwho I agree with this. Benwing2 (talk) 00:49, 4 June 2024 (UTC)Reply

I tried it that way to have the same adjacent order of language and ID, as in {{ety|en:ID|charitee:enm:ID}} "From Middle English charitee". But I don't have any issue with {{ety|en:ID|enm:charitee:ID}}.

As for the use of ">", in addition to the issue that Benwing mentions, I found it unintuitive. The code on state for example currently contains "enm>stat>condition". Reading this according to the standard meaning of ">" in linguistics results in "condition is from stat, which is from enm".

As for united, as it happens I don't agree with the given etymology, since French unité is a noun meaning "'unity", not a past participle comparable to united. The latter is just unite + -ed. But if I were to agree with the given etymology, my proposal would result in {{ety|en:ID|fr:unité:ID|en:-ed:ID}} "From French unité + English -ed." Which seems a good deal simpler. Nicodene (talk) 00:50, 4 June 2024 (UTC)Reply

There are a lot of suggestions in here so I'll just dump a few opinions:

Neutral on changing the etymon parameter format. However, I oppose any scheme where > is used both as a separator and for inline modifiers for the reasons pointed out by Benwing. Out of the options discussed here I would take foo:bar<id:baz> (I assume foo is the language code).
Weak oppose having |1 be in the format lang:ID as I find this very unintuitive, although it does admittedly save keystrokes.
Oppose changing anything about the keyword parameters for now until the requirements are more established. I feel like @Nicodene is putting the cart before the horse in discussing categorization when it's not even clear how this should work. In particular, I'd like to eventually deprecate the existing "X terms derived from Y" system in favour of something more fine-grained (although this will be tough to implement in the short term).

Ioaxxere (talk) 04:15, 4 June 2024 (UTC)Reply

In something like foo:bar, foo should definitely be the lang code, otherwise it will be too confusing. In foo:bar:baz:bat, I would assume foo is a lang code and the others are terms. If the lang code is optional, we'll need a different separator for the terms. Benwing2 (talk) 04:28, 4 June 2024 (UTC)Reply

@Benwing2: Currently, with the > separator, the lang code is optional. Hence you can do something like {{etymon|ine-pro|id=father|af|unc|*peh₂->protect|*-tḗr>agent noun}} (the ine-pro> part is implied). Part of the reason I like the current system is that it's optimized for keystrokes, e.g. *peh₂->protect has 14 characters, whereas ine-pro:*peh₂-<id:protect> has 26 characters. But I think that it should be possible in the new system to omit the lang code in the same manner as long as : characters are escaped everywhere else. Ioaxxere (talk) 04:42, 4 June 2024 (UTC)Reply

@Ioaxxere I am not saying you need to use inline modifiers for things like ID's that occur frequently. You will find, for example, in {{it-conj}} that there are various delimiters used, e.g. {{it-conj}} for riempire might look like {{it-conj|a/riémpio,riempìi,riempìto:riempiùto}}; here the a/ at the beginning indicates the auxiliary verb avere; following are three principal parts, comma-separated, and alternatives for principal parts are colon separated. Some verbs need four principal parts and use ^ to separate the fourth principal part, e.g. venire, whose full spec looks like {{it-conj|e/vèngo^viène:viéne,vénni:vènni,venùto.fut:verrò.presp:veniènte}}. To help unpack this, the format for principal parts is PRES1S,PHIS1S,PP in most verbs (specifying the 1sg pres indic, the 1sg past historic, and the past participle), but PRES1S^PRES3S,PHIS1S,PP in verbs where the 3sg pres indic is also irreg. In addition, . separates distinct specs, where the main principal parts are collectively a single spec, and fut:verrò is another spec indicating the future principal part, and presp:veniènte is yet another spec indicating the present participle. I could have used the format of fut:verrò for all principal parts, which would look like {{it-conj|e/pres:vèngo.pres3s:viène:viéne.phis:vénni:vènni.pp:venùto.fut:verrò.pres:veniènte}} (BTW you can put spaces and newlines next to any delimiter to make it easier to read), but that's a lot more keystrokes. Benwing2 (talk) 05:07, 4 June 2024 (UTC)Reply

~~Is handling language and ID the same way throughout, as in~~

{{ety|en:polity|stat:enm:condition|inh=1}}

less intuitive than handling them in different ways like this?

~~{{etymon|en|id=polity|inh|enm>stat>condition|tree=1}}~~ ed: nevermind; see below

I wasn't aware you're considering getting rid of "X terms derived from Y" categories. Is the problem the name (as it happens I'd been thinking of suggesting "X terms of Y origin") or is it the problem that such categories exist at all? Nicodene (talk) 04:58, 4 June 2024 (UTC)Reply

@Biolongvistul, Qwertygiy, Ioaxxere, Theknightwho, Benwing2:

Adjusting for your comments, we get something like:

{{ety|en<id:X>|en:clever<id:Y>|en:-ly<id:Z>}} "From clever + -ly".

Does that syntax satisfy everyone?

If so perhaps we can get to discussing Ioaxxere's proposed changes to categories. Nicodene (talk) 09:16, 4 June 2024 (UTC)Reply

I like this, to be honest. Vininn126 (talk) 09:18, 4 June 2024 (UTC)Reply

I'd prefer something like

{{ety|en|clever#Y|-ly#Z}}

That way you minimize typing. Benwing2 (talk) 09:21, 4 June 2024 (UTC)Reply

Happy to go for #X instead of <id:X> if people like it.

It looks like you favour setting the default assumption for language codes to “same as the first one mentioned, unless otherwise specified”? So in this case, given the {{ety|en…}}, the following clever and -ly are assumed to be English.

I suppose in that case the syntax for Russian монтировать (montirovatʹ) would read {{ety|ru#X|de:montieren#Y|-овать#Z}} “From German montieren + Russian -овать (-ovatʹ)” or similar. Nicodene (talk) 10:47, 4 June 2024 (UTC)Reply

This raises the issue of adapted borrowings anyway. I suppose for the tree you'd have a fork either way, but the question is whether to print "bor" in the tree or not. I have a slight preference for <id:X>. Vininn126 (talk) 10:51, 4 June 2024 (UTC)Reply

Can we move forward with one of these syntaxes? Vininn126 (talk) 09:47, 6 June 2024 (UTC)Reply

@Vininn126 languagecode:lemma<id:X> appears to be the most accepted. Perhaps space-saving feature ls can be added down the line, like the aforementioned #ID or having language codes default to the first one mentioned.

@Ioaxxere wants to make major changes to the category system. From what I gather we’ve a long ways to go before reaching that: we’ve yet to hash out any details, and then there’s community consensus to reckon with.

On the other hand we have, if I’m not mistaken, agreed on a new syntax for {{etymon}. So I also think we might as well implement it now, unless someone has further modifications to suggest. It shouldn’t make adapting to future category changes any easier or more difficult than it would be currently.

I’ve volunteered to manually clean up the existing transclusions of {{etymon} and update the documentation. Nicodene (talk) 10:22, 6 June 2024 (UTC)Reply

Yes, I think adding categories would be great; I also don't think it's necessary for updating the syntax? I could be wrong. If not, then I think we can move forward. Vininn126 (talk) 10:25, 6 June 2024 (UTC)Reply

I have no strong feelings about the exact markup, I can adjust. Vininn126 (talk) 08:47, 4 June 2024 (UTC)Reply

Was suggested to bring up the fact that I've been setting the trees below the etymology as opposed to the "current practice" of putting them above, as to me, the trees are not be the focus of the etymology section, or at least they shouldn't be considered as such, as your average joe will probably not care that creepypasta's lineage contains the doublets pasta and paste, they'll just be interested that it came from the /x/ board. Akaibu (talk) 06:31, 5 June 2024 (UTC)Reply

Personally I'd prefer them above. Vininn126 (talk) 06:33, 5 June 2024 (UTC)Reply

I prefer above, it just looks much better. Plus it's collapsed by default, so I definitely think people will notice the etymology first. — SAMEER (؂・؄・؏) 07:31, 5 June 2024 (UTC)Reply

@Babr re diff: not currently. But you may be interested in this discussion. Ioaxxere (talk) 20:21, 15 June 2024 (UTC)Reply

Rethinking confidence parameters

Currently, to indicate uncertainty, you might do something like {{etymon|ine-pro|id=father|af|unc|*peh₂->protect|*-tḗr>agent noun}}. As pointed out by @Fenakhay, this is a bit unintuitive due to the fact that there are two "layers" of keywords present (both etymons are associated with both af and unc). As an alternative, I support being able to write {{etymon|ine-pro|id=father|af|*peh₂->protect?|*-tḗr>agent noun?}}. This is intuitive and also saves two characters. We would just have to make sure that there are no IDs ending in a question mark.

Also, I'm personally not a fan of using # to show IDs, since it could be confused with the actual fragment. In Benwing's example, {{ety|en|clever#Y|-ly#Z}} would link to clever#English:_Y. Ioaxxere (talk) 19:45, 4 June 2024 (UTC)Reply

If you like <id:X>, perhaps another inline modifier like <unc:1>? Nicodene (talk) 21:24, 4 June 2024 (UTC)Reply

I think using ? to indicate uncertainty is fine. I'm not sure about what > and -> mean here; I need to read the docs, but they maybe could be replaced with something more intuitive. Benwing2 (talk) 04:36, 5 June 2024 (UTC)Reply

> precedes an ID, and the hyphen is just part of the PIE lemma *peh₂-. Nicodene (talk) 04:41, 5 June 2024 (UTC)Reply

I see. In that case maybe use @ or ^ to separate the ID from the lemma. Benwing2 (talk) 04:51, 5 June 2024 (UTC)Reply

Classical Attic audio files

Latest comment: 29 days ago5 comments5 people in discussion

Umm ... I have come across several of these. Do we really want them? E.g. λέγω, where on top of everything else, the pronunciation is completely wrong; the speaker says /leːɡuː/ when the reconstructed pronunciation should be /lɛɡɔː/. Some others (which I have not checked yet): καί, ὁ, ψυχή, φύσις, αὐτός, εἰμί, χείρ, οὗτος, χθών, τίς, φθόγγος. Benwing2 (talk) 00:48, 4 June 2024 (UTC)Reply

I believe there was consensus to remove the audio files for Classical Latin, so this should be no different. Andrew Sheedy (talk) 01:10, 4 June 2024 (UTC)Reply

I don't want them either personally. At the very least they should be labelled with a disclaimer like ‘modern attempt to approximate Attic’ to convey some idea of the uncertainties involved in attempting a phonetic rendition of a pronunciation predating Christ. Nicodene (talk) 01:23, 4 June 2024 (UTC)Reply

The ones you have not checked, I am surprised how well they match. Such small details could make readers fond, in their grim and despondent struggles to master Greek. Can’t withsay them in the interest of the art and science. Fay Freak (talk) 01:42, 4 June 2024 (UTC)Reply

Audio for reconstructed pronunciations is extremely unscholarly. It's practically conlanging. Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

Use of etymology trees made with Template:etymon in the entries for multi-word terms

Latest comment: 14 days ago26 comments11 people in discussion

Hello, following the passage of Wiktionary:Votes/2024-04/Allowing etymology trees on entries last week, etymology trees generated by {{etymon}} have been added to a number of entries. Earlier today, there was some discussion on the Discord server about the inclusion of etymology trees in the "Etymology" sections of multi-word entries like United States of America (added here) and Abkhaz Autonomous Soviet Socialist Republic (not added as of writing). Some supported etymology trees on such entries while others opposed their inclusion. The discussion started getting detailed enough as well as got enough attention that I've decided to try and move it here, on-site so that it is more "official" and can have more organization and visibility. Pinging those who expressed views on Discord: @Qwertygiy, Vininn126, Lattermint, Ioaxxere, Akaibu, Soap, Saph668, AG202, Theknightwho. —The Editor's Apprentice (talk) 02:08, 4 June 2024 (UTC)Reply

Replying to say that I don't think it's best to have etymology trees on multiword terms like United States of America. It starts to get unwieldy, and while it looks "cool", we should be aiming for information presented in a concise and helpful way, not the pseudo-gamification that I've started to see. AG202 (talk) 02:15, 4 June 2024 (UTC)Reply

Completely agreed. In general the etymology of a multiword term should indicate the way the term was constructed in the same language, and that's it, unless the term was calqued from some other language. Benwing2 (talk) 03:20, 4 June 2024 (UTC)Reply

In the same vein, the discussion around adding a tree to Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch on Discord shows me the gamification that I'm talking about. Even after being pointed out to that they shouldn't work with languages that they don't know, the tree was still added. I assume because it's a long word and they explicitly stated that they couldn't edit pneumonoultramicroscopicsilicovolcanoconiosis (locked to auto-patrollers and up). I'd also like to remind editors of the statement from the vote:

This vote does not:

Allow or encourage editors to mass-add etymology trees across the site. As stated above, each language community will decide if or when they are appropriate.

AG202 (talk) 00:37, 5 June 2024 (UTC)Reply

Weak support etymology trees on multi-word terms. I don't see the harm considering they're collapsed and don't take a lot of effort to create. However, I admit that the tree on United States of America is virtually unusuable simply due to how wide it is. I think the best course of action is to have trees of a certain width display in a horizontal format as seen in Wiktionary:Beer parlour/2024/May#Descendant tree design. Ioaxxere (talk) 04:22, 4 June 2024 (UTC)Reply

@AG202: I would like some clarity on what you're actually aiming for. Are you saying that no etymology tree should be added to terms with a space? What about a term like chow mein, which was directly borrowed from a single word? Ioaxxere (talk) 04:28, 7 June 2024 (UTC)Reply

@Ioaxxere: No, I said words like United States of America, where it’d be a clear SOP term if not for the fact that it’s a proper noun. When we start debating whether or not to add the tree for of in a multiword term, it’s getting out of hand. AG202 (talk) 04:41, 7 June 2024 (UTC)Reply

Oppose per AG202. —Caoimhin ceallach (talk) 00:36, 7 June 2024 (UTC)Reply

Oppose per AG202. DCDuring (talk) 15:02, 8 June 2024 (UTC)Reply

As much as I support the template in general,

Oppose the generation of trees on multiword entries. Of course having it for an ID and such is still useful. Vininn126 (talk) 15:09, 8 June 2024 (UTC)Reply

Oppose per AG202. — Fenakhay ^{(حيطي · مساهماتي)} 17:01, 8 June 2024 (UTC)Reply

@AG202, Caoimhin ceallach, DCDuring, Vininn126, Fenakhay: I've tweaked the CSS so that the tree on United States of America is less "unwieldy". Does this have an impact on your opinion? Ioaxxere (talk) 19:23, 4 July 2024 (UTC)Reply

Mildly, I'm still not sure open compounds are the best candidate. Vininn126 (talk) 19:25, 4 July 2024 (UTC)Reply

@Ioaxxere What were the tweaks: was it adjusting the horizontal vs. vertical space? The tree now shown at united doesn't look optimized in this regard to me: viewing it on a computer screen, there's a lot of wasted horizontal space and there's unnecessary and awkward hyphenation of language names on the rightmost branch, like "Proto-Ger- manic" and Old Eng- lish". If there's some way for the tree to adjust a bit based on what space is available, that seems like it would be good.--Urszag (talk) 19:57, 4 July 2024 (UTC)Reply

@Urszag: Language names shouldn't be getting cut up like that—what browser are you on? But I've changed it so united still uses the wide format on large screens. Ioaxxere (talk) 20:13, 4 July 2024 (UTC)Reply

This is on Safari. I still see those breaks in the language names on [7], although it's not as egregious since they aren't at the right side of the tree anymore. I don't see them on Firefox or Chrome, so it does vary by browser apparently.--Urszag (talk) 20:19, 4 July 2024 (UTC)Reply

No. The default should be no etymology-tree display for open compounds. The few warranted exceptions would be marked by tree=1 provided no trace whatsoever appeared in the default. If some etymology fans wanted custom CSS to display hidden-by-default, non-performance-impairing etymology trees, so be it. DCDuring (talk) 20:22, 4 July 2024 (UTC)Reply

While they don't particularly interest me, I have to say that I'm baffled at this kind of staunch opposition given that it's automatically folded into a dropdown. If you don't care about it, can't you just ignore them? Theknightwho (talk) 20:26, 4 July 2024 (UTC)Reply

I'm appalled at the single-minded obsession with creating complete sets of things, whether they add value to most or not. I don't like the waste of dropdown bars or the distraction of the dropdown-opening tool. DCDuring (talk) 02:33, 5 July 2024 (UTC)Reply

@DCDuring There are no "obsessions" here. I wish you'd stop misrepresenting people you disagree with as caricatures. Theknightwho (talk) 23:37, 5 July 2024 (UTC)Reply

Honestly I find the vertical bars worse. I really just do not think that there's any significant benefit for having trees on multiword terms like this. And clearly consensus agrees with that right now. AG202 (talk) 07:57, 5 July 2024 (UTC)Reply

Yes indeed, I agree. Benwing2 (talk) 08:20, 5 July 2024 (UTC)Reply

I agree with AG202. I would like to expand the tree ban to compounds whose etymology is identical to its surface etymology, in other words recent compounds, e.g. homework. It's cool that you can make trees do all of this, but that isn't a good reason to include them. I think the standard should be that they really add something, i.e. display something notable that you can't easily get from a text etymology. —Caoimhin ceallach (talk) 13:09, 7 July 2024 (UTC)Reply

I can't really agree with this; I don't think the presence of the tree is harmful in any way, and banning it on all but root words (which is what this implies) wouldn't achieve anything, in my view. Theknightwho (talk) 16:24, 7 July 2024 (UTC)Reply

agreed here Vininn126 (talk) 17:18, 7 July 2024 (UTC)Reply

I think similar principles to what I said here about the {{root}} template can be applied to etymology trees:

Only lemmas should be given etymology trees (or {{root}} templates), unless sufficiently distinct (such as inflected forms of be) or suppletive (such as people). Where something like datum versus data falls, I don't know.
Every entry using {{root}} or etymology trees must be a single word or morpheme. "Unsplittable" terms, such as Hong Kong, Ku Klux Klan, and sgian dubh, can be treated as one word for this purpose.
WT:COALMINE scenarios... I'm not sure.
Descendant hubs should not use {{root}} or etymology trees, though using the {{etymon}} template is okay for passing information to other pages. -BRAINULATOR9 (TALK) 19:26, 5 July 2024 (UTC)Reply

User:Purplebackpack89

Latest comment: 1 month ago55 comments13 people in discussion

This user is making an awful lot of noise for very little signal, and judging by their mainspace-to-talkpages edit ratio, they don't seem particularly interested in actually building a dictionary.

Purplebackpack will probably argue that they're not making as many mainspace edits as they'd like because other people are constantly putting spokes in his/her wheel. They apparently don't like their work being reviewed and quality-controlled, or their edit history being looked at, and will readily dismiss criticism as "harassment", an accusation they've levelled at no less than four different people in the course of a single week (diff, diff, diff, diff).

While we should look to see if there isn't some truth there (I think we could have done without WF's trolling, at least), and make sure that there isn't a systemic problem of people feeling pressured (a topic which has recently been brought up), I would argue that rapid-fire accusations from a single editor make it harder to think clearly on such an issue.

And the fact that the same person has levelled similar accusations at an entirely different set of editors many years ago (diff, diff, diff, diff) certainly doesn't help in taking their claims seriously now.

They seem to take particular exception to people challenging them on their votes (see this discussion); notice the similarity between this and the accusation of harassment thrown at Benwing2 after his comment (on Purplebackpack regularly failing to provide a rationale for his/her votes).

I'd also like to mention that, while complaining of other people's behaviour towards them, they seem unbothered (diff, diff) by the idea that their own attitude might have played a role in the abrupt decision of a fellow editor to leave; note the striking temporal proximity between the aforementioned discussion and that editor's departure.

If Purplebackpack perceives any kind of scrutiny as harassment, I would say Wiktionary simply isn't the right place for them. Everyone on this project must be ready to face criticism - sometimes repeatedly.

I personally am loath to imagine not being able to go through a user contributions and express earnest concern about the quality of their interventions (in the main space or elsewhere) without being labelled as a "harasser".

Therefore, for the good of the project, I would like to propose that this user be prevented from further editing. This is not meant as a punitive measure (I'm not "out to get him/her"), but as a way of putting an end to highly toxic and massively detrimental behaviour, thereby preserving an atmosphere more conducive to serene dialogue and productive work. P U C – 23:07, 4 June 2024 (UTC)Reply

PBP's false harassment accusations have gotten to the point of trolling. I view such unwarranted accusations, esp. a pattern of them, as a blockable offense, and I think if PBP makes any more such accusations that aren't clearly warranted, they should be blocked, maybe on the schedule of one week, then one month, then permanently if they keep it up. PBP reminds me of Dan Polansky; a ton of heat, little light, and a strong increase in the toxicity of the atmosphere as a result of them. In Dan Polansky's case, I finally permablocked him for outright racism on top of everything else. I suspect PBP is smart enough not to engage in outright racism, but IMO that should not prevent a warranted block. Benwing2 (talk) 04:33, 5 June 2024 (UTC)Reply

^ I agree with Ben's suggestion of issuing increasing blocks. PBP's recent behavior has been really inappropriate and rude, but I'm not sure if a permaban is the best immediate action. But I definitely think we should not tolerate disruptions to the project. — SAMEER (؂・؄・؏) 07:24, 5 June 2024 (UTC)Reply

I agree. Theknightwho (talk) 12:56, 5 June 2024 (UTC)Reply

I have not had a single productive encounter with this user. Vininn126 (talk) 05:53, 5 June 2024 (UTC)Reply

Same - just a lot of vitriol and repeated sniping. Theknightwho (talk) 09:15, 5 June 2024 (UTC)Reply

Many can and have characterized your interpersonal relations same way, @Theknightwho Purplebackpack89 12:31, 5 June 2024 (UTC)Reply

"no u". thread's not about them, bro, it's about you. Vininn126 (talk) 14:23, 5 June 2024 (UTC)Reply

Have you heard what Wordy and I have been saying? There are greater systemic concerns here and it's wrong to single out one editor. Purplebackpack89 12:16, 6 June 2024 (UTC)Reply

@Purplebackpack89 There may be larger systemic concerns here, *AND* this does not absolve you from behaving in a civil fashion at all times. Imperfections in the system don't give you a free pass to run rampant and blame your bad behavior on "the system". Everyone (even Wordy) has tried to make that point in one way or another, but IMO you don't want to listen. Benwing2 (talk) 07:24, 7 June 2024 (UTC)Reply

At some point I may prepare a longer response, but I gotta interject this right now: I'm CLEARLY HERE to build a Wiktionary, as I've created 636 entries. Purplebackpack89 05:21, 5 June 2024 (UTC)Reply

I was going to say:
On a balance, I'm inclined (perhaps naively) to think PBP is not trolling but sincere, that he really regards people as harassing him, and is really freaked out about being blocked... in part because I think a troll would know that being so over-the-top — accusing so many different users of harassment (some on very flimsy grounds); and when blocked, sending lots of pings on his talk page, sending me an e-mail and contacting me on Wikipedia asking to be unblocked; and holding up creating ~636 entries since 2009 as an accomplishment — is counter-persuasive. Sincerity doesn't ameliorate the extent to which many of the accusations are unwarranted; indeed, sincerely perceiving most disagreement as harassment is a problem. PBP, when you're complaining to multiple different users about (for example) the fact that they RFDed an entry you made, but then the community discusses the entries at RFD and determines they indeed aren't the sort of thing we want to include, it would be prudent to reflect that the RFDer was not harassing you but correctly perceiving that the entry didn't meet commonly-accepted criteria for inclusion.
However, before I could post that, I see his lack of any indication of awareness of irony in telling other users to walk away while himself continuing to poke at them🙄 which... well, whether it's trolling or sincere, it's ill-advised either way. - -sche (discuss) 06:25, 5 June 2024 (UTC)Reply

@-sche I think the idea that I "perceive most disagreement as harassment" is exaggerated. Below I am going to explain how I came to the conclusion that I am being harassed. Purplebackpack89 12:36, 5 June 2024 (UTC)Reply

I agree with -sche's views here. Rarely have I seen so histrionic a user, who demands so much attention from his fellow editors and politicks so energetically on the discussion pages, while contributing so little and showing so few signs of introspection. — Mnemosientje (t · c) 15:42, 5 June 2024 (UTC)Reply

Not gonna weigh in on the question of whether PB89's contributions have been constructive on balance. I do find there seems to be a lot of selectivity in which editors are deemed intolerably disruptive. WordyAndNerdy (talk) 07:52, 5 June 2024 (UTC)Reply

Oh, totally agree. There are the guardians here and there are the peons. The behavior of the guardians is no better than that of the peons, but no peon can ever tell a GUARDIAN that he's wrong

And some of the people who are commenting on this are people who, in undoing or modifying my edits, have made questionable edits themselves. For example, Theknightwho stumbled into the hot-dog-is-a-sandwich debate being too hasty about reverting me. Benwing nominated dont tread on me for deletion...and quickly five votes that he was wrong showed up. Instead of owning up to their screw-ups, they're here. Purplebackpack89 12:29, 5 June 2024 (UTC)Reply

To be clear the "intolerably disruptive" remark was intended to reference generally trollish editors that Wiktionary has collectively chosen to tolerate/ignore for some reason. Problem admins ("guardians") are definitely an issue as well – and my experience is also that Wiktionary typically circles the wagons around them – but that's separate from a wiki keeping pet trolls. I'd also urge you to consider the possibility that Benwing RfD'ing dont tread on me was independent of TKW leaping into into the fray at hot dog. You didn't include an etymology explaining that this is the precise text on the Gadsen flag. It's possible Benwing saw the entry without being familiar with that history and concluded it was simply an unlikely misspelling. WordyAndNerdy (talk) 15:47, 5 June 2024 (UTC)Reply

I am willing to concede that they are possibly unrelated...but they still happened in the same window of a few days, which again brings us to the problem of a whole lot happening to me at once and that (understandably!) making me frustrated. If we're talking in hypotheticals, it's also possible Benwing could've acknowledged there was information he didn't know and admitted he erred. HE DIDN'T (He's a guardian...why would he?). If we're talking hypotheticals, it's also possible that Knight or Ben could've noticed "hey, Purplebackpack89 feels stress out and put upon! Maybe I should leave him alone for awhile, and if there's problems that need fixing, I'll get to them at a later date!". THEY DIDN'T. Purplebackpack89 16:22, 5 June 2024 (UTC)Reply

I think this is the stage at which we need to shift from narrowly focusing on individual incidents to discussing remedies to overarching systemic issues. WordyAndNerdy (talk) 16:47, 5 June 2024 (UTC)Reply

@User:WordyAndNerdy You could take the lead on that. My proposals haven't gained any traction:

to forbid any mention of any username (pings and signatures naturally excepted, probably also sayonaras and welcome-backs) in principal namespaces, Wiktionary space and their talk spaces, excepting the page required for the following proposals and enforcement thereof.
1. this would be enforced by increasing blocks and/or removal of admin powers. Formal public apologies on offended user's talk pages or in BP might mitigate the blocks or removals.
to have a request for mediation process, page, and template. Requests for interaction bans could be handled there as well.

Only the request for mediation addresses 'hounding' or 'abuse of administrative powers', including unjustified blocking, 'passive aggressive behavior', etc, or, possibly, the consequences of the 'gender-related, structural' composition of our veteran contributors, admins, and discussion participants. DCDuring (talk) 21:48, 6 June 2024 (UTC)Reply

I think there is abundant evidence, even on this subpage, that mentioning individual users on core community discussion pages too often rapidly leads to defensiveness and a total loss of focus on substantive, principled discussion, even discussion of how to limit (interpersonal) conflict. We should not want to have our conflict-suppression mechanisms be targeted against individuals, as has been suggested here. DCDuring (talk) 22:15, 6 June 2024 (UTC)Reply

Point one as I'm reading it strikes me as unworkable. Some discussions will inevitably centre on a specific user or group of users. Sometimes these discussions will be of a positive or neutral nature. Sometimes they'll involve navigating more difficult territory. But implementing formal mediation as a frontline remedy to interpersonal concerns doesn't seem like a viable plan. Some people won't look at the process as mediation. They'll see it as arbitration – being put on wiki-trial. Starting out on what some will find to be an adversarial footing doesn't seem like it would be conducive toward conflict suppression to me. It seems more likely to put people in a siege mentality and escalate matters that might otherwise be resolved without much fuss. I do think limiting the number of active BP discussions concerning a specific user to one at a time might be a step in the right direction. We do need a formal mediation process. I just think less-formal discussion might be ideal as a frontline approach. Why require a mediation process by default when it won't be necessary to resolve every disagreement that arises? WordyAndNerdy (talk) 06:52, 7 June 2024 (UTC)Reply

I'd be happy to hear about other proposals that have a better chance of success.

As a starting point, it is basic practical psychology (followed in business, law, government, and sometimes, even politics) to frame issues as about substance and not persons, even personal actions, let alone invisible attributes, like motivations, values, attitudes, beliefs, intelligence or energy levels, etc. To the extent our users aren't doing that, they would benefit from learning to do so. The first locus after edit wars are talk pages for entries, next are user talk pages. Right now people chime in (or pile on) on talk pages they are watching. At some point the discussion may fail to resolve the issue. This is where things go wrong if the issues are framed as personal and not substantive.

As soon as issues of personal behavior come up, especially in a public forum, we see: defensiveness, score-settling, etc. This can be worse than a real trial, it can be mobocracy. Were interpersonal conflicts diverted to a mediation, as there are necessarily two parties in an interpersonal conflict, neither party need be on trial. I would suggest that we may need the mediation page to be basically private, invisible to the community at large, except possibly in the event of failure of the process, after a waiting period. The role of a mediator is probably first to sort out substantive issues (for the appropriate forums) to the extent the users have failed to do so. Then behavioral issues can be sorted. Keeping attributions out of the discussion at all stages is critical. DCDuring (talk) 23:53, 7 June 2024 (UTC)Reply

@WordyAndNerdy I agree with you. In particular I think, as you do, that the mediation process should start only when an informal BP discussion fails to resolve the issue. I also think there's no way that it's workable to forbid mentioning specific users in Wiktionary-space, talk spaces, etc. Most of these mentions as they currently occur are not intended to single out a user for opprobrium or anything but for any of a number of other reasons, e.g. to agree with someone, to mention their theory or proposal on something, etc. I think your suggestion of limiting BP discussions concerning a particular user to one at a time should be enforceable; if there are multiple simultaneous concerns about a particular user they're likely to be related and should be merged. If in some weird circumstance we really need to have two unrelated simultaneous discussions about a given user and one can't wait for the other to finish, that should require prior explicitly discussed consensus. Benwing2 (talk) 07:18, 7 June 2024 (UTC)Reply

The occasional efforts of some of our wiser experienced users to mediate discussions in public forums often seem to simply lead to the interpersonal conflict threatening to involve them.

Direct person-to-person contact on user talk page is the first-line location for discussions. If an issue arises from a substantive matter, then the substantive matter should be discussed in the appropriate forum: BP, TR, GP. It should not be hard to refer to edits by diffs without mentioning the editor by name. I don't think that we have a very good record of resolving interpersonal conflicts in group forums, unless we count driving contributors of all kinds away or into virtual hiding (changing username, narrow range of edits) as success. It is very easy to exclude personal mentions: policies, warnings, escalating blocks. DCDuring (talk) 23:53, 7 June 2024 (UTC)Reply

Purplebackpack on feelings of harassment

Were people actually harassing me? Maybe, maybe not. May I explain why I felt harassed?

A large portion of my edits have been scrutinized in a very short amount of time. Taken literally years of work, some of which hadn't bothered anybody for years, and tried to change or delete a lot of it in just two weeks. Had the scrutiny occurred more slowly, I would not have not felt as put upon.
Editors have given the appearance of assuming bad faith and focusing on the editor, not the content .There have been several nominations or comments on the lines of "oh, well, this is a Purplebackpack89 edit". That's not supposed to matter.
Editors made no good-faith effort to deescalate continued making the edits even though it was clearly bothering me. No deadline...could just wait until I was less stressed out.
Some of the attempts to modify my edits ended up being questionable themselves. For example, Theknightwho stumbled into the hot-dog-is-a-sandwich debate being too hasty about reverting me. Benwing nominated dont tread on me for deletion...and quickly five votes that he was wrong showed up. Denazz piled on by trolling left and right

Given those four things happening, basic psychology would suggest that I would be frustrated. And naturally, a questionable 31-hour block and this thread would also put me on edge! It would probably put anybody on edge! Was I over the top? Maybe, but I feel that where my feelings of harassment came from are understandable. The solution to say that this is entirely my fault, I'm never entitled to feel frustrated, and nobody else did anything questionable is...just wrong. Fundamentally, this thread COULD end up having a chilling effect on speaking up if you feel put upon and...we don't want that either. Purplebackpack89 12:49, 5 June 2024 (UTC)Reply

Regarding your comments about "focusing on the editor": I think it's to be expected that if a user has a history of questionable edits/entries, their activity will get more scrutiny. For better or worse, there's an (unwritten) reputation system here, and it does matter who created an entry. Pretending otherwise is a fantasy. Jberkel 13:17, 5 June 2024 (UTC)Reply

Of late, people have been exaggerating the questionability of my edits though, @Jberkel. Above, people are essentially claiming that I never did anything productive at all and that is inaccurate. Purplebackpack89 13:31, 5 June 2024 (UTC)Reply

I'd say you're particularly sensitive to corrections, from what I've seen. I'd love to have people scrutinize my work. Vininn126 (talk) 14:24, 5 June 2024 (UTC)Reply

I think you'd love it to a point and not be comfortable with it beyond that point, @Vininn126 (And I believe that is true for most editors). If people scrutinized you in the manner I outlined, I think you (or anyone) would be somewhat bothered. Purplebackpack89 15:19, 5 June 2024 (UTC)Reply

I was heavily scrutinized when I first started editing. Even berated. I don't see similar berating towards you. I see corrections that I personally would welcome. Vininn126 (talk) 15:28, 5 June 2024 (UTC)Reply

I'd argue there's a significant difference to receiving heightened scrutiny as an actual wiki-newbie and receiving it as veteran editor. At a certain point it's only natural for a veteran to start feeling that they're being subjected to disproportionate scrutiny and opposition. Especially when this community creates a special policy carveout for a habitually trollish editor. It's almost as if provocation is treated as excusable while being especially provokable is not. WordyAndNerdy (talk) 16:12, 5 June 2024 (UTC)Reply

I really feel like you didn't read my messages. I'd say I'm a veteran editor at this point and that I'd love more scrutiny. Vininn126 (talk) 16:13, 5 June 2024 (UTC)Reply

Also I really don't see why editing for a long time gives you this freedom. Let's say someone took a long break, or have just always been problematic. Vininn126 (talk) 16:21, 5 June 2024 (UTC)Reply

That's you. Other editors will respond differently. Yes, this is a wiki. Every edit comes with the caveat it might be objected to or undone. But it's not unexpected for someone to start feeling like a pariah or whipping kid if they routinely encounter intense opposition. That feeling doesn't come from nowhere. This wiki definitely plays favourites at times. WordyAndNerdy (talk) 16:26, 5 June 2024 (UTC)Reply

Your assumption that it can come from nowhere, in my opinion, greatly misrepresents PB's reaction. I'm not trying to invalidate anyone's emotions, but that also doesn't mean someone's reaction can't be over-the-top or unproductive. If we never address that behavior, things get bad very quickly. Vininn126 (talk) 16:28, 5 June 2024 (UTC)Reply

It doesn't matter, @Vininn126. You still have to assume good faith about their edits (and the "reputation system" mentioned above flies in face of that btw). And if an edit feels put upon, it seems like a good idea to lay off him for a bit unless there's something serious like vandalism that has-to, has-to, has-to be dealt with right away.

I don't think Wordy is saying my reaction came from NOWHERE, I think he's saying that there is a SOMEWHERE, AND that that needs to be addressed rather than singling me out alone. Purplebackpack89 16:32, 5 June 2024 (UTC)Reply

At no point did I assume bad faith on your part. Having good faith doesn't absolve you from any bad behavior. Vininn126 (talk) 16:33, 5 June 2024 (UTC)Reply

My point is that it's a double standard to treat having a dramatic reaction to provocation as requiring community action while giving a pass toward actual provocation (see the second link in my above comment). WordyAndNerdy (talk) 16:40, 5 June 2024 (UTC)Reply

I saw, and see my comment that this thread is about this user in question. If it's about being "hounded", I don't think that those claims are founded. If it's about other actions, I'd prefer they stay in that thread. Please don't muddy the waters on the conversation to make a point that's tangentially related. Vininn126 (talk) 16:43, 5 June 2024 (UTC)Reply

It isn't "muddy[ing] the waters." It's providing relevant context. Nothing happens in a vacuum. My thoughts on this haven't changed in ten years. WordyAndNerdy (talk) 16:53, 5 June 2024 (UTC)Reply

You are muddying the waters - it's just a boatload of whataboutism. Theknightwho (talk) 00:03, 6 June 2024 (UTC)Reply

There was no heightened scrutiny, and even if there had been it would have been justified given the number of mistakes I (and others) have found in PB89's edits. Saying that PB89 "routinely encounter[ed] intense opposition" is simply a complete fiction. Theknightwho (talk) 00:16, 6 June 2024 (UTC)Reply

Knight, you bear some responsibility for this situation. There was nothing you were doing vis-a-vis me that had to be handled immediately. You could have noticed that I was frustrated by the way you were handling things and proceeded more slowly and cautiously. You didn't, in fact, you literally did the exact opposite.

And on top of this, you yourself made mistakes while hastily trying to undo my mistakes. And you never owned up.

There are several threads that have expressed concern about your confrontationalism and this one should echo those concerns. Purplebackpack89 12:14, 6 June 2024 (UTC)Reply

I wasn’t confrontational with you at all, and making some minor changes to entries you’d edited and then posted about on high-traffic pages didn’t have anything to do with you specifically. I tagged you in one edit as a form of guidance, and your disproportionate feelings of negativity are not a reasonable response, as numerous people have said by now. It is not my responsibility to manage your emotions; you are an adult, and you do not get a free pass on mistreating other editors simply because you feel upset. Theknightwho (talk) 16:25, 6 June 2024 (UTC)Reply

PBP, you need to understand that this is a collaborative project and the other users here are not your enemies. You brought up Assume good faith but you yourself have never assumed good faith in anyone this whole time. In the many disputes you've had you always assumed the other person had a motive against you, which is extremely rude and disrespectful. Which begs the question: Why are you the only one who deserves the assumption of good faith? Why do you never afford others the same assumtion you mention?

You should not assume people who look through your contributions are doing so out of spite, but rather because everyone makes mistakes and it's honestly for the best that everyone's edits gets reviewed at least occasionally. Otherwise, mistakes could go unnoticed to decades! If you've ever done entry maintenance, you would know that yourself. Hell, I actually used to get into disputes with Fenakhay when I first joined the project for the same reason. But he basically taught me how to format entries and now he's the main person I ask when I have a question about entry formatting.

Additionally, your repeated provocations of editors recently is completely inappropriate. What good reason is there to tag TKW 5 days after things calmed down to tell him to walk away? Especially since he DID walk away, 5 days ago! It was you who didn't! What reason is there to send aggressive messages to Ben telling him that he "better rescind" his RfD? After Ben told you to calm down because you were being aggressive, what was the point in continuing to double-down rather than walk away and discuss the issue at RfD?? And after being blocked for "intimidation and bullying", what is the reason to try to pick an argument with the blocker —who has not participated in any conversations with/about you— rather than just walking away (as your yourself suggested)? You preach values you don't even follow and regularly throw stones in a glass house. You started a whole post about how TKW to checking your edits was harassment, yet your somehow incapable of seeing how your actions towards Ben could be perceived as bullying by a third party. — SAMEER (؂・؄・؏) 18:55, 5 June 2024 (UTC)Reply

Benwing made a bad RfD. It was so bad that five people almost instantly voted keep. But...the problem is me telling him it's a bad RfD, not him creating one?

Benwing and Knight and Denazz bear some responsibility for this situation. They made the situation worse with trolling in Denazz's case and questionable edits in the other two. Why do they get free passes and I don't? Is it because they're GUARDIANS and I'm a peon? Purplebackpack89 20:06, 5 June 2024 (UTC)Reply

You're right, that RfD may have been a "mistake" (as in, it seems people disagree with him), but you had no right to be rude about it. When we see RfD's we disagree with, we discuss at the RfD why we think it's a bad idea and let other people compare the reasoning provided. We do not hound the people who made the RfD demanding they withdraw it (that's not even the process for resolving RfD's), and mock them for incorrectly putting up a term for RfD. That is bullying.

What do you mean Ben got a free pass just because he's a "Guardian"?? Do you mean that everyone just listened to Ben cuz he's an admin? Cuz if so, that literally didn't happen. In fact, you said yourself that most users in the RfD read your reasoning and agreed. Nobody voted against you just because you're a "peon" (whatever that means), so I have no idea what you are talking about.

Also, looking solely at interactions —that you specifically— have had with Ben and TKW, it seems like they were just doing entry maintenance. And looking at your recent interactions with Ben, it genuinely appears to me as though you are being a bully. — SAMEER (؂・؄・؏) 20:38, 5 June 2024 (UTC)Reply

Making an RfD that doesn't end up passing is, in fact, not a problem. That's perfectly ordinary. Making an RfD into a pissing contest, on the other hand... Nicodene (talk) 21:36, 5 June 2024 (UTC)Reply

Yeah agreed. It’s ok if you’ve ‘mistakenly’ rfd-ed an entry convinced that the entry will fail.

Purple is bereft of maturity and is sorely inexperienced. He isn’t necessarily acting in bad faith, he takes it for granted that he is always right in any untoward issues involving him but… evil dictators also know they are doing the correct deed, oh well. Thus it’s just a matter of interpretation whether Purple’s bearing is going to result in other editors getting psychologically harassed and being coerced into quitting Wiktionary—just as the populace of a brutal dictatorship are forced to flee their country or face persecution in their homeland—or Purple is actually an innocent victim of harsh law enforcement here. Inqilābī 22:09, 5 June 2024 (UTC)Reply

Yeah, this is basically my reading of it: PB89 isn't acting in bad faith. They just lack social awareness and think they're axiomatically correct about everything, so they conclude the only possible reason anyone disagrees with them must be because they're out to get them. Frankly, I don't care whether it's down to incompetence or maliciousness, but either way it's having a very negative effect. Theknightwho (talk) 00:10, 6 June 2024 (UTC)Reply

I tell you that something bothers me, Knight. Your response is to do that thing that much more, and to do it so hastily you make mistakes while doing so. It's not surprising that anybody would feel attacked under that circumstance. Your critique comes off as hypocritical because embedded in it is that your edits and conduct towards me are "axiomatically correct". Also, can everybody cut out playing amateur psychologist? You ain't Joyce Brothers. Purplebackpack89 15:17, 7 June 2024 (UTC)Reply

It is unreasonable to demand that an admin stop doing their job simply because of your personal feelings. Nicodene (talk) 07:02, 9 June 2024 (UTC)Reply

You keep saying I make lots of mistakes, but what you actually mean is that after I reverted your change to the definition of hot dog from “sandwich” to “entree”, Equinox then changed it to “snack”. The fact that you keep focusing on this doesn’t make any sense to me, as it clearly misrepresents what happened, and I don’t see how it’s supposed to be hypocritical anyway. Theknightwho (talk) 10:41, 9 June 2024 (UTC)Reply

I would suggest that, even if there is consensus to block Purplebackpack89, it would make more sense to just block him from discussion pages— while still letting him contribute to the dictionary proper, cause his lexicographical additions, with fixes and corrections by other editors, are still substantial. Inqilābī 19:50, 6 June 2024 (UTC)Reply

Constructed languages in the mainspace

Latest comment: 1 month ago27 comments10 people in discussion

(Notifying -sche, The Editor's Apprentice, Mahagaja): : I recently created this vote (start date TBD): Wiktionary:Votes/2024-06/CFI for mainspace constructed languages, in hopes of coming to a consensus on which conlangs should be included in the mainspace and why. Since its creation, nonetheless, I've come to realize that we currently include possibly two conlangs in the mainspace outside of the ones listed at WT:CFI, and I'm not sure what to do about them. These include:

Eskayan
N'Ko, though it's debated as to whether or not this is a conlang (per Glottolog) a mixed language per Ethnologue, or simply a literary register per Wikipedia.

If we consider N'Ko a conlang, should it be included in our permitted mainspace list? I would think so, but I also don't feel like it's an actual conlang. I don't know anything about Eskayan to comment on it.

On the same note, I'd like to bring up the case of palawa kani, created by the Tasmanian Aboriginal Centre as it seems closer to the revival of an indigenous language instead of a language like Volapük, at least based on my surface-level research of it. It looks to be taught to children, is used in place names, is used in official dubbing, has a growing oral tradition and more. I cannot yet verify if it has native speakers, but I wouldn't be surprised if it does, if not for a lack of direct access to the language (the merits of which I won't comment on). If so, I'd like to see what the consensus is about adding it to Proposal 1 of the above vote. AG202 (talk) 23:57, 4 June 2024 (UTC)Reply

Pinging @Mar vin kaiser since you seem to be the most active editor of Eskayan & @Thadh since you mentioned it on Discord. AG202 (talk) 00:22, 5 June 2024 (UTC)Reply

This looks well thought out, nicely done. Thanks. Vininn126 (talk) 07:15, 5 June 2024 (UTC)Reply

In a previous discussion (before the Interslavic discussion), someone said it felt like the divide we make between mainspace conlangs and appendix-space ones was that the handful of long-used conlangs are in mainspace, and new ones are in appendix space... and they said that thinking it was a bad thing (arbitrary), but I think it's been a reasonable approach. Having a fair number of native speakers and/or works in the language could be another decent rule of thumb. As regards Eskayan, I note how many aspects of our attitudes to / treatment of artificial languages seem to have been developed with Western conlangs in mind (often created recently and for certain reasons, attempting and failing to be world languages or new nations, or for fiction), to the extent that the existence of old non-Western artificial languages like Eskayan (created for different reasons and used in rather different ways, in Eskayan's case as a language of the Eskaya people, taught in several schools) seems to have slipped the minds of the people devising the original conlang policies, and flown under the radar. All things considered, that (fact that Eskayan is currently included) seems OK to me. I'm not wedded to it being in mainspace if people want it moved to appendix-space, but it does seem to be in a different boat from various Western conlangs that have been suggested for inclusion. - -sche (discuss) 00:53, 5 June 2024 (UTC)Reply

I have no strong opinions either way on Eskayan; it feels different in some way from run-of-the-mill conlangs but I don't know if that's just a bias based on its non-Western origin. Benwing2 (talk) 04:06, 5 June 2024 (UTC)Reply

BTW as for N'Ko, from reading the Wikipedia entry it sounds more like Standard Basque, Standard Moroccan Amazigh, Rumantsch Grischun or Unified Kichwa, which I do not consider conlangs so much as intentionally created koines. These are on the same spectrum as Modern Hebrew, standard German and standard Italian, all of which are partly planned languages but none of which are reasonably considered conlangs IMO. Benwing2 (talk) 04:17, 5 June 2024 (UTC)Reply

Thanks! Yeah I won't worry about it then. AG202 (talk) 05:06, 5 June 2024 (UTC)Reply

I've since done a skim of relevant chapters of The Last Language on Earth: Linguistic Utopianism in the Philippines by Dr. Piers Kelly (Dec 2021), which focuses on Eskayan, and based on what I've read, it seems like it has a strong rationale for inclusion. It's taught in schools to children, used in praying, singing, speechmaking, excluding overhearers, and common phrases, and there's an extensive literary history. "In effect, Eskayan appears to have supplanted the special authoritative role of English." They estimate that there are between 500-550 speakers of Eskayan, with several speakers with a high degree of linguistic competence in speaking, reading, and writing the language.

The only issue I'm seeing is that it's technically not a mother tongue: "Unlike Boholano-Visayan, which is acquired as a mother tongue, knowledge of Eskayan is learned through voluntary attendance at traditional Eskaya schools, and mastery of the language is considered a prerequisite for becoming truly Eskaya." Thus, there technically aren't any L1 speakers from birth, but seeing as though there are children taught it from a fairly young age, would that not qualify as a pseudo-native language? It's definitely different from the typical conlang, and has fully-fledged educational aspects, including arithmetic & equations being taught and performed in schools. The author starts out the final section, stating:

The immediate future for Eskayan as a viable language is reasonably assured. Competent speakers have status within the communities; in Biabas and Taytay the language is being actively learned by children, and plans are well under way to construct an Eskaya school in Cadapdapan. Recent government recognition, through the Indigenous Peoples Rights Act, provides additional legitimacy to an already valued language.

This makes it clear to me that it holds legitimacy and should be included in the namespace. AG202 (talk) 05:06, 5 June 2024 (UTC)Reply

@AG202 Sounds good to me; I would amend your proposals to include it along with Esperanto. Benwing2 (talk) 05:17, 5 June 2024 (UTC)Reply

@AG202, Benwing2: I seem to be late in entering this discussion but yeah, I agree that Eskayan is quite different from other conlang. I would describe Eskayan as already part of indigenous culture of that region of Bohol. So it should be part of the mainspace. --Mar vin kaiser (talk) 07:10, 5 June 2024 (UTC)Reply

Actually the more I'm thinking about it, the more I feel it might be possible to include it as a "jargon" of Cebuano. I guess it doesn't really give a complete picture, but it's basically Cebuano with an almost complete substitution of words, which is very similar to what we usually consider a jargon, rather than an independent language. Thadh (talk) 07:11, 5 June 2024 (UTC)Reply

On a similar note, Eskayan is currently considered an LDL; I assume that that should stay the same? I hate to continue having separate threads on this topic, but I want to make sure that everything is addressed before setting the start time & date for the vote. AG202 (talk) 22:39, 6 June 2024 (UTC)Reply

It's been brought to my attention that Ido is said to have 26 native speakers in Finland per the Ido language Wikipedia page. However, I'm not sure if it's been independently verified and would like more input before I make any changes in either direction about it. Surjection previously told me on Discord that the Finnish website does not provide any additional information. CC: @Benwing2, @Thadh, @-sche, @Vininn126 AG202 (talk) 19:55, 6 June 2024 (UTC)Reply

I think that if we can't verify it, we shouldn't consider it. Vininn126 (talk) 20:08, 6 June 2024 (UTC)Reply

I mean, we can verify it, namely by looking at the reference. Tilastokeskus isn't an organisations to just invent 26 native speakers, is it? Thadh (talk) 20:19, 6 June 2024 (UTC)Reply

I suspect that it can't be ruled out that these were just some pranksters fooling around with their own self reported information submitted to the population census database (if such choice had been presented in the questionnaire form). I doubt that the Finnish statisticians actually made any effort to verify the actual Ido language proficiency of these people. And if native speakers actually exist, then it should be possible to confirm this information from the other sources. --Ssvb (talk) 21:10, 6 June 2024 (UTC)Reply

I was very astonished when I saw it for the first time, and frankly, I find it extremely hard to believe. I mean, there are reports about native speakers of Volapük, but that was when Volapük was still a huge movement. Ido has never been a huge movement. In fact, I think these 26 native speakers appearing out of the blue are possible only if the entire community of Ido users decided to move to Finland within a short period of time, started multiplying themselves and teaching it to their newborns. But on the other hand, the source doesn't seem to be unreliable, although in this case I wonder if it couldn't simply be a mistake. IJzeren Jan (talk) 21:28, 6 June 2024 (UTC) N.B. I wouldn't put my money on pranksters either, as that would require some pretty good organization; and why would they pick Ido, of all possibilities? Putting myself in their shoes, I'd rather have chosen Klingon, Na'vi, Huttese or something similar. IJzeren Jan (talk) 21:39, 6 June 2024 (UTC)Reply

@Ssvb: From what I know, the Finnish statistical database, just like the Dutch one, is based on population data obtained at birth/subsequent corrections during the person's lifetime. Unlike the census, this is personal information the government has on you, so the chance people would play with that is a lot lower. But it's always possible that I misunderstood this? Thadh (talk) 21:41, 6 June 2024 (UTC)Reply

But on the other hand, does Finland collect data on one's native language, too? Because I'm sure as hell that the Netherlands don't! IJzeren Jan (talk) 21:59, 6 June 2024 (UTC)Reply

That's a good point. I don't know, but I'm sure that that information can be found somewhere. Thadh (talk) 22:05, 6 June 2024 (UTC)Reply

@Thadh: Yes, it's the personal information, but seems like the Finnish residents can just login using their online banking credentials here and update various details, including their "native language". I doubt that some oddly selected native language can possibly affect anything in everyday life. And I think that having a few dozens of conlanger weirdos in the whole Finland isn't statistically improbable. For example, there were some nutcases in Taiwan, who even changed their names just to get a discount. If there's a loophole in the system AND a real incentive to abuse it, then it will be abused. --Ssvb (talk) 01:01, 7 June 2024 (UTC)Reply

@Ssvb: More realistically tax advisors now suggest to change genders according to the new self-determination act in the FRG, because you get different capitalisation factors for the assessment of the value of a land encumbrance, remaining at the owner after donating a property and hence reducing gift tax, depending on legal gender. I can imagine legal advantages to slip in for someone determining his native language, as it is also an idpol kind of thing, or even purposefully teach a child an artificial language as a second native language just for benefits introduced somewhere. Wiktionary alone though is just not important enough to be gamed this way. Fay Freak (talk) 14:13, 7 June 2024 (UTC)Reply

The borderline transphobia aside (was that really a meaningful addition to the discussion?), I don't see how Ido, a constructed language, would give anyone benefits, so I am highly sceptical anyone would change their native languages for that reason, even more so than for the reason of trying to be funny. Thadh (talk) 16:37, 7 June 2024 (UTC)Reply

Strange individuals exist in every society. You can't expect everyone to be sane and reasonable. --Ssvb (talk) 18:06, 7 June 2024 (UTC)Reply

Maaaybe self-promotion, as you could brag about how your hip new language has "26 recorded native speakers in Denmark", even if it isn't technically true. CitationsFreak (talk) 08:32, 8 June 2024 (UTC)Reply

Personally I think we should ignore this data point about Ido, because it seems a priori unlikely, as others have pointed out. Claims about native and total speakers are habitually inflated, e.g. someone insists on putting back into Wikipedia the claim that there are over 200 million total speakers of Swahili, based on a single questionable reference and in contradiction to all other references; I have deleted this info several times but it keeps getting put back, and I don't have the energy to fight this. Benwing2 (talk) 22:33, 6 June 2024 (UTC)Reply

True that. Same goes for Esperanto, by the way: the ridiculously high number of 2 million speakers (sometimes even 10 million) keeps popping up regularly, even though it was refuted already a long time ago. Today we know that even a number of 100,000 is probably way too optimistic. Same goes for those one or two thousand so-called native speakers. Usually, such figures come from sources with an interest in inflating them. However, that cannot be said of those figures from Finland. Instead of drawing conclusions based on suppositions, shouldn't we at least ask them where those 26 native speakers come from? IJzeren Jan (talk) 23:25, 6 June 2024 (UTC)Reply

synthesized audio files

Latest comment: 28 days ago29 comments13 people in discussion

Do we have a policy on this? I have encountered some, e.g. at inconsequential the audio is explicitly labeled "CA synth", which I take to mean synthesized Canadian. Although it's now possible to synthesize realistic sounding text-to-speech audio, this particular audio sounds very artificial to me, and I think it doesn't belong. Even for realistic-sounding audio, I'm skeptical. Here are some other words with audio labeled "CA synth": extraterrestrial, catamaran, angst, centralization, depolarization, disorganization, amnesia, counterfactual, homily, atherosclerosis, icicle, ecclesiastical, enclose, intruder, gasp, entitle, grievance, goose flesh, biodiversity, lethargic, hyperventilation, coliseum, macrobiotics, impracticality, autobiographical, disputant. An additional file labeled as ca-synth in the filename but not the caption occurs in isolationism. Some of these have additional non-synthesized audio files, some don't. Benwing2 (talk) 04:04, 5 June 2024 (UTC)Reply

Our general rule (not sure how much of a policy it is) is that audio pronunciations ought to be recorded by native speakers of the languages—a machine is admittedly not a native speaker of any language. Some time ago I came across a bunch of synthesized audio files on English entries (around 20–30 IIRC) that were all created by a single Commons user years ago and then a few years later were automatically added by a bot (User:DerbethBot) that was adding missing Commons audio recordings not on the entries. The quality of the audios was really poor and many weren't even correct, so I went ahead and removed them. I admit perhaps I should’ve brought up the matter here, but it seemed pretty clear-cut to me that they had to go as, with an audio recording, one would expect the voice of an actual native human speaker. Even with better quality recordings and as voice synthesis technology gets better and better (particularly with the AI stuff), I think we should still try to supply authentic human recordings as any voice synthesis services will be available to the readers elsewhere. lattermint (talk) 05:05, 5 June 2024 (UTC)Reply

I support removing these and sticking a note somewhere that people shouldn't add synthesized audios as pronunciations. (OTOH, if someone could add a synthesized audio to e.g. voice synthesis as a T:examples type thing, that could actually be appropriate use of synthesized audio, ha.) I suspect these will become an increasingly common issue unfortunately; Commons is similarly dealing with low-quality AI art being added. - -sche (discuss) 05:33, 5 June 2024 (UTC)Reply

Agree with lattermint and -sche. Vininn126 (talk) 05:46, 5 June 2024 (UTC)Reply

I’ve encountered these before. Not a fan. Among other things a formal policy of ‘recordings should be of native speakers’ would help by automatically disqualifying this sort of thing. Nicodene (talk) 12:16, 5 June 2024 (UTC)Reply

We can formulate it this way, “recordings within pronunciation sections must be of native speakers”; how for allowed conlangs without native speakers? For dead languages I formulate “recordings for extinct languages are discouraged”, to leave room for interpretation, mine being that if it would pass off as native if we were ignorant of the information of the language being extinct then it is tolerated for practical purposes—we will find agreement on making a statement about recordings of natural languages possessing native speakers easy. Fay Freak (talk) 13:19, 5 June 2024 (UTC)Reply

Perhaps it could be phrased as 'If a language has a large body of native speakers, any recording should be of a native speaker. If a language is extinct or constructed but has a large body of non-native speakers, any recording should be of a proficient speaker using one of the conventional pronunciations’. Nicodene (talk) 20:09, 5 June 2024 (UTC)Reply

it reminds me of the radio recordings of traffic conditions i used to hear on the radio, where both the tone and the speed were out of step. outside of acute distress, we just dont speak that way. basically it sounds like someone who was abducted and is being forced to read a letter saying "no im fine please dont look for me im definitely okay i promise" —Soap— 12:50, 5 June 2024 (UTC)Reply

Phonology is also out of step; of native speakers announcing the stops in the tram I hear local toponyms being pronounced dodgily, since these speakers rarely get IPA transcriptions; even the pronunciations of municipality-level names from TV presenters are unreliable. It is impossible to guess and almost impossible to look up that Baumheide, Altenhagen, Hiddenhausen, Oerlinghausen but not Bad Oeynhausen, Ubbedissen and Asemissen are stressed at the second stem, and for Hövelhof I still don’t know whether ⟨v⟩ is /v/ or /f/ – IP corrected it to /v/, from a local public broadcaster’s newsreader I remembered it /f/. There is a lot of confirmation bias and no source, lest to speak of reliable ones, on de.Wikipedia’s claim for Hiddenhausen and Oerlinghausen being stressed at the onset. Fay Freak (talk) 13:19, 5 June 2024 (UTC)Reply

Can this policy be generalized to dealing with just any low quality audio? E.g. anything with disruptive background noise too. Also rather than deleting, maybe it's more productive to aggressively categorize and label them as something that needs replacement? The deleted low quality audio samples won't just disappear from the net and may be re-added again by the less attentive editors or bots. It would probably help if https://lingualibre.org/wiki/User:Olafbot could prioritize replacement of such known low quality recordings over adding the totally new audio samples. Pinging @Olaf just in case if he might be interested in this discussion.

As for the possible replacement of the current artificially synthesized isolationism audio, the https://commons.wikimedia.org/wiki/Category:Lingua_Libre_pronunciation-eng?from=isolationism link lists one human recorded sample. But it's a sample recorded by a native Mandarin Chinese speaker and, despite of that, it's used by fr.wiktionary.org and pl.wiktionary.org. This is also not ideal in my opinion. --Ssvb (talk) 15:02, 5 June 2024 (UTC)Reply

And to give an example, just listen to the click in the iridium audio sample. I encountered a lot of samples with similar or even worse defects, but can't easily find them offhand right now.

What is a Wiktionary editor supposed to do upon spotting such audio? Just simply let it be because it had been recorded by a native speaker? Remove it? Label it somehow? --Ssvb (talk) 18:03, 5 June 2024 (UTC)Reply

If the audio is particularly bad, please do bring it up for discussion (compare e.g. Wiktionary:Tea_room/2013/February#Dutch_enig.2C_Audio_file_file:Nl-enig.ogg; note that the current audio file is different than the one that was there when that discussion happened); if it's bad, people will probably agree on removing it; if a lot of files by a particular speaker have problems, we may want to remove them all systematically and try to 'blacklist' the user's files (Metaknowledge was spearheading a project to do this, but has been inactive). I like the idea of listing known bad recordings somewhere after removing them from entries, so they can hopefully be replaced (either outright overwritten with a better file, or someone just records and uploads a separate file). - -sche (discuss) 18:19, 5 June 2024 (UTC)Reply

@-sche: I understand and fully agree with starting a public discussion when bad faith is suspected. Such as covert vandalism or if a person with an obvious accent has an audacity to pretend to be a native speaker.

However this is simply not workable in all other cases due to excessive bureaucracy involved. You already set the bar at "particularly bad", possibly to limit the scope and the amount of paperwork. But even "moderately bad" or "slightly bad" audio samples shouldn't be normally desirable. Using Lingua Libre, it's possible to easily record more than 100 audio samples in less than one hour if one is up to it. Another factor is that the beginners are likely to systematically record and upload a certain percentage of low quality audio samples simply due to lack of experience. Jumping the gun to harass or blacklist the users for this is also counterproductive, because this is a sure way to lose a potentially valuable contributor.

My suggestion is to simply add a new parameter to Template:audio for flagging low quality audio samples. The parameter value can be a short text description: "wrong accent", "noise", "clipped", "muffled", "synthetic", etc. So that when I encounter a bad quality audio sample, I can just spend a few seconds on a quick edit to flag it. When this process is established, the problematic words can be automatically added to a Lingua Libre list, similar to this one: https://lingualibre.org/wiki/List:Eng/Lemmas-without-audio-sorted-by-number-of-wiktionaries (so that the Lingua Libre contributors know what to prioritize when recording their audio samples).

Please look at the same iridium audio sample again. Is it so bad that it needs an urgent removal or a public discussion? Maybe not. Would it be a good idea to eventually replace it? Yes, of course. --Ssvb (talk) 17:14, 6 June 2024 (UTC)Reply

@-sche: And here's another example: the Ukrainian хлор (xlor) sounds almost indistinguishable from хор (xor) because the sound "л" is missing.

It's interesting that another audio sample also recorded by @Tohaomg drops a different sound ("р" instead of "л") in хлорметан. And I can actually hear a click in place of the missing sound. After searching a bit, I found a known problem https://lingualibre.org/wiki/LinguaLibre:Technical_board/Audio_click_bug#HIGH_PRIORITY:_Audio_recordings_have_dust_and_clicks, which was likely fixed only in 2023.

Anyway, even though the audio samples likely got corrupted because of a bug in the recorder application, I see this primarily as a QA issue. Corrupted audios shouldn't be normally uploaded to commons.wikimedia.org by the person, who recorded them. --Ssvb (talk) 02:47, 9 June 2024 (UTC)Reply

@Benwing2: What's your opinion? Should we label problematic audio samples as |a=synthetic or |a=defective? Or a better solution is needed? --Ssvb (talk) 03:01, 9 June 2024 (UTC)Reply

@Ssvb I added support for a |bad= parameter for labeling bad audio recordings with arbitrary text. You can see it in action in User:Benwing2/test-audio (specifically, the last example under the "Production" section). The "bad recording" note should appear boldfaced in red, but it may take 5-10 minutes for it to appear this way as I just added the appropriate specs to MediaWiki:Common.css for this and it takes a few minutes after doing so for the changes to propagate. Let me know if this is helpful or if you want some other param. Benwing2 (talk) 03:16, 9 June 2024 (UTC)Reply

Also, uses of this param are currently tracked using the WT:Tracking mechanism, by visiting Special:WhatLinksHere/Wiktionary:Tracking/audio/bad-audio and Special:WhatLinksHere/Wiktionary:Tracking/audio/bad-audio/LANG for a specific lang code. Maybe this should be made into a category. Benwing2 (talk) 03:19, 9 June 2024 (UTC)Reply

I wouldn't be a priori opposed to artificially produced audios if native speakers can vouch for their sounding natural. --Lambiam 14:28, 6 June 2024 (UTC)Reply

@Lambiam: This is a slippery slope and it's not always easy to tell the difference between natural and non-natural. Audios may be just slightly unnatural and people would hesitate to discard them. That said, I don't mind having synthetic audios as a temporary placeholder, but only if they are always clearly labelled as such. And only if they are added to a publicly visible to-be-replaced list. --Ssvb (talk) 17:25, 6 June 2024 (UTC)Reply

I have no strong feelings about this, but note that occasionally an audio file presumably produced by flesh-and-blood native speaker may sound off as well (and sometimes even plainly wrong). In the end, whatever the means of production, ~~the capitalist will appropriate the surplus value~~ the quality of the result needs to be assessed and ensured by native speakers. --Lambiam 17:34, 6 June 2024 (UTC)Reply

Sure, not everyone is a professional voice actor. But a synthetic audio is like a synthetic flower. Some aspects of it are as good or even better than the real thing. Yet the other aspects are different, possibly in a subtle way. --Ssvb (talk) 19:10, 6 June 2024 (UTC)Reply

Hard against. Vininn126 (talk) 17:30, 6 June 2024 (UTC)Reply

I don't think it's a good idea that AI will learn humans how to speak "right". We've already got humans who are doing it wrong. Tollef Salemann (talk) 13:29, 23 June 2024 (UTC)Reply

@Tollef Salemann: Did you mean "AI will teach humans"? I don't like this idea either. --Ssvb (talk) 14:29, 23 June 2024 (UTC)Reply

Support removing synthesized audio. — SAMEER (؂・؄・؏) 04:21, 9 June 2024 (UTC)Reply

Support synthesized/AI audio on the condition that it's indistinguishable from a natural human voice. The ones listed above are extremely robotic-sounding which I dislike. Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

@Ioaxxere: Who will decide that it's indistinguishable? Even native speakers sometimes can't notice a foreign or a regional accent unless they pay close attention to very specific subtle details. --Ssvb (talk) 17:08, 23 June 2024 (UTC)Reply

Oppose synthesized/AI audio. IPA by non-natives depends on the expertise and ear of the person who originally recorded the information- sometimes it's very good. At any rate, one doesn't have to speak a language to record what has been heard. Audio by non-natives is a lie- sometimes a harmless white lie, but always a lie. Chuck Entz (talk) 15:23, 23 June 2024 (UTC)Reply

Oppose synthesized/AI audio. Such synthesized audio would be effectively squatting Wiktionary pages, effectively preventing audio samples recorded by humans from finding their way there. --Ssvb (talk) 17:24, 23 June 2024 (UTC)Reply

Anti-intensifiers and the epidemic of British meiosis

Latest comment: 1 month ago5 comments4 people in discussion

At the moment our entry for maybe lists the following sense:

(UK, meiosis) Certainly

Similarly, we find the following under a bit:

(UK, meiosis) Very.
(UK, meiosis) A lot.

and the following under somewhat:

(UK, meiosis) Very

The problem is that, as far as I am aware, every single word or phrase that carries a sense of moderation is fair game for ‘meiosis’. In no particular order I cite slight, modest, mild, decent, small, minor, light (adj.); relatively, perhaps, to some extent, fairly, a little, possibly, not exactly; might, could, seems; scuffle, tiff, misunderstanding.

The flipside of this is that people can and (especially in the UK) do assign sarcastic senses to any word that denotes a positive quality: genius, fantastic, brave, brilliant, revolutionary, creative, and so on.

Both meiosis and sarcasm are, I think, cultural/metalinguistic and as such beyond the purview of a dictionary. Nicodene (talk) 00:24, 6 June 2024 (UTC)Reply

Are there examples where words actually acquired new meanings through meiosis. What exactly distinguishes it from understatement? —Caoimhin ceallach (talk) 00:46, 7 June 2024 (UTC)Reply

Not that I'm aware of, and I don't believe there is a difference other than meiosis coming off as ‘a bit’ affected. Nicodene (talk) 00:53, 7 June 2024 (UTC)Reply

I do think there is a place for meiosis in wiktionary as it can contribute to etymology. As mentioned in the wikipedia article on meiosis, the Australian 'outback' is one example where the word did acquire a new meaning through meiosis. It was originally used as a meiotic comparison to the back yard of a house, but is now commonly used without that comparison in mind. That said, I would agree that meiosis should not be included as an additional sense in each of the entries you referenced. If meiotic or sarcastic senses are to be included at all, I suggest it be in usage notes, as in nice. Pangur Bán & I (talk) 06:21, 9 June 2024 (UTC)Reply

On a balance I think you are right, this is not worth a separate sense line, at least in those entries. (As Pangur says, there are cases where meiosis seems to become lexical, like pond.) I suppose the (small-c) conservative thing to do would be to conserve the information in a usex like the one showing sarcastic use at Sherlock, or move the quotes under the 'regular' sense. Indeed, it is nonobvious to me how one discerns that the "somewhat weatherbeaten" quote at somewhat is meiosis, anyway; do I need to have knowledge of the real condition of the train in question to know the writer is understating its weatherbeatenness, and out of meiotic intent rather than misassessment? - -sche (discuss) 16:31, 18 June 2024 (UTC)Reply

Kyakhta Russian–Chinese Pidgin

Latest comment: 1 month ago5 comments3 people in discussion

I suggest adding Kyakhta Russian–Chinese Pidgin in addition to existing pidgins based on the Russian: Mednyj Aleut (mud), Russenorsk (crp-rsn), Solombala English (crp-slb), Taimyr Pidgin Russian (crp-tpr). AshFox (talk) 08:18, 6 June 2024 (UTC)Reply

Support Protegmatic (talk) 18:13, 8 June 2024 (UTC)Reply

Do you have any plan about how to make the entries for Kyakhtian? The only clear feature it has is the suffix -la, and you even can't always use it. There is no clear grammar or spelling or pronunciation records. Also, i remember rumors that there are some Chinese records of this pidgin, and it were some problems with them as well. I have thinked about this pidgin long time and just gave up, cause am not sure how to make it structured enough for Wiktionary. Anyway, good luck with this work if you decide to do it. The pidgin is a mess, but it has many cool words worth to be mentioned on Wiktionary. Tollef Salemann (talk) 22:02, 8 June 2024 (UTC)Reply

But as for adding an own language code for it, I'm fully supporting it. Tollef Salemann (talk) 22:04, 8 June 2024 (UTC)Reply

Oh, yeah, there is also -shek-/-nek- instead of Russian -shk-/-nk- but I guess that it is just the Chinese pronunciation, and not really a pidgin grammar feature. Tollef Salemann (talk) 22:07, 8 June 2024 (UTC)Reply

Full stops after templates like {{synonym of}}

Latest comment: 2 days ago43 comments20 people in discussion

Should automatic full stops be added after templates used in definitions like {{clipping of}}, {{short for}}, and {{synonym of}} (with the option to turn it off)? @Sgconlaw suggested to make this discussion after I manually added one to sacrifice. J3133 (talk) 13:20, 6 June 2024 (UTC)Reply

Support (for English; separate discussion for other languages desirable). In fact, {{clipping of}} and {{short for}} already automatically add a full stop at the end. I think it makes sense to have a full stop automatically added for other templates like {{synonym of}} (with the option to turn it off in appropriate cases) for consistency with the earlier-mentioned templates, and because we treat our definitions for English entries like sentences, starting them with a capital letter and ending them with a full stop. — Sgconlaw (talk) 14:25, 6 June 2024 (UTC)Reply

Support Def templates usually benefit from having full stops..

Support for English,

Oppose for other languages in light of Ben's argument. Vininn126 (talk) 09:30, 7 June 2024 (UTC)Reply

Support (edited to add: for English only, in agreement with Benwing below); IMO, the templates should use the langcode to format themselves the way definitions are formatted, capital + period for English, lowercase + no period for other languages. That might need separate/more discussion, because some people disagree and want capital + period for all languages/definitions, or want lowercase + no period for all langs/definitions, but in any event having some [but only some] templates do "capital but no period" is not consistent with anything. Let's check for cases these are followed by (a) a manual period a bot should remove if we have the template start supplying them, or (b) something else, such as another template or gloss, which we / a bot might solve by adding nodot= — I sometimes see (or do!) things like "{{altcase|en|fooh}}: {{altform|en|foo}}". - -sche (discuss) 17:54, 6 June 2024 (UTC)Reply

Oppose. It's trivial to add punctuation: it's one keystroke. It's comparatively cumbersome to use parameters to disable unwanted punctuation: |nodot=1. Not automatically outputting punctuation is a more flexible design, more user-friendly, and less obtuse.

There have been cases in the past where someone has gone in and added auto-punctuation to long-standing templates, requiring lots of manual editing to fix existing wikicode where the templates were used mid-sentence.

The time gains from auto-punctuation are trivial. The time losses are substantially larger. ‑‑ Eiríkr Útlendi │^{Tala við mig} 20:33, 6 June 2024 (UTC)Reply

Hmm, this is a testable argument: T:alternative form of (for example) is used on 174,660 pages, so the cost of adding the missing period to them all would in fact not be 1 keystroke, but 174,660 keystrokes. [If we only add dots for English, the number will be lower but we also won't have to worry about adding nodot= to non-English entries... in any event, by ratio,] it seems like somewhere in the vicinity of 1/8th of affected entries would have to be putting other text (besides a period) after the template in order for the keystroke argument to support defaulting to no dot, rather than supporting defaulting to dots. Can anyone check what's the case? I suspect the number which are putting other text after the template is in fact far, far smaller than 1/8th, but I could be proven wrong! (In most cases, I think whatever display would be correct in the majority of cases should be the default display.) If we do decide the default display should be no dot, I hope someone will write a bot to add dots where they're missing, since at present this is not done and entries just sit around with their normal definitions and these templates looking inconsistent. - -sche (discuss) 21:29, 6 June 2024 (UTC)Reply

On Translingual entries, I have been forced to add "nodot=1" for synonyms templates used within {{taxon}} (which has a default period) and for instances of "See {{specieslite}} and

Beer parlour on Wikipedia.Wikipedia for other species". I'd be happy to forego the default period in {{taxon}} for the benefit of consistency in the need to consider the punctuation needs of the entry. DCDuring (talk) 15:12, 7 June 2024 (UTC)Reply

Oppose per Eirikr. If this is an issue, I can start adding full stops to the entries I create, or someone can make a bot do that; Adding an automatic printing of a full stop is always a whole headache trying to remove it. Thadh (talk) 21:12, 6 June 2024 (UTC)Reply

~~Strong oppose~~

Support for English,

Strong oppose for other languages. I believe as a general rule that all form-of templates should auto-generate capital letters and final periods (full stops) only for English (if that), and should default to lowercase and no periods for all other languages. I have wanted to implement that for all form-of templates instead of the morass of randomness we currently have, but need to get consensus for it. Benwing2 (talk) 22:26, 6 June 2024 (UTC)Reply

OK, I see User:-sche agrees with me, but has phrased it using "support". I think we should have a separate poll to implement this option. Benwing2 (talk) 22:28, 6 June 2024 (UTC)Reply

@Benwing2: mmm, isn't your view in support of J3133's proposal, at least where English is concerned? I also have no objection if it is felt that for non-English languages there should not be an initial capital letter or terminal full stop (though I'm unclear why). — Sgconlaw (talk) 22:29, 6 June 2024 (UTC)Reply

@Sgconlaw User:J3133 did not qualify their proposal with a restriction to English; I'm strongly opposed to making this a blanket addition to all languages, which is what the proposal suggests on its face value. Benwing2 (talk) 22:37, 6 June 2024 (UTC)Reply

@Benwing2: We (Sgconlaw and I) were discussing English entries, but I forgot to mention it. J3133 (talk) 06:05, 7 June 2024 (UTC)Reply

Abstain The current state of capitalisation and full stops in definitions is pretty chaotic. I think it doesn't make much sense to change some templates one way or the other before reaching consensus on puncuation for each type of definition (English and non-English, lemma and non-lemma, gloss and non-gloss). After deciding on that, the inclusion of automatic full stops in form-of templates may be worth another discussion. Personally, I think most (or all) non-gloss definitions, including those which use form-of templates, should have capitals letters and full stops in both English and non-English entries. Einstein2 (talk) 23:07, 6 June 2024 (UTC)Reply

Support for English,

Oppose for other languages, in strong agreement with Benwing above. — Vorziblix (talk · contribs) 00:30, 7 June 2024 (UTC)Reply

Support Per User:Benwing2 and User:-sche. Ioaxxere (talk) 04:28, 7 June 2024 (UTC)Reply

Support for English,

Oppose for other languages. Same thing with capitalization (capitals for English, lowercase for other languages). However, there should be a "nodot" parameter on all of these. Sometimes it's useful to add information (that should be part of the same sentence) after the template. Andrew Sheedy (talk) 05:02, 7 June 2024 (UTC)Reply

@Andrew Sheedy Agreed. Whenever a template auto-capitalizes or auto-adds a final period, there should be (and usually are) |nocap=1 and |nodot=1 params to disable the capitalization and auto-period. Benwing2 (talk) 06:23, 7 June 2024 (UTC)Reply

Support for English,

Oppose for other languages (as this proposal’s initiator). J3133 (talk) 06:28, 7 June 2024 (UTC)Reply

Abstain for English, given current practice which I would not have begun but now we have it,

Oppose for other languages and strongly support the opposite. Fay Freak (talk) 09:53, 7 June 2024 (UTC)Reply

Oppose for English and Translingual.

Abstain for other languages. The nodot=1 option required to maintain flexibility in the use of the template is an annoyance. Prohibiting use of such templates except in prescribed cases and in prescribed manner needs some kind of justification that I haven't seen here. I hope we aren't going in the direction of "Everything that is not mandatory is forbidden." DCDuring (talk) 14:38, 7 June 2024 (UTC)Reply

Somewhat

oppose for all languages,

strong oppose having English as a special case apart from other languages. It not only might confuse editors, it will definitely confuse editors. — SURJECTION ^{/ T / C / L /} 21:56, 7 June 2024 (UTC)Reply

Oppose -- Sokkjō 05:51, 9 June 2024 (UTC)Reply

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Comment - it appears we have consensus to not include final full stops/periods (and probably not initial capitalization either) in non-English form-of templates, but no obvious consensus for English form-of templates. It appears there are two options, either include them by default with English or don't include them. The latter makes English consistent with non-English, but the former is closer to existing practice (in many cases, at least). Some thoughts:

Would it make a difference in your voting if there were a one-character way of turning off initial-caps and/or final period/full-stop? For example, a symbol like ^ or > (just brainstorming here, maybe there are better symbols, and it doesn't have to be the same symbol at the beginning and the end) could be placed at the beginning to suppress the initial caps and at the end to suppress the final period.
How strongly do you feel about the inclusion or non-inclusion of initial caps and final periods for English? (E.g. for me, I could go either way with English; what I feel strongly about is that initial caps and final periods should *NOT* be present for non-English.)

Benwing2 (talk) 09:18, 9 June 2024 (UTC)Reply

Assuming we are still talking about "templates like {{synonym of}}", I am considering abandoning their use embedded in {{taxon}} because {{syn of}} doesn't accept nocap=1. If the final period is mandated, then there would also be an extra period. DCDuring (talk) 19:25, 9 June 2024 (UTC)Reply

@DCDuring Are you sure that {{syn of}} doesn't accept |nocap=1? It's documented to accept it and internally it sets |withcap=1, which simultaneously turns on initial capitalization and adds a |nocap= option to turn it off. Also as mentioned above, I am thinking of adding a feature to make it easier (fewer keystrokes) to turn off the initial caps/final period. Benwing2 (talk) 19:39, 9 June 2024 (UTC)Reply

I'll try again. See Geomalia for the look at present. DCDuring (talk) 20:42, 9 June 2024 (UTC)Reply

@DCDuring Looks good to me. Benwing2 (talk) 20:51, 9 June 2024 (UTC)Reply

That's after I corrected my error. Previously: [8]. I still wish that the italics could be removed (optionally: noi=1), so italicized taxa could appear italicized in Translingual/taxonomic definitions that use {{syn of}} within {{taxon}}. DCDuring (talk) 21:34, 9 June 2024 (UTC)Reply

@DCDuring I could implement that, although at that point I wonder if it wouldn't be better just to manually write out "synonym of"; the template doesn't categorize so there seems little point in using it if you have to add a bunch of flags to get non-default behavior. Benwing2 (talk) 21:54, 9 June 2024 (UTC)Reply

It might be better for me to fork out {{taxonsyn}} for the increasing number of cases where I embed a synonymous taxon in a definition within {{taxon}}. Those 'lesser' taxon definitions merit less detail that 'real' taxon definitions, so excluding them from searches for 'incomplete'/improvable entries is desirable. The formatting peculiarities of taxa derived from italicization (and perhaps the meaning of synonym) may justify otherwise undesirable forking. DCDuring (talk) 22:12, 9 June 2024 (UTC)Reply

I am not familiar with the ins and out of taxon formatting, but in general if you are doing something repeatedly, it makes sense to have a dedicated template or template parameter for it. Benwing2 (talk) 22:35, 9 June 2024 (UTC)Reply

Support for English,

Oppose for other languages. @Benwing2 I note many (not all) of the full oppose votes are from editors who don’t edit English, who may have not realised the growing consensus for separate options. Theknightwho (talk) 11:18, 9 June 2024 (UTC)Reply

Oppose for English,

Support for other languages. P U C – 13:20, 9 June 2024 (UTC)Reply

@PUC: But we (including you) do not use full stops for non-English definitions. J3133 (talk) 13:33, 9 June 2024 (UTC)Reply

He's goofin. Vininn126 (talk) 13:38, 9 June 2024 (UTC)Reply

I feel strongly about having all English definitions start in a capital letter and end in a period, because my initial reason for becoming a Wiktionary editor was to rectify inconsistencies like that. But I would also like it to be based on consensus and not be forced on people who feel strongly the other way. Andrew Sheedy (talk) 18:16, 9 June 2024 (UTC)Reply

Support. Imetsia (talk (more)) 22:22, 9 June 2024 (UTC)Reply

@Imetsia Can you clarify? What do you support exactly, and for English or non-English? Benwing2 (talk) 22:35, 9 June 2024 (UTC)Reply

I support automatically adding full stops in templates like {{synonym of}}, for both English and non-English. I've wanted this for Italian entries for quite a while by now. Imetsia (talk (more)) 22:49, 9 June 2024 (UTC)Reply

Support, with the automation that nodotbe enabled for non-English by default. While at it, I also support nocap be enabled for non-English by default but I have not seen editors follow that convention as rigorously (I do, though, since I was told it is prescribed to do so). Svartava (talk) 07:11, 12 June 2024 (UTC)Reply

Oppose. Full stops are a nuisance where they are included in the template, especially when adding text after it, and a comma is needed. DonnanZ (talk) 23:47, 12 June 2024 (UTC)Reply

OK, from what I can tell, consensus is in favor of initial caps for English and no initial caps otherwise. There seems to be a general consensus in favor of final period for English and no final period otherwise (cf. around 10-6) so I'm going to proceed with this. To address the concern about the annoyance of turning off the initial caps and final period, there will be a single-character way of turning them off for English. I'm leaning towards using a ~ char (meaning "switch"), which if it comes before the language code will turn off initial caps and if it comes after the language code will turn off final period, hence:

{{alt form|en|foobar}} -> Alternative form of foobar.

{{alt form|en~|foobar}} -> Alternative form of foobar

{{alt form|~en|foobar}} -> alternative form of foobar.

{{alt form|~en~|foobar}} -> alternative form of foobar'

The alternative is to put one or both of the switch characters before or after the lemma rather than the language code. I'm planning on doing this gradually, starting with a single lesser-used form-of template. There are 142 form-of templates in Category:Form-of templates so this won't happen overnight. Benwing2 (talk) 00:15, 20 July 2024 (UTC)Reply

Old Franconian

Latest comment: 1 month ago1 comment1 person in discussion

Old Franconian is a language variety derived from Frankish, and has many languages within West Central German like Luxembourgish, Rhine Franconian, East Franconian, and Central Franconian. See this. That Northern Irish Historian (talk) 03:49, 7 June 2024 (UTC)Reply

Collapsible lists within definitions

Latest comment: 1 month ago24 comments8 people in discussion

I propose that for cases in which definitions include lists (especially long ones), it be adopted as best practice to make said lists collapsible with a template such as those existing for quotations and semantic relationships (or one based on a code I cobbled together to attempt this for the list of place names in Eden). I believe this would be worthwhile to help streamline some unwieldy pages, prioritizing definitions and relationships. @Soap @Ioaxxere – Pangur Bán & I (talk) 21:41, 7 June 2024 (UTC)Reply

I support this. Hopefully if we approve this we can base it on existing code like that of {{collapse}} or {{collapse-top}} (neither of which will work inside a list as of yet) so that it can be guaranteed to work on all browsers. —Soap— 21:52, 7 June 2024 (UTC)Reply

@Soap, you said before that you would be willing to assist me in drafting this proposal. What are your thoughts on this in light of DCDuring's opposition? Pangur Bán & I (talk) 22:19, 17 June 2024 (UTC)Reply

I only meant I could help start the post since you're a new user and I felt you might be too shy to come here outright. But you have a good understanding of the issue and how to express yourself, so right now I dont have anything else to add. —Soap— 17:44, 18 June 2024 (UTC)Reply

I gotcha. Thanks anyway! Pangur Bán & I (talk) 17:50, 18 June 2024 (UTC)Reply

Support although there aren't that many pages where collapsed definitions are worth using. Maybe Mandarin màn (consider someone searching for the Vietnamese entry)? Ioaxxere (talk) 22:48, 7 June 2024 (UTC)Reply

Thank you for your support. To your point, as noted by DCDuring, this does seem to be primarily an issue with toponyms. See entries like Chester, Richmond, Franklin, and Weston for a few examples of pages bloated of pages where I think collapsible lists of subsenses would be worth using. Pangur Bán & I (talk) 15:54, 17 June 2024 (UTC)Reply

Support but I don't want it to make a box around the sub definitions when you expand it, cuz I think that's kinda ugly. That is, if it's even possible to do that. — SAMEER (؂・؄・؏) 04:25, 8 June 2024 (UTC)Reply

That is absolutely possible. I gave my makeshift collapsible list a border just to make it visually distinct, but in hindsight I think it would make more sense for something like this to more closely follow the style of the semantic relations and quote templates, just in a bulleted or numbered list format unlike those. Pangur Bán & I (talk) 16:20, 17 June 2024 (UTC)Reply

~~Abstain~~

Oppose Not a complete proposal. It's just based on the Eden anecdote. By my lights it would have to be restricted to definitions formatted as subsenses. As nobody seems to have analyzed the cases, perhaps we should wait to see how it would be applied to toponyms for now. DCDuring (talk) 15:12, 8 June 2024 (UTC)Reply

@User:Geographyinitiative Any thoughts? DCDuring (talk) 20:58, 17 June 2024 (UTC)Reply

I really have no opinion on the proposal. I will be fine with it if you do it. Please compare Washington County on Wiktionary with Washington County on Wikipedia and Category:Washington County on Wikimedia Commons. The solution sounds like an innovation beyond Wikipedia and Commons. I would want to find out if this have been discussed in Wikipedia, etc. Geographyinitiative (talk) 23:04, 17 June 2024 (UTC)Reply

But, what's the benefit in the case of Washington County? There's is only one screenful of total content in the entry. DCDuring (talk) 16:50, 18 June 2024 (UTC)Reply

Apologies for my perhaps poor phrasing. I would be absolutely fine with amending this proposition to be restricted to definitions formatted as subsenses and I would even support having a toponym-specific template. Though, I would still be in favor of having one for subsenses more generally as I think that would allow some editor freedom without any cost that I can see. Any thoughts? Pangur Bán & I (talk) 15:51, 17 June 2024 (UTC)Reply

How many non-toponym entries would benefit from this? What criteria are to be applied, eg, number of subsenses, total number of definitions in PoS section, nature of supersense definition (Some are purely hypothetical for purpose of grouping. @User:-sche)? Others may have more questions and issues. I feel this might need a formal vote, not just a straw poll on this page. DCDuring (talk) 16:47, 17 June 2024 (UTC)Reply

Your concerns about a general subsenses template are absolutely worth discussing, but before we move on to that, would you definitely support a toponym-specific collapsed-list template in the vein of the formatting of in-line collapsed quotations, and hypernyms, meronyms, etc. (but formatted as a bulleted or numbered list)?

Once the details are more hammered-out, a formal vote sounds like a great idea. My main trouble is that I don't have the coding knowhow to do a good job writing the template I'm envisioning. I don't know how I would go about producing a comprehensive count of how many entries would benefit, but block, cross, finger, head, stand, slash, and band are just a few non-toponyms I've found that I think could potentially use collapsible subsenses. As for requisite criteria for use, if you have any specific suggestions I'd genuinely love the help in fleshing out this proposal. The existence of two or more items seems to be the only hard criterion for quotations formatting and semantic relations templates, which seem fine models for something like this, but I'm happy to consider alternatives. Based on this poll, it would certainly seem that there is some interest in this functionality, and if it does reach the point of a formal vote, different options for potential criteria could easily be offered. Pangur Bán & I (talk) 18:51, 17 June 2024 (UTC)Reply

If we don't begin to address the issues now, than it will not be possible to draft a meaningful proposal. At head we have two levels of subsenses. The first definition is "The part of the body of an animal or human which contains the brain, mouth and main sense organs.". Under this definition, the first subsense layer consists of two non-definitions: "(people) To do with heads." and "(animal) To do with heads." Would that first layer be visible or not under a yet-to-be specified proposal? DCDuring (talk) 20:58, 17 June 2024 (UTC)Reply

I'm not set on anything and am entirely willing to continue workshopping this proposal. In pages like head, perhaps subsenses hosting a second layer of subsenses should not be collapsible under this prospective template. I see no problem with that if that's what you're suggesting. Pangur Bán & I (talk) 22:19, 17 June 2024 (UTC)Reply

Neither of the two member of the first subsense layer at the first definition of head are real definitions. The first definition itself does not necessarily suggest the range of definitons at the second layer of subsenses. To me this is a specific sign that hiding subsenses can make it harder for less experienced user to find less common definitions. DCDuring (talk) 16:47, 18 June 2024 (UTC)Reply

Support. Imetsia (talk (more)) 22:23, 9 June 2024 (UTC)Reply

I am ambivalent about the idea of doing this to placenames, or long lists of Chinese "romanization of"s as also suggested above; I would not support collapsing 'real' definitions e.g. at take, even if there are very many. It seems like the number of placename entries which really have so many senses as to merit collapsing is small, and it seems like the sort of person who'd go to màn#Mandarin is someone interested in learning what it's a romanization of: why else wouldn't they go to or click through in the TOC to màn#Vietnamese? so collapsing just adds an extra step for them. In general it does not seem like that much of a hassle to scroll past placenames one is uninterested in. Whereas, collapsed content is easily missed, even by veteran editors who know to look for it (I myself often missed the existence of various inline -nyms under definitions back when they were autocollapsed, and have seen other veteran users miss collapsed etymology content), let alone new users. So I am ambivalent, leaning against it. - -sche (discuss) 16:15, 18 June 2024 (UTC)Reply

Thank you for your consideration and your well articulated concerns. I have no opinion on the "romanization of" example as that's not something with which I have any experience myself, and in hindsight I do think my original proposal here is likely too broad. I don't really want every list of subsenses to be collapsed, but rather for this to be available as a tool in situations where it may be truly helpful, its usage being determined via consensus for edge cases.

For 'real' definitions, I would agree that genuinely distinct subsenses such as those in take probably shouldn't be collapsed, at least not by default (I think making them open by default but with the ability to collapse them could still be useful). I really had in mind entries wherein the "subsenses" are really just examples, which can be seen in some of the examples I cited (block, head#Noun sense 2, etc.).

My issue is less that it is inconvenient to scroll past them per se, but rather that lists of examples subordinate to the most common senses are effectively privileged over secondary senses that can be more prevalent/noteworthy than items in those lists. This, I think, is not conducive to efficiently absorbing the information, and rather counter to the purpose of ordering senses in the first place. Pangur Bán & I (talk) 17:30, 18 June 2024 (UTC)Reply

We only have opinions, not facts, about the relative frequency of use of different definitions, the relative frequency of requests for different definitions, even of the time-period of use of definitions. I have trouble justifying the privileging of some contributor(s) opinions about what is to be listed first and what de-privileged by being rendered into subsenses. I also have trouble understanding why we discuss this in terms of the rights and privileges of definitions. Our concern is merely with users and their ability to navigate an essentially linear presentation of data, in which some data necessarily precedes other data. I'm afraid that tradeoffs are inevitable and that we have little reasoned basis to make them in general. DCDuring (talk) 20:47, 18 June 2024 (UTC)Reply

I suppose 'privilege' was ill-chosen here. I meant 'prioritize', in the sense of placing one thing before another in sequence. As for privileging contributors' opinions about what is listed first, every Wiktionary entry that includes multiple senses already does that, in accordance with WT:SG#Definition sequence, with the frequency-based order determined via consensus, exactly as I'm proposing the usage of this template be. My concern is also with users and their ability to navigate the information, which is precisely why I am proposing this. Pangur Bán & I (talk) 21:41, 18 June 2024 (UTC)Reply

User:Ioaxxere/MTE glossary

Latest comment: 13 days ago9 comments4 people in discussion

I invite you to check out this new glossary format. Using JavaScript (User:Ioaxxere/auto-glossary.js), it automatically scrapes every entry in a certain category and finds definitions containing a certain label. To see the output, you will have to add the line importScript("User:Ioaxxere/auto-glossary.js"); into your common.js page. Here's what the output looks like: https://imgur.com/a/kKQLGSG.

I propose that we create more of these automatically-generated glossaries in Appendix space, as I think that they are very useful for keeping track of a certain category. Ioaxxere (talk) 05:45, 9 June 2024 (UTC)Reply

Yeah, they should be efficient search engine spam, people land on when searching slang words. Fay Freak (talk) 08:20, 9 June 2024 (UTC)Reply

I wonder if search engines will manage to index them, given that they are dynamically loaded. In general I'd think the outcome Ioaxxere is aiming for is better accomplished by frequently updating the actual wikitext of the page using a bot or script. This, that and the other (talk) 00:18, 24 June 2024 (UTC)Reply

@This, that and the other: I'm not sure if this is possible given the relatively strict pagesize limits, but it's definitely worth trying. Who would be available to run the bot? (note that we may end up with hundreds of these glossaries) Ioaxxere (talk) 05:07, 24 June 2024 (UTC)Reply

@Ioaxxere Good point about page size limits. It just doesn't "feel" like the right thing to do using client-side JS... but maybe that's just the old-school web developer in me talking. Is there at least some pagination or limiting in the event the number of glossary entries exceeds a certain number?

Having said that, the concept of auto-generated glossaries, however implemented, is undeniably good for Wiktionary, so if no-one else objects (or offers to implement it) let me know and I'll put your JS into place as you request, and we can review it later if a better solution appears. This, that and the other (talk) 12:15, 24 June 2024 (UTC)Reply

@This, that and the other Before you do that, I should mention: since the template makes up virtually all of the content on the pages on which it is used, it would make sense for the script to run before everything else — maybe even at the very top of MediaWiki:Common.js. At the very least we should optimize away User:Ioaxxere/auto-glossary.js#L-143. Ioaxxere (talk) 04:51, 27 June 2024 (UTC)Reply

@This, that and the other: It looks like no one has offered any comments or suggestions in the past two weeks. Ioaxxere (talk) 21:24, 8 July 2024 (UTC)Reply

Neat. Vininn126 (talk) 18:38, 9 June 2024 (UTC)Reply

Should this gadget be enabled by default? Notifying a few active interface administrators: @-sche, Benwing2, Surjection, This, that and the other. Ioaxxere (talk) 08:08, 23 June 2024 (UTC)Reply

standardizing the form of phrase lemmas

Latest comment: 17 days ago9 comments7 people in discussion

This is based on a discussion in WT:RFM originally concerning tail wagging the dog, which someone proposed moving to the tail wags the dog. User:Theknightwho asked about general conventions, and I suggested the following:

try to avoid "one" or "someone" in a lemma unless it's unavoidable, e.g. it's in the possessive; so kiss goodbye not kiss one goodbye or kiss someone goodbye;
if "one" or "someone" needs to be expressed, use "one" if it is the same as the subject, "someone" otherwise; hence kiss one's ass goodbye is correct, not kiss someone's ass goodbye; take someone's word for it is correct, not take one's word for it (which is correctly a redirect); but someone's ass off should be one's ass off (the latter is incorrectly a redirect to the former);
use the infinitive for verbs occurring at the beginning of an expression (in a verb-object phrase), but the simple present for verbs occurring with a subject (hence the tail wags the dog not the tail wagging the dog; time stands still not time stood still, time standing still, time stand still, etc.
there should be something about whether to include the word "the", e.g. in tail wags the dog or the tail wags the dog.

User:DCDuring asked:

Those seem like good rules to me. There is an interaction with what I think is our preference not to have headwords with leading the. Also, to clarify, when you say infinitive you mean the 'bare infinitive', not the 'to infinitive'. When should something be used instead of someone? (Does it depend on the relative frequency of use of the expression with non-gendered things? Threshhold?) Are there circumstance in which we would go with a different lemma headword? Should we have alt form entries for some of the inflected and other variant forms or just hard redirects. I don't know how complete we should try to be. To much detail might delay implementation and course correction. DCDuring (talk) 01:53, 7 June 2024 (UTC)Reply

To which I replied:

These are good questions. You are right that I mean "bare infinitive" rather than "to-infinitive". As for something vs. someone, I think if it can reasonably occur with both, one should be a soft redirect to the other. Generally I prefer soft redirects over hard redirects, although I understand that hard redirects are easier to enter. Another issue is, what's the inanimate equivalent of one's? Is it its? I will bring these rules to the BP and see what people say. Benwing2 (talk) 03:01, 7 June 2024 (UTC)Reply

The suggestion is to put these in the WT:Style guide rather than WT:Entry layout (which requires a vote to make any substantive changes). Does anyone have any thoughts or additional suggestions for standardization rules? Benwing2 (talk) 09:06, 9 June 2024 (UTC)Reply

Definitely agree on point 2, which is WT:CFI#Pronouns already. Re point 4, Wiktionary:Tea room/2023/December#Proverb_entries_starting_with_"the" suggested more people want to include the in proverbs than don't (obviously only for phrases that can include the; nobody is moving →*the Rome wasn't built in a day), hopefully a wider discussion finds a wider consensus. (Maybe we can even determine whether to standardize the situation with short the X phrases / nouns: we have the bomb, but (after some TR discussion) the talk is a redirect to talk; the Netherlands redirects to Netherlands, but the Rock is an entry, and I don't think anyone would dream of moving The Hague.) I advocate redirects from whichever form we don't lemmatize to whichever we do. Point 3 seems reasonable; there too I advocate redirects from other common forms (e.g. the tail is wagging the dog). If we remove the object from the entry title (point 1), I hope we strongly encourage people to add usexes or citations showing where in the phrase the object goes, because sometimes it's [verb] [other word] someone and sometimes it's [verb] someone [other word] and sometimes it's other possibilities. - -sche (discuss) 17:29, 9 June 2024 (UTC)Reply

@-sche Thanks, and I completely agree with your idea of strongly encouraging the inclusion of usexes showing where the object goes. Sometimes even a single expression can go both ways; my canonical example for this is see through. For this example, we do include usexes for each sense, along with a usage note indicating that some senses take the object before through, some after. Maybe there is a way to standardize this? Benwing2 (talk) 18:40, 9 June 2024 (UTC)Reply

I don't have a strong opinion on how we lemmatise (though I see the merits of the cut down) but I agree that pronouns (and other arguments) are extremely important (and in the case of phrasal/particle verbs, also their relative positions), especially for learners, and would support a policy which requires mandatory marking of the arguments which a word/phrase takes, at least in the entry (via usex or similar), if not also in the headword. By way of illustration, compare the variants of the lemma turn on:

turn something on (“activate, start"; also possible in the order: "turn on something”) (as far as I can tell, this is the only construction from these examples which can display ergativity, and thus can occur in the bare form, apparently without an object: "the coffee machine turned on [by itself] in the middle of the night"),
turn someone on (“excite, esp. sexually”),
turn on something (“revolve around, centre on"; also: "activate, start”),
turn on someone (“unexpectedly attack or betray"; but IMO this order is ~~not possible~~ rather awkward in the meaning "excite sexually”).

In practise, this information may not be as obvious/readily accessible to editors as we might hope, since although I've probably used all the above examples before, the third and fourth examples only occurred to me after consulting a dictionary.

Edit: it occurs to me that the preferred order may also vary depending on whether the object is a pronoun or a noun.

Helrasincke (talk) 07:31, 19 June 2024 (UTC)Reply

I disagree slightly about the "avoid someone/one" rule - where an entry would be ambiguous, I think having the pronoun is better. For instance, I think leave someone holding the bag seems better than leave holding the bag, which could be misinterpreted as "to leave [a room] while holding the bag". Occasionally this is all that separates entries like get there vs get someone there. Smurrayinchester (talk) 19:05, 12 June 2024 (UTC)Reply

Support rules 1–3. As for using "the", I think it should be avoided unless the entry title is a fixed phrase, like a proverb, or would sound extremely unnatural without it (admittedly subjective). Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

Generally, I am also in

Support of the first three rules. I agree with Helrasincke in that we are mostly lacking guidance about the correct (or most common) usage of pronouns for different senses of English phrasal verbs, which should probably also be standardized in the future. However, agreement on page titles is a good first step towards normalization. Einstein2 (talk) 11:54, 5 July 2024 (UTC)Reply

The right to bear ewes

Latest comment: 1 month ago4 comments3 people in discussion

A usual way of qualifying the restricted applicability of a verb sense is to have a label saying, of a .... For example, for the verb proceed:

6. (intransitive, of a rule) To be applicable or effective; to be valid.

Since the verb is intransitive, this can only refer to the subject of the verb. For transitive verbs, there is an ambiguity: does the restriction apply to the subject or the object of the verb?

Here is an example. At bear, Etymology 2, we see both

1.2. (transitive, of garments, pieces of jewellery, etc.) To wear.

and

1.3. (transitive, rarely intransitive, of a woman or female animal) To carry (offspring in the womb), to be pregnant (with).

Common sense tells us that the first sense does not mean to refer to diamonds wearing a smile and the second sense not to being pregnant with a ewe. But common sense may not be good enough in cases where both interpretations make sense.

Is there a way to disambiguate this that does not depend on common sense? --Lambiam 16:31, 9 June 2024 (UTC)Reply

Granting that this doesn't help someone who is unfamiliar with our subtle norms (and doesn't help if the norms aren't followed): in theory I think the nature of the restriction is supposed to be clarified by the form and placement of the restriction: "of..." labels precede the definition and restrict the subject, whereas restrictions on the object are supposed(?) to occur within the definition itself, not as a label, and not normally with "of" (although clearly this is not always followed, and maybe my sense of this is wrong!). Hence "To carry (offspring)" uses "(offspring)" to indicate that the thing in the womb is normally restricted to being offspring, and that if a surgeon left a surgical implement in a woman's womb after surgery she wouldn't normally be described as bearing it in this sense. (However, there was a discussion recently where a set of "of..." labels were moved—because they had been using {{a}} or {{q}} or manual formatting—from being in front of the definition, to being qualifiers after the definition, which made things [even] less standardized/predictable in this respect.) In theory we could make this explicit by saying things like "(SUBJECT is a pregnant person)", "+ OBJECT (offspring)" or something modelled on however we express objects being in the accusative-vs-dative (etc) already. - -sche (discuss) 17:45, 9 June 2024 (UTC)Reply

Agree with User:-sche here about using of (before or after the definition) to indicate subject restrictions, and parens after the definition without of to indicate object restrictions. Preposition restrictions should use {{+preo}}. Other sorts of predicate restrictions should use {{+obj}}. (I have a sandbox version of {{+obj}} that reworks it to support prepositions and such much better than {{+preo}} currently does; you can see examples at User:Benwing2/test-obj. At some point I will finish this and deploy it.) Benwing2 (talk) 18:45, 9 June 2024 (UTC)Reply

If I understand this correctly, a more appropiate way to express sense 1.2 of bear above is

1.2. (transitive) To wear (garments, pieces of jewellery, etc.).

It would be nice if this was documented in some form of guidance to creating good definitions.

But note that there is a slight problem in applying this to sense 1.1. We get

1.1 (transitive) To carry (weapons, flags or symbols of rank, office, etc.) upon one's person, especially visibly; to be equipped with (weapons, flags or symbols of rank, office, etc.).

(although I can't immediately think of a use covered by the second part not already covered by the first part). --Lambiam 19:09, 9 June 2024 (UTC)Reply

Batch editing Wiktionary with AWB

Latest comment: 25 days ago10 comments4 people in discussion

As discussed in February, there are cases where for both US and UK Englishes, the voiced alveolar approximant /ɹ/ is transcribed as the trill /r/. Our team at CUNY (myself and @Yaejunmyung) would like to use the AWB tool to a batch editing, mapping all instances of the trilled /r/ to /ɹ/ for both US and UK Englishes. Please let us know if you see any issues with this batch editing. If it sounds okay to you, could you please add me and @Yaejunmyung to the enabled user list? Thank you! Cpeng2 (talk) 19:31, 9 June 2024 (UTC)Reply

This seems reasonable. Indeed, it's possible that the replacement could be fully automated (for specific accents where it's known that trilled /r/ is not phonemic and thus that it can be replaced systematically). I will wait to see if any bot-maintainer wants to run it as a bot task, or if anyone has objections; if not, I can add you to the AWB list after ~a week, or someone else can feel free to do that sooner. (For other people, let me provide a link to the February discussion; this seems like a more limited and safer proposed change than the changes to parenthetical (ɹ).) - -sche (discuss) 20:00, 9 June 2024 (UTC)Reply

OK, based on Surjection's comment it looks like it would be better for this standardization to be done by someone more familiar with Wiktionary, so we can be sure it's done correctly. (I do think that now that accents have been incorporated into T:IPA, it would be possible for a bot operated by one of en.Wiktionary's competent bot operators to do this if they are reading this and have time; indeed, it might even be possible for the T:IPA template to know that if the input is /r/ + an accent that doesn't have trilled /r/, it should simply correct the displayed output to /ɹ/ and/or add a cleanup category, the last of which is possibly the safest option.) - -sche (discuss) 17:06, 12 June 2024 (UTC)Reply

Yes, we'd appreciate it if it could be done by a bot operated by one of en.Wiktionary's competent bot operators. Is there's a way that we can reach out to them to coordinate this? Cpeng2 (talk) 17:33, 25 June 2024 (UTC)Reply

Pinging @JeffDoozan, Surjection, Erutuon, Benwing2, as operators of bots: How feasible do you think it would be to find and replace instances of /r/ (with →/ɹ/) in dialects of English that don't have trilled /r/? Or does it seem worthwhile (balancing how much effort it'd take to do vs benefit) to simply have {{IPA}}, whenever the input is /r/ but the accent is tagged as GA, RP, etc, simply output/display /ɹ/? Or do you think it's better to wait for a pronunciation module? - -sche (discuss) 21:29, 25 June 2024 (UTC)Reply

@-sche This should be possible. Erutuon already set up a tracking category here Special:WhatLinksHere/Wiktionary:Tracking/IPA/en/plain r for tracking uses of r in English pronunciations. There are 1,183 pages linked so this would have to be done by a bot. I don't know which accents legitimately allow /r/ but if you have an idea I can generate a list of all template invocations using r in English and scan through them manually to see if any are tagged with the relevant accents. Benwing2 (talk) 21:49, 25 June 2024 (UTC)Reply

/r/ exists in some Scottish dialects and, if Wikipedia is to be believed, some Welsh, South African(?),and Indian dialects. One idea would be, as a first step, to change any /r/ which was either tagged as GA/US/UK/RP or presented as pan-dialectal, and then review what's left. - -sche (discuss) 22:15, 25 June 2024 (UTC)Reply

@-sche I generated a list of all the English pronunciations with plain /r/ but I don't think they are amenable to simply replacing /r/ -> /ɹ/ by bot because many or most of them have all sorts of other issues in them; the /r/ is often the canary in the coal mine indicating that the creator didn't really know what they were doing. The list of just the instances of {{IPA|en}} with /r/ in them is here: User:Benwing2/IPA-en-plain-r; but what might be more useful is the list of all the English pronunciations on each page with /r/ in any English pronunciation, which is here: User:Benwing2/IPA-en-plain-r-all. The latter is only about 25% larger than the former. In both cases I took out page 679, Appendix:Protologisms/Long words/Titin, and put it here: User:Benwing2/IPA-en-plain-r-titin because it alone blows up the total list size by about 4x bytes. If you want to go about cleaning up these entries, you can do it directly in any of the userspace lists I linked, and I will then run a bot script to push all the changes to the respective pages, as long as you follow these rules:

Don't change the <begin> or <end> markers.
If you (or someone else) edits any of the listed pages directly, that's fine; I have kept a copy of the unchanged versions of the above lists, and the bot script compares the unchanged version against what's currently present and won't make any changes if there's a mismatch.
If you want to delete a line (e.g. because it's correct or because you made a change to the page directly), that's fine as well, but in that case it's best to delete all the lines associated with a given page.

Benwing2 (talk) 05:36, 26 June 2024 (UTC)Reply

Thanks! Some of those (e.g. hour) are probably fine; others (world) look wrong. I think the ones that give /r/ as Australian are also wrong. I will try to edit the list (more) later. - -sche (discuss) 17:14, 26 June 2024 (UTC)Reply

Oppose granting AWB. I have had to block Yaejunmyung twice for bot-like edits so careless that they did not even check which language they were editing (exhibit A, exhibit B, exhibit C, exhibit D). Some other edits are also inexplicable. This level of editing is simply not acceptable, and if this is what we can expect, we absolutely should not be making it any easier. — SURJECTION ^{/ T / C / L /} 09:17, 11 June 2024 (UTC)Reply

The final text of the Wikimedia Movement Charter is now on Meta

Latest comment: 1 month ago1 comment1 person in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hi everyone,

The final text of the Wikimedia Movement Charter is now up on Meta in more than 20 languages for your reading.

What is the Wikimedia Movement Charter?

The Wikimedia Movement Charter is a proposed document to define roles and responsibilities for all the members and entities of the Wikimedia movement, including the creation of a new body – the Global Council – for movement governance.

Join the Wikimedia Movement Charter “Launch Party”

Join the “Launch Party” on June 20, 2024 at 14.00-15.00 UTC (your local time). During this call, we will celebrate the release of the final Charter and present the content of the Charter. Join and learn about the Charter before casting your vote.

Movement Charter ratification vote

Voting will commence on SecurePoll on June 25, 2024 at 00:01 UTC and will conclude on July 9, 2024 at 23:59 UTC. You can read more about the voting process, eligibility criteria, and other details on Meta.

If you have any questions, please leave a comment on the Meta talk page or email the MCDC at mcdc@wikimedia.org.

On behalf of the MCDC,

RamzyM (WMF) 08:45, 11 June 2024 (UTC)Reply

Proposal for a Turkish conjugation module

Latest comment: 1 month ago21 comments7 people in discussion

(Notifying İtidal, Fytcha, Vox Sciurorum, Lambiam, Whitekiko, Ardahan Karabağ, Orexan, Moonpulsar, Lagrium):

I've noticed that the current conjugation tables for Turkish verbs are incomplete, sometimes wrong (korkmak has korkmış as its inferential past 3rd person singular form, according to the table) and different from one another, albeit for minor things (etmek and gitmek seem to be, together with their derivates, the only verbs that show the polite imperative forms in their table). These reasons, together with the fact that as of now there are way too many templates (Template:tr-conj, Template:tr-conj-v, Template:tr-demek-yemek, Template:tr-conj-*tmek) that require way too many parameters (tr-conj requires the verb's stem, the last vowel in the verb's stem, the stem with the aorist suffix, the last vowel when the aorist suffix is attached and a t/d to know which consonant to use in the suffix -dI) to conjugate Turkish verbs, have made me decide to work on a module that could summarize every possible Turkish verb's conjugation, adding more forms, requiring parameters only if strictly necessary (i.e. if the verb's aorist suffix is unpredictable of if it ends in a t which turns into d before vowels) and making the default table smaller too by setting some forms as collapsible, and I'd like to propose that we switch to this module (here are some sample verbs to display the table)

— Trimpulot (talk) 12:18, 12 June 2024 (UTC)Reply

The module is very impressive. I would totally support switching to the module version. Lagrium (talk) 12:49, 12 June 2024 (UTC)Reply

It looks like a huge improvement. --Lambiam 15:06, 12 June 2024 (UTC)Reply

In many ways this looks like a huge improvement over what we have now. Before we ship it could you squeeze in some more info? Like the formal imperative forms, maybe? And you added the verbal noun but the -iş form is not there. Maybe these two should be listed on the same row to save space horizontally. Rn -me form is there in its own mansion of a box. Same things with adverbial forms. You listed 2 but many are missing. Like -ince, -ip, -e -e, -dikçe, -eli, -esiye and maybe a few more if I'm forgetting any. Whitekiko (talk) 15:58, 12 June 2024 (UTC)Reply

@Whitekiko: The formal imperative forms (as well as -sene and -senize labeled as informal imperatives since I didn't know how else to name them for the time being), -ince, -ip and -e -e are already on the table but aren't shown as a default, mostly because I tought it would overcrowd the table. As for the other forms you mentioned I did miss some of them but I'm not sure adding -iş is really necessary since as far as i can tell it's more of a derivational suffix more akin to -im or -i, whereas -me has actual grammatical functions.

— Trimpulot (talk) 16:52, 12 June 2024 (UTC)Reply

There is enough space for adding the -ince forms:

temporal adverbs açınca, açarken

However, speaking in general, tables for Turkish forms will never be complete. For example, the verbal nouns are declined like all nouns, including case forms of possessive forms. Under ekmek we give the form ekmeğime, so shouldn’t we also, for the sake of completeness, give the form ememememe (as used in ememememe bakmayın! – “don’t mind my inability to suck!”) under the impotential verbal noun emememe? What about the passive, causative and reciprocal forms? And the causatives of reciprocals, like uyuşturmak, or the causatives of causatives, like öldürtmek? The just-do-it suffix -(y)iver? Maybe, one day, we’ll have a module for analyzing Turkish forms, but attempts to be complete in tables are doomed to fail. --Lambiam 20:03, 12 June 2024 (UTC)Reply

@Lambiam: Of course we can't include the entire noun-like declension of the verbal noun nor do we need to as it is implicit in the fact that it is a verbal noun. As for the adverbial forms though, -ince is actually included, but it only appears after toggling the "Show complex tenses" switch for no reason in particular other than if all the hidden adverb and participle forms were visible by default they would overcrowd the table in my opinion, as they would outnumber the finite TAMs. Also I don't think listing them all on the same line would work because that way they wouldn't get any description of their usage or function at all, however small it may be: if -esiye and -eli where in the same box separated by a mere comma how is one supposed to understand that they are pretty much polar opposites in meaning?

— Trimpulot (talk) 20:20, 12 June 2024 (UTC)Reply

Could you also add -er -mez ("as soon as") as a temporal adverb? I forgot to mention that. As with -iş... Our current template has it and I think that's for a reason. -im comes only after a finite number of verbs to derive nouns and these nouns always appear on dictionaries. On the other hand every verb has an -iş form and it always means "the way someone does x". It'll help users that are beginners in Turkish find the infinitive of the verb. -iş has a weird status. There was a debate around it, idk how it ended. We weren't sure what to call it, if it should be a lemma or a non lemma, if the pages should be created. Whitekiko (talk) 08:26, 14 June 2024 (UTC)Reply

@Whitekiko: I don't think that -iş always having the same meaning and being able to be applied to any verb is enough of a reason to include it in the conjugation table, since that argument could also be made for -ici and similar suffixes in other languages as well, like -tio and -tor in Latin, but those are left out. Of course the line between what counts as conjugation and what doesn't isn't precise but it has to be drawn somewhere and I think that semantically heavier suffixes with little grammatical or syntactical meaning should be left out.

As for -er -mez, I would like to add it but I still don't understand if it works with polarities other than positive, and if so how? If you can help me figure it out I'll see it added.

— Trimpulot (talk) 11:38, 14 June 2024 (UTC)Reply

It's just that the first part takes the aorist and the 2nd part takes negative aorist. I've added the def and an example under -er some time ago, rather then creating -er -mez and such. Not sure which one's the right thing to do. Putting these 2 suffixes together will create 6? combinations because first part can go through vowel changes.

Maybe we're of different opinions but I'd like to see -iş and -ici forms too somewhere on the table. I don't think adverbial forms are considered conjugations either but I loved to see them. I don't know the technicalities behind this but it would be revolutionary if we could add "ghost texts" to the templates. Yalayış and Yalayıcı, for example should pull yalamak as a result. In case users run into it in the wild, and they surely will. Whitekiko (talk) 12:46, 14 June 2024 (UTC)Reply

Proposed module looks great. I had noticed the irregular behaviour with certain above mentioned verbs, but unfortunately I'm module illiterate. And I've always thought the current template gives terrifyingly too much info to an absolute beginner checking one of the simplest conjugations, so the drop-down menus are smart. The details can be discussed and smoothed out, but I definitely support this improvement.

By the way, there are a few more active native editors of Turkish, who might have something of their own to say about this; @Hswehli, Blueskies006, Kakaeater, Science boy 30. Orexan (talk) 20:39, 12 June 2024 (UTC)Reply

@Trimpulot: That looks really good! I have a few suggestions:

Make the colours a bit more muted (see {{es-conj}} {{la-conj}} for good examples).
Make sure each link has #Turkish.
The "show complex tenses" button should be inside the table itself to make it more clear that it expands the table rather than showing a new table.
I feel like the infinitive, being the lemma form, should be at the top.
You may want to use — to indicate an "impossible" conjugation (although leaving the cell empty works as well).
You may want to have a slightly different colour scheme for each table.
You could add a disclaimer explaining which forms aren't included.

Ioaxxere (talk) 02:35, 13 June 2024 (UTC)Reply

@İtidal, Fytcha, Vox Sciurorum, Lambiam, Whitekiko, Ardahan Karabağ, Orexan, Moonpulsar, Lagrium, Hswehli, Blueskies006, Kakaeater, Science boy 30.

I have updated the module with some minor changes (fixed the links, moved the "Show complex tenses" button inside the table and added some missing adverbial forms) however I would like like to ask if you think it makes sense to have those "complex tenses" hidden at all. At first it was meant to make the table more readable by hiding all of the forms that employ more than one suffix but I noticed that even without hiding them it is still relatively small and readable. Also let me know if there is a better way to label the forms in -eli, -esiye and -dikçe since I really don't like using a translation as a label but I also don't know how else to call them.

— Trimpulot (talk) 13:19, 13 June 2024 (UTC)Reply

My personal opinion of the complex tenses drop down menu is not only should it stay but it should also have a high-vis warning that says something like "Attention! May cause shock, anxiety, dizziness, despair in beginner learners. Abandon all hope, ye who click here!" Maybe that's a little much, but it should definitely stay.

I would like to float the idea of adding "-cesine" meaning something like "as if ...", forming adverbials. It is productive with a myriad of suffix combinations, see here ("-ercesine", "-mişçesine" "-yorcasına" "-ecekçesine" etc.) as well as "noun + -cesine" though unrelated. It's a common enough usage to encounter. I guess you would only display the "Simple" aorist form like you did with "-ken" (which is also productive as "-erken", "-yorken", "-mişken", "-ecekken" etc.).

Also, as a native and someone who's got above average grasp on English but isn't a grammar expert, the labels don't mean anything to me, even some of the conjugated forms don't mean anything unless I see it in an example sentence. I literally had to google "-esiye" to see what the hell it was used for. I assume it would be similar with other natives or learners, so I think coming up with labels is kind of an exercise in futility. I get that each form will point to a page of their own, where a text like temporal adverb "until" inflection of açmak would look strange. I'm not saying the module should include example sentences for each and every usage, but translations actually make it easier to have some idea about how or when something is used. At the end of the day, nothing short of turning the table into a full scale grammar book will go very far in the way of helping someone understand the contexts in which these conjugations are used, at least beyond the most basic ones like "açarım, açıyorum, açtım" etc. — Orexan (talk) 14:54, 13 June 2024 (UTC)Reply

@Orexan: That makes sense. As for the various forms suffixed with -ken and -cesine, I've been thinking about putting them all on thesame line like so:

temporal adverb simple açarken, açıyorken, açmışken, açacakken

modal adverb "as if" açarcasına, açıyorcasına, açmışçasına, açacakçasına

Or alternatively we could just display the aorist form as you said and add a note of some kind to explain that those suffixes are actually way more productive.

Also for the drop down menu, do you think it should stay even for those participle and adverb forms, and the formal and "informal" imperative forms, even though they don't employ more than one suffix?

— Trimpulot (talk) 15:32, 13 June 2024 (UTC)Reply

I'm not sure if the line is long enough to contain all combinations of some suffixes, especially with verbs with longer roots than two letters like "aç-". For reference, the paper I linked lists "açarcasına, açıyorcasına, açmışçasına, açacakçasına, açarmışçasına, açmazcasına, açmazmışcasına, açıyormuşçasına, açacakmışçasına, açmacasına, açmamacasına" but I highly doubt any mortal could possibly identify all possible combinations. Displaying the aorist form only, with a note indicating their productivity, and maybe a link to that suffix's lemma page, which hopefully one day comes to be and shows at least a good portion of these combinations and the meanings they convey and, if one can be so bold to ask, one or two example sentences while one's at it, would maintain the table's structural integrity and still be helpful, even if the suffix pages that don't exist yet aren't made in the near future. A suffix page of this comprehensivity is a ton of work, though. I tried to put something together for "-sa" a while back, which is in dire need of an update and some cleanup. That was painful.

The participle and adverbials could be outside the drop down, yeah. But the alternative forms of the imperative are good within, in my opinion. — Orexan (talk) 16:20, 13 June 2024 (UTC)Reply

@Orexan: I see and I agree with you. I have updated the module as well.

— Trimpulot (talk) 18:54, 13 June 2024 (UTC)Reply

First thoughts, good. It's very big, though, like Swahili. There are some parts that are grammatically correct that might be omitted to save space. I suggest putting some boolean constants near the top of the module to control behaviors we might change our minds about.

the -abilmek forms are not necessary (and the -ivermek forms should not be added)
omit passive imperative (probably requires a template argument), potential imperative, and maybe even formal and informal imperative
what about the -iş verbal noun form?
I suggest packing the impersonal particlple and gerund/adverb forms into as few lines as possible even if that means omitting the less common forms or losing some of the labels ("impersonal participles | açan, açmış, açacak")

A minor coding style issue, the initializers for local variables lv and hv should have line breaks in the same places. Vox Sciurorum (talk) 16:49, 14 June 2024 (UTC)Reply

I agree that -ivermek shouldn't be added, but omitting -ebilmek while -ememek is included is just asymmetrical
I see why you say to omit the potential imperative (as well as the impotential, I assume), but why the others as well, especially the informal and even more so the formal imperatives?
as I said before, I think -iş is past the boundary of what counts as conjugation and what doesn't
cramming all of the participle or adverb forms on the same line without any hint as to what distinguishes them from one another wouldn't really be helpful in my opinion

As for the line breaks, I just put all the vowel inputs that return the same vowel on the same line, that's the only reason for it being the way it is.

— Trimpulot (talk) Trimpulot (talk) 17:34, 14 June 2024 (UTC)Reply

@Vox Sciurorum: I have an idea on the last point: how about placing all impersonal participles or adverbs on the same line by default but separating them when the table is expanded? You can see the table like that here as the conjugation for açmak

— Trimpulot (talk) 09:09, 15 June 2024 (UTC)Reply

I guess this makes sense, as per an earlier comment of mine, the labels and even the conjugated forms don't mean much in a vacuum like this, so this setup makes sense at least from a design point of view. Orexan (talk) 15:10, 15 June 2024 (UTC)Reply

Names of people

Latest comment: 29 days ago43 comments12 people in discussion

[Thread moved from Tea Room]

Van Gogh

Vincent van Gogh, Dutch draughtsman and painter.

Monet

Claude Monet, French painter.

Picasso

Pablo Picasso (1881–1973), Spanish painter, best known as a founder of the Cubist movement.

Can anyone clarify upon what basis we have these entries and others similar? Mihia (talk) 19:18, 11 June 2024 (UTC)Reply

There are others, like Einstein. This issue has come up before. Personally, I do not think they comply with WT:CFI, and thus should not be in the dictionary. I think it is worth having a formal vote to clarify the wording in the CFI. — Sgconlaw (talk) 20:00, 11 June 2024 (UTC)Reply

I agree. CFI says about names "No individual person should be listed as a sense in any entry whose page title includes both a given name or diminutive and a family name or patronymic. For instance, Walter Elias Disney, the film producer and voice of Mickey Mouse, is not allowed a definition line at Walt Disney."

However, it says nothing about entries for individual persons under their family name only (or given name only, for that matter). This seems to be an omission, perhaps because there is no agreement.

(By the way, I also think that the "Walter Elias Disney" example introduces an unnecessary complication/distraction, being different from "Walt Disney". I think it would be clearer to use an example such as, let's say, Pete Tong, which does not have this complication.) Mihia (talk) 20:15, 11 June 2024 (UTC)Reply

@Mihia: I don't have strong feelings about the "Walt Disney" example, but have no objection if the example is changed as you suggest. — Sgconlaw (talk) 22:41, 11 June 2024 (UTC)Reply

I think it may also be worth taking this opportunity to clarify the following, which have also come up before:

Whether terms which are a combination of an honorific or title and a name are permitted, e.g., King Charles (meaning Charles III) and Queen Mum (meaning Queen Elizabeth The Queen Mother). I'm generally of the view that we shouldn't allow such terms, because we may then get entries like King Louis (Louis I, Louis II, Louis III, etc.) and Pope Leo (meaning Leo I, Leo II, Leo III, etc.). The relationship between such terms and nicknames (e.g., Brangelina), which I believe are generally thought acceptable, needs to be considered. Perhaps the rule should be that a term which is a combination of an honorific title and a name are generally not permitted unless it is a widely used nickname.
Whether senses which mean "a work by a person with the surname X" are allowed, e.g., Picasso (meaning "an artwork by Picasso") and Roy (meaning "a book by Arundhati Roy"). Again, I am not in favour of such senses because any surname can be used in this way.

— Sgconlaw (talk) 22:33, 11 June 2024 (UTC)Reply

Yes, based on the consensus in the spate of RFDs now at Talk:Michelangelo, uses of NAME to mean "a work by NAME" should not be included; if we can formalize this somewhere, all the better. Regarding the Walt Disney example, I would only add Pete Tong, but not remove Walt Disney: having Walt Disney as an example is useful for showing that you can't defend having someone's name just because you didn't enter their full name. - -sche (discuss) 00:49, 12 June 2024 (UTC)Reply

Right, I see what you mean. Mihia (talk) 09:25, 12 June 2024 (UTC)Reply

@Mihia, -sche: actually I realized that Pete Tong might be better as an example since we don't have Walt Disney as an entry at all. — Sgconlaw (talk) 12:50, 13 June 2024 (UTC)Reply

I do see -sche's point, though, that the Disney example does illustrate how the policy applies even if the article title is not the exact full name. Mihia (talk) 14:36, 13 June 2024 (UTC)Reply

Draft proposal

For discussion purposes, I've taken the liberty of drafting a proposed amendment to be inserted under "Wiktionary:Criteria for inclusion#Names of specific entities". — Sgconlaw (talk) 12:34, 13 June 2024 (UTC)Reply

Original text

However, policies exist for names of certain kinds of entities. In particular:

No individual person should be listed as a sense in any entry whose page title includes both a given name or diminutive and a family name or patronymic. For instance, Walter Elias Disney, the film producer and voice of Mickey Mouse, is not allowed a definition line at Walt Disney.

Proposed amended text

However, policies exist for names of certain kinds of entities. In particular:

Names of people are subject to the "People's names" section of this page.

People's names

In an entry consisting of both a given name or diminutive and a family name or patronymic, including a pseudonym, no individual person should be listed as a sense. For instance, at the entry Pete Tong, the following sense is not allowed: "Peter Michael Tong (born 1960), the English disc jockey." The entry Mark Twain is not allowed if its only sense is "The pen name of Samuel Langhorne Clemens (1835–1910), the American author". However, any figurative sense is allowed.
In a forename or surname entry:
- No individual person should be listed as someone having that forename or surname. For instance, at the entry Mariah the sense "Mariah Carey (born 1969), American singer" is not allowed, and at the entry Van Gogh the sense "Vincent van Gogh (1853–1890), Dutch draughtsman and painter" is not allowed.
- As a corollary, a sense meaning "a work by a person with the surname" is not allowed. For instance, at the entry Picasso, the following sense is not allowed: "An artwork by the Spanish artist Pablo Picasso (1881–1973)."
A nickname for a person, or two or more persons collectively, which is not their legal name, is allowed. For example, the entry Brangelina (defined as "The couple consisting of celebrities Brad Pitt and Angelina Jolie, together from 2005 to 2016") is allowed. Ye defined as "Kanye West, American rapper, songwriter, record producer, and fashion designer" is allowed, because it was a nickname before West legally adopted it as his name in 2021.
An entry consisting of an honorific or title and a name is not allowed unless it qualifies as a nickname as described above or has a figurative sense. For instance, Lord Byron (defined as "George Gordon Byron, 6th Baron Byron (1788–1824), the English poet") and Prince William (defined as "William, Prince of Wales (born 1982)") are not allowed. Prince Albert, meaning (among other things) a Prince Albert coat, is allowed.

@Sgconlaw: Would Jack the Ripper be deleted as a pseudonym? J3133 (talk) 13:02, 13 June 2024 (UTC)Reply

@J3133: my initial impression is no, because it does not consist of "both a given name or diminutive and a family name or patronymic". — Sgconlaw (talk) 13:06, 13 June 2024 (UTC)Reply

@Sgconlaw: I assume you do not mean we could have anyone’s pseudonym as long as there is no family name included. J3133 (talk) 13:10, 13 June 2024 (UTC)Reply

@J3133: Yes in general, but I haven't given full thought to this point. I think we would want to extend the general forename + surname rule to pseudonyms (perhaps including names like Cardi B and Malcolm X which are in the same format), but if a pseudonym is only a single word it comes close to becoming (or may be indistinguishable from) a nickname, in which case there may be consensus for including such names. — Sgconlaw (talk) 13:34, 13 June 2024 (UTC)Reply

I suggest the nickname portion of the proposal be amended to explicitly disallow stage names and assumed names. As it stands, the proposal as written would technically allow entries for Malcolm X, The Rock, Grimes, Pink, etc. None of those monikers are legal names. But they go beyond being simply nicknames. They're how those individuals identify and are identified publicly. The nickname policy was designed to allow for informal/colloquial nicknames for people. E.g. King of Pop for Michael Jackson, RPattz for Robert Pattinson, Elongated Muskrat for Elon Musk, or Maggie for Margaret Thatcher. Those entries have lexical value that entries based on stage names don't. Someone seeing a celeb news headline like "RPattz to play Dark Knight" might not think to punch "RPattz" into Wikipedia. Whether readers can easily connect a nickname to its bearer via WP depends on whether there's a redirect or disambiguation page. Alternatively, someone is unlikely to encounter a headline like "Elon Musk and Claire Boucher split." Everyone knows her as "Grimes." It's what her Wikipedia entry is titled. There'd be no benefit in having a definition for her at Grimes. WordyAndNerdy (talk) 21:08, 13 June 2024 (UTC)Reply

Hmm... to me, someone seeing "RPattz to play Dark Knight" (or seeing "RPattz" anywhere else) and thinking "I should look that up in a dictionary" seems even less plausible, vs. them thinking to google it or thinking Wikipedia might have a redirect from that to the article on whoever it is. No? I mean, if I'm not going to find out what "Slipknot to appear in new John Wick" or "Grimes and Pink to appear in Barbie sequel" means from a dictionary, and I'm not going to find out what/who Margot Robbie ("Margot Robbie to reprise Barbie role", etc.) is from a dictionary, why would I expect to find out about RPattz from a dictionary? What is the rationale for having RPattz in a dictionary, and not having Grimes, Pink, Slipknot and Margot Robbie? - -sche (discuss) 21:37, 13 June 2024 (UTC)Reply

@-sche You're right, but people do tend to click on things that pop up on Google search results, which is why Urban Dictionary is so successful. Theknightwho (talk) 22:34, 13 June 2024 (UTC)Reply

RPattz is a proper noun that's used exclusively as informal slang. Margot Robbie, Slipknot, and Grimes are the "official" names (legal and self-styled) of various entities. Slang is something a descriptive dictionary should aim to document. Proper nouns like Margot Robbie, Slipknot, etc. are best left to the encyclopedia side, where they can be covered with the depth and detail afforded by biographical articles. The purpose of the RPattz entry is to tell readers this term means "Robert Pattinson, British actor," while the goal of RPattz's Wikipedia entry is to tell you where he was born, how many siblings he has, his first acting job, etc. WordyAndNerdy (talk) 23:03, 13 June 2024 (UTC)Reply

I have never heard of either "RPattz" or "Grimes". I would have no reason to imagine that I could look up the former in Wiktionary but not the latter. Mihia (talk) 23:31, 13 June 2024 (UTC)Reply

This is one those areas where someone's individual knowledge base seems likely to inform their perspective in nuanced and hard-to-pin-down ways. Regional variations in English, differences between native speakers vs. proficient secondary speakers, generational differences, differences in interests and subcultures (follows celeb news vs. doesn't). I don't think there's a "right" or "wrong" answer to some of the questions being raised in this thread. I just think some approaches are generally more workable than others. More conducive toward hitting the sweet spot of a dictionary that's more inclusive and up-to-date than Oxford but vastly more serious and reliable than UD. WordyAndNerdy (talk) 23:57, 13 June 2024 (UTC)Reply

We certainly do not want to emulate the sea of crap that is UD. However, although it somewhat goes against my personal instincts, I do think it is at least worth considering allowing ALL proper names that meet some reasonable requirement of widespread mention sufficient to prevent a tidal wave of trivia. In this way we would avoid the need to make fine policy distinctions that might make sense to us at the time but are probably lost on ordinary users, such listing "RPattz" but not "Grimes", or listing "Mona Lisa" because we can find references to "a Mona Lisa smile" but not "Barbara Streisand" because we can find references to "a Barbara Streisand nose", or whatever it might be -- and also avoid the need to be perpetually debating these distinctions. If you asked me, or had asked me, I would say that every single tiny place name definitely was not dictionary material, and yet that policy was agreed. If we can have every tiny place name, then why not also "Grimes", "Monet" and the rest of them? What is the difference, essentially? They are no more or less encyclopedic than the place names, in my opinion. Mihia (talk) 11:50, 14 June 2024 (UTC)Reply

@Sgconlaw: I have a couple of comments on your proposed text.

I wonder whether allowing nicknames, with no further restriction, could open the door to some potentially unwanted entries. Strictly speaking, as the text stands, there seems nothing to prevent me from adding an entry for my mate nicknamed "Bagger". I wonder whether we want unrestricted coverage even for well-known people. I was going to give the example "Giggsy", which is a fairly trivial nickname for a footballer called Ryan Giggs, as something that we wouldn't want to include, but now I see that we actually already DO have this entry! I guess someone thought it was suitable for inclusion.

To be doubly clear, I wonder whether we could explicitly mention that stage names are excluded as pseudonyms.

You mention the exclusion of "a work by a person with the surname"; I wonder if at the same time we should consider making some exclusions as to what does not count as "figurative" use. In my opinion, the following are all candidates for exclusion (in fact, these apply to other proper nouns as well as to people). It seems to be possible to find examples of these for almost anyone/anything that one has heard of, or certainly anyone famous.

"like X", referring to some characteristic of X.
"the X of Y", e.g. "The Ronald Reagan of liberalism".
"do a X", "pull a X", referring to some behaviour associated with X, e.g. "do a Ronald Reagan".
"an X moment", e.g. "a Ronald Reagan moment".

By the way, would it be appropriate to move this discussion to the Beer Parlour, as it concerns general policy? Mihia (talk) 17:59, 13 June 2024 (UTC)Reply

@Mihia: yes, by all means relocate the discussion to the Beer Parlour, and we can continue it there. — Sgconlaw (talk) 18:05, 13 June 2024 (UTC)Reply

Sorry, just one other point that occurred to me. Would it be simpler/shorter to specify under what circumstances definitions that consist ONLY of a real person actually ARE allowed, rather than listing the exclusions, which seem to cover most cases? Mihia (talk) 18:37, 13 June 2024 (UTC)Reply

@Mihia:

I seem to recall from previous discussions that there seems to be a consensus that nicknames are generally allowed, though it seems that this isn't reflected in the CFI. I don't think entries for nicknames of people's random friends will be an issue—it's almost certain that such entries won't pass the verifiability standard.
I'm not clear what you mean by "whether we could explicitly mention that stage names are excluded as pseudonyms". Are you suggesting that stage names should or should not be allowed as entries? (I assume the latter?)
Yes, I think it is a good idea to clarify what counts as a figurative use. Feel free to work that into the draft.
Personally, I think it is clearer to specify in the policy both what is allowed and what isn't, otherwise later on we may be in a difficult position of trying to discern what the applicable rule is from the silence of the text. But maybe it would be clearer to specify what is allowed first, followed by what is therefore not allowed.

— Sgconlaw (talk) 18:45, 13 June 2024 (UTC)Reply

Yeah, for better or worse, even obvious abbreviations of first+last names have tended to be included (Talk:RPattz, Talk:JBiebs), although if enough people comment here we might get a sense of whether there's appetite to reconsider that. I think our CFI are bizarre when it comes to what names we do vs don't include. Why is it considered that I need to know which specific person Giggsy is, but not which specific person Dua is? Last I checked, I was only able to find 2-3 people with the name Dua (and our current presentation of it as an Albanian female given name fails to reflect that two of the 3 bearers got it from Arabic, and one is nonbinary) but perhaps more works have been digitized and the name is better attestable now. Do we include band names, e.g. Slipknot, Rammstein, Einstürzende Neubauten? It seems we do not, and that seems reasonable to me... but then why is Slipknot referring to a set of individuals not included, but Brangelina referring to a set of individuals is? (This is only the tip of the iceberg, consider e.g. fictional places' names.) For Prince William et al., cf. Talk:George VI.
I will opine that if a nickname is used for multiple individuals, and especially if it's productively applicable to e.g. everyone with the surname Giggs, it is probably better defined as "a nickname for people with the surname Giggs" [etc] rather than as "the nickname of [specific person], [specific other person], [specific third person], [specific fourth person], [specific fifth person], ...", similar to how we treat Ed. - -sche (discuss) 19:21, 13 June 2024 (UTC)Reply

Things would be a lot simpler if we decided either "No definitions at all are allowed that simply describe a proper noun -- go and look at Wikipedia for that" OR "Every proper noun (attestable to some minimum level) is allowed"! Mihia (talk) 19:32, 13 June 2024 (UTC)Reply

@Mihia That seems unhelpful at best: we document terms, whereas Wikipedia documents the referents for those terms. Excluding terms that describe a certain class of referent because people might be looking for information about that referent would lead to us excluding English tree because the Wikipedia article Tree exists. Obviously that's a silly example, but it underlines the point that it's not sound logic to be basing policy on. If you don't care about proper nouns that's fine, but quite clearly many users and editors do. Theknightwho (talk) 22:26, 13 June 2024 (UTC)Reply

On the contrary, I believe that it is an EXCELLENT idea to choose one or the other. Judging by your last sentence, you seem to have missed my second option. Mihia (talk) 22:53, 13 June 2024 (UTC)Reply

@Mihia It's only an excellent idea if you prioritise swift policy decisions over anything else, but I don't think it would even achieve that: any set of rules always raises questions about what does and does not qualify, unless it is infinitely permissive or restrictive, but neither of those stances would improve the dictionary, in my view. Theknightwho (talk) 16:10, 14 June 2024 (UTC)Reply

The problem with this is that not every language/variety has the privilege of having a Wikipedia. Even if they do have Wikipedia's that Wikipedia might be prescriptive. For example, the official Persian word for Malaysia is مالزی (malezi) in Iran and مالیزیا (malīziyā) in Afghanistan (those are the respective terms used by news agencies in both countries). However, Persian Wikipedia is extremely prescriptive and considers standard Iranian Persian "correct" and standard Dari "wrong". Mentioning that the country is called مالیزیا (malīziyā) in Afghanistan is actually not even allowed and would be reverted. So it's not as though we can implement a hard rule that says "go look to Wikipedia for Proper nouns" because in some cases, the only place it can be documented is on Wiktionary!! — BABR (talk) 02:44, 14 June 2024 (UTC)Reply

Brangelina is an informal nickname for a celebrity couple used in the media and colloquial speech. Celebrity couples generally don't present themselves to the public by such monikers in the same way bands collectively identify as Radiohead or Slipknot. Official band names can treated like stage names. They're names that individuals have chosen for themselves and thus seemingly fall outside our scope. Whereas informal/colloquial nicknames call under the umbrella of documenting language as it exists. We have Fab Four (informal nickname), but Beatles should probably only exist as a plural of Beatle. This is a complex and somewhat subjective line to draw. Which is why I think CFI should ideally leave room for case-by-case considerations. Nailing down hard and detailed rules about what is and isn't inclusion-worthy in this area might create more headaches than it resolves. WordyAndNerdy (talk) 21:59, 13 June 2024 (UTC)Reply

Nevertheless, the present situation is a mess, whereby there are perpetual case-by-case arguments. Mihia (talk) 22:20, 13 June 2024 (UTC)Reply

I generally agree that having clear and consistent policy is favourable to vague (and often unwritten) rules. But in this particular case I'm not sure that exhaustively itemizing what's includable would be an improvement. Would the clarity make for swifter resolution to discussions, or would it create new opportunities for bickering? I just don't see heated disagreements erupting over whether "a Monet" used in reference to an individual work of art is sufficiently figurative to warrant inclusion (it is, IMO) outside a call to explicitly disallow such terms. People often remain indifferent to policy considerations until their hard work is on in the chopping block. Which is the main reason I've tried to take an inclusionist approach. People gauge the relevance of language to Wiktionary's mission differently. I've never seen the relevance of taxonomical names. But clearly a number of Wiktionarians do and have put in good work in that area. WordyAndNerdy (talk) 23:32, 13 June 2024 (UTC)Reply

1. Yes, good point about verification.

2. I think it would be helpful to mention that "pseudonyms" includes stage names, if that is indeed the intention (or not, if that is the case, I guess). I mention this because "pseudonyms" can sound more "literary".

4. It seems to me that the silence of the text is more likely to be an issue IF we try to explain both what is allowed and what is not, since "almost inevitably" some case will later arise that is not mentioned at all. If we were to say "these are the only cases when people are allowed as definitions, and everything else is excluded" there can't be any room for doubt. Of course, anything can be challenged later if it transpires that something important has been overlooked. Mihia (talk) 19:27, 13 June 2024 (UTC)Reply

What about foreign renderings of names? Such as 忽必烈 (Hūbìliè) and Hốt Tất Liệt for Kublai Khan. MuDavid 栘𩿠 (talk) 01:36, 14 June 2024 (UTC)Reply

@Sgconlaw Re: the draft proposal above and adding to what MuDavid brought up here - the proposal as it stands seems to fall short in the case of borrowed names of specific individuals in corpus languages. For example, we have zero evidence that 𐌰𐌻𐌰𐌹𐌺𐍃𐌰𐌽𐌳𐍂𐌿𐍃 (alaiksandrus) was a given name in Gothic; it is "encyclopedic" content in that sense, it seems to just refer to an individual. Yet it is valuable to include, because names such as this constitute valuable linguistic and onomastic evidence in otherwise poorly attested languages. Another similar case in Old High German Ōtacher, which is very valuable evidence from a philological standpoint, but which refers again to a specific individual only without attested use as a given name afaik. — Mnemosientje (t · c) 12:23, 14 June 2024 (UTC)Reply

I don't think it is the function of principal namespace to be a repository of unattested terms whose only justification is their possible value to linguists. DCDuring (talk) 12:54, 14 June 2024 (UTC)Reply

These are not unattested. As neither of these is "an entry consisting of both a given name or diminutive and a family name or patronymic", I don't see how they would fall under the exclusions outlined in the proposal. Including such terms in the case of extinct languages with a relatively closed corpus seems clearly preferable to me.--Urszag (talk) 13:03, 14 June 2024 (UTC)Reply

You are, of course, as right about attestation as I was wrong. I really don't think that principal namespace should have entries for terms whose main justification is the convenience of linguistic researchers, that doesn't meet our standards for inclusion for all languages. I believe that there is nothing that prevents the use of names in etymologies. Whether we would want to have Appendices of such items is a separate question. DCDuring (talk) 14:33, 14 June 2024 (UTC)Reply

Gothic and Old High German are extinct and are Limited Documentation Languages, so unless otherwise excluded, terms meet criteria for inclusion if they are attested by one use in a contemporaneous source or one mention in a source accepted by the community of editors for that language. I don't see a reason to have a stricter policy for proper names of the type Mnemosientje mentioned than for other terms.--Urszag (talk) 14:43, 14 June 2024 (UTC)Reply

@DCDuring whose main justification is the convenience of linguistic researchers Who else do you think is interested in entries on Gothic or Old High German at all? Theknightwho (talk) 22:07, 14 June 2024 (UTC)Reply

Support. I don't feel strongly about the details but I think reducing ambiguity is always a good idea. Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

Jatki and Western Punjabi

Latest comment: 1 month ago1 comment1 person in discussion

First off, I believe Jatki (i.e. the Lahnda dialects of Jhangli, Shahpuri and Dhanni) need to be given their own language code under Lahnda. Currently Jatki entries have to be put as dialectal Punjabi, which doesn't make sense as all the other Lahnda dialects (Saraiki, Pahari-Potwari, Northern and Southern Hindko) get their own language codes.

Secondly, there is an issue where Punjabi (the Wiktionary sense) is not exactly Punjabi anymore. Because although half or more of Lahnda speakers (of Jatki and Pothwari particularly, up to 50 million people) call their language Punjabi, Punjabi of the Wiktionary sense only includes the Eastern dialects (Majhi, Doabi, Malwai, Puadhi).

So I have a wild suggestion; rename Punjabi as it is now to Eastern Punjabi. (I know this would have a tonnn of complications, just a suggestion :D)

Assuming this did happen, it kind of brings up another problem, because "Eastern Punjabi" does not correlate with "Eastern Punjab" (the Punjab state of India), which could cause confusion. Majhi (the taken standard and central dialect of Punjabi) is an eastern dialect and shares its grammar with other eastern dialects. However, the majority of its speakers are from Western Punjab (Pakistan).

Thoughts? OblivionKhorasan (talk) 14:21, 14 June 2024 (UTC)Reply

English pronunciation module

Latest comment: 25 days ago19 comments5 people in discussion

I am soliciting comments for a possible English pronunciation module. I originally thought of doing this using English-style respelling but it occurs to me it may be too complicated to do it this way. For comparison, I wrote a German pronunciation module that uses respelling based on standard German spelling conventions and is mostly finished; it runs to 2400+ lines and supports only a single dialect (the prescribed one with /ɛ:/ for long ä). You can see testcases (lots and lots of them) here, here and here. So I'm thinking of reusing something similar to enPR notation, i.e. something that can map fairly directly onto phonemes but abstracts out the dialectal differences as much as possible. It would be pan-dialectal as much as possible, at least across conservative GenAm (i.e. without the cot-caught and merry-marry-Mary mergers) and RP, so that if a distinction is made in either dialect it needs its own symbol. But it would also support giving separate per-dialect respellings to handle one-off differences like in controversy and advertisement. Does this make sense to people? What do people think of "augmented enPR" as a notation?

BTW by "augmented enPR" I mean enPR with some additional symbols. For example, enPR calls for writing short o in cot as ŏ and au in caught as ô, but cases where short o is pronounced like au in GenAm (the lot-cloth split, as in dog, long, moth, coffee, chocolate, etc.) would need an additional symbol, maybe ŏ*. Similarly for the RP trap-bath split, where affected words (class but not crass, path but not math etc.) would need an additional symbol, maybe ă*. And probably similarly for the weak vowel merger, because (I think) some unstressed /ɪ/ vowels do not turn into a schwa in GenAm (although I can't say which ones other than bring up the canonical minimal pair Rosa's ~ roses). Ideally in this augmented enPR notation people would write hw in words like which and whale, and ōr in words like hoarse and borne that are distinct from horse and born in accents without the horse-hoarse merger; although in practice the latter might be hard to get right as I'm not sure which dictionaries still notate the distinction. (Update: The Longman Pronunciation Dictionary does indicate this distinction for GenAm, as a secondary pronunciation in the cases where the hoarse vowel can exist. For example, force writes the RP pronunciation as only /fɔːs/ but the GenAm pronunciation as primary /fɔːrs/, secondary /foʊrs/.)

Probably we'd have to manually put spaces or hyphens at all syllable boundaries as this is hard to do automatically, although possibly there could be defaults.

There would have to be parameters for the supported dialects so you can specify different pronuns for each (or some subset) as needed, but it might also make sense to have a way of adding pronunciations with arbitrary accent labels.

I might ditch the standard primary and secondary stress symbols that go after the syllable in question (rather than before as in IPA), or at least let you also use IPA-style symbols that go before, as well as probably acute and grave accents that go on the stressed vowel. (The latter would result in lots of double-accented vowels but most modern fonts support them reasonably well, and at least on my Mac using the ABC-Extended keyboard layout, it's easy to type acute and grave accents but harder to enter IPA or enPR stress marks.)

Thoughts? Benwing2 (talk) 03:46, 16 June 2024 (UTC)Reply

An excellent starting point is the diaphonemes listed here. One will need a distinct way to represent each of them. Nicodene (talk) 07:16, 16 June 2024 (UTC)Reply

@Nicodene That is quite a table. I won't be starting off with anywhere near the coverage of dialects given here; probably just traditional GenAm (w/o cot-caught and Mary-marry-merry), "new" GenAm (w/cot-caught and Mary-marry-merry), and RP, maybe also GenAus. Benwing2 (talk) 08:16, 16 June 2024 (UTC)Reply

Still usable for reference, whatever dialects one chooses to include. Nicodene (talk) 07:10, 17 June 2024 (UTC)Reply

Support. My only request is that you consider generating reconstructed Early Modern English pronunciations (see w:Shakespeare in Original Pronunciation). For example, the Oxford Dictionary of Original Shakespearean Pronunciation glosses knight as /(k)nǝɪt/, although I would prefer /(k)nǝɪ(x)t/ to reflect the fact the fact that it existed in other EME dialects (and in some cases unexpectedly shifted to /f/, e.g. thruff) [9]. Simon Roper's videos are also an invaluable resource. By the way, I thought @Theknightwho was working on this too? Ioaxxere (talk) 16:30, 16 June 2024 (UTC)Reply

@Ioaxxere I definitely don't want to step on User:Theknightwho's toes but they said they wouldn't be getting to this for awhile. The approach in their prototype was quite different, using English-based respelling and a whole bunch of rules taken I think from a Git package for text-to-speech (which were maybe RP-specific?) to convert to IPA. Benwing2 (talk) 19:02, 16 June 2024 (UTC)Reply

I'm of two minds about how to handle widespread mergers (especially the horse-hoarse merger): on one hand I support notating what the pre-horse-hoarse merger pronunciation was (and very much support notating what the non-wine-whine and non-Mary-etc merger pronunciations are), and indeed I like the idea of making it easier to notating the full phonological history, mentioning what the Early Modern pronunciation was, what the pre-pane-pain merger sound was, etc. On the other hand, if no or few people make a particular distinction anymore... and we expect the single required 'main' input to make that distinction... people won't make the distinction correctly. Realistically, they'll be notating a hoarse word and it'll be 50/50 chance whether they look at and copy the notation of a hoarse vs a horse word, because they don't realize there's any difference between those (because there isn't any difference, for any of the major modern national standards, nor most of the subnational dialects, AFAICT). So, it might be safer to have the 'main' input be horse-hoarse-merging, and require the horse-hoarse-distinction sound to be input as a separate value? This means the extra horse-hoarse-distinguishing line will be missing most of the time (like at present), but perhaps that's better than it being wrong much of the time(?). But I concede that there's only so far we could go in that direction: if a speaker doesn't make the Mary-marry-merry or wine-whine distinction or the trap-bath split, they'll likewise just use whatever sounds right in their dialect without realizing they were supposed to make a distinction for the sake of some other dialect, and yet I understand the desire to have the 'main' input to make the distinction... and for things like cot-caught I fully agree the main input should make the distinction (since it's still the norm AFAICT) even though this does mean people who merge the sounds will indeed sometimes notate the wrong sound (e.g. [10]). Other than that, I'll just observe that if one input generates multiple outputs, e.g. both US and UK, then an American adding an American pronunciation may not realize if/that the auto-generated British pronunciation is wrong, and vice versa; maybe we could provide a parameter so that at least conscientious users (if not blithe ones) could add "foobar|USonly=1" (or whatever) so only the US pronunciation they could vouch for was generated, and then entry went into some maintenance category so a Briton could check whether "foobar" also generated the correct British pronunciation and then remove the "USonly=1"...? IDK. PS I hope there's a key mapping the notation to IPA; I have to look at a key whenever I need to figure out what some enPR is intended to be.😅😂 - -sche (discuss) 22:14, 16 June 2024 (UTC)Reply

@-sche What you say makes sense and I was thinking of adding parameters to allow arbitrary pronunciations to be input (either using enPR or whatever respelling or direct IPA) with an accent qualifier added to indicate which accent would be involved; so possibly the horse-hoarse distinction could be handled that way. Take a look at {{pt-IPA}}; the way it handles multiple accents is similar to what I was thinking of doing here (except it doesn't provide parameters to input arbitrary accents). Basically, if you put a pronunciation in |1=, it applies everywhere unless you override a particular accent using e.g. |us=, |uk=, |rp=, etc.; but if you just put a pronunciation in e.g. |us=, it applies only to that accent or set of accents (depending on the parameter), and all the others are considered unspecified and don't display. As for enPR, we could have people input some diaphonemic version of IPA like is used in Wikipedia, but I would be concerned that people would have difficulty using it correctly and would tend to input whatever IPA they felt like inputting, leading to an inconsistent mess just like we have now. The advantage of enPR or English respelling is that it is a clear abstraction layer separate from IPA and doesn't allow as much flexibility, reducing the likelihood of inconsistency. And yes I'd definitely provide a key indicating how the enPR symbols map to IPA in different accents.

Another possibility of dealing with the horse-hoarse issue is to provide different notations to indicate "the horse sound", "the hoarse sound" and "the merged horse-hoarse sound". For example, hōrs "hoarse" vs. hôrs "horse" vs. hors (merged horse-hoarse). That way someone who doesn't know the difference could at least avoid being wrong, and in that case the module would only generate the merged version and not the unmerged version. (Maybe the same thing could be done with the cot-caught distinction, which is very unpredictable for words spelled with o. I don't know.)

I also think we might have to have flapping indicated explicitly, or at least have symbols to override whatever the default rules are for deciding whether a given t is flapped. There's no way, for example, the module could automatically know that capitalistic has a flapped t but militaristic doesn't. (Unless maybe it goes by whether the t is placed in the preceding or following syllable? Hence kằp-ĭt-əl-ĭ́st-ĭk vs. mĭ̀l-ĭ-tər-ĭ́st-ĭk?) Another similar case is with so-called "Canadian raising" of /aɪ/, which IMO should definitely be shown (since it's probably by now the majority pronunciation in the US?) and which has unpredictable exceptions, like spider and tiger (at least for me, where tiger has "Canadian raising" but taiga doesn't). Benwing2 (talk) 22:42, 16 June 2024 (UTC)Reply

@-sche Please take a look at User:Benwing2/enPR-table. This is my attempt so far at coming up with a list of enPR-style symbols for vowels and their mapping in three accents: RP, "traditional" GenAm and "merged" GenAm. It's not complete (but getting there), and there are certainly mistakes in the table as well as places needing further discussion. There's a column for GenAus but it's so far not filled in. Note that in some cases there are two possible symbols, particularly before r that is not followed by a vowel: a more expressed symbol (i.e. with more diacritics) and a less expressed one, corresponding to the fact that in this context there are a reduced set of possibilities. The two symbols would be equivalent. Benwing2 (talk) 05:30, 17 June 2024 (UTC)Reply

Re having three symbols, for "horse", "hoarse", and "merged horse-hoarse", I suppose the usefulness of that depends on whether we think the average person adding a pronunciation is more likely to look up the documentation page where we can spell out "if you have the same sound in horse and hoarse and don't know which of those originally-distinct classes a word is from, just use notation X; if you do know which class the word is from (consult Longman's, Dictionary.com, the old 1930s OED, [etc other references]), use Y for horse or Z for hoarse" — in which case doing so is sensible — or if they are more likely to just mimic what they see in other entries, e.g. if I know court sounds like horse or hoarse to me (just with h->k and s->t), maybe I just go to [flip a coin: one or the other of those entries] and copy what's there, changing h->k and s->t, in which case it's a coin flip as to whether I've used the right notation.
It also occurs to me that another thing people might do if we use enPR-like notation is just copy the enPR-like notation of the AHD, MW, Dictionary.com, old gazetteers, etc (and if they don't know IPA and the pronunciations used in all national dialects we're outputting, never notice if that causes wrong IPA to be output) . . . but there may be no intelligible notation system which would avoid that problem, since using IPA we equally have people who copy IPA from places without understanding whether it makes sense, e.g. blithely putting length marks and /ɒ/ in GenAm, using /r/, etc.
If we deploy the template semi-manually, not just bot-converting IPA to it, I suppose we could aspire to manually check and correctly input the horse-vs-hoarse class of words as we went along (and the wine vs whine class, etc) and then just ... maybe try to track new additions with an edit filter or something to ensure they were right? And in that case, just having the one main input make the horse-hoarse and wine-whine etc distinctions would indeed be less effort than having a separate w=wh / w=w or hh=hors vs hh=hoars (or whatever).
Regarding Canadian raising and /ʌɪ/: is this phonemically contrastive with /aɪ/? (AFAIK the contrast between writer and rider is viewed as being phonemically /t/ vs /d/?) If it's not contrastive, I would suggest leaving it as a [narrow bracket] thing (and might not consider it important to require the 'main' template input to distinguish it, though if displaying the [phonetic] difference can be done automatically and/or with simple added symbols like your +/- idea, great). Likewise, I would consider not requiring flapping to be indicated in the input (if it's not phonemically contrastive and isn't present at all in one of the major dialects our inputters will be coming from), but if it too can be accomplished by add-ons like you suggest, great. I will note that using hyphen-minus, while it has the appealing advantage of being intuitive, has the disadvantage that it'll cause unexpected behavior/interpretations when people retain orthographic hyphens when inputting the pronunciation of e.g. sky-high and hit-and-run, if the template takes ī- / t- to be signalling something about raising or flapping but in fact the inputter just meant it to signal "there's a hyphen here". (OTOH, if what the template displays in response to that hyphen is nonetheless correct — if sky-high-type words indeed don't raise and hit-and-run-type words don't flap — then I suppose it doesn't matter, hah). - -sche (discuss) 20:16, 17 June 2024 (UTC)Reply

@-sche Hmm, your point about hyphens is a possible issue, as I was thinking of using hyphens to separate syllables. Maybe instead I will use dot (.), which is also intuitive. As for whether /ʌɪ/ is phonemically contrastive, aside from cases like writer vs. rider, there are near-minimal pairs at least in my dialect of spider /ʌɪ/ vs. spied-her /aɪ/, tiger /ʌɪ/ vs. taiga /aɪ/, high school "secondary school" /ʌɪ/ vs. high school "a school that is high (e.g. in elevation)" /aɪ/, etc. I don't know if those pairs are universal, but I think at least the spider and high school exceptions are pretty standard. My thought was that the template would have a default rule "use /ʌɪ/ before an unvoiced sound, /aɪ/ otherwise" that would work in the large majority of cases, so the cases needing a specific ī+ or ī- override would be fairly rare. Similarly for flapping, the rule might be something like "syllable ends in vowel + t or rt and the next syllable is unstressed and begins with a vowel", which should work in the majority of the cases provided people put the t in the right place (which of course isn't guaranteed, but as you've shown, it's difficult to make something foolproof).

As for trying to catch people misusing the template, I think that IPA is very easy to misuse (as you've given examples of) and hopefully the use enPR will be a little less so; at least, I was thinking of having the code check for erroneous usages and throw errors in those cases to make it more likely they get fixed. Examples of erroneous usages would be omitting syllable breaks (there should never be more than one vowel in a syllable), using an unmarked vowel other than in the particular cases where it's allowed, putting two of the same consonant in a row, etc.

As for whether it makes sense to have a symbol for cases of mergers, I'm not sure what the right answer is here. If we do have such symbols, we can have cleanup categories for their use. If we don't, we can use the WT:Tracking mechanism to track cases where e.g. the horse and hoarse symbols are used, but I'm not sure how to "mark off" the ones we've checked other than e.g. to have a page somewhere containing a whitelist of terms that have been checked. Can you elaborate on how you think an edit filter would work? Benwing2 (talk) 20:53, 17 June 2024 (UTC)Reply

Dot for syllable breaks is intuitive. I've been worried people would use hyphen for things like hĭt-bī-pĭtch.ĭs — we were discussing a while ago the various unstandardized ways people indicate various kinds of word breaks in the hyphenation template — so I was thinking if we used a different symbol than - for indicating flapping / (non)raising (e.g. use t^ or something), then if people do use - to mean "there's a hyphen here", the template can easily flag it as something to clean up, whereas if the template expects t- as valid input, I was worrying it'd be harder for it to know whether a given instance is right, but your proposed checks against two vowels etc sound like they'd catch any problems. (I may be wrong to think people will input hĭt-bī-pĭtch.ĭs, anyway.)
Re raising, hopefully more people can weigh in; for my part I would rather be conservative and wait until I see more literature referring to it as phonemic rather than allophonic (AFAICT it is near-uniformly referred to as allophonic), before moving it out of [brackets] and into /phonemic/ status. (The various near-minimal pairs I see mentioned all seem to be said to exist for only some speakers and dialects; besides spider and tiger I also see hire vs higher mentioned as a pair some people distinguish, but not others.) BTW, on the subject of ʌɪ, the OED gives the British pronunciation of all these words, tiger, taiga, rider, writer, etc, as /ʌɪ/ (and the American pronunciation of them all as /aɪ/), although I suppose that's not RP.
Re an edit filter, I meant a filter could tag all new additions of the pronunciation template with a horse/hoarse vowel in the input, so people could manually review those additions to see if they were correct; this would be labor intensive / inefficient. I like your idea of having the third / merged symbol add a cleanup category; that's probably the best approach, though I think it only helps with aware/conscientious users (who know they're supposed to make the distinction, and can use the "merged" symbol if they're unsure), whereas I'm thinking about the users who don't realize they're supposed to make a distinction (but perhaps nothing can be done about them). - -sche (discuss) 04:04, 19 June 2024 (UTC)Reply

What you're saying about phonemic vs. allophonic of Canadian raising and flapping makes sense; I'll have them indicated as allophonic (but still provide symbols for inputting them if needed, since it's hard for the module to always get it right). BTW one possibility for notating the horse/hoarse distinction is to use similar +/- or whatever symbols rather than ör vs. ōr, so that e.g. you'd have merged hors vs. something like ho-rs "horse" and ho+rs "hoarse" (or maybe some other special characters); maybe that will make people more likely to look up the documentation and see that it's OK to write hors if you're not sure. (The idea is that the + and - additions will always indicate finer distinctions that can be left out in cases of doubt.) Dunno though. Also, I've been looking at sample words to come up with how I would structure the arguments to {{en-IPA}} or {{en-pr}}; the first two words I picked were tree and three and both of them have weirdly narrow IPA transcriptions added. IMO these *really* should not be there; e.g. I don't see how [t̠͡ɹ̠̊˔ʷɪi̯] possibly helps anyone. Benwing2 (talk) 04:46, 19 June 2024 (UTC)Reply

@Benwing2, FWIW, Canadian raising is at least borderline phonemic for me (writer and rider are a minimal pair in my idiolect, as are house [noun] and house [verb] (or house verb and how's)), but I pronounce spider and spied her identically and I don't raise the vowel in tiger. Andrew Sheedy (talk) 23:36, 25 June 2024 (UTC)Reply

@Andrew Sheedy Thanks! The writer ~ rider difference is widespread but often analyzed as underlyingly /ɹaɪtəɹ/ vs. /ɹaɪdəɹ/, where Canadian raising applies before voiceless sounds earlier than flapping applies. Similarly house (noun) would be /haʊs/ and house (verb) would be /haʊz/ (do you pronounce the final consonant differently in these two words?). Also there may be a difference between "Canadian raising" in Canada vs. "Canadian raising" in the US; certainly, there is no raising of /aʊ/ in the US, and I think the raising of spider is fairly widespread in the US although maybe not universal. (Strangely I do have raising of /aɹ/ in my speech, making Carter and carder sound different.) Benwing2 (talk) 23:54, 25 June 2024 (UTC)Reply

@Benwing2: Ah, yes, I remember hearing writer/rider analyzed that way. I do find the sounds quite distinct though—ice cream without the raising is one that always stands out to me as distinctly "American". As for house, I pronounce the noun /hʌʊs/ and the verb /hʌʊz/. So the verb house forms a minimal pair with how's, which I pronounce /haʊz/. (Likewise, houses is /hʌʊzəz/ and espouse is /ɛˈspʌʊz/. I didn't make the connection between louse and lousy until I was in my 20s, because to me the words had completely different vowels (/lʌʊs/ vs. /laʊzi/).) I think my having phonemic Canadian raising is fairly idiosyncratic. In a word like houses most people where I live either retain /s/ or lower the vowel to [aʊ]. I'm surprised to see our entry for spider list [ˈspʌɪ̯ɾə(ɹ)] as the Canadian pronunciation. I'll have to pay more attention, but I don't think that pronunciation is widespread in the Prairies. Andrew Sheedy (talk) 18:33, 26 June 2024 (UTC)Reply

@Andrew Sheedy Interesting. The Wikipedia article on Canadian raising talks about idiosyncratic raised [ʌɪ] in the words "tiny, spider, cider, tiger, dinosaur, cyber-, beside, idle (but sometimes not idol), and fire" as at least possible in certain East Coast and Midwest US accents. For me, tiny, cider and idle/idol can go both ways whereas spider and tiger are usually raised and the remainder not. It definitely seems like this is an idiosyncratic phonemic split in the process of happening. Benwing2 (talk) 18:53, 26 June 2024 (UTC)Reply

@Benwing2: It's interesting that I don't have that split. In my case, regular Canadian raising of /aʊ/ has simply been cemented as phonemic (I don't think that applies to /aɪ/ and /ʌɪ/ for me, though I perceive them as quite distinct in certain environments (but not others)). As far as I can tell, it only affects words that would have a different realization of /aʊ/ according to the form of the word. It's worth noting that our entry for dinosaur lists that pronunciation as following the "idle-idol split" (also mentioned in idle), which I've never heard of before. Andrew Sheedy (talk) 19:04, 26 June 2024 (UTC)Reply

@Andrew Sheedy Hmm, Googling for "idle-idol split" only brings up a few Wiktionary links and a couple of Reddit topics e.g. [11]. Sounds like something that a Wiktionary contributor might have made up. Benwing2 (talk) 19:24, 26 June 2024 (UTC)Reply

Numerals

Latest comment: 1 month ago1 comment1 person in discussion

Aside from the obvious numerical definitions, the pages on numbers like 3 and 4 include defs for one topic: indicating phonological tones in tonal languages. Is that a rule? Why include that and not other non-numerical things representable by numbers? Dewey decimal numbers, Hornbostel-Sachs numbers, Fujita scale, etc. Or are those things includable? I don't see anything relevant in WT:CFI and I have no interest in adding such defs, I was just wondering if this has been discussed. What's special about tonal markers? Mazzlebury (talk)

Probably because they are used in the transcription of words, e.g. in Jyutping. Voltaigne (talk) 13:57, 16 June 2024 (UTC)Reply

Oh ok, that makes sense, it's more like a character, I get that. Mazzlebury (talk)

AWB request (Brainulator9)

Latest comment: 25 days ago5 comments2 people in discussion

I would like access to AutoWikiBrowser, for use in helping with doing things such as diffusing categories like Category:English terms prefixed with un-. I already have been approved for this tool on English Wikipedia and Wikimedia Commons and have used them with little issue. -BRAINULATOR9 (TALK) 23:42, 16 June 2024 (UTC)Reply

Can you be more specific about what you plan to do? Generally categories like Category:English terms prefixed with un- are not supposed to be added manually. Benwing2 (talk) 05:32, 17 June 2024 (UTC)Reply

In this case, I would be adding |idN= parameters to the {{suffix}} templates, putting them in categories like Category:English terms prefixed with un- (negative). I'm not sure if every little task needs to be brought up here first, but that's the specifics for the task I mentioned. -BRAINULATOR9 (TALK) 14:16, 17 June 2024 (UTC)Reply

OK that sounds fine, I just want to make sure you have some idea what you're doing :) ... if no one objects in a couple of days, I'll add you to the list. Benwing2 (talk) 18:54, 17 June 2024 (UTC)Reply

Thank you! Hopefully, what I've done so far isn't cause for concern (I stopped make one type of change partway in case there wasn't consensus to do it, especially en masse). -BRAINULATOR9 (TALK) 04:15, 27 June 2024 (UTC)Reply

Japanese historical kana transliteration

Latest comment: 1 day ago13 comments5 people in discussion

Hello, is there maybe a problem with how historical kana spellings are currently transliterated? I've had to change 柔和's "にうわ" to "にう.わ", because, for some reason, the former was producing "niwa" instead of "niuwa". I just now went on 飢える and find that "うゑる" is transliterated as "weru"... is there some sort of mistake here? Why would this be the default behavior and require a "." to fix it? Kiril kovachev (talk・contribs) 01:23, 18 June 2024 (UTC)Reply

This looks like a problem with how the template handles labialized consonants that were present in middle Japanese. IIRC only k and g could be labialized, and the spelling for these would use く or ぐ before a わ行 sound, e.g. くわ for kwa, ぐゑ for gwe, etc. (sounds other than くわ and ぐわ specifically may be counted as ancient readings instead?). It looks like this logic is being extended to all う段 sounds though, not just く and ぐ, resulting in things like うわ > wa, すゑ > swe, etc. I've already fixed this on articles for 末, 故, and 梢, but no doubt others are affected. No idea how to fix the template so you don't have to manually insert a period though. Any ideas? @Eirikr Horse Battery (talk) 20:29, 26 June 2024 (UTC)Reply

Thank you for the heads-up, but sadly I am not of much use when it comes to our module infrastructure. Agreed that this romanization behavior as a marker for labialization should only apply to the "w" kana when immediately following either く (ku) or ぐ (gu). Even there, we need a means of indicating when this should not happen, in those rare cases such as words like 久遠 (modern kuon, classical kuwon; "boundless time, eternity; far in the past or future"). I think the current practice of adding a period should work just fine for these corner cases. ‑‑ Eiríkr Útlendi │^{Tala við mig} 21:57, 26 June 2024 (UTC)Reply

@Eirikr @Horse Battery @Kiril kovachev I am working on Japanese modules at the moment, so I'll try to make time to look at this. That being said, I haven't ever touched anything to do with historical kana transliteration, so I'll need to get up to speed with it first. It's on the to-do list, at least. Theknightwho (talk) 20:33, 27 June 2024 (UTC)Reply

@Eirikr @Horse Battery We get the same problem with いゆ (iyu) becoming "yu". Unlike labialisation, it looks like palatalisation is the correct behaviour for most consonants, but there are still exceptions like this.

The way the module's been implemented makes this a little tricky to fix, but it should be doable. Theknightwho (talk) 17:07, 1 July 2024 (UTC)Reply

@Eirikr Sorry for the double ping - should palatalisation be applied in cases like 消(き)ゆ (kiyu), where ゆ (-yu) is a Classical Japanese suffix? The output is currently "kyu". I assume not, but just want to double check. I suppose my broader question is whether we always want a morphemic break between furigana and okurigana in the kanji readings section, as it seems odd for 消 to have "kyu" as a historical reading as it does right now. Theknightwho (talk) 17:58, 1 July 2024 (UTC)Reply

Ugh, ya, the ゆ in 消ゆ is a suffix, and suffixes are not valid cases for palatalization. Diachronically, sure, that happens, but then it's not a suffix anymore and instead a fused morpheme.

There will be a few words like this, which might affect your implementation. 見ゆ (miyu), 聞こゆ (kikoyu), 覚ゆ (oboyu, omoyu), 冷ゆ (hiyu), 煮ゆ (niyu), among others.

There are cases that aren't suffixing, and that would also be affected by this, such as きやきや (kiyakiya), にやにや (niyaniya), ちちよちちよ (chichiyochichiyo), as a few examples. That said, I am uncertain if there are enough non-suffixing exceptions to warrant explicitly coding for these. It might be enough to have editors use the medial-period workaround to force the template(s) to treat these as separate morae. ‑‑ Eiríkr Útlendi │^{Tala við mig} 18:43, 1 July 2024 (UTC)Reply

@Eirikr As a first step, I'll change {{ja-readings}} so that it always puts a morpheme boundary between the furigana and okurigana. For some reason, it's already doing that for modern readings, but not historical or "ancient" ones, so I assume there was an oversight at some point.

You can see this in action already in the newly-revamped kanji category descriptions (e.g. Category:Japanese kanji with historical kun reading き・ゆ), since it has to work from a parallel implementation. (As a side point, I have replaced the hyphen with the middle dot, as that's what Daijisen use, among others, and it's a lot more legible; this doesn't affect user input - only the names of the automatically-generated categories.) Theknightwho (talk) 00:55, 2 July 2024 (UTC)Reply

(Edit: forgot to ping @Eirikr - see below. Theknightwho (talk) 19:15, 17 July 2024 (UTC))Reply

One issue that I've noticed is that we are extremely inconsistent with small kana usage in historical spellings (e.g. see Category:Japanese terms historically spelled with ゎ). However, I do think I have a solution to this:

For input, small kana should be used, just like with modern spelling rules.
For output, small kana will not be displayed or linked to, but will be accounted for in the transliteration.

This has three advantages:

It guarantees consistency in our historical kana entries. Even if someone uses full-size kana, the worst that will happen is the transliteration will be wrong; the link will still be to the correct entry, since historical kana entries should never use small kana in the title.
It reduces the need for manual overrides in transliteration. Plus, being able to break things down by mora allows for more sophisticated transliteration (which is the approach I'm taking in the rewrite of the transliteration module that I'm currently working on).
It's intuitive for users who are familiar with Japanese, but who aren't very experienced with wikitext/our templates, which keeps the barrier to entry low.

What do you think? Also pinging @Fish bowl.

Theknightwho (talk) 19:13, 17 July 2024 (UTC)Reply

Using small kana is a good idea IMO. However, I still have worries about the "historical romanization" system itself being poorly defined, and also wonder if it could be unified with a general system for romanizing quotes in pre-war orthography in general. But here it gets quite thorny because those quotes have been romanized according to how people would pronounce them today, not in any half-assed "historical romanization". Providing "Historical romanization" is also misleading for more modern terms that can easily be written in the orthography (as it is just an orthography, not to be confused with historical attestation, although it is in multiple areas *based on* historical attestation). Frankly I would like to see "historical romanization" retracted from ja-headword, and further deliberation on the treatment of middle Japanese vs. Meiji-era pre-war Japanese. —Fish bowl (talk) 20:24, 20 July 2024 (UTC)Reply

@Fish bowl Yeah, I agree with all of that. This also affects the sokuon as well, which we don't really account for at all at the moment: e.g. we've got historical spellings like をつとつせい (wotutotusei), which should probably be wottossei, and it gets worse if the pre-reform orthography uses something other than つ, like がくかう (gakukau), which would be better as gakkau. Not sure if it's possible to solve that with small kana, either, as I don't think the etymological mora would be predictable from an input like がっかう (though if it is, that's great).

In terms of your last point, I think it's high time we split Middle Japanese out as its own L2, because blending everything from the end of Old Japanese in c. 800 to the orthographical reform in 1946 into a "historical kana orthography" is over-simplified. Category:Middle Japanese currently only contains derived terms, so we're basically pretending it doesn't exist outside of etymology sections at the moment. Yes, there are 190 terms in Category:Classical Japanese, but that's not the same thing as the vernacular language, which we simply label "obsolete". Theknightwho (talk) 01:32, 21 July 2024 (UTC)Reply

Personally I think we have the larger problem that "historical kana transliteration" is invented from whole cloth and undocumented. If I recall correctly, people instructed me to create something that "seemed appropriate" for the sake of {{ja-readings}} (which also has support for a badly-defined "ancient kana" feature), and it was later unilaterally added to Module:Jpan-headword by User:Huhu9001, despite issues such as morpheme boundaries for words like きやきや as stated above. —Fish bowl (talk) 22:05, 1 July 2024 (UTC)Reply

Updates to WT:AINE

Latest comment: 1 month ago2 comments2 people in discussion

It was suggested that I update WT:AINE to better reflect common convention, so I went ahead and did so. Probably the biggest change is deleting some of the sort rules, which were a bit complicated and therefore mostly ignored. @Mahagaja, Rua, This, that and the other, Nicodene, Benwing2 --{{victar|talk}} 05:41, 18 June 2024 (UTC)Reply

@Victar Seems reasonable to me from looking over the changes. Benwing2 (talk) 05:46, 18 June 2024 (UTC)Reply

AWB request (Babr)

Latest comment: 1 month ago15 comments3 people in discussion

Was originally gonna wait a few weeks after the first AWB request cuz I didn't want to ask too soon after someone else, but then someone else asked again so I guess I can't control how close my request is to someone else's.

Anyway, I will be using it to clean up Tajik entries, for examples of what I am changing compare what this entry looked like before I cleaned it up to what it looks like after I cleaned it up. I've already cleaned up about ~400 entries so I will just continue what I'm already doing at a faster pace.

BTW I am User:Sameerhameedy, I just changed my username a few days ago. — BABR (talk) 08:13, 18 June 2024 (UTC)Reply

@Babr Hi! I added you to Wiktionary:AutoWikiBrowser/CheckPageJSON. Please let me know if this works; some users have said that this page doesn't work and you have to be added to Wiktionary:AutoWikiBrowser/CheckPage despite that page saying it's superseded. Benwing2 (talk) 21:08, 18 June 2024 (UTC)Reply

Unfortunately it would probably be disruptive and a bad idea to test this (if some users' AWB use in fact depends on the non-JSON CheckPage existing), but . . . iff Wiktionary:AutoWikiBrowser/CheckPage has in fact been superseded, I wonder if the issue might be that the page nonetheless still exists (with names on it and everything), so perhaps AWB first looks there, sees it exists, assumes it's operating on an old wiki that still uses the old name for the page, and looks for names there and doesn't find them: I wonder if not having a page with that title would force it to look for the new JSON page. - -sche (discuss) 02:09, 19 June 2024 (UTC)Reply

Hmmm, that is an interesting hypothesis. I wonder if we can check this in some other fashion, maybe by looking through the AWB docs or asking one of the AWB developers (wherever they hang out). Benwing2 (talk) 02:17, 19 June 2024 (UTC)Reply

BTW on Wikipedia, their CheckPage is a hard redirect to CheckPageJSON. Maybe that would work for us? Benwing2 (talk) 02:25, 19 June 2024 (UTC)Reply

I guess we could try it and revert if it turns out to cause problems. (Might as well move the useful text to WT:AWB while we're at it.) - -sche (discuss) 04:06, 19 June 2024 (UTC)Reply

We'd need the cooperation of someone who has AWB access as well as AWB installed so they could try things out to see if anything breaks. (I don't have AWB installed because (a) I'm on a Mac and (b) I have bot scripts for downloading sets of pages, editing them offline and pushing them in a batch; this is the source of those (manually assisted) notations in my bot changes.) Benwing2 (talk) 04:50, 19 June 2024 (UTC)Reply

I have AWB and can test whether it still works if the page is redirected. (If only some users find that being added to the JSON page isn't enough, that isn't foolproof; it might be better to get one of the people who found that merely being added to the JSON page wasn't enough for them.) - -sche (discuss) 05:12, 19 June 2024 (UTC)Reply

So far I can still edit, but next I'll try closing and restarting AWB, as I suspect it performs its check on startup. - -sche (discuss) 05:41, 19 June 2024 (UTC)Reply

I closed AWB and started it afresh, and: "Logged in, user and software enabled", it says. - -sche (discuss) 05:43, 19 June 2024 (UTC)Reply

OK, hmmm. Let me try redirecting the page then. Benwing2 (talk) 05:59, 19 June 2024 (UTC)Reply

@-sche OK, I merged the two pages, copying the non-user text to WT:AutoWikiBrowser, and redirected Wiktionary:AutoWikiBrowser/CheckPage to Wiktionary:AutoWikiBrowser/CheckPageJSON. Let me know if it still works after closing, logging out explicitly (if possible), logging in and seeing if you can make an edit. Benwing2 (talk) 06:12, 19 June 2024 (UTC)Reply

OK, I logged out in my other browsers, opened AWB, logged in, explicitly logged out in AWB, closed it, reopened it, logged back in (in AWB), hit the "refresh status" option (I figured if anything would "refresh my 'has-AWB-rights' vs 'doesn't' status", that seemed like a likely candidate, although I think it in fact refreshes some list of typos somewhere), logged out and back in again for good measure, and it still says I'm approved. - -sche (discuss) 06:32, 19 June 2024 (UTC)Reply

OK great! So hopefully everything is sorted now. Benwing2 (talk) 06:34, 19 June 2024 (UTC)Reply

Didn't get a chance to test it until now but it works just fine! — BABR (talk) 07:19, 20 June 2024 (UTC)Reply

Template:univerbation

Latest comment: 29 days ago3 comments3 people in discussion

Could we change the behaviour of this template please? Currently it's a mere copy of {{affix}} / {{compound}} with a specific categorisation. But univerbations are a specific type of compound: they are originally entire phrases/syntagms which came to be joined together.

For example, French aujourd’hui is not the mere sum of au + jour + de + hui: it's the whole phrase au jour(-)d’hui (now obsolete) rewritten and felt as one word.

Imo the template shouldn't take parameters (so we'd write {{univerbation|fr|[[au]] [[jour]] [[de|d']][[hui]]}} instead of {{univerbation|fr|au|jour|de|hui}}), and it certainly should not output a "+" between the components. P U C – 17:43, 18 June 2024 (UTC)Reply

@PUC This is a pretty major change. If we were to implement this we'd need to figure out a strategy for migrating the 2,000 or so pages that currently use the old format to the new one. Benwing2 (talk) 02:26, 19 June 2024 (UTC)Reply

I feel like knowwhaddamean is a good candidate for this template. So I

Support the formatting you have in mind. Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

CFI for translations?

Latest comment: 28 days ago12 comments8 people in discussion

Is there some bare level of attestability needed for translations? I ask because of the translations of Mummerset, a word that will be used vanishingly rarely - if ever - in other languages. We have Finnish and ~~Russian~~ Macedonian translations, but it doesn't look like these words have ever been used in those languages (and it's arguable whether they are right: Mummerset isn't actually a dialect, it's just a stage accent). Smurrayinchester (talk) 13:38, 19 June 2024 (UTC)Reply

Some people insist the same CFI applies to translations as does for entries. I cannot find anything in the policy to support it personally. At the same time, I don't know what people expect to happen when translation requests are added indiscriminately, without paying any attention to how likely it is for the term in question to ever be used in the target language, if outside English at all. — SURJECTION ^{/ T / C / L /} 15:44, 19 June 2024 (UTC)Reply

I would support that CFI for translations be the same as other entries, and I share your confusion. Vininn126 (talk) 20:07, 19 June 2024 (UTC)Reply

In theory, because CFI applies to entries, what you're supposed to do if you think a translation (or redlinked Derived term, etc) is wrong, is: create an entry for it. Then you RFV it and it gets removed as both an entry and a translation if it fails RFV. Because this is rather ... faffy ... there are people, as Surjection mentions, who prefer to just apply CFI directly to the translation and remove it if it fails ATTEST without creating an entry for it first, but this gets a surprising amount of pushback, so... if you think something is wrong and doesn't exist, you can always fall back on creating an entry for it. - -sche (discuss) 16:09, 19 June 2024 (UTC)Reply

If so, there has to be a solution to removing the only translation from an entry and people simply requesting a translation to be added later "because it's missing". — SURJECTION ^{/ T / C / L /} 19:43, 19 June 2024 (UTC)Reply

Also using {{not used}} and {{no equivalent translation}} seem inappropriate sometimes. Vininn126 (talk) 20:12, 19 June 2024 (UTC)Reply

I think there are still a lot of cases where the language may have an equivalent translation, but it's not attested/attestable. Like pretty much any place name in pretty much any LDL. Thadh (talk) 23:12, 19 June 2024 (UTC)Reply

This also happens with some multiword terms, or something similar, where the given English word is idiomatic, a language would translate it the same way, but that exact phrase isn't attested. Vininn126 (talk) 23:14, 19 June 2024 (UTC)Reply

@Thadh @Vininn126 This is why {{no attested translation}} exists, but it's hardly used. Theknightwho (talk) 23:16, 19 June 2024 (UTC)Reply

@Theknightwho, Vininn126: When the topic becomes technical, it necessarily occurs: I remember adding a natural SOP translation in my native German which should have such a term but has not dropped it anywhere yet … so I just starred it preceded by “suggestion”: attrition bias. There is one for non-response bias or participation bias, so here we see the effect of another bias by which scientists select the keywords they write, publication bias or filter bubbles, ironically. Fay Freak (talk) 22:44, 23 June 2024 (UTC)Reply

Yes, clearly CFI applies to translations as well. Otherwise I would be able to add ⡰⪘◰ⵗ⥙⽟⪢ⷼⲋⴑ⎽⪬⣏⫙ as the Arabic translation of neurohistopathologist and no one could stop me. I assume that any challenged translation which is a redlink can be removed on the spot since there isn't even an entry to RFV. Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

This is correct. So there should be criteria for inventions, in the case translations are needed irrespectively of attestation. Due to the rarity of the problem, it does not need to be formulated just yet however, since rules are gameable. Fay Freak (talk) 22:44, 23 June 2024 (UTC)Reply

Entries by Geshiza

Latest comment: 28 days ago4 comments2 people in discussion

Following up on this BP discussion back in March, after which I notified Geshiza that they'd be given time to fix their entries before they were moved out of the main space. Its been 3 months since then and Geshiza has been completely inactive and hasn't requested and language code for Eastern Geshiza, so we cannot even fix the entries they made ourselves if we wanted to. I think the best solution now is to move the entries they made to their user space and notify them that their entries have been moved there and can be fixed and readded (if and when they return). I'd be happy to move the entries myself if there is consensus to do it, but someone would still need to delete every redirect page so I suppose it's better someone else does it.

Notifiying Chuck Entz, Benwing2 and User:Theknightwho, who were involved in the BP discussion from March — BABR (talk) 07:14, 20 June 2024 (UTC)Reply

@Babr Let's go for it. Can you identify a list of pages to be moved and deleted? Benwing2 (talk) 07:50, 20 June 2024 (UTC)Reply

@Benwing2 pretty much all the entries they've made need to be moved. Lucky it seems they've added all their entries to Category:Eastern Geshiza nouns. Though, since they tagged the category manually, it's possible they missed some (though I didn't notice any missing entries when comparing the category to their contributions).

BTW I was planning on notifying them of where their entries went, so if you plan on moving the entries yourself then please let me know when they have been moved (unless you were planning on notifying them). If not, I could move the entries myself, but I would need extended mover in order to do so. I'm fine either way, so I'll leave that up to your discretion. — BABR (talk) 08:17, 20 June 2024 (UTC)Reply

@Babr Apologies for the delay. I have given you the extended mover right. Let me know if you need help moving or deleting any pages. Benwing2 (talk) 19:38, 23 June 2024 (UTC)Reply

Htoklibang Pwo

Latest comment: 28 days ago7 comments3 people in discussion

See this paper. Htoklibang Pwo is apparently a Pwo lect that is not mutually intelligible with any of the other Pwo languages, and is not culturally more related to any Pwo group over another. It seems neither Glottolog nor Wikipedia have entries or even descriptions of this lect, and according to this, this lect was only first identified in 2008! Should we make a distinct code for this? I have currently kept the one entry as Eastern Pwo, but that doesn't seem like a good solution. Thadh (talk) 10:10, 20 June 2024 (UTC)Reply

Support Theknightwho (talk) 03:49, 21 June 2024 (UTC)Reply

West-Central Thailand Pwo Karen

Another Pwo lect that has escaped addition as a code. this paper makes clear that this is a group of Pwo lects that aren't well intelligible with any other Pwo group, and forms a sociolinguistic group on its own. The request for an addition of a code was rejected by SIL, with an argument that, in my opinion, are absolute hogwash, namely that SIL "found no evidence that [...] the [...] West-Central Thailand variety of Pwo Karen was not intelligible with the Eastern Pwo Karen", even though the paper I linked above just says this outright in the first line... Anyway, we're not bound by SIL's decisions, let's just add this code as well so we can document these languages properly. Thadh (talk) 22:20, 21 June 2024 (UTC)Reply

P. S. Perhaps better to call this group by a different name, perhaps either "West-Central Thailand Pwo" or "Southern Pwo". Thadh (talk) 22:21, 21 June 2024 (UTC)Reply

I've made User:Thadh/Pwo for comparison. We might want to even split it three-way, and handle Southern separate from WCT. Thadh (talk) 16:02, 23 June 2024 (UTC)Reply

I know little about these lects but my general experience is that splits are easier to implement than merges, so I would favor a more conservative approach when splitting (in this case, a two-way rather than a three-way split). FWIW the cognate sets in the table you've created don't look so different to me (except for /sʷɛ˥˥/ and /θʷa˩˩/ from /mɛ˥˥/ — how does that work?) but this doesn't say all that much, as this is a very small sample. Benwing2 (talk) 19:49, 23 June 2024 (UTC)Reply

@Benwing2: Do take into account that the first three lects are written in the Thai script, while the last three are written in Burmese. Also, tonal contrast is quite a distinction, and so are the vowels - what I have given there are all vowels that are phonemic for all lects afaik. Also, I have gone ahead and added cognates, but of course there will also be tons of differences lexically and usage-wise. Finally, the paper discussing WCT Pwo specifically distinguishes it from Southern Pwo, both of which are still in Thailand. Thadh (talk) 20:00, 23 June 2024 (UTC)Reply

Decluttering the altform mess

Latest comment: 28 days ago14 comments7 people in discussion

(Previous discussion.)

At the moment part-of-speech categories are practically unusable for languages with numerous altforms. For instance iluec and its 270 variants account for nearly half (!) of all entries in Category:Old French adverbs.

This state of affairs would be greatly improved by adding an optional parameter to {{head}} which disables the normal categorizations handled by that template and instead puts entries in categories named '[language name] alternative forms'.

Thoughts? Nicodene (talk) 03:48, 21 June 2024 (UTC)Reply

i agree. its worth noting that iluec is an extreme outlier, but even so, if there are a lot of examples where the same word shows up several times, a reader would waste time trying to guess which was the correct one. —Soap— 09:36, 21 June 2024 (UTC)Reply

Clearly something needs to be done about this situation. A lot of scripts, bots and tools (like OrangeLinks) depend on every entry being categorised into either a "LANG lemmas" or "LANG non-lemma forms" category (or a couple of other special categories for Japanese entries iirc).

Is your proposal specifically that if a POS header consists solely of one or more alternative form sense lines and no other senses, it should be categorised into "LANG alternative forms" instead of "LANG lemmas"? Sounds like a maintenance headache, if I'm honest. I don't think it's possible for the {{head}} template to detect this automatically, so it would need to be done manually or by bot. Wondering if @Benwing2, Theknightwho have any input on this.

If your concern is that these entries are cluttering the POS categories specifically, let's keep them in "LANG lemmas" and substitute the POS category with the "LANG alternative forms" category: {{head|fro|alternative form}} This, that and the other (talk) 01:23, 22 June 2024 (UTC)Reply

I don't actually think it's such a big deal to change things like the OrangeLinks gadget to know about alternative forms as an alternative (so to speak) to lemmas or non-lemma forms. It's usually just a one-line change and I don't think there are that many tools or scripts that would need changing. Your suggestion of using alternative form (probably with a shorter alias altform provided) as the POS is a good one, I think, although if we want to use lang-specific templates to provide inflections of these alt forms, we'd probably need to add a parameter |altform=1 or similar to the lang-specific templates. Whether we actually want alt forms to be in lemmas might depend on the particular language. In particular, non-standardized languages like Old French and Middle English are IMO materially different from (semi-)standardized languages like English or Portuguese that have more than one spelling convention. In the former case, it makes sense IMO to put the canonical lemma at a standardized spelling and make all the other spellings be alt forms that don't appear in LANG lemmas, but in the latter case, we can't reasonably privilege one standard spelling over another (although we could do the Old French thing for obsolete and superseded spellings). Benwing2 (talk) 01:42, 22 June 2024 (UTC)Reply

@This, that and the other: Another possibility would be to have subcategories with "alternative" prefixed: Category:Middle English lemmas would have a subcategory "Category:Middle English alternative lemmas", for instance. This could be triggered by something like |isalt=1 in the headword template. There may be some cases where the same headword has both altform and mainform senses, though. We would also have to consider whether there might be some category names that would be just too long.

The advantage of this is it would preserve the lemma/nonlemma noun/noun form, etc. distinctions and require less modification of the code. It would also mean that all of the forms we have together now would still be findable from the old category name, but the alternative forms would be out of the parent category. Chuck Entz (talk) 01:53, 22 June 2024 (UTC)Reply

This seems like a good idea to me. Benwing2 (talk) 19:50, 23 June 2024 (UTC)Reply

I don’t have any particular objection against there being categories like ‘Old French plurals of alternative noun lemmas’ but it isn’t clear to me how someone could find them useful.

By the way, regarding cases like British vs American English, I think a language having more than one official standard isn’t really a matter of ‘altforms’. In an ideal world I think we’d simply have a template that binds for example center to centre, such that they both automatically share the same content (definitions, etymology, and so forth). Someone for instance flips a coin and decides that American English should be the default on Wiktionary. Then the entry for centre is reduced to just {{mirror|en|center|UK}} which displays a copy of the corresponding American English entry. (With the headword spelling adjusted of course.) How feasible that would be I don’t know. Nicodene (talk) 21:55, 23 June 2024 (UTC)Reply

This is definitely feasible. In fact we've discussed doing exactly this for Serbo-Croatian and Punjabi, where there are multiple scripts for the same language and we don't want to prioritize one over another. It would probably be an extension of the existing {{tcl}} ("transclude") template, which transcludes individual meanings from one entry to another, or something else similar in spirit. It can get tricky in English because there are some complex cases where e.g. only spelling A is used in American English but spellings A and B are both used in British English with different meanings (I can't think of an example but I know they exist). There's also the issue of what to do with quotes and usexes; e.g. presumably quotes should maintain the original spelling but do we want different quotations illustrating the respective spellings, or does it not matter? Should usexes have the spelling automatically adjusted, and if so what about other terms needing spelling adjustments? But these sorts of issues should be solvable, one way or another. Benwing2 (talk) 22:49, 23 June 2024 (UTC)Reply

(@Benwing2: what about license which is used as both a noun and verb in American English, but only as a verb in British English (the noun being spelled licence)?) — Sgconlaw (talk) 22:54, 23 June 2024 (UTC)Reply

@Sgconlaw Yes, this is one. The one I was thinking of was draft, which is split draft ~ draught in British English in a very complex fashion. There's also program in American English vs. programme ~ program in British English, and disk ~ disc in both varieties with different preferred usages. Benwing2 (talk) 23:06, 23 June 2024 (UTC)Reply

I don’t think quotations will be an issue. At present the lemma usually has quotations with all alternative forms, while alternative form entries only have quotations with that specific form. If there’s some sort of transclusion, then we just won’t need to list quotations with alternative form spellings separately at the alt entries. Regarding usage examples, maybe the solution is just to give several examples using the different forms and add qualifiers stating “American spelling”, “British spelling”, etc. A more difficult issue is what spelling to use in definitions. I’m not sure how that should be dealt with. — Sgconlaw (talk) 23:11, 23 June 2024 (UTC)Reply

I suppose one way is to use a combination of automatic conversions with manual overrides as necessary. For example, the automatic conversions could be a combination of pattern matches (for ise ~ ize words, with overrides as needed) and individual entries, and manual overrides specified in the Wikicode maybe as {{~|disc|disk}} for words or grammatical differences that can't be handled automatically. In cases like draft vs. draft ~ draught, ideally the Wikicode would have the British spelling, because in this case British -> American can be done automatically but the other way can't. Similarly for license vs. license ~ licence and program vs. program ~ programme. (Or we could just punt the whole issue and let the spelling be whatever, although I don't consider that ideal.) Benwing2 (talk) 23:22, 23 June 2024 (UTC)Reply

We could save ourselves the trouble and leave the quotes as-is. We already include quotes with altforms on lemma entries - why not quotes with other official spellings?

For cases like draft~draught, I think it’d be easiest to lemmatize forms found in both British and American English (draft, program, license, and arbitrarily either disc or disk), altformify the others, and then explain regional differences in a usage note. For license the usage note would mention “in British English the noun is spelt licence”. And the entry for licence would just have “noun, British spelling, alternative form of license”.

I can’t think of splits that go in the other direction (distinct American spellings that are homographs in British English) so this may be a good reason to treat American spelling as a general default. I say this despite personally using British (Oxford) spelling. Nicodene (talk) 00:08, 24 June 2024 (UTC)Reply

Support Ioaxxere (talk) 08:31, 23 June 2024 (UTC)Reply

Hyphenation: syllabi(fi)cation in writing

Latest comment: 27 days ago5 comments3 people in discussion

What is the algorithm followed in entries such as man·u·script? JMGN (talk) 15:37, 22 June 2024 (UTC)Reply

@JMGN: usually the etymology of the term. — Sgconlaw (talk) 15:45, 22 June 2024 (UTC)Reply

If a word is a compound, it can be split before the second component; e.g. knights-wort. The (phonetic) onset of a syllable after a hyphen must be phonotactically possible in English. Because [-ptə(ɹ)] is impossible, helico-pter is not an acceptable hyphenation, pace its etymology. The quality of the vowel of the syllable preceding the hyphen plays a role: compare po-stern and pos-ture. But it is both disci-ple and disci-pline, in spite of the difference in quality, so there may not be a straightforward rule based on the pronunciation. See further the article Syllabification on Wikipedia. --Lambiam 20:35, 24 June 2024 (UTC)Reply

@Lambiam: For some consistency, I wonder if it is worth coming up with a set of guidelines and seeing if we can find consensus for them, then updating "Wiktionary:Pronunciation#Hyphenation" (a draft proposal) to specify them. — Sgconlaw (talk) 21:22, 24 June 2024 (UTC)Reply

The only guideline I can think of is to consult a major dictionary that indicates how English words are hy‧phen‧at‧ed,^[12] such as The American Heritage Dictionary of the English Language. This will not reveal a different British English hyphenation, as of the word knowledge. I am not aware of online sources for British English hyphenation. --Lambiam 07:42, 25 June 2024 (UTC)Reply

User talk:Equinox

Latest comment: 27 days ago16 comments12 people in discussion

Could the revision history of this page be restored? Ioaxxere (talk) 08:09, 23 June 2024 (UTC)Reply

It's not what he would've wanted. Denazz (talk) 21:22, 23 June 2024 (UTC)Reply

Never change, WF. Nicodene (talk) 21:41, 23 June 2024 (UTC)Reply

WP has a rule against deleting user talk pages (w:WP:DELTALK), on the grounds that "they are usually needed for reference by other users". But it seems we don't follow that here. See the deletion histories of some prominent contributors' user talk pages.

Personally I think it is a poor look to delete your own talk page, and it shouldn't be done. I wouldn't ever honor a {{d}} request of this kind (unless the user was a non-contributor of course). But there are clearly several admins who are comfortable with the practice of deleting a user's talk page when they ask. I'm interested to hear what others think here. This, that and the other (talk) 12:57, 24 June 2024 (UTC)Reply

I would support a rule against allowing the deletion of a user talk page. Honestly, the fact that it's done currently in cases of controversial users or criticism just shows to me that it's done to hide critiques, rather than protecting the user from anything. AG202 (talk) 14:56, 24 June 2024 (UTC)Reply

I am inclined to agree. Benwing2 (talk) 19:41, 24 June 2024 (UTC)Reply

Me too. If the user wants to blank the page that's up to them, but unless there's a very good reason (e.g., it contains someone's private information) administrators shouldn't delete the page. — Sgconlaw (talk) 21:24, 24 June 2024 (UTC)Reply

I too am inclined to agree. There might be edge cases where it'd make sense to just delete a talk page, like if (nearly) every revision contained personal information and so deleting it was simpler than revdelling every revision (maybe if a user in good standing edited under their full name and then decided to be globally renamed to "Renamed user 2345675434" to vanish, but their talk page would be full of them commenting under their old username?), but in general, it seems like we can revdel specific revisions without needing to delete a whole page / revision history. Seeing multiple users in good standing asking for this page to be restored, I'll restore it now, whether we adopt a general policy or not. I agree that a user merely blanking their page is a different matter. - -sche (discuss) 22:48, 24 June 2024 (UTC)Reply

Due to the number of revisions involved, my first attempt got: "To avoid creating high replication lag, this transaction was aborted because the write duration (6.0507435798645) exceeded the 3 second limit. If you are changing many items at once, try doing multiple smaller operations instead. [...] Fatal exception of type "Wikimedia\Rdbms\DBTransactionSizeError". Restored it on the second try. - -sche (discuss) 22:54, 24 June 2024 (UTC)Reply

I'm adding to the chorus to agree. I don't see the actual justification for deleting and the purported reason in the log is "tl dr", which is not helpful. —Justin (koavf)❤T☮C☺M☯ 22:55, 24 June 2024 (UTC)Reply

I agree generally, but maybe allow deleting in cases related to harassment? CitationsFreak (talk) 23:07, 24 June 2024 (UTC)Reply

I would say revert or revdel as needed and then protect the page, but wholesale deletion is using a rocket launcher when you need a flyswatter. —Justin (koavf)❤T☮C☺M☯ 23:17, 24 June 2024 (UTC)Reply

Speaking of which, I note that the old history of @Victar was hidden in the same way, and think it should be restored into an archive. Happy to make a separate thread to request this, if necessary. Theknightwho (talk) 01:23, 25 June 2024 (UTC)Reply

@Theknightwho inspection of the complete deletion log shows that in fact the page was restored in 2023 on Victar's own request. There are no deleted revisions. This, that and the other (talk) 05:32, 25 June 2024 (UTC)Reply

Alright. Theknightwho (talk) 05:37, 25 June 2024 (UTC)Reply

In case of privacy violation, it can and will be hidden either way, and we even comply with GDPR requests, isn’t it. So I don’t see how talk pages should be deleted only because they are in the user-space, which in reality is only a topic-space with varyingly loose relations to a user personally. For the sake of the scientific argument, or also good coding practice, the default assumption should be that a user page is kept.

There are of course low-relevance cases of drive-by IPs only receiving a welcome message and some random notes, but we would needs withstand deletion if, say, one of Theknightwho or Benwing2 becomes loony and deletes reasonings given on his talk page for module implementations, likewise the contributions of many users to philologic argument are too high for one to suffer suppression of the tracks of the scientific discourse well. It would be as if we worked on something irrelevant. So I figure how Theknightwho has the general impression of deleting a talk-page not being right; the personal preference has an objective basis here. Fay Freak (talk) 04:41, 25 June 2024 (UTC)Reply

Voting to ratify the Wikimedia Movement Charter is now open – cast your vote

Latest comment: 27 days ago1 comment1 person in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello everyone,

The voting to ratify the Wikimedia Movement Charter is now open. The Wikimedia Movement Charter is a document to define roles and responsibilities for all the members and entities of the Wikimedia movement, including the creation of a new body – the Global Council – for movement governance.

The final version of the Wikimedia Movement Charter is available on Meta in different languages and attached here in PDF format for your reading.

Voting commenced on SecurePoll on June 25, 2024 at 00:01 UTC and will conclude on July 9, 2024 at 23:59 UTC. Please read more on the voter information and eligibility details.

After reading the Charter, please vote here and share this note further.

If you have any questions about the ratification vote, please contact the Charter Electoral Commission at cec@wikimedia.org.

On behalf of the CEC,

RamzyM (WMF) 10:52, 25 June 2024 (UTC)Reply

The slashes in the transcription (`ts=`) parameter

Latest comment: 25 days ago25 comments7 people in discussion

Could we please change the (ts=) parameter so that it uses something else instead of slashes to delimit transcriptions? I understand that transcriptions are intended to accompany transliterations for certain languages, where it's useful to have a literal transliteration followed by a transcription that shows how it was actually read, but the problem with slashes is that they make it look like the transcription is meant to be a phonemic IPA pronunciation, which isn't what's intended most of the time. Indeed, commenters in the original discussion explicitly didn't want people to use ts= for pronunciations; however, 6 years on, and taking a completely random selection of uses, I can sort them into three buckets:

Very likely intended as IPA:
- Korean ㅓ (/⁠ʌ⁠/) - transliteration suppressed and transcription added; clearly referring to pronunciation
- Chagatai قزاق (qazāq /⁠qazaq⁠/) - seems to be a slightly broader version of the pronunciation on the entry (/qɑ.zɑq/), intended to show the lack of length distinction
Resemble IPA on the surface, but the nature of the language means they must be deciphered/reconstructed readings:
- Sumerian 𒅎𒊑𒀀 (im-ri-a /⁠imria⁠/)
- Mycenaean Greek 𐀀𐀯𐀹𐀊 (a-si-wi-ja /⁠aswijaː⁠/)
Definitely not IPA:
- Old Persian 𐎭𐎠𐎼𐎹𐎺𐎢𐏁 (d-a-r-y-v-u-š /⁠Dārayauš⁠⁠/) - capitalised proper noun
- Old Turkic 𐰖𐰉𐰕 (y¹b¹z /⁠yabïz⁠/) - interpreting this as IPA /ï/ would be hopeless (is it even allowed?), but ⟨ï⟩ is a common Turkic transcription of /ɯ/
- Phoenician 𐤏𐤋𐤉𐤑 𐤏𐤁𐤀 (ʿlyṣ ʿbʾ /⁠ʿaliṣ-ʿuboʾ⁠/) - ⟨ʿ⟩ and ⟨ʾ⟩ are not part of IPA

The ones that really concern me are those in the second group, because without any indication it's invalid IPA, it runs the risk of misleading even experienced users who may not be familiar with the language in question; especially when we do give pronunciations for some languages from hundreds/thousands of years ago. Hell, at Middle Persian 𐭬𐭤 (mh /⁠čē⁠/) we even have a pronunciation section with /tʃeː/ and the transcription /čē/ on the headword line, which is very silly. Obviously the transcription is useful to have, but we shouldn't be using slashes for two different things within the same entry, especially when we don't give readers any clue that that's what's happening, so a naive reader may assume one of them is simply a mistake.

From reading the discussion linked above, the basis for using slashes seems to have been that (a) one user started using them because that's how A Dictionary of Manichaean Middle Persian and Parthian uses them, and (b) someone else mentioned that Russian dictionaries sometimes include Cyrillic transcriptions in square brackets (which is maybe sort of the same thing if you squint really, really hard). However, It's all very well for a publication to use slashes to mean something other than IPA, so long as it's consistent, but the big difference between us and that dictionary is that we also use slashes to refer to IPA, and indeed some editors have used ts= for genuine IPA, as you can see above.

Is there anything else we could use instead? Theknightwho (talk) 14:46, 25 June 2024 (UTC)Reply

I completely agree. I don't know what would be best though. Either we could use some other sort of delimiters (but which ones?), some sort of font or color indication (but how?), or some abbreviation like ts., appropriately linked so readers will have some idea what it means. Benwing2 (talk) 20:41, 25 June 2024 (UTC)Reply

@Benwing2 I'd prefer delimiters, as anything involving colours runs into accessibility issues with colour-blindness etc that I don't want to figure out. We should pick something that doesn't have another meaning, which rules out most of the common delimiters, but I think these three probably work okay, with the first being my preference:

قزاق (qazāq, ｢qazaq｣)
قزاق (qazāq, ⸢qazaq⸣)
قزاق (qazāq, ‹qazaq›)

Theknightwho (talk) 19:36, 26 June 2024 (UTC)Reply

@Benwing2, Theknightwho: just wanted to highlight that the second example above is showing up for me as two rectangles when viewed on a mobile device. I assume that isn’t the desired output. — Sgconlaw (talk) 22:37, 26 June 2024 (UTC)Reply

Can you paste a screenshot to imgur.com? For me the second example contains two half-brackets, one in the top left and one in the top right. Benwing2 (talk) 22:39, 26 June 2024 (UTC)Reply

@Benwing2 I have the same issue when viewing it on an iPhone, so I assume the boxes are replacement characters. Theknightwho (talk) 22:40, 26 June 2024 (UTC)Reply

Interesting, I see the same thing with the chars ⸢⸥, while the overly tall ｢｣ look totally fine (they look like how the ⸢⸥ chars look on my desktop). This can be fixed with CSS if necessary. Benwing2 (talk) 22:45, 26 June 2024 (UTC)Reply

@Benwing2 Great - if we can use CSS to fix any size issues then ｢｣ sounds like the best option. Theknightwho (talk) 22:59, 26 June 2024 (UTC)Reply

@Theknightwho Not sure it's possible to use CSS to fix size issues like this (the issue is rather that the characters ｢｣ on the desktop extend to the full height of the bounding box rather than going halfway down) but I think if the characters are appropriately tagged with a CSS class, you can e.g. use ⸢⸥ and make the surrounding CSS class on mobile have "display: none;" (to not display the character) and use the ::before selector to insert a different character before, something like this:

.ts-left {
    display: none;
}
.ts-left::before {
     content: "｢";
}

and similarly for the bottom-right half-bracket. I would prefer we use ⸢⸥ as the actual chars and the above hack on mobile only (maybe iPhone only if it's possible to have a selector for that), since it seems to be a bug in the iPhone's handling of the chars, and since the ⸢⸥ chars seem to be preferred over the ｢｣ chars (which are in the U+FFxx compatibility area). Benwing2 (talk) 23:17, 26 June 2024 (UTC)Reply

I'm having the same issue, and I'm using the latest iPhone too! If my default font doesn't support it then I think theres a good chance that a lot of mobile devices don't. I think that means ⸢⸥ are out of the question. (edited cuz I didn't read the thread before commenting) — BABR・talk 00:28, 27 June 2024 (UTC)Reply

My preferences are ‹qazaq› followed maybe by ｢qazaq｣ and then ⸢qazaq⸣, although I think the latter two both look somewhat strange. If you want to use box corners (or whatever you call them), maybe ⸢qazaq⸥ would be better; this uses top-left and bottom-right "half brackets" rather than the taller versions in ｢qazaq｣, which appear to be called "halfwidth left corner bracket" and "halfwidth right corner bracket". Benwing2 (talk) 19:56, 26 June 2024 (UTC)Reply

Would abslutely not support this change for the reasons found in the original discussion. --{{victar|talk}} 20:48, 26 June 2024 (UTC)Reply

The original reason was that you (and you alone) were used to slashes because of a single dictionary, which isn't a reason at all really. They're misleading, and need to be changed to something else. Theknightwho (talk) 21:11, 26 June 2024 (UTC)Reply

@Benwing2 I find ‹qazaq› slightly hard to make out, but ⸢qazaq⸥ works well. Theknightwho (talk) 21:16, 26 June 2024 (UTC)Reply

⸢qazaq⸥ and ｢qazaq｣ look good to me, ‹qazaq› is too similar to ⟨⟩ employed for representations of writing forms. Fay Freak (talk) 21:23, 26 June 2024 (UTC)Reply

Show me dictionaries or academic papers that that use ⸢⸥ or ｢｣. --{{victar|talk}} 21:50, 26 June 2024 (UTC)Reply

Are you going to address the point that using / / to mean two different things is a problem? Show me a single academic dictionary or paper that does that. To be honest, I'm curious to know if there are any sources outside of that one dictionary that use / / for transcriptions. Theknightwho (talk) 22:30, 26 June 2024 (UTC)Reply

Yep, {{R:ira:Novak:2013}} and Basharin (2013) also use slashes for transcripts. Reusing symbols happens everywhere. If people are not reading the documentation and erroneously putting IPA characters in the |ts= field, there should be better errors to stop them. --{{victar|talk}} 00:03, 27 June 2024 (UTC)Reply

@Victar Where do they do that in the second one? They're just using single slashes as "X/Y" to mean "X or Y", which is completely different. Even if they were using slashes in the way you've chosen to use them, I don't see any instances of IPA in that paper, so it doesn't count, because the whole point is that we're being inconsistent. Try again.

It's not possible to throw errors for this, because symbols used in transcription fequently make for legal IPA, as I've shown above. This is why it's a problem, as I have already pointed out. Theknightwho (talk) 00:15, 27 June 2024 (UTC)Reply

Basharin (2013) p 114: HḄWṢYNʾ hlbyck /xarbīčak, xarbūčak/ ‘water-melon’; Novak (2013) p91: wšw /ʷəxšú, ᵊxʷəšú/.

There are certain character which are only used in IPA, like ˈ and ː, but we can also do is have languages that do set manual transcriptions, allow them to specify which characters are acceptable and which aren't. --{{victar|talk}} 00:34, 27 June 2024 (UTC)Reply

Took me no time to find, actually the first modern thing in my Semitic folder: Peter Stein: Lehrbuch der Sabaischen Sprache 2 vols. 2012–2013 uses transcriptions between slashes.

As Victar implies, this has permeated the philologies of the relevant languages, so I felt surprised and gaslighted by the offence in it. I only reviewed the visual merits of various options notwithstanding usage.

Turns also out that ⸢qazaq⸣ cannot be used because it is already used for some conjecture kind of thing, e.g. in Akkadian Written by Egyptian Scribes in the 14th and 13th Centuries BCE in Proceedings of the 53th Rencontre Assyriologique Internationale Vol. 1 page 805 (2010) they cite from the cuneiform version of the treaty between Ramses II and Ḫattusilli III, without explanation, so it is known in the field:

ul-tù ⸢dá⸣-ri-ti ilu(DINGIR-LIM) ú-ul i-na-an-⸢din⸣ a-na e-pé-ši nukurti (LÚ.KÚR) i-na be-ri-šu-nu/ [i-na ri-ki-il-ti a-d]i da-a-ri-ti

‘from the beginning the god did not ever permit the making of hostilities between them [by means of a treaty for]ever’ (lines 10–11; Edel 1997:6, 18). Fay Freak (talk) 00:09, 27 June 2024 (UTC)Reply

Wiktionary is the best glossary and has entry on ⸢ ⸣ , too, which Theknightwho has edited two times. Fay Freak (talk) 00:13, 27 June 2024 (UTC)Reply

I have no particular attachment to ⸢ ⸣. I just want us to use something that isn't going to mislead users by standing for two different things. Theknightwho (talk) 00:18, 27 June 2024 (UTC)Reply

I agree it's a bit problematic to use /.../ for multiple different types of transcriptions, but I'm somewhat conflicted by the idea of using something other than slashes, since slashes are pretty ubiquitous with transcription. Like, just spit-balling here, but what if we did something like:
قزاق (qazāq, trans.^?/qazaq/) or
قزاق (qazāq, /qazaq/^?)
with the ? taking you to a section of the documentation page that explains not to use it for IPA?? Idk, just an idea.

— BABR・talk 00:50, 27 June 2024 (UTC)Reply

The usage of the proposed new delimiters for transcriptions seems unprecedented, while / / has a long-standing tradition behind it. To my eyes there is no inconsistency in how we are using them: / / always means "transcription", and when it is an IPA transcription in particular there is usually a blue "IPA (key)" text before it. In any case, by this logic we would have to strip away [ ] from its usage for normalisations, brackets=on, corrupted portions of quotations, translation/transliteration of book titles, etc. only because it is used already for phonetic transcriptions. Catonif (talk) 00:57, 27 June 2024 (UTC)Reply

(vnv) meaning

Latest comment: 26 days ago7 comments4 people in discussion

The pronunciation of several French words is shown that way, e.g. "crevé":

IPA^(key): (vnv) /kʁə.ve/

What means (vnv) here?

I searched in the help pages to no avail. Jlliagre (talk) 00:15, 26 June 2024 (UTC)Reply

That seems to have been added by WingerBot. I also noticed the |pos=v examples at {{fr-IPA}} don’t give what I’d expect. @Benwing2, you seem to be the boss there. MuDavid 栘𩿠 (talk) 01:14, 26 June 2024 (UTC)Reply

@MuDavid Oops, I standardized the handling of various parameters and I forgot that |pos= had a special meaning for {{fr-IPA}}. Will fix. Benwing2 (talk) 01:25, 26 June 2024 (UTC)Reply

Fixed. I'm not sure what pos=vnv was intended to mean; it has no effect now. Benwing2 (talk) 01:33, 26 June 2024 (UTC)Reply

Thanks! Jlliagre (talk) 01:41, 26 June 2024 (UTC)Reply

In October 2019, when WingerBot added vnv to the entry, the module was being rewritten to cover various things like "ihV". But even at the end of that round of edits, I don't see vnv in the module. If no-one knows what pos=vnv was intended to do, should we just remove it...? - -sche (discuss) 02:09, 26 June 2024 (UTC)Reply

Yeah I agree. I looked back to the introduction of pos= in 2016 and it always only had the value of "v". Benwing2 (talk) 03:33, 26 June 2024 (UTC)Reply

Headword word IDs in multiword entries

Latest comment: 18 days ago4 comments2 people in discussion

We need some way to add sense IDs when linking to words in the headword line. Having to do e.g. |head=[[olla]] [[kukko#Finnish: animal|kukko]]na [[tunkio]]lla is not very practical. My first idea was to support some kind of new parameter for {{head}} that supports e.g. inline modifiers after links, like |head2=[[olla]] [[kukko]]<alt:kukkona><id:animal> [[tunkio]]lla. — SURJECTION ^{/ T / C / L /} 12:28, 26 June 2024 (UTC)Reply

I would support something like this. Note that I've implemented a special syntax for some languages (so far, English and various Romance languages) to make it easier to correctly handle multiword linking in long terms. It's documented under Module:en-headword#Link modifications and is enabled when you use a value for |head=, |head2=, etc. that begins with ~. I have thought of extending this to all languages but haven't done it yet. Inline modifiers could be added to this syntax and/or to the plain |head= syntax, as in your example. Benwing2 (talk) 19:00, 26 June 2024 (UTC)Reply

I'm not sure i understand the benefit of the ~ syntax. In any case, modifier support could be added to |head= too, but one'd have to avoid break existing uses. — SURJECTION ^{/ T / C / L /} 08:36, 4 July 2024 (UTC)Reply

I think that could be done. The reason I introduced the ~ syntax was to avoid you having to repeat the whole head e.g. in a 5+-word lemma when e.g. you need to modify the default linking of one or two words. It doesn't save you effort in a case like |head=[[olla]] [[kukko]]na [[tunkio]]lla when you have only 3 words and two have to modify the default linking, but it helps in a case like English admiral of the Swiss Navy, where the default linking is [[admiral]] [[of]] [[the]] [[Swiss]] [[Navy]] and you want to change the linking of Navy to [[navy|Navy]] while keeping the remainder unchanged. So you'd write ...|head=~[N:n]avy instead of ...|head=[[admiral]] [[of]] [[the]] [[Swiss]] [[navy|Navy]], a savings of 49 - 11 = 38 characters of typing.

I think modifier support could be added without breaking existing uses; a simple solution, which I already have implemented, is to not parse modifiers if HTML is detected at top level. See Module:parse utilities#L-209. This allows HTML inside of qualifiers and such but if you e.g. use {{l|foo|bar}} at top level inside of |head= (which some people do), or ..., it won't trip up the parser. We could make the parser correctly handle inline modifiers in the presence of ... at the cost of a bit more complexity in the parsing code. Benwing2 (talk) 09:07, 4 July 2024 (UTC)Reply

Words of uncertain reading, etc.

Latest comment: 25 days ago10 comments4 people in discussion

I have a bit of a cunundrum. Many Old Polish words have uncertain readings, for which we have Appendix:Old Polish terms of uncertain reading. However, I discovered that some also have uncertain parts of speech, like Appendix:Old Polish terms of uncertain reading#baze. What would be the best way to handle this? Vininn126 (talk) 18:35, 26 June 2024 (UTC)Reply

@Vininn126: maybe put the word under the most plausible part of speech, and add a usage note explaining that there’s uncertainty about this. — Sgconlaw (talk) 22:40, 26 June 2024 (UTC)Reply

that's the thing. It sort of looks like a preposition but it could be a noun. I'm not sure there is a most plausible here. Vininn126 (talk) 22:52, 26 June 2024 (UTC)Reply

@Vininn126: in that case it probably doesn’t matter which one you pick. — Sgconlaw (talk) 22:56, 26 June 2024 (UTC)Reply

Sgconlaw's idea sounds reasonable, but to spitball some other ideas in case any are more appealing: I notice assalay just doesn't put a part of speech at all. I have seen "particle" used as a catchall / wastebasket for anything that doesn't clearly fit somewhere else. I have seen Chinese entries just use "Definitions" as the L3 / POS header (not without considerable controversy). - -sche (discuss) 23:01, 26 June 2024 (UTC)Reply

@-sche That's an interesting idea. Vininn126 (talk) 05:42, 27 June 2024 (UTC)Reply

We also have kelaunikui under the POS heading "Word"... This, that and the other (talk) 09:12, 27 June 2024 (UTC)Reply

Hm, I suppose since appendices often are the wild west, this could work, too. This might be my favorite option so far. Vininn126 (talk) 09:14, 27 June 2024 (UTC)Reply

Is |3=lemma documented in {{head}}? Vininn126 (talk) 09:15, 27 June 2024 (UTC)Reply

I don’t think new part of speech headings not sanctioned by Wiktionary:Entry layout should be used without consensus. — Sgconlaw (talk) 10:46, 27 June 2024 (UTC)Reply

Links within glosses

Latest comment: 22 days ago14 comments9 people in discussion

Do we have any sitewide guidelines or policies or policy-adjacent recommendations regarding links within glosses? By that I mean for example eau (“water”) versus eau (“water”). I have long been under the impression that links within glosses were discouraged, but I have seen other editors not just use links in glosses when writing new text, but actively adding links to glosses in existing text. Is this a "do whatever you like" situation or is there actual guidance somewhere? I couldn't find anything at WT:ELE or in the documentation for {{l}}. —Mahāgaja · talk 08:55, 27 June 2024 (UTC)Reply

I'm interested to know what others say as well. I personally don't have strong opinions either way, so I'm willing to adapt to what people think the practice should be. Vininn126 (talk) 09:00, 27 June 2024 (UTC)Reply

I was once advised not to link in etymology sections, but I do it in definitions since we usually link key words in definitions anyway. — Sgconlaw (talk) 10:48, 27 June 2024 (UTC)Reply

It would make sense to ensure that words used in glosses didn't require links to be understood. If an obscure, technical, or highly polysemic word is used in a gloss, then a link is warranted and even essential. DCDuring (talk) 13:57, 27 June 2024 (UTC)Reply

@DCDuring I treat glosses of English and non-English terms differently (and I don't think I'm the only one): for non-English terms, I prefer to give a precise English equivalant, if possible (i.e. something you'd use if you were translating it into English as part of a larger piece of text). For English terms, glosses are purely explanatory. In the latter case, I agree that they should use straightforward language that most people would understand without needing a link, but in the first case that's not always possible (or even appropriate, if the term is technical or esoteric), so a link makes sense. If I see a non-English term with an explanatory gloss, that indicates to me that there isn't an English equivalent (or the editor didn't know what it was). Theknightwho (talk) 16:00, 27 June 2024 (UTC)Reply

Yes. If anything, we need more explanatory glosses in non-English L2s. It is a very lazy approach that simply inserts a polysemic English term as a gloss. In those cases, in particular, a link to a senseid'ed definition would be a good alternative to an unlinked explanation. DCDuring (talk) 16:10, 27 June 2024 (UTC)Reply

@DCDuring Yeah, if the best English equivalent is a polysemic term, then I'd definitely opt for the double-approach (with maybe an abridged explanation, depending on context): 德國人／德国人 (Déguórén, “German; person from Germany”).

Otherwise, we risk more "vessel" incidents (Rajkiandris added a stub with the definition "vessel", and linked it as a translation under "ship", but the actual meaning was "drinking vessel"). Theknightwho (talk) 20:17, 27 June 2024 (UTC)Reply

Yeah, I think this is more or less my approach as well. Vininn126 (talk) 16:40, 27 June 2024 (UTC)Reply

That's my approach. I avoid linking glosses in etymologies and much prefer them not to be linked there. I link only difficult terms in glosses in FL entries, though I don't usually remove links that are already there. Andrew Sheedy (talk) 17:30, 27 June 2024 (UTC)Reply

I agree with the themes herein. Thus: Try to use simple words in glosses, which don't need links; but when a nonsimple word is apt (and circumlocuting around it is counterproductive), just ensure that it is linked so that any user who wants to find out what it means can easily click/tap. Corollary: in this context, take a moment to bother to send them straight to a POS anchor or ID anchor. They don't want to land at the top of a big-ass entry and then hunt for what was meant in the context that they came from. Quercus solaris (talk) 18:34, 27 June 2024 (UTC)Reply

It used to be done a lot back in the old days -- not so much anymore. I generally remove them when I see them, so I would be in support of hardcoding it into the rules. --{{victar|talk}} 18:56, 27 June 2024 (UTC)Reply

I think User:Theknightwho summed up the principles well for how to write good definitions. I would just add/clarify (in the following I'm thinking specifically of non-English terms):

Don't use obsolete, dated or archaic terms in definitions, even if the term or sense itself is obsolete. E.g. I just came across مَنَّ (manna) with the definition #6 "to reproach, to upbraid, to exprobrate". I don't know what "exprobrate" means and it adds negative value to the definition; "reproach" and "upbraid" are enough ("scold" would be even better) and including an archaic term just confuses things. Similarly, definite #4 says "# (obsolete) to jade, to tire". Even though the meaning itself is obsolete, you should not use obsolete or obscure terms like jade in definitions.
Don't give more than 3 synonyms. E.g. Stephen Brown (RIP?) and certain other contributors would sometimes list 10 or more synonyms in a definition. This is IMO unhelpful esp. as most of the time the different synonyms all have different shades of meaning in English, so it's not clear which ones are the best translations.
If a term used in a definition has more than one possible meaning, you should include context to clarify the meaning; either a label, or a synonym, or an explanatory qualifier, etc.
When giving the definition of a non-English term, you should strive very hard to find the equivalent English term; give a multiword definition only if there is no equivalent English term or you really can't find it (e.g. for some technical terms in foreign languages, it can be extremely hard to figure out the equivalent in English unless you're an expert in the field in question). You want to think in terms of translation equivalence as much as possible. At the same time, some foreign terms have extra shades of meaning that aren't conveyed by the closest English term; those shades should be conveyed using qualifiers.

Benwing2 (talk) 06:54, 28 June 2024 (UTC)Reply

@Benwing2 One last thing I'd add is not to shy away from using the term itself as the first word in the gloss if English has borrowed it in an unadapted form. e.g. 호떡 (hotteok, “hotteok; a type of filled pancake popular as street food in South Korea”) might look a bit silly, but it tells the reader that English does have a term for it, and it just-so-happens to be a direct borrowing. You sometimes see a similar phenomenon in non-English entries where the term is the same as English (e.g. manga), where editors don't bother linking to the English entry and instead write an explanatory definition (or copy the English one), not realising that that implies English has no equivalent term. The fact it's the same in both languages shouldn't make any difference, really, since a naive reader isn't going to know that until you tell them. Theknightwho (talk) 19:16, 29 June 2024 (UTC)Reply

Totally agreed. Benwing2 (talk) 19:57, 29 June 2024 (UTC)Reply

Request for Template/module editing permissions

Latest comment: 20 days ago5 comments2 people in discussion

Been working on templates/modules for the last few days(mostly clearing out Category:Categories_that_are_not_defined_in_the_category_tree), this would allow me to do such, been also wanting to work on updating Template Data on various templates, was told this was the right place to ask such a request. Akaibu (talk) 16:40, 27 June 2024 (UTC)Reply

@Akaibu Can you be more specific as to what you are interested in working on and what permissions you're looking for? Are you referring to Wiktionary:Template editors? Generally we don't give template editor permissions unless there's a very good reason to do so and for a long-established user with a solid record of work on modules, because template editor permissions give you the ability to make changes to core modules that can massively mess things up if not done carefully. If there is a specific module you want to work on that is template-editor-protected, it might be better to downgrade the permissions on the module (depending on which module it is). Also, when you say "Template Data" are you referring to the TemplateData documentation stuff at the bottom of documentation pages such as the one for Template:mention? You don't need template editor rights to edit doc pages. Benwing2 (talk) 21:29, 27 June 2024 (UTC)Reply

I suppose the various sub pages of Module:category_tree/poscatboiler/data is what i'm intending to work on? idk I've just encountered a number of cases of needing to ask someone else to modify a data module in the course of trying to clear that maintenance. some of the subpages i am able to edit but it's kinda arbitrary from what i understand. Akaibu (talk) 02:11, 28 June 2024 (UTC)Reply

@Akaibu Yeah we need to fix the permissions of any of these submodules that aren't set to "autoconfirmed". Let me know if you find any. Benwing2 (talk) 18:32, 28 June 2024 (UTC)Reply

hi, sorry for late reply, now that i'm looking into it, for examples i can't add anything to Module:category_tree/topic_cat/data/Places currently Akaibu (talk) 20:47, 1 July 2024 (UTC)Reply

new version of Template:+obj

Latest comment: 20 days ago25 comments11 people in discussion

We currently have three very underpowered templates {{+obj}}, {{+preo}} and {{+posto}} for indicating governance of verbs, nouns, adjectives and the like. Back in Jan 2021 I created a better and much more powerful replacement, but I wasn't satisfied with the formatting so it's sat on the back burner. Today I made a bunch of formatting changes that IMO make it look significantly nicer, along with better support for qualifiers and support for alternants etc. in the case associated with a given adposition. I am thinking it's ready for deployment, but I'm soliciting some further comments esp. on the formatting. See User:Benwing2/test-obj for a bunch of examples. Some comments:

I'm not wedded to the square brackets surrounding the whole governance structure. This is consistent with how {{+obj}} etc. currently work, but maybe we should just switch to parens.
I'm also not totally wedded to the glosses written small in single quotes like this: ‘towards’. I did it this way because I felt that regular-sized glosses with double quotes distracted from the overall structure.
The conjunctions "along with" and "and" mean the same thing; "along with" is used when joining arguments that contain alternants, as in [with accusative ‘whom’, along with genitive or an ‘which matter’]; here, if you take out the "or an", it would change to [with accusative ‘whom’ and genitive ‘which matter’]. The reason for using "along with" is to avoid the ambiguity that would result from having "foo and bar or baz".

My plan is to rename {{+obj}} to {{+obj/old}}, deploy the new {{+obj}} (which subsumes {{+preo}} and {{+posto}}), with proper documentation, and convert all the old uses to the new syntax. This should be mostly doable by bot, along with some manual cleanup to handle cases where the underpowered nature of the current templates results in people using weird hacked-up notation to express situations that aren't handlable in the current syntax. Benwing2 (talk) 02:03, 28 June 2024 (UTC)Reply

I personally actually like the formatting. I think brackets are better than than parentheses. I guess making the object of the preposition not small would be marginally better. We may want to establish on whether "something/someone" would be better as the input or "what/whom". I'm supposing most people would prefer the indefinite pronoun. I think I have a slight preference for "whom/what".

being able to list multiple prepositions/cases that have the same meaning (i.e. let's say you have Polish przy czymś(loc)/nad czymś (ins) or maybe even "co/nad czymś" or what-have-you would be nice (unless I am missing something). Vininn126 (talk) 18:36, 28 June 2024 (UTC)Reply

@Vininn126 You can in fact list multiple prepositions/cases with the same meaning. I put an example of that at the very bottom with czekać. Internally they are slash-separated but currently it displays using with foo + case or bar + other-case. Benwing2 (talk) 20:35, 28 June 2024 (UTC)Reply

Great! Vininn126 (talk) 20:37, 28 June 2024 (UTC)Reply

I am in complete assent and approval of the initiative. I especially like the way this allows me to add register qualifiers like ‘(formal) with preposition or (informal) with accusative’—to say nothing of the liberty to combine prepositional and non-prepositional objects. ―⁠Biolongvistul (talk) 19:10, 28 June 2024 (UTC)Reply

Looks and performs better, so

Support. Fay Freak (talk) 19:29, 28 June 2024 (UTC)Reply

So, not intended to work with English verbal complements, like 'to infinitive', 'bare infinitive', 'ing-form/gerund/present participle' ? DCDuring (talk) 21:17, 28 June 2024 (UTC)Reply
Yes it should work with English. It's not lang-specific. Benwing2 (talk) 21:41, 28 June 2024 (UTC)Reply
But not without modification of Module:form of/data, which is beyond my ken. DCDuring (talk) 23:50, 28 June 2024 (UTC)Reply
You can put arbitrary text in place or things like "acc" or "accusative", just like when using {{infl of}}; it's just that you won't get such text auto-linked to the glossary. Benwing2 (talk) 01:42, 29 June 2024 (UTC)Reply

This looks great, very nice work! I wonder if it will be able to incorporate more complex displays in a language-specific manner, e.g. Hungarian érdeklődik ⟨ to inquire [with delative (-ról/-ről) or iránt or felől or után ‘about something’] ⟩, and add the relevant categories (currently implemented with {{hu-case}}). (@Adam78, Panda10) Einstein2 (talk) 21:49, 28 June 2024 (UTC)Reply
@Einstein2 There isn't currently built-in support for displaying the (-ról/-ről) portion or auto-adding categories; the rest can be handled with the existing support. Is the (-ról/-ről) something that should be auto-added any time you use delative or does it need to be added per usage? Benwing2 (talk) 23:49, 28 June 2024 (UTC)Reply
I was not expecting the template to support such things at this early stage, I was just curious if it was possible to display arguments in a more advanced manner for specific languages in the future. In Hungarian, each case comes with its own set of suffixes, so it would not be necessary to add the suffixes separately in each entry. Einstein2 (talk) 00:28, 29 June 2024 (UTC)Reply
@Einstein2 OK, I'll keep that in mind, it should be possible to add in the future. Benwing2 (talk) 01:42, 29 June 2024 (UTC)Reply
@Einstein2, @Benwing2: I'd prefer to display the actual case suffixes instead of the case names. It's much clearer to learners. Examples: elnézést kér, beszél. (@Adam78). Panda10 (talk) 17:42, 30 June 2024 (UTC)Reply
I agree with Panda10. Hungarian language has 18 cases and only the most hardened linguists can identify their suffixes by their names. (Not mentioning at least twenty postpositions that can also be arguments.) Adam78 (talk) 17:57, 30 June 2024 (UTC)Reply
Oofda, ya, totally agree with this! As a native English speaker and Hungarian learner, it has been much easier to simply learn the case suffixes themselves (and what they mean), rather than the obtuse English-language names for each case. Just by my own subjective experience of viewing other online fora, I'm not the only one in this position. 😉

→ Please just display the suffixes, instead of the case names. ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:42, 1 July 2024 (UTC)Reply
@Panda10, Adam78, Eirikr I was not proposing to exclude the suffixes from the template. In my example above, I included both the case name and the corresponding suffixes (with the latter shown in boldface). While they are not very well-known, I think displaying case names would be beneficial. They are also displayed in {{hu-infl-nom}}, therefore, showing them in verb entries could assist learners when looking for the correct inflected form of a noun etc. that serves as an argument. Besides, they also account for cases when the argument does not strictly contain the case suffix (as with the pronominal adverbs neki, velünk etc.). Also, they take up minimal space so I don't think there is much harm in displaying them. Einstein2 (talk) 19:25, 1 July 2024 (UTC)Reply
Yeah I will add some lang-specific support so that if you underlyingly write "inessive" or "allative" or whatever, it automatically shows the corresponding suffix next to the case name. Benwing2 (talk) 21:37, 1 July 2024 (UTC)Reply

Cheers, that sounds good, thank you for clarifying. ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:19, 1 July 2024 (UTC)Reply

Does it have all the functionalities of {{indtr}}? P U C – 14:30, 29 June 2024 (UTC)Reply

@PUC Overall it is much more powerful than {{indtr}} but also fundamentally different as it is intended to go after the definition rather than mixed in with the labels. As a result, for example, it doesn't have support for including arbitrary labels; the intention is that you use {{lb}} for that. Benwing2 (talk) 19:59, 29 June 2024 (UTC)Reply

BTW it is intended to replace {{indtr}} by using {{lb}} for labels and {{+obj}} for governance. Benwing2 (talk) 20:01, 29 June 2024 (UTC)Reply

This looks good :) for the kinds of languages which currently use those other templates, languages in which verbs etc govern cases. I take it that it is to be used for those languages (not English, which has basically no cases to speak of)? The only English uses I spotted (so far) are the ones raised here, back out, back where it seems more confusing to me (with the equals sign and the invention of an accusative case) than just using parentheses to indicate like the object like other English verbs do. (A comment there prompted me to copy this sentiment over to here.) - -sche (discuss) 00:45, 30 June 2024 (UTC)Reply

Yes, I agree we should just put regular objects in parens. Right now I'm in the process of offline-converting uses of {{+obj}} in the old syntax to the new one, and in cases of a simple object I've changed them to use parens instead of {{+obj}}. Occasionally I've found it necessary to explicitly indicate transitivity using {{+obj|&transitive}}, e.g. in Catalan reparar (“to notice, to pay attention to”), where I've written {{+obj|ca|&transitive/:en<someone/something>}} which looks like [transitive or with en ‘someone/something’] (where the initial & suppresses the with that would normally precede the word "transitive"); this is to indicate that you can say either "reparar something" or reparar en something". Benwing2 (talk) 02:14, 30 June 2024 (UTC)Reply

Category for subjunctives

Latest comment: 18 days ago4 comments2 people in discussion

Should one be created to accompany Category:English imperative sentences? J3133 (talk) 02:40, 28 June 2024 (UTC)Reply

Would that it were. DCDuring (talk) 21:19, 28 June 2024 (UTC)Reply

@DCDuring: I have created Category:English subjunctive expressions. J3133 (talk) 17:30, 3 July 2024 (UTC)Reply

If I knew whether the category should include conditionals or redirects that contain would to entries that do not, I would increase the population of the category substantially. DCDuring (talk) 18:46, 3 July 2024 (UTC)Reply

I think Category:Japanese onomatopoeias should be Category:Japanese ideophones

Latest comment: 12 days ago6 comments4 people in discussion

As with Korean, which is similiar. Here we see Category:Korean ideophones, which encompasses both sound symbolism and expressives (emotions, sights, and so on), as the umbrella category, and then Category:Korean onomatopoeias within it for the sound symbolism only.

As the situation seems similar in Japanese, I think we should move the present onomatopoeia category to ideophones and then create a new category within it for the sound-based words.

For example, there is a word どぎまぎ (dogimagi), which seems like a loose parallel, both in meaning and in rhythm, for English heebie-jeebies, but we dont call heebie-jeebies onomatopoeia.

Also, is there a reason why Category:Japanese onomatopoeias is not a subcategory of Category:Japanese lemmas? It took me quite a while just to find it, because I was expecting it to be categorized like that of Korean and other languages. Thanks, —Soap— 19:19, 28 June 2024 (UTC)Reply

I just noticed the existing category Category:Gitaigo, which contains terms that I would consider to be expressives, which is to say, they're the non-sound-based ideophones, which I believe are the counterpart to Korean's 의태어. There are certainly more than 3 of these words in Japanese, so I propose that this category be expanded and perhaps renamed so that it will be easier to find. —Soap— 19:29, 28 June 2024 (UTC)Reply

It may be that one reason why the Japanese categories seem so under-documented is that we don't usually have an etymology for these words, and that is where the {{onom}} template goes in the other languages' words. —Soap— 19:30, 28 June 2024 (UTC)Reply

I would be supportive of recategorizing such terms as "ideophones". It's awful difficult to think of terms like シーン (shīn) as "onomatopoeia", when it's supposed to be the "sound effect" of no sound at all — silence. Or things like ぴりぴり (piripiri), a "pins and needles" feeling or possibly the feeling of spicy food on the tongue. There's no sound to that either. The "ideophone" label would seem to cover these, as well as actual imitative adverbs like ぞっと (zotto, “with a shudder”) or ばんばん (banban, literally “bang bang”).

‑‑ Eiríkr Útlendi │^{Tala við mig} 01:05, 29 June 2024 (UTC)Reply

I think there's also quite a distinctive difference between adverbial ideophones, such as the ones you describe, and basic onomatopoeic interjections like カー (kā, “caw”, sound of a crow) or SFX sound effects in manga like チャッ (cha', “shing”, a sharp metallic sound).

I think a lot of the issue is caused by the fact that it's often difficult to translate ideophones idiomatically into standard English, so they get lumped in with sound effects and/or translated in ways that don't convey the ideophonic effect, but the same phonemenon is actually pretty common in colloquial English - especially when spoken: "my chest was thump thump thump-ing as I..." (どきどき (dokidoki)); "I crept forward tip-toe-tip-toe and then..." (ちょこちょこ (chokochoko)) etc. etc. They're verbal or (quasi-)adverbial terms that evoke some kind of vibe, not literal sound effects, and are often completely ad hoc. Theknightwho (talk) 22:41, 29 June 2024 (UTC)Reply

Sounds good to me. If some ideophones right now are classes as onomatopoeic despite not really representing a real-world sound, it may be sort of misleading to keep it as it is. Kiril kovachev (talk・contribs) 21:21, 9 July 2024 (UTC)Reply

Requesting template editor right

Latest comment: 21 days ago2 comments2 people in discussion

Requesting template editor right so I can edit the topical label data modules and the like. I just added a new category for Brazilian politics but cannot add related label data now (I previously added new label data for similar regional politics categories, such as for Philippine politics and Palestinian politics but the level of protection has been increased lately due to disruptive additions from other users). TagaSanPedroAko (talk) 21:25, 30 June 2024 (UTC)Reply

@TagaSanPedroAko I lowered the protection one notch to "autopatrollers" so you can edit this. Let me know if you run into issues with any other modules. Benwing2 (talk) 03:41, 1 July 2024 (UTC)Reply

July 2024

Parsing the principal parts in the headword lines of Latin verbs

Latest comment: 13 days ago15 comments8 people in discussion

Previous discussion:

Wiktionary:Beer parlour/2023/October § Changing Latin verb definitions to use "to ..." instead of "I ..."

As a consequence of the change to Latin verb definitions initiated in the above-linked discussion, some editors raised the concern that some readers might be confused by the mismatch between the grammatical status of the verb form which serves as the Latin lemma (the first-person singular present active indicative) and the status quo novus of how Latin verbs are defined (with the English infinitive). Accordingly, the suggestion was made to parse the first principal part of Latin verbs (the other three principal parts had already been parsed since time immemorial). The code to institute this was added, but soon after removed again because “[it] wildly change[d] the visual outcome of what Latin verbs have always been and [had] not [been] changed with a valid vote”. Given that consensus for the addition is not yet apparent, I thought I'd try to reinvigorate the discussion here.

Since the occasion has presented itself, I would like to propose a reform to how Latin verbs are parsed. I shall use for my example the old grammarians' favourite, Latin amō (“I/to love”). Currently, Latin verbs are parsed thus:

amō (present infinitive amāre, perfect active amāvī, supine amātum); first conjugation

and restoring the parsing of the first principal part as it stood 14–17 November 2023 would result in:

amō first-singular present indicative (present infinitive amāre, perfect active amāvī, supine amātum); first conjugation

In both these schemes, however, the parsings are badly and inconsistently conceived. All the parsings are abbreviated; if they weren't, this is what we'd see:

amō first-person singular present active indicative (present active infinitive amāre, first-person singular perfect active indicative amāvī, accusative supine amātum); first conjugation

The first three principal parts (hereafter PPs) are all active, so why is only the third PP labelled as such? The first and third PPs are both first-person, singular, and indicative, so why is only the first PP labelled as such? This doesn't really make sense. Note that, for normal Latin verbs' principal parts, no person but the first person, no number but the singular number, and no voice but the active voice is ever mentioned. Only the tense (present or perfect) and the mood (indicative or infinitive) vary; accordingly, they are the important features to parse. If we restricted ourselves thereto, amō would look like this:

amō present indicative (present infinitive amāre, perfect indicative amāvī, supine amātum); first conjugation

Of course, the initial concern about the aforementioned mismatch remains in that scheme, so what I actually propose is:

amō present indicative (present infinitive amāre, perfect indicative amāvī, supine amātum); first conjugation

Thoughts? 0DF (talk) 03:14, 1 July 2024 (UTC)Reply

I agree that we can leave out "active" from all parts: I doubt anyone would expect unlabeled forms to be passive, so treating the active as an unmentioned default seems unproblematic. So I support replacing "perfect active" with "perfect indicative". I think it's not so obvious though that the citation form is first-person singular. While "first-person singular present indicative" is a bit long, possibly we could use "1sg present indicative" and "1sg perfect indicative" with a link explaining what the abbreviation 1sg means?--Urszag (talk) 03:35, 1 July 2024 (UTC)Reply

I suggest using standard linguistic glosses throughout:

amō 1sg.prs.ind, amāre prs.inf, amāvī 1sg.prf.ind, amātum sup

With tooltips spelling everything out in plain English. Nicodene (talk) 09:56, 1 July 2024 (UTC)Reply

This one reads the smoothest. Fay Freak (talk) 14:16, 1 July 2024 (UTC)Reply

I don’t think this is a good idea. Tooltips don’t display on mobile devices, and the abbreviations are difficult for ordinary readers to understand. — Sgconlaw (talk) 16:17, 1 July 2024 (UTC)Reply

We already do that for gender and number. Nicodene (talk) 21:05, 1 July 2024 (UTC)Reply

Funny that I specifically considered mobile users and reached the opposite conclusion. It is advised either way to keep the lines short and avoid line breaks ergo as well vertical space. Our mobile presentation can look different, by including a question-mark toolbutton in place of relying on tooltips. And link to w:List of glossing abbreviations or something; we have not shied away from having an Appendix:Glossary, so @Sgconlaw is inconsequential here, though Nicodene has not expressly thought it through. Fay Freak (talk) 23:15, 1 July 2024 (UTC)Reply

My opinion may be a bit radical. Our headwords are already five times more verbose than normal paper dictionaries, because of course, we can afford it, but visual clutter is still a thing (i.e. scouting out the real information drowned in labels). Even as they currently stand I would make them shorter:

amō (perfect amāvī, supine amātum, infinitive amāre)

More thorough explainations can be held in a centralised appendix, which should somehow be accessible and reader-friendly. As a side note, I notice only now our Latin headwords have moved infinitives from the fourth place (as they're in all Latin dictionaries) to the first one, I don't see any obvious reason why we would be breaking such a secular practice in Latin lexicography. Catonif (talk) 11:24, 2 July 2024 (UTC)Reply

I’d be happy with that. Plus perhaps a footnote/toolbutton/whatever with a message along the lines of:

“The four principal parts are as follows: first-person singular present active indicative (first-person singular perfect active indicative, accusative supine, present active infinitive).”

Nicodene (talk) 12:29, 2 July 2024 (UTC)Reply

Seeing as the headword will never need to link to another page (since it's identical to the name of the page), I guess we could put a tooltip on the headword itself that when hovered on, explains that it is a first-person present indicative form--unless that would be too distracting. Like this:

amō (perfect amāvī, supine amātum, infinitive amāre)

--Urszag (talk) 13:56, 2 July 2024 (UTC)Reply

I suppose that is enough. More complete information about the other three principal parts can always be found by just clicking on them. Nicodene (talk) 16:02, 2 July 2024 (UTC)Reply

I support this--i.e., (1) having tooltips so that all the relevant information about the verb forms is presented, but (2) without cluttering the entry, and (3) moving the infinitive to the end to match the order used by other dictionaries. I would also support a solution that used straightforward abbreviations (e.g. "1st person sing."). I am concerned about the status quo, which has simply removed any indication of the lemma not being an infinitive form. Andrew Sheedy (talk) 18:48, 2 July 2024 (UTC)Reply

Re reordering the principal parts, I was surprised to discover that Charlton Thomas Lewis's An Elementary Latin Dictionary (amō) and Félix Gaffiot's Dictionnaire illustré latin-français (ămo) do indeed give the principal parts in the order present indicative, perfect indicative, supine, present infinitive. However, I'm not seeing universality here. Harm Pinkster's Woordenboek Latijn/Nederlands (amō via Logeion) gives only the present indicative and present infinitive; Joaquim Affonso Gonçalves' Lexicon Magnum Latino-Sinicum (Amo) gives the first- and then second-person singular present active indicative (amō, amās), followed by the present active infinitive; whereas Lewis & Short's A Latin Dictionary (ămo) gives them in the order present indicative, perfect indicative, supine, omitting the present infinitive altogether. Our order is that which prevails on Wikipedia and in Joseph Henry Allen and James Bradstreet Greenough's New Latin Grammar for Schools and Colleges (see § 1.26.1 thereof). Pace Catonif, I am not persuaded that we are in fact "breaking…a secular practice in Latin lexicography" by ordering verbs' principal parts as we do. 0DF (talk) 20:38, 2 July 2024 (UTC)Reply

Aha! Well conducted research. Not trying to persuade you, legitimately thought that was a thing, mostly due to the Italian lexicographic tradition for Latin verbs we're taught in schools, which is actually first person singular indicative present, second person singular indicative present, perfect, supine, infinitive: all four Italian-Latin works I have access to work like that. I guess if it is not as rigid as I was taught we can go with whatever is most common in Anglophone literature, or really, we can keep the current system and eventually discuss this separately if we feel like we have to.

Anyways, I also agree with having a tooltip (even though there's always the mobile issue), maybe however just containing "first person singular"? Since indicative, present, and active is all what you'd expect anyways from a lemma, and the first-person status is also shared with the perfect. Catonif (talk) 22:10, 2 July 2024 (UTC)Reply

Given that around 30-40% of our readers are using mobile devices (source) I

Oppose any further use of hover titles beyond what is already in place. Would definitely

Support removing the word "active" and re-adding the caption that labels the first principal part as the first-person singular present (indicative? not sure if we need to mention that). This, that and the other (talk) 23:54, 8 July 2024 (UTC)Reply

Should minimum wage be in a PIE root category?

Latest comment: 18 days ago18 comments9 people in discussion

Sorry for a header so specific, what I mean to address is actually this general issue, but wasn't able to capture it in a concise enough title. It is something that has bugged me for quite some time already, but now that {{etymon}} started to be greatly employed and even deals with categorisation (not without disagreements, see GP § June 2024), it seems like this has become a very pressing issue. For the sake of an example, take Category:English terms derived from the Proto-Indo-European root *mey- (small). At the moment it contains mostly sensible entries and is an actually enjoyable and relatively useful category. Although notice entries such as minimum wage (where the question would be, does that need an etymology in the first place) and, most importantly, note that once we fully undergo proper automated concistency, this will contain thousands of entries containing the prefix mono- (e.g. monobrominated, monochromaticity, monotheistically), making the category overly cluttered and much harder to find non-obvious results in it.

My approach for this (automation aside, this is how I have always used {{root}}) is to only add to that category the terms that aren't themselves derivates (or derivable, tricky here) from other entries that are already in the category. Essentially the entry would contain only the basic terms by which all the other ones can be derived.

I can see how this can seem unappealing to those seeking for full module automation and steel-hard consistency accross all entries, although I hope many can agree that whoever chose to make root category a thing didn't do it for them to be an endless list of monotonousness.

Catonif (talk) 17:42, 1 July 2024 (UTC)Reply

I don't see why an MWE like minimum wage needs an etymology. If an MWE were to have an etymology, it would seem useful to exclude it from etymology trees. This kind of thing is also a problem under Derived terms headers, where it would not surprise me to find minimum wage law. DCDuring (talk) 18:32, 1 July 2024 (UTC)Reply

It was originally agreed that this template would not be deployed in multiword terms, not sure who's ignoring that. I don't see the problem with single-term entries being categorized. Vininn126 (talk) 19:21, 1 July 2024 (UTC)Reply

Side note, but can we make an exception for that when a multiword term can't be broken down into its constituents? It would be really dumb to exclude Hong Kong, for instance. Theknightwho (talk) 20:19, 1 July 2024 (UTC)Reply

I think that was mentioned in the thread, I can't remember. Vininn126 (talk) 20:25, 1 July 2024 (UTC)Reply

The exceptions to the MWE exception do need to be respected. Do we have a category for such terms? If not, we could benefit from having one. DCDuring (talk) 21:19, 1 July 2024 (UTC)Reply

I am aware of a few other multi-word entries that use {{root}}, namely Ku Klux Klan ("Ku Klux" being a split form of Ancient Greek κύκλος (kúklos)) and sgian dubh (borrowed together from Scottish Gaelic; neither word is used separately in English).

This discussion reminds me of a similar one from June about the etymon template. As for this discussion, I wondered the same thing, but I feel that the following rules of thumb are likely to prevent people from getting too riled up:

Only lemmas should be included, unless sufficiently distinct (such as inflected forms of be) or suppletive (such as people). Where something like datum versus data falls, I don't know.
Each one must be a single word or morpheme; this may include, I suppose, every word prefixed with insert prefix here (I won't go crazy with it, though). However, "unsplittable" terms such as the ones described above can be treated as one word.
WT:COALMINE scenarios... I'm not sure.
Descendant hubs should not use {{root}} or etymology trees, though using the {{etymon}} template is okay for passing information to other pages.

-BRAINULATOR9 (TALK) 02:18, 2 July 2024 (UTC)Reply

@Catonif: minimum wage is:

English
A term
Derived from the Proto-Indo-European root *mey- (small).

So there is absolutely no justification for it to not be in a category called Category:English terms derived from the Proto-Indo-European root *mey- (small).

It seems like what you really want to do is to change the category itself. What you're describing (i.e. a category without thousands of mono- words) might be more accurately dubbed Category:Common English words derived from the Proto-Indo-European root *mey- (small). I don't know how we could decide which terms are "common" enough to include. Also @Vininn126: the previous BP discussion was about etymology trees on multi-word entries, not the template itself. Ioaxxere (talk) 04:17, 2 July 2024 (UTC)Reply

@Ioaxxere: It's also spelled with the letters "a","e","g","i","m","n","u", and "w", but a decision was made to not have "spelled with" categories for letters that are part of the normal orthography of a language. We absolutely should have single words and morphemes in all the applicable derivation categories, but derivational categories for all the parts of a multi-word expression is unnecessary overkill. Do we really want categories for function words like "a", "and", "of", "or", "the", etc. in phrase entries? Chuck Entz (talk) 05:46, 2 July 2024 (UTC)Reply

@Chuck Entz: I agree. That's why we don't have a category called Category:English terms spelled with A. My point is that if the category Category:English terms derived from the Proto-Indo-European root *mey- (small), as literally specified, is too large to be useful, then it should simply be deleted, rather than post hoc negotiating "well, actually, it's not all the English terms...". Ioaxxere (talk) 06:26, 2 July 2024 (UTC)Reply

@User:Ioaxxere The principle that you seem to require us to follow is that once something is specified, the specification must remain unchanged, either for all time or perhaps until you decide otherwise. This would seem to mean that nothing should be specified unless it is perfectly specified for all time. Good luck with that. I've always thought that humans, both individually and in groups, were at their best when learning and adjusting their institutions accordingly.

To me making the simple adjustment of excluding MWEs by default from the listing in question and requiring exceptions to be made manually seems reasonable to cover the cases where the MWE has a non-trivial etymology. Occasionally the etymology of an MWE is relevant to an etymon tree. But usually it is not. For example, I don't understand why anyone finds adding a trivial etymology to a multi-part taxonomic name a good use of their time or of a user's attention, but some do. DCDuring (talk) 14:23, 2 July 2024 (UTC)Reply

I think it's a fair point that these kinds of categories are doomed to be highly incomplete and arbitrary, especially if it's left up to manual placement of "root" templates. It's not so obvious that this is something that makes sense as a category (if we're just presenting a manually curated list with no pretensions at being comprehensive, wouldn't that almost be more appropriately presented in an appendix?). I definitely think it seems especially low value to include multi-word terms in these kinds of categories, and that isn't a very difficult exclusion criterion to apply (although the use of "term" instead of "word" in the category name doesn't help with making this criterion apparent). But the conversation has also brought up prefixed or derived single words, which is a much harder criterion to follow. (Incidentally, it seems like "mono-" probably doesn't come from *mey- after all, though that doesn't resolve the issue of if we want all the mono- words included in some category or another.)--Urszag (talk) 14:34, 2 July 2024 (UTC)Reply

I agree with User:Urszag here. Ioaxxere (talk) 18:26, 2 July 2024 (UTC)Reply

Well put. Vininn126 (talk) 18:35, 2 July 2024 (UTC)Reply

Then I propose we simply change these categories to read "English words derived from the Proto-Indo-European root...". Andrew Sheedy (talk) 18:53, 2 July 2024 (UTC)Reply

Agree with Urszag and Sheedy. But what about Klu Klux Klan and similar? DCDuring (talk) 19:35, 2 July 2024 (UTC)Reply

@Andrew Sheedy That doesn't work, because we need an exception for terms with spaces that aren't decomposable in the given language. Theknightwho (talk) 21:03, 2 July 2024 (UTC)Reply

@Ioaxxere I think a short-term way out of this mess would be to allow control over categorization: there should be parameters to tell the module that certain parts of the tree should not generate categories. There may be better names for the parameters I'm suggesting.

first the easy part:
- Prevent addition of categories to the current entry only without affecting the drawing of the tree:
  - |nocat=[language code or spec of node to be uncategorized]
    - So, if you don't want {{etymon}} to add the cat for Middle French to an English borrowing from modern French, you would just use |nocat=frm, and it won't add Category:English terms derived from Middle French to the entry.
  - |endcat=[language code or spec of the highest node to be uncategorized]

This tells it to show categories up to the node in question, but not those of any of its ancestors. If you don't think the ancestry of a minor morpheme in an Old English ancestor should be added to the categories for a Japanese calque of an English term, you would just put |encat=ang, or whatever the spec is for the Old English morpheme itself. If the node given is the current entry, no category at all would be added, so you could use |nocat=[spec for "the" in the current entry name] to keep it from showing any categories for "the"

more complicated:
- The same as above, but the parameters would control {{etymon}} in all the other entries that use the entry as a node in their trees. Thus, a parameter in an Old English entry could prevent a certain node in its tree from being used to add categories for any of its children, and another could tell all of its children not to look past a certain ancestral node in its tree when adding categories.

That's all I have time for right now, but at least it should give the basic idea Chuck Entz (talk) 15:14, 3 July 2024 (UTC)Reply

@Chuck Entz: Those are interesting ideas. But I think it would be better to come up with some simple and consistent rules that the template can enforce rather than letting editors arbitrarily cut off whichever categories they like.

Ryukyuan kanji entries

Latest comment: 17 days ago8 comments6 people in discussion

I'm a bit concerned about the large number of unsourced kanji entries we have for the non-Okinawan Ryukyuan languages. I note that they were generally added to pages en masse by users who like(d) to bounce about between languages (e.g. [13] [14] [15]), and most are completely unsourced. In some cases, they don't seem to make much sense, either: e.g. Miyako 食ー, which I think has been inferred from JLect or from Nikolay Nevskiy’s Miyakoan dictionary, which gives "foː", but I can find no dictionary to verify the kanji spelling, and it seems implausible that we'd have a lone ー after the kanji, given that's not where the morpheme boundary is.

There are a handful of entries that that do provide a source for the kanji spelling, like Kikai 蒜, and although JLect isn't seen as very reliable by some contributors, it's better than nothing. However, I really think we should remove all of the unsourced entries, as they look strongly like inferences to me. Before I nominate them, though, I wanted to hear what others have to say first. @Eirikr @Fish bowl @Chuterix @Lattermint @Poketalker @Kwékwlos @Mellohi! @TongcyDai Theknightwho (talk) 19:41, 1 July 2024 (UTC)Reply

Also pinging @荒巻モロゾフ.

Recently, for some new Yonaguni entries I create that source the reference of the original word, I tend to put main headword at hiragana (used in Dunanmunui Jiten). The alternative kanji however, is entirely inferred from etymology/semantics; feel free to remove them if you don't like it. Chuterix (talk) 20:14, 1 July 2024 (UTC)Reply

Most Ryukyuan languages, except for Shuri, have little to no literary tradition. Hiragana would be best suited, since the kanji is meant to signify a possible Japanese cognate but not all etymologies are correct. Kwékwlos (talk) 21:38, 1 July 2024 (UTC)Reply

Agreed with Kwekwlos. We should move all Ryukyuan entries to hiragana. For Shuri, kanji should only be an alternative spelling. Chuterix (talk) 21:48, 1 July 2024 (UTC)Reply

IFF kanji spellings are used in texts written in those respective languages, then great, we should include those somewhere (whether as main or alt-spelling entries, I currently have no strong opinions). ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:22, 1 July 2024 (UTC)Reply

I've expressed similar concerns at Beer Parlour (March) but do not have the knowledge to comment further. —Fish bowl (talk) 22:01, 1 July 2024 (UTC)Reply

My thoughts on this are still the same:

We should lemmatize at what native speakers have used the most, absent a standard orthography, regardless of if it seems inconsistent or "ad-hoc". Defective or variant orthographies are not specific to Ryukyuan, and in other cases, we list the variants as alternative forms with the "standard" or most-common form as the lemma. (Or in the case of two differently-pronounced words represented by the same orthography, we disambiguate in the etymology + pronunciation sections)

For Okinawan in particular, there are several works written in mixed script (Kanji & kana), and it looks to be the traditional orthography as well, so I wouldn't support a move to solely kana, and definitely not the Latin script. The same level of research should be done for the other languages as well; if they are more-written in the Latin script or katakana [or hiragana], then shifts can be made, but the research needs to be done first.

The same applies here. I would highly recommend doing a deep dive into what speakers use most often. And again, I would not support a move of Okinawan entries to hiragana. AG202 (talk) 17:19, 3 July 2024 (UTC)Reply

@AG202 I completely agree with you re Okinawan; I had hoped I'd made it clear that it's not in the scope of this thread, as I'm aware it has its own literary tradition that is best handled in the same way we handle Japanese. Theknightwho (talk) 20:49, 4 July 2024 (UTC)Reply

Category:English arbitrarily coined terms

Latest comment: 18 days ago11 comments5 people in discussion

Should we have a category for words that were completely "made up" like quark, grok, or frabjous? Searching "arbitrary formation" on the OED reveals many more results. (accepting suggestions if anyone has better ideas for a name)

Personally

Support. Ioaxxere (talk) 21:29, 2 July 2024 (UTC)Reply

Wouldn’t this just be “Category:English nonce terms”? — Sgconlaw (talk) 04:06, 4 July 2024 (UTC)Reply

@Sgconlaw: Checking the contents of that category I see that very few of them would fit into my proposed category. Ioaxxere (talk) 05:21, 4 July 2024 (UTC)Reply

@Ioaxxere: in your view, what is a “completely made up” term and how does it differ from a nonce term? — Sgconlaw (talk) 10:14, 4 July 2024 (UTC)Reply

ex nihilo vs nonce. Vininn126 (talk) 10:18, 4 July 2024 (UTC)Reply

@Vininn126: seems to me that “completely made up” terms are also “terms invented for the occasion”, so the former could quite happily be included in “English nonce terms” without the need for an additional category. — Sgconlaw (talk) 10:23, 4 July 2024 (UTC)Reply

No they are not. Nonce terms are "made for a single occasion". One could argue that normal affixation could also relate to nonce terms, wherein the speaker realizes it's not fully "lexicalized" the way other words are, and they are not restricted to new stems. Ex nihilo is a new stem, it may be nonce, it may catch on and become fully lexicalized. Vininn126 (talk) 10:25, 4 July 2024 (UTC)Reply

‘Arbitrary’ isn't all that descriptive.

I'd suggest ‘ex nihilo coinages’. Nicodene (talk) 05:03, 4 July 2024 (UTC)Reply

@Nicodene: How about "English terms coined ex nihilo"? Ioaxxere (talk) 05:21, 4 July 2024 (UTC)Reply

Sounds good to me. Nicodene (talk) 05:30, 4 July 2024 (UTC)Reply

Related, what do if only a part is ex nihilo, as in pharmacology, it turns out, is often? → ipamorelin and its whole suffix. Fay Freak (talk) 05:31, 4 July 2024 (UTC)Reply

RQ for Rollbacker (and Patroller)

Latest comment: 18 days ago3 comments3 people in discussion

I'm not sure what the bar for these rights are, but I would like to request these tools as I believe they would be helpful tools on those days I end up watching RC for vandalism (which has been happening more frequently as of late). I will not rollback anything other than obvious vandalism and my primary usage of patroller would be to simply review un-patrolled edits. — BABR・talk 08:34, 3 July 2024 (UTC)Reply

I nominated you in WT:WL. — Fenakhay ^{(حيطي · مساهماتي)} 15:58, 3 July 2024 (UTC)Reply

Approved. Vininn126 (talk) 08:48, 4 July 2024 (UTC)Reply

Gender Only, or Gender + Number for Nouns in Translation Tables?

Latest comment: 14 days ago3 comments3 people in discussion

I've noticed that many noun entries in translation tables are annotated with gender only, with no indication of number, though there is no specific guidance or policy provided for this particular more in documentation related to translation tables.

Should number indeed be left out of translation table entries, perhaps unless the number of the translated noun differs from the number of the original noun? On the other hand, the argument for including number, along with gender, in translation tables, would be that it provides the most possible context for a reader who is more unfamiliar with the language in question.

If the consensus is that number should only be included if it differs from the number of the original English entry, I would suggest this policy be made explicit either in the translation table "add translation" forms themselves, or in Wiktionary:Translations.

Hermes Thrice Great (talk) 10:01, 3 July 2024 (UTC)Reply

I wouldn't support a policy of routinely including number. The point as I see it of including gender is that (for many commonly used languages) gender is lexically specific and relatively arbitrary relative to the meaning and potentially also the form of the word. Number is usually non-arbitrary and semantically transparent. I would agree with a policy of including number only when it differs from English, as in "meubles (fr) m pl" at furniture.--Urszag (talk) 05:42, 6 July 2024 (UTC)Reply

My impression is that by far the most common situation in noun translations tables is "English singular noun is translated into another language as Other-Language singular noun": I see no reason to indicate number in that case, since it's the 'default'. Whether to indicate it when the English and foreign-language numbers are plural (which is different from the default, but matches) is debatable. Obviously, it would be helpful to indicate where the English vs foreign-language numbers differ, as Urszag says. - -sche (discuss) 21:09, 7 July 2024 (UTC)Reply

Full entries for alternative forms in Chinese

Latest comment: 18 days ago3 comments3 people in discussion

@Justinrleung, Fish bowl, Wpi, Mar vin kaiser, Kc_kennylau Right now, etymology 2 of 腳／脚 soft redirects to 跤, while under etymology 3 of 乾／干 is a full entry containing the pronunciation of Min 焦 in the different varieties of Min. I was wondering if a full entry for the Min word for "foot" at 腳／脚 is allowed, since 漢語方音字彙 lists the readings of 跤 under the entry for 腳／脚 as 訓讀／训读. What are the rules regarding entries for alternative forms in Chinese?

As a side note, perhaps 骹 should be adopted as the main form for the Min word for "foot" instead of 跤, which we currently use. 跤 is used only in Taiwan sources, and as far as I know, is used only for Hokkien. On the other hand, 骹 is used in Mainland China publications for all varieties of Min. RcAlex36 (talk) 12:40, 3 July 2024 (UTC)Reply

I would argue that {{zh-see}} should only be used for orthographic variants that one would consider as representing the same underlying character, so things like traditional/simplified or 異體字; those that involve intermediate steps which cannot simply be summarised as "orthographic variant" (e.g. modern kun'yomi, modern borrowings for the character pronunciation, romanisations, puns) would be a full entry and use {{alt form}} (or preferably a template that specifies the type of alt form). Often the latter type would require (or already have) its own set of attestation separate from the main entry, so it would be more convenient to have a definition line (as in the {{alt form}}) where quotes can be added to.

Some examples: 里 and 裏 redirecting to 裡; 羣 redirecting to 群 will continue to use {{zh-see}}, while 腳 for 跤; 甲 for 佮; 懟 or 隊 for 㨃; der for 的 will use {{alt form}}. – wpi (talk) 14:39, 3 July 2024 (UTC)Reply

I agree with the general intuition from wpi. I also would like to add that {{zh-see}} should only be used when it can entirely replace a whole etymology section. Any proper subset of an etymology section, such as restrictions to certain pronunciations, definitions, and/or lects, should use {{alt form}} for sure. — justin(r)leung _{{ (t...) | c=› }} 14:46, 3 July 2024 (UTC)Reply

Is it ever necessary to use `{{etymon}}` in a redirect?

Latest comment: 18 days ago4 comments3 people in discussion

I'm referring to pages that look like this:

#REDIRECT [[some page]]
{{etymon|en|id=something}}

Here are the disadvantages:

Harder to keep follow an {{etymon}} chain as you have to check both the redirect and the redirect target to find an {{etymon}} call.
Worse performance, as the template has to search both the redirect and the redirect target for the parent.
A *lot* of corner cases. Say we have {{etymon|en|id=someID}} on some_redirect and {{etymon|en|id=someID}} on redirect_target. Should this be allowed? It has to be, because otherwise you can never add {{etymon}} on a page until verifying that the same ID isn't already used on any of its redirects, which is obviously inconvenient. But if we do that, now [[some_redirect#English:_someID]], when clicked, takes you to the wrong place!

Thus, unless there's a very good reason for this to be supported, I would like to remove all {{etymon}} uses from redirects.

Pinging @Theknightwho who pushed for this to be supported. Ioaxxere (talk) 14:16, 3 July 2024 (UTC)Reply

@Ioaxxere I didn't push for this to be supported - I just did the implementation, so that pre-existing attempts to do this weren't completely broken. Theknightwho (talk) 14:21, 3 July 2024 (UTC)Reply

@Theknightwho: I say "pushed" because you called it the "most sensible solution". But maybe I'm misinterpreting what you meant. Ioaxxere (talk) 16:03, 3 July 2024 (UTC)Reply

I don't think this is optimal - I think allowing for alt forms in the title of the pointed-to page would be better. Vininn126 (talk) 14:29, 3 July 2024 (UTC)Reply

categorizing modern English verbs as "class 4 strong verbs" etc

Latest comment: 17 days ago7 comments5 people in discussion

At Wiktionary:Requests for cleanup#Cat:English_class_4_strong_verbs, User:Mahagaja and I questioned whether it makes sense to be presenting modern English verbs as still having the class system they had back in PIE. Many verbs which were historically one class now behave like another class, or class distinctiveness has been lost, a lot has changed over the last few thousand years. I suggested that if anyone wants an etymology category, renaming the cats like "English verbs derived from ~~PIE~~ PGmc class X verbs" would make the intended(?) purpose and scope clearer, but alternatively it might make more sense to just not be categorizing this. What do you think? - -sche (discuss) 16:13, 3 July 2024 (UTC)Reply

The strong verb class system only dates back to Proto-Germanic, not Proto-Indo-European, AFAIK. I would prefer not categorizing them at all, but if we do, then yes, "derived from Proto-Germanic class ## verbs" makes more sense than calling them class ## verbs synchronically. —Mahāgaja · talk 16:18, 3 July 2024 (UTC)Reply

(That's its own issue, then, because the category descriptions are defining themselves in terms of the PIE conditions of the words, with no obvious reference to PGmc.) - -sche (discuss) 16:23, 3 July 2024 (UTC)Reply

I think the category descriptions are supposed to be giving background information, not defining criteria (even if that's not clear from how they are written).--Urszag (talk) 17:01, 3 July 2024 (UTC)Reply

IMO English verbs should not be categorized according to the Proto-Germanic strong verb system because most of the classes no longer have any coherence in modern English. (German is a different story. We still categorize modern German verbs according to this system because most of the classes have not lost their coherence in modern German.) Having this be an etymology category (reflecting what language? Middle English, Old English, Proto-West-Germanic, ...?) doesn't make a lot of sense IMO. Benwing2 (talk) 22:28, 3 July 2024 (UTC)Reply

I think it makes sense if all the verbs in the category are irregular due to still behaving like they're a member of a particular class, but it's probably better to give them a different name, as "English class 4 strong verbs" implies this is a common/agreed upon way of classifying English verbs. Theknightwho (talk) 20:43, 4 July 2024 (UTC)Reply

I read somewhere that someone tried to group English irregular verbs into classes and came up with 26 of them. Needless to say, there's no standard way of forming such classes; dictionaries just list the principal parts. Benwing2 (talk) 20:59, 4 July 2024 (UTC)Reply

What defines "transitive"?

Latest comment: 17 days ago17 comments6 people in discussion

I am in the process of converting {{indtr}} to use {{+obj}}, but {{indtr}} seems to play fast and loose with the "transitive" label so I'd like to get a sense of what people think "transitive" means. In my book, a "transitive" verb takes a direct object, and a verb whose only object is taken through a preposition is not a transitive verb. However, {{indtr}} labels such usages as transitive with em or similar. (Note that {{indtr}} doesn't actually categorize such verbs in e.g. CAT:Portuguese transitive verbs due to the way it implements the labels; I'm not sure if that was intended, though.) My questions are:

Do we agree that a verb usage like Portuguese pegar em (“to touch”) is intransitive, even though it's translated in English using a transitive verb? (IMO yes.)
If so, should we label the verb using {{lb|pt|intransitive}}, thereby categorizing it into CAT:Portuguese intransitive verbs? (IMO yes.)
What about verbs like Latin serviō (“to serve”) (which takes a dative object) in languages like Latin that have a case system? Should these be labeled as intransitive? (IMO yes. This is what Gaffiot does, for example.)

I should add that Italian generally follows the above rules, and it's useful to do so because all non-reflexive transitive verbs (according to the above definitions) take avere as an auxiliary, but intransitive verbs can take avere or essere. I think Spanish does too, and Italian and especially Spanish make little use of {{indtr}} compared with Portuguese and French. Benwing2 (talk) 22:23, 3 July 2024 (UTC)Reply

From what I figured when I specifically researched this question, in order to write government labels, transitivity depends on semantic properties and thus can be mediated through adpositions, so I am pleased to see that the author of the template {{indtr}}, which I have not known yet, due to generally editing other languages, understood it the same way. Many reference works dance around the mines when defining the concept, of course, particularly Wikipedia, whose manifold authors in one article have different understandings without realizing, giving false impression of a coherent article. The German Wikipedia at least was and is pretty explicit about the optional restriction to only direct objects. Als transitiv werden … Verben bezeichnet, die kein (oder, je nach Definition, kein direktes) Objekt haben. in the introduction and then a whole section about “conceptual (or terminology) variants” Fay Freak (talk) 23:25, 3 July 2024 (UTC)Reply

As a consequence, I disagree with your first conclusion.

Labels {{lb|langname|intransitive}} and {{lb|langname|transitive}} seem to have the purpose of disambiguating English equivalents, so one would have to reject the idea that they should categorize at all, were one not to know that other editors use the labels with different focus. In other words, the labels are polysemous, contrary to what we, who we describe language, trapped in its linearity, use to intuit – one reason why I avoid {{lb|langname|intransitive}} and {{lb|langname|transitive}} completely, preferring to specify government by {{+obj}}, formerly and in its new version, and use unambiguous verbose glosses. Fay Freak (talk) 23:38, 3 July 2024 (UTC)Reply

The simplest criterion and the one I think that is most commonly used by English speakers is that "transitive" means "takes a direct object", which excludes verbs that take prepositional phrases as their complements. I guess there could be unclear edge cases in some languages like the use in Spanish of "personal a" for animate patients of otherwise transitive verbs. The necessity of accusative case isn't entirely clear to me: I believe it is traditionally treated as diagnostic in Latin, but it seems like there might be a tradition in Polish of recognizing some verbs that take a genitive or instrumental complement as transitive if they can be passivized.--Urszag (talk) 03:03, 4 July 2024 (UTC)Reply

There is a great deal of sloppiness in labeling English phrasal verb senses as transitive or intransitive. I think it derives from including full definitions that are arguable SoP in the phrasal-verb L2s. DCDuring (talk) 18:54, 4 July 2024 (UTC)Reply

I'm opposed to calling any verb which takes a complement (be it a direct object - "verbes transitifs directs" in French - or a prepositional object - "verbes transitifs indirects") intransitive, though I agree it's not satisfactory to call them all transitive and leave it at that. Why not use "prepositional transitive" when needed? P U C – 18:20, 4 July 2024 (UTC)Reply

@User:PUC I assume that your opposition does not extend to all languages. DCDuring (talk) 18:54, 4 July 2024 (UTC)Reply

@PUC Concepts like "indirect transitive" appear to be Romance-specific. Benwing2 (talk) 19:01, 4 July 2024 (UTC)Reply

And even then, not found in all Romance languages. So I think it's much better to just call them intransitive; the fact that there is a preposition that can (and often is not) attached is clear from the use of {{+obj}}. Benwing2 (talk) 19:02, 4 July 2024 (UTC)Reply

To clarify, I have so far found the concept of "indirect transitive" only in the TLFi French dictionary and Michaelis Portuguese dictionary. It is not found in any other Portuguese dictionary, nor in any Spanish, Galician or Catalan dictionary that I have consulted. The Spanish, Galician or Catalan dictionaries label verbs as transitive only if they take a direct object; the Priberam and Infopedia Portuguese dictionaries are sloppy about the use of the labels "transitive" and "intransitive", sometimes calling verbs that only take a prepositional object "transitive" and sometimes "intransitive". The policy I'm following is that a verb that takes a prepositional object is transitive if and only if it also takes a transitive object; hence "to base (something in something else)" is transitive, but "to confide (in someone)" is intransitive. Benwing2 (talk) 19:42, 4 July 2024 (UTC)Reply

I'm getting the impression we may need to adopt various definitions of transitive per-language. Vininn126 (talk) 19:02, 4 July 2024 (UTC)Reply

In Polish grammars a requirement is generally that it can form the passive. Some verbs can take "accusative" arguments but cannot form the passive, and most scholars analyze the accusative argument as more of an adverbial. Vininn126 (talk) 18:22, 4 July 2024 (UTC)Reply

@Vininn126 Are you referring to what are often called "cognate accusatives"? Benwing2 (talk) 19:04, 4 July 2024 (UTC)Reply

@Benwing2 No, remember Talk:przejść? Vininn126 (talk) 19:10, 4 July 2024 (UTC)Reply

@Vininn126 Hmm, can you give a sentence with one of these adverbial accusatives in them? I'm having a hard time understanding what's being referred to. Benwing2 (talk) 19:12, 4 July 2024 (UTC)Reply

here's an article.

Basically people might create nonce formations like jezioro zostało obeszłe or something but they're highly non-grammatical, despite being able to say "Jaś obszedł jezioro". Vininn126 (talk) 19:13, 4 July 2024 (UTC)Reply

I think I recall this happening in Russian, too, where some transitive verbs don't have a passive participle. Zaliznyak labels them (depending on the verb) as either missing the passive participle entirely or as having a "rare and awkward" passive participle (which in practice won't ever be encountered). There are also a very small number of intransitive verbs that do have a passive participle; I think these are verbs that for whatever reason use an object in some other case but sound transitive. It's also interesting to me that the article you linked mentions that Polish in general avoids the passive, unlike German; this is like Spanish and unlike English. Benwing2 (talk) 19:51, 4 July 2024 (UTC)Reply

"Encyclopedic" as a deletion reason

Latest comment: 11 days ago12 comments6 people in discussion

Discussion moved from Wiktionary:Requests for deletion/English#Sticks Nix Hick Pix. (to be archived at Talk:Sticks Nix Hick Pix)

I've forked this from that discussion because it concerns broader issues and was starting to take it over. Chuck Entz (talk) 17:56, 5 July 2024 (UTC)Reply

@Chuck Entz You're about the only one who's actually given any substantive reasons. Inqil and Fay gave no reasons at all, Sgconlaw said "we don't have these" but didn't say way, and Fenakhay simply said "encyclopedic" without saying encyclopedic in what respect. When I talk of a "false dichotomy", what I mean is a false dichotomy between dictionary definition and encyclopedia entry. The idea that there is some bright, hard-and-fast line between dictionary definition and encyclopedia entry is fantasy. There are Wikipedia articles about parts of speech and Wiktionary entries about places and occasionally events too. Furthermore, the term "encyclopedic" has now become a catch-all excuse for deleting or revising almost anything. One frequent usage I've seen of "encyclopedic" is to claim that an entry is too detailed, but that's clearly NOT the case here. Purplebackpack89 16:25, 5 July 2024 (UTC)Reply

Well it is true that many encyclopedic entries can be lexicographical and vice versa, however when we say that a term is encyclopedic, it means it is purely non-lexicographical and has zero rationale for inclusion— unlike those tons of encyclopedic entries we keep such as toponyms, anthroponyms, and any abbreviations. Inqilābī 17:16, 5 July 2024 (UTC)Reply

a) If by encyclopedic you meant "non-lexicographical", you should have said "non-lexicographical" at the outset. And "non-lexicographical" isn't much better because it is also an amorphous idea. Purplebackpack89 17:34, 5 July 2024 (UTC)Reply

(After edit conflict) Just because there isn't a perfect bright line between the two concepts doesn't mean either is invalid. As I've explained to people in the past: an encyclopedia is about things: ideas, events, people, places and things. A dictionary is about the words, phrases, etc. used to refer to those as words, phrases, etc.

Yes, Wikipedia has articles about parts of speech- but Wiktionary doesn't (not in mainspace, anyway). Part of speech is information about the terms, that we give in the entries for them. This is illustrated by the fact that definitions for "verb" as a part of speech are under a "Noun" part of speech header, since the word "verb" is a noun.

Likewise, we don't have entries for things like List of ethnic slurs, even though we have lots of entries for ethnic slurs.

There's overlap between an encyclopedia and a dictionary as far as definitions, because they have to be clear about what the terms refer to and thus give some of the same information that an encyclopedia article contains. There's also overlap in encyclopedia articles, because they often contain information about the names and terminology used for the article subjects. Still, overlap isn't identity.

Of course, whether something is encyclopedic or not is sometimes not clear- but that just means we need to discuss it. Also, a wiki is a community that decides things via consensus. All of the rules you cite were originally arrived at by consensus. Right or wrong, your opinions are just your personal opinions unless there is a consensus that agrees with them. Chuck Entz (talk) 18:57, 5 July 2024 (UTC)Reply

I think you've actually acquiesced to the balance of what I've said.

And if there's a lack of clarity of the the term "encyclopedic" (and it's pretty clear there is!), we need to tighten the language. Purplebackpack89 20:14, 5 July 2024 (UTC)Reply

I think this is a good opportunity to clarify "Wiktionary:Criteria for inclusion#Wiktionary is not an encyclopedia", which currently states:

Care should be taken so that entries do not become encyclopedic in nature; if this happens, such content should be moved to Wikipedia, but the dictionary entry itself should be kept.
Wiktionary articles are about words, not about people or places. Articles about the specific places and people belong in Wikipedia.

The first sentence seems to be addressing that point that a definition (for example, for a scientific concept like relativity) should not become like a Wikipedia entry in length. Thus, the entry itself should remain but the encyclopedic information should be moved to Wikipedia (if it isn't already there).

The second sentence comes closer to addressing the general issue about when an entry is "encyclopedic" and so is not sufficiently lexical for inclusion in the dictionary, but it does not explain very much. It needs to be read in conjunction with "Wiktionary:What Wiktionary is not", which states: "Wiktionary is not an encyclopedia, a genealogy database, or an atlas; that is, it is not an in-depth collection of factual information, or of data about places and people. Encyclopedic information should be placed in our sister project, Wikipedia." We should discuss whether this makes it clear enough to determine when a term is "encyclopedic" and so inherently unsuitable for the dictionary.

Following the discussion here, we should have a formal vote to amend to clarify the CFI. — Sgconlaw (talk) 18:12, 5 July 2024 (UTC)Reply

The general idea of the RfD and now of this post is that the word "Encyclopedic" is thrown around way too casually as a catch-all for everything. The secondary concern is attempts to establish a bright line between encyclopedia entry and dictionary definition are folly. There's just too many similarities and too much of a gray area. Let's take a more specific look at the two clauses Sgconlaw references. I believe both are in need of some revision:

The first one needs to use greater precision than the catch-all "Encyclopedic". Instead of saying, "entries do not become encyclopedic in nature", perhaps a better phraseology might be, "entries and definitions do not become overly lengthy or detailed".
Given my druthers, I'd dispense with the second, or reword it, because there are already many exceptions and need to be more. There is a long list of places that are acceptable entries or definitions. If a person or a fictional character becomes genericized, they are permitted a definition. There are also nicknames for individual people or groups of people.

Furthermore, there may need to be guidance at RfD that simply staying "Encyclopedic" isn't a thorough enough rationale for deletion, and greater specificity is required. Purplebackpack89 18:32, 5 July 2024 (UTC)Reply

If we do that, could we also legislate on bare "Keep" votes without rationale? P U C – 19:58, 5 July 2024 (UTC)Reply

When someone says keep or delete without any further statement, it generally means the rationale is the same as that of the previous voters in the post. Inqilābī 20:37, 7 July 2024 (UTC)Reply

Purple, may I suggest reading encyclopedia vs dictionary (on Wikipedia)? Just because you mentioned being unsure what type of content is encyclopedic material and what type of content is dictionary material. Hope that helps clear things up.

That being said, even putting aside it being encyclopedic material, it fails CFI as the term is SOP. The general expectation for multi-word terms is that the words put together have a different meaning than they do apart, but that doesn't seem to be the case here. The definition is literally just every word that makes up the term, hence it's SOP. And, while we do sometimes make exceptions to certain CFI policies, we do so when there is agreement that the terms would be helpful for translations; Such an exception wouldn't apply here, there isn't really any translation based usage for this term that I can think of. So, no hard feelings Purple, but this does not meet our CFI, even if you challenged the current definition of "encyclopedic". — BABR・talk 01:16, 6 July 2024 (UTC)Reply

I think most of what you've said belongs at the RfD discussion as it focuses on the specific rather than the general. Purplebackpack89 14:16, 10 July 2024 (UTC)Reply

Moratorium on editing other languages' etymology sections for the purpose of English etymology trees

Latest comment: 15 days ago11 comments7 people in discussion

I'd like to request a moratorium on the editing of other languages' etymology sections for the purpose of populating English etymology trees (outside of adding {{etymon}} based on the already existing etymology). This has led to several conflicts and cleanups due to English editors wanting to display an etymology tree, haphazardly editing another language's etymology and causing misinformation to spread for other editors to clean up. It's been brought up to such users several times (primarily on Discord), but it looks like the problem has been continuing. As such, I'm bringing it to BP for a wider audience.

Whether it be the creation of problematic PIE reconstructions as detailed at Wiktionary:Requests for deletion/Reconstruction § Reconstruction:Proto-Indo-European/stéh₂tus, the editing of Spanish guayaba based on a misreading of an already faulty source for the tree at English poggers, the creation of {{etymon}} for nonexistent Middle Irish entries as stated at Wiktionary:Grease pit/2024/June § Template:etymon for nonexistent entries added to other entries, or the editing of Welsh entries for the purpose of the tree in English Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch by an editor with very little experience, it's been more clear to me that certain editors are more focused on the gamification aspect of trees rather than propagating pertinent and accurate information. See also: Wiktionary:Beer parlour/2024/June § Use of etymology trees made with Template:etymon in the entries for multi-word terms.

This was mainly sparked by the edit at guayaba because I don't even know where to begin to fix it, and it doesn't seem like {{etymon}} allows a derivation from a parent language with no attested term. I'm tired at this point of bringing this up on an individual basis to users and having to play cleanup, and had I known it would've exploded like this I would've voted oppose at the vote instead of abstain until this was made more clear. It's gotten out of hand. AG202 (talk) 00:49, 6 July 2024 (UTC)Reply

Support 100%. The fact that {{etymon}} seems to be in large part used to add "cool" trees to terms like United States of America, pneumonoultramicroscopicsilicovolcanoconiosis and such makes it clear that too many people view this as a game. At the time this template was being created, I had a bad feeling about the design and usage and expressed my concerns; unfortunately these concerns appear to be borne out. Benwing2 (talk) 02:48, 6 July 2024 (UTC)Reply

Being entirely fair, a lot of this seems to be the work of one editor (@Akaibu), but I agree with @AG202: I have removed {{etymon}} from Welsh Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch, as the template call was essentially malformed by not even accounting for mutations, and I don't feel I'm in a position to correct it. While I'm all in favour of having etymology trees, I think we shouldn't be afraid to simply revert its addition to an entry if an editor is completely inexperienced with a given language, just as we would with any other part of an etymology section. There's no big rush here; the trees will get added over time, but it has to be done by editors who know what they're doing. Theknightwho (talk) 04:08, 6 July 2024 (UTC)Reply

I really wish it were mostly just one user. And the reason I've brought it to light is because it's led to mini edit wars as with the revision history at þeir that only ended when I brought it to Discord to remind folks yet again to not edit languages they're not familiar with. And then I myself had to go find and revert all the tree additions. AG202 (talk) 04:15, 6 July 2024 (UTC)Reply

I will add that I haven't taken a close look at the {{etymon}} syntax but from what I've seen it needs an overhaul. This implies we should not be adding it all over the place yet because all the uses will have to be fixed by bot. Benwing2 (talk) 04:43, 6 July 2024 (UTC)Reply

Just for clarity's sake, after posting the above image, I was (inadvertently) pinged by @Akaibu on Discord in the English channel who stated:

>I really wish it were mostly just one user. ~ @AG202

continues to bitch and moan about something that was still a single user's doing

This is just highly inappropriate. They've since replied that it was a joke about "[themselves] myself being the cause of the problems" and "saying one deserves more credit for causing trouble", but I still do not find it appropriate. This adds to their other messages on Discord pushing back against simple asks to not edit languages that they're not knowledgeable about (on multiple instances), along with misunderstanding basic tenants about how this project works. For a user with under 500 edits (a large chunk related to this effort), I'm concerned about constructive editing for the future. AG202 (talk) 04:45, 6 July 2024 (UTC)Reply

Support, and your statement that "certain editors are more focused on the gamification aspect of trees rather than propagating pertinent and accurate information" is spot on. P U C – 08:05, 6 July 2024 (UTC)Reply

Support. I do feel that people just wanna use the new tool wherever, without checking the results.

(Also, should we gamify making good etymologies?) CitationsFreak (talk) 09:34, 6 July 2024 (UTC)Reply

Support. Going to the first redlink or the first place we are sure is fine. It's better to be slow and sure than fast and wrong. It's better to make a request somewhere or discuss it at WT:ES than anything. Vininn126 (talk) 09:41, 6 July 2024 (UTC)Reply

Support. @VGPaleontologist --{{victar|talk}} 02:06, 7 July 2024 (UTC)Reply

Support. Theknightwho (talk) 02:23, 7 July 2024 (UTC)Reply

Criteria for "...terms inherited from..." categories

Latest comment: 12 days ago8 comments2 people in discussion

Related to the discussion above about PIE root categories, what kinds of bounds do we want on the membership of categories like Category:Latin terms inherited from Proto-Italic? I noticed this now includes words like solstitium and tessellatus, presumably because of the etymology trees. While I don't see an issue with putting these in Category:Latin terms derived from Proto-Indo-European, I don't think compound words that were put together after the ancestor language should go in this kind of category, much less words like tessellatus where only the suffixes are inherited, and the base is a borrowing. For simplicity, I think it would be best not to have words categorized as "inherited" unless they are inherited from only one etymon in the relevant ancestor language. @Ioaxxere would you be able to clarify if this is intended behavior, or a bug? Urszag (talk) 01:29, 6 July 2024 (UTC)Reply

@Urszag: the two entries you linked were using the template incorrectly which is what caused the unwanted categories to be added. I've gone ahead and fixed the entries. Ioaxxere (talk) 01:36, 6 July 2024 (UTC)Reply

Thanks for fixing my mistake. I hadn't realized it was erroneous to use "from" for affixed words, although that makes sense given the analogy to Template:from.--Urszag (talk) 01:39, 6 July 2024 (UTC)Reply

@Ioaxxere: What would be the correct derivation keyword to use for univerbations or multi-word phrases such as ab ante (or, if they don't end up being forbidden, cases like United States of America, LASER etc, which are now in "English terms inherited from Proto-West Germanic", "English terms inherited from Middle English", etc.)? I noticed the same issue here ("ab ante" being wrongly placed in "Latin terms inherited from Proto-Italic"), but wasn't sure about the semantically correct way to fix it. Should they be equated to compounds and marked with "af" (despite not containing any affixes)? I don't quite see in what circumstances the current behavior of "from" with multi-word etymologies will be appropriate, so maybe it should trigger an error message?--Urszag (talk) 04:37, 6 July 2024 (UTC)Reply

@Urszag: Yes, compounds are af. Maybe the name is a little misleading but it's explained in the documentation. Both of the examples you listed were added by @Akaibu who I hope can resolve the issue. Switching af and from still results in a valid etymology in this case (just not one that matches reality) so it's hard to automatically judge when someone has done it by accident. Ioaxxere (talk) 14:27, 6 July 2024 (UTC)Reply

I don't understand yet what the hypothetically valid alternative etymology would be in cases like this. Are there any concrete examples of entries derived from more than one etymon in the same language where it would be correct to use "from" rather than "af"? Seeing an example would help me understand better why it's necessary to distinguish the two keywords and their behaviors in this context. (I tried to look at examples of Template:from, but it seems to be barely used judging from the "What links here" tool.) Would "from" be reserved for cases like "a word used to have multiple forms that then merged into one" (but wouldn't that be more of a case for "influence"?) or "a term/phrase was derived from modification of a preexisting phrase" (but wouldn't that call for syntax like "from|en>united states of America...", like at religion of piss)?--Urszag (talk) 15:04, 6 July 2024 (UTC)Reply

@Urszag: Yes, you're correct. One example of from with more than one etymon would be cytotech which might be written |from|cytotechnologist>id1|cytotechnician>id2. Ioaxxere (talk) 15:38, 6 July 2024 (UTC)Reply

Thanks. While I think I understand this now, I think the way it currently works is unintuitive and is likely to lead to a lot of cases of "from" being used in contexts where it isn't appropriate according to these criteria, and where it will create categorization errors. Making it the default and describing it as "unspecified derivation type" doesn't help: "from" sounds a lot more generic than it really is, in contrast to "af" which sounds more specific than it really is. E.g. Reconstruction:Latin/ad vix was created with {{etymon|la|id=barely_vulg|la>ad>to|la>vix>barely}}, where the absence of a keyword apparently makes it be treated as "from", which inaccurately puts this entry into Category:Latin terms inherited from Proto-Italic.

Also, even in cases like "cytotech", would it really be accurate to describe this term as "inherited from Middle English" in a hypothetical situation where "cytotechnologist" and "cytotechnician" are inherited from Middle English, but the shortened form arose only in Modern English?

If the plan is to keep the current behavior of "from" with regard to inheritance categories the same, what would you think of making "af" instead the default when there is more than one etymon, all of which are in the same language as the entry?--Urszag (talk) 12:37, 9 July 2024 (UTC)Reply

merge "pronominal" into "reflexive"

Latest comment: 13 days ago9 comments3 people in discussion

We have two labels pronominal and reflexive that are supposed to reflect a difference made in the linguistic tradition of certain Romance languages, whereby a "pronominal" verb is a reflexive verb whose meaning isn't obviously reflexive. Unfortunately in practice there's absolutely no consistency in how these labels are used, and it doesn't reflect anything in the actual syntax of the verbs, but only in an extremely subjective judgment as to whether a given sense has a sufficiently "reflexive" meaning to it. On top of this, the actual display of the labels doesn't do anything but muddy the waters: reflexive displays as reflexive while pronominal displays as takes a reflexive verb. Furthermore, the pronominal label doesn't seem to be used outside certain Romance languages even though there are several other languages (e.g. Slavic languages) that have a similar concept of reflexive verbs that may or may not be semantically reflexive. Finally, whether a verb is semantically as well as syntactically reflexive should be obvious from the specified meaning of the verb, i.e. the pronominal label adds absolutely nothing of value to the entry beyond what reflexive would do. So given all this I propose simply bot-replacing pronominal with reflexive. Benwing2 (talk) 02:44, 6 July 2024 (UTC)Reply

I'm not sure about this. I agree the distinction doesn't seem particularly necessary in terms of usage label text. This blog post makes a four-way distinction between reflexive, reciprocal, idiomatic pronominal, and essentially pronominal verbs. The last category seems to be fairly small, and it is possible to characterize it relatively unambiguously in terms of these verbs normally not occurring without a reflexive pronoun (compare Latin deponent verbs, which can be identified by the absence of attested active-morphology forms in most parts of their paradigm). It looks like currently, we have separately named categories for Category:French pronominal verbs, Category:French reflexive verbs, Category:French reciprocal verbs, although the first contains only one verb. We seem to already treat these verbs differently by including the pronoun "se" in the entry name, at least in the case of se barrer and a number of verbs in Category:French reflexive verbs, such as se passionner, se casser, se pouvoir; what's our policy on this? We have essentially duplicated information about the reflexive sense at passionner and pouvoir but not at casser.--Urszag (talk) 04:58, 6 July 2024 (UTC)Reply

@Urszag But this isn't at all how the label 'pronominal' is used here. It is used facultatively for reflexive senses of verbs (including those that are also used non-reflexively) that are idiomatic, that's all. Benwing2 (talk) 05:13, 6 July 2024 (UTC)Reply

BTW the policy for Spanish and Portuguese is that verbs are lemmatized with the reflexive pronoun only if they don't occur non-reflexively. This is different from the practice with Italian, which strictly separates reflexive and non-reflexive verbs into different lemmas. Benwing2 (talk) 05:14, 6 July 2024 (UTC)Reply

French appears to mostly follow the Spanish and Portuguese practice. Benwing2 (talk) 05:15, 6 July 2024 (UTC)Reply

Yes, I see that the current usage of the labels doesn't follow this or any other clear distinction (there are even some cases like se la péter that have both labels), so a bot replacement seems like it wouldn't remove information. I don't oppose that, but it seems like an opportunity to consider the question of whether it is possible to make a non-subjective distinction between the concept of pronominal and reflexive verbs, and to what extent our entries should mark this or leave them undistinguished. Since labels are sometimes used to add words to categories, that made me think about the categorization of these verbs, but it seems like pronominal doesn't actually even place a verb in Category:French pronominal verbs. Is there any easy way to see which pages use a certain label? I just noticed that pronominal is used not only in the lb template but also in other templates such as Template:indtr.--Urszag (talk) 05:35, 6 July 2024 (UTC)Reply

@Urszag {{indtr}} underlyingly uses the label machinery to handle things like .pronominal and other parameters preceded by a dot. You can see which pages use a given label by visiting Special:WhatLinksHere/Wiktionary:Tracking/labels/label/pronominal or a language-specific subcategory such as Special:WhatLinksHere/Wiktionary:Tracking/labels/label/pronominal/fr for French. Note that I'm in the process of converting all uses of {{indtr}} to a combination of {{lb}} and {{+obj}}, which is why I'm running into this issue. Benwing2 (talk) 05:41, 6 July 2024 (UTC)Reply

@Urszag I am also finding various examples e.g. Portuguese dedicar where "pronominal" is used even with explicitly reflexive meanings. Benwing2 (talk) 03:13, 9 July 2024 (UTC)Reply

Support merging. In my experience "reflexive" is simply the term used in English-based learning materials where Romance languages use "pronominal". You could distinguish between the different functions of these verbs that Urszag listed, but I imagine none of the editors adding labels to these verbs have those distinctions in mind. They are most likely just using the terminology of their native language. Ultimateria (talk) 08:10, 9 July 2024 (UTC)Reply

where does Medieval Latin begin?

Latest comment: 4 days ago45 comments9 people in discussion

This came up in an WT:RFDO topic. I'd like to establish clearly where Medieval Latin begins, so we can determine whether categories like CAT:Proto-West Germanic terms borrowed from Medieval Latin are legitimate, or should be emptied by fixing the terms in it to refer to Late Latin, Vulgar Latin or some other variety. AFAIK Medieval Latin begins no earlier than 600 AD; anything prior is Late Latin. User:Theknightwho and User:Nicodene agree with me, but User:Victar claims that Medieval Latin begins in the 4th century AD with Christian writers such as Jerome. What's the consensus here? Benwing2 (talk) 04:44, 7 July 2024 (UTC)Reply

‘Late antiquity extends roughly from 200 to 600, and the grammarians active during this period are often known as the Late Latin grammarians [...] The early Middle Ages (600—800) was characterized by the need to study Latin as a foreign language [...]’ - Mantello & Rigg (1996), Medieval Latin: An introduction and bibliographical guide, page 288.

Nicodene (talk) 05:23, 7 July 2024 (UTC)Reply

I would not expect it to be used of Latin earlier than 500 AD (or 476 if we use historical events as a marker). The Dictionary of Medieval Latin from British Sources apparently focuses on texts from between 540-1600.--Urszag (talk) 09:12, 7 July 2024 (UTC)Reply

What Nicodene and Urszag said. From the Early Middle Ages; for me Boethius is a marker, himself Late Latin. The 4th century even in Christian writers is clearly far from Medieval. It is bizarre to view as Augustine as Late Latin, though the so-called Church Fathers be all at fault for the decline and fall of the Roman Empire. This is the same fallacy as calling the Qurʔān Classical Arabic only because it is the basis of Classical Arabic. Fay Freak (talk) 09:40, 7 July 2024 (UTC)Reply

I don't really have that much of a dog in this, but this is from {{R:ine:EIEC|xxi}}: "[M]edieval Latin, a rather generic designation for Latin of the third century AD and later. (The cutoff date between Latin and medieval Latin follows that of the Oxford Latin Dictionary)". Personally, I also find this early, but 7th century seem quite late. If one used the fall of the Roman Empire, that would be end of the 5th century, and the works of Boethius would be the start of the 6th century. --{{victar|talk}} 05:07, 8 July 2024 (UTC)Reply

Quoting what @Nicodene said in response to this in the other thread: :::::: The Oxford Latin Dictionary set an approximate cut-off of 200 AD for the end of Classical Latin (the date I use as well), not the start of 'Medieval Latin'. I hate to say it, but the authors of the EIEC are simply mistaken. Benwing2 (talk) 05:11, 8 July 2024 (UTC)Reply

Also, to repeat (and add to) what I said: if what Victar is claiming is true, it either leaves no room for Late Latin, or means that we have to start treating Late Latin as a period of Medieval Latin; neither of which make much sense to me. Theknightwho (talk) 05:27, 8 July 2024 (UTC)Reply

TKW, "Victar is claiming is true"? These aren't my claims, and I was citing Urszag and Fay Freak. Please see their replies above. {{R:itc:EDL|14}}, going off of Weiss, claims Late Latin spans 3rd~4th c. to 5~6th c., leaving Medieval Latin to begin in the 5~6th century. That would allow for ML borrowings into 5th century Frankish/Proto-West Germanic, as well as Proto-Slavic. --{{victar|talk}} 05:34, 8 July 2024 (UTC)Reply

I think that's pushing it. Benwing2 (talk) 05:54, 8 July 2024 (UTC)Reply

And by that you mean to say you think de Vaan's/Weiss' dates are wrong? --{{victar|talk}} 05:56, 8 July 2024 (UTC)Reply

If "Late Latin spans 3rd~4th c. to 5~6th c", then Medieval Latin should start 6th~7th century, not 5~6th century. It's pushing it to infer from Late Latin having a 5th-6th century ending date that Medieval Latin can start as early as c 425 AD. That seems exceedingly unlikely to me. Benwing2 (talk) 05:59, 8 July 2024 (UTC)Reply

"then Medieval Latin should start 6th~7th century": No it wouldn't. See https://pasteboard.co/4tt1HXHqRvUq.png from the EDL. --{{victar|talk}} 06:03, 8 July 2024 (UTC)Reply

I take c 5th/6th century to mean c 500 AD. You can't take it to mean a 200 year range and arbitrarily pick the earliest possible date as the beginning of Medieval Latin. Benwing2 (talk) 06:06, 8 July 2024 (UTC)Reply

In any case, I think you'll have a hard time getting consensus on a date for Medieval Latin before 500 AD at the earliest, and you're kinda tilting at windmills trying to do so. Benwing2 (talk) 06:08, 8 July 2024 (UTC)Reply

That shows Late Latin lasting into the 5th~6th c., as we've been been saying.

Also there really is no calling anything prior to 476 (at the earliest) 'medieval' in any sense. Nicodene (talk) 06:12, 8 July 2024 (UTC)Reply

Nicodene, in Benwing's opening statement it is claiming the 7th century as the start of Medieval Latin. I am fine with a 5th~6th century start to ML, which still allows for some very late PWG and SL borrowings. --{{victar|talk}} 06:16, 8 July 2024 (UTC)Reply

As I said, you're trying to impose an artificially early date on Medieval Latin so you can borrow from Medieval Latin into PWG. I don't buy it. PWG is < 500 AD, Medieval Latin is > 500 AD, hence no overlap. Benwing2 (talk) 06:25, 8 July 2024 (UTC)Reply

And you keep trying to impose some finite date, when it's actually porously 5th~6th century. The end of PWG itself too is vague, and probably better also labeled 5th~6th century, as many scholars would call the 6th century Malberg glosses still Frankish.

In the end, it really doesn't matter. If all those entries on CAT:Proto-Slavic terms derived from Medieval Latin and CAT:Proto-West Germanic terms borrowed from Medieval Latin where changed to Vulgar or Late Latin, it would be of little consequence. You came at me hot, though, and so I'm just giving my understanding of the scholarship on the issue. --{{victar|talk}} 07:33, 8 July 2024 (UTC)Reply

Specialists generally place the cutoff in the sixth century AD (beginning, end, or somewhere in between) give or take a few decades. The cutoff is often tied to the death of a scholar, for instance Boethius or even more so Isidore of Seville. The latter lived to see the last gasps of the old Roman order. Nicodene (talk) 08:29, 8 July 2024 (UTC)Reply

I have to agree that anything before 476 AD can't be considered medieval. Also, Wikipedia claims that Etymologiae (c. 625) is Late Latin, so maybe the date should be pushed even later...? Ioaxxere (talk) 13:19, 8 July 2024 (UTC)Reply

The endpoint of "Late Latin" can be put at various places: Wikipedia says some would put it as late as 900 CE. My viewpoint is that if we make use of the term "Medieval Latin", it is best to define it in terms of the same date range commonly recognized for the Medieval period/Middle Ages in historical periodization. While the start of the Middle Ages isn't entirely fixed by convention (The Catholic Encyclopedia suggests you could take 375, 476, or 609) our entry at Middle Ages and Wikipedia both describe it as starting at 500 CE. If we expect this definition of "Medieval" to be the most common prior expectation for our readers, I think it seems strange to cut off several centuries from the category bearing that name. Sure, the division is arbitrary since there is no sharp transition between Late Latin and Medieval writers, but the same applies to Classical and Late Latin, and Medieval and New Latin.--Urszag (talk) 14:08, 8 July 2024 (UTC)Reply

This claim is for his generation. Isidore, over 60 when authoring his work, must have employed the Late Latin language he learnt when a bairn, like some of our seniors appear to relate to 20th-century English, and foreign languages, better than Generation Alpha slang. Idiolects aren’t all updated at the same time, so chronolects have intersections in reality. Fay Freak (talk) 14:13, 8 July 2024 (UTC)Reply

To give a real-world example, Latin plastrum is only attested in Medieval Latin, so the etymology on PWG *plastr is {{bor|gmw-pro|ML.|plastrum}}, supported by {{R:nl:NEW|plaaster}}. What should this be changed to in cases like this, Vulgar Latin? --{{victar|talk}} 20:57, 8 July 2024 (UTC)Reply

"Vulgar Latin" is a problematic term. I would lean towards saying it's not necessary to distinguish different kinds of Latin in the context of categories for borrowings into Proto-West-Germanic, and thus "Proto-West Germanic terms borrowed from Latin‎", "Proto-West Germanic terms borrowed from Medieval Latin", "Proto-West Germanic terms borrowed from Early Medieval Latin‎" and "Category:Proto-West Germanic terms borrowed from Vulgar Latin" might be better as just one category. In that case, it could simply use {{bor|gmw-pro|la|plastrum}}. If more context is desired, another format could be "from {{bor|gmw-pro|la|emplastrum}} via a clipped form {{m|la|plastrum}} (attested in Medieval Latin)."--Urszag (talk) 21:27, 8 July 2024 (UTC)Reply

Latin plastrum needs a label for a conjectured chronolect, starred “Late Latin”. I commented four weeks ago about this missing functionality. Fay Freak (talk) 21:34, 8 July 2024 (UTC)Reply

@Victar @Urszag I agree here with Fay Freak that if we actually believe this term existed in PWG, it needs to be derived from a hypothesized Late Latin term. I should add that the earliest cites listed in Du Cange and DMLBS are c. 1200 AD; not even Early Medieval Latin. @Fay Freak I saw your comment but didn't respond because I wasn't sure (and still am not sure) what you're asking for exactly. Can you give an example? Benwing2 (talk) 22:15, 8 July 2024 (UTC)Reply

@Benwing2 I think Fay Freak was saying exactly the same thing as you, but as a FF-ism. Theknightwho (talk) 22:30, 8 July 2024 (UTC)Reply

@Benwing2: I think the thing is simple, but people are unsure how to fit it in. The claimed dialect or chronolect label can be based on conjecture rather than attestation, so merits a star, or something else, if the idea from my side to put it beside language names rather than word forms is erratic, though it is just a general icon for reconstruction and I wouldn’t know which other sign to invent. By the same reasoning a sense has to be marked as reconstructed when the term is attested in but a part of the distinct meanings. Fay Freak (talk) 22:32, 8 July 2024 (UTC)Reply

Another example is PWG *lubistik, where the etymology lists multiple stages of Latin: Borrowed from Medieval Latin lubisticum, libisticum, from Late Latin levisticum, corrupted from Latin ligusticum. Detailing the different forms of Latin helps to give a sense of chronology, which just using plain Latin doesn't afford. --{{victar|talk}} 22:44, 8 July 2024 (UTC)Reply

I suspect a lot of these terms are wanderwords that didn't exist at the PWG stage. For example, if there was really a PWG plăstr, wouldn't we expect OE plæster not #plaster? Benwing2 (talk) 22:48, 8 July 2024 (UTC)Reply

Old High German pflastar exhibits p > pf, which points to it being borrowed before this change occurred, i.e. in Proto-West Germanic. What happens with a lot of Latin borrowings is that they get later reenforced by Latin individually and even later by Old French. --{{victar|talk}} 23:03, 8 July 2024 (UTC)Reply

Not necessarily; the p -> pf change could have survived as a surface filter for hundreds of years after it first occurred. Benwing2 (talk) 23:45, 8 July 2024 (UTC)Reply

I didn't realize you were a PWG editor. --{{victar|talk}} 00:44, 9 July 2024 (UTC)Reply

Cut the sarcasm. You're not a Latin "editor" either. Benwing2 (talk) 01:42, 9 July 2024 (UTC)Reply

No, however I am an editor who spends a large portion of their time focusing on Latin borrowings into West Germanic. OHG had no problem borrowing p from Latin, with examples like pensil, from Medieval Latin penicillum. --{{victar|talk}} 02:32, 9 July 2024 (UTC)Reply

I know that, but IMO it doesn't prove that much. Spanish borrowed some words from Latin with ie reflecting short ĕ hundreds of years after the initial sound change /ɛ/ -> ie under stress; Italian borrowed some words from Latin with closed /o/ reflecting Latin ŭ late into the Medieval and Early Renaissance period, more than 1,000 years after the corresponding sound change took place. Russian still sometimes makes the substitution /h/ -> г /g/ in borrowings.

In any case we seem to have 5-1 consensus that PWG can't borrow from Medieval Latin. Benwing2 (talk) 03:23, 9 July 2024 (UTC)Reply

"5-1 consensus"? Where do you see that? What's been discussed is the start date of Medieval Latin. If ML begins in 5th~6th century, 5th~6th century PWG can conceivably borrow from it. --{{victar|talk}} 03:42, 9 July 2024 (UTC)Reply

I've sorted out some of the details of plastrum.

Not the first time I've encountered a late borrowing from Romance into Latin that happens to be spelt the same way as we'd spell the 'Vulgar Latin' reconstruction. An even later example is consutura. Nicodene (talk) 03:15, 9 July 2024 (UTC)Reply

Thanks for creating an entry for it. --{{victar|talk}} 03:23, 9 July 2024 (UTC)Reply

@Nicodene, want to create an entry for *buttia, from buttis? --{{victar|talk}} 04:36, 9 July 2024 (UTC)Reply

@Victar Done. Let me know if there are others like this. Nicodene (talk) 22:50, 9 July 2024 (UTC)Reply

Is it necessary to draw a distinction between Late and Medieval Latin? What if we merge them as simply ‘Post-classical Latin’ or something? Since different sources can variously call the same etymon LL and ML, a merge might be helpful… Inqilābī 04:14, 9 July 2024 (UTC)Reply

Yes. They represent notably different stages and dialects. Late Latin was still spoken natively, while Medieval Latin was learned as a foreign language and has a lot of weirdnesses in it by comparison. Benwing2 (talk) 04:46, 9 July 2024 (UTC)Reply

The end period of natively spoken Latin overlaps with the early period of the use of Latin as a scholarly, learned Lingua Franca. Therefore, any switch-over date we pick is somewhat artificial and arbitrary. To make it easy, we should then use some century. It will not make a great deal of difference whether we let Medieval Latin “take over” per the 5th, 6th or 7th century. But perhaps we should allow the languages to overlap in time, depending on the evidence of use – whether it was a (possibly reconstructed) term used by the common people or a term used by a scholar or scribe. After all, Late Latin was not truly the ancestor of Medieval Latin in the sense in which Middle English is the ancestor of English. --Lambiam 22:42, 17 July 2024 (UTC)Reply

There is a label "post-Classical" that I've used in the past (e.g. octoplus, athough an IP replaced it with the more specific ML. label). As a label, it would in theory cover all of Late Latin, Medieval Latin and New Latin, which is a pretty broad time range. I can see why we might want to include a few more divisions. I'm not entirely on board with conceptualizing the boundary between "Late Latin" and "Medieval Latin" as a real internal transition rather than a convenient yet more-or-less arbitrary convention of periodization: while some analyses consider the transition between native use and "learned as a foreign language" use of Latin to be significant and something that happened around the start of the "Medieval" era, there isn't consensus either on the nature of that transition or its date, so I don't think our dictionary should commit to the idea that this is an essential criterion dividing Late Latin from Medieval Latin. (I don't think the presence of these labels in itself requires that theoretical commitment, but I wanted to push back a bit against the viewpoint mentioned by Benwing.)--Urszag (talk) 12:55, 9 July 2024 (UTC)Reply

Unchecked proliferation of pages generated due to using `{{uder}}`

Latest comment: 13 days ago27 comments4 people in discussion

I noticed many editors who aren’t well-aquainted with etymology templates are using {{uder}}. I would recommend displaying a default warning after someone uses this template to deter people from using it. @Benwing2, Chuck Entz Inqilābī 20:28, 7 July 2024 (UTC)Reply

@Inqilābī Could you please give some context as to the kinds of pages you mean, and what the problem is? Theknightwho (talk) 20:34, 7 July 2024 (UTC)Reply

Category:Undefined derivations by language. The previous template was {{etyl}} (which was the oldest etymology template and was non-specific in nature), later replaced by {{uder}} for smoother appearance. This template ought to be substituted with a legitimate etymology template such as {{der}}, {{inh}}, {{bor}}, {{lbor}}, etc. Using {{uder}} generates the aforesaid category and subcategories; even Wingerbot keeps generating these categories as part of its routine category creation. Inqilābī 20:49, 7 July 2024 (UTC)Reply

@Inqilābī My understanding is that the template is supposed to be used if you're not fully sure of the derivation. Ideally every entry would be more specific, but that's not feasible. Theknightwho (talk) 20:53, 7 July 2024 (UTC)Reply

I guess I would rather someone who doesn't know what the right template to use is, uses {{uder}} rather than just {{der}}, which lots of people are in the (unfortunate) habit of doing. Benwing2 (talk) 21:00, 7 July 2024 (UTC)Reply

As someone who does etymology cleanups, {{der}} and {{uder}} are the same thing to me. Using a wrong template isn’t the only issue though; wrong etymons are another problem, commonly found in very old entries. Changing {{bor}} to the more precise {{lbor}} is another maintainence. Inqilābī 21:10, 7 July 2024 (UTC)Reply

@Inqilābī Well, going forward shall we agree the following?

{{uder}} should be used when you're unsure of the exact derivation, so it categorises the entry into a maintenance category so that it can be replaced with a more specific template if appropriate, or otherwise {{der}} if it's not.
{{der}} should only be used when the derivation does not fit into one of the types of derivation covered by another template.

This seems like a useful distinction to me, at least. Theknightwho (talk) 21:16, 7 July 2024 (UTC)Reply

@Theknightwho Yes, that was the original intention of these templates. Benwing2 (talk) 21:17, 7 July 2024 (UTC)Reply

@Inqilābī Please don't do that. If you're not sure of whether to use {{inh}} or {{bor}}, leave it alone. Benwing2 (talk) 21:17, 7 July 2024 (UTC)Reply

As someone who also does etmology cleanups, I would rather people be vague than wrong. With {{uder}}, people who know what they're doing can find these entries to fix them. It's already too easy for someone to just copypaste a derivation template from another language without changing the language codes. I find things like Bengali entries in Category:Pashto terms borrowed from Arabic, Bengali terms categorized as Assamese and vice versa. I even find entries for European languages in Category:Indonesian terms inherited from Malay. Then there are the etymologies where a term is borrowed from a neighboring language, which inherited it from another language, which borrowed it from yet another language- and the etymology uses {{bor}} for that last step, even though there was no direct contect between the first language and the last language. Of course, the people who make those errors would probably get the language codes wrong in {{uder}} if they even knew about it. My point is that unnecessary usage of {{uder}} is pretty minor compared to that kind of nonsense. Chuck Entz (talk) 22:26, 7 July 2024 (UTC)Reply

@Theknightwho: I am pretty sure that is not the case. It’s only a cleanup template / category— and if unsure about the immediate etymon, using {{der}} is sufficient (but a {{rfe}} can be added alongside). It wasn’t created to exist permanently, and will be deprecated eventually. Inqilābī 21:02, 7 July 2024 (UTC)Reply

I'm a little confused, because wouldn't that mean it could have been wholesale replaced by {{der}} without any loss of specificity? I thought the whole point was that it created the maintenance category because the editor suspects a more specific derivation may be possible, whereas {{der}} does not necessarily imply that, meaning it's still useful if people add it. Theknightwho (talk) 21:05, 7 July 2024 (UTC)Reply

@Theknightwho That's right. {{der}} is supposed to used only when {{inh}} and {{bor}} don't apply. I don't agree with User:Inqilābī that {{der}} is OK if you're not sure of the correct template. Benwing2 (talk) 21:07, 7 July 2024 (UTC)Reply

Firstly, I think having {{der}} is okay in cases where the etymology says ‘ultimately from’. Another editor could subsequently add to the etymology if more data is available. {{rfe}} can always be used, and editors can fix etymologies from that category page. Originally, {{uder}} was created to prevent someone who was converting {{etyl}} to {{der}} wholesale, and not out of the desire to make a category for to-be-fixed etymologies.

But if people want to change the purpose of this template, then I have no objections- I am just stating what I thought was a misuse of the template and categories. Inqilābī 21:21, 7 July 2024 (UTC)Reply

@Inqilābī On a separate-but-related note, please don't change empty categories with {{auto cat}} to {{d}}, because if they stop being empty then they're still no longer part of the category tree, and it means someone else (read: an admin) has to actively undo your change, which is annoying. Empty categories get automatically categorised into Category:Empty categories, and are deleted routinely. However, these categories are maintenance ones that shouldn't be deleted anyway, since they will occasionally see new entries. Theknightwho (talk) 21:23, 7 July 2024 (UTC)Reply

I agree with this; I just did a run deleting empty categories a couple of days ago. Benwing2 (talk) 21:25, 7 July 2024 (UTC)Reply

Okay sorry, I didn’t know I wasn't supposed to do that with the {{uder}} categories, even though I have done this for a long time without anyone objecting. Inqilābī 21:32, 7 July 2024 (UTC)Reply

@Inqilābī Yes, IMO {{der}} is OK in 'ultimately from' cases but from what you said above it sounded like you were doing exactly that wholesale conversion of {{uder}} to {{der}}, which is absolutely wrong. Benwing2 (talk) 21:23, 7 July 2024 (UTC)Reply

Yeah, people don’t clearly understand what I say. This was funny because I am against indiscriminately using {{der}}, and I was the main reason this was halted after I started a BP discussion about the problem some years ago. Inqilābī 21:31, 7 July 2024 (UTC)Reply

My apologies, when you said {{der}} and {{uder}} are the same thing to me it sounded like you were using {{der}} indiscriminately. Benwing2 (talk) 21:35, 7 July 2024 (UTC)Reply

Few considerations

Future consideration: It seems like editors who use {{uder}} use it randomly (this includes copy and pasting from elsewhere) and not because they think it is to be used if they are uncertain about the specific etymology. Our default welcome message could also contain a link to a page listing the etymology templates with explanations about each of their purposes, which would help prevent a misuse of any of the templates by new editors. This in turn would rule out the necessity of any maintainance etymology template.
Immediate consideration: If empty pages aren’t regularly deleted then {{uder}} cleanup becomes difficult. I don’t want to click on every subcat to check which languages’ cleanups are complete. Hence I would suggest a bot deleting the empty pages periodically (from what I know this has not been done for the category in question), or actually allowing editors to tag the pages for deletion, or even creating a duplicate copy of the category which won’t contain empty subcats (I’m not sure if the last option is feasible). This makes life easier for people going across language entries to substitute this template with better etymology templates.

Inqilābī 12:24, 8 July 2024 (UTC)Reply

@Inqilābī Your second point confuses me: if you're monitoring them from Category:Undefined derivations by language, then you can already see which are empty. The next level up from that is Category:Entry maintenance subcategories by language, which is way too broad to be conducting routine entry maintenance from, since you generally want to pick a single area to focus on at any one time. Theknightwho (talk) 15:20, 8 July 2024 (UTC)Reply

@Theknightwho: Do you mean those (x e) things beside the category names? Wow I never realized until just now that it indicated the number of entries contained. Sorry for wasting everyone’s time!- but also thank you for pointing it out. Inqilābī 15:31, 8 July 2024 (UTC)Reply

@Inqilābī No worries. If it has subcategories, you can also click the little arrow to the left of the name to expand them. Theknightwho (talk) 15:33, 8 July 2024 (UTC)Reply

@Theknightwho: I am aware of those arrows, but for some reason every single arrow in this category actually appears grey, and I’m unable to expand them. Inqilābī 15:38, 8 July 2024 (UTC)Reply

@Inqilābī Yeah, it only shows subcategories - you can't use it to see pages, unfortunately. Theknightwho (talk) 15:39, 8 July 2024 (UTC)Reply

@Benwing2, well I would still urge periodically deleting empty subcats of undefined derivations using your bot, given that limitless new subcats can be created by anyone, while I prefer it be a short list of cleanup category which I don’t have to scroll through to be able to spot the non-empty ones. Thanks for considering! Inqilābī 11:59, 9 July 2024 (UTC)Reply

Voting to ratify the Wikimedia Movement Charter is ending soon

Latest comment: 14 days ago1 comment1 person in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello everyone,

This is a kind reminder that the voting period to ratify the Wikimedia Movement Charter will be closed on July 9, 2024, at 23:59 UTC.

If you have not voted yet, please vote on SecurePoll.

On behalf of the Charter Electoral Commission,

RamzyM (WMF) 03:46, 8 July 2024 (UTC)Reply

etydate template

Latest comment: 10 days ago11 comments7 people in discussion

Originally, {{etydate}} displayed text inside square brackets and small text. However, this was removed later on while retaining the dot at the end of the text and the parameter |nodot=. Now, the dot at the ending is not necessarily needed always and editors often use other wording in the same sentence after the template-generated text. So I think it should be consistent with other etymology-line templates like {{doublet}}, {{calque}}, etc. which generate text but no dot, also because it's easily much less a hassle to type a dot than |nodot=1. I'd like to know if people agree with or object to such a change. Svartava (talk) 12:15, 9 July 2024 (UTC)Reply

I'd be in

support of this, as probably the biggest user of this template. All instances would need a bot update, but it would be much better imo. Similarly, I've considered removing "the" when a non-number was given, opting to manually have it, but maybe it's better to have it. Vininn126 (talk) 12:17, 9 July 2024 (UTC)Reply

Support. (Adding the langcode as the first parameter might also be useful as it would help creating lists of terms in a given language by first attestation.) Einstein2 (talk) 12:34, 9 July 2024 (UTC)Reply

There was also talk of having categorization for dates at some point, but no one was able to come up with a concrete system. Vininn126 (talk) 12:57, 9 July 2024 (UTC)Reply

Support removing the default dot and removing the |nodot= parameter in {{etydate}} because it's easily much less a hassle to type a dot.

Adding the langcode as the first parameter seems like a useful idea as well for creating lists of terms in a given language by first attestation. Kutchkutch (talk) 13:46, 9 July 2024 (UTC)Reply

Support and I can do the bot changes. Benwing2 (talk) 18:39, 9 July 2024 (UTC)Reply

Support — BABR・talk 19:06, 9 July 2024 (UTC)Reply

Support - Leasnam (talk) 02:26, 10 July 2024 (UTC)Reply

Done; as of now, the cleaning up is in process. @Benwing2: If you can run a bot job of adding . at the end of instances of {{etydate}}, here's a list of entries on which cleanup has not been done -- rest have been fixed. Thanks! Svartava (talk) 09:18, 11 July 2024 (UTC)Reply
@Svartava Can you tell me exactly what steps you took and in what order? In the future it would be better not to do half-cleanups like this, particularly when making a change that isn't idempotent such as adding or removing a period, because it's difficult to figure out how to do the bot changes correctly. It would be better to let me do it completely. Benwing2 (talk) 02:00, 12 July 2024 (UTC)Reply
@Benwing2: I removed all instances |nodot=1 and added periods at the end of {{etydate}}'s on some of the pages on which |nodot=1 wasn't used. I've created the list mentioned above for the entries on which cleanup is yet to be done so on those entries it's just adding periods after {{etydate}} since all those pages are among those pages which were not using |nodot=1. Svartava (talk) 02:51, 12 July 2024 (UTC)Reply

Removing hiragana transliterations in Japanese

Latest comment: 11 days ago17 comments4 people in discussion

Hello, I propose that I run a bot task to remove the instances where we see hiragana used as part of the transliteration when linking to Japanese, e.g. {{t|ja|窯|tr=かま, kama}} → {{t|ja|窯|tr=kama}}. This is because the hiragana doesn't add anything more than the romanization already offers; the transliteration doesn't help those who can't read Japanese writing anyway; and, in general, I think |tr= should be reserved for Latin writing, as people who only know English can at least always derive something from it. I believe we had (maybe not everyone) agreed that we should only use the romanization in this case for Japanese, but please let me know what you think.

What I would do: 1. Likely make a tracking category for entries that use non-Latin in Japanese translations in {{t}}, as I believe the majority of uses of this are in translation sections; 2. Iterate over all translations, and for each one: 3. If the transliteration of the hiragana equals the romanization, simply remove the hiragana; else, save it for our review, in case there are any mismatches in transliteration out there.

If there's a more refined way to access any instances of the link module using non-Roman transliterations, that might also be a better substitute for step 1, but I don't know if that exists. Kiril kovachev (talk・contribs) 21:35, 9 July 2024 (UTC)Reply

@Kiril kovachev Yes, I agree. I don't mind/quite like having hiragana displayed as rubytext (or we could even do Chinese-style and have it displayed after a slash), but having it in transliterations like this is generally pretty crap. Theknightwho (talk) 21:15, 10 July 2024 (UTC)Reply

@Theknightwho Do you think we should do either of these ideas, instead of outright removing it? Kiril kovachev (talk・contribs) 21:44, 10 July 2024 (UTC)Reply

@Kiril kovachev I think for now it's best to remove them, since any conversion to rubytext would need to be done manually, and a lot of them are quite sloppy.

We probably want to have a proper discussion about displaying multiple forms (i.e. rubytext), as it might make more sense to use the Chinese style in translation sections (where space is tight), as compared to other places. I'm still keen for us to display kana forms, though, since having to work backwards from transliterations is annoying. Theknightwho (talk) 21:57, 10 July 2024 (UTC)Reply

@Theknightwho Alright, I'll focus on removing them now, then, if that's what we want. I'm asking because if we eventually wish to convert the translations to give kana inline anyway, getting rid of it now would just make it harder for us later, no? Kiril kovachev (talk・contribs) 22:13, 10 July 2024 (UTC)Reply

@Kiril kovachev Hopefully it should be possible to automatically scrape kana in some cases, which should mitigate this. However, I don't think it's too much of a problem if we remove it now, since it would all need to be converted properly by hand anyway. Theknightwho (talk) 22:20, 10 July 2024 (UTC)Reply

Okay, gotcha. I'll figure it out one of these days hopefully then. Kiril kovachev (talk・contribs) 22:23, 10 July 2024 (UTC)Reply

Agreed also. Benwing2 (talk) 21:32, 10 July 2024 (UTC)Reply

@Kiril kovachev:

Oppose simply deleting these hiragana readings. In Hepburn romanization (which is standard in the English Wiktionary), the long vowel in any intra-morpheme combination of an お段(だん) (odan) kana + う (u) or お (o) is transcribed ō without distinction, じ and ぢ are both transcribed ji, and ず and づ are both transcribed zu, so it is untrue that the hiragana adds nothing, since one can't in all cases infer the hiragana from the Romanisation (the converse is also true, since kana do not distinguish case, whereas the Latin script does). It is also very common for learners of Japanese to be proficient in kana but be unfamiliar with many kanji (I am just such a learner). I would propose converting these hiragana readings to furigana, but as Theknightwho already noted, the smaller text size and lack of space in translation tables militates against this. I'm not thrilled about slashed translations à la Chinese traditional/simplified spellings and would prefer the kana remain in parentheses alongside the rōmaji, but it would be a tolerable solution.

0DF (talk) 22:54, 10 July 2024 (UTC)Reply

@0DF Well, I've personally argued for having a 1-to-1 transliteration system in the past, but that doesn't seem to be overly popular, so for now you're right that it does add minor distinctions, which I chose to ignore because I didn't believe them to be overly significant. The reason is because you can just click on the link to see the kana if you wanted to see the original; after all, if you wanted to see the historical spelling, which is even more distant from the current pronunciation, you'd again have to do that too. The differences in kana and romaji don't affect the way a reader would pronounce the word, as far as I'm aware, which is in the first place why the romanization is identical for all those syllables.

But, I get that there is a difference, and that you won't be able to spell the word in kana correctly if all you have is the Hepburn. Fair enough. Maybe we can abstain from removing it for now.

I also have a few suggestions for what we can do with the kana, though, to keep it a bit smaller: as some dictionaries do, we can have the furigana given as a subscript on the kanji. Or as brackets after each kanji, but that's less readable IMO. Kiril kovachev (talk・contribs) 23:06, 10 July 2024 (UTC)Reply

@0DF I completely disagree with this. You are arguing based on a small number of edge cases, which can easily be determined for those small numbers of people who care, by looking at the lemma page. Benwing2 (talk) 23:39, 10 July 2024 (UTC)Reply

@Benwing2 Well, you could make the same argument for removing simplified Chinese, since - like kana - it can't be readily determined in only a quite small proportion of cases. Theknightwho (talk) 23:46, 10 July 2024 (UTC)Reply

@Theknightwho But the difference is that simplified Chinese *IS* the normal way of writing these lexemes for 95%+ of native speakers, which doesn't apply to kana in the case of words normally written with kanji. Benwing2 (talk) 23:58, 10 July 2024 (UTC)Reply

@Benwing2 It's trivial to find examples of words which are in free variation between the two. It's not safe to just assume the kana forms aren't used. Theknightwho (talk) 00:03, 11 July 2024 (UTC)Reply

@Theknightwho I'm not sure why you are arguing. Did you change your mind about removing kana from transliterations or are you just playing devil's advocate? Benwing2 (talk) 00:08, 11 July 2024 (UTC)Reply

@Benwing2 I think we should remove the ones given in manual transliterations, but I'm ultimately in favour of having kana displayed somehow. Theknightwho (talk) 00:12, 11 July 2024 (UTC)Reply

@Theknightwho I see. Personally I think furigana is enough. Benwing2 (talk) 01:04, 11 July 2024 (UTC)Reply

U4C Special Election - Call for Candidates

Latest comment: 12 days ago1 comment1 person in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello all,

A special election has been called to fill additional vacancies on the U4C. The call for candidates phase is open from now through July 19, 2024.

The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. Community members are invited to submit their applications in the special election for the U4C. For more information and the responsibilities of the U4C, please review the U4C Charter.

In this special election, according to chapter 2 of the U4C charter, there are 9 seats available on the U4C: four community-at-large seats and five regional seats to ensure the U4C represents the diversity of the movement. No more than two members of the U4C can be elected from the same home wiki. Therefore, candidates must not have English Wikipedia, German Wikipedia, or Italian Wikipedia as their home wiki.

Read more and submit your application on Meta-wiki.

In cooperation with the U4C,

-- Keegan (WMF) (talk) 00:03, 10 July 2024 (UTC)Reply

MediaWiki:Gadget-SpecialSearch

Latest comment: 2 days ago5 comments4 people in discussion

Should we get rid of this old gadget (currently on by default)? It creates three new buttons at Special:Search (Google, Bing, Yahoo) which are meant to let you search Wiktionary with an alternative search engine, although the gadget is not working properly at the moment. @This, that and the other mentioned that the gadget might have been created at a point in time in which the built-in MediaWiki search was more "primitive". I think it is no longer necessary, but maybe some people would still like to be able to use it. Ioaxxere (talk) 04:21, 10 July 2024 (UTC)Reply

I don't care either way but if other people don't want to keep & fix it then let's just get rid of it.

I will add, though, you are able to link to google using [[google:]] (e.g. google:Wiktionary), and if you type google: in the search bar you'll be redirected to google as well. So if the goal is to search something on wiktionary and pull results on google, I assume a fix is possible but I'm not sure how helpful that would be. OTOH if the goal is to pull up google results on Wiktionary, I 'm still not sure it would be helpful. Unless, it could count the total amount of results (since google removed that feature), but even that feels like a stretch. — BABR・talk 04:35, 10 July 2024 (UTC)Reply

Looks like nobody cares that much. I'll take the gadget away and we'll see if anyone complains. This, that and the other (talk) 03:25, 19 July 2024 (UTC)Reply

I don't care about the issue, but 9 days in Summer seems like a very short time to expose any issue to objections. DCDuring (talk) 11:39, 19 July 2024 (UTC)Reply

someone can still voice an objection after it is removed. I don't think there's any harm in removing the gadget now, and think it's good way to see if the gadget being gone actually causes issues for anyone. Plus, FWIW, the gadget is not even functional right now and no one has voiced support for fixing it, so there's really no difference between it remaining installed or not. — BABR・talk 19:30, 19 July 2024 (UTC)Reply

Mahagaja changing references to `{{reflist|size=smaller}}`

Latest comment: 8 days ago10 comments3 people in discussion

This was brought up before (link?), but User:Mahagaja has a habit of going around changing the references section on pages from <references /> to {{reflist|size=smaller}}. Assuming this is out of a visual preference, they can simply edit their private common.css to accommodate their personal taste. If we, as a project, wanted this font size as the default, that change would have been made in the backend. --{{victar|talk}} 18:28, 10 July 2024 (UTC)Reply

I don't remember anyone bringing this up before, but if we're not supposed to be allowed to do this, we shouldn't have |size= in {{reflist}}, or maybe we shouldn't have {{reflist}} at all. —Mahāgaja · talk 18:31, 10 July 2024 (UTC)Reply

What a bizarre reason to remove a feature (one can also always use <references />), but I find it useful for notes on entries and references within discussions. Also {{reflist}} is the only way you can have a references inside a references list, is which helpful for notes. --{{victar|talk}} 18:47, 10 July 2024 (UTC)Reply

I wasn't actually advocating removing either {{reflist}} or |size=; I was pointing out the oddity of providing a template that has a certain function, and then complaining when users use it. —Mahāgaja · talk 08:39, 11 July 2024 (UTC)Reply

And it has its uses here and there, but changing all instances in the references section is another thing entirely. Is this just for aesthetic reasons? You can add ol.references { font-size: smaller; } to your common.css file which will accomplish the same. --{{victar|talk}} 03:15, 12 July 2024 (UTC)Reply

That would make them smaller only for me, not for everyone. There's a reason that books have for centuries been printing footnotes at the bottom of the page in a smaller font size than the regular text: you don't want less important information like references to be written as large as the more important information. The difference between main text and footnotes is clearer to the reader when there's a size difference. —Mahāgaja · talk 12:44, 13 July 2024 (UTC)Reply

@Mahagaja: if your intention is to apply {{reflist|size=smaller}} to all entries, this is a matter that should be discussed first. — Sgconlaw (talk) 13:36, 13 July 2024 (UTC)Reply

Well, I never edit entries just to apply {{reflist|size=smaller}}, but if I'm editing, for some other reason, an entry that uses <references/>, I often change that too. Never in a billion years would it have occurred to me that anyone would be annoyed by that, but if they are, I'll stop changing it in existing entries. But I will keep using {{reflist|size=smaller}} in entries I create or entries I'm adding references to for the first time. —Mahāgaja · talk 14:06, 13 July 2024 (UTC)Reply

@Mahagaja: but that's exactly what should be discussed. It makes certain entries look different from others—I'm not sure that's a good idea. — Sgconlaw (talk) 14:09, 13 July 2024 (UTC)Reply

We're a wiki with hundreds of editors. It's inevitable that some entries look different from others, and that's never going to change. And you know how discussions of this type end up: lots of people express lots of different opinions, much more heat than light gets generated, eventually the conversation fizzles out without anything being resolved, and everyone goes back to doing things exactly the way they always have. —Mahāgaja · talk 14:20, 13 July 2024 (UTC)Reply

Language of surnames

Latest comment: 11 days ago5 comments5 people in discussion

Someone has had a lot of fun adding tons of Polish surnames as English entries, rendering Category:English terms borrowed from Polish practically unusable in the process.

Five years after a failed vote (Wiktionary:Votes/pl-2019-11/CFI policy for foreign given names and surnames), couldn't we have another go at devising a policy about this? P U C – 18:52, 10 July 2024 (UTC)Reply

The phenomenon of proper nouns "drowning out" regular words in categories is not unique to this situation. E.g. it's a little bit of work to spot the common nouns in Category:Latin feminine nouns in the second declension and Category:Latin masculine nouns in the first declension given all of the borrowed names in these categories. Does this mean it would be useful to have categories for these that exclude proper names? I once thought so, but I think I've seen some argument about how intersectional categories aren't necessary because there's supposed to be some way to generate them yourself—not that I remember how to do it. Since Category:English terms borrowed from Polish, Category:English proper nouns, and Category:English surnames all exist, in theory there's all the information needed to calculate the difference of these sets.--Urszag (talk) 19:23, 10 July 2024 (UTC)Reply

@Urszag I don't know how easy it is to do set differences. Maybe @Chuck Entz or @DCDuring or someone else who knows the search system would know. But one way to deal with your specific issue is to categorize proper nouns differently from (common) nouns in the above categories.

@PUC I completely agree we need a criterion preventing people from arbitrarily adding surname X as term in language Y. I remember this happening various times, leading to mass RFD's that haven't been resolved consistently. In inflected languages like Russian and Polish it's useful to know how to decline certain foreign names, but not arbitrary ones; an appendix would be sufficient for that. Benwing2 (talk) 23:36, 10 July 2024 (UTC)Reply

It's easy to do searches in the searchbox: 'incategory:"English terms borrowed from Polish" -incategory:"English proper nouns"' (There are 69; note the "-".). Using categories and templates (hastemplate:"template name") in combination is quick. Adding individual words or phrases to shorten the result is also quick. If you do these things, adding regex searches for very specific targetting (eg, rare typos) isn't much of a performance hit. See Help:CirrusSearch (at MediaWiki). DCDuring (talk) 01:54, 11 July 2024 (UTC)Reply

We should make a subcat called "LangX proper nouns borrowed from LangY". CitationsFreak (talk) 06:11, 11 July 2024 (UTC)Reply

Pintupi-Luritja

Latest comment: 10 days ago7 comments4 people in discussion

Currently, Pintupi-Luritja does not have a script code assigned (just None), meaning that the one translation we have at peace comes up in CAT:Pintupi-Luritja terms in nonstandard scripts. For Pitjantjatjara, we have a special encoding pjt-Latn at MediaWiki:Gadget-LanguagesAndScripts.css which uses the same special characters (in addition to Pintupi-Luritja and Pitjantjatjara both being classified as part of the Western Desert dialect cluster). I personally do not have the ability to do any of these things, because I lack the rights, but I propose that:

a (etymology-only? I don't know what the right handling is here) language code be created for the Western Desert (see w:Western Desert language for what would be included) cluster, or at least a family code,
pjt-Latn be renamed to follow that code,
and all Western Desert dialects be changed to use the special encoding.

Minor Pama-Nyungan languages are currently severely neglected and neither I nor anyone else can give them any attention in this state. Pinging @Soap who I discussed this with on the Discord. -saph668 (user—talk—contribs) 20:56, 10 July 2024 (UTC)Reply

I'm keen for us to not add more custom script codes, as these were essentially inherited from a time when we used different templates for every script. What we need to do is sort this out via CSS. For now, I've added the Latin script, so CAT:Pintupi-Luritja terms in nonstandard scripts is now empty. Theknightwho (talk) 22:02, 10 July 2024 (UTC)Reply

What would sorting it out via CSS entail? -saph668 (user—talk—contribs) 22:05, 10 July 2024 (UTC)Reply

@Saph668 There should be a way to specify it based on what's in the lang= tag, though @This, that and the other may know more. Theknightwho (talk) 22:21, 10 July 2024 (UTC)Reply

I'll look into this again.

I do agree with what @Saph668 says about Pama-Nyungan languages. There doesn't appear to have been any attempt to group them into even the most obvious and uncontroversial subfamilies (Western Desert, Kulin, ...) This, that and the other (talk) 06:17, 12 July 2024 (UTC)Reply

@Theknightwho see the last discussion at Wiktionary:Beer_parlour/2024/January#Deprecating_pjt-Latn. The issue is specifically with page titles and would need some Lua coding to fix. This, that and the other (talk) 06:19, 12 July 2024 (UTC)Reply

i thought the idea was that it needs a special font so that the underscores appear properly. see on the peace page, under "Translations to be checked", how Pintupi still appears with a normal font, but the closely related Pitjantjara language (which coincidentally is next to it alphabetically) uses a font that looks slightly more bunched together. I was only guessing, but my intuition told me that the reason we do this is because these languages are among the few that use letters with underscores as part of their alphabet, and that these might not render properly on some fonts, especially the ḻ. But I could be wrong. —Soap— 08:14, 11 July 2024 (UTC)Reply

Are taxonomic names Latin or Translingual?

Latest comment: 3 hours ago17 comments8 people in discussion

Following an RFV discussion, there has been some further discussion between me and @Benwing2 on how we should treat taxonomic names. So I think it is best to take this discussion to BP. The questions that stand to be resolved are:

For specific epithets that are only attested in taxonomic names and not in the Latin literature, should we categorize them as Latin or as Translingual?
If we do categorize them as Translingual, how should we deal with their inflections?

According to @Urszag on the linked RFV discussion, "other editors agree in the past, a taxonomic name by itself doesn't count as a usage of a word in the Latin language", but that isn't the same as a formal discussion, so here I am. Here are my concerns:

abbotti is currently a Translingual adjective that has no gender specified, but lycioides is currently m or f or n, even though they function identically in the context of being a specific epithet in Translingual. (This is because in Latin, abbotti is the genitive form of a noun, and lycioides is an adjective.)
We're kind of implying that Translingual is a gendered language...

Here are Ben Wing's concerns, to the best of my knowledge (I apologize if I have misrepresented him in any way and I am open to be corrected):

actinocarpus is also currently a Translingual adjective, and it is gendered as m, and the feminine and neuter forms are provided. Further, actinocarpa is currently marked as the "feminine" form of actinocarpus, whereas in Latin it would be instead "nominative feminine singular". He thinks that partially borrowing the inflectional structure from Latin is problematic, because "Translingual doesn't have any grammatical rules"; but then it also "seems wrong" to have to "make three lemmas for the masc/fem/neut varieties".
The whole taxonomic naming system is really "a restricted sort of Latin" and so they should be classified as Latin in the first place.

(Also pinging @Chuck Entz, Trooper57, DCDuring.)

--kc_kennylau (talk) 21:44, 10 July 2024 (UTC)Reply

After some discussion with @Nicodene it has been brought to light that in the (pre-)modern scientific Latin literature, the species names are declined as normal as in Latin (see noctula where the species name Vespertilio murinus is declined in the ablative as Vespertilione murino). This has changed my opinions a bit, and I now think that it would be reasonable to categorize them under Latin. --kc_kennylau (talk) 22:43, 10 July 2024 (UTC)Reply

When I address such things, I normally leave the L2 header alone, not because I don't have beliefs and preferences about them, but because there has been no codification, and not much interest in codifying, how such matters are addressed.

Specifically, Translingual does have simple rules about gender agreement, not unlike those of Latin, that are enforced by the Code authorities.

As to the inflection line for Translingual adjectives (ie, those we have not found in any vintage of Latin), almost all of which imitate Latin in form, it seems a good presumption that all three genders potentially occur, as all three genders are represented among genus names.

I would not be surprised to find that specific epithets that are homonyms of Latin adjectives are used with an apparent definition that differs in some way from the Latin.

The guardians of our Latin entries have not even allowed legal or medical Latin, or modern Latinate inscriptions or mottos (either on the grounds that they are SoP or that they are not properly formed, ie, not SoP) to sully the Latin categories. There is not even much enthusiasm (ie, citation effort) to include modern Church Latin, despite use in running text.

In principle, the same kind of problem can occur with genus names and perhaps names at other ranks. For example, Atlas is a synonym of Dicronorhina, Gaea of Euhagena, Zeus is a genus of ray-finned fish. We treat such terms as both Translingual and Latin (also English), albeit with different definitions.

Finally, in the past century (or longer) many specific epithets have been derived from poorly documented languages that were spoken near where specimens where found. Often there is no Latinate ending grafted on, so the epithet is invariant, its PoS is not obvious, and it is treated as an adjective.

I have no proposal to make and wonder what, if anything, is now being proposed. DCDuring (talk) 22:47, 10 July 2024 (UTC)Reply

Why not remove the genders? Then it can be Translingual and no assumption must be made.

On the other hand, I think specific epithets (or any part of the name) that are obviously construed as being Latin should be given a Latin section in addition to their Translingual section, where you can specify the declension. But if that's no good, then I also see no problem with simply saying that Translingual "can" be a gendered, and declined, language. It's not one language after all, it's any terms that aren't specific to one language, so if some of the vocabulary used translingually is declinable or gendered, is there really be a problem? Kiril kovachev (talk・contribs) 22:47, 10 July 2024 (UTC)Reply

I tend to agree with DCDuring and Kiril here, with one exception. I don't think the occasional use of taxonomic names in Latin literature is sufficient reason to treat all these terms as Latin. We have special rules for Translingual taxonomic names and templates for specific epithets that denote the gender, which is essential imo and cannot be omitted - this is where I differ from Kiril.

The only thorny patch arises due to the fact that Latin is an LDL, so even a single attestation of a declined taxonomic name (like Vespertilione murino) in a Latin text would give us license to add a Latin entry for the genus and the specific epithet. Perhaps time to revisit the idea of treating post-1500 Latin as a WDL. (Although in this case it's a moot point, as vespertilio and murinus both date back to classical Latin.) This, that and the other (talk) 23:14, 10 July 2024 (UTC)Reply

I have been following the criterion that any term that has any use at all in Latin text passes RFV (in accordance with Latin's classification as a LDL). So I haven't attempted to convert any term to Translingual when there are concrete attestations like "Secretum in Vespertilione murino et V. noctula foetidum". This, that and the other, it sounds like you would be in favor of a stricter criterion? Whereas kc_kennylau, it sounds like you are saying that because some species names are attested like this, we should include a Latin entry for any species name, even if zero attestations can be found for that specific term in running Latin text? I'm not seeing a consensus yet.

Even though I wouldn't agree with the viewpoint that binomial nomenclature is a form of Latin, it is of course undeniable that this naming system follows some conventions that are derived from Latin grammar. One of these is agreement in gender (masculine, feminine, or neuter) for adjectival epithets. So I don't think it's adequate to simply have an entry for the form "actinocarpus" with no indication that its feminine form is "actinocarpa" and its neuter form is "actinocarpum". Nor does it make sense to have these as separate, disconnected entries. I don't see it as problematic to include this in a Translingual entry, as Translingual is not itself a language that can have or lack grammatical rules as a whole: different entries in Translingual may belong to their own subsystems of communication that follow their own particular rules.--Urszag (talk) 01:33, 11 July 2024 (UTC)Reply

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ So, it seems like all the respondents so far (@DCDuring, Kiril kovachev, This, that and the other, Urszag) would agree with (or at least accept) a policy like this:

If a specific epithet is found in the Latin literature, then it is to be formatted as ==Latin==, with all the gender and inflection information included (e.g. noctula).
Otherwise, it is to be formatted as ==Translingual==, where there are at most three forms (masculine, feminine, neuter), and no inflection (e.g. actinocarpus).

(For the second point, it should be mentioned that the official guidelines do specify that the specific epithets are gendered and need to agree with the gender of the genus, when they originate as a Latin adjective.)

In addition, I would like to propose the following points:

The Translingual genus names (e.g. Abroma) should be included in the gendered categories (e.g. Category:Translingual neuter nouns).
(When {{taxoninfl}} was converted to use {{head}} by Special:Diff/59825365/72996213), |nogendercat=1 was specified because the original code did not categorise according to gender. I'm not sure if that was a deliberate decision, or they just forgot to categorize.)
The various other relations between the various nouns and adjectives should be suffixes instead of inflections. For example, abdimii has the suffix -ii that forms specific epithets (even though it comes from the Latin genitive), and Acanthodii has the suffix -ii that forms classes (even though it comes from the Latin plural).
We implicitly agree that these words are not Latin and thus do not have vowel length.

A question remains to be resolved, that I have mentioned above: abbotti originates as a genitive, and lycioides originates as an adjective whose three genders are the same (also see Point 5). When they function as specific epithets, they behave the same in all regards. Should we treat them the same? i.e. should they either both be m or f or n or both have no gender specified?

(All three approaches have potential problems:

If abbotti has no gender specified and lycioides is m or f or n, then it's inconsistent descriptively (as specific epithets). (I think this is the original intent of the official guideline, but the guideline itself kind of retains (reasonably so) some features of Latin.)
If abbotti becomes m or f or n, the Latinists might not like this. (In my opinion this seems to be the best solution, and if the Latinists don't want to accept neo-Latin then I guess they also cannot decide how Translingual grammar works.)
If lycioides becomes no gender specified, then this is kind of inconsistent to our previous ruling that specific epithets agree in gender with the genus.)

--kc_kennylau (talk) 11:46, 11 July 2024 (UTC)Reply

@Kc kennylau I agree with all those points! As for the final point, I principally want to say abbotti should have no gender, whereas lycioides have m or f or n, but you make a good point that, in as much as they aren't Latin, those two ought to behave virtually the same, i.e. be usable after any gender of genus. I had prepared a long argument about why I would propose what I said originally, but now I think we would be best to label them the same, probably with all three genders; as, in both cases, we aren't labelling the inherent gender of the adjective, but what genders of genus it can agree with, which in both cases is all of them. I guess for two- or three-termination adjectives, if those are used in taxonomy, I don't know, they would have one or two genders at the "base" form and links to feminine/neuter versions? Kiril kovachev (talk・contribs) 13:29, 11 July 2024 (UTC)Reply

(barbadensis and cervicornis would be examples. The latter might be a bit problematic.) --kc_kennylau (talk) 13:38, 11 July 2024 (UTC)Reply

Yes. A combination of both. The two main taxonomic codes explicitly state that the names are in Latin and Latinized Ancient Greek, but only a very restricted subset. Basically, all taxonomic writing used to be in Latin, then most of the writing was replaced by the vernacualar except that a diagnosis describing a new name in Latin had to be provided, and finally that faded out, just leaving the names themselves. One could make the argument that taxonomic names really are Latin, but the extremely narrow context in which they're used doesn't allow for us to see verbs (except participles), prepositions, or accusative, dative, locative or vocative nominal/adjectival inflections

As for the names themselves: they all theoretically have gender, but above the rank of genus it's usually impossible to know what it is. The names of genera are nouns in the nominative singulat, but they have gender. Species, subspecies, varieties, etc. modify the name of the genus either as:

An adjective in the nominative that agrees with the generic name in gender and number
A noun in the genitive that agrees in gender and number with the referent (so abbottii isn't an adjective)

or as:

A noun in apposition that only agrees with itself.

The genitive can be used for species named after someone or something, or it can be used to refer to some association with the referent, as in Sempervivum tectorum, which has historically been found growing on roofs, or parasitic species named after their host. The last case is the only way to determine the gender of a name above the rank of genus, since a species that parasitizes members of a taxonomic group would have a name that agrees in gender and number with the name of that group.

Thus a species in the genitive named after Mr. Smith would be smithi or smithii, one named after Ms. Smith would be smithae, after the Smith sisters would be smitharum and after Mr. Smith and at least one other Smith would be smithorum.

There's a lot more I could say, but I don't have time right now. Chuck Entz (talk) 15:07, 11 July 2024 (UTC)Reply

Just wanted to add that the "noun in apposition" is the rule for things that are not Latin. For example: piranga, from Old Tupi is present in some scientific names (Issoca piranga, Pyrianoreina piranga) and remains the same regardless the genus is masculine, feminine or neuter. Some authors can still choose to latinize them, Aulonastus pirangus and Sternostoma pirangae do exist, but just piranga is still valid. Trooper57 (talk) 15:33, 11 July 2024 (UTC)Reply

@Chuck Entz: I understand that we are following Latin rules, and that the guidelines intend the names to be Latin; but that does not change the descriptive reality of how they are currently used in the scientific community (I would suppose that most biologists don't know Latin), nor should this impact our policy making decisions. This is also apparent in the fact that 31.1.2 of this document had to spell out the various genitive endings, just as you did for the Smiths. In Point 4 of my proposals, these endings (-i, -orum, -ae, -arum) would be classified as Translingual suffixes for the purpose of the English Wiktionary.

Also, after reading your reply, I'm not really sure what your opinion is, regarding this matter.

@Trooper57: The apposition rule can also apply for Latin names, with an example given in the document being Cephenemyia (f) phobifer (m), or Acrochordonichthys (m) ischnosoma (n).

--kc_kennylau (talk) 16:32, 11 July 2024 (UTC)Reply

I didn't think to check it before now, but it seems that the essay Wiktionary:Taxonomic names has covered some of the topics discussed here.--Urszag (talk) 21:42, 11 July 2024 (UTC)Reply

Regarding "As for the names themselves: they all theoretically have gender, but above the rank of genus it's usually impossible to know what it is.", I note that LPSN lists genders above genus level. Examples: phylum Acidobacteriota as "neuter" and family Zavarziniaceae as "feminine".

Although re-reading your remarks, perhaps you intended a contrary interpretation of "above the rank of genus"?

—DIV (1.129.106.197 08:02, 22 July 2024 (UTC))Reply

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ I suppose there is some merit in making a distinction between noun and adjective for specific epithets even when there is no descriptive difference, in that it is a nice balance between the Latinists and pragmatism. If we do so, then abdimii can be a noun with no gender specified (specific epithets that are nouns in the Latinist sense do not need to agree in gender with the genus, so there is no need to specify the gender thereof). Would people agree with this approach? --kc_kennylau (talk) 22:39, 11 July 2024 (UTC)Reply

I'd say it's as least as accurate (if not more so) to call them nouns as it is to call them adjectives, so I have no objection to that.--Urszag (talk) 22:45, 11 July 2024 (UTC)Reply

(By the way, there are currently 586 entries using the template {{la-epithet}}. --kc_kennylau (talk) 14:27, 12 July 2024 (UTC)Reply

Automating taxonomic entries

Latest comment: 3 days ago4 comments3 people in discussion

I have recently made {{taxref}} which can automate the reference section for genus entries and tested it on Felis and Autoserica. Basically, those reference sections usually contain a link to Wikipedia, a link to Wikispecies, and a link to the Commons category. This new template can automatically detect if each link exists, given the Wikidata ID.

I am wondering if we should add more links to templates, and more generally if more things about taxonomic entries can be automated.

For example, I think {{taxfmt}} (which links to a Translingual entry) and {{taxlink}} (which links to Wikispecies) can be unified.

--kc_kennylau (talk) 13:32, 12 July 2024 (UTC)Reply

Reasons not to combine {{taxfmt}} and {{taxlink}} are:

Instances of taxonomic name within {{taxlink}}, but not {{taxfmt}}, are occasionally counted and used to create lists of taxonomic names "wanted" in principal namespace.
There is no reason for every instance of {{taxfmt}} to check for the existence of a Translingual L2 section.
The ease of converting one to the other as we add taxonomic-name entries.

Wikidata links are fine, where they exist. If there is no Wikidata link, then it is usually desirable to link to Species, WP, or Commons pages for taxa of a higher rank, which AFAICR wikidata does not do. What are also useful are some of the links to external databases, even though many of the external links do not have data not present in others.

BTW, I continue to believe that we do not need to have entries for 10,000,000 species, nor even just the 1,000,000+ described species. We need entries for those that are important for humans, often as evidenced by the existence of vernacular names that more-or-less correspond to species, genus, or higher-ranked taxon. DCDuring (talk) 19:18, 12 July 2024 (UTC)Reply

@DCDuring On your point about the large number of potential entries, many of those won't pass CFI at the end of the day. If a name is used in one paper and then only ever mentioned in taxonomic databases, I don't think that passes, quite frankly. Theknightwho (talk) 16:18, 18 July 2024 (UTC)Reply

We haven't decided that last point, about whether occurrence in a taxonomic database (or table in a print or electronic document, for that matter) was a use or a mention.

But your point would seem to argue against any simple automation of the creation of taxonomic entries and, I would argue, in favor of a system for directing new-entry creation efforts toward those names that had links thereto (ie, were "wanted") by other entries in principal namespace. There are a fair number of orphan or near-orphan taxonomic entries being created now. Some might be good candidates for RfV. DCDuring (talk) 17:23, 18 July 2024 (UTC)Reply

Implementing auto-glossary

Latest comment: 8 days ago4 comments4 people in discussion

I propose creating pages using {{auto-glossary}} in appendix space add adding User:Ioaxxere/auto-glossary.js to the gadgets list. We should start by beta-testing a couple of pages and get user feedback (@Vininn126). Also, if the gadget gets moved to MediaWiki space it would be ideal if I could be made an interface administrator so I could continue working on the gadget if need be and possibly help maintain our other gadgets as well. What do we think? (Pinging @Benwing2). Ioaxxere (talk) 05:19, 13 July 2024 (UTC)Reply

It seems nice to have. Vininn126 (talk) 09:54, 13 July 2024 (UTC)Reply

I am fine with installing this. Also I'm the one who suggested to Ioaxxere to post about becoming an interface admin. Benwing2 (talk) 18:51, 13 July 2024 (UTC)Reply

I support you for interface admin too. Kiril kovachev (talk・contribs) 19:53, 13 July 2024 (UTC)Reply

User:Ioaxxere/PagePreviews.js

Latest comment: 7 days ago2 comments2 people in discussion

As the name suggests, this is a script to display a preview of an entry when hovering over a link. To try it out, add

importScript("User:Ioaxxere/PagePreviews.js");

into your common.js page. Please try it out! My goal is to create a preview gadget that's good enough to be on by default to match Wikipedia's page previews. Ioaxxere (talk) 05:48, 14 July 2024 (UTC)Reply

Works great on mobile iOS – I tested it on both my iPhone and tablet. I really like how it skips straight to the definitions and remains in the single language that was linked. Just like its use on Wikipedia is to quickly preview an article without having to click it, this will be more convenient for readers who just want to find out the meaning of a word without having to leave the main entry one is viewing. Conversely, I do not really see any negatives here – The only point of discussion for me might be on what exactly is included in the preview, but I do prefer its current configuration for the added convenience. Easy support from me. LunaEatsTuna (talk) 21:16, 14 July 2024 (UTC)Reply

dated/archaic/obsolete in the glossary as well Wiktionary:Obsolete_and_archaic_terms

Latest comment: 19 minutes ago22 comments9 people in discussion

People seem to have a lot of confusion about these words. Some people seem to use "archaic" for obsolete words, or think that this is a sliding scale.

I propose we change the text in the proposed link to mention the same information as the glossary (i.e. archaic is for stylization) and also mention that this isn't really a scale of oldness, but rather markedness. Vininn126 (talk) 08:34, 14 July 2024 (UTC)Reply

Do you have some example of the confusion? DCDuring (talk) 14:50, 14 July 2024 (UTC)Reply

An exact edit, no. Although compare pośmiewać and the recent edit history. But it's been many a time the subject of discussion on Discord, maybe @PUC can chime in. Vininn126 (talk) 15:01, 14 July 2024 (UTC)Reply

Discord doesn't count as either authority or evidence. DCDuring (talk) 16:30, 14 July 2024 (UTC)Reply

@DCDuring No, but Vininn126's word that this has come up a lot on Discord means that this is an issue, and it's one I can attest to as well. People do get these confused - I've seen it myself. Theknightwho (talk) 17:10, 14 July 2024 (UTC)Reply

Why always so dismissive of the idea? Did you ignore the other part of my comment or did you see "Discord" and think "DISCORD BAD"? Vininn126 (talk) 19:18, 14 July 2024 (UTC)Reply

Why, yes, Discord IS bad, being a potential communication channel for cabals. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply

It's being communicated here with official examples, so I fail to see the issue. Vininn126 (talk) 08:18, 15 July 2024 (UTC)Reply

I am unable to assess the import of the labels in Polish. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply

‘Markedness’ felicitously puts into a single word the impressions I myself have formed through editing. My own thoughts, developing on the idea: ‘dated’ properly encompasses a rather small selection of terms, and ‘archaic’ evenmoreso. A large part of terms common in the past and rare in the present actually elicit no recognition of antiquity from the reader, being neither part of the vocabulary of conventional archaisms, nor a word whose decline is apparent from the recent past. These uses—and I call them that because, from my experience, they mostly encompass senses of polysemous words, and not entire words—are, in my opinion, best described with a neutral {{lb|xx|now|uncommon}} or {{lb|xx|now|rare}}.

I would also like to make a comment on the nature of ‘obsolete’, which our glossary defines as ‘no longer likely to be understood’. I have seen this given as justification to not apply this label, in cases where an obsolete term bears enough similarity to an existing one to be understood even today. (‘Dated’ labels for alternative forms from the seventeenth century!) I find a good rule of thumb in such cases is that a term like this is obsolete when a reader no longer recognises it as antique, but simply wrong, novel or unusual. ―⁠Biolongvistul (talk) 16:01, 14 July 2024 (UTC)Reply

Any thoughts on how these labels should be applied (would be understood) for no-longer-widely-accepted taxonomic names. It isn't just aging taxonomists that might recognize such a term, but also users of older reference works, advocates of some less accepted taxonomic position, etc. DCDuring (talk) 16:36, 14 July 2024 (UTC)Reply

‘Historic’ maybe? ―⁠Biolongvistul (talk) 17:45, 14 July 2024 (UTC)Reply

The history can be quite recent, as DNA analysis has led to many changes in name, not to mention placement (hypernyms) and circumscription (hyponyms). Current experts in the taxonomy of a family or order would know about many old names, but view them as not suitable for their use. True obsolescence takes a long time. I am not sure that dated captures much that is relevant, but those with more exposure to the taxonomic community discourse may know better. DCDuring (talk) 21:05, 14 July 2024 (UTC)Reply

I think it's "dated" as soon as the scientific name changes, maybe "obsolete" as people stop using it. CitationsFreak (talk) 02:10, 16 July 2024 (UTC)Reply

Can "non-standard" be applied?

Specifically for taxonomic names, another possibility is "not validly published" (example: Thermodesulfobacteriota), but I feel that this might cause more confusion among general readers (unless, perhaps, linked to a glossary?).

My main concern with applying "dated" automatically is that sometimes the scientific community at large may ignore the standards/guidelines set by official bodies. For example, "Gibbs free energy" remains a common term, even though neither IUPAC (nor IUPAP, from my recollection) recommend it. At WP: "an increasing number of books and journal articles do not include the attachment 'free'," but keep in mind that this is several decades after the recommendation was made (1988). FWIW, I support the IUPAC recommendation!

Thermodesulfobacteriota became officially correct in 2021, but you will still see older forms being used — largely through lack of awareness of the change in status, and (hence) not necessarily being perceived as wrong/dated by author/reader.

Secondly, to me "dated" has connotations that the usage is old-fashioned (like groovy), and I'm not sure (yet) that that is a good fit for describing scientific words that have suddenly been superseded.

Actually, superseded could be a reasonable label too.

—DIV (1.129.106.197 08:28, 22 July 2024 (UTC))Reply

Markedness is already something we have to contend with, vis-a-vis colloquial, formal. Vininn126 (talk) 19:19, 14 July 2024 (UTC)Reply

Yeah, I have confusion about them. I generally go for dated=no hits in 50 years, archaic=nothing in 100 years, obsolete=nothing in 200+ years, or if Webster 1913 marks it as archaic. Sometimes it might be on the cusp of different tags, in which case I randomly choose, by instinct. Any tag telling us "you shouldn't use this word today" is better than none, IMHO. Newfiles (talk) 21:45, 14 July 2024 (UTC)Reply
That's not always indicative of how the words are truly marked. Vininn126 (talk) 08:18, 15 July 2024 (UTC)Reply

"Dated" definitely doesn't mean "no hits in 50 years." I've never seen anyone interpret it that way before. Take a look at the page linked in the header for this topic. That should give you a better sense of what the consensus has been in the past. Andrew Sheedy (talk) 17:11, 15 July 2024 (UTC)Reply
I agree with you, @Andrew Sheedy. The rules of thumb described by @Newfiles strike me as overly strict. And I would query what corpus is being used. —DIV (1.129.106.197 08:33, 22 July 2024 (UTC))Reply
I think @Newfiles' rules of thumb might be a bit rigid, but I see where they're coming from. I don't know if there is any "scientific" way of determining which label should be used, but I do agree generally that a term is "dated" if it feels old-fashioned when used in the present day, "archaic" if it feels very old-fashioned, and "obsolete" if it has fallen out of use for a long time, possibly centuries. So in my view it is a sliding scale. I am not in favour of using now in labels, because of the uncertainty this creates. — Sgconlaw (talk) 11:41, 22 July 2024 (UTC)Reply
As I've said, this is about how the words are received when heard. Vininn126 (talk) 11:42, 22 July 2024 (UTC)Reply

Ban the POS "prepositional phrase"

Latest comment: 2 days ago47 comments12 people in discussion

I am going through and eliminating the POS "prepositional phrase" from Russian lemmas. I propose we ban new lemmas with "prepositional phrase" as the POS. IMO, all lemmas tagged as "prepositional phrase" are better identified as either an adjective, adverb, preposition or interjection. "Prepositional phrase" tells you nothing about the syntactic function of the phrase. In Russian, at least, there is no consistency whatsoever in whether a given multiword phrase headed by a preposition is identified as a "prepositional phrase" or as an adjective, adverb, preposition or interjection. I think it comes down to the laziness of the editor. Existing cases of the POS "prepositional phrase" have to be grandfathered in, but we can prohibit new ones with an edit filter. The only potential issue I see is that some prepositional phrases can function either as an adjective or an adverb, and we currently don't have a concise way of putting multiple POS's in a single entry. This means every once in a while, a prepositional phrase will have to be converted into two entries, an adjective and an adverb (or maybe identify as one of them and use |cat2= to categorize under the other, although that is less ideal). Benwing2 (talk) 03:01, 15 July 2024 (UTC)Reply

It looks like this header was specifically allowed by a vote in 2010. I don't really agree with calling a prepositional phrase an adjective or adverb; I guess some might be lexicalized to the extent of becoming essentially single words, but we also have entries for idioms that really have to be considered phrases rather than words, such as of a lifetime, on the surface, at first glance, up a tree.--Urszag (talk) 03:15, 15 July 2024 (UTC)Reply

@Urszag But you will find zillions of multiword adverbs here. There is no consistency whatsoever in whether multiword terms are characterized as "prepositional phrases" or "adverbs" etc. Why can't we call on the surface an adverb? It functions exactly like one syntactically. Benwing2 (talk) 03:21, 15 July 2024 (UTC)Reply

It's not that phrases can't be categorized as adverbs (on Wiktionary). What I meant was that I could get behind avoiding the term "prepositional phrase" if it was inaccurate—for example, if the term it was applied to wasn't really a phrase. But in cases where it is a perfectly accurate description, I don't see why it should be avoided. It seems consistent enough to call all phrases with the form of a prepositional phrase "prepositional phrases"; if some are currently called "adverbs", they could be changed to match the others, rather than the reverse. "Adverb" is a pretty diverse and heterogeneous category, so a lot of things can potentially fit in it. I can't think of a specific test to differentiate "on the surface" from a lexical adverb, but maybe someone else knows of one.--Urszag (talk) 03:47, 15 July 2024 (UTC)Reply

@Urszag We don't include "transitive verb" or "intransitive verb" as part of speech headers, even though those might be perfectly accurate. In many cases these are not phrases (since "phrase" doesn't apply to just anything with multiple words), and calling it a "prepositional phrase" obscures the actual function of the term, since it's too broad. Theknightwho (talk) 03:49, 15 July 2024 (UTC)Reply

To me, the analogy with "transitive verb" and "intransitive verb" cuts the other way. We call verbs "verbs", and don't try to include extra information about how they grammatically function in the POS header. Likewise, I think we should call prepositional phrases "prepositional phrases", and not bother with trying to make the header tell the reader details about how they function grammatically in a sentence: that's what the definition, examples, and if necessary usage notes are for, not what the Part of Speech header is for.--Urszag (talk) 15:31, 15 July 2024 (UTC)Reply

I completely agree. It feels like a crutch for people who don't like multiword terms. Theknightwho (talk) 03:48, 15 July 2024 (UTC)Reply

I, too, find that the usage of this POS is generally not helpful. I've eliminated most of its use from Polish entries. Vininn126 (talk) 08:23, 15 July 2024 (UTC)Reply

Keep the phrase "on the surface" contains a preposition, an article, and a noun. None of those are adverbs. Furthermore, even if the POS is sometimes misused, that's not a reason to delete the ENTIRE POS Purplebackpack89 11:58, 15 July 2024 (UTC)Reply
Not containing an adverb does not mean that the phrase itself isn't an adverb. That's a ridiculous statement. Vininn126 (talk) 12:08, 15 July 2024 (UTC)Reply
Adverbs are, according to our own definition, words, not phrases. You're confusing functioning as an adverb with actually being one Purplebackpack89 15:18, 15 July 2024 (UTC)Reply
Or, taken from Wikipedia: An adverb is a word or an expression. Just because our current entry is missing information doesn't make it right. The claim that adverbs have to be a single word is ridiculous. Vininn126 (talk) 15:21, 15 July 2024 (UTC)Reply

Quoting opening comment: "The only potential issue I see is that some prepositional phrases can function either as an adjective or an adverb, and we currently don't have a concise way of putting multiple POS's in a single entry. This means every once in a while, a prepositional phrase will have to be converted into two entries, an adjective and an adverb (or maybe identify as one of them and use |cat2= to categorize under the other, although that is less ideal)."

In English probably the majority of prepositional phrases can serve as both adjectives and adverbs. Do we have any facts to support the adverbial "Every once in a while". I believe it is just wrong, at least for English. If there are languages other than English for which there is some good reason to remove the "prepositional phrase" PoS, so be it.

I really don't understand the motivation for this kind of indiscriminate, complicating, revolutionary change. Is it fun? Does it indulge a homogenizing, controlling impulse? DCDuring (talk) 12:29, 15 July 2024 (UTC)Reply

Because not all can be converted, and placing under one umbrella implies that these phrases all have the same syntactic behavior, which isn't true. Why must you always through accusations like this around? It's unbecoming, rude, and frankly I'm tired of it. This is a poor attitude to have. Vininn126 (talk) 12:37, 15 July 2024 (UTC)Reply

My questions can be rephrased as "What is the "attitude" that warrants this kind of proposal?"

All categories are "umbrella classes", with a great deal of diversity of syntactic behavior. That each PP tends to modify sentences, phrases, adverbs, or adjectives with different frequencies might be a call for some kind of usage note, but I doubt that we will do the work to justify such a note, just as we haven't done the work to support even the broader claims on which this proposal is based. It does seem to be much easier to change things all at once in code than to engage in the one-definition-at-a-time effort to improve entry quality. DCDuring (talk) 13:27, 15 July 2024 (UTC)Reply

What is or isn't a prepositional phrase is rather clear-cut. Unless it changed since I was in middle school English, all prepositional phrases start with a preposition and have some more words afterwards (usually articles and nouns). Purplebackpack89 15:21, 15 July 2024 (UTC)Reply

That has nothing to do with what I said. DC's claim is that most prepositional phrases may need to be converted to both an adjective and an adverb, and my argument is that not all will be. By using prepositional phrase we are assuming that they all have the same syntactical behavior, when in reality, they do not. Vininn126 (talk) 15:23, 15 July 2024 (UTC)Reply

The fact that they do not is the reason why we have to preserve prepositional phrase as an acceptable POS... Purplebackpack89 17:17, 15 July 2024 (UTC)Reply

Huh? We need to know which ones act as both adverbs and adjectives, and which ones act as only adverbs, and which ones only as adjectives. If they always behaved as both, it would be predictable. Your logic makes no sense. Vininn126 (talk) 17:21, 15 July 2024 (UTC)Reply

His logic does not make sense, but do we feel confident that users would accurately distinguish the prepositional phrases by adverbs and adjectives? Just two years ago we had to find the common fallacy of categorizing predicatively used adjectives as adverbs, Wiktionary:Requests for deletion/Non-English § extrem, Talk:extrem. Sure we can do it, but I have not discerned the benefit of the eventual effort (only significant in English though, given the numbers in the category), instead of leaving the ambiguity, only that Benwing can apparently more parsimoniously conceptualize the parts of speech, since a rule or theory becomes less convincing with each exception. Fay Freak (talk) 20:26, 15 July 2024 (UTC)Reply

Trusting users to accurately distinguish things should not always be a priority. We should remain faithful to the truth, regardless of how complicated it is. Vininn126 (talk) 20:28, 15 July 2024 (UTC)Reply

I mean it is not wrong, and then can it be more wrong, or less correct or more vague if people prefer to employ the more general category “phrase”. “Prepositional phrase” could also be kept as a tracking category therefore. Fay Freak (talk) 20:34, 15 July 2024 (UTC)Reply

FWIW, Wiktionary consistently misuses the word "phrase". am I under arrest is not a linguistic phrase but that's what we call it. Benwing2 (talk) 20:36, 15 July 2024 (UTC)Reply

Is that relevant to this discussion? @Theknightwho suggested that some of the terms currently labeled "prepositional phrase" are not phrases, but most of the terms in Category:English prepositional phrases seem to qualify. I've relabeled cases like at the bottom of as prepositions. By the way, I noticed Category:English phrasal prepositions seems to be a manually curated category, but couldn't it be implemented automatically as the intersection of Category:English prepositions and Category:English multiword terms?--Urszag (talk) 21:48, 15 July 2024 (UTC)Reply

If you still have the problem with prepositional phrases that function as adjectives (but aren't adjectives) AND function as adverbs (but aren't adverbs), by deleting the prepositional phrase category, you're just replacing one problem with another problem. People are way too focused on trying to label prepositional phrases as adjectives or adverbs, but I don't think that that's an important or necessary distinction, certainly not important enough to delete a part of speech to implement. Purplebackpack89 22:08, 15 July 2024 (UTC)Reply

@Purplebackpack89 But they are adjectives and/or adverbs, whereas "prepositional phrase" doesn't refer to an independent part of speech. All we're doing at the moment is makng it more difficult to tell which ones can be used as adjectives, which can be used as adverbs, and which can be both. Theknightwho (talk) 22:37, 15 July 2024 (UTC)Reply

A preposition is an independent part of speech. Why wouldn't a prepositional phrase be as well? I still say calling something that's preposition + article + noun an adjective or an adverb is inaccurate. Purplebackpack89 23:59, 15 July 2024 (UTC)Reply

What on earth are you even talking about. I feel like you are discussing something completely differently. Basically the idea is to make prepositional phrases more like na bani. Vininn126 (talk) 08:09, 16 July 2024 (UTC)Reply

From what I gather, Purple doesn't know linguistics well but thinks he does. Benwing2 (talk) 08:34, 16 July 2024 (UTC)Reply

@DCDuring Do you find it fun caricaturing every opinion you disagree with? You continually pepper discussions with these kinds of passive-aggressive comments, and they are getting very tiresome. Theknightwho (talk) 21:50, 15 July 2024 (UTC)Reply

How are they passive? DCDuring (talk) 22:10, 15 July 2024 (UTC)Reply

You "pepper discussions with [...] aggressive comments"? CitationsFreak (talk) 01:11, 16 July 2024 (UTC)Reply

No comment right now on the proposal to remove "Prepositional Phrase" but it's a bit funny to me how our current list (outside of "Prepositional Phrase" which had its own vote) was institutionalized by a vote consisting of only 7 editors in a 5-1-1 vote as seen at Wiktionary:Votes/pl-2015-12/Part_of_speech. It was part of User:Daniel Carrero's vote-a-rama in 2015 to rightfully update Wiktionary's policies as of that time, as explained in Wiktionary:Beer parlour/2015/December § WT:EL new votes. It's been 9 years since then. Maybe it's time we do a full review of WT:EL, though I suspect it won't be a simple as it used to be. AG202 (talk) 23:30, 15 July 2024 (UTC)Reply

Delete. If they are just multi-word adverbs/adjectives/other things, just call them what they are. (I also support seeing which phrases are really just other parts of speech disguised as phrases. I know that tail between one's legs is only a phrase due to editing conflicts over if it should be a noun or adverb. CitationsFreak (talk) 01:26, 16 July 2024 (UTC)Reply

@DCDuring, let's remember that "prepositional phrase" is a common header now chiefly because Equinox merged thousands upon thousands of separate adverb/adjective sections under a unified "prepositional phrase" header back in the day. I've always disagreed with that, as it seemed to me we were erasing a useful distinction. Moreover it's not always possible to word a definition both adjectivally and adverbially. P U C – 17:54, 16 July 2024 (UTC)Reply

Yes, I remember. I also believe that we would be foolish not to recognize that many of the PP entries would probably benefit from {{&lit}} as they often have SoP as well as entry-worthy definitions, however this gets resolved. DCDuring (talk) 21:24, 16 July 2024 (UTC)Reply

Support eliminating this header. Part of speech headers exist to describe the syntactic function, not the form, of a term. We got rid of "acronym", "initialism", and "abbreviation" for this reason. Ultimateria (talk) 20:12, 16 July 2024 (UTC)Reply

Oppose. Many of the Italian entries I create are prepositional phrases. Splitting these into separate POS headers for adjectives and adverbs would be both inconvenient and misleading for almost all such entries. Intuitively, it’s also more intuitive for me to categorize these entries as prepositional phrases rather than as adjectives/adverbs. Imetsia (talk (more)) 22:55, 17 July 2024 (UTC)Reply

Could you provide an example where this would be detrimental? Vininn126 (talk) 22:57, 17 July 2024 (UTC)Reply

Just looking through my recent contributions, entries like a lume di candela, a coppia, a palla, and a gonfie vele would be affected. In these cases, we'd have to split them up into adjective and adverb headers, which I find to be an unintuitive way to label these constructions. Imetsia (talk (more)) 23:12, 17 July 2024 (UTC)Reply

I would classify most of these as adverbs, to be honest, in a syntactical sense. Vininn126 (talk) 23:13, 17 July 2024 (UTC)Reply

I don't see how you could. I have the example "cena a luma di candela" ("candlelit dinner"), in which case "a luma di candela" is modifying a noun. I could come up with "gara a coppia" ("competition between pairs of contestants"), "musica a palla" ("loud music"), "vendita ticket a gonfie vele" ("smooth-sailing ticket sales"); in which case the constructions are always modifying nouns. Imetsia (talk (more)) 23:17, 17 July 2024 (UTC)Reply

Taking a noun argument does not exclude it - often phrases built from a preposition + noun are adverbs anyway, asking the syntactic question "when", compare w nocy (at night). Vininn126 (talk) 23:19, 17 July 2024 (UTC)Reply

Aren't these just adjectives? CitationsFreak (talk) 03:51, 20 July 2024 (UTC)Reply

Honestly this sounds like "it's too much work for me to figure out whether it's an adjective or adverb so I'd rather just put something unhelpful and let the reader figure it out". Benwing2 (talk) 23:02, 17 July 2024 (UTC)Reply

No, for almost all cases, the prepositional phrase can act as both an adjective and an adverb. It's not a matter of "figuring out whether it's an adjective or adverb". And I don't think readers have been confused by the prepositional phrase header. Imetsia (talk (more)) 23:13, 17 July 2024 (UTC)Reply

I don't think it's always about what leaves the reader confused, but rather about what's factually more precise, since some PP can be adverbs, some adjectives, and some both. Vininn126 (talk) 23:15, 17 July 2024 (UTC)Reply

Minor changes to CFI attestation section

Latest comment: 4 days ago6 comments5 people in discussion

At WT:RFVE#myxa, P Aculeius wrote a comment that drew my attention to some divergences between our current practice and the text of CFI. I want to propose two small changes to align the policy with practice.

1. What is a "citation"?

Currently CFI does not actually say what a "citation" is. Experienced Wiktionary editors know that (at least for WDLs) a citation is a quotation - a snippet of (usually) running text that includes the word and is referenced to a particular durably archived work. But CFI somehow manages to avoid saying that. The word citation means different things to different people and non-lexicographers may be unfamiliar with the applicable sense. Case in point: to us, "citation" and "quotation" are interchangeable, but to a Wikipedian, "citation" is synonymous with "reference" (Wikipedia:Citing sources).

This can be cleared up by adding a sentence at the beginning of WT:CFI#Number of citations:

{{l|en|citation|Citations|id=lexicography}}, in the form of [[WT:Quotations|quotations]], provide evidence that a term exists and provide examples of how it is used as part of a language.
For languages well documented on the Internet, three citations in which a term is used is the minimum number for inclusion in Wiktionary. […]

2. Dictionary entries are not "uses"

It is well understood by seasoned Wiktionarians that a mere listing of a word as a headword in a dictionary or glossary is a mention, not a use, and only relevant for the attestation of LDLs. But perhaps this is not so obvious to less experienced contributors. It wouldn't hurt for CFI to say this outright.

To help clear up any confusion, I propose to alter the existing example in WT:CFI#Conveying meaning from

For example, an appearance in someone’s online dictionary is suggestive, but it does not show the word actually used to convey meaning.

to:

For example, the fact that a dictionary contains an entry for the word is suggestive, but it does not show the word actually used to convey meaning.

(I note for completeness that users have differing views on whether the appearance of a word as a gloss in a dictionary is a use or a mention. That's why I chose the wording carefully, specifically using the word "entry" instead of the general term "appearance" to avoid any possible confusion.)

These changes are small and, I suspect, uncontroversial, so I'm asking here to see if there is support or opposition instead of opening a formal vote. If it is felt that a vote is required I will start one. This, that and the other (talk) 10:35, 17 July 2024 (UTC)Reply

These are indeed minor changes, in that they only codify "common understandings" rather than addressing significant issues raised by these questions. The fact that the rest of the universe defines a "citation" as any reference to authority for a given point, while on Wiktionary the word is "commonly understood" to refer only to particular usage examples, and to exclude all authority is a significant and frankly nonsensical situation. To provide context, a common situation is that an editor will refer words that have no citations or attested usage in their entries to RFV because they're not mentioned in a particular dictionary, e.g. OED, or because the OED entry doesn't include a particular sense. But if Webster's Third New International Dictionary gives the word and supports the definition for the entry—or for that matter any number of other authorities—then the concern raised hasn't been addressed because dictionary entries don't count as citations, even though the reason why the word or sense was brought to RFV was because of what wasn't included in another dictionary!

Worse, the entry is then vulnerable to deletion as an unattested word—despite literal attestation from strong authority—because we want a minimum of three usage examples. Which is a fine goal to show that a word is actually in use—but this standard is often difficult to satisfy in the case of archaic or technical terms, which may be contained, mentioned, or defined in every textbook and manual of a subject, but may be used without any definition whatever only in old and largely inaccessible works that can't be easily located on the internet.

The case that brought this to discussion was a technical term for "the fused distal end of the lower mandible" of a bird, which evidently was occasionally found in ornithological literature of the late nineteenth and early twentieth century, but which was described as "rare" in its dictionary entries. Although three examples of use besides in dictionaries and glossaries were eventually found by another editor, I made several attempts to locate it in use via Google and Google Books, searching for the term in combination with keywords such as "ornithology", "avian", "anatomy", "birds", and for ornithological texts that might use the word to describe bird mandibles—and I came up with exactly one, which did little more than define it and explain that it was a rare term used by at least one important authority, cited by last name and year only, but for which it recommended the use of a different word.

Of course, Wiktionary hosts countless entries and definitions that lack three attested uses, but which go unchallenged because they appear to be correct anyway, and there seems to be no rush to go out and find them. But once someone discovers that a particular dictionary doesn't include the word or sense, it's not enough to show that others do—you have to go beyond that and find three uses that don't define or discuss the word, or else the entry can be deleted. To be clear, this is not an argument for allowing random rubbish on the internet, or "dictionary-only words", i.e. terms that have never been used for their intended purpose, but were simply invented as examples of words that someone thought should exist, like "hippopotomonstrosesquipedalian", or the fanciful collective plurals of animals that were never used before someone decided that a particular term would be amusing, and which are only ever used as examples of words.

Instead, it is an argument that when a word or sense can be cited to a strong authority—for instance a pre-internet dictionary not stuffed full of nonce words, or technical works that describe what something means and how or even whether it is or was in widespread use, and especially when multiple authorities can be cited, then it makes little sense to hold that it should then be marked as "unverified", or subject to deletion, due to a lack of sufficient attestation, even though it may have significantly more evidence of, and authority for both meaning and usage, than a significant proportion of other, unchallenged entries.

On Wikipedia, we have a number of tags that can be applied to articles, sections, or claims, to indicate that additional sources are needed or wanted, and that potentially-controversial material may be subject to deletion if it can't be verified. But on Wiktionary, even things that can easily be verified—and have been—are subject to deletion under the present CFI. It seems to me that what's really needed is an adjustment to policy that recognizes that words or senses that can be found in and cited to authority have at least a minimal degree of attestation, and that while additional sources or examples may be wanted, the entry or sense does not need to be deleted solely due to the fact that no editor has succeeded in finding—or perhaps even attempted to go out and find those sources or examples. If you can challenge an entry based on its inclusion or lack thereof in a dictionary, then the fact that it appears in one ought to be at least as relevant, otherwise there is a significant imbalance between inclusion and exclusion—one that seems to do a serious disservice to our readers. P Aculeius (talk) 13:08, 17 July 2024 (UTC)Reply

I responded to most of these points at WT:RFVE#myxa. I would only note here that what I'm proposing is nothing new; it is merely seeking to update the policy to reflect longstanding practice on this wiki. I'll let others respond but I don't think you'll see much support for your suggestions here. This, that and the other (talk) 14:44, 17 July 2024 (UTC)Reply

Above, it is mentioned that the three cites rule is "difficult to satisfy in the case of archaic or technical terms, which may be contained, mentioned, or defined in every textbook and manual of a subject, but may be used without any definition whatever only in old and largely inaccessible works that can't be easily located on the internet." This is true for many modern minor geographical terms as well. There are geogrpahical terms with legitimately cited English Wikipedia entries (cited from non English materials) that can't meet Wiktionary's three cites. The three cites rule itself is preposterous absurdity and the emperor has no clothes. But despite this truth, it is indeed because of the three cites rule that I can make entries for many legitimate but rare words like Banmendian that do not appear in other dictionaries, and many Tingyong Pinyin words or similar minor location names. Geographyinitiative (talk) 16:43, 17 July 2024 (UTC)Reply

@P Aculeius What you've said is all very well, but it would be a major policy change. What might make more sense is to change the word "citation" to "quotation" on the policy page, to avoid any confusion over what the word "citation" means. Theknightwho (talk) 21:55, 17 July 2024 (UTC)Reply

Are you building a false dichotomy? I know how to balance inclusion and exclusion.

Given that our manpower is limited, we “send to RFV” terms for which, on a case-by-case basis, there are indications of them being hoaxical, protological, or corrupted. That is for the positive assumption, which editors arrive at differently, that they have not been used at any point in time or corner of the earth, as opposed to you being unable to find them while specifically searching.

For Geographyinitiative’s topographic terms you thereby have a litmus test that if you believe it had to be in a secret military map it is better left included. For living languages I have been content for a spelling to exist only conceptually: Wiktionary:Requests for deletion/Non-English#5-DM-Banknote, Talk:5-DM-Banknote, after all it is the most used banknote in Europe. Inclusion may provide a picture more consistent with existing data. This is perhaps a wider implication of the “assume good faith” principle. But it is only me considering the general spirit, principles and purposes of the primary documents so much, while others outpope the Pope in their legalism; I would leave rixig as a term secured for some period and place in spite of only one quote and anything in general if our resources are convincingly specifically argued to suffer undercoverage: we need a flexibility clause for cases when one term points to more occurrences which we just don’t find, if a word has arisen organically in a community which presumably shared it rather than being an occasional literary creation. We know how to build a dictionary, if not for the CFI, indeed, wherewith we have been increasingly creative in order not to make the text work against its own goals. … Fay Freak (talk) 21:12, 17 July 2024 (UTC)Reply

consensus on inclusion/exclusion of "someone" in multiword English verb lemmas

Latest comment: 3 days ago6 comments3 people in discussion

Hi. I'd like to get consensus on how to handle the placement of "someone", "something", "one" and "it" in multiword verb phrases. I already made a BP post about this last month, here: Wiktionary:Beer parlour/2024/June#standardizing the form of phrase lemmas. Most of the rules I proposed were well-received, but there was disagreement over whether and when to include the word "someone" etc. in verb phrases (except when it occurs as someone's, where it's generally mandatory). The statistics indicate that mostly it is excluded in the lemma:

14,446 multiword English verb lemmas
430 of them contain someone's
297 of them contain someone
4 of them contain somebody or somebody's (IMO we should always use someone in preference to somebody)
53 of them contain something
0 of them contains something's

The upshot is that the vast majority don't contain someone or something.

Some people said that we should put "someone/something" in the lemma when it belongs in the middle, e.g. "see something through", because it's supposedly mandatory here. I note that this isn't actually the case; even expressions like this can have the object placed at the end if it's heavy, e.g. "I saw through all the projects that were assigned to me", whereas "I saw all the projects that were assigned to me through" is maybe possible but awkward to say the least. Furthermore, it's not generally the practice to include "someone" or "something" in the lemma, as exemplified by see through, which contains both the "see through it" and "see it through" senses.

What I do propose instead is to indicate the position of the object, especially when it goes before the preposition, in the headword but not the lemma. This would mean that under see through we'd have two separate entries with separate headwords, one for the meanings that are construed using "see through it" and another for the meanings that are construed using "see it through". This might be indicated something like this:

===Verb===
{{en-verb|see<,,saw,seen> through (sth/so)}}

# {{lb|en|transitive}} To perceive visually through something [[transparent]].
#: {{ux|en|Their fabric is so thin that I can '''see through''' these curtains.}}
#: {{ux|en|We '''saw through''' the water with ease; it was as clear as glass.}}
# {{lb|en|transitive|idiomatic}} To not be [[deceive|deceived]] by something that is [[false]] or [[misleading]]; to understand the hidden truth about someone or something.
#: {{ux|en|I'm surprised she doesn't '''see through''' his lies.}}
#: {{ux|en|I can '''see through''' his [[poker face]]. He isn't fooling anyone.}}
#* {{quote-book|en|title=Rationality and the Pursuit of Happiness: The Legacy of Albert Ellis|author=Michael E. Bernard|year=2010|passage=Now, when you awfulize you go beyond that and tell yourself, instead “It's horrible, awful and terrible!” You then mean several things, all of which are clearly unprovable and which any self-respecting Martian with an IQ of 100 could easily '''see through'''.}}
# {{lb|en|transitive|idiomatic}} To [[recognize]] someone's true [[motive]]s or [[character]].
#: {{ux|en|In that moment, I finally '''saw through''' her; this petition drive had nothing to do with her love for animals, and everything to do with impressing Michael, the cute intern.}}

===Verb===
{{en-verb|see<,,saw,seen> (sth/so) through}}

# {{lb|en|transitive|idiomatic}} To provide support or cooperation to (a person) throughout a period of time; to support someone through a difficult time.
#: {{ux|en|And may we all, citizens the world over, '''see''' these events '''through'''.}}
#* {{quote-song|en|title=w:Never, Never Gonna Give Ya Up|author=w:Barry White|year=1973|passage=Forever and ever, yeah / I'll '''see you through''' it}}
#* {{quote-song|en|year=1976|title=Coney Island Baby|author=w:Lou Reed|passage=The glory of love might '''see you through'''}}
# {{lb|en|transitive|idiomatic}} To do something until it is finished; to continue [[work on|working on]] (something) until it is finished.
#: {{syn|en|see out}}
#: {{cot|en|carry out}}
#: {{ux|en|Despite her health problems, Madame Prime Minister '''saw''' the project '''through'''.}}
#* {{quote-journal|en|date=2022 January 12|author=Sir Michael Holden|title=Reform of the workforce or death by a thousand cuts?|journal=RAIL|issue=948|page=25|text=But if the Government really wants our railway to reduce the level of its subsidy and improve value for taxpayers' money, then it must provide the political air cover to enable managers to get on and make the hard decisions that are needed... and then '''see''' them '''through'''.}}
# {{lb|en|transitive|idiomatic}} To constitute ample supply for one for.
#: {{ux|en|Those chocolates should '''see''' us '''through''' the holiday season.}}

Here, the format of the argument to {{en-verb}} is provisional. The proposed syntax displays as see through something/someone (for the first entry) and see something/someone through (for the second entry). Maybe there's an even more efficient but still understandable syntax. For example, the Italian verb module has a large list of built-in verbs and I may do the same here; then you can just say {{en-verb|see<@> through (sth/so)}} where @ means "consult the built-in verb list for see", so you don't have to duplicate the principal parts of see in each expression involving it if you don't want to.

Note also that I'm about to change things so that multiword verbs link each word separately by default in the listed inflections instead of linking the whole expression as a green link; it seems overkill to have non-lemma entries for the inflections of all these expressions. Benwing2 (talk) 06:06, 18 July 2024 (UTC)Reply

Seems good. I note that one/one's (used to placehold for a reflexive pronoun) is omitted from the list of placeholders. But having placeholders (something, someone etc.) with the same orthography as the core terms of the expression makes the placeholders seem to be a required part of the expression. Some dictionaries use parentheses for such cases. Perhaps we could just not embolden the placeholders. (We could then use parentheses to orthographically distinguish optional from required terms and placeholders.)

I also wonder how we could detect whether we have usage examples that actually match the definitions when someone (a person) is common, as well as something (inanimate). I mention that because the first definition in see sth/so through explicitly includes "(a person)" whereas the usex has an inanimate object. If we are going to have "something" and "someone" used and distinguished in our definitions, as we should, then we should make sure our cites and usexes fit. If it can't be done automagically, then we need some human engineering to induce contributors to clean these things up. We need to make such cleanup a bright shiny object. The only way I've experienced is cleanup lists. Creating such lists would be an important task for magic. DCDuring (talk) 14:42, 18 July 2024 (UTC)Reply

@DCDuring I agree with everything you say. I don't think it's possible to automatically determine whether a cite matches a usex and the header; this would have to be done manually, through cleanup lists as you suggest. Benwing2 (talk) 21:29, 18 July 2024 (UTC)Reply

Do you have any ideas about how to generate useful cleanup lists? They don't have to be perfect in inclusion or selectivity. Actually, maybe all we need is to look at all the cases that have each of the placeholders in the headword and either of the standard indicators of usage examples: {{usex}} or *:. The numbers you've provided above are not vast. All one needs is sufficient Sitzfleisch. DCDuring (talk) 21:49, 18 July 2024 (UTC)Reply

How would you handle cases like break up, where the noun can go on either side of the preposition in some cases? "I broke up the fight" and "I broke the fight up" are AFAICT synonymous, but I'm not sure that holds for all senses (does it work for sense 3, which I think of as break someone up)? Smurrayinchester (talk) 15:27, 18 July 2024 (UTC)Reply
@Smurrayinchester Hmm, this is interesting. It seems there may be at least four separate cases to consider:
1. It is mandatory to place the noun or pronoun after the preposition, as in see through (it)/see through (the ruse).
2. It is mandatory to place a pronoun before the preposition, but nouns normally go after, as in break up (the monotony).
3. It is mandatory to place a pronoun before the preposition, but nouns can go before or after, as in break up (the class into groups) or break (the class) up (into groups).
4. It is mandatory to place a pronoun before the preposition, but nouns normally go before, as in break (the class) up (into fits of laughter) or see (the project) through.
If you look at the definition of break up in [16] for the Farlex Dictionary of Idioms, they seem to distinguish these cases in that they identify case #4 using In this usage, a noun or pronoun is commonly used between "break" and "up." and case #3 using In this usage, a noun or pronoun can be used between "break" and "up.", while case #2 doesn't have any trailing verbiage. (Note although that by this criterion they put "break up into pieces" in case #4 instead of #3). Under see through in [17] for this same dictionary, they have three separate headers, "see (one) through", "see (something) through" and "see through (someone or something)".
A few things to add:
1. "someone" and "something" are pronouns, and accordingly they get placed before the preposition whenever possible, even in case #2 above; "break up something" sounds a bit strange to me unless you put a pause between "up" and "something".
2. Even in case #2 it's possible to put short nouns before the preposition, as in "break the monotony up", it's just not so natural.
3. In case #4, sufficiently heavy/long nouns have to be placed after the preposition, as in the example I gave above: "I saw through all the projects that were assigned to me".
Anyway, I'm not quite sure how to distinguish cases #2 - #4 above. It seems we have two choices: fit this into the header somehow or use labels or similar. My intuitive sense is that labels might be better, because otherwise there might be a lot of duplication of headers and because the labels can appropriately link to an appendix that explains the usage in more detail.

BTW I'm sure there have been oodles of papers written on this topic but I don't know of any good ones. Can anyone find a definitive explanation of the above phenomena? Benwing2 (talk) 21:28, 18 July 2024 (UTC)Reply

Wikimedia Movement Charter ratification voting results

Latest comment: 2 days ago2 comments2 people in discussion

You can find this message translated into additional languages on Meta-wiki. Please help translate to your language

Hello everyone,

After carefully tallying both individual and affiliate votes, the Charter Electoral Commission is pleased to announce the final results of the Wikimedia Movement Charter voting.

As communicated by the Charter Electoral Commission, we reached the quorum for both Affiliate and individual votes by the time the vote closed on July 9, 23:59 UTC. We thank all 2,451 individuals and 129 Affiliate representatives who voted in the ratification process. Your votes and comments are invaluable for the future steps in Movement Strategy.

The final results of the Wikimedia Movement Charter ratification voting held between 25 June and 9 July 2024 are as follows:

Individual vote:

Out of 2,451 individuals who voted as of July 9 23:59 (UTC), 2,446 have been accepted as valid votes. Among these, 1,710 voted “yes”; 623 voted “no”; and 113 selected “–” (neutral). Because the neutral votes don’t count towards the total number of votes cast, 73.30% voted to approve the Charter (1710/2333), while 26.70% voted to reject the Charter (623/2333).

Affiliates vote:

Out of 129 Affiliates designated voters who voted as of July 9 23:59 (UTC), 129 votes are confirmed as valid votes. Among these, 93 voted “yes”; 18 voted “no”; and 18 selected “–” (neutral). Because the neutral votes don’t count towards the total number of votes cast, 83.78% voted to approve the Charter (93/111), while 16.22% voted to reject the Charter (18/111).

Board of Trustees of the Wikimedia Foundation:

The Wikimedia Foundation Board of Trustees voted not to ratify the proposed Charter during their special Board meeting on July 8, 2024. The Chair of the Wikimedia Foundation Board of Trustees, Nataliia Tymkiv, shared the result of the vote, the resolution, meeting minutes and proposed next steps.

With this, the Wikimedia Movement Charter in its current revision is not ratified.

We thank you for your participation in this important moment in our movement’s governance.

The Charter Electoral Commission,

Abhinav619, Borschts, Iwuala Lucy, Tochiprecious, Der-Wir-Ing

MediaWiki message delivery (talk) 17:53, 18 July 2024 (UTC)Reply

It's my understanding that the vote fell a bit short of the 2% quorum requirement as well. DCDuring (talk) 19:34, 19 July 2024 (UTC)Reply

Deprecating MediaWiki:Gadget-Navigation popups

Latest comment: 2 days ago3 comments2 people in discussion

This gadget does not work correctly and has a replacement in the form of Page Previews (see #User:Ioaxxere/PagePreviews.js). If no one objects I will remove it from the gadgets list. Ioaxxere (talk) 04:03, 19 July 2024 (UTC)Reply

As someone who uses Navigation popups, I notice the following differences: your popups do a better job of creating a preview of definitions for entries, but don't create previews for other pages, and lack the suite of other links that the 'Navigation popups' have in the "actions" dropdown, which I use to do things like get a preview of—or click through and go directly to—the edit history of the page upon hovering over the link (instead of having to click the link to go to the page, and then click the "history" tab and go that page, and then go back to my watchlist or the recent changes feed and do that for the next entry that catches my eye). I also like to use Navigation popups to preview diffs. So, I object to removing them (for now). The ideal solution IMO might be to incorporate the full functionality of the Navigation popups into your popups, but other (possibly easier?) solutions might be to make one or the other shift where it pops up so that they could just both be used. - -sche (discuss) 20:59, 19 July 2024 (UTC)Reply

@-sche: Unfortunately there's no way I can incorporate the full functionality of Navigation Popups into my gadget while still maintaining its minimalist aesthetic. Also, it seems like Navigation Popups has virtually no restrictions on what can be displayed, whereas I would like to ensure that the previewed content actually makes sense — so I won't try to match it on that front, either. But if there's something specific that I could add which would get you to switch, please let me know. Ioaxxere (talk) 05:35, 20 July 2024 (UTC)Reply

Admin abuses

Latest comment: 3 days ago1 comment1 person in discussion

To admins: I opened a discussion on a problematic administrator at Wiktionary talk:Administrators#Admin abuses rv rights. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] _{(parla con me)} 10:05, 19 July 2024 (UTC)Reply

synonyms, antonyms ▲

Latest comment: 7 hours ago6 comments5 people in discussion

Do you guys use these buttons to hide and show inline synonyms? I'm thinking of removing these (at least on mobile) as they create visual clutter for seemingly little benefit. I don't think we should even have so many nyms on a single definition that someone would want to collapse them. Ioaxxere (talk) 19:37, 19 July 2024 (UTC)Reply

I strongly prefer having them hidden (which is what my preferences are set to) and I'd prefer having them hidden by default. I think there's more clutter added to entries by the -nyms themselves. I would be very opposed to removing the collapsibility feature. Andrew Sheedy (talk) 20:20, 19 July 2024 (UTC)Reply

Collapsed by default would be fine. (Thus, set to collapse syn and ant and cot by default, just as hyper and hypo and mer and hol are already set to collapse by default.) The option to unhide should not be taken away, though. Its button could be as unobtrusive as anyone likes, so long as it exists. It is true that list length can sometimes be trimmed via the final list item being a Thesaurus entry, and that is always nice to do when we get a chance. But it is not always practical, and there should not be any banning of semantic relations links. Quercus solaris (talk) 22:08, 19 July 2024 (UTC)Reply

I prefer them visible/expanded by default, because having them collapsed makes them hard to find even if you're an adept user actively looking for them, so it surely makes them unnoticeable for many people who would be interested in them but don't realize to look for them: there was a discussion semi-recently (which I can't relocate at the moment) about no longer collapsing coordinate terms, altform-inline, etc, because when they used to be collapsed, even I had gone to entries which I thought I added e.g. a coordinate term to, hadn't seen it (because it was collapsed and the 'expand' button was small and easily overlooked), hadn't found it when Ctrl-F-ing, and only noticed when I went to edit the wikitext of the page to add it that it was already there in the wikitext, just hidden from view. But since some people prefer having them collapsed, to me it seems most reasonable to let people individually opt in to collapsing content they don't want (vs relying on people to notice content is being hidden and specifically opt-out of it being hidden).
I notice that on at least some mobile devices the buttons seem to be the same size and font/colour as the definitions; perhaps we could change that to make the buttons look like the distinct thing that they are, so the entry was less of an undifferentiated lump of clutter? In general our mobile interface seems suboptimal, e.g. it's nonobvious how to get to the page history, and I'm not sure if my phone screen is just too small for the little 'circles' to display that allow selecting individual revisions to compare, but they don't display for me on my phone, though I see them if I access the mobile version of the site from my computer. (In turn, our desktop interface is uncompact, with wasted whitespace; one idea there would be to make the "synonyms:", "antonyms:" etc text stand out in some way and then optionally put them all on one line, "synonyms: foo, antonyms: bar, coordinate terms: baz".) - -sche (discuss) 22:16, 19 July 2024 (UTC)Reply

Agreed. That last idea would be A-OK in my opinion. A follow-up to my comment above. Wiktionary serves various user personas, and that's A-OK. Everyone from (1) people who barely even speak a language and don't want anything except (what Collins would call) a gem definition, to (2) people who can handle all of the semantic relations and would be well-served by having the option to see them, even if they are hidden by default (in service of the aforementioned users who either don't want or can't handle them). The knowledgeable users can simply unhide them if they choose to do so. The button to unhide could be unobtrusive but also should not be such a desperately hidden Easter egg that serendipitous discovery becomes unlikely. A nice balance can be struck. Quercus solaris (talk) 22:23, 19 July 2024 (UTC)Reply

I don't understand why we have individual [synonyms/antonyms] buttons on every sense for users who have them expanded by default. It's just clutter. There is no need to offer the capability to hide the nyms of one individual sense and leave all others expanded. I think the [synonyms/antonyms] buttons should only be visible to users who have chosen, via the sidebar toggle, to collapse these nyms by default. This, that and the other (talk) 04:12, 22 July 2024 (UTC)Reply

Applying `{{ux}}`

Latest comment: 1 day ago9 comments4 people in discussion

Could someone create a bot to automatically replace raw markup with {{ux}} in entries? This would have the benefit of allowing users to easily change the appearance of usage examples if desired as well as make it easier to extract structured data from Wiktionary. Maybe @JeffDoozan would be interested. Ioaxxere (talk) 23:13, 19 July 2024 (UTC)Reply

You need to be careful of this and {{co}}. Vininn126 (talk) 23:17, 19 July 2024 (UTC)Reply

Yes, and this request would have been easily implemented before the creation of {{co}}. I always manually apply the templates whenever I spot uncategorized usage examples, collocations, quotations, etc. Inqilābī 15:41, 20 July 2024 (UTC)Reply

Sounds good. Vininn126 (talk) 15:55, 20 July 2024 (UTC)Reply

My bot already does this for unambiguous bare UX entries that use the formatting recommended by WT:UX. JeffDoozan (talk) 15:54, 20 July 2024 (UTC)Reply

@JeffDoozan: Thank you, I did recall it was you. Do you why your bot seems to have missed attire#Noun? Ioaxxere (talk) 16:22, 20 July 2024 (UTC)Reply

@Ioaxxere: attire#Noun is ignored for not following the WT:UX guidelines Example sentences should ... not contain wikilinks (the words should be easy enough to understand without additional lookup). JeffDoozan (talk) 16:26, 20 July 2024 (UTC)Reply

@JeffDoozan That text seems to have been present from the earliest revision of the page and doesn't seem to reflect current practice. I'll start a discussion about removing it. Are there any other usage example formats that your bot is ignoring? Ioaxxere (talk) 16:39, 20 July 2024 (UTC)Reply

@Ioaxxere: The bot is very careful to convert only text that is unambiguously a UX: sentences that are completely enclosed in italics, contain exactly one bolded item, do not contain wikilinks, do not contain templates, start with a capital letter A-Z, end with punctuation mark "." "!" or "?", do not contain "sibling" text at the same indentation level (except expected templates like ux, syn, coi, etc), do not contain any text or templates at a deeper indentation level except for non-English sections that may contain a single translation one level deeper that is italicized and contains bolded text. JeffDoozan (talk) 17:10, 20 July 2024 (UTC)Reply

CAT:Dialects

Latest comment: 1 day ago1 comment1 person in discussion

I propose that this category be deleted. All dialects are languages on their own right, and all languages are ultimately dialects as well— hence a distinction between languages and dialects in our categories is unnecessary and is a source of chaos due to the lack of well-defined criteria (compare Old Icelandic, Old Novgorodian, the Prakrits, Arabic lects, Chinese lects, etc.). Further, many language varieties, as opposed to real dialects, are wrongly categorized as dialects (Italian English, Korean English, Sri Lankan English etc.). On the other hand, if we are ready to systematically draw a line between them (say, by cleaning up the categorization, or by categorizing as both a language and a dialect), it would be justified in creating more linguistic categories, namely, for Regiolekte, creoles, etc. Also, I noticed lots of lect names are sums of parts, which should actually be deleted. What do you think? Inqilābī 15:30, 20 July 2024 (UTC)Reply

Sister project boxes

Latest comment: 23 hours ago4 comments3 people in discussion

Today I noticed that water has sister project boxes for Wikipedia, Commons, Wikiquote, and Wikiversity. Do we really need all those? Ioaxxere (talk) 16:28, 20 July 2024 (UTC)Reply

This is why variants of these templates exist that can be added under the Further reading section (e.g. {{pedialite}}). A hot take is that we should do that everywhere and retire floating sister project boxes from entries entirely. — SURJECTION ^{/ T / C / L /} 22:37, 20 July 2024 (UTC)Reply

@Surjection I would

Support this for the floating boxes other than {{wikipedia}}. I think that one can still be useful. Ioaxxere (talk) 02:40, 21 July 2024 (UTC)Reply

That is a hot take. I personally don't have a strong opinion, but I do feel we should synchronize it perhaps through a wikidata entry, since often many things are kept there. Vininn126 (talk) 12:48, 21 July 2024 (UTC)Reply

Wiktionary:Example sentences

Latest comment: 4 hours ago14 comments13 people in discussion

I propose removing the point: Example sentences should […] not contain wikilinks. This text was added in a 2007 vote, but is does not longer seem to be followed, particularly in Chinese entries. There are many good reasons to wikify usage examples, even in English: see deafaz for an example. Ioaxxere (talk) 16:39, 20 July 2024 (UTC)Reply

Support. As with linking in definitions, the criterion should simply be that useful links are welcome and overlinking (a pointlessly excessive degree) will be pruned back. As with pruning bushes, no need for overpruning. Quercus solaris (talk) 17:56, 20 July 2024 (UTC)Reply

@Ioaxxere: I don't see how the deafaz entry illustrates a problem. Could you explain it? DCDuring (talk) 20:20, 20 July 2024 (UTC)Reply

@DCDuring: According to the policy, all the wikilinks in the example sentence should be removed, but this would obviously be less than ideal as all of the wikilinked terms are slang terms which are unknown to most English speakers. Ioaxxere (talk) 20:33, 20 July 2024 (UTC)Reply

Maybe change it to something like "Example sentences should [...] use wikilinks only sparingly" or something. Or, yes, just remove it; I don't think we want people to start wiki linking every word or even just most words, but I do agree there are cases where wikilinking is reasonable. - -sche (discuss) 20:39, 20 July 2024 (UTC)Reply

I think that would be preferable. The original idea was that usexes were supposed to be simple enough to actual illustrate the word to someone who didn't know it, not be the cause for looking up more words. We should continue to discourage wikilinking usexes, while being flexible when it makes sense. Andrew Sheedy (talk) 22:04, 20 July 2024 (UTC)Reply

Yes, I agree. As an alternative to blue links, I ended up giving "translations" at we#Etymology 2, but that only makes sense in some contexts (and even then, I'm still unsure if it's the right approach). Theknightwho (talk) 22:50, 20 July 2024 (UTC)Reply

Support relaxing the prohibition in general, but especially for non-English example sentences. The policy states that the words should be easy enough to understand without additional lookup, but readers of the English Wiktionary should not be assumed to understand other languages, and such wikilinks can be a convenience to language learners. Voltaigne (talk) 21:01, 20 July 2024 (UTC)Reply

Support but not in blue. Useful for examples and for quotations. _1. Perhaps a l2 (link2) like word dashed, black ? _2. Hoping: that in the future, for Examples and Quotations, double-click links would be available for all words, with manual links provided if linking to a diffferent form. and _3. Quotations linked to dozens of lemmata e.g. an Ancient Greek paragraph linking to 50 pages. A repository of texts?, at least some? Thank you ‑‑Sarri.greek ^♫ I 18:12, 21 July 2024 (UTC)Reply

Oppose at least complete removal. I think links in usage examples (and quotes, but that's not being discussed here) tend to be cluttersome and distracting, drawing attention away from the bolded word in question. I'd actually remove the {{ux}} on deafaz because the quotes do a decent job of showing the word in use and because I should not have to click on another page just to understand what this other word is. I suppose when it comes to foreign languages, things might be different, and exceptions may exist regardless, so perhaps this can be lightened at least by adding "generally" before "not". I also like Sarri.greek's suggestion of being able to double-click on any word or term as needed. -BRAINULATOR9 (TALK) 22:30, 21 July 2024 (UTC)Reply

Support. Imetsia (talk (more)) 22:36, 21 July 2024 (UTC)Reply

Support. This also need to be changed at EL: WT:EL#Example sentences. The principle that words used in example sentences "should be easy enough to understand without additional lookup" is a good one and should be kept, perhaps with added guidance at WT:UX that difficult or unusual terms should only be used where it significantly adds to the illustrative value of the example sentence. This, that and the other (talk) 03:20, 22 July 2024 (UTC)Reply

Support as links can be useful. J3133 (talk) 05:31, 22 July 2024 (UTC)Reply

Oppose: if a user is looking up deafaz, that implies they're reading/hearing that kind of slang, and they're either already familiar with most of it (in which case the links are mostly useless), or they should start with a basic course (in which case we should link to a basic course). The same goes for foreign languages. (The usex at deafaz could be considered a foreign language, so add a translation into "normal" English rather than a gazillion links.) I really dislike looking up Chinese entries because of the completely pointless sea of blue. MuDavid 栘𩿠 (talk) 07:58, 22 July 2024 (UTC)Reply

Transcriptions of the nurse vowel

Latest comment: 1 day ago2 comments2 people in discussion

Can someone remind me why Appendix:English pronunciation uses /ɜɹ/ for GenAm but many entries use /ɝ/? Was the recommendation revised after a bunch of entries had inherited the older recommendation? No big deal, I am just idly curious about it. Quercus solaris (talk) 18:03, 20 July 2024 (UTC)Reply

I don't know if it was intended as a firm "/ɜɹ/ good, /ɝ/ bad" declaration; there was a similar discussion of /əɹ/ vs /ɚ/ recently; each option (/ɜɹ/ vs /ɝ/) has arguments for it, maybe we just need to take a straw poll / vote about which to use. I will note that we almost never notate /ɑ˞/ or /ɔ˞/, it's always V+ɹ, so /ɜɹ/, /əɹ/ would be consistent with that and would require positing fewer phonemes / using fewer symbols (which is I suspect why the Appendix is set up like it is). In any case we should definitely add footnotes that /ɜɹ/ and /ɝ/ (and likewise for /ɑ˞/, etc) are the same phoneme, so anyone reading one work (or entry!) that has one, and another work or entry that has the other, knows they're not separate phonemes in English. - -sche (discuss) 20:47, 20 July 2024 (UTC)Reply

The Spanish salle problem

Latest comment: 1 day ago3 comments3 people in discussion

Is there some way we can amend the selected combined forms table at salir so that it doesn't claim that the forms salle (et al.) don't exist? I appreciate that this stems from a well known problem with the Spanish orthography that renders these forms "unwriteable", since the morphemes are "sal-le" (/ˈsalle/), which clashes with the orthography rules that dictate it should be read "sa-lle" (/ˈsaʝe/), but that's something we should be explaining in a footnote. What we shouldn't be doing is brushing it under the carpet by pretending they don't exist, since it's misleading to learners, who may well encounter these forms in speech; particularly given that salir is a really basic verb. Theknightwho (talk) 03:45, 21 July 2024 (UTC)Reply

I would support a usage note, and we can make entries for salle and such, if they're attestable, but they'd have to have a nonstandard label. However, while salir is a common verb, the usage that an imperative salirle brings is not nearly as common, and in this specific instance many choose to rephrase it as this page explains.

Looking at the RAE's historical corpus, salir gets 63857 hits, while salirle gets 314. Searching for sálgale gets 3 hits and salile gets me 0. salle gets 673, but the vast majority of them are pre-1600s and the usage is not the same as what's being discussed. After the 1600s, the only hits are clear cases where "salle" is part of the name. There are no hits for sal-le (and the RAE does keep track of hits with dashes). The CORPUS XXI also has no hits for sálgale, salile, nor sal-le, and the salle hits are all names of something.
Moving to Google Books, the queries are a bit harder to find, so I'll be using the construction with al encuentro behind it to illustrate. Out of the searches for salle al encuentro, only one is an actual use, with the others being a mention or explicitly talking about the fact that you can't write it. Sálgale al encuentro only has 3 pages of hits. Salile al encuentro has slightly more, but some of those usages are clearly salí (1p preterite) + le from back when pronouns could be added to indicative verbs as enclitics. Finally, sal-le al encuentro paints the same picture, as there are only one, maybe two, genuine uses for it in running text (with the others either simply repeating the text or talking about the phenomenon). Even when I expand the search to sal-le al, I get maybe one more use.

As such, overall, I don't think this is really that serious of an issue, and I highly doubt that a learner would come across it in running text. I doubt that they're likely to hear it in speech either considering how rare the other forms are as well. Folks clearly took the "you can't write this in Spanish!" headline and ran with it, but the data shows that it's not common at all. Honestly I'm not even sure if "sal-le" is attestable under CFI at this rate, and if it is, it'd be exceedingly rare. It's more trivia than anything. AG202 (talk) 05:46, 21 July 2024 (UTC)Reply

Knight did say that the term was encountered in speech more than in writing, so maybe there are more uses in recorded speech (like talk shows and vinyl recordings.) Although I do acknowledge that the data suggests that this form is rare.

(Also, do we need one use of the nonlemma form to make an entry, or three?) CitationsFreak (talk) 08:06, 21 July 2024 (UTC)Reply

German Low German and Low German

Latest comment: 22 hours ago1 comment1 person in discussion

First of all, there's no such a thing as "German low German" there's just one Low German, with west and east varieties. I've seen some people explaining that Low German is Low German spoken in Netherlands (even if it would be truth, why just don't call it Dutch Low Saxon, it has already a name) but then I notice that we classify Westphalian as German Low German (which is spoken both in Germany and Netherlands) so what's Low German and what's German Low German? It seems like none really knows, I've seen people adding Low Prussian entries (a dialect of colonial east German) as sometimes Low German and sometimes as German Low German. My proposal to fix this issue is to merge Low German with German Low German, and split it into "East Low German" or "Low Saxon" and into "West Low German". I'm not good at technical side of Wikitionary, but I think it's also possible to automatically label entries east or west when someone would for example would label entry as Low Prussian it automatically becomes East German so we don't need to write "East German, Low Prussian" Rakso43243 (talk) 13:32, 21 July 2024 (UTC)Reply

Add topic

Wiktionary:Beer parlour

How to resolve conflicts on Wiktionary

how to identify locations in audio snippets of minority languages?

Dealing with controversial quotes

Announcing the first Universal Code of Conduct Coordinating Committee

"ux" template

{{etymon}}

Rethinking confidence parameters

Classical Attic audio files

Use of etymology trees made with Template:etymon in the entries for multi-word terms

Purplebackpack on feelings of harassment

Constructed languages in the mainspace

synthesized audio files

Anti-intensifiers and the epidemic of British meiosis

Kyakhta Russian–Chinese Pidgin

Full stops after templates like {{synonym of}}

Old Franconian

Collapsible lists within definitions

standardizing the form of phrase lemmas

The right to bear ewes

Batch editing Wiktionary with AWB

The final text of the Wikimedia Movement Charter is now on Meta

Proposal for a Turkish conjugation module

Names of people

Draft proposal

Jatki and Western Punjabi

English pronunciation module

Numerals

AWB request (Brainulator9)

Japanese historical kana transliteration

Updates to WT:AINE

AWB request (Babr)

CFI for translations?

Entries by Geshiza

Htoklibang Pwo

West-Central Thailand Pwo Karen

Decluttering the altform mess

Hyphenation: syllabi(fi)cation in writing

Voting to ratify the Wikimedia Movement Charter is now open – cast your vote

The slashes in the transcription (ts=) parameter

(vnv) meaning

Headword word IDs in multiword entries

Words of uncertain reading, etc.

Links within glosses

Request for Template/module editing permissions

new version of Template:+obj

Category for subjunctives

I think Category:Japanese onomatopoeias should be Category:Japanese ideophones

Requesting template editor right

Parsing the principal parts in the headword lines of Latin verbs

Should minimum wage be in a PIE root category?

Ryukyuan kanji entries

RQ for Rollbacker (and Patroller)

Gender Only, or Gender + Number for Nouns in Translation Tables?

Full entries for alternative forms in Chinese

Is it ever necessary to use {{etymon}} in a redirect?

categorizing modern English verbs as "class 4 strong verbs" etc

What defines "transitive"?

"Encyclopedic" as a deletion reason

Moratorium on editing other languages' etymology sections for the purpose of English etymology trees

Criteria for "...terms inherited from..." categories

merge "pronominal" into "reflexive"

where does Medieval Latin begin?

Unchecked proliferation of pages generated due to using {{uder}}

Voting to ratify the Wikimedia Movement Charter is ending soon

etydate template

Removing hiragana transliterations in Japanese

U4C Special Election - Call for Candidates

Mahagaja changing references to {{reflist|size=smaller}}

Language of surnames

Pintupi-Luritja

Are taxonomic names Latin or Translingual?

Automating taxonomic entries

Implementing auto-glossary

dated/archaic/obsolete in the glossary as well Wiktionary:Obsolete_and_archaic_terms

Ban the POS "prepositional phrase"

Minor changes to CFI attestation section

consensus on inclusion/exclusion of "someone" in multiword English verb lemmas

Wikimedia Movement Charter ratification voting results

Deprecating MediaWiki:Gadget-Navigation popups

`{{etymon}}`

The slashes in the transcription (`ts=`) parameter

Is it ever necessary to use `{{etymon}}` in a redirect?

Unchecked proliferation of pages generated due to using `{{uder}}`

Mahagaja changing references to `{{reflist|size=smaller}}`

Applying `{{ux}}`